19 Feb 2022
The 1999 revision to the C programming language brought several interesting
changes and additions to the standard. Among those are variable length arrays,
or VLAs for short, which allow array types to have lengths that are determined
at runtime. These can show up in different contexts, but for the purposes of
this post we will be looking at VLAs as parameters to functions. For example:
void f(int n, float v[n]);
Since array parameters decay to their corresponding pointer type, that
declaration is functionally equivalent, and compatible (6.7.5.3p7) to the
following:
void f(int n, float *v);
The length of v
in the first declaration is given by the value of a previous
parameter of the function, which can only be known at runtime. However, in a
function prototype (i.e. not a function definition) like the above, the length
of a VLA parameter is never computed and essentially ignored (6.7.5.2p5).
But in a function definition, the length is evaluated and must be greater than
zero. In practice this is rather useless because the sizeof operator doesn’t
work as one might expect it to with array parameters (6.5.3.4-note88):
void f(int n, float param[n]) {
float local[n];
int a=sizeof local,
b=sizeof param;
printf("n=%dn", n);
printf("sizeof local=%dn", a); // this will be n * sizeof(float)
printf("sizeof param=%dn", b); // but this will be sizeof(float *)
}
In other words, using sizeof with a VLA will evaluate the runtime-known length
and calculate the array size based on that, except when that VLA is a function
parameter, then, for whatever reason, it is the size of the corresponding
pointer type (in this example, sizeof local===n * sizeof(float)
but sizeof param==sizeof(float *)
). The length of a VLA parameter may be used when e.g.
computing indices when accessing multi-dimensional variable length arrays.
Alas, the standard mandates that the variable array length be computed when
the function is called. Of course, the expression in between the square
brackets is not limited to simple expressions like n
, so one can write
something like:
void f(int a, int b, int m[(a + b) / 2]) {}
or
void f(int x, int b[abs(x)]) {}
or even
void f(int v[getchar()]) {}
The following program should give you an idea of the kind of constructs
that these rules allow for (try it on compiler explorer):
int main(int argc, char *argv[printf("Hello")])
{
printf(" world!n");
}
The length expression of the VLA is evaluated before any of the statements in
main
. I couldn’t find anywhere in the standard saying whether this evaluation
order is well-defined but it is what clang and gcc do, and luckily, it does not
matter for the sake of this article, as we will see shortly.
Let us refer to the subset of C99 where function bodies must be empty as
disembodied C. You might naturally ask yourself what things can be
accomplished in this subset (though you can probably guess the answer from the
title of this page). In other words, what can you do in C if you are limited to
just evaluating expressions and no statements?
-
Using the comma operator, we can sequence arbitrarily many expressions for
their side effects, sovoid f() { printf("Press enter to confirm: "); getchar(); printf("thanks.n"); }
becomes
void f(char _[( printf("Press enter to confirm: "), getchar(), printf("thanks.n"), 1 )]) {}
In a comma expression, the operands are evaluated left-to-right and the value
of the last operand is the resulting value of the whole expression. The1
at the end ensures that the evaluated array length is>0 (to avoid UB). -
Functions in disembodied C are going to need a dummy parameter where the VLA
length expression evaluation can take place. For consistency, we will denote
it aschar _[...]
and give it the value""
(the empty string) when calling
said functions (note that the value we give it doesn’t actually matter, though
its size should be at least as big as the computed VLA size to avoid UB). -
If-else statements can be replaced with ternary conditional expressions, such that
void f(int n) { if (n 0) printf("positive!"); else printf("zero!"); }
becomes
void f(int n, char _[( (n 0) ? printf("positive!") : printf("zero!") , 1 )]) {}
-
Remember that the VLA length expression can access previous function arguments,
so parameter passing is essentially unchanged. -
We cannot return values, but we can use out parameters by taking advantage of
the fact that assignments are expressions in C, so instead ofint imax(int a, int b) { return a> b ? a : b; }
we can write
void imax(int *out, int a, int b, char _[ (*out=a> b ? a : b), 1 ]) {}
-
We cannot define local variables inside of expressions, but we can just add
extra function parameters to use as temporaries, rewritingvoid fswapf(float *a, float *b) { float tmp=*a; *a=*b; *b=tmp; }
as
static void fswapf_aux(float *a, float *b, float tmp, char _[( tmp=*a, *a=*b, *b=tmp, 1 )]) {} void fswapf(float *a, float *b, char _[( fswapf_aux(a, b, 0, ""), 1 )]) {}
Alternatively, if re-entrancy and thread-safety are disregarded, we could
just use global (static) variables. -
What about loops? Clearly we cannot use
while
orfor
inside expressions.
Thankfully, they are unnecessary thanks to recursion. For example:float sum(float *v, size_t n) { float sum=0.0f; for (size_t i=0; i
can be expressed as
/* the forward declaration is necessary */ static void sum_aux(float *out, float *v, size_t n, char *); static void sum_aux(float *out, float *v, size_t n, char _[( (n> 0) ? ( *out +=*v, sum_aux(out, v + 1, n - 1, ""), 1 ) : 1 )]) {} void sum(float *out, float *v, size_t n, char _[( *out=0.0f, sum_aux(out, v, n, ""), 1 )]) {}
In fact, any imperative-style loop can be turned into an equivalent recursive
loop (as any functional programmer will be happy to demonstrate), though since
C lacks closures or any form of anonymous functions, it can get quite unwieldy
(I hope you like auxiliary functions).The astute reader might point out that these two versions of
sum
are not
equivalent because the recursive definition may cause a stack overflow for
large enough values ofn
. This is unfortunately true, and the major hurdle
for the practicality of disembodied C, but does not preclude Turing
completeness (an ideal Turing machine has infinite memory at its
disposal). Luckily, modern compilers are smart, and if we write our functions
carefully to be tail recursive, they will often be able to perform tail
call elimination, removing the risk of a stack overflow. For the above
example, both gcc and clang are able to optimize the tail call (see on
compiler explorer). -
Although not relevant to standard C99, in case you had the idea of using gcc
statement expressions to bypass these limitations, the compiler will stop
you right on your tracks since it doesn’t allow those outside of function
bodies.
After all of that, it seems that basically anything can be done in disembodied
C, so, is there something that can’t be expressed in this C99 subset?
With the rules that we’ve laid down so far, yes. In particular:
-
It is not possible in the general case to use APIs that require callbacks.
For example, look at the standardqsort
function:void qsort(void *base, size_t nmemb, size_t size, int (*compar)(const void *, const void *));
The
compar
parameter is a pointer to a function that should return the
result of comparing two values in the given array. Since its parameters are
void pointers, we can’t have the VLA argument we need to perform computations
(arrays ofvoid
are not valid C), and we also cannot provide anint
return value. Thus there is no way (that I know of) to use this function in
standards-compliant disembodied C.This is not a dealbreaker since we could just reimplement
qsort
;). -
We cannot access the program arguments. The entry point of a program in
disembodied C looks like this:int main(int argc, char *argv[ /* code here ... */ ]) {}
In the VLA length expression for
argv
,argc
is in scope, butargv
is
not.Fortunately there is a way to work around this: we can use the common
extension wheremain
can take a third argument which is a pointer to the
environment. According to the standard (J.5.1), this is implemented in
hosted environments. In practice, POSIX environments and Microsoft both
support it. Then, our entry point looks like this instead:int main(int argc, char **argv, char *envp[/* ... */]) {}
Now
argv
is in scope within theenvp
array length expression.Side note: even though we can’t
return
,main
is the exception to the rule
that reaching the closing}
of a function returning non-void is verboten,
and is equivalent to returning zero (5.1.2.2.3). If we need to terminate
with an exit code other than zero, we can useexit()
.
It seems that with enough effort anything can be done with these peculiar
restrictions. As an example, I wrote implementations of the MD5 and SHA256
hash functions as well as Conway’s Game of Life using SDL for graphics.
I was also working on a more interesting and “useful” program but I kind of
lost motivation halfway through. I’ll update this article if I ever come back
to finish that.
In case it wasn’t clear enough, there is zero practical reason to ever do any
of this. I hope you’ll agree it’s a fun party trick though :).
Related posts
Tips & Trick For Healthy Glowing Skin
Send to HN Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nam laoreet, nunc et accumsan cursus, neque eros sodales…
My Fight With Depression. Concussions
Send to HN Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nam laoreet, nunc et accumsan cursus, neque eros sodales…
How I Traveled The World With Only $100
Send to HN Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nam laoreet, nunc et accumsan cursus, neque eros sodales…
What’re People Buzzing About? Your Content Should Join The Conversation
Send to HN Sed faucibus ultrices orci ac malesuada. Cras eu ante dapibus, imperdiet lacus ac, pulvinar nulla. Interdum et…
Does Coffee Help Deduce Stress Hormone Levels?
Send to HN Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nam laoreet, nunc et accumsan cursus, neque eros sodales…
Review Of Healthy Breakfast Meals For Energy Boost
Send to HN Lorem ipsum dolor sit amet, consectetur adipiscing elit. Quisque hendrerit fringilla enim, ut scelerisque dui. In hac…
How Much Time On Social Networks Is Considered Healthy
Send to HN Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nam laoreet, nunc et accumsan cursus, neque eros sodales…
My Fight With Depression. Concussions
Send to HN Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nam laoreet, nunc et accumsan cursus, neque eros sodales…