:: Re: [DNG] [OT] Re: Studying C as to…
Top Page
Delete this message
Reply to this message
Author: Irrwahn
Date:  
To: dng
Subject: Re: [DNG] [OT] Re: Studying C as told. (For help)
On Thu, 23 Jun 2016 08:23:01 +0100, Katolaz wrote:
[...]
> Thanks for your explanation :) Actually, my point was simply that the
> language allows you to do whatever you want with allocated space that
> you think is "an array of T". If I remove the explicit casts, the
> compiler gives a few warning (incompatible pointer type) but still
> compiles the code into an executable.


Unfortunately it does, I might add. :->

> I am not saying that this is the way to proceed. Such warnings have to
> be always resolved in any serious piece of code. But the language
> allows you to do that, simply because there is no strong typing
> whatsoever of any kind of either statically or dynamically allocated
> memory. Consider the snippet (no malloc here...):
>
> ===
> #include <stdio.h>
> int main(){
>
>     int a = 17;
>     char *b;

>
>     b = (char*)&a;
>     b[2]=21;
>     printf("a: %d\n", a);
>     b[2]=0;
>     printf("a: %d\n", a);
> }
> ===


For the sake of argument I will pretend you actually meant
to write "unsigned char" above, so we do not step into the
realm of dreaded undefined behavior right from the start. ;-)

> whose execution gives:
>
> katolaz@akela:~/test$ ./pointer_wierd
> a: 1376273
> a: 17
> katolaz@akela:~/test$
>
> What is the type of the variable a?


That is obvious: int.

A variable is an object (i.e. a contiguous region of memory)
that contains a value that is to be interpreted according to
the rules pertaining to the declared type of the variable, and
which has a designator (the name of the variable) attached to
it. (Those are my own words, not actual "Standardese".)

> What is an array of "char"? Even
> worse, this code will run smoothly on some architectures and give
> segfault on other ones.


That is already a very strong indication that it probably
is severely broken.

> And is still perfectly *legal* ANSI C :)


Alas, it is not. By making assumptions on the width of an int
(at least 3 bytes in this case) you are already deep in
implementation defined land, IOW your code is not portable.
And then there's the issue of possible trap representations[1]
in an int, which makes above code (at least potentially)
invoke undefined behavior:

| Certain object representations need not represent a value of
| the object type. If the stored value of an object has such a
| representation and is read by an lvalue expression that does
| not have character type, the behavior is undefined.

(C99 6.2.6.1p5, same in C11)

IIRC it can be derived from the standard that an unsigned int
can neither contain padding bits, nor can it have any trap
representations. Furthermore, we know an unsigned int has to
be at least 2 bytes in width. Thus we can indeed convert your
example into one that *actually* *is* ISO C compliant:

#include <stdio.h>
int main(void) {
    unsigned int a = 17;
    unsigned char *b;
    b = (unsigned char *)&a;
    b[1]=21;
    printf("a: %u\n", a);
    b[1]=0;
    printf("a: %u\n", a);
}


But that's (almost?) as far as you can get with accessing a
typed[2] object after you tampered with its value by means of
altering (part of) its binary representation through an
unsigned char pointer. (Unless you know about architecture
and implementation details, but that's neither here nor there,
from the standard's POV.)

Caveat: Since we still know nothing about the endianness of the
machine it's run on, the output of the program is still
implementation defined. Tweaking it so it does produce the same
output regardless of endianness is left as an exercise. 8^)

Lesson: Although C has a lot of "loopholes" in its typing system
(despite being a statically typed language) that does not imply
you can get away with everything. The wording in the standard
puts a lot of constraints on which forms of type conversion,
type coercing (a.k.a. casts) and bit fiddling are well defined.
Due to that C is actually surprisingly strongly typed, even if
that trait is not always instantaneously obvious.
(Sub-lesson: The statement "It compiles!" is worthless as it is.)


[1] I concede that this is merely theoretical, as it does 
    apply mostly to ancient or exotic architectures, AFAIK. 
    Nevertheless it is to be respected when discussing 
    "perfectly legal ANSI C".  


[2] Other than character type.

HTH, Regards
Urban