:: Re: [DNG] semantic of sizeof operat…
Forside
Slet denne besked
Besvar denne besked
Skribent: Irrwahn
Dato:  
Til: dng
Gamle-emner: Re: [DNG] simple-netaid from scratch
Emne: Re: [DNG] semantic of sizeof operator in C (was: simple-netaid from scratch)
Didier Kryn wrote on 12.06.19 12:15:
[...]

Hi Didier,

please allow me to clear up some apparent misconceptions below.

>
>     What I meant in this discussion is that sizeof() allows to
> calculate the number of elements of an array, because we make
> assumptions on data layout, but this is an artefact and I don't think it
> is specified by the language wether the result is exact or not.
>
>     Let's consider the following type:
>
> typedef struct {int i; short h} sesqui_int;
>
>     One would naively consider that sizeof(sesqui_int) is equal to 6.
> But, with gcc, the value is 8, which looses 2 bytes in which it could
> store a short or two chars. This is because this struct must be aligned
> on a 4-byte boundary and, if you make an array of these,
> sizeof(sesqui_int)*number_of_elements must give the size of the array.
> Gcc has chosen to return a wrong sizeof() for the sake of preserving a
> naive size arithmetic.


There is nothing wrong here. Gcc reports the size that is necessary to
store an object of type sesqui_int, including any padding that has been
applied, e.g. for alignment reasons. An array of n elements of that type
will in turn always be reported by sizeof as having *exactly* n times
that size, in bytes. Gcc is therefore in accordance with the language
definition.

I assume the misunderstanding here was that sizeof should report the
minimal size an object would occupy in the absence of any alignment
requirements etc. imposed by the actual platform. This is not what sizeof
is designed to do. Instead it shall report the *actual* amount of memory
required to store such an object. If you expected something else you
already made unwarranted assumptions about implementation details that
should not matter to you as the programmer.

>
>     Another implementation of the C language might decide to add
> headers to arrays, in which it would store the size to perform strict
> runtime checks. In this case the size of an array would be larger than
> the sum of the sizes of its elements.


No, it must not. This is prohibited by the definitions and constraints
in the C standard. The introduction of array headers would for example
lead to
(void *)&array == (void *)&array[0]
not always being true, which would contradict the language definition.
In other words, your hypothetical implementation would implement some
language that is not C, by definition.

On a somewhat related note: Any padding present in a struct can never
appear at the start of that struct, i. e. the address of an object of
structural type is guaranteed to always compare equal to the address
of its first member.

>
>     Therefore this use of sizeof(), even though widespread, remains a
> trick.


Not so. Num_array_elements = sizeof array / sizeof element is neither
a trick nor an accident, but rather idiomatic C . It is guaranteed by
the C standard (any version) to yield the correct element count.
Predicting the behavior of any non-trivial C program would be a crap
shot otherwise. Moreover, it would make impossible to reliably allocate
dynamic memory for arrays, consider the well-known (and correct) idiom:

some_type *arr = malloc(num_elements * sizeof *p);


And while we're at it, please let me add some random interesting facts
about the sizeof operator one should be aware of:

* Being the operand of the sizeof operator is one of the few cases
where an array designator does not decay into a pointer to its first
element, and a non-array lvalue is not converted to the value stored
in the designated object; e.g. the *p in the example above does _not_
dereference p. (All of this is a fancy way of saying that sizeof
looks strictly only at the type of its operand, not its value).

* Since C99 there is one important exception to the rule that the
sizeof operator is evaluated at translation time, and that is when
applied to VLAs (variable length arrays) - for obvious reasons.

* The parentheses around the operand of sizeof are only mandatory, if
said operand is a type name. For ordinary object designators (lvalues)
they arguably add unnecessary clutter and may mislead novices into
the false belief that sizeof is a function, which it is not.


I hope that helped clear things up a bit.


Best regards,

Urban
--
Sapere aude!