Casting char to unsigned short: what's going on behind the scenes?

Given this field:

char lookup_ext[8192] = {0}; // Gets filled later

      

And this statement:

unsigned short *slt = (unsigned short*) lookup_ext;

      

What's going on behind the scenes?

lookup_ext [1669] returns 67 = 0100 0011 (C), lookup_ext [1670] returns 78 = 0100 1110 (N), and lookup_ext [1671] returns 68 = 0100 0100 (D); but slt [1670] returns 18273 = 0100 0111 0110 0001.

I'm trying to port this to C #, so besides the easy way out of this, I'm also curious about what's actually going on here. It's been a while since I've been using C ++ regularly.

Thank!

+1


source to share


4 answers


The statement you show does not assign a char to an unsigned short, it casts a char pointer to a pointer to unsigned short. This means that normal pointer-to-data arithmetic conversions will not occur and that the underlying char data will simply be interpreted as unsigned shorts when accessed through a variable slt

.

Note that sizeof(unsigned short)

it is unlikely to be one, so it slt[1670]

does not necessarily match lookup_ext[1670]

. Most likely, if, say, sizeof(unsigned short)

is equal to two, it corresponds to lookup_ext[3340]

and lookup_ext[3341]

.



Do you know why the source code uses this alias? If this is not necessary, it might be worth trying to clean up the C ++ code and check that the behavior has not changed before you port it.

+6


source


If I understand correctly, the type conversion will convert a char array of size 8192 to a short int array of half that value, equal to 4096.



So I don't understand what you are comparing in your question. slt [1670] must match lookup_ext [1670 * 2] and lookup_ext [1670 * 2 + 1].

+2


source


Well this statement

char lookup_ext[8192] = {0}; // Gets filled later

      

Creates an array locally or non-locally, depending on where the definition occurs. Initializing it this way, with an aggregate initializer, it will initialize all its elements to zero (the first one explicitly, the rest implicitly). So I wonder why your program outputs non-null values. If no padding occurs before reading, it makes sense.

unsigned short *slt = (unsigned short*) lookup_ext;

      

This will interpret the bytes that make up the array as unsigned short objects when read from that pointer. Strictly speaking, the above behavior is undefined, because you cannot be sure that the array is suitable for alignment, and you would read from a pointer that does not indicate the type of the original dot type (unsigned char ↔ unsigned short). In C ++, the only portable way to read a value from some other module (plain old data), that all structures and simple types that are too possible in C (for example short), in the broad sense of the word, is to use such library functions like memcpy

or memmove

.

So, if you are reading *slt

above, you should interpret the first sizeof(*slt)

bytes of the array and try to read it as unsigned short (which is called type pun

).

+1


source


When you do "unsigned short slt = (unsigned short) lookup_ext;", number no. bytes equivalent in size (unsigned short) are fetched from the location specified by lookup_ext and stored in the location pointed to by slt. Since unsigned short will be 2 bytes, the first two bytes from lookup_ext will be stored in the location pointed to by slt.

0


source







All Articles