Converting char * to structure
In the code here , there is a line:
struct iphdr * iph = (struct iphdr *)buffer;
in ProcessPacket
, where it buffer
is of type char*
. buffer
set to a value recvfrom
in the main function. How is a simple string ( buffer
) converted to a structure and how is the data retrieved safely?
iphdr:
struct iphdr {
#if defined(__LITTLE_ENDIAN_BITFIELD)
__u8 ihl:4,
version:4;
#elif defined (__BIG_ENDIAN_BITFIELD)
__u8 version:4,
ihl:4;
#else
#error "Please fix <asm/byteorder.h>"
#endif
__u8 tos;
__u16 tot_len;
__u16 id;
__u16 frag_off;
__u8 ttl;
__u8 protocol;
__u16 check;
__u32 saddr;
__u32 daddr;
/*The options start here. */
};
source to share
The first thing to understand is that the bits in memory remain exactly the same regardless of the cast (struct iphdr *)
. It's just that you are now saying that buffer
it should now be treated as a pointer to struct iphdr
instead of what it used to be. You are just telling the compiler to look at the bits with a different pair of points, and therefore interpret accordingly. The compiler suddenly sees that it has buffer
become struct iphdr *
. And he says "OK" that's all. What's important is you know exactly what it is buffer
and apply it to the correct type.
If you wanted, you could be of type buffer
before int *
(or any other pointer type) and the compiler wouldn't say anything. You will have problems later though.
source to share
buffer
not a string
. It is a pointer to raw binary data. recvfrom
fills in (in this example, see below) buffer
with raw IP / TCP frames (also called packets). Thus, the first sizeof(iphdr)
bytes buffer
are the IP-header a struct: iphdr
. This is why the blog author uses the snippet you provided:
struct iphdr * iph = (struct iphdr *)buffer;
If IP header options are enabled, the actual size of the header iph->ihl*4
.
The protocol field ( iph->protocol
) of the header is then checked in ProcessPacket
(in the blog) to determine which transport protocol the packet contains.
If the transport protocol used is TCP , then the TCP header (and then the data) can be retrieved using (snippet from the blog):
unsigned short iphdrlen = iph->ihl*4;
struct tcphdr *tcph = (struct tcphdr*)(buffer + iphdrlen);
Raw frames
The author of the blog created a socket using:
sock_raw = socket(AF_INET , SOCK_RAW , IPPROTO_TCP);
- The first parameter
AF_INET
means that you want an IPv4 packet (as opposedAF_INET6
to IPv6 ). - The second parameter tells
socket
you that you want raw frames - The third parameter (
IPPROTO_TCP
) makes sure you receive TCP frames
Alternatively, if you want UDP frames, you can use:
sock_raw = socket(AF_INET , SOCK_RAW , IPPROTO_UDP);
If you are greedy and want to use every package (please read in frame format before using this!):
socket(AF_PACKET, SOCK_RAW, htons(ETH_P_ALL));
source to share
I think the code introduces undefined behavior because of the following two lines that are executed during the course of your program:
unsigned char *buffer = (unsigned char *)malloc(65536);
...
struct iphdr *iph = (struct iphdr*)buffer;
buffer
is a pointer to a block of memory, reserved as unsigned char*
, which is then mapped to a pointer of the type struct iphdr
; but struct iphdr
most likely has different alignment constraints than char*
, and this behavior is undefined (cf like this online c11 draft standard ):
6.3.2.3 Pointers
(7) A pointer to an object type can be converted to a pointer to another object type. If the resulting pointer is incorrectly aligned for a reference type, the behavior is undefined ....
While it may work (which is still one of the UB variants), it could also be that the program behaves in a way that you don't intend to.
I would suggest copying the information into a properly aligned struct iphdr
-object:
unsigned char *buffer = (unsigned char *)malloc(65536);
...
struct iphdr iphobj;
memcpy(&iphobj,buffer,sizeof(struct iphdr));
...
Take care of the object's lifetime.
Note that you tagged the code with C
and C++
, and both languages have different rules (for example, regarding explicit expression of the result malloc
, which is required in C ++ but not recommended in C).
But in regards to UB, I am absolutely sure that the code introduces UB in both C and C ++.
source to share