Finding Binary Pattern in C (Read Buffered Binary)

Hey there. I am trying to write a small program that will read the next four bytes after the last occurrence of "0xFF 0xC0 0x00 0x11", which can easily be converted to binary or decimal. The goal is that 2-5 bytes after the last occurrence of this hex pattern represent the width and height of the JPEG file.

#include <stdio.h>

 int main () {
  FILE * pFile;
  long lSize;
  char * buffer;
  size_t result;

  pFile = fopen ( "pano8sample.jpg" , "rb" );
  if(pFile==NULL){
   fputs ("File error",stderr);
   exit (1);
  }

  fseek (pFile , 0 , SEEK_END);
  lSize = ftell (pFile);
  rewind (pFile);

  printf("\n\nFile is %d bytes big\n\n", lSize);

  buffer = (char*) malloc (sizeof(char)*lSize);
  if(buffer == NULL){
   fputs("Memory error",stderr);
   exit (2);
  }

  result = fread (buffer,1,lSize,pFile);
  if(result != lSize){
   fputs("Reading error",stderr);
   exit (3);
  }

  //0xFF 0xC0 0x00 0x11 (0x08)

  //Logic to check for hex/binary/dec

  fclose (pFile);
  free (buffer);
  return 0;
 }

      

The problem is I don't know how to read from buffered memory recursively and use the last read variable as an int to compare against my binary / hex / dec.

How to do it?

+2


source to share


4 answers


byte needle[4] = {0xff, 0xc0, 0x00, 0x11};
byte *last_needle = NULL;
while (true) {
  byte *p = memmem(buffer, lSize, needle, 4); 
  if (!p) break;
  last_needle = p;
  lSize -= (p + 4) - buffer;
  buffer = p + 4;
}

      



If last_needle

not null, you can print last_needle+4

...

+6


source


instead of reading the entire file into memory, I would use a state machine bit. My C is a little rusty, but:

char searchChars[] = {0xFF,0xC0,0x00,0x11};
char lastBytes[5];
int pos = 0; int curSearch = 0;
while(pos <= lSize) {
    curChar = getc(pfile); pos++;            /*readone char*/

    if(curChar == searchChars[curSearch]) { /* found a match */
        curSearch++;                        /* search for next char */
        if(curSearch > 3) {                 /* found the whole string! */
            curSearch = 0;                  /* start searching again */
            read = fread(lastBytes,1,5,pfile); /* read 5 bytes */
            pos += read;                      /* advance position by how much we read */
        }
    } else { /* didn't find a match */
        curSearch = 0;                     /* go back to searching for first char */
    }
 }

      



at the end, you are left with 5 bytes in lastBytes, which are 5 bytes immediately after the last findChars lookup

+2


source


Personally, I would use a function that swallows one character at a time. The function will use a state machine to perform simple regular expression matching, keeping the details in both static local variables and the structure of parameter blocks. You need two sub-blocks, one for the matched state and one for the last full match, each indicating the corresponding positions or value as needed.

In this case, you should be able to design it manually. For more complex requirements check out Ragel .

+1


source


You can use the fscanf function in C / C ++ if the data is ascii encoded. If it is not, you will have to write your own function to do it. A simple way would be to read N number of bytes from a file, search for a byte string for the pattern you want, then continue to EOF.

Your code actually reads the entire file at once (unnecessary if the string you are looking for is at the top of the file.) Your code stores the file on the heap as a byte array (char is equivalent to bytes in C ++) with a pointer to start buffer a contiguous array in memory. Manipulate the buffer array in the same way you would manipulate any other array.

Also, if you intend to do something after reading the size, make sure you free the malloced buffer object to avoid leaking.

0


source







All Articles