Fscanf not working in python ctype call

I'm trying to wrap google word2vec for my pre-rendered vectors in python using ctypes.

I have some code here:

void initialize(){
  ...
  long long words, size;
  char *fname = "GoogleNews-vectors-negative300.bin.gz";
  strcpy(file_name, fname);
  printf("%s\n", file_name);
  f = fopen(file_name, "rb");
  if (f == NULL) {
      printf("Input file not found\n");
      return -1;
  }

  fscanf(f, "%lld ", &words);
  fscanf(f, "%lld", &size);
  printf("size of words is %d\n", words);
  ...
}

      

This code works great when I call it from the main function. However, when I compile it to a .so file and call it from ctypes, the words always get null. Checking with ftell I notice that fscanf does not move the file pointer forward and fscanf always returns 0. Its binary, so I'm not sure if fscanf works here, other than converting that 3GB binary to a more massive CSV and reading from there ...

How can this be fixed? An alternative route that excludes fscanf in this case will also work.

+3


source to share


1 answer


The OP file is a compressed file "GoogleNews-vectors-negative300.bin.gz" and the code is for reading the uncompressed version.



Please try again with the uncompressed version.

+1


source







All Articles