Why does a global variable behave differently in different methods?
Purpose:
- Make a global array of a string (dictionary) if this size can be calculated only in a function
load()
. - Print dictionary on screen using function
print()
.
My approach:
Create a global pointer to a string, create an array of strings in, load()
and assign a local array to a global pointer.
Problem:
If I try to print the global array (and locally) internally load()
, everything works fine, but in the case of printing with a print()
segfault, it will happen somewhere at the end of the array. The findings of GDB and valgrind seem cryptic to me. I give up. What's wrong?
Source and dictionary here .
Code:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
// length of the longest word in dictionary
#define LENGTH 45
// dictionary file
#define DICTIONARY "large"
// prototypes
void load(const char* dictionary);
void print(void);
// global dictionary size
int dict_size = 0;
// global dictionary
char **global_dict;
int main(void)
{
load(DICTIONARY);
print();
return 0;
}
/**
* Loads dictionary into memory.
*/
void load(const char* dictionary)
{
// open dictionary file
FILE *dict_file = fopen(dictionary, "r");
// compute size of dictionary
for (int c = fgetc(dict_file); c != EOF; c = fgetc(dict_file))
{
// look for '\n' (one '\n' means one word)
if (c == '\n')
{
dict_size++;
}
}
// return to beginning of file
fseek(dict_file, 0, SEEK_SET);
// local array
char *dict[dict_size];
// variables for reading
int word_length = 0;
int dict_index = 0;
char word[LENGTH + 1];
// iteration over characters
for (int c = fgetc(dict_file); c != EOF; c = fgetc(dict_file))
{
// allow only letters
if (c != '\n')
{
// append character to word
word[word_length] = c;
word_length++;
}
// if c = \n and some letters're already in the word
else if (word_length > 0)
{
// terminate current word
word[word_length] = '\0';
//write word to local dictionary
dict[dict_index] = malloc(word_length + 1);
strcpy(dict[dict_index], word);
dict_index++;
// prepare for next word
word_length = 0;
}
}
// make local dictioinary global
global_dict = dict;
}
/**
* Prints dictionary.
*/
void print(void)
{
for (int i = 0; i < dict_size; i++)
printf("%s %p\n", global_dict[i], global_dict[i]);
}
source to share
Explanation of the problem (inspired by @Ingo Leonhardt)
The problem is here:
char *dict[dict_size];
With such a declaration, memory for the array is automatically allocated (it is inviolable only in load()
), and after the load()
call, the dict
memory is available for rewriting. It seems that the rewriting happens somewhere at the end of the program (by accident).
Solution (inspired by @iharob)
char **dict = malloc((1 + dict_size) * sizeof(*dict));
This way I dynamically allocate memory for dict
(untouchable until the end of the program or free(global_dict)
)
PS I agree that globals are dangerous, but limiting assignment solves the problem with such a construct.
source to share
The answer is simple, you are assigning a pointer to a variable that is local to load()
and it is deallocated on return load()
, so it is invalid in print()
, which results in undefined behavior.
You even commented on this
// local array <-- this is your comment not mine
char *dict[dict_size];
You have two options:
-
Don't use a global variable, the pattern you used with a global variable does not have any advantage and it is very dangerous. You can return a dynamically allocated poitner from a function
load()
and then pass it toprint()
. -
Allocate an array of pointers using
malloc()
.global_dict = malloc(dict_size * sizeof(*global_dict));
Why don't I like global variables?
- Because you can do what you did in your program without even getting a warning from the compiler.
But of course you won't do that when you get experience, so this is more your fault than globals, but usually they see programmers who are still learning use globals to solve the problem of communicating data between functions. what the parameters are for.
So, using global variables + without knowing how to handle them correctly is bad, learn about function parameters instead and you will solve every problem that would require global variables to pass data through various functions in your program without using global variables ...
This is your own code, I removed the variable global_dict
and used dynamic memory allocation internally load()
, also I did some error checking on malloc()
, you should improve this part if you want the code to be robust, the rest is self-evident
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
// length of the longest word in dictionary
#define LENGTH 45
// dictionary file
#define DICTIONARY "large"
// prototypes
char **load(const char *dictionary);
void print(char **);
int main(void)
{
char **dictionary;
dictionary = load(DICTIONARY);
if (dictionary == NULL)
return -1;
print(dictionary);
/* Don't forget to free resources,
you might need to do it while
the program is still running,
leaked resources might quickly
become a problem.
*/
for (int i = 0 ; dictionary[i] != NULL ; ++i)
free(dictionary[i]);
free(dictionary);
return 0;
}
/**
* Loads dictionary into memory.
*/
char **load(const char *dictionary)
{
// open dictionary file
FILE *dict_file;
size_t dict_size;
char **dict;
char word[LENGTH + 1];
size_t word_length;
size_t dict_index;
dict_file = fopen(dictionary, "r");
if (dict_file == NULL) /* you should be able to notify this */
return NULL; /* failure to open file */
// compute size of dictionary
for (int c = fgetc(dict_file); c != EOF; c = fgetc(dict_file))
{
// look for '\n' (one '\n' means one word)
if (c == '\n')
{
dict_size++;
}
}
// return to beginning of file
fseek(dict_file, 0, SEEK_SET);
// local array
dict = malloc((1 + dict_size) * sizeof(*dict));
/* ^ add a sentinel to avoid storing the number of words */
if (dict == NULL)
return NULL;
// variables for reading
word_length = 0;
dict_index = 0;
// iteration over characters
for (int c = fgetc(dict_file); c != EOF; c = fgetc(dict_file))
{
// allow only letters
if (c != '\n')
{
// append character to word
word[word_length] = c;
word_length++;
}
// if c = \n and some letters're already in the word
else if (word_length > 0)
{
// terminate current word
word[word_length] = '\0';
//write word to local dictionary
dict[dict_index] = malloc(word_length + 1);
if (dict[dict_index] != NULL)
{
strcpy(dict[dict_index], word);
dict_index++;
}
// prepare for next word
word_length = 0;
}
}
dict[dict_index] = NULL;
/* We put a sentinel here so that we can find the last word ... */
return dict;
}
/**
* Prints dictionary.
*/
void print(char **dict)
{
for (int i = 0 ; dict[i] != NULL ; i++)
printf("%s %p\n", dict[i], (void *) dict[i]);
}
source to share
In load()
you create and populate a local variable char *dict[dict_size];
, and then assign only a pointer to that variable global_dict
. But once it load()
returns, you shouldn't be accessing local variables anymore. This is stealing hotel keys as it is explained in this answer
source to share