Finding the most frequent character in a file in C

I am writing a function that finds the most common alphabetic character in a file. The function must ignore all non-alphabetical characters.

At the moment I have the following:

int most_common(const char *filename)
{
char frequency[26];
int ch = 0;

FILE *fileHandle;
if((fileHandle = fopen(filename, "r")) == NULL){
    return -1;
}

for (ch = 0; ch < 26; ch++)
    frequency[ch] = 0;

while(1){
    ch = fgetc(fileHandle);
    if (ch == EOF) break;

    if ('a' <= ch && ch  <= 'z')
        frequency[ch - 'a']++;
    else if ('A' <= ch && ch <= 'Z')
        frequency[ch - 'A']++;
}

int max = 0;
for (int i = 1; i < 26; ++i)
  if (frequency[i] > frequency[max])
      max = i;

return max;
}

      

The function now returns how many times the most frequent letter was encountered, not the symbol itself. I am a little lost as not sure what this function will look like. Does it make sense and how can you solve the problem?

I am very grateful for your help.

+3


source to share


1 answer


The variable is frequency

indexed by a symbolic code. So frequency[0]

equal to 5 if there was 5 'a.

In the code, you are assigning a score max

, not a character code, so you return the counter not the actual character.

You need to store both the maximum number of frequencies and the code of the character it was referring to.

I would fix this with:



int maxCount = 0;
int maxChar = 0;
// i = A to Z
for (int i = 0; i <= 26; ++i)
{
  // if freq of this char is greater than the previous max freq
  if (frequency[i] > maxCount)
  {
      // store the value of the max freq
      maxCount = frequency[i];

      // store the char that had the max freq
      maxChar = i;
  }
}

// character codes are zero-based alphabet.
// Add ASCII value of 'A' to turn back into a char code.
return maxChar + 'A';

      

Please note that I changed int i = 1

to int i = 0

. Starting at 1 would mean starting at B

, which is a subtle error that you might not have noticed. Also, the loop must end with <= 26

, otherwise you will miss too Z

.

Note the curly braces. Your curly brace style (no braces for single expression blocks) is highly discouraged.

It is also i++

more common than ++i

in such cases. It won't make any difference in this context, so please advise i++

.

+5


source







All Articles