Convert words from camelCase to snake_case in C

What I am trying to code is that if I enter camelcase

, it should just print camelcase

, but if it contains any uppercase letters, for example if I enter camelcase

, it should print camel_case

.

Below I work, but the problem is that if I enter,, camelcase

it outputs camel_ase

.

Can someone tell me the reason and how to fix it?

#include <stdio.h>
#include <ctype.h>

int main() {
    char ch;
    char input[100];
    int i = 0;

    while ((ch = getchar()) != EOF) {
        input[i] = ch;
        if (isupper(input[i])) {
            input[i] = '_';
            //input[i+1] = tolower(ch);
        } else {
            input[i] = ch;
        }
        printf("%c", input[i]);

        i++;
    }
}

      

+3


source to share


5 answers


Look at your code first and think about what happens when someone enters a word longer than 100 characters -> undefined. If you are using a buffer for input, you always need to add checks so that you don't overflow that buffer.

But then when you are printing characters directly, why do you need a buffer? This is completely unnecessary with the approach you are showing. Try the following:

#include <stdio.h>
#include <ctype.h>

int main()
{
    int ch;
    int firstChar = 1; // needed to also accept PascalCase
    while((ch = getchar())!= EOF)
    {
        if(isupper(ch))
        {
            if (!firstChar) putchar('_');
            putchar(tolower(ch));

        } else
        {
            putchar(ch);
        }
        firstChar = 0;
    }
}

      


Side note: I changed the type ch

to int

. This is due to the fact that the getchar()

returns int

, putchar()

, isupper()

and islower()

take int

, and they all use the value unsigned char

or EOF

. Since it is char

allowed to subscribe, on a signed platform char

, you will get undefined behavior calling these functions with negative char

. I know this is a little tricky. Another way to work around this problem is to always throw char

on unsigned char

when calling a function that takes the value unsigned char

as int

.




As you are using a buffer, and it is useless right now, you may be wondering that there is a possible solution using a buffer: reading and writing a whole line at a time. This is slightly more efficient than calling a function for every single character. Here's an example:

#include <stdio.h>

static size_t toSnakeCase(char *out, size_t outSize, const char *in)
{
    const char *inp = in;
    size_t n = 0;
    while (n < outSize - 1 && *inp)
    {
        if (*inp >= 'A' && *inp <= 'Z')
        {
            if (n > outSize - 3)
            {
                out[n++] = 0;
                return n;
            }
            out[n++] = '_';
            out[n++] = *inp + ('a' - 'A');
        }
        else
        {
            out[n++] = *inp;
        }
        ++inp;
    }
    out[n++] = 0;
    return n;
}

int main(void)
{
    char inbuf[512];
    char outbuf[1024]; // twice the lenght of the input is upper bound

    while (fgets(inbuf, 512, stdin))
    {
        toSnakeCase(outbuf, 1024, inbuf);
        fputs(outbuf, stdout);
    }
    return 0;
}

      

This version also avoids isupper()

and tolower()

, but sacrifices portability. It only works if the character encoding has letters in sequence and has uppercase letters before lowercase letters. For ASCII, these assumptions are met. Keep in mind that what counts as a (capital) letter may also vary by language. The program above only works for the letters AZ, like in English.

+4


source


There are two problems in the code:

  • You insert one character in each branch if

    , while one of them should insert two characters, and
  • You print characters as you go, but the first branch should print both _

    as well ch

    .

You can fix this by increasing i

on paste with i++

and printing the whole word at the end:



int ch; // <<== Has to be int, not char
char input[100];
int i = 0;

while((ch = getchar())!= EOF && (i < sizeof(input)-1)) {
    if(isupper(ch)) {
        if (i != 0) {
            input[i++] = '_';
        }
        ch = tolower(ch);
    }
    input[i++] = ch;
}
input[i] = '\0'; // Null-terminate the string
printf("%s\n", input);

      

Demo version

0


source


I don't know exactly how to code in C, but I think you should do something like this.

if(isupper(input[i]))
{
    input[i] = tolower(ch);
    printf("_");

} else
{
    input[i] = ch;
}

      

0


source


There are several problems with the code:

  • ch

    is defined as char

    : you cannot validate end of file correctly if c

    not defined as int

    . getc()

    can return all values ​​of the type unsigned char

    plus the special value EOF

    , which is negative. Determine ch

    how int

    .

  • You store bytes in an array input

    and use isupper(input[i])

    . isupper()

    is defined only for values ​​returned getc()

    , not for potentially negative values ​​of a type char

    if that type is signed on the target system. Use isupper(ch)

    or isupper((unsigned char)input[i])

    .

  • You don't check if enough is enough i

    before storing bytes in input[i]

    , causing a potential buffer overflow. Please note that there is no need to store characters in an array for your problem.

  • You must insert '_'

    into the array and the character converted to lowercase. This is your main problem.

  • To convert Main

    to _main

    , Main

    or leave as a Main

    matter of specification.

Here's a simpler version:

#include <ctype.h>
#include <stdio.h>

int main(void) {
    int c;

    while ((c = getchar()) != EOF) {
        if (isupper(c)) {
            putchar('_');
            putchar(tolower(c));
        } else {
            putchar(c);
        }
    }
    return 0;
}

      

0


source


You don't need to use an array to display the entered characters in the form you showed. The program might look like this.

#include <stdio.h>
#include <ctype.h>

int main( void )
{
    int c;

    while ((c = getchar()) != EOF && c != '\n')
    {
        if (isupper(c))
        {
            putchar('_');
            c = tolower(c);
        }
        putchar(c);
    }

    putchar('\n');

    return 0;
}

      

If you want to use a character array, you must reserve one element of it for null termination if you want the array to contain a string.

In this case, the program might look like

#include <stdio.h>
#include <ctype.h>

int main( void )
{
    char input[100];
    const size_t N = sizeof(input) / sizeof(*input);

    int c;
    size_t i = 0;

    while ( i + 1 < N && (c = getchar()) != EOF && c != '\n')
    {
        if (isupper(c))
        {
            input[i++] = '_';
            c = tolower(c);
        }
        if ( i + 1 != N ) input[i++] = c;
    }

    input[i] = '\0';

    puts(input);

    return 0;
}

      

0


source







All Articles