How to split a string by character position in c

I am using C to read into an external text file. The input is small and will look like this:

0PAUL               22   ACACIA AVENUE                           02/07/1986RN666

      

As you can see, I don't have an obvious delimeter and sometimes the values ​​don't matter in between. However, I know how long the character length should be when splitting. Which looks like this:

id = 1
name = 20
house number = 5
street name = 40
date of birth = 10
reference = 5

      

I created a structure where I want to store this information and tried using fscanf to read in a file. However, I find something along the lines of just not doing what I need,

fscanf(file_in, "%1d, %20s", person.id[i], person.name[i]);

      

(The actual line where I'm trying to grab all the input, but you should see where I'm going ...)

The long term intent is to reformat the input file into a different output file, which is a little easier on the eye.

I appreciate that I am probably wrong, but I would really appreciate it if someone could point me on the right path. If you are able to easily deal with me regarding the apparent lack of understanding, I appreciate it as well.

Thanks for reading

+3


source to share


3 answers


If you have fixed column widths, you can use pointer arithmetic to access substrings of the string str

. if you have a starting index begin

,

printf("%s", str + begin) ;

      

will print the substring starting at begin

and up to the end. If you want to print a specific string length

, you can use the precision specifier printf

.*

, which takes the maximum length as an additional argument:

printf("%.*s", length, str + begin) ;

      

If you want to copy the string to a temporary buffer, you can use strncpy

that will generate a null terminated string if the buffer is larger than the length of the substring. You can also use snprintf

according to the above pattern:

char buf[length + 1];

snprintf(buf, sizeof(buf), "%.*s", length, str + begin) ;

      



This will extract the leading and trailing whitespace, which is probably not what you want. You can write a function to remove unwanted spaces; there should be many examples here on SO.

You can also remove spaces when copying a substring. The example code below does it using a function isspace

/ macro from <ctype.h>

:

#include <stdlib.h>
#include <stdio.h>
#include <ctype.h>

int extract(char *buf, const char *str, int len)
{
    const char *end = str + len;
    int tail = -1;
    int i = 0;

    // skip leading white space;
    while (str < end && *str && isspace(*str)) str++;

    // copy string
    while (str < end && *str) {
        if (!isspace(*str)) tail = i + 1;
        buf[i++] = *str++;
    }

    if (tail < 0) tail= i;
    buf[tail] = '\0';

    return tail;
}

int main()
{
    char str[][80] = {
        "0PAUL               22   ACACIA AVENUE                     02/07/1986RN666",
        "1BOB                1    POLK ST                           01/04/1988RN802",
        "2ALICE              99   WEST HIGHLAND CAUSEWAY            28/06/1982RN774"
    };
    int i;

    for (i = 0; i < 3; i++) {
        char *p = str[i];
        char id[2];
        char name[20];
        char number[6];
        char street[35];
        char bday[11];
        char ref[11];

        extract(id,     p + 0, 1);
        extract(name,   p + 1, 19);
        extract(number, p + 20, 5);
        extract(street, p + 25, 34);
        extract(bday,   p + 59, 10);
        extract(ref,    p + 69, 10);

        printf("<person id='%s'>\n", id);
        printf("    <name>%s</name>\n", name);
        printf("    <house>%s</house>\n", number);
        printf("    <street>%s</street>\n", street);
        printf("    <birthday>%s</birthday>\n", bday);
        printf("    <reference>%s</reference>\n", ref);
        printf("</person>\n\n");        
    }

    return 0;
}

      

There is a danger here: when accessing a string at a specific position, str + pos

you have to make sure that you don't go beyond the actual length of the string. For example, a line might be interrupted after a name. When you access a birthday, you are accessing valid memory, but may contain garbage.

You can avoid this problem by filling the entire line with spaces.

+1


source


Use fgets to read each line at a time, then extract each field from the input line. Warning: checks are not performed on buffers, so care must be taken to flush buffers.

For example, something like this (I don't compile it, so maybe there are some bugs):



    void copy_substr(const char * pBuffer, int content_size, int start_idx, int end_idx, char * pOutBuffer)
    {
        end_idx = end_idx > content_size ? content_size : end_idx;
        int j = 0;
        for (int i = start_idx; i < end_idx; i++)
            pOutBuffer[j++] = pBuffer[i];
        pOutBuffer[j] = 0;
        return;
    }

    void test_solution()
    {
        char buffer_char[200];
        fgets(buffer_char,sizeof(buffer_char),stdin);   // use your own FILE handle instead of stdin
        int len = strlen(buffer_char);
        char temp_buffer[100];
        // Reading first field: str[0..1), so only the char 0 (len=1)
        int field_size = 1;
        int filed_start_ofs = 0;
        copy_substr(buffer_char, len, filed_start_ofs, filed_start_ofs + field_size, temp_buffer);

    }

      

+2


source


scanf is a good way to do this, you just have to use a buffer and call sscanf multiple times and give good offsets. For example:

char buffer[100];
fscanf(file_in, "%s",buffer);

sscanf(buffer, "%1d", person.id[i]);
sscanf(buffer+1, "%20s", person.name[i]);
sscanf(buffer+1+20, "%5d", person.street_number[i]);

      

etc. I feel like this is the easiest way to do it.

Please also consider using an array of your structure instead of an array structure, it just feels like it doesn't have person.id [i] and not person [i] .id

+2


source







All Articles