Not a match for a working regex in c

I want to match a regex (?<=SEARCH_THIS=").+(?<!"\n)

in C using PCRE.

However, the following code does not work as expected.

#include <pcreposix.h>
#include <string.h>
#include <stdio.h>
#include <stdlib.h>


int main(void){
    regex_t re;
    regmatch_t matches[2];
    char *regex = "(?<=SEARCH_THIS=\").+(?<!\"\n)";
    char *file = "NO_MATCH=\"0\"\nSOMETHING_ELSE=\"1\"\nSOME_STUFF=\"1\"\nSEARCH_THIS=\"gimme that\"\nNOT_THIS=\"foobar\"\nTHIS_NEITHER=\"test\"\n";

    puts("compiling regex");
    int compErr = regcomp(&re, regex, REG_NOSUB | REG_EXTENDED);
    if(compErr != 0){
        char buffer[128];
        regerror(compErr, &re, buffer, 100);
        printf("regcomp failed: %s\n", buffer);
        return 0;
    }
    puts("executing regex");
    int err = regexec(&re, file, 2, matches, 0);
    if(err == 0){
        puts("no error");
        printf("heres the match: [.%*s]",matches[0].rm_eo-matches[0].rm_so,file+matches[0].rm_so);
    } else {
        puts("some error here!");
        char buffer[128];
        regerror(err, &re, buffer, 100);
        printf("regexec failed: %s\n", buffer);
    }
    return 0;
}

      

Console output:

compiling regex
executing regex
some error here!
regexec failed: No match

      

I checked the functionality of this regex here Any idea what is going wrong here?

EDIT # 1

Compiler version

 $ arm-merlin-linux-uclibc-gcc --version
 arm-merlin-linux-uclibc-gcc (GCC) 4.2.1
 Copyright (C) 2007 Free Software Foundation, Inc.
 This is free software; see the source for copying conditions.  There is NO
 warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

      

Compile command

 $ arm-merlin-linux-uclibc-gcc -lpcre ./re_test.c -o re_test.o

      

+3


source to share


1 answer


There are several issues in your code.

First you use %*s

to limit the length of the printed line. However, the integer width before the format s

is the minimum length of what is printed; if the corresponding string length is less than the specified one, it will be padded with spaces. If the length is longer than the specified one, it just prints the entire line. You will need a different method of limiting the length of the output string (just avoid modification *file

because it file

points to a constant string).

Second, you specify an option REG_NOSUB

in your call regcomp

, but according to the man page , this means that no substring position is stored in the argument pmatch

- so even if yours regexec

did work, the next one printf

will use uninitialized values ​​(this is undefined behavior).



Finally, I suspect the problem is that the \"

and characters \n

must be double-escaped; those. you need to use \\\"

and \\n

in your line of regular expressions. While the code you gave me worked for me (Ubuntu 14.04 x64), the double-escaped version also works.

Taking all this into account, this is the output I get:

compiling regex
executing regex
no error
heres the match: [.gimme that"]

      

+1


source







All Articles