When to quantitatively ignore pattern matching in C sscanf function

Cppcheck 1.67 raised the portability issue in my source code at this line:

sscanf(s, "%d%*[,;.]%d", &f, &a);

      

This is the message I received from him:

scanf without field width limits may crash with huge inputs in some libc versions.

The original intention of the format string was to accept one of three possible limiter characters between two integers, and today - thanks to Cppcheck [1] - I see that %*[,;.]

it even accepts delimited strings. However, I doubt my format string is likely to fail as the unbounded portion is ignored.

Possibly a buffer overflow problem? ... maybe behind the scenes?


[1] How to get lost between farsightedness and blindness:

I tried to fix it with %1*[,;.]

(after some API doc ) but Cppcheck insisted on the problem, so I also tried %*1[,;.]

with the same "success". It seems that I have to suppress it now ...

+3


source to share


2 answers


Congratulations on finding a bug in Cppcheck 1.67 (current version).

You have basically three workarounds:



  • Just ignore the false result.
  • Correct your format (assign this field if you only want one character).

    char tmp;
    if(3 != sscanf(s, "%d %c%d", &f, &tmp, &a) || tmp!=',' && tmp!=';' && tmp!= '.')
        goto error;
    
          

  • Suppress warning immediately (built-in suppressors preferred):

    //cppcheck-suppress invalidscanf_libc
    if(2 != sscanf(s, "%d%1*[,;.]%d", &f, &a))
        goto error;
    
          

Remember to report the bug as "defect / false positive" so you can retire and forget about this workaround as quickly as possible.

+2


source


When to quantify ignored pattern matching in Cscanf C function?

It's probably a good idea to always quantify (see below), but over-quantification can also distract from your intentions. In the case above, where a single char separator has to be skipped, quantification will definitely be helpful.

Maybe a buffer overflow problem? ... maybe behind the scenes?

There won't be any crashes caused by your code. As for the behind the scenes question, I experimented with large input strings. There was no internal buffer overflow in the C library under test. I tried the C lib supplied with Borland C ++ 5.6.4 and found that I could not cause a buffer overflow with large inputs (over 400 million characters).

Surprisingly, Cppcheck was not completely wrong - a portability issue, but different:



#include <stdio.h>
#include <assert.h>
#include <sstream>

int traced_sscanf_set(const int count, const bool limited)
{
    const char sep = '.';
    printf("\n");
    std::stringstream ss;
    ss << "123" << std::string(count, sep) << "456";
    std::string s = ss.str();
    printf("string of size %d with %d '%c in it\n", s.size(), count, sep);
    std::stringstream fs;
    fs << "%d%";
    if (limited) {
        fs << count;
    }
    fs << "*["<< sep << "]%d";
    std::string fmt = fs.str();
    printf("fmt: \"%s\"\n", fmt.c_str());
    int a = 0;
    int b = 0;
    const sscanfResult = sscanf(s.c_str(), fmt.c_str(), &a, &b);
    printf("sscanfResult=%d, a=%d, b=%d\n", sscanfResult, a, b);
    return sscanfResult;
}

void test_sscanf()
{
    assert(traced_sscanf_set(0x7fff, true)==2);
    assert(traced_sscanf_set(0x7fff, false)==2);
    assert(traced_sscanf_set(0x8000, true)==2);
    assert(traced_sscanf_set(0x8000, false)==1);
}

      

The library I checked internally limits the input (and missing) inputs to 32767 (2 15 -1) characters, unless the limit is explicitly specified in the format parameter.

For those interested, here is the trace output:

string of size 32773 with 32767 '. in it
fmt: "%d%32767*[.]%d"
sscanfResult=2, a=123, b=456

string of size 32773 with 32767 '. in it
fmt: "%d%*[.]%d"
sscanfResult=2, a=123, b=456

string of size 32774 with 32768 '. in it
fmt: "%d%32768*[.]%d"
sscanfResult=2, a=123, b=456

string of size 32774 with 32768 '. in it
fmt: "%d%*[.]%d"
sscanfResult=1, a=123, b=0

      

+1


source







All Articles