Function in K & R example returns wrong string length?
This is from page 65 in K&R. The description says this function returns the length of a string. Here is the code:
int trim (char s[])
{
int n;
for (n = strlen(s)-1; n >= 0; n--)
if (s[n] != ' ' && s[n] != '\t' && s[n] != '\n')
break;
s[n+1] = '\0';
return n;
}
It looks like it should return n + 1. Is the null character not included in the string?
Example:
char s[4];
s[0] = c, s[1] = a, s[2] = t, s[3] = '\0';
Doesn't that mean that the size of the string is 3 and that there are 3 elements in use? This function will return 2, which is not correct.
Also, how long is the string defined as?
source to share
You are absolutely correct: the returned "n" in your example == "strlen (s) -1"; not "strlen (s)".
#include <stdio.h>
#include <string.h>
int
trim (char s[])
{
int initial_strlen = strlen(s);
int n;
for (n = initial_strlen-1; n >= 0; n--) {
if (s[n] != ' ' && s[n] != '\t' && s[n] != '\n')
break;
}
s[n+1] = '\0';
printf ("s=%s, initial strlen=%d, current strlen=%d, n=%d\n",
s, initial_strlen, strlen(s), n);
return n+1;
}
int
main (int argc, char *argv[])
{
char buf[80];
strcpy(buf, "cat ");
printf ("trim #1= %d\n", trim (buf));
strcpy(buf, "cat\t\t\t ");
printf ("trim #2= %d\n", trim (buf));
return 0;
}
Output example:
s=cat, initial strlen=6, current strlen=3, n=2
trim #1= 3
s=cat, initial strlen=9, current strlen=3, n=2
trim #2= 3
source to share
The problem is that this is just bad code. this is an example of how code should not be written. As a rule, any bad code contains an error, and this example demonstrates this rule. :) This is bad code, because at least you can't even say with certainty what the function should return. :) And if it doesn't return strlen (s ), then why should it return strlen (s) - 1, especially for empty strings.
I would write the function like this
size_t trim( char s[] )
{
size_t n = strlen( s );
while ( n != 0 && ( s[n-1] == ' ' || s[n-1] == '\t' || s[n-1] == '\n' ) ) --n;
s[n] = '\0';
return n;
}
Compare the code with the code you provided. It's very clear in my code that if the loop doesn't repeat, the function will return strlen( s )
. That is, you don't even need to research what the loop is doing. If you remove the loop for example
size_t trim( char s[] )
{
size_t n = strlen( s );
s[n] = '\0';
return n;
}
the code will be very clear and easy to read. This is an invariant.
As far as the loop is concerned, it uses the idiom of bidirectional iterators in C ++. So again, this code is easy to read. And there is no break statement. :)
Note that it is important that the function returns sizeof( s )
after the hanging spaces have been removed. For example, it can be used when you want to concatenate two strings.
source to share
You are correct that this implementation trim
does not return the length of the resulting string.
However, this is not necessarily wrong.
My copy of the K&R (2nd) edition reads:
The following function
trim
removes trailing spaces, tabs, and newlines from the end of the line, usingbreak
to exit the loop when the rightmost non-blank, non-tab, non-newline.
/* trim: remove trailing blanks, tabs, newlines */ [... code ...]
strlen
returns the length of the string ....
Nowhere is it said what the expected return value should mean trim
. While I agree that its actual return value is unintuitive, it is not necessarily wrong as we are not being told how it should behave.
Also, you can look at bugs for the K&R C Programming Language (this example is not listed).
source to share
On the one hand, the trailing null character is not considered part of the string length. This is how it works strlen
, for example. For example. the string "ABCD"
is as long 4
as you would expect.
On the other hand, the above function trim
does return 1 less than the actual string length. Indeed, it should return n + 1
.
The line length in your example is valid 3
. This is the correct length. The trailing null character is not considered part of the length. If you were counting trailing zero the length would be 4
.
source to share