Python str.split () incompatible?
".a string".split('.') ['', 'a string'] > "a .string".split('.') ['a ', 'string'] > "a string.".split('.') ['a string', ''] > "a ... string".split('.') ['a ', '', '', ' string'] > "a ..string".split('.') ['a ', '', 'string'] > 'this is a test'.split(' ') ['this', '', 'is', 'a', 'test'] > 'this is a test'.split() ['this', 'is', 'a', 'test']>
is it different from
when the called string has spaces as spaces?
is it divided
does not consider an empty word between two delimiters ...
The docs are clear about this (see @agf below), but I would like to know why this is the selected behavior.
I looked in the source code ( here ) and thought line 136 should be less: ...
i < str_len
source to share
docs , this is a special mention:
, consecutive delimiters are not grouped together and are treated as delimited empty strings (for example,
['1', '', '2']
). Sep can be multiple characters (for example,
['1', '2', '3']
). Splitting an empty string with the specified delimiter returns
not specified, or
, then the other splitting algorithm applied: runs of consecutive spaces are treated as single separator, and the result will not contain blank lines at the beginning or end, if the string has leading or trailing spaces . As a consequence, splitting an empty string or a string consisting of simple whitespace characters with a delimiter
Python tries to do what you expect. Most people who don't think too much are probably expecting
'1 2 3 4 '.split()
['1', '2', '3', '4']
Consider splitting the data, which used spaces instead of tables to create fixed-width columns - if the data is of different widths, each row will have a different number of spaces.
There is often a space at the end of the line that you don't see, and the default ignores it - it gives you the answer you would expect.
When it comes to the algorithm used when specifying a delimiter, think about a line in a CSV file:
means there is data in 1st and 3rd columns, and there is no data in 2nd, so you need
['1', '', '3']
otherwise, you won't be able to determine which column the row came from.
source to share