Reading csv with pandas with header comment

Question

Reading csv with pandas with header comment

I have CSV files with #

in the header line:

s = '#one two three\n1 2 3'

If I use pd.read_csv

, the sign #

goes to the first heading:

import pandas as pd
from io import StringIO
pd.read_csv(StringIO(s), delim_whitespace=True)
     #one  two  three
0     1    2      3

If I set the argument comment='#'

then it pandas

completely ignores the line.

Is there an easy way to handle this case?

The second related issue is how can I handle quoting in this case, it works without #

:

s = '"one one" two three\n1 2 3'
print(pd.read_csv(StringIO(s), delim_whitespace=True))
   one one  two  three
0        1    2      3

it is not with #

:

s = '#"one one" two three\n1 2 3'
print(pd.read_csv(StringIO(s), delim_whitespace=True))
   #"one  one"  two  three
0      1     2    3    NaN

Thank!

++++++++++ Update

here is a test for the second example.

s = '#"one one" two three\n1 2 3'
# here I am cheating slicing the string
wanted_result = pd.read_csv(StringIO(s[1:]), delim_whitespace=True)
# is there a way to achieve the same result configuring somehow read_csv?
assert wanted_result.equals(pd.read_csv(StringIO(s), delim_whitespace=True))

+3

python pandas csv

Andrea Zonca May 18 '15 at 19:57

source to share

2 answers

farhawa · Answer 1 · 2015-05-18T20:18:47+0000

You can rename the first output header read_csv()

as follows:

import pandas as pd

from io import StringIO
df = pd.read_csv(StringIO(s), delim_whitespace=True)
new_name =  df.columns[0].split("#")[0]
df.rename(columns={df.columns[0]:new_name})

manu190466 · Answer 2 · 2015-05-18T20:57:38+0000

You can delete the first # of your file like this:

s = u'#"one one" two three\n1 2 3'

import pandas as pd
from io import StringIO

wholefile=StringIO(s).read().split("#")[1]

pd.read_csv(StringIO(wholefile), delim_whitespace=True)

   one one  two  three
0        1    2      3

The inconvenience is that you have to load the entire file into memory, but it works.

Reading csv with pandas with header comment

More articles: