Fix index column in pandas DataFrame

I have a pandas DataFrame with the following data (output from Jupyter Notebook)

enter image description here

Can I name the first column? I cannot access it because it looks like a column of row names.

Or else, extract the first unnamed column and create a new framework with ['accessions'] and ['symbols']

+3


source to share


2 answers


pd.DataFrame.rename_axis

This is an index ...
Using the @JesseVogt data templatedf

df = pd.DataFrame(
    data={
         'asc': [['XM', 'NM', 'XM'], ['NM', 'XM'], ['NM', 'NM', 'NM'], ['NM']],
         'sym': [{'CP', 'BT', 'MF'}, {'BC', 'CP'}, {'NT', 'IF', 'NT5'}, {'BA'}],
     },
    index=[('A', 'A'), ('A', 'C'), ('A', 'G'), ('A', 'U')]
)

      

You can rename the index and show it above the index when displayed



df.rename_axis('MyName')

                 asc            sym
MyName                             
(A, A)  [XM, NM, XM]   {MF, BT, CP}
(A, C)      [NM, XM]       {BC, CP}
(A, G)  [NM, NM, NM]  {NT, IF, NT5}
(A, U)          [NM]           {BA}

      

Or you can reset the index to put this information into the framework itself

df.rename_axis('MyName').reset_index()

   MyName           asc            sym
0  (A, A)  [XM, NM, XM]   {MF, BT, CP}
1  (A, C)      [NM, XM]       {BC, CP}
2  (A, G)  [NM, NM, NM]  {NT, IF, NT5}
3  (A, U)          [NM]           {BA}

      

+1


source


If you are only given a DataFrame and cannot change the way it is built, you can call reset_index to retrieve the index into the column:

In [13]: df = pd.DataFrame(data={
    ...:     'asc': [['XM', 'NM', 'XM'], ['NM', 'XM'], ['NM', 'NM', 'NM'], ['NM']],
    ...:     'sym': [{'CP', 'BT', 'MF'}, {'BC', 'CP'}, {'NT', 'IF', 'NT5'}, {'BA'}],
    ...: }, index=[('A', 'A'), ('A', 'C'), ('A', 'G'), ('A', 'U')])

In [14]: df
Out[14]:
                 asc            sym
(A, A)  [XM, NM, XM]   {BT, CP, MF}
(A, C)      [NM, XM]       {CP, BC}
(A, G)  [NM, NM, NM]  {NT, NT5, IF}
(A, U)          [NM]           {BA}

In [15]: df.reset_index(drop=False)
Out[15]:
    index           asc            sym
0  (A, A)  [XM, NM, XM]   {BT, CP, MF}
1  (A, C)      [NM, XM]       {CP, BC}
2  (A, G)  [NM, NM, NM]  {NT, NT5, IF}
3  (A, U)          [NM]           {BA}

      



To remove this index completely, you must use drop=True

. The name can be changed by first specifying the index to df.index.name = 'some_name'

.

+2


source







All Articles