Column creation and ordering consistent in Pandas DataFrame

I'm looking for an elegant, Pythonic way to make Pandas DataFrame columns consistent. Value:

  • Make sure all columns in the main list are present and if not added to an empty placeholder column.
  • Make sure the columns are in the same order as the main list.

I have the following example that works, but is there a built-in Pandas method to achieve the same goal?

import pandas as pd
df1 = pd.DataFrame(data=[{'a':1,'b':32, 'c':32}])
print df1

      

   abc
0 1 32 32
column_master_list = ['b', 'c', 'e', 'd', 'a']
def get_dataframe_with_consistent_header(df, headers):
    for col in headers:
        if col not in df.columns:
            df[col] = pd.np.NaN
    return df[headers]

print get_dataframe_with_consistent_header(df1, column_master_list)

      

   bceda
0 32 32 NaN NaN 1
+3


source to share


1 answer


You can use the method reindex_axis

. Go to the list of column names and specify 'columns'

. NaN

Default padding value for missing records :



>>> df1.reindex_axis(column_master_list, 'columns')
    b   c   e   d  a
0  32  32 NaN NaN  1

      

+4


source







All Articles