Handling duplicate columns in Pandas DataFrame constructor from SQLAlchemy Join

Question

Handling duplicate columns in Pandas DataFrame constructor from SQLAlchemy Join

I know it read_csv

has mangle_dup_cols

, but how can I do the same from sql connection in sqlalchemy after release:

pd.DataFrame(result.fetchall(), columns=result.keys())

which is giving me an error when used df.info()

because of the dupe col names.

+2

python pandas sqlalchemy

horatio1701d 01 Mar 14 at 14:49

source to share

1 answer

van · Accepted Answer · 2014-03-01T15:37:37+0000

You can create your own helper function that manages the column names. Below code I copied from io.parsers._infer_columns

:

def mangle_dupe_cols(columns):
    counts = {}
    for i, col in enumerate(columns):
        cur_count = counts.get(col, 0)
        if cur_count > 0:
            columns[i] = '%s.%d' % (col, cur_count)
        counts[col] = cur_count + 1
    return columns

pd.DataFrame(result.fetchall(), columns=mangle_dupe_cols(result.keys()))

Handling duplicate columns in Pandas DataFrame constructor from SQLAlchemy Join

More articles: