Filtering multiple Pandas columns

I have a method that takes pandas as input frame:

def dfColumnFilter(df, columnFilter, columnName):
    ''' Returns a filtered DataFrame

    Keyword arguments: 
    df           :  DataFrame in which to apply the filter
    columnFilter :  The list of which to filter by
    columnName   :  The DataFrame column to apply the columnFilter to '''

    for column_filter in columnFilter:
        df=df[df[columnName] == column_filter]
        return df

      

The question is how to make this work for n columns?

+1


source to share


1 answer


You can use a keyword *args

to pass a list of pairs:

def filter_df(df, *args):
    for k, v in args:
        df = df[df[k] == v]
    return df

      

It can be used like this:

df = pd.DataFrame({'a': [1, 2, 1, 1], 'b': [1, 3, 3, 3]})

>>> filter_df(df, ('a', 1), ('b', 2))
    a   b
2   1   3
3   1   3

      

Note



In theory, you could use **kwargs

which would have nicer use:

filter_df(df, a=1, b=2)

      

but then you can only use it for columns whose names are valid Python identifiers.

Edit

See comment given by @Goyo for a better implementation point.

+2


source







All Articles