Find minimum number of sheets in multiple sheets with pandas

Question

Find minimum number of sheets in multiple sheets with pandas

How to find the minimum values for multiple sheets for each index on a shared sheet

suppose

  worksheet 1

    index    A   B   C
       0     2   3   4.28
       1     3   4   5.23
    worksheet 2

    index    A   B   C
        0    9   6   5.9
        1    1   3   4.1

    worksheet 3

    index    A   B   C
        0    9   6   6.0
        1    1   3   4.3
 ...................(Worksheet 4,Worksheet 5)...........
by comparing C column, I want an answer, where dataframe looks like

index      min(c)
    0       4.28
    1       4.1

+3

python pandas excel min worksheet

rick sarkar June 27. 17 at 5:23 am

source to share

2 answers

You need read_excel

with a parameter sheetname=None

for OrderedDict

from all the sheets , and then enumerate the comprehension reduce

using numpy.fmin

:

dfs = pd.read_excel('file.xlsx', sheetname=None)
print (dfs)
OrderedDict([('Sheet1',    A  B     C
0  2  3  4.28
1  3  4  5.23), ('Sheet2',    A  B    C
0  9  6  5.9
1  1  3  4.1), ('Sheet3',    A  B    C
0  9  6  6.0
1  1  3  4.3)])

from functools import reduce

df = reduce(np.fmin, [v['C'] for k,v in dfs.items()])
print (df)
0    4.28
1    4.10
Name: C, dtype: float64

Solution with concat

:

df = pd.concat([v['C'] for k,v in dfs.items()],axis=1).min(axis=1)
print (df)
0    4.28
1    4.10
dtype: float64

If you need to define an index in read_excel

:

dfs = pd.read_excel('file.xlsx', sheetname=None, index_col='index')
print (dfs)
OrderedDict([('Sheet1',        A  B     C
index            
0      2  3  4.28
1      3  4  5.23), ('Sheet2',        A  B    C
index           
0      9  6  5.9
1      1  3  4.1), ('Sheet3',        A  B    C
index           
0      9  6  6.0
1      1  3  4.3)])


df = pd.concat([v['C'] for k,v in dfs.items()], axis=1).min(axis=1)
print (df)
index
0    4.28
1    4.10
dtype: float64

+3

jezrael June 27. 17 at 5:36 am

source to share

piRSquared · Accepted Answer · 2017-06-27T05:26:10+0000

from functools import reduce

reduce(np.fmin, [ws1.C, ws2.C, ws3.C])

index
0    4.28
1    4.10
Name: C, dtype: float64

It generalizes well with understanding

reduce(np.fmin, [w.C for w in [ws1, ws2, ws3, ws4, ws5]])

If you must insist on the name of your column

from functools import reduce

reduce(np.fmin, [ws1.C, ws2.C, ws3.C]).to_frame('min(C)')

       min(C)
index        
0        4.28
1        4.10

You can also use pd.concat

in dictionary and use pd.Series.min

with parameterlevel=1

pd.concat(dict(enumerate([w.C for w in [ws1, ws2, ws3]]))).min(level=1)
# equivalently
# pd.concat(dict(enumerate([w.C for w in [ws1, ws2, ws3]])), axis=1).min(1)

index
0    4.28
1    4.10
Name: C, dtype: float64

Note:

dict(enumerate([w.C for w in [ws1, ws2, ws3]]))

is another way to say

{0: ws1.C, 1: ws2.C, 2: ws3.C}

Find minimum number of sheets in multiple sheets with pandas

More articles: