How to add rows for all missing sibling values โ€‹โ€‹with multiple indices?

Suppose I have the following dataframe df

indexed with a 3-tier multi-index:

In [52]: df
Out[52]: 
          C
L0 L1 L2   
0  w  P   1
   y  P   2
      R   3
1  x  Q   4
      R   5
   z  S   6

      

Code for creating DataFrame:

idx = pd.MultiIndex(levels=[[0, 1], ['w', 'x', 'y', 'z'], ['P', 'Q', 'R', 'S']],
                    labels=[[0, 0, 0, 1, 1, 1], [0, 2, 2, 1, 1, 3], [0, 0, 2, 1, 2, 3]],
                    names=['L0', 'L1', 'L2'])

df = pd.DataFrame({'C': [1, 2, 3, 4, 5, 6]}, index=idx)

      

Possible values for level L2

: 'P'

, 'Q'

, 'R'

and 'S'

, but some of these values are missing for specific combinations of values for the other levels. For example, the combination is (L0=0, L1='w', L2='Q')

missing in df

.

I would like to add enough lines in df

so that for each combination of values โ€‹โ€‹for levels other than L2

, L2

there is exactly one line for each of the level values . For added rows, the column value C

must be 0.

IOW, I want to expand df

so that it looks like this:

          C
L0 L1 L2     
0  w  P   1
      Q   0
      R   0
      S   0
   y  P   2
      Q   0
      R   3
      S   0
1  x  P   0
      Q   4
      R   5
      S   0
   z  P   0
      Q   0
      R   0
      S   6

      

Requirements:

  • the operation should leave the column types unchanged;
  • the operation should add the least number of lines needed to complete only the specified level ( L2

    )

Is there an easy way to accomplish this extension?

+3


source to share


1 answer


Assuming L2 initially contains all the possible values โ€‹โ€‹you need, you can use the trick unstack.stack

:

df.unstack('L2', fill_value=0).stack(level=1)

      



enter image description here

+5


source







All Articles