How to add rows for all missing sibling values โโwith multiple indices?
Suppose I have the following dataframe df
indexed with a 3-tier multi-index:
In [52]: df
Out[52]:
C
L0 L1 L2
0 w P 1
y P 2
R 3
1 x Q 4
R 5
z S 6
Code for creating DataFrame:
idx = pd.MultiIndex(levels=[[0, 1], ['w', 'x', 'y', 'z'], ['P', 'Q', 'R', 'S']],
labels=[[0, 0, 0, 1, 1, 1], [0, 2, 2, 1, 1, 3], [0, 0, 2, 1, 2, 3]],
names=['L0', 'L1', 'L2'])
df = pd.DataFrame({'C': [1, 2, 3, 4, 5, 6]}, index=idx)
Possible values for level L2
: 'P'
, 'Q'
, 'R'
and 'S'
, but some of these values are missing for specific combinations of values for the other levels. For example, the combination is (L0=0, L1='w', L2='Q')
missing in df
.
I would like to add enough lines in df
so that for each combination of values โโfor levels other than L2
, L2
there is exactly one line for each of the level values . For added rows, the column value C
must be 0.
IOW, I want to expand df
so that it looks like this:
C
L0 L1 L2
0 w P 1
Q 0
R 0
S 0
y P 2
Q 0
R 3
S 0
1 x P 0
Q 4
R 5
S 0
z P 0
Q 0
R 0
S 6
Requirements:
- the operation should leave the column types unchanged;
- the operation should add the least number of lines needed to complete only the specified level (
L2
)
Is there an easy way to accomplish this extension?
source to share