Pandas / numpy code cleaner to find equivalence matrix?

I have a pandas DataFrame and would like to create an equivalence matrix (or whatever it calls it) where each cell has one value if df.Col [i] == df.Col [j] and another value when! =.

The following code works:

df = pd.DataFrame({"Col":[1, 2, 3, 1, 2]}, index=["A","B","C","D","E"])
df

    Col
A   1
B   2
C   3
D   1
E   2

sm = pd.DataFrame(columns=df.index, index=df.index)
for i in df.index:
    for j in df.index:
        if df.Col[i] == df.Col[j]:
            sm.loc[i, j] = 3
        else:
            sm.loc[i, j] = -1
sm

     A   B   C   D   E
A    3  -1  -1   3  -1
B   -1   3  -1  -1   3
C   -1  -1   3  -1  -1
D    3  -1  -1   3  -1
E   -1   3  -1  -1   3

      

But there must be a better way. Perhaps using numpy? Any thoughts?

[change]

Using what piRsquared wrote, maybe something like?

m = df.values == df.values[:, 0]
sm = pd.DataFrame(None, df.index, df.index).where(m, 3).where(~m, -1)

      

Can this be improved?

+3


source to share


3 answers


v = df.values
m = v == v[:, 0]
pd.DataFrame(np.where(m, 1, -1), df.index, df.index)

   A  B  C  D  E
A  1 -1 -1  1 -1
B -1  1 -1 -1  1
C -1 -1  1 -1 -1
D  1 -1 -1  1 -1
E -1  1 -1 -1  1

      



+3


source


#initialize your sm to 1s
sm = pd.DataFrame(columns=df.index, index=df.index, data=1)
#create a mask to indicate equivalence
mask = (np.asarray(df)[:,None]==np.asarray(df)).reshape(5,5)
#set non-equivalent elements to -1
sm = sm.where(mask,-1)
sm
Out[129]: 
   A  B  C  D  E
A  1 -1 -1  1 -1
B -1  1 -1 -1  1
C -1 -1  1 -1 -1
D  1 -1 -1  1 -1
E -1  1 -1 -1  1

      



+1


source


Here, using multiplication

to have a compact solution -

a = df.values
sm = pd.DataFrame(4*(a[:,0]==a)-1, df.index, df.index)

      

To make the meaning of -1

and 1

, replace 4

with 2

.

Example run -

In [41]: df
Out[41]: 
   Col
A    1
B    2
C    3
D    1
E    2

In [42]: a = df.values

In [43]: pd.DataFrame(4*(a[:,0] == a)-1, df.index, df.index)
Out[43]: 
   A  B  C  D  E
A  3 -1 -1  3 -1
B -1  3 -1 -1  3
C -1 -1  3 -1 -1
D  3 -1 -1  3 -1
E -1  3 -1 -1  3

      

+1


source







All Articles