Multiply every element in a column by every element in another column in the same dataframe
I need to multiply every single element of a column by every single element from another column of the same data block. My original datasets look something like this:
origin sum sum2
a. 2 1
b. 4 2
c. 6 3
The result I am expecting is similar to:
origin dest result (sum * sum2)
a. a. 2
a. b. 4
a. c. 6
b. a. 4
b. b. 8
b. c. 12
c. a. 6
c. b. 12
c. c. 18
The script I am writing is the following, but I cannot get the results I want:
x = 0
numerator = []
for index1, row1 in df.iterrows():
constant = row1
numerator.append([])
for index2, row2 in df.iterrows():
result = row2*constant
numerator[x].append(result)
x = x + 1
source to share
You can use:
-
numpy.outer
for several -
numpy.ravel
to align -
MultiIndex.from_product
for new index from columnorigin
-
DataFrame
constructor -
reset_index
for columns fromMultiIndex
:
mux = pd.MultiIndex.from_product([df.origin, df.origin], names=['origin','dest'])
data = np.outer(df['sum'], df['sum2']).ravel()
df = pd.DataFrame(data, index=mux, columns=['result']).reset_index()
print (df)
origin dest result
0 a. a. 2
1 a. b. 4
2 a. c. 6
3 b. a. 4
4 b. b. 8
5 b. c. 12
6 c. a. 6
7 c. b. 12
8 c. c. 18
source to share
You can use np.outer
for multiplication.
np.outer(df['sum'], df['sum2'])
Out:
array([[ 2, 4, 6],
[ 4, 8, 12],
[ 6, 12, 18]])
This can be converted to a series with labels like this:
pd.DataFrame(np.outer(df['sum'], df['sum2']),
index=df['origin'],
columns=df['origin']).rename_axis('dest', axis=1).stack()
Out:
origin dest
a. a. 2
b. 4
c. 6
b. a. 4
b. 8
c. 12
c. a. 6
b. 12
c. 18
dtype: int64
(pd.DataFrame(np.outer(df['sum'], df['sum2']),
index=df['origin'],
columns=df['origin']).rename_axis('dest', axis=1).stack()
.to_frame('result').reset_index())
Out:
origin dest result
0 a. a. 2
1 a. b. 4
2 a. c. 6
3 b. a. 4
4 b. b. 8
5 b. c. 12
6 c. a. 6
7 c. b. 12
8 c. c. 18
source to share
import pandas as pd
import itertools
# Make data example
df = pd.DataFrame()
df['origin']=['a.','b.','c.']
df['sum'] = [2,4,6]
df['sum2'] = [1,2,3]
# Record sum and sum2 for a. b. c.
df_dict = df.set_index('origin').to_dict()
df_final = pd.DataFrame()
for x,y in itertools.product(df['origin'],df['origin']):
df_final = pd.concat([df_final,pd.DataFrame([x,y,df_dict['sum'][x]*df_dict['sum2'][y]]).T],axis=0)
df_final.columns =['origin','dest','result (sum * sum2)']
Result
origin dest result (sum * sum2)
0 a. a. 2
0 a. b. 4
0 a. c. 6
0 b. a. 4
0 b. b. 8
0 b. c. 12
0 c. a. 6
0 c. b. 12
0 c. c. 18
source to share