How to sum column values using python
I have a rowset that looks like this:
defaultdict(<type 'dict'>,
{
u'row1': {u'column1': 33, u'column2': 55, u'column3': 23},
u'row2': {u'column1': 32, u'column2': 32, u'column3': 17},
u'row3': {u'column1': 31, u'column2': 87, u'column3': 18}
})
I want to be able to easily get the sum of column1, column2, column3. It would be great if I could do this for any number of columns, getting the result in a hashmap that looks like columnName => columnSum
. As you might have guessed, there was no way for me to get the totals from the database in the first place, so a reason to ask a question.
+3
source to share
4 answers
>>> from collections import defaultdict
>>> x = defaultdict(dict,
{
u'row1': {u'column1': 33, u'column2': 55, u'column3': 23},
u'row2': {u'column1': 32, u'column2': 32, u'column3': 17},
u'row3': {u'column1': 31, u'column2': 87, u'column3': 18}
})
>>> sums = defaultdict(int)
>>> for row in x.itervalues():
for column, val in row.iteritems():
sums[column] += val
>>> sums
defaultdict(<type 'int'>, {u'column1': 96, u'column3': 58, u'column2': 174})
Oh much better!
>>> from collections import Counter
>>> sums = Counter()
>>> for row in x.values():
sums.update(row)
>>> sums
Counter({u'column2': 174, u'column1': 96, u'column3': 58})
+7
source to share
Nested generators + list comprehension does the trick:
>>> foo
defaultdict(<type 'dict'>, {u'row1': {u'column1': 33, u'column3': 23, u'column2': 55}, u'row2': {u'column1': 32, u'column3': 17, u'column2': 32}, u'row3': {u'column1': 31, u'column3': 18, u'column2': 87}})
>>> dict(zip(foo.values()[0].keys(), [sum(j[k] for j in (i.values() for _,i in foo.items())) for k in range(3)]))
{u'column1': 96, u'column3': 58, u'column2': 174}
+2
source to share
Here's another answer if I can suggest a solution. First put your data into a matrix. Then multiply the matrix by the vector.
import numpy as np
A = np.random.normal(size = (3,3))
Now, to get the sum of the columns, just use the dot product.
np.dot(A, np.ones(3))
To stack rows rather than columns, just transpose the matrix.
np.dot(A.T, np.ones(3))
0
source to share