Counting T / F Values for Multiple Conditions

Question

Counting T / F Values for Multiple Conditions

I am starting to use pandas.

I am looking for mutations in several patients. I have 16 different conditions. I'm just writing code about it, but how do I do it for a loop? I am trying to find changes in the MUT column and set them to True and False. Then try to count True / False numbers. I only did 4.

Can you suggest an easier way instead of writing the same code 16 times?

s1=df["MUT"]
A_T= s1.str.contains("A:T")
ATnum= A_T.value_counts(sort=True)

s2=df["MUT"]
A_G=s2.str.contains("A:G")
AGnum=A_G.value_counts(sort=True)

s3=df["MUT"]
A_C=s3.str.contains("A:C")
ACnum=A_C.value_counts(sort=True)

s4=df["MUT"]
A__=s4.str.contains("A:-")
A_num=A__.value_counts(sort=True)

+3

python pandas count

kant Jul 20. 15 at 19:34

source to share

2 answers

Just use value_counts

, this will give you the count of all unique values in your column, no need to create 16 variables:

In [5]:
df = pd.DataFrame({'MUT':np.random.randint(0,16,100)})
df['MUT'].value_counts()

Out[5]:
6     11
14    10
13     9
12     9
1      8
9      7
15     6
11     6
8      5
5      5
3      5
2      5
10     4
4      4
7      3
0      3
dtype: int64

+1

EdChum Jul 20. 15 at 19:47

source to share

Michael0x2a · Accepted Answer · 2015-07-20T19:40:41+0000

I'm not an expert on using Pandas, so I don't know if there is a cleaner way to do this, but maybe the following might work?

chars = 'TGC-'
nums = {}

for char in chars:
    s = df["MUT"]
    A = s.str.contains("A:" + char)
    num = A.value_counts(sort=True)
    nums[char] = num

ATnum = nums['T']
AGnum = nums['G']
# ...etc

Basically, go through each unique character (T, G, C, -), then pull out the values you want, then finally insert the dictionary words into the dictionary. Then, once the loop is over, you can extract all the numbers you need from the dictionary.

Counting T / F Values ​​for Multiple Conditions

More articles:

Counting T / F Values for Multiple Conditions