Finding IQR String Groups
I want to find the IQR of a range of values ββin a data frame. These values ββare grouped as well, so I need to find the IQR of each group in the dataframe. I have the following table:
Block DNAname Spot_Size Molarity Cy3_Fluorescence
1 DNA 01 100pl 100 14266
1 DNA 01 100pl 100 16020
1 DNA 01 100pl 100 15705
1 DNA 01 100pl 100 15783
1 DNA 01 100pl 100 15834
1 DNA 01 100pl 50 12248
1 DNA 01 100pl 50 12209
1 DNA 01 100pl 50 12511
1 DNA 01 100pl 50 12316
1 DNA 01 100pl 50 12469
1 DNA 01 100pl 25 9626
1 DNA 01 100pl 25 9804
1 DNA 01 100pl 25 9794
1 DNA 01 100pl 25 10020
1 DNA 01 100pl 25 9739
1 DNA 01 100pl 10 7158
1 DNA 01 100pl 10 6802
1 DNA 01 100pl 10 7378
1 DNA 01 100pl 10 5949
1 DNA 01 100pl 10 7484
1 DNA 01 100pl 5 5257
1 DNA 01 100pl 5 5560
1 DNA 01 100pl 5 6076
1 DNA 01 100pl 5 5925
I am running the following code to find the IQR:
aggregate(Cy3.DNA1.100pl.1uM$Cy3_Fluorescence, list(Molarity=
Cy3.DNA1.100pl.1uM$Molarity, Spot_Size=Cy3.DNA1.100pl.1uM$Spot_Size ), IQR)
This gives me an output:
Molarity Spot_Size x
5 100pl 384
10 100pl 576
25 100pl 65
50 100pl 221
100 100pl 129
This conclusion correctly groups all molarities, but the IQR is incorrect. If the above code has a value as a function instead of an IQR, the value for x (function value) is correct:
Molarity Spot_Size x
5 100pl 5752.4
10 100pl 6954.2
25 100pl 9796.6
50 100pl 12350.6
100 100pl 15521.6
The expected IQRS should be as follows:
Molarity IQR
100 324.25
50 258
25 363
10 519.5
5 400
Any help would be much appreciated. If anyone has any ideas how I might perform this function for IQR when there are groups of spot sizes (where spot sizes range from 100pl-400pl) including molarity categories, I would love to hear them.
Thank you for that.
source to share
It is not clear if your problem is related to aggregation or your (??) IQR definition. There are many ways to calculate IQR (see this and this ). As far as I can tell, none of these give results in your post.
When it comes to aggregation based on spot size and molarity, there are two ways:
# use aggregate(...) in base R - will be slow with large datasets
aggregate(Cy3_Fluorescence~Molarity+Spot_Size,df,IQR)
# Molarity Spot_Size Cy3_Fluorescence
# 1 5 100pl 478.5
# 2 10 100pl 576.0
# 3 25 100pl 65.0
# 4 50 100pl 221.0
# 5 100 100pl 129.0
# use data.table - will be extremely fast.
library(data.table)
setDT(df)[,list(IQR=IQR(Cy3_Fluorescence)),by=list(Molarity,Spot_Size)]
# Molarity Spot_Size IQR
# 1: 100 100pl 129.0
# 2: 50 100pl 221.0
# 3: 25 100pl 65.0
# 4: 10 100pl 576.0
# 5: 5 100pl 478.5
source to share