R For Fisher Test Run - Error Message

My dataframe looks like this:

595.00000    18696      984.00200     32185    Group1  
935.00000    18356      1589.00000    31580    Group2            
40.00010     19251      73.00000      33096    Group3            
1058.00000   18233      1930.00000    31239    Group4                
19.00000     19272      27.00000      33142    Group5            
1225.00000   18066      2149.00000    31020    Group6  
....                 

      

For each group, I want to do an exact Fisher test.

table <- matrix(c(595.00000, 984.00200, 18696, 32185), ncol=2, byrow=T)  
Group1 <- Fisher.test(table, alternative="greater")

      

Tried looping over the dataframe with:

for (i in 1:nrow(data.frame))  
 {  
 table= matrix(c(data.frame$V1, data.frame$V2, data.frame$V3, data.frame$V4), ncol=2, byrow=T)    
fisher.test(table, alternative="greater")  
}

      

But there was an error message

Error in fisher.test(table, alternative = "greater") :  
FEXACT error 40.  
Out of workspace.  
In addition: Warning message:  
In fisher.test(table, alternative = "greater")  :  
'x' has been rounded to integer: Mean relative difference: 2.123828e-06

      

How can I fix this problem, or maybe do another way to iterate over the data?

+3


source to share


1 answer


Your first mistake: Out of workspace

?fisher.test
fisher.test(x, y = NULL, workspace = 200000, hybrid = FALSE,
        control = list(), or = 1, alternative = "two.sided",
        conf.int = TRUE, conf.level = 0.95,
        simulate.p.value = FALSE, B = 2000)

      

You should try to increase the value workspace

(default = 2e5).

However, this happens in your case, because you have really huge values. Typically, if all the elements in your matrix are> 5 (or in your case 10, since df = 1), then you can safely approximate it using the square square independence criterion with chisq.test

. For your case, I think you should use chisq.test

.

And it warning message

happens because your values ​​are not integers (595,000) etc. So, if you really want to use recursively fisher.test

, do this (if your data is in df

and is data.frame

>:

# fisher.test with bigger workspace
apply(as.matrix(df[,1:4]), 1, function(x) 
         fisher.test(matrix(round(x), ncol=2), workspace=1e9)$p.value)

      



Or, if you prefer to replace chisq.test

(which I think you need for these huge values ​​to improve performance without any significant difference in p values):

apply(as.matrix(df[,1:4]), 1, function(x) 
         chisq.test(matrix(round(x), ncol=2))$p.value)

      

This will extract the p-values.

Edit 1: I only noticed that you are using one-sided Fisher exact test

. Maybe you should continue using Fisher's large workspace test, as I'm not sure about the one-tailed square independence test as it is already calculated in probability right-tail

(and you cannot divide the p-values ​​by 2 as asymmetric).

Edit 2: Since you require a p-value group name and you already have a data.frame, I suggest you use the package data.table

like this:

# example data
set.seed(45)
df <- as.data.frame(matrix(sample(10:200, 20), ncol=4))
df$grp <- paste0("group", 1:nrow(df))
# load package
require(data.table)
dt <- data.table(df, key="grp")
dt[, p.val := fisher.test(matrix(c(V1, V2, V3, V4), ncol=2), 
                workspace=1e9)$p.value, by=grp]
> dt
#     V1  V2  V3  V4    grp        p.val
# 1: 130  65  76  82 group1 5.086256e-04
# 2:  70  52 168 178 group2 1.139934e-01
# 3:  55 112 195  34 group3 7.161604e-27
# 4:  81  43  91  80 group4 4.229546e-02
# 5:  75  10  86  50 group5 4.212769e-05

      

+6


source







All Articles