Lme error: "Error in the registry"
4 hives were equipped with sensors that collected temperature, humidity, pressure, decibels inside the hive. these are response variables.
treatment was done using wifi, treatment groups were exposed to wifi from day 1 to day 20, then again from day 35-45, and data was collected until day 54. n hives = 4, n collected sensor data in each hive = ~ million.
I am having difficulty using mixed effects models.
there is a data frame of all hive response variables.
names(Hives)
[1] "time" "dht22_t" "dht11_t" "dht22_h"
[5] "dht11_h" "db" "pa" "treatment_hive"
[9] "wifi"
time is in "% Y -% m -% d% H:% M:% S", dht11 / 22_t / h are temperature and humidity data. "wifi" is a dichotomous variable (1 = by 0 = off) that corresponds to the exposure time, and the treatment bush is another dichotomous variable for wifi-exposed hives (1 = exposure, 0 = control).
Here is the error I am getting.
attach(Hives)
model2 = lme(pa_t~wifi*treatment_hive, random=time, na.action=na.omit, method="REML",)
Error in reStruct(random, REML = REML, data = NULL) :
Object must be a list or a formula
Here's some sample code:
time dht22_t dht11_t dht22_h dht11_h db pa treatment_hive wifi
1 01/09/2014 15:19 NA NA NA NA 51.75467 NA 0 1
2 01/09/2014 15:19 30.8 31 59.8 44 55.27682 100672 0 1
3 01/09/2014 15:19 30.8 31 60.3 44 54.81995 100675 0 1
4 01/09/2014 15:19 30.8 31 60.9 44 54.14134 100671 0 1
5 01/09/2014 15:19 30.8 31 61.1 44 53.88574 100672 0 1
6 01/09/2014 15:19 30.8 31 61.2 44 53.68800 100680 0 1
R version 2.15.1 (2012-06-22) Platform: i486-pc-linux-gnu (32-bit)
attached packages: [1] ggplot2_0.8.9 proto_0.3-9.2 reshape_0.8.4 plyr_1.7.1 nlme_3.1 -104
[6] lme4_0.999999-0 Matrix_1.0-6 lattice_0.20-6
source to share
There are a lot of problems here, some of which are programming related (StackOverflow), but statistic issues are probably important (suitable for CrossValidated or r-sig-mixed-models@r-project.org
).
tl; dr If you just want to avoid the error, I think you need random=~1|hive
(regardless of your hive pointer variable) to fit the model where the underlying response (interception) varies depending on the hives, but I would suggest you read on ...
- Can we have a (small!) Reproducible example ?
- don't use
attach(Hives)
, usedata=Hives
in your calllme()
(not necessarily a problem, but [much] better practice) - with only four hives, it is somewhat doubtful whether the spec for a random effect across hives would work (although with millions of observations you could handle it).
- the random effect should consist of a categorical (factor) grouping variable; in your case, I think "hive" is a grouping variable, but I cannot tell from your question which variable identifies the hives
- you should almost certainly have a model that takes into account trends over time and variations in time trends across hives, i.e. random slopes model which will be expressed as
formula=...~...+time, random=~time|hive
(where...
the bits of your existing model represent) - you will need to convert the time to something sane in order to use it in your model (see
?strptime
or packagelubridate
), something like seconds / minutes / hours from start time might be most sane. (What is your temporal resolution? Do you have multiple sensors per hive, in which case you should also consider accidental sensor interference?) - with millions of data points, your model is likely to fit very slowly; you can view the package
lme4
- with millions of data points, everything will be statistically significant and very sensitive to aspects of the model that are not reflected in the data, for example (1) non-linear trends over time (for example, consider using additive time trend models with
mgcv::gamm
or batchgamm4
); (2) temporal autocorrelation (consider adding a parametercorrelation
to your modellme
).
source to share