Why read.zoo gives index as dates when time is available

I am trying to understand my difficulties in the past with entering zoo objects. The following two uses of read.zoo give different results despite the default argument for tz being "", and this is the only difference between the two calls to read.zoo:

Lines <- "2013-11-25 12:41:21         2 
2013-11-25 12:41:22.25      2 
2013-11-25 12:41:22.75      75 
2013-11-25 12:41:24.22      3 
2013-11-25 12:41:25.22      1 
2013-11-25 12:41:26.22      1"

library(zoo)
z <- read.zoo(text = Lines, index = 1:2)

> dput(z)
structure(c(2L, 2L, 75L, 3L, 1L, 1L), index = structure(c(16034, 
16034, 16034, 16034, 16034, 16034), class = "Date"), class = "zoo")

z <- read.zoo(text = Lines, index = 1:2, tz="")
> dput(z)
structure(c(2L, 2L, 75L, 3L, 1L, 1L), index = structure(c(1385412081, 
1385412082.25, 1385412082.75, 1385412084.22, 1385412085.22, 1385412086.22
), class = c("POSIXct", "POSIXt"), tzone = ""), class = "zoo")
> 

      

+3


source to share


3 answers


In fact, the default index class "Date"

, if not used tz

, is the default in this case "POSIXct"

. So the first example in the question gives the class "Date"

as that is the default and the second "POSIXct"

as it was stated tz

.

If you want to specify a class without using these defaults, then to explicitly use the argument FUN

:

read.zoo(...whatever..., FUN = as.Date)
read.zoo(...whatever..., FUN = as.POSIXct) # might need FUN=paste,FUN2=as.POSIXct
read.zoo(...whatever..., FUN = as.yearmon)
# etc. 

      

The argument FUN

can also perform a custom function, as shown in the examples in the package.

Note that it always accepts standard formats (like "%Y-%m-%d"

in the case of a class "Date"

) if no format is specified and never tries to auto-detect the format.

How it works is explained in detail in ?read.zoo

, and there are many examples in ?read.zoo

(the examples section contains 78 lines of code) as well as in the entire vignette (one of six vignettes) dedicated to simply read.zoo

": Reading data in a zoo .

Added . Unfolded above. Also, in the development version of zoo available here , the heuristic has been improved and with this improvement, the first example in the question recognizes date / time and selects POSIXct. Also some clarification of simple heuristics has been added to the help file read.zoo

, so many examples that were not provided with data should not rely on as much.



Here are some examples. Note that the heuristic in question is a heuristic to only define the time index class. It can only identify classes "numeric"

, "Date"

and "POSIXct"

. The heuristic cannot identify other classes (although you can specify them yourself using FUN=

). Also, heuristics do not define formats. If the format is not provided using format=

or implicitly through FUN=

, then a standard format is assumed, eg. "%Y-%m-%d"

in case "Date"

.

Lines <- "2013-11-25 12:41:21  2 
2013-12-25 12:41:22.25      3 
2013-12-26 12:41:22.75      8"

# explicit.  Uses POSIXct.
z <- read.zoo(text = Lines, index = 1:2, FUN = paste, FUN2 = as.POSIXct) 

# tz implies POSIXct
z <- read.zoo(text = Lines, index = 1:2, tz = "")

# heuristic: Date now; devel ver uses POSIXct
z <- read.zoo(text = Lines, index = 1:2) 


Lines <- "2013-11-251  2 
2013-12-25 3 
2013-12-26 8"

z <- read.zoo(text = Lines, FUN = as.Date) # explicit.  Uses Date.
z <- read.zoo(text = Lines, format = "%Y-%m-%d") # format & no tz implies Date
z <- read.zoo(text = Lines) # heuristic: Date

      

Note:

(1) In general, it is safer to be explicit using FUN

or using tz

and / or format

rather than relying on heuristics. If you use explicitly FUN

or semi-explicit with tz

and / or format

, then read.zoo

there are no changes between the current and development versions .

(2) It is safer to rely on the documentation rather than the internal environment, as internal elements can change without warning and have actually changed in the development release. If you really want to look at the code anyway, the key operator that selects the index class, if FUN

not explicitly defined, is the one if (is.null(FUN)) ...

in the source read.zoo

.

(3) I recommend using read.zoo

both simpler, direct and compact than workarounds such as read.table

followed by zoo

. I have been using read.zoo

for many years, like many others, and seems pretty solid to me, but if anyone has problems with read.zoo

or with the documentation (always possible since there are quite a few of them) they can always be reported. Even though the package has been around for many years, improvements are still underway.

+3


source


The answer is (of course) in the sources for read.zoo()

where there is:

....
ix <- if (missing(format) || is.null(format)) {
    if (missing(tz) || is.null(tz)) 
        processFUN(ix)
    else processFUN(ix, tz = tz)
}
else {
    if (missing(tz) || is.null(tz)) 
        processFUN(ix, format = format)
    else processFUN(ix, format = format, tz = tz)
}
....

      

Even though the default value for tz

is ""

, in your first case tz

it is considered to be missing (on missing()

) and therefore used processFUN(ix)

. When you install tz = ""

it doesn't disappear anymore and hence you get processFUN(ix, tz = tz)

.



Looking at the details read.zoo()

, this could be better handled by using tz = NULL

or tz

(not the default) in the arguments, and then in code, if tz

needed to be set to ""

for any reason, do:

if (missing(tz) || is.null(tz)) {
    tz <- ""
}

      

or maybe it isn't even required if everything is required to avoid confusion about two different calls?

+9


source


I suspect your usage read.zoo

worked. Here's what I did:

library(zoo)
tt <- read.table(text=Lines)
z <- zoo(as.integer(tt[,3]), order.by=as.POSIXct(paste(tt[,1], tt[,2])))

      

z

Is now the correct zoo object:

R> z
2013-11-25 12:41:21.00 2013-11-25 12:41:22.25 2013-11-25 12:41:22.75 
                     2                      2                     75  
2013-11-25 12:41:24.22 2013-11-25 12:41:25.22 2013-11-25 12:41:26.22 
                     3                      1                      1 
R> class(z)
[1] "zoo"
R> class(index(z))
[1] "POSIXct" "POSIXt" 
R> 

      

And by making sure that I used an object POSIXct

for the index, I actually return the object POSIXct

.

+2


source







All Articles