# R: time series with repeating time index entries

I have n00b in R and n00b on stack overflow (just joined), so forgive me if I failed to use markup (which I don't know) or missed something in the readme.

If you don't mind, I'll cover my whole problem here, as perhaps you can be kind enough to shed some insight on how I would best do this!

Stage 1

Build separate time series objects for each TS1. Below is an example of the data. Basically, I am loading a csv file with several irregular time series in it (example TS1, TS2) below, so in a perfect world I would split them into separate irregular time series objects (like a zoo?), So TS1, TS2, .. this issue was discussed here ( R / zoo: handle non-standard index entries but not lose data? ), but I tried this approach repeatedly and failed.

```
Date TS Data
21/05/2014 TS1 0.95
17/04/2014 TS1 1.02
27/03/2014 TS1 0.90
30/01/2014 TS1 0.80
12/12/2013 TS1 0.70
18/09/2013 TS1 0.67
01/11/2012 TS1 0.71
01/11/2012 TS1 0.70
21/05/2014 TS2 0.47
20/05/2014 TS2 0.51
16/05/2014 TS2 0.49
15/05/2014 TS2 0.55
10/05/2014 TS2 0.63
07/05/2014 TS2 0.77
```

as you can see, the problem is due to the duplicate date index `01/11/2012`

for TS1, which causes `read.zoo`

my split data object not to be created.

Stage 2

What I would like to do is add all data from that date together on every irregular date. Since all time series are irregular and with different regularity, I would like to use the previous value for a `TS`

. For example. for `21/05/2014`

this calculation in the example is simple, since TS1 and 2 have an entry, so the answer will be `0.47 + 0.95`

. But for `20/05`

only `TS2`

has an entry, so the value for `TS1`

to be used is the most recent since that date, that is, the value `17/04/2014`

`1.02`

, so the calculation for `20/05/2014`

must be`0.51 + 1.02`

... Maybe the simplest way to achieve this would be to convert each TS to a daily value so that the previous value is used until the new data point? but this is wasteful / unnecessary for step 3 below.

Stage 3

Having created this cumulative data sum of all TSs, I want to make a polynomial curve. I also want to distinguish this curve in order to find the rate of change to date given by this curve.

Any help would be greatly appreciated! I feel like banging my head against the wall repeatedly would be much more fun than doing anything else at this stage!

thank

Updated: I now now have the code following Grothendieck.

```
library(scales)
library(zoo)
library(ggplot2)
f <- function (z) {
zz <- read.zoo(z, header = TRUE, split = 2, format = "%d/%m/%Y", aggregate = mean);
z.fill <- na.locf(zz);
z.fill <- (z.fill >= 0.5) * z.fill;
z.fill <- na.fill(z.fill,0);
zfill.mat = matrix(z.fill, NROW(z.fill));
z.sum <- rowSums(zfill.mat);
zsum <- zoo(z.sum,time(z.fill));
return(zsum);
}
DF <- read.csv(file.choose(), header = TRUE, as.is = TRUE);
DF.S <- split(DF[-2], DF[[2]]);
user <- DF[1,2];
Ret <- lapply(DF.S, f);
```

I remain a problem:

Ret contains a list of data frames. I can access it by typing Ret $ user, but since the user is changing, I need to make this dynamic. I tried to build a dynamic expression like:

x <- paste ("Ret $ '", user, "'", sep = "");

plot (x)

but couldn't appreciate it.

source to share

`read.zoo`

has an argument `aggregate=`

that takes a function that is used to aggregate the values ββtwice in the same series. Here we take `mean`

recurring days within a series, but you can use `sum`

any other function. (If the data came from a file, we would replace the argument `text = Lines`

with `read.zoo`

something like `"myfile.dat"`

.) Then we use `na.locf`

NA to fill, sum the lines, and use `na.omit`

to fill in any leading NS giving `zsum`

. We then compute a grid with a regular interval `g`

and a spline function `splfun`

by evaluating this function and its derivative on the grid, which, when converted to the zoo, gives `zspl`

and `zder`

. Finally, we will build them.

```
Lines <- "Date TS Data
21/05/2014 TS1 0.95
17/04/2014 TS1 1.02
27/03/2014 TS1 0.90
30/01/2014 TS1 0.80
12/12/2013 TS1 0.70
18/09/2013 TS1 0.67
01/11/2012 TS1 0.71
01/11/2012 TS1 0.70
21/05/2014 TS2 0.47
20/05/2014 TS2 0.51
16/05/2014 TS2 0.49
15/05/2014 TS2 0.55
10/05/2014 TS2 0.63
07/05/2014 TS2 0.77"
library(zoo)
z <- read.zoo(text = Lines, header = TRUE, split = 2, format = "%d/%m/%Y",
aggregate = mean)
zsum <- na.omit(zoo(rowSums(na.locf(z)), time(z)))
g <- seq(start(zsum), end(zsum), "day")
splfun <- splinefun(time(zsum), coredata(zsum))
zspl <- zoo(splfun(g), g)
zder <- zoo(splfun(g, deriv = 1), g)
plot(merge(zspl, zder))
```

source to share