Calculating tidal ranges

I have a dataframe that contains the following tide information. I am trying to write a function that takes four parameters (low.max, hi.max, hi.earliest, hi.latest). For example, show me all days where the minimum is 2 feet or less, the hi is 6 feet or less, and the hi occurs between 10:00 and 16:00. Right now I'm looping through the lines to do this (sort of hi.max - low.max working with this), but I'm new to R and guess there is a more R-like approach.

  date      day  time       ft      cm     H/L
2013/01/01  Tue 07:03 AM    8.1     247     H
2013/01/01  Tue 12:49 PM    5.1     155     L
2013/01/01  Tue 05:30 PM    5.7     174     H
2013/01/02  Wed 12:03 AM    0.5     15      L
2013/01/02  Wed 07:33 AM    8.1     247     H
2013/01/02  Wed 01:40 PM    4.4     134     L
2013/01/02  Wed 06:32 PM    5.3     162     H
2013/01/03  Thu 12:42 AM    1.4     43      L
2013/01/03  Thu 08:03 AM    8.1     247     H
2013/01/03  Thu 02:33 PM    3.5     107     L
2013/01/03  Thu 07:46 PM    4.9     149     H

      

Adding Output Output:

structure(list(Date = structure(c(15706, 15706, 15706, 15707, 
15707, 15707, 15707, 15708, 15708, 15708), class = "Date"), Day = c("Tue", 
"Tue", "Tue", "Wed", "Wed", "Wed", "Wed", "Thu", "Thu", "Thu"
), Time = c("7:03 AM", "12:49 PM", "5:30 PM", "12:03 AM", "7:33 AM", 
"1:40 PM", "6:32 PM", "12:42 AM", "8:03 AM", "2:33 PM"), Pred.Ft. = c(8.1, 
5.1, 5.7, 0.5, 8.1, 4.4, 5.3, 1.4, 8.1, 3.5), Pred.cm. = c(247L, 
155L, 174L, 15L, 247L, 134L, 162L, 43L, 247L, 107L), High_Low = c("H", 
"L", "H", "L", "H", "L", "H", "L", "H", "L")), .Names = c("Date", 
"Day", "Time", "Pred.Ft.", "Pred.cm.", "High_Low"), row.names = c(NA, 
10L), class = "data.frame")

      

What I've tried so far for the hi / lo part, regardless of timing:

  tides <- read.csv("TideData.csv", stringsAsFactors = FALSE)

  for (i in 1: nrow(tides)){
    if (tides[i, 6] == "L" & tides[i, 4] <= low.max 
        & tides[i+1, 6] == "H" & tides[i+1, 4] <= hi.max){

      #deal with last iteration being out of bounds / write out to a df

    }

      

+3


source to share


2 answers


A subset of data is a very simple operation in R and well described, e.g., in the manual R Introducing R .

Assuming your data is named x

, use the subset operator [

to specify the rows you want to store:

x[x$Pred.Ft < 2, ]

        Date Day     Time Pred.Ft. Pred.cm. High_Low
4 2013-01-02 Wed 12:03 AM      0.5       15        L
8 2013-01-03 Thu 12:42 AM      1.4       43        L

      

Or just tides:

x[x$Pred.Ft > 6, ]

        Date Day    Time Pred.Ft. Pred.cm. High_Low
1 2013-01-01 Tue 7:03 AM      8.1      247        H
5 2013-01-02 Wed 7:33 AM      8.1      247        H
9 2013-01-03 Thu 8:03 AM      8.1      247        H

      



To combine logical instructions, use |

for OR

or &

for AND

. So, to get a set of both low and high tides in one step:

x[x$Pred.Ft > 6 | x$Pred.Ft < 2, ]


        Date Day     Time Pred.Ft. Pred.cm. High_Low
1 2013-01-01 Tue  7:03 AM      8.1      247        H
4 2013-01-02 Wed 12:03 AM      0.5       15        L
5 2013-01-02 Wed  7:33 AM      8.1      247        H
8 2013-01-03 Thu 12:42 AM      1.4       43        L
9 2013-01-03 Thu  8:03 AM      8.1      247        H

      


To get tides only in spring try this. Since you know that each Low is followed by a high, you can calculate the difference in tide levels with diff

, and then return only the rows where the difference is above the threshold:

x$Tidediff <- c(NA, diff(x$Pred.Ft))
na.omit(x[x$Tidediff > 6, ])

        Date Day    Time Pred.Ft. Pred.cm. High_Low Tidediff
5 2013-01-02 Wed 7:33 AM      8.1      247        H      7.6
9 2013-01-03 Thu 8:03 AM      8.1      247        H      6.7

      

+2


source


Use a function by

to process inside records that have the same date value:

L.lt.2 <- by(tides, tides$Date, FUN= function(d) d[
                          d$High_Low=="L" & d$Pred.Ft <= 2,  "Date",drop=FALSE])
H.lt.6.b.4 <- by(tides, tides$Date, FUN= function(d) d[
               d$High_Low=="H"     &     d$Pred.Ft <= 6    & 
               as.POSIXct(d$Time, format="%H:%M %p") <= 
                                           as.POSIXct("4:00 PM", format="%H:%M %p"), 
                                                             "Date", drop=FALSE])
intersect(L.lt.2, H.lt.6.b.4)
#[[1]]
#character(0)

      



Didn't bother adding an extra time requirement as the data wasn't engineered to support state testing. Leave as an "exercise" as it will only involve adding an additional boolean vector to the [i, ...]

-selection operation . (It would be better to build an example where there was at least one date when the goal was met.)

+1


source







All Articles