Plot time tag coverage with ggplot

I am trying to reproduce a plot that renders the temporal markup of a group of electronic tags, but has had little success. I have attached a simple example of what kind of plot I am going to produce and the data that makes this plot. Any help generating this plot using ggplot would be extremely helpful.

Note that in a graph that I don't like the year, I just want to visualize the days and months for which the tag was writing data. Also note that for tags like 4120 that were released at the end of the year (September) and continued to release data until the beginning of the next year (April), this streak continues until the end of the year and then has another bar that starts at January and renders the rest of the tag entry.

dat <- structure(list(Tag_Num = c(44386L, 44387L, 44388L, 44390L, 52236L, 
52237L, 52238L, 60639L, 60641L, 61921L, 61925L, 61932L, 61936L, 
61938L, 61940L, 61957L, 63975L, 63977L, 87565L, 100949L), Deploy = structure(c(1L, 
3L, 2L, 9L, 5L, 7L, 14L, 6L, 4L, 13L, 15L, 20L, 10L, 12L, 8L, 
19L, 16L, 11L, 18L, 17L), .Label = c("5/4/2004", "5/5/2004", 
"5/6/2004", "6/22/2011", "6/24/2005", "6/24/2011", "6/26/2005", 
"6/30/2006", "7/3/2004", "9/1/2006", "9/10/2007", "9/11/2007", 
"9/12/2006", "9/15/2007", "9/21/2006", "9/22/2006", "9/24/2010", 
"9/6/2008", "9/7/2006", "9/9/2006"), class = "factor"), Recover = structure(c(14L, 
14L, 14L, 2L, 18L, 17L, 3L, 16L, 15L, 7L, 4L, 12L, 9L, 6L, 13L, 
8L, 5L, 11L, 1L, 10L), .Label = c("12/20/2008", "12/31/2004", 
"3/14/2008", "3/21/2007", "4/18/2007", "5/12/2008", "5/15/2007", 
"5/16/2007", "5/21/2007", "5/22/2011", "5/8/2008", "5/9/2007", 
"7/26/2006", "9/10/2004", "9/20/2011", "9/22/2011", "9/25/2005", 
"9/8/2005"), class = "factor")), .Names = c("Tag_Num", "Deploy", 
"Recover"), class = "data.frame", row.names = c(NA, -20L))

      

This figure no longer matches the specified dataset, but still gives an example of what I am trying to accomplish.

enter image description here

+3


source to share


1 answer


I found a solution, although I ended up relying on Julian dates to get this to work. I relied heavily on the lubridate, dplyr and ggplot2 packages.

I have been thinking for a long time about how the dataset should look like. If you only have five points, you can easily do the second row for 4120. Here's a way to do it on the entire dataset using do

from dplyr

.

require(dplyr)
require(lubridate)

dat2 = dat %>%
    group_by(Tag_Num) %>%
    do(if(year(mdy(.$Deploy)) - year(mdy(.$Recover)) != 0) {
        data.frame(Deploy = c(as.character(.$Deploy), paste("1/1", year(mdy(.$Recover)), sep = "/")), 
                  Recover = c(paste("12/31", year(mdy(.$Deploy)), sep = "/"), as.character(.$Recover))) }
        else { data.frame(Deploy = .$Deploy, Recover = .$Recover) } )

      

The dataset now looks like this:

  Tag_Num    Deploy    Recover
1    4001  1/1/2014   9/1/2014
2    4120  9/1/2013 12/31/2013
3    4120  1/1/2014  4/20/2014
4    4356  1/1/2011  6/29/2011
5    4665 3/15/2010 10/17/2010

      

I did the conversion to Julian Day Deploy and rebuild the dates for the actual build. I also set the deployment year so that you can technically do something like color by year in the plot.

dat2 = dat2 %>% ungroup %>% 
    mutate(year = year(mdy(Deploy)), JDeploy = yday(mdy(Deploy)), 
          JRecover = yday(mdy(Recover)), Tag_Num = factor(Tag_Num))

      



  Tag_Num    Deploy    Recover year JDeploy JRecover
1    4001  1/1/2014   9/1/2014 2014       1      244
2    4120  9/1/2013 12/31/2013 2013     244      365
3    4120  1/1/2014  4/20/2014 2014       1      110
4    4356  1/1/2011  6/29/2011 2011       1      180
5    4665 3/15/2010 10/17/2010 2010      74      290

      

To put months on the x-axis instead of the Julian day, I calculated the approximate Julian day of the middle of each month to use as the axial breaks. This seems a bit hacky to me, but wasn't sure how else to define breaks.

# Make breaks in Julian Day that will be equivalent to essentially midmonth?
xbreaks = yday(paste(2013, 1:12, c(15, 14, rep(15, 10)), sep = "-"))
# If want labels at start of each month rather than midmonth
xbreaks2 = yday(paste(2013, 1:12, 1, sep = "-"))

      

Then, let's plot it using ggplot2. It depends on as.numeric

the factor Tag_Num

to be used in geom_segment

. Then, the y-axes breaking the marks were set with levels Tag_Num

. You can change the y-axis order to change the order of the levels Tag_Num

in the dataset.

EDIT

With more labels, numerical breaks along the y-axis no longer represent each unique tag by default (with updated dataset in OP). You can fix this problem by installing breaks

in scale_y_continuous

.

require(ggplot2)

ggplot(dat2, aes(x = JDeploy, xend = JRecover, y = as.numeric(Tag_Num), yend = as.numeric(Tag_Num))) +
    geom_segment(size = 5) +
    scale_y_continuous(breaks = unique(as.numeric(dat2$Tag_Num)), labels = paste("Tag", levels(dat2$Tag_Num))) + 
    ylab(NULL) + 
    xlab(NULL) +
    scale_x_continuous(breaks = xbreaks2, labels = format(ISOdate(2004,1:12,1),"%b"))

      

+2


source







All Articles