How to denote a specific range of values ββin R?
Data
My original data frame contains information about lane changes by various drivers. Each driver changes tracks multiple times. I have created a column lane.change
that contains yes
where the vehicle lane changes. Below is an example data frame that contains 2 lane changes for one driver:
x <- structure(list(file.ID = structure(c(1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L), class = "factor", .Label = "Car1"), frames = 1:11,
lane.change = structure(c(1L, 1L, 1L, 2L, 1L, 1L, 1L, 2L,
1L, 1L, 1L), .Label = c("no", "yes"), class = "factor"),
y.m = c(80, 80, 80, 81, 82, 82, 82, 83, 84, 84, 84)), row.names = c(NA,
-11L), class = "data.frame", .Names = c("file.ID", "frames",
"lane.change", "y.m"))
Lane change setting:
Rows LC1
and LC2
show the range of band changes in this data.
What I want to do:
I want to indicate the range of values ββthat are shown in the graph. This represents the total duration of the lane change. So my desired output is:
Desired output:
> x
file.ID frames lane.change range_LC y.m
1 Car1 1 no . 80
2 Car1 2 no . 80
3 Car1 3 no LC1 80
4 Car1 4 yes LC1 81
5 Car1 5 no LC1 82
6 Car1 6 no . 82
7 Car1 7 no LC2 82
8 Car1 8 yes LC2 83
9 Car1 9 no LC2 84
10 Car1 10 no . 84
11 Car1 11 no . 84
What I have tried and problems:
I know that I can link to the corresponding frames
one using x[which(x$lane.change=="yes"),"frames"]
. But the goal is to label the previous and next lines for each lane change. I am stuck on how to do this. Also, I want to apply it to all drivers (in source data), each with a different number of stripe changes (> = 2). Please tell me which function to use. I prefer to use dplyr
and purrr
. Thanks in advance.
source to share
First I will do some helper functions
is_changing <- function(x) {
x !=lag(x, default=first(x)) | x != lead(x, default=last(x))
}
This function basically looks to see if a particular value in the vector will be next to another value (increase or decrease).
The following function takes a vector of TRUE / FALSE values ββand assigns a new index for each run of TRUE values.
true_run_index <- function(x) {
r<-rle(x)
v<-r$values
v[v] <- seq.int(sum(v))
v[v==0]<-NA
rep(v, r$length)
}
Then we can use the ones that have your sample data
x %>% mutate(LC = true_run_index(is_changing(lane.change)))
x %>% mutate(LC = true_run_index(is_changing(y.m)))
# file.ID frames lane.change y.m LC
# 1 Car1 1 no 80 NA
# 2 Car1 2 no 80 NA
# 3 Car1 3 no 80 1
# 4 Car1 4 yes 81 1
# 5 Car1 5 no 82 1
# 6 Car1 6 no 82 NA
# 7 Car1 7 no 82 2
# 8 Car1 8 yes 83 2
# 9 Car1 9 no 84 2
# 10 Car1 10 no 84 NA
# 11 Car1 11 no 84 NA
source to share
Solution using functions from dplyr
and data.table
. x4
- final result.
library(dplyr)
library(data.table)
x2 <- x %>%
mutate(LC_ID = rleid(lane.change)/2) %>%
mutate(LC_ID2 = ifelse(LC_ID %% 1 == 0, paste0("LC", LC_ID), NA)) %>%
mutate(LC_ID3 = lag(LC_ID2), LC_ID4 = lead(LC_ID2))
x3 <- mutate(x2, range_LC = coalesce(x2$LC_ID2, x2$LC_ID3, x2$LC_ID4, "."))
x4 <- x3 %>% select(file.ID, frames, lane.change, range_LC, y.m)
source to share