Set the relative minimum value of a column to a single column data frame R
I am working with a dataframe in R and I would like to find the min value for each date and then put that value in one column in the dataframe. The catch is that the min value for each date should only appear in the rows of that date. Take a look below and it will be clearer.
Date Last
1 2015-06-21 2106.25
2 2015-06-21 2105.25
3 2015-06-21 2105.75
4 2015-06-22 2106.75
5 2015-06-22 2107.00
6 2015-06-22 2106.75
7 2015-06-23 2117.25
8 2015-06-23 2117.25
9 2015-06-23 2117.00
Desired output:
Date Last Min
1 2015-06-21 2106.25 2105.25
2 2015-06-21 2105.25 2105.25
3 2015-06-21 2105.75 2105.25
4 2015-06-22 2106.75 2106.75
5 2015-06-22 2107.00 2106.75
6 2015-06-22 2106.75 2106.75
7 2015-06-23 2117.25 2117.00
8 2015-06-23 2117.25 2117.00
9 2015-06-23 2117.00 2117.00
To extract min from each date, I use:
MinVal <- sapply(split(data, data$Date), function(x) min(x$Last))
The output is fine:
2015-06-21 2015-06-22 2015-06-23
2105.25 2106.75 2117.00
Now how do I get this back into the dataframe in a single column named Min? Using this sapply creates a matrix that I don't want:
data$Min <- sapply(names(MinVal), function(i) ifelse(data$Date == i, MinVal[i], NA))
This command will create the original dataframe:
data <- structure(list(Date = structure(c(16607, 16607, 16607, 16608,
16608, 16608, 16609, 16609, 16609), class = "Date"), Last = c(2106.25,
2105.25, 2105.75, 2106.75, 2107, 2106.75, 2117.25, 2117.25, 2117
)), .Names = c("Date", "Last"), class = "data.frame", row.names = c(NA,
9L))
Thank!
source to share
Consider using a package data.table
for such operations. Here's an example:
library(data.table)
datadt <- data.table(data)
datadt[,Min:=min(Last),by=Date]
datadt
This leads to the desired result:
Date Last Min
1: 2015-06-21 2106.25 2105.25
2: 2015-06-21 2105.25 2105.25
3: 2015-06-21 2105.75 2105.25
4: 2015-06-22 2106.75 2106.75
5: 2015-06-22 2107.00 2106.75
6: 2015-06-22 2106.75 2106.75
7: 2015-06-23 2117.25 2117.00
8: 2015-06-23 2117.25 2117.00
9: 2015-06-23 2117.00 2117.00
The :=
in operator data.table
allows you to create columns on an existing data table. This argument can make the by
group unique.
source to share
One way dplyr
::
data %>%
group_by(Date) %>% #group
mutate(Min = min(Last)) #add min Last per group
Output:
Source: local data frame [9 x 3]
Groups: Date
Date Last Min
1 2015-06-21 2106.25 2105.25
2 2015-06-21 2105.25 2105.25
3 2015-06-21 2105.75 2105.25
4 2015-06-22 2106.75 2106.75
5 2015-06-22 2107.00 2106.75
6 2015-06-22 2106.75 2106.75
7 2015-06-23 2117.25 2117.00
8 2015-06-23 2117.25 2117.00
9 2015-06-23 2117.00 2117.00
source to share
Here's another solution:
cbind(data[order(data$Date),],Min=min(data$Last))
Output:
Date Last Min
1 2015-06-21 2106.25 2105.25
2 2015-06-21 2105.25 2105.25
3 2015-06-21 2105.75 2105.25
4 2015-06-22 2106.75 2105.25
5 2015-06-22 2107.00 2105.25
6 2015-06-22 2106.75 2105.25
7 2015-06-23 2117.25 2105.25
8 2015-06-23 2117.25 2105.25
9 2015-06-23 2117.00 2105.25
source to share