How can I reorder the loaded time series data?
I have data in a txt file (mydata.txt) something like this:
Variable, DateTime, Value, Quality
A, 01-01-1970 00:00:00, 0, 0
A, 01-01-1970 00:02:00, 2, 2
A, 01-01-1970 00:04:00, 4, 1
A, 01-01-1970 00:06:00, 6, 0
B, 01-01-1970 00:02:00, 0.2, 0
B, 01-01-1970 00:04:00, 0.4, 1
B, 01-01-1970 00:06:00, 0.6, 1
B, 01-01-1970 00:10:00, 1.0, 0
C, 01-01-1970 00:00:00, 20.0, 0
C, 01-01-1970 00:04:00, 16.0, 0
C, 01-01-1970 00:08:00, 12.0, 3
And I can load it into R without issue with
read.csv("mydata.txt", header = TRUE, sep = ",")
or
read.table("mydata.txt", header = TRUE, sep = ",")
But , what I would like to work with r would be something like this:
DateTime, A_Value, A_Quality, B_Value, B_Quality, C_Value, C_Quality
01-01-1970 00:00:00, 0, 0, NA, NA, 20.0, 0
01-01-1970 00:02:00, 2, 2, 0.2, 0, NA, NA
01-01-1970 00:04:00, 4, 1, 0.4, 1, 16.0, 0
01-01-1970 00:06:00, 6, 0, 0.6, 1, NA, NA
01-01-1970 00:08:00, NA, NA, NA, NA, 12.0, 3
01-01-1970 00:10:00, NA, NA, 1.0, 0, NA, NA
(where the first column is the date / time type).
I don't know what or how many different variables are in my file (i.e. A, B, ... Z) and I don't know their names - all I know is their column.
How do I get from a dataset that I have in a text file into a dataset that I would like to work with in R?
Thanks in advance!
source to share
Read your data as usual:
mydata <- read.table("mydata.txt", header = TRUE, sep = ",")
Then "reformat" it from the so-called "long" format to "wide" format using one of several methods.
It's just 1 line in the R base with reshape
:
reshape(mydata, direction = "wide", idvar = "DateTime", timevar = "Variable")
# DateTime Value.A Quality.A Value.B Quality.B Value.C Quality.C
# 1 01-01-1970 00:00:00 0 0 NA NA 20 0
# 2 01-01-1970 00:02:00 2 2 0.2 0 NA NA
# 3 01-01-1970 00:04:00 4 1 0.4 1 16 0
# 4 01-01-1970 00:06:00 6 0 0.6 1 NA NA
# 8 01-01-1970 00:10:00 NA NA 1.0 0 NA NA
# 11 01-01-1970 00:08:00 NA NA NA NA 12 3
source to share
You can do this using the package reshape2
:
First step: melt
yourdata.frame
require(reshape2)
df.m <- melt(df, id.var = 1:2) # changed names(df)[1:2] to 1:2 (following @Anandamahto comment)
Second step: cast
to the result:
dcast(df.m, DateTime ~ Variable + variable, fill=NA)
# DateTime A_Value A_Quality B_Value B_Quality C_Value C_Quality
# 1 01-01-1970 00:00:00 0 0 NA NA 20 0
# 2 01-01-1970 00:02:00 2 2 0.2 0 NA NA
# 3 01-01-1970 00:04:00 4 1 0.4 1 16 0
# 4 01-01-1970 00:06:00 6 0 0.6 1 NA NA
# 5 01-01-1970 00:08:00 NA NA NA NA 12 3
# 6 01-01-1970 00:10:00 NA NA 1.0 0 NA NA
source to share