R: Converting from character to POSIXct loses hours and minutes
Good morning,
I am trying to convert from character to POSIXct, but when I do, I lose hours and minutes from the data.
hourlyData (dataframe)
Login Expo EquityUSD Period UnrealizedProfitUSD
1 252957 0.00 7.187185 2014-02-03 00:00:00.000 0.00000
2 252957 0.00 7.187772 2014-02-03 01:00:00.000 0.00000
3 252957 0.00 7.188198 2014-02-03 02:00:00.000 0.00000
4 252957 0.00 7.187825 2014-02-03 03:00:00.000 0.00000
5 252957 0.00 7.187079 2014-02-03 04:00:00.000 0.00000
6 252957 0.00 7.187079 2014-02-03 05:00:00.000 0.00000
7 252957 0.00 7.188731 2014-02-03 06:00:00.000 0.00000
8 252957 0.00 7.186279 2014-02-03 07:00:00.000 0.00000
9 252957 0.00 7.187185 2014-02-03 08:00:00.000 0.00000
when i type the class (hourlyData $ Period) i get Character
. When I try to convert this column as such hourlyData$Period = as.POSIXct(hourlyData$Period)
, however, I get the following output:
hourlyData p>
Login Expo EquityUSD Period UnrealizedProfitUSD
1 252957 0.00 7.187185 2014-02-03 0.00000
2 252957 0.00 7.187772 2014-02-03 0.00000
3 252957 0.00 7.188198 2014-02-03 0.00000
4 252957 0.00 7.187825 2014-02-03 0.00000
5 252957 0.00 7.187079 2014-02-03 0.00000
6 252957 0.00 7.187079 2014-02-03 0.00000
7 252957 0.00 7.188731 2014-02-03 0.00000
8 252957 0.00 7.186279 2014-02-03 0.00000
9 252957 0.00 7.187185 2014-02-03 0.00000
If hours and minutes have been removed from the Period column. Does anyone know why this is happening, or how to prevent it?
thank
Mike
source to share
The other answers hint at the problem, but don't really address it. as.POSIXct(...)
has strange behavior when a character vector is passed in with invalid times: instead of returning NA for those elements with invalid times, it as.POSIXct(...)
removes the time portion for all elements .
You can "fix" this by explicitly specifying the format specification, even if you are using the standard specification (see the last line below).
x <- sprintf('%02d:00:00', 20:25) # 25:00:00 is not a valid time spec.
y <- sprintf('%s %s', '2018-01-01',x) # last element has invalid time
as.POSIXct(head(y,-1)) # works fine
## [1] "2018-01-01 20:00:00 HST" "2018-01-01 21:00:00 HST" "2018-01-01 22:00:00 HST" "2018-01-01 23:00:00 HST" "2018-01-02 00:00:00 HST"
as.POSIXct(y) # fails miserably
## [1] "2018-01-01 HST" "2018-01-01 HST" "2018-01-01 HST" "2018-01-01 HST" "2018-01-01 HST" "2018-01-01 HST"
as.POSIXct(y, tz='UTC') # tz does not fix this...
## [1] "2018-01-01 UTC" "2018-01-01 UTC" "2018-01-01 UTC" "2018-01-01 UTC" "2018-01-01 UTC" "2018-01-01 UTC"
as.POSIXct(y, format='%Y-%m-%d %H:%M:%S') # but this does...
## [1] "2018-01-01 20:00:00 HST" "2018-01-01 21:00:00 HST" "2018-01-01 22:00:00 HST" "2018-01-01 23:00:00 HST" "2018-01-02 00:00:00 HST" NA
Running R 3.4.0 on Win 7 x64.
source to share