Convert chr value to num from multiple columns?

I have this dataframe and I want to change the chr values ​​to num:

> dput(Df)
structure(list(`@MeasurementDateGMT` = c("2016-09-01 00:00:00", 
"2016-09-01 01:00:00", "2016-09-01 02:00:00", "2016-09-01 03:00:00", 
"2016-09-01 04:00:00", "2016-09-01 05:00:00", "2016-09-01 06:00:00", 
"2016-09-01 07:00:00", "2016-09-01 08:00:00", "2016-09-01 09:00:00", 
"2016-09-01 10:00:00", "2016-09-01 11:00:00", "2016-09-01 12:00:00", 
"2016-09-01 13:00:00", "2016-09-01 14:00:00", "2016-09-01 15:00:00", 
"2016-09-01 16:00:00", "2016-09-01 17:00:00", "2016-09-01 18:00:00", 
"2016-09-01 19:00:00", "2016-09-01 20:00:00", "2016-09-01 21:00:00", 
"2016-09-01 22:00:00", "2016-09-01 23:00:00"), `@Value` = c("10.9", 
"9.8", "9.9", "14.1", "13.6", "16.5", "15", "18.5", "18", "17", 
"16.6", "12", "12.1", "18.1", "15.9", "15.9", "16.9", "21.6", 
"23.5", "40.7", "16.6", "12.7", "12.4", "12.2")), .Names = c("@MeasurementDateGMT", 
"@Value"), class = "data.frame", row.names = c(NA, 24L))

      

Conversion code:

columns <- sapply(Df, is.factor)
Df[, columns] <- lapply(Df[, columns, drop = FALSE], function(x) as.numeric(as.character(x)))

      

Result:

> str(Df)
'data.frame':   24 obs. of  2 variables:
 $ @MeasurementDateGMT: chr  "2016-09-01 00:00:00" "2016-09-01 01:00:00" "2016-09-01 02:00:00" "2016-09-01 03:00:00" ...
 $ @Value             : chr  "10.9" "9.8" "9.9" "14.1" ...

      

They still are chr

. What did I miss? any ideas?

+3


source to share


2 answers


We can use type.convert

.

Df[] <- lapply(Df, function(x) type.convert(x, as.is = TRUE))
str(Df)
#'data.frame':   24 obs. of  2 variables:
#$ @MeasurementDateGMT: chr  "2016-09-01 00:00:00" "2016-09-01 01:00:00" "2016-09-01 02:00:00" "2016-09-01 03:00:00" ...
#$ @Value             : num  10.9 9.8 9.9 14.1 13.6 16.5 15 18.5 18 17 

      

...

If we need to convert the 'datetime' column,

Df[[2]] <- as.POSIXct(Df[[2]])

      

Since the columns are all character

in the OP's post, we don't need to convert it to characcter

before applying type.convert

otherwise usetype.convert(as.character(x), ..




Ok if we need dplyr

to do this

library(dplyr)
res <- Df %>% 
         mutate_all(funs(type.convert(as.character(.), as.is = TRUE)))
str(res)
#'data.frame':   24 obs. of  2 variables:
#$ @MeasurementDateGMT: chr  "2016-09-01 00:00:00" "2016-09-01 01:00:00" "2016-09-01 02:00:00" "2016-09-01 03:00:00" ...
#$ @Value             : num  10.9 9.8 9.9 14.1 13.6 16.5 15 18.5 18 17 ...

      


Or another option data.table

library(data.table)
setDT(Df)[, lapply(.SD, function(x) type.convert(x, as.is = TRUE))]

      

+2


source


You can use dplyr::mutate_if

which applies the function (in this case as.numeric

) to all columns that satisfy the predicate function (in this case is.character

).

library(dplyr)

df %>% 
  janitor::clean_names() %>% # removes the "@" from names since that messes up mutate_if
  tibble::as_tibble() %>% # just for the nice printing
  mutate_if(is.character, as.numeric)

#> Warning in eval(substitute(expr), envir, enclos): NAs introduced by
#> coercion

#> # A tibble: 24 x 2
#>    x_measurementdategmt x_value
#>                   <dbl>   <dbl>
#>  1                   NA    10.9
#>  2                   NA     9.8
#>  3                   NA     9.9
#>  4                   NA    14.1
#>  5                   NA    13.6
#>  6                   NA    16.5
#>  7                   NA    15.0
#>  8                   NA    18.5
#>  9                   NA    18.0
#> 10                   NA    17.0
#> # ... with 14 more rows

      



But the above doesn't work well for the first column as it is datetime. It just gets set to NA

on as.numeric

because it contains non-numeric characters. You should probably change it to a datetime variable instead.

df %>% 
  janitor::clean_names() %>%
  tibble::as_tibble() %>% 
  mutate(x_measurementdategmt = lubridate::as_datetime(x_measurementdategmt)) %>% 
  mutate_if(is.character, as.numeric)
#> # A tibble: 24 x 2
#>    x_measurementdategmt x_value
#>                  <dttm>   <dbl>
#>  1  2016-09-01 04:00:00    10.9
#>  2  2016-09-01 05:00:00     9.8
#>  3  2016-09-01 06:00:00     9.9
#>  4  2016-09-01 07:00:00    14.1
#>  5  2016-09-01 08:00:00    13.6
#>  6  2016-09-01 09:00:00    16.5
#>  7  2016-09-01 10:00:00    15.0
#>  8  2016-09-01 11:00:00    18.5
#>  9  2016-09-01 12:00:00    18.0
#> 10  2016-09-01 13:00:00    17.0
#> # ... with 14 more rows

      

+1


source







All Articles