Convert chr value to num from multiple columns?

Question

Convert chr value to num from multiple columns?

I have this dataframe and I want to change the chr values to num:

> dput(Df)
structure(list(`@MeasurementDateGMT` = c("2016-09-01 00:00:00", 
"2016-09-01 01:00:00", "2016-09-01 02:00:00", "2016-09-01 03:00:00", 
"2016-09-01 04:00:00", "2016-09-01 05:00:00", "2016-09-01 06:00:00", 
"2016-09-01 07:00:00", "2016-09-01 08:00:00", "2016-09-01 09:00:00", 
"2016-09-01 10:00:00", "2016-09-01 11:00:00", "2016-09-01 12:00:00", 
"2016-09-01 13:00:00", "2016-09-01 14:00:00", "2016-09-01 15:00:00", 
"2016-09-01 16:00:00", "2016-09-01 17:00:00", "2016-09-01 18:00:00", 
"2016-09-01 19:00:00", "2016-09-01 20:00:00", "2016-09-01 21:00:00", 
"2016-09-01 22:00:00", "2016-09-01 23:00:00"), `@Value` = c("10.9", 
"9.8", "9.9", "14.1", "13.6", "16.5", "15", "18.5", "18", "17", 
"16.6", "12", "12.1", "18.1", "15.9", "15.9", "16.9", "21.6", 
"23.5", "40.7", "16.6", "12.7", "12.4", "12.2")), .Names = c("@MeasurementDateGMT", 
"@Value"), class = "data.frame", row.names = c(NA, 24L))

Conversion code:

columns <- sapply(Df, is.factor)
Df[, columns] <- lapply(Df[, columns, drop = FALSE], function(x) as.numeric(as.character(x)))

Result:

> str(Df)
'data.frame':   24 obs. of  2 variables:
 $ @MeasurementDateGMT: chr  "2016-09-01 00:00:00" "2016-09-01 01:00:00" "2016-09-01 02:00:00" "2016-09-01 03:00:00" ...
 $ @Value             : chr  "10.9" "9.8" "9.9" "14.1" ...

They still are chr

. What did I miss? any ideas?

+3

char numbers r dataframe

laukok May 30 '17 at 4:11

source to share

2 answers

You can use dplyr::mutate_if

which applies the function (in this case as.numeric

) to all columns that satisfy the predicate function (in this case is.character

).

library(dplyr)

df %>% 
  janitor::clean_names() %>% # removes the "@" from names since that messes up mutate_if
  tibble::as_tibble() %>% # just for the nice printing
  mutate_if(is.character, as.numeric)

#> Warning in eval(substitute(expr), envir, enclos): NAs introduced by
#> coercion

#> # A tibble: 24 x 2
#>    x_measurementdategmt x_value
#>                   <dbl>   <dbl>
#>  1                   NA    10.9
#>  2                   NA     9.8
#>  3                   NA     9.9
#>  4                   NA    14.1
#>  5                   NA    13.6
#>  6                   NA    16.5
#>  7                   NA    15.0
#>  8                   NA    18.5
#>  9                   NA    18.0
#> 10                   NA    17.0
#> # ... with 14 more rows

But the above doesn't work well for the first column as it is datetime. It just gets set to NA

on as.numeric

because it contains non-numeric characters. You should probably change it to a datetime variable instead.

df %>% 
  janitor::clean_names() %>%
  tibble::as_tibble() %>% 
  mutate(x_measurementdategmt = lubridate::as_datetime(x_measurementdategmt)) %>% 
  mutate_if(is.character, as.numeric)
#> # A tibble: 24 x 2
#>    x_measurementdategmt x_value
#>                  <dttm>   <dbl>
#>  1  2016-09-01 04:00:00    10.9
#>  2  2016-09-01 05:00:00     9.8
#>  3  2016-09-01 06:00:00     9.9
#>  4  2016-09-01 07:00:00    14.1
#>  5  2016-09-01 08:00:00    13.6
#>  6  2016-09-01 09:00:00    16.5
#>  7  2016-09-01 10:00:00    15.0
#>  8  2016-09-01 11:00:00    18.5
#>  9  2016-09-01 12:00:00    18.0
#> 10  2016-09-01 13:00:00    17.0
#> # ... with 14 more rows

+1

austensen May 30 '17 at 4:22

source to share

akrun · Accepted Answer · 2017-05-30T04:14:44+0000

We can use type.convert

.

Df[] <- lapply(Df, function(x) type.convert(x, as.is = TRUE))
str(Df)
#'data.frame':   24 obs. of  2 variables:
#$ @MeasurementDateGMT: chr  "2016-09-01 00:00:00" "2016-09-01 01:00:00" "2016-09-01 02:00:00" "2016-09-01 03:00:00" ...
#$ @Value             : num  10.9 9.8 9.9 14.1 13.6 16.5 15 18.5 18 17

...

If we need to convert the 'datetime' column,

Df[[2]] <- as.POSIXct(Df[[2]])

Since the columns are all character

in the OP's post, we don't need to convert it to characcter

before applying type.convert

otherwise usetype.convert(as.character(x), ..

Ok if we need dplyr

to do this

library(dplyr)
res <- Df %>% 
         mutate_all(funs(type.convert(as.character(.), as.is = TRUE)))
str(res)
#'data.frame':   24 obs. of  2 variables:
#$ @MeasurementDateGMT: chr  "2016-09-01 00:00:00" "2016-09-01 01:00:00" "2016-09-01 02:00:00" "2016-09-01 03:00:00" ...
#$ @Value             : num  10.9 9.8 9.9 14.1 13.6 16.5 15 18.5 18 17 ...

Or another option data.table

library(data.table)
setDT(Df)[, lapply(.SD, function(x) type.convert(x, as.is = TRUE))]

Convert chr value to num from multiple columns?

More articles: