R when replacing a character with a string
I am reading a file with a structure like this:
[1111111]aaaa;bbbb;cccc
[2222222]dddd;ffff;gggg
And I want to have a data frame like this:
Column A Column B Column C Column D
1111111 aaaa bbbb cccc
2222222 dddd ffff gggg
So, I need to split; and replace all []
So here's my code:
Read file
df<-read.csv("file.csv",sep=";")
Replace []
df_V1 <- gsub(pattern="[",replacement="",df$V1) #ERROR HERE!
df_V1 <- gsub(pattern="]",replacement=";",df$V1) #Replace the ] to ;
Then put it all together
df_V1 <- do.call(rbind.data.frame,strsplit(df_V1,split=";"))
Data<- cbind(
df_V1,
df[,c(2:ncol(df))])
And here is my conclusion
View(Data)
Column A Column B Column C Column D
[1111111 aaaa bbbb cccc
[2222222 dddd ffff gggg
And don't know why the first [can't be replaced, I've already tried using gsub and removing the first character of the string, but doesn't seem to solve anything. Any idea?
thank you for your time
+3
source to share
2 answers
First we can read the data with readLines
, do we change the line with gsub
, and then read withread.csv
read.csv(text=sub(";", "", gsub("[][]", ";", lines)),
sep=";", header=FALSE, col.names = paste0("Column", LETTERS[1:4]), stringsAsFactors=FALSE)
# ColumnA ColumnB ColumnC ColumnD
#1 1111111 aaaa bbbb cccc
#2 2222222 dddd ffff gggg
data
lines <- readLines("file1.txt")
+3
source to share
If the columns are indeed fixed in length, then read_fwf in the readr library is useful.
library(readr)
read_fwf(
"[1111111]aaaa;bbbb;cccc
[2222222]dddd;ffff;gggg
", fwf_cols("Column A"=c(2,8), "Column B"=c(10,13), "Column C"=c(15,18), "column D"=c(20,23)))
# `Column A` `Column B` `column C` `Column D`
# <int> <chr> <chr> <chr>
# 1 1111111 aaaa bbbb cccc
# 2 2222222 dddd ffff gggg
+1
source to share