Splitting a string into substrings of different lengths in R
I've read similar threads, but my substrings are of different lengths (9,3,5 characters each) and couldn't find an answer for that.
I need to split a 17 characters long string into three substrings, where the first is 9, the next is 3, the last is 5 characters.
Example:
N12345671004UN005
N34567892902UN002
I would like to split the rows into three columns:
First col 9 char.length
"N12345671"
"N34567892"
Second col 3 char.length
"004"
"902"
Third col 5 char.length
"UN005"
"UN002"
+3
source to share
2 answers
You can try read.fwf
and specifywidths
ff <- tempfile()
cat(file=ff, instr, sep='\n')
read.fwf(ff, widths=c(9,3,5), colClasses=rep('character', 3))
# V1 V2 V3
#1 N12345671 004 UN005
#2 N34567892 902 UN002
Or using tidyr/dplyr
library(dplyr)
library(tidyr)
as.data.frame(instr) %>%
extract(instr, into=paste0('V', 1:3), '(.{9})(.{3})(.{5})')
# V1 V2 V3
#1 N12345671 004 UN005
#2 N34567892 902 UN002
Or a combination sub
andread.table
read.table(text=sub('(.{9})(.{3})(.{5})', '\\1 \\2 \\3', instr),
colClasses=rep('character', 3))
# V1 V2 V3
#1 N12345671 004 UN005
#2 N34567892 902 UN002
data
instr = c("N12345671004UN005", "N34567892902UN002")
+5
source to share