Splitting a string into substrings of different lengths in R

Question

Splitting a string into substrings of different lengths in R

I've read similar threads, but my substrings are of different lengths (9,3,5 characters each) and couldn't find an answer for that.

I need to split a 17 characters long string into three substrings, where the first is 9, the next is 3, the last is 5 characters.

Example:

 N12345671004UN005
 N34567892902UN002

I would like to split the rows into three columns:

First col 9 char.length

"N12345671"      
"N34567892"

Second col 3 char.length

"004"          
"902"

Third col 5 char.length

"UN005"  
"UN002"

+3

string r

zone1 June 12. 15 at 15:40

source to share

2 answers

You can try read.fwf

and specifywidths

ff <- tempfile()
cat(file=ff, instr, sep='\n')
read.fwf(ff, widths=c(9,3,5), colClasses=rep('character', 3))
#        V1  V2    V3
#1 N12345671 004 UN005
#2 N34567892 902 UN002

Or using tidyr/dplyr

library(dplyr)
library(tidyr)
as.data.frame(instr) %>%
       extract(instr, into=paste0('V', 1:3), '(.{9})(.{3})(.{5})')
#         V1  V2    V3
#1 N12345671 004 UN005
#2 N34567892 902 UN002

Or a combination sub

andread.table

read.table(text=sub('(.{9})(.{3})(.{5})', '\\1 \\2 \\3', instr),
              colClasses=rep('character', 3))
#         V1  V2    V3
#1 N12345671 004 UN005 
#2 N34567892 902 UN002

data

instr = c("N12345671004UN005", "N34567892902UN002")

+5

akrun June 12. 15 at 16:03

source to share

mts · Accepted Answer · 2015-06-12T15:44:37+0000

instr = c("N12345671004UN005", "N34567892902UN002")
out1 = substr(instr, 1, 9)
out2 = substr(instr, 10, 12)
out3 = substr(instr, 13, 17)

Splitting a string into substrings of different lengths in R

data

More articles: