How to use numeric list as variable input in gsub template?

Question

How to use numeric list as variable input in gsub template?

I would like to keep only the first half of each line. The imported data duplicates names, all in a larger data frame:

fname: TimmyTimmy, PopPop, AdnanAdnan, KobeKobe.

The first idea was to count the / 2 characters and then replace that number of characters with gsub, counting the number of characters I would like to remove from the beginning of each line using fn_len as my variable in the template.

fn_len: 5, 6, 5, 4

df$fname <- 
    gsub("^[[:alpha:]]{df$fn_len}", "", df$fname)

Returns error: invalid regular expression; Reason "Invalid content {} '

The code works if I use single numbers (like 1,2,3,4,5), but obviously don't understand some of the template rules here.

On the other hand, maybe the best way to do it from the start?

+3

string r gsub

panstotts Dec 12. 14 at 1:54

source to share

2 answers

If the pattern is similar to the one shown in the example

 gsub("([A-Za-z]+)\\1+", "\\1", str1)
 #[1] "Timmy" "Pop"   "Adnan" "Kobe"

or

 scan(text=sub('(?<=[a-z])(?=[A-Z])', ' ', str1, perl=TRUE),
                            what='', quiet=TRUE)[c(TRUE, FALSE)]
 #[1] "Timmy" "Pop"   "Adnan" "Kobe"

or

 sapply(strsplit(str1, '(?<=[a-z])(?=[A-Z])', perl=TRUE), `[`,1)
 #[1] "Timmy" "Pop"   "Adnan" "Kobe"

Update

Should work for lines with names starting with lowercase

  gsub('([A-Za-z]+)\\1+', '\\1', str2)
  #[1] "Timmy" "Pop"   "Adnan" "Kobe"  "tim"

data

 str1 <- c("TimmyTimmy", "PopPop", "AdnanAdnan", "KobeKobe")
 str2 <- c(str1, 'timtim')

+2

akrun Dec 12. 14 at 2:46

source to share

MrFlick · Accepted Answer · 2014-12-12T01:56:40+0000

It really looks like the substring operation would be better

fname<-c("TimmyTimmy", "PopPop", "AdnanAdnan", "KobeKobe")
substr(fname, 1, nchar(fname)/2)
# [1] "Timmy" "Pop"   "Adnan" "Kobe"

How to use numeric list as variable input in gsub template?

Update

data

More articles: