Combine not all text in a column for reference

I have two data files. The first one is a "master sheet" where I compile the data, with the corresponding columns that look like this:

Family          ID                     Size
Tyrannidae      Empidonax traillii
Tyrannidae      Empidonax atriceps
Conopophagidae  Conopophaga lineata

      

Where is the size of the column I want to fill. I need the identifier of the future analysis to contain both genus names and species. Size data is based on gender (first word in ID) in a separate file, for example:

 Genus        Size
Empidonax     13
Conopophaga   6

      

Is there a way in R to tell the mapping to specific rather than all of the text in the ID column, so that the Size column can then be populated? Desired result

Family          ID                     Size
Tyrannidae      Empidonax traillii     13
Tyrannidae      Empidonax atriceps     13
Conopophagidae  Conopophaga lineata    6

      

Or is it just easier to split the ID column in two, fill in Size and then concatenate the two back?

thank

+3


source to share


2 answers


Since it data.table

allows X[Y]

, it seems very appropriate. So here's a data.table

solution:



require(data.table)
master <- data.table(Family=c("Tyrannidae", "Tyrannidae", "Conopophagidae"), 
          ID = c("Empidonax traillii", "Empidonax traillii", "Conopophaga lineata"))
dt <- data.table(Genus = c("Empidonax", "Conopophaga"), Size = c(13, 6))

# get Genus
master[, Genus := gsub(" .*$", "", ID)] # master$ID replaced by ID
# set key to Genus
setkey(master, "Genus")
> master[dt] # X[Y]

#          Genus         Family                  ID Size
# 1:   Empidonax     Tyrannidae  Empidonax traillii   13
# 2:   Empidonax     Tyrannidae  Empidonax traillii   13
# 3: Conopophaga Conopophagidae Conopophaga lineata    6

      

+3


source


If master

and size

are your dataframes, you can create a Genus column and then use merge

to get the merged dataframe.



#regex deletes all characters after a space
master$Genus <- gsub(" .*$","",master$ID) 
merge(master,size,by="Genus")

      

+4


source







All Articles