Lookup value from another column that matches the variable

I have a dataframe that looks like this:

animal_id   trait_id    sire_id dam_id
    1         25.05        0       0
    2         -46.3        1       2
    3          41.6        1       2
    4         -42.76       3       4
    5         -10.99       3       4
    6         -49.81       5       4

      

I want to create another variable containing the "trait_id" score for each "sire_id" and "dam_id".

All peaks (sire_id) and dams (dam_id) are also present in the animal_id column. So what I want to do is look for their dimension in trait_id and iterate over that variable in a new variable.

As a result, I want:

animal_id   trait_id    sire_id trait_sire  dam_id  trait_dam
     1       25.05         0        NA        0        NA
     2       -46.3         1       25.05      2       -46.3
     3       41.6          1       25.05      2       -46.3
     4      -42.76         3       41.6       4       -42.76
     5      -10.99         3       41.6       4       -42.76
     6      -49.81         5      -10.99      4       -42.76

      

Any suggestion would be greatly appreciated.

+3


source to share


3 answers


You can use match

; match(col, df$animal_id)

gives the corresponding index of the elements from col to animal_id

, which can be used further to determine values trait

:



df[c("trait_sire", "trait_dam")] <- 
    lapply(df[c("sire_id", "dam_id")], function(col) df$trait_id[match(col, df$animal_id)])

df
#  animal_id trait_id sire_id dam_id trait_sire trait_dam
#1         1    25.05       0      0         NA        NA
#2         2   -46.30       1      2      25.05    -46.30
#3         3    41.60       1      2      25.05    -46.30
#4         4   -42.76       3      4      41.60    -42.76
#5         5   -10.99       3      4      41.60    -42.76
#6         6   -49.81       5      4     -10.99    -42.76

      

+5


source


With data.table connection ...

library(data.table)
setDT(DT)    

DT[, trait_sire := 
  .SD[.SD, on=.(animal_id = sire_id), x.trait_id ]
]

DT[, trait_dam := 
  .SD[.SD, on=.(animal_id = dam_id), x.trait_id ]
]

   animal_id trait_id sire_id dam_id trait_sire trait_dam
1:         1    25.05       0      0         NA        NA
2:         2   -46.30       1      2      25.05    -46.30
3:         3    41.60       1      2      25.05    -46.30
4:         4   -42.76       3      4      41.60    -42.76
5:         5   -10.99       3      4      41.60    -42.76
6:         6   -49.81       5      4     -10.99    -42.76

      

Syntax x[i, on=, j]

, where j

is some column function. To see how it works, try the DT[DT, on=.(animal_id = dam_id)]

options as well. Some notes:



  • The i.*

    / syntax x.*

    helps distinguish where the column came from.
  • If j

    - v := expression

    , the expression is assigned to the column v

    .
  • The join x[i, ...]

    uses strings i

    to find strings x

    .
  • The syntax on=

    is similar to .(xcol = icol)

    .
  • Internally, j

    the table itself can be written as .SD

    .

One of the advantages of this approach over match

is that it extends to joins on more than one column, for example, on = .(xcol = icol, xcol2 = icol2)

or even "non equi join" like on = .(xcol < icol)

. It is also part of a consistent syntax for working in a spreadsheet (explained in the introductory package ), not specialized code for a single task.

+5


source


You can do it using match

(in R base) in one pass (no need to iterate over)

df[c("trait_sire", "trait_dam")] <- 
cbind(with(df, trait_id[match(sire_id, animal_id)]), 
      with(df, trait_id[match(dam_id, animal_id)]))

  # animal_id trait_id sire_id dam_id trait_sire trait_dam
# 1         1    25.05       0      0         NA        NA
# 2         2   -46.30       1      2      25.05    -46.30
# 3         3    41.60       1      2      25.05    -46.30
# 4         4   -42.76       3      4      41.60    -42.76
# 5         5   -10.99       3      4      41.60    -42.76
# 6         6   -49.81       5      4     -10.99    -42.76

      

+1


source







All Articles