How to add variable (column) name before every row of row in column

I would like to add the column name to each string of characters in the column. Here's a small data frame to use.

df <-structure(list(CoA = c("Baton Rouge", "Birmingham", "Chattanooga", 
"Columbia", "Houston"), CoB = c("Haddonfield, NJ", "Haddonfield, NJ", 
"Philadelphia, PA", "Hackensack, NJ", "Princeton, NJ"), CoC = c("St. Louis, Missouri", 
"Kansas City, Missouri", "Jefferson City, Missouri", "Belleville, Illinois", 
"Overland Park, Kansas")), .Names = c("CoA", "CoB", "CoC"), row.names = c(NA, 
-5L), class = "data.frame")

      

I tried the following, but R recycles the object company and the df object.

company <- colnames(df)
new <- sapply(df, function(x) paste(company, x, sep = ", ")) 

      

This is what I want, but for all columns:

paste(colnames(df[1]), df$CoA, sep = ", ")
[1] "CoA, Baton Rouge" "CoA, Birmingham"  "CoA, Chattanooga" "CoA, Columbia"    "CoA, Houston"

      

I tried various regexes and didn't get it anywhere. How do I get sapply

to perform an insert operation on each column?

Thank you for your help.

+3


source to share


1 answer


Here's a possible solution:

mx <- sapply(colnames(df),function(name){ paste(name,df[,name],sep=", ")})

> mx
     CoA                CoB                     CoC                            
[1,] "CoA, Baton Rouge" "CoB, Haddonfield, NJ"  "CoC, St. Louis, Missouri"     
[2,] "CoA, Birmingham"  "CoB, Haddonfield, NJ"  "CoC, Kansas City, Missouri"   
[3,] "CoA, Chattanooga" "CoB, Philadelphia, PA" "CoC, Jefferson City, Missouri"
[4,] "CoA, Columbia"    "CoB, Hackensack, NJ"   "CoC, Belleville, Illinois"    
[5,] "CoA, Houston"     "CoB, Princeton, NJ"    "CoC, Overland Park, Kansas"

      

Note that it sapply

returns a matrix; if you data.frame

just want to doas.data.frame(mx)

Explanation:

sapply

applies a function to every element of the vector / list passed in the first argument X

(in this case we are passing through colnames(df)

).
The function to be applied to each element is passed as an argument FUN

.
In this case, we pass the following function to FUN

:



function(name){ 
   paste(name,df[,name],sep=", ")
   # equivalent to return(paste(name,df[,name],sep=", "))
}

      

this function is called for every element colname(df)

and every element is passed as the first argument (i.e. the argument name

).
So using name

(remember one column name) we select the column df

, we add the column name with a function paste

and return the resulting row vector.
The rest of the function remains sapply

, which automatically binds each resulting vector to one matrix (since simplify=TRUE

by default, otherwise a list of vectors will be returned, as it does using lapply

)

EDIT :

As @hadley correctly pointed out, the result sapply

with is simplify=TRUE

not always the same (for example it changes if you only have one row or one column).
So this is a safer solution:

df2 <- as.data.frame(sapply(colnames(df),
                            function(name){ paste(name,df[,name],sep=", ")},
                            simplify=F))

> df2
               CoA                   CoB                           CoC
1 CoA, Baton Rouge  CoB, Haddonfield, NJ      CoC, St. Louis, Missouri
2  CoA, Birmingham  CoB, Haddonfield, NJ    CoC, Kansas City, Missouri
3 CoA, Chattanooga CoB, Philadelphia, PA CoC, Jefferson City, Missouri
4    CoA, Columbia   CoB, Hackensack, NJ     CoC, Belleville, Illinois
5     CoA, Houston    CoB, Princeton, NJ    CoC, Overland Park, Kansas

      

+4


source







All Articles