How to change the name of a data frame

I have a recurring situation where I am setting a value at the top of a long set of R-code that is used in a subset of one or more data frames. Something like that:

city_code <- "202"

      

At the end of the whole process, I would like to store the results in a dataframe that is named appropriately, say based on the addition of "city_code" to a shared stub.

city_results <- paste("city_stats", city_code, sep = "")

      

My problem is that I cannot figure out how to rename the resulting dataframe to "city_results". There is a lot of information on how to rename the columns of a data frame, but not how to rename the data frame itself. Based on the suggested answer, here's a clarification:

Thank you @ mike-wise. It is helpful to study Hadley Advanced R with a specific problem in hand.

library(dplyr)
gear_code <- 4
gear_subset <- paste("mtcars_", gear_code, sep = "")
mtcars_subset <- mtcars %>% filter(gear == gear_code)
head(mtcars_subset)
write.csv(mtcars_subset, file = paste(gear_subset, ".csv", sep = ""))

      

This allows me to write a subset to the appropriate csv file. However, your suggestion works, but I cannot, for example, link to a data.frame with a new name:

assign(gear_subset, mtcars_subset)
head(gear_subset)

      

+4


source to share


2 answers


The truth is, objects in R don't have names as such. There are various types of environments, including global ones for each process. These environments have lists of names that point to various objects. Two different names can refer to the same object. This is best explained by my knowledge in the Environment chapter of Hadley Wickhams' book Advanced R http://adv-r.had.co.nz/Environments.html.

Thus, it is not possible to change the name of the data frame, because there is nothing to change.

But you can force the new name (for example newname

) to point to the same object (in your case the dataframe object) as the given name (for example oldname

) by simply doing:

   newname <- oldname

      

Note that if you change one of these variables, a new copy will be made and the internal links will no longer be the same. This has to do with the R copy on change semantics. See this post for an explanation: what are copy-on-change semantics in R, and where is the canonical source?

Hope this helps. I know pain. Dynamic and functional languages ​​are different from static and procedural languages ​​...



It is of course possible to calculate a new name for the dataframe and register it with the environment using the command assign

- and you may be looking for that. However, it would be rather confusing to refer to this later.

Example (assuming df

is the dataframe in question):

   assign(  paste("city_stats", city_code, sep = ""), df )

      

As always, see http://stat.ethz.ch/R-manual/R-devel/library/base/html/assign.htmlassign

for more information .

Edit: In response to your changes and various comments about usage problems, eval(parse(...)

you can parse the name like this:

head(get(gear_subset))

      

+12


source


Generally, you should not programmatically generate names for data frames in your global environment. This is a good sign that you should be using list

to make your life easier. See FAQ How do I list data frames? for many examples and more detailed discussions.

Using your specific example, I would rewrite it in one of several ways.

library(dplyr)
gear_code <- 4
gear_subset <- paste("mtcars_", gear_code, sep = "")
mtcars_subset <- mtcars %>% filter(gear == gear_code)
head(mtcars_subset)
write.csv(mtcars_subset, file = paste(gear_subset, ".csv", sep = ""))

      

The goal is to write a CSV named gear_X.csv

that has a subset of mtcars

s gear == X

. You shouldn't support an intermediate dataframe, this should be fine:

gear_code <- 4
mtcars %>% filter(gear == gear_code) %>%
    write.csv(file = paste0('mtcars_', gear_code, '.csv'))

      

But maybe you are coding it like this because you want to do it for each value gear

, and this is where it dplyr

group_by

helps:

CSV for all transmissions



mtcars %>% group_by(gear) %>%
  do(csv = write.csv(file = sprintf("mt_gear_%s.csv", .[1, "gear"]), x = .)

      

Data frames for each transmission layer:

If you really need separate dataframe objects for each transfer layer, storing them in a list is the way to go.

gear_df = split(mtcars, mtcars$gear)

      

This gives you list

three frames of data, one for each layer gear

. And they are already named with levels, so to see a dataframe with all rows gear == 4

, do

gear_df[["4"]]

      

It is usually easier to work with three data frames. Anything you want to do with all data frames that you can do with one at the same time lapply

, and even if you want to use a loop for

, it's easier than eval(parse())

or get()

.

+1


source







All Articles