Best way to get required external data for R package

I have an R package that predicts gender groups from first names. This requires several fairly large datasets. I placed them in a separate R package . Ideally, a packet gender

could be packet dependent genderdata

and both would be accepted by CRAN. However, it looks like CRAN is not accepting the packet genderdata

because it is too large (26MB). (I think "big data"> = 5MB.)

So my question is this: what's the best way to get this data in my package gender

if I can't include the package genderdata

in Imports:

the file DESCRIPTION

.

My thought is to depend on devtools

and provide a function like this:

install_gender_data <- function() {
  if(!require(genderdata)) devtools::install_github("lmullen/gender-data-pkg")
}

      

Then I would .onLoad()

also use the package launch message to inform users to launch this feature if they have not already loaded genderdata

.

+3


source to share


1 answer


Have a look at Hadley Wickham's "babynames" package. http://cran.r-project.org/web/packages/babynames/index.html



-1


source







All Articles