Loading csv files into sparkR

In R, I created two datasets which I saved as csv files with

liste <-write.csv(liste, file="/home/.../liste.csv", row.names=FALSE)
    data <- write.csv(data, file="/home/.../data.csv", row.names=FALSE)

      

Now I want to open these csv files in SparkR. So I type

liste <- read.df(sqlContext, "/home/.../liste.csv", "com.databricks.spark.csv", header="true", delimiter= "\t")

data <- read.df(sqlContext, "/home/.../data.csv", "com.databricks.spark.csv", header="true", delimiter= "\t")

      

It turns out that one dataset "liste" has been successfully loaded into SparkR, however the "data" cannot be loaded for some strange reason.

'liste' is just a vector of numbers in R, whereas "data" is a data.frame that I loaded into R and removed some parts of the data.frame. SparkR gives me this error message:

Error: returnStatus == 0 is not TRUE

+3


source to share


1 answer


Liste is a local list that can be written with write.csv, data is a SparkR DataFrame that cannot be written with write.csv: it only writes its pointer, not the DataFrame. That's why it's only 33 kb



+2


source







All Articles