How can I change RowMatrix to Array in Spark or export it as CSV?

I have this code in Scala:

val mat: CoordinateMatrix = new CoordinateMatrix(data)
val rowMatrix: RowMatrix = mat.toRowMatrix()

val svd: SingularValueDecomposition[RowMatrix, Matrix] = rowMatrix.computeSVD(100, computeU = true)

val U: RowMatrix = svd.U // The U factor is a RowMatrix.
val S: Vector = svd.s // The singular values are stored in a local dense vector.
val V: Matrix = svd.V // The V factor is a local dense matrix.

val uArray: Array[Double] = U.toArray // doesn't work, because there is not toArray function in RowMatrix type
val sArray: Array[Double] = S.toArray // works good
val vArray: Array[Double] = V.toArray // works good


How can I change U to uArray or a similar type that can be printed to a CSV file?


source to share

2 answers

This is the basic operation, this is what you need to do given that U is a RowMatrix like this:

val U = svd.U


rows () is a RowMatrix method that allows you to get the RDD from a RowMatrix row by row.

You just need to apply the strings on your RowMatrix and match the RDD [Vector] to create an array, which you would concatenate into a string that creates an RDD [String].

val rdd = x => x.toArray.mkString(","))


All you need to do now to save the RDD:






def exportRowMatrix(matrix:RDD[String], fileName: String) = {
  val pw = new PrintWriter(fileName)
  matrix.collect().foreach(line => pw.println(line))


val rdd = x => x.toArray.mkString(","))
exportRowMatrix(rdd, "U.csv")




All Articles