Remove constant columns from RDD and compute covariance matrix

My RDD may have constant value columns. In other words, the variance of some of the columns may be zero. My goal is to remove all such columns from the RDD (and ultimately calculate the covariance matrix for the rest of the columns). How can i do this?

Thanks and greetings,

+3


source to share


1 answer


The RDD is assumed to be unchanged. So I don't think you want to remove anything from it, but it's just map

what suits you and / or filter

something (more details in the documentation ).



+6


source







All Articles