码迷,mamicode.com
首页 > 其他好文 > 详细

How to convert matrix to RDD[Vector] in spark

时间:2017-07-21 11:38:10      阅读:176      评论:0      收藏:0      [点我收藏+]

标签:logs   input   this   parallel   nsf   result   park   ted   his   

The matrix is generated from SVD, and I am using the results from SVD to do clustering analysis.

 if your clustering only supports RDD as its input, here‘s how you can do the transformation

  def toRDD(sc :SparkContext,m: Matrix): RDD[Vector] = {
        val columns: Iterator[Array[Double]] = m.toArray.grouped(m.numRows)
//        val rows: Seq[Array[Double]] = columns.toSeq // Skip this if you want a column-major RDD.
        val rows: Seq[Seq[Double]] = columns.toSeq.transpose // Skip this if you want a column-major RDD.
        val vectors: Seq[DenseVector] = rows.map(row => new DenseVector(row.toArray))
        sc.parallelize(vectors)

  

How to convert matrix to RDD[Vector] in spark

标签:logs   input   this   parallel   nsf   result   park   ted   his   

原文地址:http://www.cnblogs.com/xiaoma0529/p/7216802.html

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!