Monday, 29 June 2015

Fetch/Operate Row Elements in Dataframe

dataFrame.map(r => r(0)) will probably get you a list of Any type. Ïf you want to specify the result type, you can use .asInstanceOf[T] in r => r(0).asInstanceOf[YOUR_TYPE] mapping
Another approach is to use "getAs[T]", e.g. fetch a value in List[String] column.

row.getAs[Seq[String]]("columnName")
Here, we use Seq[String], instead of Array or List.
Otherwise, you will see

java.lang.ClassCastException: scala.collection.mutable.WrappedArray$ofRef cannot be cast to [Ljava.lang.String;
1.  Link all the values in a specific column together to a String

val rules = ruleTable
  .select("colname")
  .flatMap(_.toSeq.asInstanceOf[String])
  .reduce(_ + "," + _).toString

2.  Add parts of values of a Row together in a value

 sqlContext
  .sql(s"SELECT id, customer_id, $rules FROM txnScoreTable")
  .map { row =>
  val totalScore = row.toSeq.slice(2, 2+ruleNum)
    .asInstanceOf[Double]
    .foldLeft(0.0)(_ + _)

 Row(rowSeq(0), rowSeq(1), rfsFormula(ruleNum, totalScore, weight))
}

No comments:

Post a Comment