Friday, 15 January 2016

Dataframe transformations/actions can only be invoked by the driver

Only the driver node is able to perform transformations and actions on RDDs/Dataframes. 
It seems that your code is likely executing on a worker node (e.g. inside of another transformation or action). 

Below code will generate errors:

val vertexMap = vertices.zipWithUniqueId
val vertixYId = vertexMap.lookup("vertexY")

16/01/11 11:34:43 ERROR Executor: Exception in task 0.0 in stage 2.0 (TID 4)
org.apache.spark.SparkException: RDD transformations and actions can only be invoked by the driver           , not inside of other transformations; for example, rdd1.map(x => rdd2.values.count() * x) is inva           lid because the values transformation and count action cannot be performed inside of the rdd1.map            transformation. For more information, see SPARK-5063.
        at org.apache.spark.rdd.RDD.org$apache$spark$rdd$RDD$$sc(RDD.scala:87)
        at org.apache.spark.rdd.RDD.withScope(RDD.scala:306)
        at org.apache.spark.rdd.PairRDDFunctions.lookup(PairRDDFunctions.scala:873)

Reference:
http://stackoverflow.com/questions/26351382/how-to-convert-scala-rdd-to-map

No comments:

Post a Comment