After adding the spark-xml lib, encountered below error.
Could not find creator property with name 'id' (in class org.apache.spark.rdd.RDDOperationScope)
at [Source: {"id":"0","name":"newAPIHadoopFile"}; line: 1, column: 1]
com.fasterxml.jackson.databind.JsonMappingException: Could not find creator property with name 'id' (in class org.apache.spark.rdd.RDDOperationScope)
at [Source: {"id":"0","name":"newAPIHadoopFile"}; line: 1, column: 1]
at com.fasterxml.jackson.databind.JsonMappingException.from(JsonMappingException.java:148)
at com.fasterxml.jackson.databind.DeserializationContext.mappingException(DeserializationContext.java:843)
at com.fasterxml.jackson.databind.deser.BeanDeserializerFactory.addBeanProps(BeanDeserializerFactory.java:533)
Solution:
dependencyOverrides ++= Set(
"com.fasterxml.jackson.core" % "jackson-databind" % "2.4.4"
)
This is caused by the classpath providing you with a different version of jackson
than the one Spark is expecting that is 2.4.4 val xmlDf = sqlContext.read .format("xml") .option("rowTag","Rec") .option("attributePrefix", "") .option("valueTag", "value") .load(filePath)
2. Xml with an array struct and nested elements
By using explode(Column e), creates a new row for each element in the given array or map column.
xmlDf.select( xmlDf.col("Ref"), explode(xmlDf.col("Comm")).as("Comm") ).select("Comm.Type, "Ref.AcctID")
No comments:
Post a Comment