Saturday, 24 October 2015

Enable Spark History Log Server

Spark Web UI: http://<driver-node>:4040

Note that this information is only available for the duration of the application by default. To view the web UI after the fact, set spark.eventLog.enabled to true before starting the application. This configures Spark to log Spark events that encode the information displayed in the UI to persisted storage.

If Spark is run on Mesos or YARN, it is still possible to reconstruct the UI of a finished application through Spark’s history server, provided that the application’s event logs exist. You can start the history server by executing below without admin account:

$ mkdir ~/logs
$ export SPARK_LOG_DIR=$HOME/logs/   // Set the location for server logs
$./opt/spark-1.5/sbin/start-history-server.sh hdfs:///workspace/logs
 
When using the file-system provider class, the base logging directory must be supplied in the spark.history.fs.logDirectory configuration option, and should contain sub-directories that each represents an application’s event logs. This creates a web interface at http://<server-url>:18080 by default.


Below two should be set in the Spark command line
--conf "spark.eventLog.enabled=true"
--conf "spark.eventLog.dir=hdfs:///workspace/logs"

Below two should be set in Spark.con
--conf "spark.history.retainedApplications=100"
--conf "spark.history.fs.logDirectory=hdfs:///workspace/logs"

Reference:
http://spark.apache.org/docs/latest/monitoring.html
http://www.cloudera.com/content/www/en-us/documentation/enterprise/latest/topics/admin_spark_history_server.html

No comments:

Post a Comment