Wednesday 18 March 2015

Build a Spark SBT project in IntelliJ

1. Prerequisite

1. Download IntelliJ 14.

2. Install java 6 and JDK for IntelliJ IDE

3. Install scala and sbt plugins for IntelliJ
Configure-> plugins->type scala-> browse repositories->install Scala-> restart IntelliJ

4. Set the JDK, and scala SDK in your project structure. 
File-> Project Settings -> Project Structure -> Project JDK->select your JDK install directory.
Download Scala sdk-2.11.6, select its directory in “Scala SDK” dialog.

2. Create a new SBT project

Create new project->Scala -> SBT -> Fill in
Project SDK, SBT version, Scala version -> Check "Use auto-import" and "Create directories..."->Finish

Edit the build.sbt file to add the dependencies.
Make sure that the scala version matches that of the Spark builds in the Maven repository
To do this, go to http://search.maven.org


IntelliJ will resolve the dependencies once a change happens in build.sb.
After resolving successfully, it will be indexing and  scala/spark jars for a while.

3. Import an existing SBT project

Import project -> Choose your sbt project folder -> Import project from SBT ->
Fill in Project SDK -> Check "Use auto-import" and "Create directories..."->Finish


(1) Manually install an external jar to the Project.
Right click the project -> Open Model Setting -> Libraries -> Click "+" -> select the jar -> OK.

(2) Setup from build.sbt.
Copy any jars that you want to use to the 'lib' directory. sbt will put these jars on the classpath during compilation, testing, running, and when using the interpreter
unmanagedBase := baseDirectory.value / "lib"

4. Check out a SBT project from Github

Create a repo in Github.

In IntelliJ IDE,

(1) Check out from version control -> Github -> Fill out your Github username and password ->
Copy and Paste the repo url -> Clone.

(2) Right click the project ->Project -> Fill in Project SDK -> OK

(3) Click SBT task button(right most bar) -> click"refresh SBT project" button.



5. Check in a SBT project to Github

(1) On the main menu of IntelliJ -> VCS -> Import into Version Control -> Share Project on Github
-> Give repo name -> Choose the files to commit -> Share.
A new repository is created in Github with selected files.

(2)Right click the modified file ->Git -> Commit File ->Choose files & add message -> Commit and Push -> Push.

6. Pack a jar file

(1) Create a new branch, then Check in Github
In the main menu, VCS -> Git -> Branches -> Choose from "Remote Branches" list -> Check out to local branch.

(2) Check out from a Github branch.
In the main menu, VCS -> Git -> Branches -> New Branch -> Give a Branch Name -> IDE automatically switches to the new branch workplace. 
After committing updates, you will find a new branch in Github.


7. Pack a jar file

When you are ready to build a new jar(without dependencies)

1. From SBT, 
Click "Terminal" in the bottom of IDE -> Type "sbt package" -> Run
A project jar file will be found in Project/target/scala-2.11/ directory.

2. From IDE, 
File | Project Structure | Artifacts then you should press alt+insert or click the plus icon and create new artifact choose --> jar --> From modules with dependencies.
Next goto Build | Build artifacts --> choose your artifact.

Reference:
https://docs.sigmoidanalytics.com/index.php/Step_by_Step_instructions_on_how_to_build_Spark_App_with_IntelliJ_IDEA

No comments:

Post a Comment