How do I create a spark application using Scala IDE and Maven?

I am new to Scala, Spark and Maven and would like to create the spark application described here . It uses the Mahout library.

I have a Scala IDE installation and want to use Maven to generate dependencies (which are the Mahout library as well as the Spark lib). I couldn't find a good tutorial to start with. Can anyone help me figure this out?

+3


source to share


2 answers


First, try compiling a simple Maven application in the Scala IDE. The key of the Maven project is the directory structure and pom.xml. Although I am not using Scala IDE, this document seems to be helpful. http://scala-ide.org/docs/tutorials/m2eclipse/

Next step is to add Spark dependency to pom.xml, you can follow this document. http://blog.cloudera.com/blog/2014/04/how-to-run-a-simple-apache-spark-app-in-cdh-5/



For the latest version of the Spark and Mahout artifacts, you can check them here: http://mvnrepository.com/artifact/org.apache.spark http://mvnrepository.com/artifact/org.apache.mahout

Hope it helps.

+3


source


To get started, you will need the following tools (based on recent availability) -

  • Scala IDE for Eclipse - Download the latest Scala IDE from here .

  • Scala Version - 2.11 (make sure the Scala compiler is set to this version as well)

  • Sparks version 2.2 (provided in maven dependency)

  • winutils.exe

To work in Windows environment you need hadoop binaries window format. winutils provides this, and we need to set the hasoop.home.dir property to the bin path where winutils.exe is present. You can download winutils.exe from here and put it on the path for example: c: /hadoop/bin/winutils.exe

And you can define Spark Core Dependency in your Maven POM.XML for your project to get started.



   <dependency> <!-- Spark dependency -->
     <groupId>org.apache.spark</groupId>
     <artifactId>spark-core_2.11</artifactId>
     <version>2.2.0</version>
     <scope>provided</scope>
   </dependency>

      

And in your Java / Scala class, define this property to start your local environment on Windows -

System.setProperty("hadoop.home.dir", "c://hadoop//");

      

More details and detailed setup details can be found here .

0


source







All Articles