Building a Spark Application for HDInsight using IntelliJ Part 1 of 2

For developers with a Microsoft .NET background who want to get familiar with building Spark applications with Scala programming language, this blog post series is a walk through from installing the development tools and building a simple Spark application, then submit against an HDInsight Spark cluster.

My HDInsight configuration is Spark 2.0 (HDI 3.5) with Azure Data Lake Store as the primary storage.

The articles I used to understand the installation and setup:
https://docs.microsoft.com/en-us/azure/azure-toolkit-for-intellij-installation

A point of confusion for me was that HDInsight tools for IntelliJ is deprecated. It has been rolled into Azure Toolkit for IntelliJ. So, any online articles referring to HDInsight tools are valid, but see it with Azure Toolkit in mind.
Download

I’m working on a Windows Server 2012 R2 VM.
JDK 8
Building a Spark Application for HDInsight using IntelliJ Part 1 of 2-1

IntelliJ Installation
Building a Spark Application for HDInsight using IntelliJ Part 1 of 2-2
Start IntelliJ
Create New Project
Building a Spark Application for HDInsight using IntelliJ Part 1 of 2-3

Before creating a project, we need to install Azure Toolkit so that we can use Spark on HDInsight project template.
Building a Spark Application for HDInsight using IntelliJ Part 1 of 2-4Building a Spark Application for HDInsight using IntelliJ Part 1 of 2-5
Select Spark On HDInsight (Scala) project template. As a side note, an alternate approach is with Maven, but I found this approach generally easier.

Building a Spark Application for HDInsight using IntelliJ Part 1 of 2-6

Enter Project Name
For Project SDK, Select Java 1.8 by finding it at C:\Program Files\Java\jdk1.8.0_131
For Scala SDK, Download and select latest version. In this screen shot I had previously downloaded it.
For Spark SDK, Select and find where you downloaded spark-assembly-2.0.0-hadoop2.7.0-SNAPSHOT.jar

Selected SDKsBuilding a Spark Application for HDInsight using IntelliJ Part 1 of 2-8
Note: For Spark 2.0 cluster, you need Java 1.8 and Scala 2.11 or above.
Click Finish
Go to File >Project Structure, setProject Language level to 7Building a Spark Application for HDInsight using IntelliJ Part 1 of 2-9

To write code and submit to HDInsight, see my next blog post Building a Spark Application for HDInsight using IntelliJ Part 2 of 2


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s