Hello World in Apache Spark

June 9, 2015

spark
scala
bigdata

In this post, we present a hello world application in Apache Spark.

Spark is a general engine for large-scal data processing. The main differentiating factor compared to the map-reduce framework is it’s ability to cache intermediate results in-memory.

Install Scala and SBT

First, install Scala and SBT.

Download Apache Spark

Download spark from here. Uncompress the contents to a directory and set up the environment variable SPARK_HOME to the extracted contents.

The source code is available on github.

object HelloWorld {
  def main(args: Array[String]) {

    // initialise spark context
    val conf = new SparkConf().setAppName("HelloWorld")
    val sc = new SparkContext(conf)

    println("************")
    println("Hello, world!")
    println("************")

    // terminate spark context
    sc.stop()
  }
}

Written on June 9, 2015