Run Apache Spark on Windows (yeah, I know!)

Download the suitable distribution from Apache’s Spark website (select the version and the package type you’d want).

Download winutils.exe from here or here.

Place it in a directory (Maybe, C:/Hadoop/bin/winutils.exe), go to the directory containing winutils.exe and run the following command.

winutils.exe chmod 777 /tmp/hive

You need to set environment variables HADOOP_HOME and spark.driver.host before you proceed further.

Set HADOOP_HOME = C:/Hadoop (Note: winutils.exe is taken from %HADOOP_HOME%/bin, so point HADOOP_HOME just to the root directory)

spark-setup1

Set spark.driver.host=localhost

spark-setup2

Run <your-spark-directory>/bin/spark-shell.cmd

Run this in your browser!

http://localhost:4040/

For testing Spark, create a file called test.json and add the following data into it.

{
    "glossary": {
        "title": "example glossary",
		"GlossDiv": {
            "title": "S",
			"GlossList": {
                "GlossEntry": {
                    "ID": "SGML",
					"SortAs": "SGML",
					"GlossTerm": "Standard Generalized Markup Language",
					"Acronym": "SGML",
					"Abbrev": "ISO 8879:1986",
					"GlossDef": {
                        "para": "A meta-markup language, used to create markup languages such as DocBook.",
						"GlossSeeAlso": ["GML", "XML"]
                    },
					"GlossSee": "markup"
                }
            }
        }
    }
}

Run the following commands in Spark shell in sequence and test it:


val sc: SparkContext // An existing SparkContext.
val sqlContext = new org.apache.spark.sql.SQLContext(sc)

val df = sqlContext.read.json("path/test.json")

// Displays the content of the DataFrame to stdout
df.show()

References:

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s