Setting up Spark single node with local disk
OS: Ubuntu 14.04 in GENI
Please use ssh key to login Ubuntu 14.04 in GENI
If you are /bin/bash user, you might want change your shell, you could use
$sudo chsh -s /bin/bash YourUserName
Use curl command to get auto install shell script, save filename as Install.sh
$ curl https://dl.dropboxusercontent.com/u/12787647/iCAIR/Ubuntu1404SparkInstall.sh > Install.sh
Use sh to run the shell script. It will auto install spark and other packages.
$ sh Install.sh
After install Spark in your system, you could run spark interactive shell with follow command
$ ~/spark/bin/spark-shell
Basic command practice
scala> val sakanaFile = sc.textFile("README.md")
sakanaFile: org.apache.spark.rdd.RDD[String] = MapPartitionsRDD[1] at textFile at :21
scala> sakanaFile.count()
res0: Long = 98
scala> val linesWithSpark = sakanaFile.filter(line => line.contains("Spark"))
linesWithSpark: org.apache.spark.rdd.RDD[String] = MapPartitionsRDD[2] at filter at :23
scala> linesWithSpark.count()
res1: Long = 19
scala> linesWithSpark.collect()
scala> linesWithSpark.collect.foreach(println)
No comments:
Post a Comment