Skip to content

Installing Spark

Mac

  • brew install apache-spark
  • Create a log4j.properties file
    • cd /usr/local/Cellar/apache-spark/2.0.0/libexec/conf
    • cp log4j.properties.template log4j.properties
  • Edit the log4j.properties file and change the log level from INFO to ERROR on log4j.rootCategory

Linux

  • https://www.tutorialspoint.com/apache_spark/apache_spark_installation.htm
  • yay -S apache-spark

Test it

  • cd into spark installation directory
  • look for a text file we can play with like README.md or CHANGES.txt
  • enter pyspark
  • rdd = sc.textFile("README.md")
  • you should get a count of the number of lines in that file.
  • quit()