Skip to content
This repository was archived by the owner on Aug 21, 2025. It is now read-only.

octoai/octo-spark

Repository files navigation

Prerequisites

  • Scala
  • Spark

This is tested against Spark 2.0.0 for Hadoop 2.7 and Spark 1.6.2 for CDH4. However, we are preferring Spark 2.0.0 and that is what is used for this repo. SPARK_HOME here will point to your spark directory

  • Sbt

DB Adapters

Clone

git clone git@github.com:octoai/octo-spark.git

Build

$ sbt clean package

Execute

Cassandra

Hello World Example

$  ~/etc/spark-2.0.0-bin-hadoop2.6/bin/spark-submit --class com.octo.HelloWorldExample --jars ~/etc/spark-cassandra-connector-assembly-2.0.0-M3.jar --properties-file cassandra.conf target/scala-2.10/octo-spark_2.10-0.0.1.jar 

HBase

We are following the Cloudera package of HBase http://www.cloudera.com/

We will call the HBase home dir as HBASE_HOME. The following command provides a relative location to the hbase jar which should be changed as per your directory structure.

Hello World Example

 $SPARK_HOME/bin/spark-submit --class "HBaseHelloWorld" --driver-class-path "$(echo $HBASE_HOME/lib/*.jar |xargs -n1|grep -v 'netty.*\.jar$')" --jars "../../../../etc/hbase-0.94.15-cdh4.7.1/hbase-0.94.15-cdh4.7.1-security.jar" --verbose --properties-file "hbase.conf" target/scala-2.11/octo-spark_2.11-0.0.1.jar

Product Recommender

 $SPARK_HOME/bin/spark-submit --class "ProductRecommender" --driver-class-path "$(echo $HBASE_HOME/lib/*.jar |xargs -n1|grep -v 'netty.*\.jar$')" --jars "../../../../etc/hbase-0.94.15-cdh4.7.1/hbase-0.94.15-cdh4.7.1-security.jar" --verbose --properties-file "hbase.conf" target/scala-2.11/octo-spark_2.11-0.0.1.jar

Time Recommender

 $SPARK_HOME/bin/spark-submit --class "TimeRecommender" --driver-class-path "$(echo $HBASE_HOME/lib/*.jar |xargs -n1|grep -v 'netty.*\.jar$')" --jars "../../../../etc/hbase-0.94.15-cdh4.7.1/hbase-0.94.15-cdh4.7.1-security.jar" --verbose --properties-file "hbase.conf" target/scala-2.11/octo-spark_2.11-0.0.1.jar

Troubleshooting

  • If you get a bunch of compile time issues related to Java ClassNotFoundException, in all likelihood, this is due to Scala version mismatch. You need to make sure that your version of scala is consistent across spark, sbt and connector

About

Octo-spark libraries and integration

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages