Skip to content

duyydang/spark-kafka-integration

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 

Repository files navigation

This project is a base for how to use spark and kafka

It have some Step to build this

1 Run docker compose

Open terminal and run: docker compose up -d
In this docker-compose.yml file, I use kafdrop that will manage and monitor all messages and topics from kafka
Additionally, you can uncomment service "spark" to use same environment with kafka

2 Run spark master

$SPARK_HOME/sbin/start-master.sh
You need install kafka before run that. Maybe I will build a introduce next time ( Or you can search )
https://spark.apache.org/downloads.html

3 Run submit spark

In this case I'm running Spark in standalone mode on my local machine rather than in Docker You need check master spark in spark UI and replace link spark master

spark-submit
--master spark://duydang:7077
--packages org.apache.spark:spark-sql-kafka-0-10_2.12:3.5.1
spark_kafka_integration.py

About

This is a project use Kafka deploy in Docker and Spark Streaming in local machine

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages