Skip to content

Latest commit

 

History

History
28 lines (24 loc) · 1.26 KB

README.md

File metadata and controls

28 lines (24 loc) · 1.26 KB

This repository contains the PySpark code for building a search engine for movie plot summaries using the tf-idf technique. The backend of the search engine is implemented using Flask, which exposes an API. This API accepts user queries and returns a list of possible movies based on the search query.

Description

The goal is to create a search engine that allows users to search for movies based on their plot summaries. The system utilizes the tf-idf technique to identify relevant movie summaries based on the user's search query.

Getting Started

1. Data Upload

2. Build the docker container

docker build -t search_engine .

3. Run the container

docker run --rm -p 3000:5000 -it search_engine

4. Final steps

  • Go to http://0.0.0.0:3000/ and enter your search query

Technologies Used

License

This project is licensed under the MIT License.