Skip to content

Latest commit

 

History

History
143 lines (121 loc) · 4.17 KB

README.md

File metadata and controls

143 lines (121 loc) · 4.17 KB

Description

Quickly deploy the NLTK sentiment analysis model called "VADER" with this FastAPI powered project.

This project can be run directly from GitHub Codespaces without having to install anything locally. Just run "make start" after your Codespaces environment is ready and follow the "Usage" section.

API

POST /predict

Request:

{
  "text": "I love this movie!"
}

Response:

{
  "sentiment": "positive",
  "score": 0.692
}

GET /health

{
  "status": "up",
  "timestamp": "2023-10-08T23:22:30Z"
}

GET /metadata

{
  "model_version": "0.5.0",
  "model_type": "VADER Sentiment Analysis",
  "model_training_date": "2014-11-17"
}

Setup

You can either chose to install all dependencies on your host OS or to run on a docker container. Either way you need to install "make".

Using docker

After installing docker you need to build the docker image:

make build-image

To run the tests:

make test-docker

To start the API server:

make start-docker

Running on your host

Make sure you have "python3" and "python3-venv" installed on your system. To run the tests:

make test

To start the API server:

make start

Cleaning up

make clean

Usage

Swagger UI

You may visualize and test the API from the automatically generated GUI (thanks to FastAPI & Swagger UI). Just follow this link.

CLI

# create POST request to predict the sentiment of "This movie is amazing!"
curl -X 'POST' \
  'http://127.0.0.1:8000/predict' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{"text": "This movie is amazing!"}'

# get health check
curl -X 'GET' \
  'http://127.0.0.1:8000/health' \
  -H 'accept: application/json'

# get model metadata
curl -X 'GET' \
  'http://127.0.0.1:8000/metadata' \
  -H 'accept: application/json'

Implementation details

  1. Changed metadata endpoint information to match the version and commit date from the VADER repository.
  2. Used VSCode's extensions for black formatter and isort and left the configs in ".vscode/settings.json".
  3. Created a separate module for pydantic schemas. Another one for the VADER sentiment analysis method. Here, I declared the SentimentIntensityAnalyzer globally to avoid loading the lexicon file on each request.
  4. In "main.py" a GET request on the root path "/" gets redirect to the "/docs" path to allow seeing the Swagger UI from the URLs generated by VSCode and Codespaces.
  5. Using "pyproject.toml" instead of "setup.py" since it is the new standardized format to describe python packages.
  6. The Dockerfile from ".devcontainer" is used by GitHub Codespaces / VSCode for creating a devcontainer with the required dependencies. It is also used by the "build-image", "start-docker" and "test-docker" Makefile targets. Because of this dual role I decided not to include the source files in the image itself and thus in both cases the project root is mounted inside the container.

Error Handling

As this is a very simple API there are not many possible sources of errors. One example of error handling using FastAPI is using pydantic schemas to validate the input text length and return proper error messages:

class SentimentPredictInput(BaseModel):
    text: str = Field(min_length=1, max_length=256)

For an empty input, the error looks like this:

{
  "detail": [
    {
      "type": "string_too_short",
      "loc": [
        "body",
        "text"
      ],
      "msg": "String should have at least 1 characters",
      "input": "",
      "ctx": {
        "min_length": 1
      },
      "url": "https://errors.pydantic.dev/2.4/v/string_too_short"
    }
  ]
}

Logging

Uses uvicorn's logging mechanism, which we configure using the "vaderapi/configs/log_conf.yaml" file. Thus we can customize the logging format and the log handlers. By default, logs are written to a file named "server.log" in the root of the project and to standard output.