A Dockerized service providing a REST API interface to leverage Pandoc's functionality for converting documents from one format into another.
- Simple REST API to access Pandoc
- Direct subprocess calls to the pandoc binary (no Python module dependency)
- Compatible with amd64 and arm64 architectures
- Easily deployable via Docker
To install the latest version of the Pandoc Service, run the following command:
docker pull ghcr.io/schweizerischebundesbahnen/pandoc-service:latestTo start the Pandoc service container, execute:
docker run --init --detach \
--publish 9082:9082 \
--name pandoc-service \
--env REQUEST_BODY_LIMIT_MB=500 \
ghcr.io/schweizerischebundesbahnen/pandoc-service:latestThe service will be accessible on port 9082.
The REQUEST_BODY_LIMIT_MB environment variable sets the maximum allowed size (in megabytes) for uploaded files or request bodies processed by the Pandoc service. The default is 500 MB.
To extend or customize the service, use it as a base image in the Dockerfile:
FROM ghcr.io/schweizerischebundesbahnen/pandoc-service:latestTo build the Docker image from the source with a custom version, use:
docker build \
--build-arg APP_IMAGE_VERSION=0.0.0 \
--file Dockerfile \
--tag pandoc-service:0.0.0 .Replace 0.0.0 with the desired version number.
To start the Docker container with your custom-built image:
docker run --init --detach \
--publish 9082:9082 \
--name pandoc-service \
pandoc-service:0.0.0To stop the running container, execute:
docker container stop pandoc-serviceThe project includes several test methods to ensure functionality.
container-structure-test test --image pandoc-service:local --config ./tests/container/container-structure-test.yamlTo test the Docker image build and API functionality:
bash tests/shell/test_pandoc_service.shThis script builds the image, starts a container, and performs tests on all endpoints.
# Prepare testing
poetry install# Run all Python tests
poetry run pytest -v# Run a specific test
poetry run pytest tests/test_docx_post_process.py -v# Run all test pytest and linting
poetry run toxpoetry run pre-commit run --allFor more detailed testing information, see the tests README.
Pandoc Service provides the following endpoints:
GET /version
HTTP code Content-Type Response 200application/json{ "python": "3.12.5", "timestamp": "2024-09-23T12:23:09Z", "pandoc": "3.6.2", "pandocService": "0.0.0" }
curl -X GET -H "Content-Type: application/json" http://localhost:9082/version
GET /static/openapi.json
HTTP code Content-Type Response 200application/jsonopenapi.json
curl -X GET -H "Content-Type: application/json" http://localhost:9082/static/openapi.json
GET /docx-template
HTTP code Content-Type Response 200application/vnd.openxmlformats-officedocument.wordprocessingml.documentbinary document content
curl -X GET -H "Content-Type: application/vnd.openxmlformats-officedocument.wordprocessingml.document" http://localhost:9082/docx-template
POST /convert/html/to/docx
Parameter name Type Data type Description encoding optional string Encoding of provided HTML (default: utf-8) file_name optional string Output filename (default: converted-document.pdf) paper_size optional string Paper size for the output document. Supported values: A5, A4, A3, B5, B4, JIS_B5, JIS_B4, LETTER, LEGAL, LEDGER orientation optional string Page orientation. Supported values: portrait, landscape
HTTP code Content-Type Response 200application/vnd.openxmlformats-officedocument.wordprocessingml.documentDOCX document (binary data) 400plain/textError message with exception 500plain/textError message with exception
curl -X POST -H "Content-Type: application/html" --data @input_html http://localhost:9082/convert/html/to/docx --output output.docxWith custom paper size and orientation:
curl -X POST -H "Content-Type: application/html" --data @input_html "http://localhost:9082/convert/html/to/docx?paper_size=A4&orientation=landscape" --output output.docx
POST /convert/html/to/docx-with-template
Parameter name Type Data type Description source required file Source HTML content as multipart/form-data template optional file Custom DOCX template file as multipart/form-data encoding optional string Encoding of provided HTML (default: utf-8) file_name optional string Output filename (default: converted-document.docx) paper_size optional string Paper size for the output document. Supported values: A5, A4, A3, B5, B4, JIS_B5, JIS_B4, LETTER, LEGAL, LEDGER orientation optional string Page orientation. Supported values: portrait, landscape
HTTP code Content-Type Response 200application/vnd.openxmlformats-officedocument.wordprocessingml.documentDOCX document (binary data) 400plain/textError message with exception 500plain/textError message with exception
curl -X POST -F "[email protected]" -F "[email protected]" http://localhost:9082/convert/html/to/docx-with-template --output output.docxWith custom paper size and orientation:
curl -X POST -F "[email protected]" -F "[email protected]" "http://localhost:9082/convert/html/to/docx-with-template?paper_size=A4&orientation=landscape" --output output.docx