A Gradio-based web application that detects and rewrites sentences according to the IBM Style Guide using a custom-trained Hugging Face model.
- Supports multiple file formats:
.txt,.docx,.pdf,.md,.adoc,.dita - Detects issues sentence-by-sentence and provides rewrites
- Simple Gradio UI for easy file uploads and results viewing
- Python 3.7 or higher
- Git
- (Optional) Hugging Face CLI for model access
-
Clone the repository
git clone https://github.com/your-username/your-repo.git cd your-repo -
Create and activate a virtual environment
python -m venv venv # Linux/Mac source venv/bin/activate # Windows venv\Scripts\activate
-
Install dependencies
pip install -r requirements.txt
-
Run the app
python app.py
-
Open the Gradio interface
After running, the terminal will display a local URL (e.g., http://127.0.0.1:7860). Open it in your browser.
-
Upload a file
Use the file upload widget to select a supported file. The app will extract text, detect issues, and display corrections.
.txt: Plain text files.docx: Microsoft Word documents.pdf: PDF files.md: Markdown files (HTML tags are stripped).adoc: AsciiDoc files.dita: DITA XML files (extracts<p>paragraph text)
This app uses the Hugging Face model gtrivedi/style-guide-base (with its corresponding tokenizer) for correction:
from transformers import pipeline
pipe = pipeline(
"text2text-generation",
model="gtrivedi/style-guide-base",
tokenizer="gtrivedi/style-guide-base"
)├── app.py # Main application file
├── requirements.txt # Python dependencies
├── .gradio/ # Gradio-specific configurations (do not modify)
├── venv/ # Virtual environment (ignored by Git)
└── README.md # This file
Include the following in your .gitignore to avoid committing environment and config folders:
venv/
.gradio/
Review resources directory
Contributions are welcome! Feel free to open issues or submit pull requests for improvements.
This project is maintained by Gaurav. Feel free to reach out for any questions or feedback.