Fine tuning, prompting, and stylomerty

This repo contains a comparison of different strategies for affecting an LLM's output style, with a special attention given towards application in library science.

set up

Clone this repo and using uv run

uv sync

Add the following environment variables:

export PYTHONPATH=".venv/bin/python"
export HF_TOKEN="" # optional

what is this

The repo contains notebooks for creating and comparing different strategies for affecting an LLM's text output.

Special attention is given to the field of library science, using IIIF metadata from Northwestern University's Digital Collections

fine tuning

The first two notebooks are related to fine tuning.

The first notebook creates dataset from a digital collection, and the second notebook walks through the process of fine tuning an open source model.

style profile

The third notebook generates a prompt that is informed by the field of stylometry.

comparison

The last notebook is a comparison of:

the base model
using the stylistically informed prompt
the fine tuned model

Huggingface

It is not necessary to have a Huggingface account, but there are optional steps for pushing the dataset and fine tuned model to the Huggingface Hub.

Testing

To test prompts and models, use the inference.py script:

uv run inference.py \
--model=HuggingFaceTB/SmolVLM-Instruct \
--image=https://iiif.dc.library.northwestern.edu/iiif/3/1ece48b0-8a49-491d-9f3f-90dc8bcca1ac/full/\!300,300/0/default.jpg

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.vscode		.vscode
.gitignore		.gitignore
.python-version		.python-version
01-generating-a-dataset.ipynb		01-generating-a-dataset.ipynb
02-finetuning-a-model.ipynb		02-finetuning-a-model.ipynb
03-generating-style-profile.ipynb		03-generating-style-profile.ipynb
04-comparing-strategies.ipynb		04-comparing-strategies.ipynb
LICENSE		LICENSE
README.md		README.md
inference.py		inference.py
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Fine tuning, prompting, and stylomerty

set up

what is this

fine tuning

style profile

comparison

Huggingface

Testing

About

Uh oh!

Releases

Packages

Languages

License

charlesLoder/stylometry

Folders and files

Latest commit

History

Repository files navigation

Fine tuning, prompting, and stylomerty

set up

what is this

fine tuning

style profile

comparison

Huggingface

Testing

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages