Skip to content

Commit b4e9846

Browse files
committed
Major Changes for 1.0.0
Signed-off-by: AkshathRaghav <[email protected]>
1 parent 18d7cfb commit b4e9846

27 files changed

+2513
-1906
lines changed

.github/workflows/formatter.yml

+27
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
name: Run formatting on the codebase.
2+
3+
on: [push]
4+
5+
jobs:
6+
run-script:
7+
runs-on: ubuntu-latest
8+
9+
steps:
10+
- name: Checkout repository
11+
uses: actions/checkout@v3
12+
13+
- name: Set up Python 3.9
14+
uses: actions/setup-python@v3
15+
with:
16+
python-version: 3.9
17+
18+
- name: Install dependencies (if any)
19+
run: |
20+
python -m pip install --upgrade pip
21+
pip install --upgrade autopep8
22+
23+
- name: Run the script
24+
run: |
25+
autopep8 --in-place --recursive .
26+
27+

README.md

+72-31
Original file line numberDiff line numberDiff line change
@@ -24,11 +24,17 @@
2424

2525
## 🤔 What is this?
2626

27-
This repository contains code to abstract the LLM output constraining process. It helps you define your grammar rules using Pydantic and Typing in a pythonic way, and inherently embeds metadata from these dataclasses into the prompt. Parsing is enabled in JSON, TOML and XML formats, with custom parsers that avoid the issues faced by `json.loads` (..etc) while parsing direct outputs. It can also create GNBF grammr from the same, which is used by the [llama.cpp](https://github.com/ggerganov/llama.cpp/) package for sampling logits smartly.
27+
GrammarFlow abstracts the **LLM constraining process for complex-response tasks**. It helps you define your grammar rules using Pydantic and Typing in a pythonic way, and inherently embeds metadata from these dataclasses into the prompt. Parsing is enabled in JSON, TOML and XML formats, with custom parsers that avoid the issues faced by `json.loads` (..etc) while parsing direct outputs.
2828

29-
The goal of this package was to overcome the issues faced when using langchain's output parsers with instruct language models. While GPT-4 produces consistent results in returning the correct formats, Llama-7B would cause parsing errors in my testing chains with more complex prompts.
29+
Importantly, the package supports the generation of **GNBF grammar**, which integrates seamlessly with the [llama.cpp](https://github.com/ggerganov/llama.cpp/) package. This integration allows for more intelligent sampling of logits, optimizing the response quality from models.
3030

31-
> Please reach out to `araviki [at] purdue [dot] edu` or open an issue on Github if you have any questions or inquiry related to GrammarFlow and its usage.
31+
The goal of this package was to overcome the issues faced when using LangChain's output parsers with instruct language models locally. While GPT-4 produces consistent results in returning the correct formats, local models from families like Llama and Mistral would cause parsing errors in my testing chains when I need more than just a single string response. Recently, GrammarFlow was extended to cover more features to help anyone trying to work with LLMs for complex use-cases: multi-grammar generation, regex patterns, etc.
32+
33+
Moreover, GrammarFlow is meant for use-cases with (any kind of) AI Agents, as well as extracting content from text or question-answering problems. This allows it to have an *edge over* batched LLM generation and schema recomposing. The above methods would require *much higher #calls* to an inference function, which will increase the total cost of an iteration if using a paid service like GPT or Gemini.
34+
35+
Kindly go through [`Remarks!`](https://github.com/e-lab/SyntaxShaper/tree/main?tab=readme-ov-file#remarks) section to get a complete understanding of what we're doing.
36+
37+
> Please reach out to `araviki[at]purdue[dot]edu` or open an issue on Github if you have any questions or inquiry related to GrammarFlow and its usage.
3238
3339
## Results:
3440

@@ -68,21 +74,37 @@ GrammarFlow was tested against popular LLM datasets, with a focus on constrainin
6874
|-------------------------------------------------------------------------------+------------------------|
6975
```
7076

71-
## ⚡ Quick Install
77+
78+
## ⚡ Installation
79+
80+
#### Quick Install
7281

7382
`pip install grammarflow`
7483

84+
#### (Not so quick) Install
85+
86+
```
87+
conda create --name grammarflow python=3.9 -y
88+
conda activate grammarflow
89+
90+
git clone https://github.com/e-lab/SyntaxShaper
91+
cd grammarflow
92+
pip install .
93+
```
94+
7595
## 📃 Code Usage
7696

97+
> The [guide](https://github.com/e-lab/SyntaxShaper/blob/main/guide.ipynb) contains an in-depth explanation of all the classes and functions.
98+
7799
Map out what your agent chain is doing. Understand what it's goals are and what data needs to be carried forward from one step to the next.
78100
For example, consider the [ReAct prompting framework](https://react-lm.github.io/). In every call, we want to pass in the Action and subsequent Observation to the next call.
79101

80102

81103
```python
82104
from grammarflow import *
83-
from grammarflow.prompt.template import Agent
84-
from grammarflow.grammars.template import AgentStep
85-
from grammarflow.tools.LLM import LocalLlama
105+
from grammarflow.prompt.template import Agent # Prompt
106+
from grammarflow.grammars.template import AgentStep # Structured Grammar
107+
from grammarflow.tools.llm import LocalLlama # Barebones inference call; interfaces with llama.cpp
86108

87109
llm = LocalLlama()
88110
prompt = Agent()
@@ -91,14 +113,15 @@ prompt = Agent()
91113
system_context = """Your goal is to think and plan out how to solve questions using agent tools provided to you. Think about all aspects of your thought process."""
92114
user_message = """Who is Vladmir Putin?"""
93115

94-
with Constrain(prompt, 'xml') as manager:
116+
with Constrain('xml') as manager:
95117
# Makes the changes to the prompt
96-
manager.format_prompt(
118+
prompt = manager.format(
119+
prompt,
97120
placeholders={'prompt': user_message, 'instructions': system_context},
98121
grammars=[{'model': AgentStep}]
99122
)
100123

101-
llm_response = llm(manager.prompt, temperature=0.01)
124+
llm_response = llm(prompt, temperature=0.01)
102125

103126
# Parse the response into a custom dataclass for holding values
104127
response = manager.parse(llm_response)
@@ -111,16 +134,17 @@ observation = PerformSomeAction(
111134

112135
## Features
113136

114-
GrammarFlow is mainly meant to be an add-on to your existing LLM applications. It works on the input to and output from your `llm()` call, treating everything in between as a black box. It contains pre-made template prompts for local GGUF models like [Llama2 (70B, 13B, 7B)](https://huggingface.co/TheBloke/Upstage-Llama-2-70B-instruct-v2-GGUF), [Mistral](https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.2-GGUF), [Mixtral](https://huggingface.co/TheBloke/Synthia-MoE-v3-Mixtral-8x7B-GGUF) and has template grammars for common tasks. Making these prompts and grammars are trivial and require minimal effort, as long as you know the format of what you're building.
137+
GrammarFlow is mainly meant to be an add-on to your existing LLM applications. It works on the input to and output from your `llm()` call, treating everything in between as a black box. It contains pre-made template prompts for local GGUF models like [Llama2 (70B, 13B, 7B)](https://huggingface.co/TheBloke/Upstage-Llama-2-70B-instruct-v2-GGUF), [Mistral](https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.2-GGUF), [Mixtral](https://huggingface.co/TheBloke/Synthia-MoE-v3-Mixtral-8x7B-GGUF) and has template grammars for common tasks like Chain-of-Thought and Iterative Agents. Making these prompts and grammars are trivial and require minimal effort, as long as you know the format of what you're building.
115138

116-
- [X] **GBNF Support**: Converts any Pydantic model to GNBF grammar for using with [llama.cpp](https://github.com/ggerganov/llama.cpp/)'s token-based sampling. Enables adding regex patterns directly.
139+
- [X] **GBNF Support**: Converts any Pydantic model to GNBF grammar for using with [llama.cpp](https://github.com/ggerganov/llama.cpp/)'s token-based sampling. Enables adding regex patterns directly through Pydantic's `Field(..., pattern="")`.
117140
- [x] **Easy Integration**: Integrates with any package or stack by just manipulating the prompt and decoding the result into a pythonic data abstractor. Treats everything in between as a **black box**.
118-
- [x] **Handles Complex Grammars**: Can handle typing objects ('List', 'Dict', etc.) and nested Pydantic logic with complex data-types.
119-
- [x] **Experiments with different 'formats'**: Defines grammar rules in XML, JSON and TOML formats. JSON is the standard, while XML is best for (+3) nested parsing and TOML is best when you want to get multiple models parsed simulatenously. Each has it's own usecase as described in the demo.
120-
- [x] **Reduces hallucinations or garbage results during sampling**: GBNF grammars allow for controlled whitespacing/identation and model generation ordering, while parsing logic allows for ignoring incorrect terminal symbols.
141+
- [x] **Handles Complex Grammars**: Can handle typing objects ('List', 'Dict', etc.) and nested Pydantic logic with complex data-types.
142+
- [x] **Experiments with different 'formats'**: Defines grammar rules in XML, JSON and TOML formats. JSON is the standard, while XML is best for nested parsing and TOML is best when you want to get multiple models parsed simulatenously. Each has it's own usecase as described in the [guide](https://github.com/e-lab/SyntaxShaper/blob/main/guide.ipynb).
143+
- [x] **Reduces hallucinations or garbage results during sampling**: GBNF grammars allow for controlled whitespacing/identation and model ordering, while parsing logic allows for ignoring incorrect terminal symbols.
144+
121145

122146
### Examples (@ samples/)
123-
1. For a general overview of what GrammarFlow can do, look at [demo.ipynb](https://github.com/e-lab/SyntaxShaper/blob/main/samples/demo.ipynb).
147+
1. For a general overview of what GrammarFlow can do, look at [guide.ipynb](https://github.com/e-lab/SyntaxShaper/blob/main/guide.ipynb).
124148
2. For my modification to [ReAct's](https://github.com/ysymyth/ReAct) evaluation code on [HotPotQA](https://hotpotqa.github.io/), look at [hotpotqa_modified](https://github.com/e-lab/SyntaxShaper/blob/main/samples/hotpotqa/hotpotqa_modified.ipynb).
125149
3. I've also added an implementation of a [data annotator](https://github.com/e-lab/SyntaxShaper/blob/main/samples/bert_finetuning/annotator.ipynb) for this [BERT fine-tuning guide](https://www.datasciencecentral.com/how-to-fine-tune-bert-transformer-with-spacy-3/).
126150

@@ -160,42 +184,59 @@ from grammarflow import GNBF
160184

161185
grammar = GNBF(Project).generate_grammar()
162186

163-
# Verify with LlamaGrammar
164-
GNBF.verify_grammar(grammar)
187+
# Verify with LlamaGrammar from llama-cpp-python
188+
GNBF.verify_grammar(grammar, format_='json')
165189
```
166190

167191
Results:
168192
```
169-
root ::= project ws
170-
project ::= "{" ws "\"name\":" ws string "," ws "\"description\":" ws string "," ws "\"project-url\":" ws string "," ws "\"team-members\":" ws teammember "," ws "\"grammars\":" ws grammars "}" ws
171-
ws ::= [ \t\n]*
193+
root ::= ws Project
194+
Project ::= nl "{" "\"Project\":" ws "{" ws "\"name\":" ws string "," nl "\"description\":" ws string "," nl "\"project-url\":" ws string "," nl "\"team-members\":" ws TeamMember "," nl "\"grammars\":" ws Task "}" ws "}"
195+
ws ::= [ \t\n]
196+
nl ::= [\n]
172197
string ::= "\"" (
173198
[^"\\] |
174-
"\\" (["\\/bfnrt] | "u" [0-9a-fa-f] [0-9a-fa-f] [0-9a-fa-f] [0-9a-fa-f])
199+
"\\" (["\\/bfnrt] | "u" [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F])
175200
)* "\""
176-
teammember ::= "{" ws "\"name\":" ws string "," ws "\"role\":" ws string "}" ws
177-
number ::= ("-"? ([0-9] | [1-9] [0-9]*)) ("." [0-9]+)? ([ee] [-+]? [0-9]+)?
178-
taskupdate ::= "{" ws "\"update-time\":" ws number "," ws "\"comment\":" ws string "," ws "\"status\":" ws status "}" ws
201+
TeamMember ::= nl "{" ws "\"name\":" ws string "," nl "\"role\":" ws string "}"
202+
number ::= ("-"? ([0-9] | [1-9] [0-9]*)) ("." [0-9]+)? ([eE] [-+]? [0-9]+)?
203+
boolean ::= ("True" | "False")
204+
TaskUpdate ::= nl "{" ws "\"update-time\":" ws number "," nl "\"comment\":" ws string "," nl "\"status\":" ws boolean "}"
179205
array ::= "[" ws (
180206
due-date-value
181207
("," ws due-date-value)*
182208
)? "]" ws
183209
due-date-value ::= string
184-
task ::= "{" ws "\"title\":" ws string "," ws "\"description\":" ws string "," ws "\"assigned-to\":" ws teammember "," ws "\"due-date\":" ws array "," ws "\"updates\":" ws taskupdate "}" ws
210+
Task ::= nl "{" ws "\"title\":" ws string "," nl "\"description\":" ws string "," nl "\"assigned-to\":" ws TeamMember "," nl "\"due-date\":" ws array "," nl "\"updates\":" ws TaskUpdate "}"
185211
```
186212

187-
You can use this grammar to pass into [llama.cpp](https://github.com/ggerganov/llama.cpp/) through a barebones LLM class that is provided.
213+
You can use this grammar to pass into [llama.cpp](https://github.com/ggerganov/llama.cpp/) through a [barebones LLM class](https://github.com/e-lab/SyntaxShaper/blob/main/grammarflow/tools/llm.py) that is provided.
188214

189215
```python
216+
from grammarflow import LocalLlama
217+
190218
llm = LocalLlama()
191-
response = llm(manager.prompt, grammar=manager.get_grammar(CoT), stop_at=manager.stop_at)
219+
220+
with Constrain('xml') as manager:
221+
prompt = manager.format(...)
222+
response = llm(prompt, grammar=manager.get_grammar(CoT), stop_at=prompt.stop_at)
192223
```
193224

194-
## Remarks!
225+
## Remarks
226+
227+
Please keep in mind that this package is purely software driven and aims to make developers lives simpler. It can work across model families and parameter counts with great success in parsing.
228+
229+
However, with an increase in complexity of the prompt, the accuracy and 'performance' of the model's thinking capability will degrade. This is attributed to the context-window problem that a lot of researchers are working to improve. LLMs are autoregressive models which track previously seen tokens in order to iteratively predict the next one, and thus provide (a lot) of token probabilities in every generation. Different decoding startegies like **nucleus sampling** (used in GPT) and **beam search** are expensive and need to be used in combination with other methods to prune bad thinking patterns at generation time.
230+
231+
In language models, a larger prompt provides more context, leading to a wider range of plausible continuations and increasing the uncertainty in the next token's prediction. Mathematically, this manifests as a **higher entropy in the distribution** over possible next tokens, reflecting a greater number of likely sequences or "divergent trees" during decoding. Incorporating grammar-based constraining in language models forces the parsing of outputs to adhere to predefined syntactic rules, increasing the computational complexity and reducing flexibility in generation. This constriction **narrows the search space of possible outputs**, complicating the task of finding optimal sequences that satisfy both grammatical and contextual criteria.
195232

196-
Please keep in mind that this package is purely software driven and aims to make developers lives a little simpler. It can work across model families and parameter counts with great success in parsing.
233+
This is why people have come up with great workarounds like prompting strategies, prompt pruning, batch processing prompts (like in [JSONFormer](https://github.com/1rgs/jsonformer/blob/main/jsonformer/) and [super-json-mode](https://github.com/varunshenoy/super-json-mode/blob/main/superjsonmode/)), etc. Using those practices along with this library **boosts the efficiency** of whatever you're building!
197234

198-
However, with an increase in complexity of the prompt, the accuracy and 'performance' of the model's thinking capability will fail. This is attributed to the greater possibility
235+
> Batch-processing techniques entail generating simple strings in batches and subsequently formatting them into JSON structures manually. This approach, while straightforward, encounters significant limitations when the generated content requires internal consistency or interdependence among fields.
236+
237+
For instance, take the generation of responses for a Chain of Thought (CoT) prompt. Traditional batch processing might yield a series of isolated responses, each reflecting distinct, possibly unrelated thought processes. When these responses need to be structured into a JSON format that adheres to a list, manual entry is not sufficient. This method lacks the capability to ensure that subsequent entries are contextually aligned with previous ones.
238+
239+
This is where GrammarFlow steps in -- leveraging context-free grammars (CFGs) combined with carefully engineered prompts to guide the generation process.
199240

200241
## Citation
201242

0 commit comments

Comments
 (0)