Skip to content

Commit 9270176

Browse files
authored
Add NuExtract notebook (#2311)
Ticket: CVS-149016
1 parent d5458a0 commit 9270176

File tree

5 files changed

+725
-14
lines changed

5 files changed

+725
-14
lines changed

.ci/spellcheck/.pyspelling.wordlist.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -527,6 +527,7 @@ notus
527527
nsamples
528528
nsfw
529529
NSFW
530+
NuExtract
530531
num
531532
numpy
532533
NumPy
Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
# Structure Extraction with NuExtract and OpenVINO
2+
3+
[![Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/nuextract-structure-extraction/nuextract-structure-extraction.ipynb)
4+
5+
[NuExtract](https://huggingface.co/numind/NuExtract) model is a text-to-JSON Large Language Model (LLM) that allows to extract arbitrarily complex information from text and turns it into structured data.
6+
7+
## Notebook Contents
8+
9+
The tutorial consists of the following steps:
10+
11+
- Install prerequisites
12+
- Download and convert the model from a public source using the [OpenVINO integration with Hugging Face Optimum](https://huggingface.co/blog/openvino)
13+
- Compress model weights to INT8 and INT4 with [OpenVINO NNCF](https://github.com/openvinotoolkit/nncf)
14+
- Create a structure extraction inference pipeline with [Generate API](https://github.com/openvinotoolkit/openvino.genai)
15+
- Launch interactive Gradio demo with structure extraction pipeline
16+
17+
## Installation Instructions
18+
19+
This is a self-contained example that relies solely on its own code.</br>
20+
We recommend running the notebook in a virtual environment. You only need a Jupyter server to start.
21+
For details, please refer to [Installation Guide](../../README.md).
22+
23+
<img referrerpolicy="no-referrer-when-downgrade" src="https://static.scarf.sh/a.png?x-pxid=5b5a4db0-7875-4bfb-bdbd-01698b5b1a77&file=notebooks/nuextract-structure-extraction/README.md" />
Lines changed: 64 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,64 @@
1+
import gradio as gr
2+
from typing import Callable
3+
4+
example_text = """We introduce Mistral 7B, a 7-billion-parameter language model engineered for
5+
superior performance and efficiency. Mistral 7B outperforms the best open 13B
6+
model (Llama 2) across all evaluated benchmarks, and the best released 34B
7+
model (Llama 1) in reasoning, mathematics, and code generation. Our model
8+
leverages grouped-query attention (GQA) for faster inference, coupled with sliding
9+
window attention (SWA) to effectively handle sequences of arbitrary length with a
10+
reduced inference cost. We also provide a model fine-tuned to follow instructions,
11+
Mistral 7B - Instruct, that surpasses Llama 2 13B - chat model both on human and
12+
automated benchmarks. Our models are released under the Apache 2.0 license.
13+
Code: https://github.com/mistralai/mistral-src
14+
Webpage: https://mistral.ai/news/announcing-mistral-7b/"""
15+
16+
example_schema = """{
17+
"Model": {
18+
"Name": "",
19+
"Number of parameters": "",
20+
"Number of max token": "",
21+
"Architecture": []
22+
},
23+
"Usage": {
24+
"Use case": [],
25+
"Licence": ""
26+
}
27+
}"""
28+
29+
30+
def make_demo(fn: Callable):
31+
with gr.Blocks() as demo:
32+
gr.Markdown("# Structure Extraction with NuExtract and OpenVINO")
33+
34+
with gr.Row():
35+
with gr.Column():
36+
text_textbox = gr.Textbox(
37+
label="Text",
38+
placeholder="Text from which to extract information",
39+
lines=5,
40+
)
41+
schema_textbox = gr.Code(
42+
label="JSON Schema",
43+
language="json",
44+
lines=5,
45+
)
46+
with gr.Column():
47+
model_output_textbox = gr.Code(
48+
label="Model Response",
49+
language="json",
50+
interactive=False,
51+
lines=10,
52+
)
53+
with gr.Row():
54+
gr.ClearButton(components=[text_textbox, schema_textbox, model_output_textbox])
55+
submit_button = gr.Button(value="Submit", variant="primary")
56+
with gr.Row():
57+
gr.Examples(examples=[[example_text, example_schema]], inputs=[text_textbox, schema_textbox])
58+
59+
submit_button.click(
60+
fn,
61+
[text_textbox, schema_textbox],
62+
[model_output_textbox],
63+
)
64+
return demo

0 commit comments

Comments
 (0)