Skip to content

Commit ec705a3

Browse files
committed
refactor: Resolve accidental merge change
1 parent af69651 commit ec705a3

40 files changed

+11898
-686
lines changed

Diff for: README.md

+10
Original file line numberDiff line numberDiff line change
@@ -60,6 +60,7 @@ Download a llama model to try running the llama C++ integration. You can find a
6060

6161
Double-click on Nitro to run it. After downloading your model, make sure it's saved to a specific path. Then, make an API call to load your model into Nitro.
6262

63+
6364
```zsh
6465
curl -X POST 'http://localhost:3928/inferences/llamacpp/loadmodel' \
6566
-H 'Content-Type: application/json' \
@@ -90,6 +91,15 @@ Table of parameters
9091
| `system_prompt` | String | The prompt to use for system rules. |
9192
| `pre_prompt` | String | The prompt to use for internal configuration. |
9293

94+
95+
***OPTIONAL***: You can run Nitro on a different port like 5000 instead of 3928 by running it manually in terminal
96+
```zsh
97+
./nitro 1 127.0.0.1 5000 ([thread_num] [host] [port])
98+
```
99+
- thread_num : the number of thread that nitro webserver needs to have
100+
- host : host value normally 127.0.0.1 or 0.0.0.0
101+
- port : the port that nitro got deployed onto
102+
93103
**Step 4: Perform Inference on Nitro for the First Time**
94104

95105
```zsh

Diff for: docs/docs/api.md

Whitespace-only changes.

Diff for: docs/docs/api/overview.md

-7
This file was deleted.

Diff for: docs/docs/community/changelog.md

-4
This file was deleted.

Diff for: docs/docs/community/coc.md

-54
This file was deleted.

Diff for: docs/docs/community/contribuiting.md

-41
This file was deleted.

Diff for: docs/docs/community/support.md

-27
This file was deleted.

Diff for: docs/docs/examples/llm.md

+54
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,54 @@
1+
---
2+
title: Simple chatbot with Nitro
3+
---
4+
5+
This guide provides instructions to create a chatbot powered by Nitro using the GGUF model.
6+
7+
## Step 1: Download the Model
8+
9+
First, you'll need to download the chatbot model.
10+
11+
1. **Navigate to the Models Folder**
12+
- Open your project directory.
13+
- Locate and open the `models` folder within the directory.
14+
15+
2. **Select a GGUF Model**
16+
- Visit the Hugging Face repository at [TheBloke's Models](https://huggingface.co/TheBloke).
17+
- Browse through the available models.
18+
- Choose the model that best fits your needs.
19+
20+
3. **Download the Model**
21+
- Once you've selected a model, download it using a command like the one below. Replace `<llama_model_path>` with the path of your chosen model.
22+
23+
24+
```bash title="Downloading Zephyr 7B Model"
25+
wget https://huggingface.co/TheBloke/zephyr-7B-beta-GGUF/resolve/main/zephyr-7b-beta.Q5_K_M.gguf?download=true
26+
```
27+
28+
## Step 2: Load model
29+
Now, you'll set up the model in your application.
30+
31+
1. **Open `app.py` File**
32+
33+
- In your project directory, find and open the app.py file.
34+
35+
2. **Configure the Model Path**
36+
37+
- Modify the model path in app.py to point to your downloaded model.
38+
- Update the configuration parameters as necessary.
39+
40+
```bash title="Example Configuration" {2}
41+
dat = {
42+
"llama_model_path": "nitro/interface/models/zephyr-7b-beta.Q5_K_M.gguf",
43+
"ctx_len": 2048,
44+
"ngl": 100,
45+
"embedding": True,
46+
"n_parallel": 4,
47+
"pre_prompt": "A chat between a curious user and an artificial intelligence",
48+
"user_prompt": "USER: ",
49+
"ai_prompt": "ASSISTANT: "}
50+
```
51+
52+
Congratulations! Your Nitro chatbot is now set up. Feel free to experiment with different configuration parameters to tailor the chatbot to your needs.
53+
54+
For more information on parameter settings and their effects, please refer to Run Nitro(using-nitro) for a comprehensive parameters table.

Diff for: docs/docs/features/chat.md

+172
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,172 @@
1+
---
2+
title: Chat Completion
3+
---
4+
5+
The Chat Completion feature in Nitro provides a flexible way to interact with any local Large Language Model (LLM).
6+
7+
## Single Request Example
8+
9+
To send a single query to your chosen LLM, follow these steps:
10+
11+
<div style={{ width: '50%', float: 'left', clear: 'left' }}>
12+
13+
```bash title="Nitro"
14+
curl http://localhost:3928/inferences/llamacpp/chat_completion \
15+
-H "Content-Type: application/json" \
16+
-d '{
17+
"model": "",
18+
"messages": [
19+
{
20+
"role": "user",
21+
"content": "Hello"
22+
},
23+
]
24+
}'
25+
26+
```
27+
</div>
28+
29+
<div style={{ width: '50%', float: 'right', clear: 'right' }}>
30+
31+
```bash title="OpenAI"
32+
curl https://api.openai.com/v1/chat/completions \
33+
-H "Content-Type: application/json" \
34+
-H "Authorization: Bearer $OPENAI_API_KEY" \
35+
-d '{
36+
"model": "gpt-3.5-turbo",
37+
"messages": [
38+
{
39+
"role": "user",
40+
"content": "Hello"
41+
}
42+
]
43+
}'
44+
```
45+
</div>
46+
47+
This command sends a request to your local LLM, querying about the winner of the 2020 World Series.
48+
49+
### Dialog Request Example
50+
51+
For ongoing conversations or multiple queries, the dialog request feature is ideal. Here’s how to structure a multi-turn conversation:
52+
53+
<div style={{ width: '50%', float: 'left', clear: 'left' }}>
54+
55+
```bash title="Nitro"
56+
curl http://localhost:3928/inferences/llamacpp/chat_completion \
57+
-H "Content-Type: application/json" \
58+
-d '{
59+
"messages": [
60+
{
61+
"role": "system",
62+
"content": "You are a helpful assistant."
63+
},
64+
{
65+
"role": "user",
66+
"content": "Who won the world series in 2020?"
67+
},
68+
{
69+
"role": "assistant",
70+
"content": "The Los Angeles Dodgers won the World Series in 2020."
71+
},
72+
{
73+
"role": "user",
74+
"content": "Where was it played?"
75+
}
76+
]
77+
}'
78+
79+
```
80+
</div>
81+
82+
<div style={{ width: '50%', float: 'right', clear: 'right' }}>
83+
84+
```bash title="OpenAI"
85+
curl https://api.openai.com/v1/chat/completions \
86+
-H "Content-Type: application/json" \
87+
-H "Authorization: Bearer $OPENAI_API_KEY" \
88+
-d '{
89+
"messages": [
90+
{
91+
"role": "system",
92+
"content": "You are a helpful assistant."
93+
},
94+
{
95+
"role": "user",
96+
"content": "Who won the world series in 2020?"
97+
},
98+
{
99+
"role": "assistant",
100+
"content": "The Los Angeles Dodgers won the World Series in 2020."
101+
},
102+
{
103+
"role": "user",
104+
"content": "Where was it played?"
105+
}
106+
]
107+
}'
108+
```
109+
</div>
110+
111+
### Chat Completion Response
112+
113+
Below are examples of responses from both the Nitro server and OpenAI:
114+
115+
<div style={{ width: '50%', float: 'left', clear: 'left' }}>
116+
117+
```js title="Nitro"
118+
{
119+
"choices": [
120+
{
121+
"finish_reason": null,
122+
"index": 0,
123+
"message": {
124+
"content": "Hello, how may I assist you this evening?",
125+
"role": "assistant"
126+
}
127+
}
128+
],
129+
"created": 1700215278,
130+
"id": "sofpJrnBGUnchO8QhA0s",
131+
"model": "_",
132+
"object": "chat.completion",
133+
"system_fingerprint": "_",
134+
"usage": {
135+
"completion_tokens": 13,
136+
"prompt_tokens": 90,
137+
"total_tokens": 103
138+
}
139+
}
140+
```
141+
</div>
142+
143+
<div style={{ width: '50%', float: 'right', clear: 'right' }}>
144+
145+
```js title="OpenAI"
146+
{
147+
"choices": [
148+
{
149+
"finish_reason": "stop"
150+
"index": 0,
151+
"message": {
152+
"role": "assistant",
153+
"content": "Hello there, how may I assist you today?",
154+
}
155+
}
156+
],
157+
"created": 1677652288,
158+
"id": "chatcmpl-123",
159+
"model": "gpt-3.5-turbo-0613",
160+
"object": "chat.completion",
161+
"system_fingerprint": "fp_44709d6fcb",
162+
"usage": {
163+
"completion_tokens": 12,
164+
"prompt_tokens": 9,
165+
"total_tokens": 21
166+
}
167+
}
168+
```
169+
</div>
170+
171+
172+
The chat completion feature in Nitro showcases compatibility with OpenAI, making the transition between using OpenAI and local AI models more straightforward. For further details and advanced usage, please refer to the [API reference](https://nitro.jan.ai/api).

0 commit comments

Comments
 (0)