Skip to content
This repository was archived by the owner on Jul 4, 2025. It is now read-only.

Commit 831d9ba

Browse files
authored
Merge pull request #154 from janhq/new-api-ref-update
fix navigating + change api path to api-ref
2 parents 29dff69 + 4de8456 commit 831d9ba

File tree

12 files changed

+40
-18
lines changed

12 files changed

+40
-18
lines changed
File renamed without changes.

docs/docs/examples/chatbox.md

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
---
2+
title: Nitro with Chatbox
3+
---
4+
5+
:::info COMING SOON
6+
:::
7+
8+
<!--
9+
## What is Chatbox?
10+
11+
## How to use Nitro as backend -->

docs/docs/features/chat.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ To send a single query to your chosen LLM, follow these steps:
1111
<div style={{ width: '50%', float: 'left', clear: 'left' }}>
1212

1313
```bash title="Nitro"
14-
curl http://localhost:3928/inferences/llamacpp/chat_completion \
14+
curl http://localhost:3928/v1/chat/completions \
1515
-H "Content-Type: application/json" \
1616
-d '{
1717
"model": "",
@@ -53,7 +53,7 @@ For ongoing conversations or multiple queries, the dialog request feature is ide
5353
<div style={{ width: '50%', float: 'left', clear: 'left' }}>
5454

5555
```bash title="Nitro"
56-
curl http://localhost:3928/inferences/llamacpp/chat_completion \
56+
curl http://localhost:3928/v1/chat/completions \
5757
-H "Content-Type: application/json" \
5858
-d '{
5959
"messages": [

docs/docs/features/embed.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ Here’s an example showing how to get the embedding result from the model:
1717
<div style={{ width: '50%', float: 'left', clear: 'left' }}>
1818

1919
```bash title="Nitro" {1}
20-
curl http://localhost:3928/inferences/llamacpp/embedding \
20+
curl http://localhost:3928/v1/embeddings \
2121
-H 'Content-Type: application/json' \
2222
-d '{
2323
"input": "Hello",

docs/docs/features/prompt.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -41,7 +41,7 @@ curl http://localhost:3928/inferences/llamacpp/loadmodel \
4141
### Testing the Assistant
4242

4343
```bash title="Pirate Assistant"
44-
curl http://localhost:3928/inferences/llamacpp/chat_completion \
44+
curl http://localhost:3928/v1/chat/completions \
4545
-H "Content-Type: application/json" \
4646
-d '{
4747
"messages": [

docs/docs/new/about.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@ For instance, compare the Nitro inference call:
2424
<div style={{ width: '50%', float: 'left', clear: 'left' }}>
2525

2626
```bash title="Nitro chat completion"
27-
curl http://localhost:3928/inferences/llamacpp/chat_completion \
27+
curl http://localhost:3928/v1/chat/completions \
2828
-H "Content-Type: application/json" \
2929
-d '{
3030
"model": "gpt-3.5-turbo",

docs/docs/new/install.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -141,7 +141,7 @@ Simple testcase with nitro, after starting the server, you can run the following
141141
"embedding": false
142142
}'
143143
# Send a prompt request to nitro
144-
curl -s --location 'http://localhost:3928/inferences/llamacpp/chat_completion' \
144+
curl -s --location 'http://localhost:3928/v1/chat/completions' \
145145
--header 'Content-Type: application/json' \
146146
--data '{
147147
"messages": [
@@ -172,7 +172,7 @@ Simple testcase with nitro, after starting the server, you can run the following
172172
173173
# Send a prompt request to nitro
174174
set "curl_data2={\"messages\":[{\"content\":\"Hello there\",\"role\":\"assistant\"},{\"content\":\"Write a long and sad story for me\",\"role\":\"user\"}],\"stream\":true,\"model\":\"gpt-3.5-turbo\",\"max_tokens\":100,\"stop\":[\"hello\"],\"frequency_penalty\":0,\"presence_penalty\":0,\"temperature\":0.7}"
175-
curl.exe -s -w "%%{http_code}" --location "http://localhost:3928/inferences/llamacpp/chat_completion" ^
175+
curl.exe -s -w "%%{http_code}" --location "http://localhost:3928/v1/chat/completions" ^
176176
--header "Content-Type: application/json" ^
177177
--data "%curl_data2%"
178178
```

docs/docs/new/quickstart.md

Lines changed: 17 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ Next, we need to download a model. For this example, we'll use the [Llama2 7B ch
2626
- Create a `/model` and navigate into it:
2727
```bash
2828
mkdir model && cd model
29-
wget https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGUF/resolve/main/llama-2-7b-chat.Q5_K_M.gguf?download=true
29+
wget -O llama-2-7b-model.gguf https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGUF/resolve/main/llama-2-7b-chat.Q5_K_M.gguf?download=true
3030
```
3131

3232
## Step 3: Run Nitro server
@@ -43,14 +43,28 @@ To check if the Nitro server is running:
4343
curl http://localhost:3928/healthz
4444
```
4545

46-
## Step 4: Making an Inference
46+
## Step 4: Load model
47+
48+
To load the model to Nitro server, you need to run:
49+
50+
```bash title="Load model"
51+
curl http://localhost:3928/inferences/llamacpp/loadmodel \
52+
-H 'Content-Type: application/json' \
53+
-d '{
54+
"llama_model_path": "/model/llama-2-7b-model.gguf",
55+
"ctx_len": 512,
56+
"ngl": 100,
57+
}'
58+
```
59+
60+
## Step 5: Making an Inference
4761

4862
Finally, let's make an actual inference call using Nitro.
4963

5064
- In your terminal, execute:
5165

5266
```bash title="Nitro Inference"
53-
curl http://localhost:3928/inferences/llamacpp/chat_completion \
67+
curl http://localhost:3928/v1/chat/completions \
5468
-H "Content-Type: application/json" \
5569
-d '{
5670
"messages": [

docs/docusaurus.config.js

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -103,7 +103,7 @@ const config = {
103103
{
104104
spec: "openapi/NitroAPI.yaml", // can be local file, url, or parsed json object
105105
// spec: "openapi/OpenAIAPI.yaml",
106-
route: "/api/",
106+
route: "/api-reference/",
107107
},
108108
],
109109
theme: {

docs/openapi/NitroAPI.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -134,7 +134,7 @@ paths:
134134
schema:
135135
$ref: "#/components/schemas/StatusResponse"
136136

137-
/inferences/llamacpp/embedding:
137+
/v1/embeddings:
138138
post:
139139
operationId: createEmbedding
140140
tags:
@@ -162,7 +162,7 @@ paths:
162162
schema:
163163
$ref: "#/components/schemas/CreateEmbeddingResponse"
164164

165-
/inferences/llamacpp/chat_completion:
165+
/v1/chat/completions:
166166
post:
167167
operationId: createChatCompletion
168168
tags:

0 commit comments

Comments
 (0)