Skip to content
This repository was archived by the owner on Jul 4, 2025. It is now read-only.

Commit f3024c3

Browse files
committed
removed tensorrt-llm and made general improvements to all docs
1 parent ab88347 commit f3024c3

File tree

17 files changed

+314
-345
lines changed

17 files changed

+314
-345
lines changed

docs/docs/architecture/cortexrc.mdx

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,6 @@ You can configure the following parameters in the `.cortexrc` file:
3434
| `apiServerPort` | Port number for the Cortex.cpp API server. | `39281` |
3535
| `logFolderPath` | Path the folder where logs are located | User's home folder. |
3636
| `logLlamaCppPath` | The llama-cpp engine . | `./logs/cortex.log` |
37-
| `logTensorrtLLMPath` | The tensorrt-llm engine log file path. | `./logs/cortex.log` |
3837
| `logOnnxPath` | The onnxruntime engine log file path. | `./logs/cortex.log` |
3938
| `maxLogLines` | The maximum log lines that write to file. | `100000` |
4039
| `checkedForUpdateAt` | The last time for checking updates. | `0` |

docs/docs/architecture/data-folder.mdx

Lines changed: 5 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -51,18 +51,11 @@ it typically follows the structure below:
5151
├── cortex.db
5252
├── engines/
5353
│   ├── cortex.llamacpp/
54-
│   │   ├── deps/
55-
│   │   │   ├── libcublasLt.so.12
56-
│   │   │   └── libcudart.so.12
57-
│   │   └── linux-amd64-avx2-cuda-12-0/
58-
│   │   └── ...
59-
│   └── cortex.tensorrt-llm/
60-
│   ├── deps/
61-
│   │   └── ...
62-
│   └── linux-cuda-12-4/
63-
│   └── v0.0.9/
64-
│   ├── ...
65-
│   └── libtensorrt_llm.so
54+
│   ├── deps/
55+
│   │   ├── libcublasLt.so.12
56+
│   │   └── libcudart.so.12
57+
│   └── linux-amd64-avx2-cuda-12-0/
58+
│   └── ...
6659
├── files
6760
├── logs/
6861
│   ├── cortex-cli.log

docs/docs/basic-usage/index.mdx

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -36,7 +36,7 @@ curl --request DELETE \
3636

3737
## Engines
3838
Cortex currently supports a general Python Engine for highly customised deployments and
39-
3 specialized ones for different multi-modal foundation models: llama.cpp, ONNXRuntime and TensorRT-LLM.
39+
2 specialized ones for different multi-modal foundation models: llama.cpp and ONNXRuntime.
4040

4141
By default, Cortex installs `llama.cpp` as it main engine as it can be used in most laptops,
4242
desktop environments and operating systems.
@@ -58,8 +58,7 @@ curl --request GET \
5858
"name": "linux-amd64-avx2-cuda-12-0",
5959
"version": "v0.1.49"
6060
}
61-
],
62-
"tensorrt-llm": []
61+
]
6362
}
6463
```
6564

docs/docs/capabilities/models/index.mdx

Lines changed: 2 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -3,15 +3,9 @@ title: Model Overview
33
description: The Model section overview
44
---
55

6-
:::warning
7-
🚧 Cortex.cpp is currently under active development. Our documentation outlines the intended behavior
8-
of Cortex, which may not yet be fully implemented in the codebase.
9-
:::
10-
116
Models in cortex are used for inference purposes (e.g., chat completion, embedding, etc.) after they
127
have been downloaded locally. Currently, we support different engines including `llama.cpp` with the
13-
GGUF model format, TensorRT-LLM for optimized inference on NVIDIA hardware, and ONNX for edge or
14-
different model deployments.
8+
GGUF model format, and ONNX for edge or different model deployments.
159

1610
In the future, you will also be able to run remote models (like OpenAI GPT-4 and Claude 3.5 Sonnet) via
1711
Cortex. Support for OpenAI and Anthropic engines is under development and will be available soon.
@@ -27,7 +21,6 @@ can facilitate the following:
2721
Cortex supports three model formats and each model format require specific engine to run:
2822
- GGUF - run with `llama-cpp` engine
2923
- ONNX - run with `onnxruntime` engine
30-
- TensorRT-LLM - run with `tensorrt-llm` engine
3124

3225
Within the Python Engine (currently under development), you can run models in other formats
3326

@@ -45,6 +38,6 @@ These models are ready to be downloaded and you can check them out at the link a
4538

4639
Built-in models are made available across the following variants:
4740

48-
- **By format**: `gguf`, `onnx`, and `tensorrt-llm`
41+
- **By format**: `gguf` and `onnx`
4942
- **By Size**: `7b`, `13b`, and more.
5043
- **By quantization method**: `q4`, `q8`, and more.

docs/docs/capabilities/models/model-yaml.mdx

Lines changed: 0 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -6,11 +6,6 @@ description: The model.yaml
66
import Tabs from "@theme/Tabs";
77
import TabItem from "@theme/TabItem";
88

9-
:::warning
10-
🚧 Cortex is currently under active development. Our documentation outlines the intended behavior of
11-
Cortex, which may not yet be fully implemented in the codebase.
12-
:::
13-
149
Cortex uses a `model.yaml` file to specify the configuration desired for each model. Models can be downloaded
1510
from the Cortex Model Hub or Hugging Face repositories. Once downloaded, the model data is parsed and stored
1611
in the `models` directory.

docs/docs/cli/config.mdx

Lines changed: 35 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,11 @@ import TabItem from "@theme/TabItem";
99

1010
# `cortex config`
1111

12+
:::warning
13+
At the moment, the `cortex config` command only supports a few configurations. More
14+
configurations will be added soon.
15+
:::
16+
1217
This command allows you to update server configurations such as CORS and Allowed Headers.
1318

1419
## Usage
@@ -65,14 +70,34 @@ This command returns all server configurations.
6570
For example, it returns the following:
6671

6772
```
68-
+-------------------------------------------------------------------------------------+
69-
| Config name | Value |
70-
+-------------------------------------------------------------------------------------+
71-
| allowed_origins | http://localhost:39281 |
72-
+-------------------------------------------------------------------------------------+
73-
| allowed_origins | http://127.0.0.1:39281/ |
74-
+-------------------------------------------------------------------------------------+
75-
| cors | true |
76-
+-------------------------------------------------------------------------------------+
73+
+-----------------------+-------------------------------------+
74+
| Config name | Value |
75+
+-----------------------+-------------------------------------+
76+
| allowed_origins | http://localhost:39281 |
77+
+-----------------------+-------------------------------------+
78+
| allowed_origins | http://127.0.0.1:39281 |
79+
+-----------------------+-------------------------------------+
80+
| allowed_origins | http://0.0.0.0:39281 |
81+
+-----------------------+-------------------------------------+
82+
| cors | true |
83+
+-----------------------+-------------------------------------+
84+
| huggingface_token | |
85+
+-----------------------+-------------------------------------+
86+
| no_proxy | example.com,::1,localhost,127.0.0.1 |
87+
+-----------------------+-------------------------------------+
88+
| proxy_password | |
89+
+-----------------------+-------------------------------------+
90+
| proxy_url | |
91+
+-----------------------+-------------------------------------+
92+
| proxy_username | |
93+
+-----------------------+-------------------------------------+
94+
| verify_host_ssl | true |
95+
+-----------------------+-------------------------------------+
96+
| verify_peer_ssl | true |
97+
+-----------------------+-------------------------------------+
98+
| verify_proxy_host_ssl | true |
99+
+-----------------------+-------------------------------------+
100+
| verify_proxy_ssl | true |
101+
+-----------------------+-------------------------------------+
77102
78-
```
103+
```

docs/docs/cli/engines/index.mdx

Lines changed: 16 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -9,8 +9,8 @@ import TabItem from "@theme/TabItem";
99

1010
This command allows you to manage various engines available within Cortex.
1111

12-
1312
**Usage**:
13+
1414
<Tabs>
1515
<TabItem value="MacOs/Linux" label="MacOs/Linux">
1616
```sh
@@ -24,26 +24,25 @@ This command allows you to manage various engines available within Cortex.
2424
</TabItem>
2525
</Tabs>
2626

27-
2827
**Options**:
2928

3029
| Option | Description | Required | Default value | Example |
3130
|-------------------|-------------------------------------------------------|----------|---------------|-----------------|
3231
| `-h`, `--help` | Display help information for the command. | No | - | `-h` |
3332
{/* | `-vk`, `--vulkan` | Install Vulkan engine. | No | `false` | `-vk` | */}
3433

35-
---
36-
# Subcommands:
34+
3735
## `cortex engines list`
36+
3837
:::info
3938
This CLI command calls the following API endpoint:
4039
- [List Engines](/api-reference#tag/engines/get/v1/engines)
4140
:::
42-
This command lists all the Cortex's engines.
43-
4441

42+
This command lists all the Cortex's engines.
4543

4644
**Usage**:
45+
4746
<Tabs>
4847
<TabItem value="MacOs/Linux" label="MacOs/Linux">
4948
```sh
@@ -58,6 +57,7 @@ This command lists all the Cortex's engines.
5857
</Tabs>
5958

6059
For example, it returns the following:
60+
6161
```
6262
+---+--------------+-------------------+---------+----------------------------+---------------+
6363
| # | Name | Supported Formats | Version | Variant | Status |
@@ -66,18 +66,19 @@ For example, it returns the following:
6666
+---+--------------+-------------------+---------+----------------------------+---------------+
6767
| 2 | llama-cpp | GGUF | 0.1.34 | linux-amd64-avx2-cuda-12-0 | Ready |
6868
+---+--------------+-------------------+---------+----------------------------+---------------+
69-
| 3 | tensorrt-llm | TensorRT Engines | | | Not Installed |
70-
+---+--------------+-------------------+---------+----------------------------+---------------+
7169
```
7270

7371
## `cortex engines get`
72+
7473
:::info
7574
This CLI command calls the following API endpoint:
7675
- [Get Engine](/api-reference#tag/engines/get/v1/engines/{name})
7776
:::
77+
7878
This command returns an engine detail defined by an engine `engine_name`.
7979

8080
**Usage**:
81+
8182
<Tabs>
8283
<TabItem value="MacOs/Linux" label="MacOs/Linux">
8384
```sh
@@ -92,18 +93,19 @@ This command returns an engine detail defined by an engine `engine_name`.
9293
</Tabs>
9394

9495
For example, it returns the following:
96+
9597
```
9698
+-----------+-------------------+---------+-----------+--------+
9799
| Name | Supported Formats | Version | Variant | Status |
98100
+-----------+-------------------+---------+-----------+--------+
99101
| llama-cpp | GGUF | 0.1.37 | mac-arm64 | Ready |
100102
+-----------+-------------------+---------+-----------+--------+
101103
```
104+
102105
:::info
103106
To get an engine name, run the [`engines list`](/docs/cli/engines/list) command.
104107
:::
105108

106-
107109
**Options**:
108110

109111
| Option | Description | Required | Default value | Example |
@@ -114,16 +116,18 @@ To get an engine name, run the [`engines list`](/docs/cli/engines/list) command.
114116

115117

116118
## `cortex engines install`
119+
117120
:::info
118121
This CLI command calls the following API endpoint:
119122
- [Init Engine](/api-reference#tag/engines/post/v1/engines/{name}/init)
120123
:::
124+
121125
This command downloads the required dependencies and installs the engine within Cortex. Currently, Cortex supports three engines:
122126
- `llama-cpp`
123127
- `onnxruntime`
124-
- `tensorrt-llm`
125128

126129
**Usage**:
130+
127131
<Tabs>
128132
<TabItem value="MacOs/Linux" label="MacOs/Linux">
129133
```sh
@@ -133,7 +137,6 @@ This command downloads the required dependencies and installs the engine within
133137
<TabItem value="Windows" label="Windows">
134138
```sh
135139
cortex.exe engines install [options] <engine_name>
136-
137140
```
138141
</TabItem>
139142
</Tabs>
@@ -150,6 +153,7 @@ This command downloads the required dependencies and installs the engine within
150153
This command uninstalls the engine within Cortex.
151154

152155
**Usage**:
156+
153157
<Tabs>
154158
<TabItem value="MacOs/Linux" label="MacOs/Linux">
155159
```sh
@@ -164,6 +168,7 @@ This command uninstalls the engine within Cortex.
164168
</Tabs>
165169

166170
For Example:
171+
167172
```bash
168173
## Llama.cpp engine
169174
cortex engines uninstall llama-cpp

0 commit comments

Comments
 (0)