You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+2-1
Original file line number
Diff line number
Diff line change
@@ -43,6 +43,7 @@ ModelCache
43
43
44
44
## News
45
45
46
+
- 🔥🔥[2024.10.22] Added tasks for 1024 developer day.
46
47
- 🔥🔥[2024.04.09] Added Redis Search to store and retrieve embeddings in multi-tenant. This can reduce the interaction time between Cache and vector databases to 10ms.
47
48
- 🔥🔥[2023.12.10] Integrated LLM embedding frameworks such as 'llmEmb', 'ONNX', 'PaddleNLP', 'FastText', and the image embedding framework 'timm' to bolster embedding functionality.
48
49
- 🔥🔥[2023.11.20] Integrated local storage, such as sqlite and faiss. This enables you to initiate quick and convenient tests.
@@ -60,7 +61,7 @@ Codefuse-ModelCache is a semantic cache for large language models (LLMs). By cac
60
61
61
62
You can find the start script in `flask4modelcache.py` and `flask4modelcache_demo.py`.
62
63
63
-
-`flask4modelcache_demo.py`: A quick test service that embeds SQLite and FAISS. No database configuration required.
64
+
-`flask4modelcache_demo.py`: A quick test service that embeds SQLite and FAISS. No database configuration required.
64
65
-`flask4modelcache.py`: The standard service that requires MySQL and Milvus configuration.
-离线模型 bin 文件下载, 参考地址:[Hugging Face](https://huggingface.co/shibing624/text2vec-base-chinese/tree/main),并将下载的 bin 文件,放到 `model/text2vec-base-chinese` 文件夹中。
4.离线模型 bin 文件下载, 参考地址:[Hugging Face](https://huggingface.co/shibing624/text2vec-base-chinese/tree/main),并将下载的 bin 文件,放到 `model/text2vec-base-chinese` 文件夹中。
与原始的直接模型调用方式相比,Cache Service 的调用耗时数据呈现出稳定的分布特征,性能上并不会随着模型参数规模的增加而受到影响。在传统情况下,随着模型参数规模的扩大,模型调用的耗时往往会上升,这是因为更大规模的模型需要更多的计算资源。Cache 服务通过存储经常访问的数据来避免重复的计算,从而一定程度上解耦了耗时与模型复杂性之间的关联。
0 commit comments