Skip to content

Commit 8a5b570

Browse files
committed
Fix CoT logic to handle file / image uploads
1 parent b09005c commit 8a5b570

File tree

5 files changed

+78
-67
lines changed

5 files changed

+78
-67
lines changed

chatbot/Dockerfile

Lines changed: 15 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -9,21 +9,21 @@
99
FROM python:3.10-slim
1010

1111
# Setting build related env vars
12-
ENV PORT 5000
13-
ENV OPENAI_API_KEY "DEFAULT_API_KEY"
14-
ENV OPENAI_API_BASE "http://localhost:8000/v1"
15-
ENV AGENT_NAME "Jarvis"
16-
ENV MY_MODEL "models/7B/gguf-model.bin"
17-
ENV DEBUG "false"
18-
ENV WEAVIATE_HOST "localhsot"
19-
ENV WEAVIATE_PORT "8080"
20-
ENV WEAVIATE_GRPC_HOST "localhost"
21-
ENV WEAVIATE_GRPC_PORT "50051"
22-
ENV WEAVIATE_LIBRARY "tinyllm"
23-
ENV RESULTS 1
24-
ENV ONESHOT "false"
25-
ENV RAG_ONLY "false"
26-
ENV USE_SYSTEM "false"
12+
ENV PORT=5000
13+
ENV OPENAI_API_KEY="DEFAULT_API_KEY"
14+
ENV OPENAI_API_BASE="http://localhost:8000/v1"
15+
ENV AGENT_NAME="Jarvis"
16+
ENV MY_MODEL="models/7B/gguf-model.bin"
17+
ENV DEBUG="false"
18+
ENV WEAVIATE_HOST="localhsot"
19+
ENV WEAVIATE_PORT="8080"
20+
ENV WEAVIATE_GRPC_HOST="localhost"
21+
ENV WEAVIATE_GRPC_PORT="50051"
22+
ENV WEAVIATE_LIBRARY="tinyllm"
23+
ENV RESULTS=1
24+
ENV ONESHOT="false"
25+
ENV RAG_ONLY="false"
26+
ENV USE_SYSTEM="false"
2727

2828
# Set the working directory
2929
WORKDIR /app

chatbot/Dockerfile-docman

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -8,14 +8,14 @@
88
FROM python:3.10-slim
99

1010
# Setting build related env vars
11-
ENV MAX_CHUNK_SIZE 1024
12-
ENV UPLOAD_FOLDER uploads
13-
ENV WEAVIATE_HOST localhost
14-
ENV WEAVIATE_GRPC_HOST localhost
15-
ENV WEAVIATE_PORT 8080
16-
ENV WEAVIATE_GRPC_PORT 50051
17-
ENV PORT 5001
18-
ENV COLLECTIONS_ADMIN true
11+
ENV MAX_CHUNK_SIZE=1024
12+
ENV UPLOAD_FOLDER=uploads
13+
ENV WEAVIATE_HOST=localhost
14+
ENV WEAVIATE_GRPC_HOST=localhost
15+
ENV WEAVIATE_PORT=8080
16+
ENV WEAVIATE_GRPC_PORT=50051
17+
ENV PORT=5001
18+
ENV COLLECTIONS_ADMIN=true
1919

2020
# Set the working directory
2121
WORKDIR /app

chatbot/README.md

Lines changed: 6 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,7 @@ docker run \
3535

3636
```bash
3737
# Install required packages
38-
pip install fastapi uvicorn python-socketio jinja2 openai bs4 pypdf requests lxml aiohttp weaviate-client
38+
pip install -r requirements.txt
3939

4040
# Run the chatbot web server - change the base URL to be where you host your llmserver
4141
OPENAI_API_BASE="http://localhost:8000/v1" python3 server.py
@@ -47,6 +47,7 @@ Some RAG (Retrieval Augmented Generation) features including:
4747

4848
* Summarizing external websites and PDFs (paste a URL in chat window)
4949
* If a Weaviate host is specified, the chatbot can use the vector database information to respond. See [rag](../rag/weaviate/) for details on how to set up Weaviate.
50+
* Perform chain of thought (CoT) reasonging with `/think on` command.
5051
* Command - There are information commands using `/`
5152

5253
```
@@ -56,7 +57,10 @@ Some RAG (Retrieval Augmented Generation) features including:
5657
/news # List top 10 headlines from current new
5758
/stock [company] # Display stock symbol and current price
5859
/weather [location] # Provide current weather conditions
59-
/rag [library] [opt:number] [prompt] # Answer prompt based on response from Qdrant collection
60+
/rag on [library] [opt:number] # Route all prompts through RAG using specified library
61+
/rag off # Disable
62+
/think on # Perform Chain of Thought thinking on relevant prompts
63+
/think off # Disable
6064
```
6165

6266
See the [rag](../rag/) for more details about RAG.
@@ -83,13 +87,6 @@ The `/news` command will fetch the latest news and have the LLM summarize the to
8387

8488
<img width="930" alt="image" src="https://github.com/jasonacox/TinyLLM/assets/836718/2732fe07-99ee-4795-a8ac-42d9a9712f6b">
8589

86-
### Alternative System Prompts
87-
88-
* A Hacker’s Guide to Language Models - Jeremy Howard [[link](https://www.youtube.com/watch?v=jkrNMKz9pWU&ab_channel=JeremyHoward)]
89-
90-
You are an autoregressive language model that has been fine-tuned with instruction-tuning and RLHF. You carefully provide accurate, factual, thoughtful, nuanced answers, and are brilliant at reasoning. If you think there might not be a correct answer, you say so. Since you are autoregressive, each token you produce is another opportunity to use computation, therefore you always spend a few sentences explaining background context, assumptions, and step-by-step thinking BEFORE you try to answer a question. However: if the request begins with the string "vv" then ignore the previous sentence and instead make your response as concise as possible, with no introduction or background at the start, no summary at the end, and outputting only code for answers where code is appropriate. Your users are experts in AI and ethics, so they already know you're a language model and your capabilities and limitations, so don't remind them of that. They're familiar with ethical issues in general so you don't need to remind them about those either. Don't be verbose in your answers, but do provide details and examples where it might help the explanation. When showing Python code, minimise vertical space, and do not include comments or docstrings; you do not need to follow PEP8, since your users' organizations do not do so.
91-
92-
9390

9491
## Document Manager (Weaviate)
9592

chatbot/requirements.txt

Lines changed: 11 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1,17 +1,17 @@
11
# TinyLLM Chatbot Requirements
22

33
# Required Packages
4-
fastapi
5-
uvicorn
6-
python-socketio
7-
jinja2
8-
openai
9-
bs4
10-
pypdf
11-
requests
12-
lxml
13-
aiohttp
4+
fastapi # 0.108.0
5+
uvicorn # 0.27.0.post1
6+
python-socketio # 5.11.0
7+
jinja2 # 3.1.2
8+
openai # 1.58.1
9+
bs4 # 0.0.2
10+
pypdf # 5.1.0
11+
requests # 2.31.0
12+
lxml # 5.3.0
13+
aiohttp # 3.9.3
1414

1515
# RAG Support - Weaviate Vector Database
16-
weaviate-client
16+
weaviate-client # 4.8.1
1717

chatbot/server.py

Lines changed: 38 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -98,6 +98,10 @@
9898
from PIL import Image
9999
import pillow_heif
100100

101+
# Enable tracemalloc for memory usage
102+
import tracemalloc
103+
tracemalloc.start()
104+
101105
# Ensure pillow_heif is properly registered with PIL
102106
pillow_heif.register_heif_opener()
103107

@@ -544,7 +548,7 @@ async def get_weather(location):
544548
return response.text
545549
else:
546550
return "Unable to fetch weather for %s" % location
547-
551+
548552
# Function - Get stock price for company
549553
async def get_stock(company):
550554
if ALPHA_KEY == "alpha_key":
@@ -712,6 +716,7 @@ async def home(format: str = None):
712716
"Alpha Vantage API Key (ALPHA_KEY)": "************" if ALPHA_KEY != "" else "Not Set",
713717
"Toxicity Threshold (TOXIC_THRESHOLD)": TOXIC_THRESHOLD,
714718
"Extra Body Parameters (EXTRA_BODY)": EXTRA_BODY,
719+
"Thinking Mode (THINKING)": THINKING,
715720
}
716721
if format == "json":
717722
return data
@@ -875,28 +880,37 @@ async def send_update(session_id):
875880
if client[session_id]["prompt"] == "":
876881
await sio.sleep(0.1)
877882
else:
878-
if client[session_id]["cot"]:
879-
# Remember original prompt
880-
client[session_id]["cot_prompt"] = client[session_id]["prompt"]
881-
# Check to see if the prompt needs COT processing
882-
cot_check = expand_prompt(prompts["chain_of_thought_check"], {"prompt": client[session_id]["prompt"]})
883-
debug("Running CoT check")
884-
# Ask LLM for answers
885-
response = await ask_llm(cot_check)
886-
if "a" in response.lower() or "d" in response.lower() or client[session_id]["cot_always"]:
887-
debug("Running deep thinking CoT to answer")
888-
# Build prompt for Chain of Thought and create copy of context
889-
cot_prompt = expand_prompt(prompts["chain_of_thought"], {"prompt": client[session_id]["prompt"]})
890-
temp_context = client[session_id]["context"].copy()
891-
temp_context.append({"role": "user", "content": cot_prompt})
892-
# Send thinking status to client and ask LLM for answer
893-
await sio.emit('update', {'update': 'Thinking... ', 'voice': 'ai'},room=session_id)
894-
answer = await ask_context(temp_context)
895-
await sio.emit('update', {'update': '\n\n', 'voice': 'ai'},room=session_id)
896-
# Load request for CoT conclusion into conversational thread
897-
cot_prompt = expand_prompt(prompts["chain_of_thought_summary"], {"context_str": answer,
898-
"prompt": client[session_id]["cot_prompt"]})
899-
client[session_id]["prompt"] = cot_prompt
883+
# Check to see of CoT is enabled but not while processing a file/image
884+
client_cot = client[session_id]["cot"]
885+
client_image_data = client[session_id]["image_data"]
886+
client_visible = client[session_id]["visible"]
887+
if client_cot and not client_image_data and client_visible:
888+
try:
889+
# Remember original prompt
890+
client[session_id]["cot_prompt"] = client[session_id]["prompt"]
891+
# Check to see if the prompt needs COT processing
892+
cot_check = expand_prompt(prompts["chain_of_thought_check"], {"prompt": client[session_id]["prompt"]})
893+
debug("Running CoT check")
894+
# Ask LLM for answers
895+
response = await ask_llm(cot_check)
896+
if "a" in response.lower() or "d" in response.lower() or client[session_id]["cot_always"]:
897+
debug("Running deep thinking CoT to answer")
898+
# Build prompt for Chain of Thought and create copy of context
899+
cot_prompt = expand_prompt(prompts["chain_of_thought"], {"prompt": client[session_id]["prompt"]})
900+
temp_context = client[session_id]["context"].copy()
901+
temp_context.append({"role": "user", "content": cot_prompt})
902+
# Send thinking status to client and ask LLM for answer
903+
await sio.emit('update', {'update': 'Thinking... ', 'voice': 'ai'},room=session_id)
904+
answer = await ask_context(temp_context)
905+
# Load request for CoT conclusion into conversational thread
906+
cot_prompt = expand_prompt(prompts["chain_of_thought_summary"], {"context_str": answer,
907+
"prompt": client[session_id]["cot_prompt"]})
908+
client[session_id]["prompt"] = cot_prompt
909+
except Exception as erro:
910+
log(f"CoT error - continuing with original prompt: {erro}")
911+
await sio.emit('update', {'update': '\n\n', 'voice': 'ai'},room=session_id)
912+
else:
913+
client_cot = False
900914
try:
901915
# Ask LLM for answers
902916
response= await ask(client[session_id]["prompt"],session_id)
@@ -931,7 +945,7 @@ async def send_update(session_id):
931945
client[session_id]["references"] = ""
932946
if not ONESHOT:
933947
# If COT mode replace CoT context in conversation thread with user prompt
934-
if client[session_id]["cot"]:
948+
if client_cot:
935949
client[session_id]["context"].pop()
936950
client[session_id]["context"].append({"role": "user", "content": client[session_id]["cot_prompt"]} )
937951
# Remember answer

0 commit comments

Comments
 (0)