Skip to content

Commit 4bdb9ef

Browse files
authored
Merge pull request #90 from DL4DS/dev2main
Update main with current state of dev
2 parents 81a188d + 8e1bf6f commit 4bdb9ef

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

48 files changed

+2470
-443
lines changed

.flake8

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
[flake8]
2+
max-line-length = 88
3+
extend-ignore = E203, E266, E501, W503

.gitattributes

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
*.jpg filter=lfs diff=lfs merge=lfs -text
Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
name: Code Quality and Security Checks
2+
3+
on:
4+
push:
5+
branches: [ main, dev_branch ]
6+
pull_request:
7+
branches: [ main, dev_branch ]
8+
9+
jobs:
10+
code-quality:
11+
runs-on: ubuntu-latest
12+
steps:
13+
- uses: actions/checkout@v3
14+
15+
- name: Set up Python
16+
uses: actions/setup-python@v4
17+
with:
18+
python-version: '3.11'
19+
20+
- name: Install dependencies
21+
run: |
22+
python -m pip install --upgrade pip
23+
pip install flake8 black bandit
24+
25+
- name: Run Black
26+
run: black --check .
27+
28+
- name: Run Flake8
29+
run: flake8 .
30+
31+
- name: Run Bandit
32+
run: |
33+
bandit -r .

.gitignore

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -165,7 +165,9 @@ cython_debug/
165165
.ragatouille/*
166166
*/__pycache__/*
167167
.chainlit/translations/
168+
code/.chainlit/translations/
168169
storage/logs/*
169170
vectorstores/*
170171

171-
*/.files/*
172+
*/.files/*
173+
code/storage/models/

Dockerfile

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,13 @@ WORKDIR /code/code
2626

2727
RUN --mount=type=secret,id=HUGGINGFACEHUB_API_TOKEN,mode=0444,required=true
2828
RUN --mount=type=secret,id=OPENAI_API_KEY,mode=0444,required=true
29+
RUN --mount=type=secret,id=CHAINLIT_URL,mode=0444,required=true
30+
RUN --mount=type=secret,id=LITERAL_API_URL,mode=0444,required=true
31+
RUN --mount=type=secret,id=LLAMA_CLOUD_API_KEY,mode=0444,required=true
32+
RUN --mount=type=secret,id=OAUTH_GOOGLE_CLIENT_ID,mode=0444,required=true
33+
RUN --mount=type=secret,id=OAUTH_GOOGLE_CLIENT_SECRET,mode=0444,required=true
34+
RUN --mount=type=secret,id=LITERAL_API_KEY_LOGGING,mode=0444,required=true
35+
RUN --mount=type=secret,id=CHAINLIT_AUTH_SECRET,mode=0444,required=true
2936

3037
# Default command to run the application
31-
CMD ["sh", "-c", "python -m modules.vectorstore.store_manager && chainlit run main.py --host 0.0.0.0 --port 7860"]
38+
CMD ["sh", "-c", "python -m modules.vectorstore.store_manager && uvicorn app:app --host 0.0.0.0 --port 7860"]

README.md

Lines changed: 18 additions & 39 deletions
Original file line numberDiff line numberDiff line change
@@ -15,10 +15,14 @@ You can find a "production" implementation of the Tutor running live at [DL4DS T
1515
Hugging Face [Space](https://huggingface.co/spaces/dl4ds/dl4ds_tutor). It is pushed automatically from the `main` branch of this repo by this
1616
[Actions Workflow](https://github.com/DL4DS/dl4ds_tutor/blob/main/.github/workflows/push_to_hf_space.yml) upon a push to `main`.
1717

18-
A "development" version of the Tutor is running live at [DL4DS Tutor -- Dev](https://dl4ds-tutor-dev.hf.space) from this Hugging Face
18+
19+
A "development" version of the Tutor is running live at [DL4DS Tutor -- Dev](https://dl4ds-tutor-dev.hf.space/) from this Hugging Face
1920
[Space](https://huggingface.co/spaces/dl4ds/tutor_dev). It is pushed automatically from the `dev_branch` branch of this repo by this
2021
[Actions Workflow](https://github.com/DL4DS/dl4ds_tutor/blob/dev_branch/.github/workflows/push_to_hf_space_prototype.yml) upon a push to `dev_branch`.
2122

23+
## Setup
24+
25+
Please visit [setup](https://dl4ds.github.io/dl4ds_tutor/guide/setup/) for more information on setting up the project.
2226

2327
## Running Locally
2428

@@ -34,7 +38,7 @@ A "development" version of the Tutor is running live at [DL4DS Tutor -- Dev](htt
3438
3. **To test Data Loading (Optional)**
3539
```bash
3640
cd code
37-
python -m modules.dataloader.data_loader
41+
python -m modules.dataloader.data_loader --links "your_pdf_link"
3842
```
3943

4044
4. **Create the Vector Database**
@@ -43,47 +47,16 @@ A "development" version of the Tutor is running live at [DL4DS Tutor -- Dev](htt
4347
python -m modules.vectorstore.store_manager
4448
```
4549
- Note: You need to run the above command when you add new data to the `storage/data` directory, or if the `storage/data/urls.txt` file is updated.
46-
- Alternatively, you can set `["vectorstore"]["embedd_files"]` to `True` in the `code/modules/config/config.yaml` file, which will embed files from the storage directory every time you run the below chainlit command.
4750

48-
5. **Run the Chainlit App**
51+
6. **Run the FastAPI App**
4952
```bash
50-
chainlit run main.py
53+
cd code
54+
uvicorn app:app --port 7860
5155
```
5256

53-
See the [docs](https://github.com/DL4DS/dl4ds_tutor/tree/main/docs) for more information.
54-
55-
## File Structure
56-
57-
```plaintext
58-
code/
59-
├── modules
60-
│ ├── chat # Contains the chatbot implementation
61-
│ ├── chat_processor # Contains the implementation to process and log the conversations
62-
│ ├── config # Contains the configuration files
63-
│ ├── dataloader # Contains the implementation to load the data from the storage directory
64-
│ ├── retriever # Contains the implementation to create the retriever
65-
│ └── vectorstore # Contains the implementation to create the vector database
66-
├── public
67-
│ ├── logo_dark.png # Dark theme logo
68-
│ ├── logo_light.png # Light theme logo
69-
│ └── test.css # Custom CSS file
70-
└── main.py
71-
72-
73-
docs/ # Contains the documentation to the codebase and methods used
57+
## Documentation
7458

75-
storage/
76-
├── data # Store files and URLs here
77-
├── logs # Logs directory, includes logs on vector DB creation, tutor logs, and chunks logged in JSON files
78-
└── models # Local LLMs are loaded from here
79-
80-
vectorstores/ # Stores the created vector databases
81-
82-
.env # This needs to be created, store the API keys here
83-
```
84-
- `code/modules/vectorstore/vectorstore.py`: Instantiates the `VectorStore` class to create the vector database.
85-
- `code/modules/vectorstore/store_manager.py`: Instantiates the `VectorStoreManager:` class to manage the vector database, and all associated methods.
86-
- `code/modules/retriever/retriever.py`: Instantiates the `Retriever` class to create the retriever.
59+
Please visit the [docs](https://dl4ds.github.io/dl4ds_tutor/) for more information.
8760

8861

8962
## Docker
@@ -97,4 +70,10 @@ docker run -it --rm -p 8000:8000 dev
9770

9871
## Contributing
9972

100-
Please create an issue if you have any suggestions or improvements, and start working on it by creating a branch and by making a pull request to the main branch.
73+
Please create an issue if you have any suggestions or improvements, and start working on it by creating a branch and by making a pull request to the `dev_branch`.
74+
75+
Please visit [contribute](https://dl4ds.github.io/dl4ds_tutor/guide/contribute/) for more information on contributing.
76+
77+
## Future Work
78+
79+
For more information on future work, please visit [roadmap](https://dl4ds.github.io/dl4ds_tutor/guide/readmap/).

code/.chainlit/config.toml

Lines changed: 8 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ allow_origins = ["*"]
2020

2121
[features]
2222
# Process and display HTML in messages. This can be a security risk (see https://stackoverflow.com/questions/19603097/why-is-it-dangerous-to-render-user-generated-html-or-javascript)
23-
unsafe_allow_html = false
23+
unsafe_allow_html = true
2424

2525
# Process and display mathematical expressions. This can clash with "$" characters in messages.
2626
latex = true
@@ -49,6 +49,8 @@ auto_tag_thread = true
4949
# Sample rate of the audio
5050
sample_rate = 44100
5151

52+
edit_message = true
53+
5254
[UI]
5355
# Name of the assistant.
5456
name = "AI Tutor"
@@ -59,11 +61,11 @@ name = "AI Tutor"
5961
# Large size content are by default collapsed for a cleaner ui
6062
default_collapse_content = true
6163

62-
# Hide the chain of thought details from the user in the UI.
63-
hide_cot = true
64+
# Chain of Thought (CoT) display mode. Can be "hidden", "tool_call" or "full".
65+
cot = "hidden"
6466

6567
# Link to your github repo. This will add a github button in the UI's header.
66-
# github = "https://github.com/DL4DS/dl4ds_tutor"
68+
github = "https://github.com/DL4DS/dl4ds_tutor"
6769

6870
# Specify a CSS file that can be used to customize the user interface.
6971
# The CSS file can be served from the public directory or via an external link.
@@ -85,7 +87,7 @@ custom_meta_image_url = "https://upload.wikimedia.org/wikipedia/commons/thumb/f/
8587
# custom_build = "./public/build"
8688

8789
[UI.theme]
88-
default = "dark"
90+
default = "light"
8991
#layout = "wide"
9092
#font_family = "Inter, sans-serif"
9193
# Override default MUI light theme. (Check theme.ts)
@@ -115,4 +117,4 @@ custom_meta_image_url = "https://upload.wikimedia.org/wikipedia/commons/thumb/f/
115117
#secondary = "#BDBDBD"
116118

117119
[meta]
118-
generated_by = "1.1.304"
120+
generated_by = "1.1.402"

code/__init__.py

Lines changed: 0 additions & 1 deletion
This file was deleted.

0 commit comments

Comments
 (0)