A tool to find and extract course offerings from the CUD Portal using browser automation.
- Python 3.10+ (3.13 recommended)
- Gemini API key (from Google AI Studio)
On Linux systems (particularly Arch-based distributions), you'll need to use a virtual environment due to the externally managed environment restrictions (PEP 668).
# Create a virtual environment
python -m venv venv
# Activate the virtual environment
source venv/bin/activate
# Install dependencies
pip install -r requirements.txt
# Install Playwright browsers
playwright install
### Windows
On Windows, you can install packages directly:
# Install dependencies
pip install -r requirements.txt
# Install Playwright browsers
playwright install# Create a virtual environment (recommended)
python -m venv venv
# Activate the virtual environment
source venv/bin/activate
# Install dependencies
pip install -r requirements.txt
# Install Playwright browsers
playwright install- Go to Google AI Studio
- Create a new API key
- Create a
.envfile in the project root with the following content:
GEMINI_API_KEY=your_api_key_hereRun the command-line version:
python offerings_scraper.pyFollow the prompts to enter your CUD Portal credentials and search criteria. The results will be saved to results.csv and course_offerings.xlsx.
For a more user-friendly experience, run the Streamlit web application:
streamlit run app.pyThis will open a web browser with the Schedule Finder interface where you can:
- Log in with your CUD Portal credentials and Gemini API key
- Chat with the assistant to extract course data or search for specific information
- Browse, filter, and download course offerings data
- Search for courses by instructor, year, or course code
For example, you can:
- Ask "Extract all course offerings from the CUD portal" to scrape the data
- Ask "Show me all courses taught by Dr. Said Elnaffar" to filter results
- Ask "What courses are available for 2nd year students?" to get year-specific offerings
If you see this error on Linux:
error: externally-managed-environment
This means your Python installation is managed by the system package manager. Always use a virtual environment as described in the Linux installation instructions above.
If you encounter issues with Playwright browser installation:
# Try running with admin privileges
sudo playwright install
# Or specify the browser
playwright install chromium