A PyQt6 + VLC-based language learning video player with interactive subtitles and word-level dictionary lookup.
VLCL is a custom video player designed for language acquisition. Unlike standard media players, it renders subtitles in a fully interactive overlay, enabling word-by-word interaction, dictionary lookup, and future spaced repetition integration.
turn passive video watching into active vocabulary learning.
VLC-backed media playback (python-vlc) Supports local video files (MKV, MP4, etc.) Basic play/pause/seek integration via VLC API
External .srt loading via pysubs2 Manual subtitle timing engine (not VLC subtitles) Subtitle selection based on playback timestamp Supports line navigation and replay
Fully custom QWidget overlay rendered above video Word-level segmentation of subtitles Clickable words (mouse hit-testing per word bounding box) Transparent rendering layer over video Toggle visibility (S key)
Online dictionary API integration (dictionaryapi.dev) Async lookup via Qt threads Popup window display on word click Basic caching (planned/partial depending on version)
S → toggle subtitle overlay R → replay current subtitle segment
Python 3.9+ PyQt6 (UI + event loop) VLC (media playback engine) pysubs2 (subtitle parsing) Dictionary API (REST)
graph TD
A[PlayerWidget]
A --> B[VLC MediaPlayer]
A --> C[SubtitleEngine<br/>SRT parsing + timing]
A --> D[SubtitleOverlay<br/>word rendering + interaction]
A --> E[DictionaryPopup<br/>lookup UI]
A --> F[DictionaryWorker<br/>async API fetch]
flowchart LR
A[Video Playback Time] --> B[SubtitleEngine]
B --> C[Subtitle Line]
C --> D[SubtitleOverlay renders words]
D --> E[User clicks word]
E --> F[DictionaryWorker async request]
F --> G[Dictionary API response]
G --> H[DictionaryPopup display]
- Occasional VLC decoder timestamp warnings (harmless)
- Dictionary API can return None → must be handled safely (partially done, needs loading and correct error messages)
- Overlay positioning still being tuned
- Word bounding logic is approximate (not glyph-accurate yet)
- No responsive layout scaling for different video sizes
- No keyboard navigation between words/subtitles
- No pause-on-click behavior yet
Fix dictionary normalization layer (strict schema) Remove all raw API dependency from UI layer Add safe fallback rendering for missing data Improve word tokenization (punctuation handling) Fix overlay sizing relative to video frame
Pause video on word click Auto-resume after popup close Add hover highlight on words Add subtitle click → seek to timestamp Add keyboard word navigation (left/right word selection)
Add phonetics + audio pronunciation playback Multiple meanings display (part-of-speech grouping) Example sentences Language selection (EN → FR/DE/etc.) Offline fallback dictionary cache
Save clicked words to local database (SQLite) Spaced repetition system (Anki-like scheduling) “Known / Learning / Unknown” tagging Word frequency tracking per video Subtitle export with highlighted vocabulary
Better word alignment (glyph-level layout, not heuristic widths) Multi-line subtitle shaping engine Subtitle timing smoothing (reduce jitter) Optional dual subtitles (native + target language)
Auto subtitle detection from MKV tracks (VLC integration optional) Whisper-based subtitle generation (AI transcription) Sentence segmentation vs raw subtitle lines Context-aware dictionary suggestions
This project is intentionally:
local-first low-latency interaction-heavy built around comprehension, not passive viewing
The goal is not a media player with subtitles.
It is:
a reading + listening + vocabulary acquisition environment built on top of video.
