Skip to content

LWT 3.2.0: Shell-free mobile client, security hardening, and the Kids' Library

Latest

Choose a tag to compare

@HugoFara HugoFara released this 23 Jun 14:35
· 5 commits to main since this release

A feature release built around two themes: making LWT a real backend for a shell-free mobile client, and a broad, methodical security-hardening pass across the auth, upload, and fetch surfaces. Plus a new source of easy reading material, the Global Digital Library.

Highlights

Shell-free / mobile client
The web shell is no longer required to render the app. The global navbar, the reader chrome (book navigation + audio player), and UI translations are now served from the REST API and rendered client-side, and the review surface and feedback sounds are bundled. A packaged client can choose its server, register and log in in-app, and keep its session alive with proactive token refresh (with a clean 401 teardown). This is what the Lukaisu Android client connects to.

New text source: the Global Digital Library ("Kids' Library")
Browse and search openly-licensed (CC-BY / CC-BY-SA) children's and early-grade readers — including StoryWeaver content — straight from the New Text page, filling the gap in easy texts that Gutenberg and Wikisource leave. Books import via ePUB extraction, image-only picture books are rejected, and difficulty tiers come from GDL's reading levels. The home page shows beginner-aware GDL suggestions: low-vocabulary readers see the easy books first, advanced readers see them below the classics.

Registration without email + recovery code + captcha
The username is now the unique identity, so sign-up needs only a username + password. Email becomes an optional recovery channel. Email-less accounts get a one-time recovery code (shown once, with a /password/recover reset flow that rotates on use). Registration is protected by a self-hosted ALTCHA proof-of-work captcha (no third-party service, no user puzzle; ALTCHA_ENABLED / ALTCHA_HMAC_KEY), plus a honeypot and submission-timing check.

StarDict dictionary uploads via archives (#233)
The import form now accepts .zip, .tar.gz, .tar.bz2, .tar.xz, and .tgz containing the StarDict triplet, and FreeDict downloads import directly. Extraction is shared via a new ArchiveExtractor (zip-bomb cap, path-traversal guard, automatic cleanup).

Security hardening

A multi-phase audit closed a wide range of issues, each with regression tests:

  • XSS: fixed json_encode-into-<script> breakouts (missing JSON_HEX_TAG | JSON_HEX_AMP), DOM sinks in the word popup / tooltips / Glosbe translations, and the addslashes-into-attribute anti-pattern (feed browse, confirm dialogs).
  • CSRF: added real CSRF enforcement to the auth POST endpoints (/login, /register, /password/*, email re-verification) and fixed bulk vocabulary / texts actions that posted without a token.
  • Auth: open-redirect fix on auth_redirect, timing-safe OAuth state comparison, and invalidation of remember-me + API tokens on password change/reset.
  • Authz / IDOR: cross-table ownership guards on dictionaries, feeds, and sentence lookups (languageBelongsToCurrentUser).
  • SSRF: outbound fetches (RSS, web/article extractors, Gutenberg, Wiktionary) now route through a central safeHttpGet that disables stream-level redirects and re-validates every hop.
  • Uploads: defensive depth for importers — filename sanitization at the boundary, tar list-before-extract with file caps, size caps on subtitle/JSON/CSV imports, BOM/UTF-16 handling, and reliable temp-file cleanup.
  • Audio: hardened position save (pagehide + sendBeacon, periodic checkpoint), float precision, Whisper MIME re-validation, and a rate limit on transcription.
  • Dependency scans (composer audit, npm audit --omit=dev) report 0 advisories.

Fixed

  • Navbar hamburger hidden under the status/camera bar on edge-to-edge phones — the navbar now respects safe-area insets.
  • Login/registration field icons rendering outside the input (PurgeCSS stripped Bulma's .icon.is-left/right; now safelisted).
  • Misspelled currentlangage settings key broke the current-language TTS voice and term-translation language context.
  • 429 PHP 8.5 deprecation warnings cleared (redundant setAccessible(true) in tests; ord() on a multi-byte char).
  • Saving a text with multi-word expressions in multi-user mode 500'd on a binding-misalignment FK violation.
  • Saving 2+ tags on a term/text threw on the 20-char cap (Tagify comma-serialization now split in the service layer).
  • Multi-word term selection captured inline translation hints (now reads the clean surface form).

Developer proposals (docs only)

  • Single data_hex word identity (#237) — replace the TERM<hex> class-as-index with a data_hex attribute.
  • Term-status model + FSRS scheduling (#238) — collapse the scattered 1–5/98/99 literals onto TermStatus and align review scheduling with Anki/FSRS.

Both are proposals; implementation is deferred.


Full changelog: https://github.com/HugoFara/lwt/blob/main/CHANGELOG.md