Skip to content

Latest commit

 

History

History
92 lines (63 loc) · 2.66 KB

File metadata and controls

92 lines (63 loc) · 2.66 KB

Janitr Plan (v0)

Goal

Build a local-first classifier that flags undesired content with low false positives and user-adjustable thresholds.

Scope (initial)

  • Inputs: text content (DMs, posts, web text) with URL parsing.
  • Outputs: labels[] + confidence + optional reasons.
  • Classes: scam, crypto, ai_generated_reply, promo, clean.

Non-goals (v0)

  • No account-level bot scoring.
  • No real-time social graph analysis.
  • No remote inference dependencies.

Principles

  • Local-first: run on CPU without network calls for inference.
  • Conservative thresholds: prioritize low false positives.
  • Provenance: store where each text came from.
  • Fidelity: store data without losing important information.

Roadmap

Phase 0 — Definitions

  • Finalize labels and labeling rules.
  • Define minimum data schema (JSONL) with provenance fields.
  • Define data-fidelity rules (preserve original text, links, and metadata).
  • Decide user-adjustable thresholds and defaults.

Phase 1 — Data Sourcing (AI-first)

  • Ingest with OpenClaw only (no scripts in this repo).
  • Use the provided browser to collect all source content.
  • Use AI models to label scam content at scale.
  • For ai_generated_reply, source candidates by searching X for “AI reply”.
  • Collect non-crypto promo/ads and label as promo.
  • Store provenance for every record (platform, source id/url, timestamp).

Phase 2 — Baseline ML (CPU)

  • TF-IDF + Logistic Regression (or Linear SVM).
  • Train primarily on AI-labeled data.
  • Calibrate class thresholds to control false positives.

Phase 3 — Model Upgrade (optional)

  • Small transformer embeddings (MiniLM/Distil) + light classifier.
  • Quantization for speed.
  • Evaluate improvements vs baseline.

Phase 4 — Packaging

  • CLI interface returning JSON.
  • Simple config for thresholds and rule toggles.
  • Optional local daemon for integration.

Data Strategy

  • Use AI models to label the bulk of the dataset.
  • Add ai_generated_reply samples by searching X for “AI reply”.
  • Add promo samples from non-crypto marketing content.
  • Keep provenance fields for source tracking.

Evaluation

  • Report precision/recall/F1 per class.
  • Emphasize scam precision.
  • Maintain a fixed holdout set from day one.

Thresholding

  • Per-class confidence cutoffs.
  • Default to conservative (low false positives).
  • Allow user overrides in config.

Risks

  • Label noise from weak labeling sources.
  • Domain shift across platforms (X vs Discord vs web).
  • Over-filtering if thresholds are too aggressive.

Open Questions

  • Which AI models to use for labeling at scale?
  • How to version and update label data?
  • Do we need multilingual support in v0?