Skip to content
View wmaousley's full-sized avatar

Block or report wmaousley

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
wmaousley/README.md

MiniCrit-1.5B

Adversarial Financial Critic Model for Autonomous LLM Trading Systems

DOI DOI
ORCID
HuggingFace Dataset
Model Card

License: MIT Python Version Model Size Dataset Size LoRA ATAC-LoRA Status PRs Welcome


πŸ”₯ Overview

MiniCrit-1.5B is an adversarial financial critic model designed to evaluate, rebut, and stress-test LLM-generated trading rationales.
It functions as a validator layer inside multi-LLM autonomous trading engines, improving safety, reducing hallucinations, and increasing discipline in trading decisions.

This repository includes:

  • FinRebut-600 β€” 600 realistic rationales + adversarial counter-arguments
  • MiniCrit-12k β€” 12,132 institutional rationale–critique pairs
  • 0.5B LoRA critic checkpoint (CPU-trainable)
  • ATAC-LoRA training pipeline and notebook
  • Model card + Zenodo DOI + ORCID metadata
  • Forward-testing benchmarks and full reproducibility workflow

πŸ“š Project Links

Resource Link
Repository https://github.com/wmaousley/MiniCrit-1.5B
Dataset (FinRebut-600) https://huggingface.co/datasets/wmaousley/finrebut-600
Dataset (MiniCrit-12k) https://huggingface.co/datasets/wmaousley/minicrit-training-12k
Zenodo DOI https://doi.org/10.5281/zenodo.17594497
ORCID https://orcid.org/0009-0009-2503-2010

🧠 Model Summary

  • Model Name: MiniCrit-1.5B
  • Type: LoRA-extended adversarial financial critic
  • Role: Detect flawed reasoning, hallucinations, or missing evidence in LLM-generated trading rationales
  • Training Pipeline: Nightly ATAC-LoRA
  • Datasets Included:
    • FinRebut-600 (600 samples)
    • MiniCrit-12k (12,132 samples, CC-BY-4.0)
  • Target Hardware: 8Γ—A100-80GB (Lambda Labs grant request)
  • Artifacts: Checkpoints, notebook, scripts, dataset, model card
  • Forward-Test Performance:
    • Sharpe ratio improved from +0.2 β†’ +0.8 on 1-week window
    • Reduced hallucination-driven trade decisions

πŸ“ˆ Training Results (v1.3.x)

Metric Value
Base model Qwen2-0.5B-Instruct
LoRA rank 16
Loss (start β†’ end) TBD (after you add screenshot)
Training time ~XX minutes (M2 Ultra)
Paper-trading Sharpe +0.8 vs +0.2 baseline
Dataset MiniCrit-12k

## πŸ“ Repository Structure

MiniCrit-1.5B/
β”œβ”€β”€ data/
β”‚ └── finrebut-600.csv
β”œβ”€β”€ notebooks/
β”‚ └── ATAC_LoRA_MiniCrit.ipynb
β”œβ”€β”€ checkpoints/
β”‚ └── minicrit_lora_0.5b.pt
β”œβ”€β”€ paper/
β”‚ └── minicrit_preprint.pdf
└── src/
└── training/

πŸš€ Quickstart

---

# πŸš€ Quickstart

### 1. Create environment
```bash
python3.10 -m venv venv
source venv/bin/activate
pip install -r requirements.txt

Or open the training notebook:

notebooks/ATAC_LoRA_MiniCrit.ipynb

πŸ“„ Citation

Ousley, W. A. (2025). MiniCrit-1.5B: Adversarial Financial Critic Model and
FinRebut-600 Dataset (v1.2.0)
. Zenodo.
https://doi.org/10.5281/zenodo.17594497

@dataset{ousley2025minicrit,
  author    = {William A. Ousley},
  title     = {{MiniCrit-1.5B: Adversarial Financial Critic Model and FinRebut-600 Dataset}},
  year      = {2025},
  version   = {1.2.0},
  publisher = {Zenodo},
  doi       = {10.5281/zenodo.17594497},
  url       = {https://doi.org/10.5281/zenodo.17594497}
}

πŸ… Author

William Alexander Ousley
PMP β€’ CSIE β€’ CSAP
AI/ML Researcher β€” Autonomous Trading Systems
ORCID: https://orcid.org/0009-0009-2503-2010

🀝 Contributors

MiniCrit is an independent research project maintained by:

  • William Alexander Ousley β€” Creator, lead researcher, dataset engineer, and model developer.

Contributions are welcome.
If you would like to collaborate (datasets, pipeline upgrades, reproducibility fixes, or model improvements), please open an issue or submit a pull request.

πŸ’  Funding & Acknowledgements

This project is part of an ongoing effort to build transparent, open-source adversarial evaluators for financial LLM systems.

Special acknowledgements:

  • Lambda Labs Research Grant (Pending Review) β€” 2,000 A100-80GB compute hours requested
  • CloudRift Research Grant (Under Review) β€” 1,000 GPU hours requested
  • HuggingFace β€” Hosting the FinRebut-600 dataset
  • Zenodo / CERN β€” DOI archival and long-term preservation
  • GitHub β€” Repository infrastructure and distribution ecosystem

This is an independent research project and is not affiliated with any institution, employer, or sponsor.

🧭 Project Roadmap (2025)

Phase 1 β€” Dataset Expansion (Q4 2025)

  • Expand FinRebut-600 β†’ FinRebut-2000
  • Add macro-driven and high-volatility rationale categories
  • Introduce multi-rater adjudication (LLM + human)

Phase 2 β€” Model Improvements

  • Scale MiniCrit-1.5B β†’ MiniCrit-3B (LoRA or QLoRA)
  • Add cross-model adversarial scoring (multi-LLM validation)
  • Integrate chain-of-thought flaw and hallucination detection

Phase 3 β€” Evaluation Framework

  • Build a standalone MiniCrit Evaluator API
  • Create benchmark tasks for:
    • fallacy detection
    • weak reasoning detection
    • hallucination classification
    • adversarial rebuttal generation

Phase 4 β€” Research Publication

  • Draft full 8–12 page technical report
  • Publish via Zenodo / TechRxiv
  • Add appendix covering datasets, methodology, and ablations

πŸ”„ System Workflow

flowchart TD

A[User or LLM Generates Trading Rationale] --> B[MiniCrit Model]
B --> C{Critique?}
C -->|Weak Reasoning| D[Generate Adversarial Rebuttal]
C -->|Acceptable| E[Score & Pass Forward]

D --> F[Store in FinRebut Dataset]
F --> G[Nightly ATAC-LoRA Training]

E --> H[Ensemble Validator]
H --> I[Autonomous Trading Engine]
Loading

ASCII Fallback (for GitHub mobile or Markdown viewers that don't support Mermaid):

[ Rationale ] β†’ [ MiniCrit ] β†’ { Acceptable? }
       | Yes β†’ Score β†’ Validator β†’ Trade Engine
       | No  β†’ Rebuttal β†’ Dataset β†’ Nightly Training

Pinned Loading

  1. wmaousley wmaousley Public