YOLO Model Training with Automated Dataset System

This project provides comprehensive documentation and examples for training various YOLO model v## Features

Multi-Model Support: YOLOv8, YOLOv5, and YOLO11 training
Smart Configuration: Automatic dataset preparation and configuration
Interactive Training: Step-by-step guided setup for beginners
GPU Optimization: Automatic GPU memory management with corrected estimation formulas (Sept 2025)
Comprehensive Monitoring: TensorBoard integration with real-time metrics
Checkpoint Management: Automatic saving and resuming of training sessions
Export Pipeline: Convert trained models to multiple formats (ONNX, TorchScript, etc.)
Validation Tools: Comprehensive model evaluation and testing
Memory Management: Automatic GPU memory cleanup and optimization*zero dataset preparation required**. The automated dataset system handles any dataset format and structure automatically.

Zero Dataset Preparation Required!**

Simply Place Your Dataset and Train!

# 1. Place ANY dataset in dataset/ folder (any structure/format)
# 2. Run training - everything happens automatically!

# If using virtual environment:
.venv/bin/python train.py    # Linux/Mac
# .venv\Scripts\python.exe train.py    # Windows

# Or with system Python:
python train.py

The system automatically:

Detects dataset structure (flat, nested, mixed)
Converts any format (YOLO, COCO, XML, custom)
Reorganizes to YOLO standard
Generates data.yaml configuration
Starts training immediately

Supported Sources:

Roboflow exports (any format)
Kaggle datasets
Custom annotations
Mixed sources
Any organization structure

Quick Start

NEW: QUICK_START_GUIDE.md - Get training in minutes with your dataset!

1. Setup Environment

# Create and activate virtual environment
python -m venv .venv
source .venv/bin/activate  # Linux/Mac
# or
.venv\Scripts\activate     # Windows

# Install dependencies
pip install -r requirements.txt

2. Set Roboflow API Key (Optional)

export ROBOFLOW_API_KEY="your_api_key_here"

3. Prepare Dataset (Automatic!)

# Option A: Automatic (Recommended)
# Just place your dataset in dataset/ folder and run training!

# Option B: Manual preparation (if needed)
python utils/prepare_dataset.py dataset/ --format yolov8

# Option C: Roboflow export
from roboflow import Roboflow
rf = Roboflow(api_key="your_api_key")
project = rf.workspace("workspace").project("project_id")
dataset = project.version("version_number").download("yolov8")

Note: Replace python with .venv/bin/python (Linux/Mac) or .venv\Scripts\python.exe (Windows) if using virtual environment.

4. Train YOLO Model (Any Version!)

Full Interactive Experience (Recommended for Beginners)

python train.py

The system will guide you through:

YOLO version selection (YOLO11, YOLOv8, YOLOv5)
Model size selection (nano to xlarge)
Training parameters (epochs, batch size, image size)
Advanced options and results folder naming
Automatic TensorBoard launch for real-time monitoring

Partial Interactive (Skip YOLO Selection)

python train.py --model-type yolov8

Uses specified YOLO version, prompts for other parameters

Fully Automated (No Prompts)

python train.py --model-type yolov8 --non-interactive --results-folder my_experiment

Uses all defaults, creates organized results folder

Custom Configuration (No Prompts)

python train.py \
  --model-type yolov8 \
  --epochs 100 \
  --batch-size 16 \
  --image-size 640 \
  --results-folder custom_run

Important: Replace python with your virtual environment path if using one:

Linux/Mac: .venv/bin/python
Windows: .venv\Scripts\python.exe --image-size 640
--device cuda
--results-folder production_run
--non-interactive


### 5. Monitor Training with TensorBoard

**Automatic Monitoring** (Recommended):
- TensorBoard launches automatically during training
- Opens in your browser with real-time metrics
- Continues running after training for result analysis

**Manual TensorBoard Management**:
```bash
# Check TensorBoard status and open in browser
python -m utils.tensorboard_manager status

# List all experiments with TensorBoard data
python -m utils.tensorboard_manager list

# Launch TensorBoard for specific experiment
python -m utils.tensorboard_manager launch experiment_name

# Stop TensorBoard when done
python -m utils.tensorboard_manager stop

6. Test Automated Dataset System (Optional)

# Test the automated dataset system
python tests/test_auto_dataset.py

# Run comprehensive YOLO testing
python tests/test_comprehensive_yolo.py

# Run standard tests
python -m pytest tests/ -v

Automated Dataset System Features

Smart Detection & Conversion

Structure Detection: Automatically identifies flat, nested, or mixed dataset structures
Format Conversion: Handles YOLO, COCO, XML, and custom annotation formats
Class Detection: Automatically detects classes from labels, annotations, or mapping files
Split Management: Creates optimal train/validation/test splits automatically

Zero Configuration Required

Automatic Organization: Converts any structure to YOLO standard
Smart Validation: Detects and reports dataset issues
Error Recovery: Handles corrupted files and missing labels gracefully
YOLO Compatibility: Works with YOLOv8, YOLOv5, and YOLO11

Production Ready

Comprehensive Testing: 100% test coverage for all YOLO versions
Error Handling: Robust error handling and recovery
Performance Optimized: Fast dataset preparation and validation
Integration Ready: Seamlessly integrates with training pipeline

GPU Memory Management

The system includes intelligent GPU memory management to prevent CUDA out-of-memory errors:

Smart Memory Features

Automatic memory estimation before training starts
Safety warnings for risky configurations
Memory cleanup after training completion
Emergency recovery from out-of-memory errors

Quick GPU Commands

# Check memory status
python gpu_memory_cli.py status

# Test if configuration will fit before training
python gpu_memory_cli.py check --model l --image-size 1280 --batch-size 4 --version yolov8

# Clear GPU memory if needed
python gpu_memory_cli.py clear

# Monitor memory during training
python gpu_memory_cli.py monitor

Memory Tips

If you get CUDA out-of-memory errors:

Reduce batch size: Try --batch-size 4 or --batch-size 2
Reduce image size: Try --image-size 640 instead of --image-size 1280
Use smaller model: Try YOLOv8n or YOLOv8s instead of YOLOv8l

Documentation Structure

Complete Documentation System

QUICK_START_GUIDE.md - NEW! Get training in minutes
docs/README.md - Main documentation hub
docs/workflow/README.md - Comprehensive workflow documentation

Quick Start with Documentation

Start here: docs/README.md - Main documentation hub
System overview: docs/workflow/01-system-overview/01-system-overview.md
Training workflows: docs/workflow/04-integration-workflows/01-training-workflows.md

Complete Documentation Coverage

The documentation covers ALL files in ALL repository directories:

System Overview - What the system does and why it matters
Core Components - Main training script, configuration, utilities, dataset system
Supporting Systems - Examples, testing, export, environment setup
Integration Workflows - Complete training processes, data flow, error handling
Validation & Testing - Quality assurance and maintenance procedures

Examples & Scripts

examples/export_dataset.py - Practical export script

Supported YOLO Versions

Version	Status	Export Format	Training Method
YOLO11	New Latest	`yolo11`	Repository/Ultralytics
YOLOv8	Recommended	`yolov8`	Ultralytics
YOLOv5	Stable	`yolo`	Repository/Ultralytics
YOLOv6	Limited	`yolo`	Repository
YOLOv7	Limited	`yolo`	Repository
YOLOv9	Experimental	`yolo`	Repository

Training Command Line Options

Basic Training Commands

# Full interactive experience - selects YOLO version and all parameters
python train.py

# Train YOLOv8 with interactive configuration (recommended for beginners)
python train.py --model-type yolov8

# Train YOLOv8 with custom parameters (no prompts)
python train.py --model-type yolov8 --epochs 200 --batch-size 16 --image-size 640

# Train with custom results folder (no folder prompt)
python train.py --model-type yolov8 --results-folder experiment_2024

# Skip all interactive prompts (use defaults)
python train.py --model-type yolov8 --non-interactive

# Resume training from checkpoint
python train.py --model-type yolov8 --resume logs/previous_run/weights/last.pt

# Validate only (no training)
python train.py --model-type yolov8 --validate-only

# Export model after training
python train.py --model-type yolov8 --export

Command Line Arguments

--model-type: Choose between yolo11, yolov8, yolov5
--epochs: Number of training epochs
--batch-size: Training batch size
--image-size: Input image size
--device: Training device (cpu, cuda, auto)
--results-folder: Custom folder name for results (skips interactive prompt)
--non-interactive: Skip all interactive configuration prompts (use defaults)
--resume: Path to checkpoint for resuming training
--validate-only: Only validate, don't train
--export: Export model after training

Interactive Configuration

When you run training without --non-interactive, the system will prompt you for:

YOLO Version: Choose between YOLO11, YOLOv8, YOLOv5 (if not specified)
Model Size: Choose between n (nano), s (small), m (medium), l (large), x (xlarge)
Training Duration: Number of epochs (default: 100)
Batch Size: Training batch size (default: 8)
Image Size: Input resolution (default: 1024)
Learning Rate: Training learning rate (default: 0.01)
Device: GPU or CPU training (default: cuda if available)
Advanced Options: Early stopping patience, augmentation, validation frequency

Pro Tips:

Press Enter to accept default values
Use --non-interactive for automated scripts
Combine with --results-folder to skip folder naming prompt

TensorBoard Integration

Automatic Training Monitoring

The system includes automatic TensorBoard integration that provides real-time training visualization:

During Training

Auto-launch: TensorBoard opens automatically in your browser
Real-time metrics: Live loss curves, accuracy plots, and training progress
Model visualization: Network architecture and computational graphs
Persistent access: TensorBoard remains running after training completes

After Training

Result analysis: Continue viewing training metrics and model performance
Experiment comparison: Compare different training runs and experiments
Easy management: Simple commands to control TensorBoard sessions

TensorBoard Management Commands

# Check if TensorBoard is running and open in browser
python -m utils.tensorboard_manager status

# List all experiments with their TensorBoard data status
python -m utils.tensorboard_manager list

# Launch TensorBoard for a specific experiment
python -m utils.tensorboard_manager launch experiment_name

# Launch on custom port
python -m utils.tensorboard_manager launch experiment_name --port 6007

# Stop all TensorBoard processes
python -m utils.tensorboard_manager stop

TensorBoard Features

Training Metrics: Loss curves (box, classification, DFL losses)
Validation Metrics: mAP50, mAP50-95, precision, recall
Model Architecture: Visual representation of YOLO network structure
Training Images: Sample batches with augmentations and predictions
Hyperparameters: Complete training configuration tracking
System Metrics: GPU utilization, memory usage, training speed

Manual Access

If you need to manually access TensorBoard for any experiment:

# For current training
http://localhost:6006

# View experiment logs directly
tensorboard --logdir logs/experiment_name/experiment_name

Custom Results Folder Feature

When you run training, the system prompts for a custom folder name
Results are organized in logs/your_custom_name/ instead of the default logs/yolo_training/
Each training run gets its own organized folder with weights, plots, and logs
Folder names are automatically cleaned of invalid characters
Existing folders can be reused or new names can be chosen

Interactive Configuration Feature

Beginner-Friendly: Step-by-step prompts for all major training parameters
Smart Defaults: Press Enter to accept recommended values
Model Selection: Choose from nano (n) to xlarge (x) model sizes
Parameter Guidance: Helpful explanations for each setting
Validation: Input validation with helpful error messages
Flexible: Use --non-interactive to skip prompts for automation

Robust System Architecture

The system is built with enterprise-grade reliability:

Core Components

Configuration Management: Centralized config with validation and environment overrides
Data Pipeline: Robust dataset handling with automatic validation and preprocessing
Training Engine: Comprehensive training loop with checkpoint management and monitoring
Evaluation System: Multi-metric evaluation with visualization and reporting
Export Utilities: Multi-format model export (ONNX, TorchScript, CoreML, TensorRT)

Reliability Features

Comprehensive Testing: 98 tests covering all major components
Error Handling: Graceful failure handling with detailed logging
Data Validation: Automatic dataset structure and format validation
Checkpoint Management: Robust save/load with automatic cleanup
Real-time Monitoring: Automatic TensorBoard integration with persistent access
Training Visualization: Live metrics, loss curves, and model performance tracking

Prerequisites

Python 3.8+ with pip
Roboflow account with annotated dataset
API key from Roboflow
GPU (recommended for training)

Installation

Option 1: Full Installation

pip install -r requirements.txt

Option 2: Minimal Installation

pip install ultralytics roboflow torch torchvision

Option 3: GPU Support

# Install PyTorch with CUDA support
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu118
pip install ultralytics roboflow

Learning Path

Beginner (Start Here)

Read docs/README.md - Main documentation hub
Try docs/workflow/01-system-overview/01-system-overview.md - System overview
Follow docs/workflow/04-integration-workflows/01-training-workflows.md - Training workflows
Run the example script
Train your first model with python train.py

Intermediate

Explore other YOLO versions
Customize training parameters
Experiment with different architectures
Optimize for your use case

Advanced

Research latest YOLO versions
Contribute to the community
Deploy models to production
Optimize for edge devices

Common Issues

Export Problems

Format not found: Use yolo format as fallback
API key errors: Verify environment variable is set
Permission denied: Check Roboflow project access

Training Problems

Memory errors: Reduce batch size and image size
CUDA issues: Verify PyTorch CUDA installation
Path errors: Check data.yaml file paths

Performance Issues

Slow training: Use smaller model variants
Low accuracy: Increase dataset size and quality
Overfitting: Add more augmentation and regularization

Testing & Quality Assurance

# Run all tests
python -m pytest tests/ -v

# Run specific test categories
python -m pytest tests/test_config.py -v
python -m pytest tests/test_data_loader.py -v
python -m pytest tests/test_training.py -v

Demo and Examples

Interactive Demo

python demo_interactive_training.py

Shows all available training modes and options

Quick Examples

# Interactive training with YOLO version selection
python train.py

# Non-interactive training with defaults
python train.py --model-type yolov8 --non-interactive --results-folder quick_test

# Custom configuration
python train.py --model-type yolov8 --epochs 50 --batch-size 4 --image-size 640

Contributing

Fork the repository
Create a feature branch
Make your changes
Add tests for new functionality
Submit a pull request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Support

Issues: Use GitHub Issues for bug reports
Discussions: Use GitHub Discussions for questions
Documentation: Check the docs/ folder for detailed guides
Examples: See the examples/ folder for practical usage

Acknowledgments

Ultralytics team for YOLOv8 implementation
Roboflow for dataset management tools
PyTorch community for deep learning framework
Contributors and users of this project

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
config		config
docs		docs
examples		examples
exported_models		exported_models
pretrained_weights		pretrained_weights
tests		tests
utils		utils
.gitignore		.gitignore
LICENSE.md		LICENSE.md
QUICK_START_GUIDE.md		QUICK_START_GUIDE.md
README.md		README.md
env.example		env.example
gpu_memory_cli.py		gpu_memory_cli.py
pytest.ini		pytest.ini
requirements-test.txt		requirements-test.txt
requirements.txt		requirements.txt
train.py		train.py

License

tSopermon/YOLO-auto-training

Folders and files

Latest commit

History

Repository files navigation

YOLO Model Training with Automated Dataset System

Zero Dataset Preparation Required!**

Simply Place Your Dataset and Train!

Quick Start

1. Setup Environment

2. Set Roboflow API Key (Optional)

3. Prepare Dataset (Automatic!)

4. Train YOLO Model (Any Version!)

Full Interactive Experience (Recommended for Beginners)

Partial Interactive (Skip YOLO Selection)

Fully Automated (No Prompts)

Custom Configuration (No Prompts)

6. Test Automated Dataset System (Optional)

Automated Dataset System Features

Smart Detection & Conversion

Zero Configuration Required

Production Ready

GPU Memory Management

Smart Memory Features

Quick GPU Commands

Memory Tips

Documentation Structure

Complete Documentation System

Quick Start with Documentation

Complete Documentation Coverage

Examples & Scripts

Supported YOLO Versions

Training Command Line Options

Basic Training Commands

Command Line Arguments

Interactive Configuration

TensorBoard Integration

Automatic Training Monitoring

During Training

After Training

TensorBoard Management Commands

TensorBoard Features

Manual Access

Custom Results Folder Feature

Interactive Configuration Feature

Robust System Architecture

Core Components

Reliability Features

Prerequisites

Installation

Option 1: Full Installation

Option 2: Minimal Installation

Option 3: GPU Support

Learning Path

Beginner (Start Here)

Intermediate

Advanced

Common Issues

Export Problems

Training Problems

Performance Issues

Testing & Quality Assurance

Demo and Examples

Interactive Demo

Quick Examples

Contributing

License

Support

Acknowledgments

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages