|
1 | | -# Image Annotation Tool |
| 1 | +# The Annotation Garden Project |
2 | 2 |
|
3 | | -Web-based annotation interface for the Annotation Garden Initiative. |
| 3 | +🌐 **[View Live Dashboard](https://neuromechanist.github.io/image-annotation)** |
4 | 4 |
|
5 | | -## Origin |
| 5 | +A VLM-based image annotation system for Natural Scene Dataset (NSD) using multiple Vision-Language Models (VLMs). |
6 | 6 |
|
7 | | -This repository is adapted from [neuromechanist/image-annotation](https://github.com/neuromechanist/image-annotation) with AGI branding and enhanced HED integration. |
| 7 | +## Key Features |
8 | 8 |
|
9 | | -## Features |
| 9 | +- **Multi-model support**: OLLAMA, OpenAI GPT-4V, Anthropic Claude |
| 10 | +- **Batch processing**: Handle 25k+ annotations with real-time progress |
| 11 | +- **Web dashboard**: Interactive visualization and analysis interface |
| 12 | +- **Annotation tools**: Reorder, filter, export, and manipulate annotations |
| 13 | +- **Research-ready**: Structured JSON output with comprehensive metrics |
10 | 14 |
|
11 | | -- Web-based interface for annotating static images |
12 | | -- HED (Hierarchical Event Descriptors) tag integration |
13 | | -- BIDS-compliant output format |
14 | | -- Collaborative annotation workflows |
15 | | -- Version control for annotations |
| 15 | +## Quick Start |
16 | 16 |
|
17 | | -## Design Integration |
| 17 | +### Prerequisites |
18 | 18 |
|
19 | | -- AGI logo positioned top-left |
20 | | -- AGI color theme throughout interface |
21 | | -- Consistent with annotation.garden website design |
| 19 | +- Python 3.11+ |
| 20 | +- Node.js 18+ |
| 21 | +- OLLAMA (for local models) |
| 22 | +- API keys for OpenAI/Anthropic (optional) |
22 | 23 |
|
23 | | -## Installation |
| 24 | +### Installation |
24 | 25 |
|
25 | | -*To be documented after cloning from source repository* |
| 26 | +```bash |
| 27 | +# Clone and setup |
| 28 | +git clone https://github.com/neuromechanist/hed-image-annotation.git |
| 29 | +cd hed-image-annotation |
26 | 30 |
|
27 | | -## Usage |
| 31 | +# Python environment |
| 32 | +conda activate torch-312 # or create: conda create -n torch-312 python=3.12 |
| 33 | +pip install -e . |
28 | 34 |
|
29 | | -*To be documented* |
| 35 | +# Frontend |
| 36 | +cd frontend && npm install |
| 37 | +``` |
| 38 | + |
| 39 | +### Quick Usage |
| 40 | + |
| 41 | +```bash |
| 42 | +# Start OLLAMA (for local models) |
| 43 | +ollama serve |
| 44 | + |
| 45 | +# Test VLM service |
| 46 | +python -m image_annotation.services.vlm_service |
| 47 | + |
| 48 | +# Run frontend dashboard |
| 49 | +cd frontend && npm run dev |
| 50 | +# Visit http://localhost:3000 |
| 51 | + |
| 52 | +# Configuration |
| 53 | +cp config/config.example.json config/config.json |
| 54 | +# Edit config.json with API keys and NSD image paths |
| 55 | +``` |
| 56 | + |
| 57 | +## Architecture |
| 58 | + |
| 59 | +- **Backend**: FastAPI with OLLAMA/OpenAI/Anthropic integration |
| 60 | +- **Frontend**: Next.js dashboard with real-time progress tracking |
| 61 | +- **Storage**: JSON files with database support for large datasets |
| 62 | +- **Processing**: Stateless VLM calls with comprehensive metrics |
| 63 | + |
| 64 | +## Annotation Tools |
| 65 | + |
| 66 | +Powerful CLI tools for post-processing annotations: |
| 67 | + |
| 68 | +```python |
| 69 | +from image_annotation.utils import reorder_annotations, remove_model, export_to_csv |
| 70 | + |
| 71 | +# Reorder model annotations |
| 72 | +reorder_annotations("annotations/", ["best_model", "second_best"]) |
| 73 | + |
| 74 | +# Remove underperforming models |
| 75 | +remove_model("annotations/", "poor_model") |
| 76 | + |
| 77 | +# Export for analysis |
| 78 | +export_to_csv("annotations/", "results.csv", include_metrics=True) |
| 79 | +``` |
| 80 | + |
| 81 | +## Programmatic Usage |
| 82 | + |
| 83 | +```python |
| 84 | +from image_annotation.services.vlm_service import VLMService, VLMPrompt |
| 85 | + |
| 86 | +# Initialize and process |
| 87 | +service = VLMService(model="gemma3:4b") |
| 88 | +results = service.process_batch( |
| 89 | + image_paths=["path/to/image.jpg"], |
| 90 | + prompts=[VLMPrompt(id="describe", text="Describe this image")], |
| 91 | + models=["gemma3:4b", "llava:latest"] |
| 92 | +) |
| 93 | + |
| 94 | +# Results include comprehensive metrics |
| 95 | +for result in results: |
| 96 | + print(f"Tokens: {result.token_metrics.total_tokens}") |
| 97 | + print(f"Speed: {result.performance_metrics.tokens_per_second}/sec") |
| 98 | +``` |
30 | 99 |
|
31 | 100 | ## Development |
32 | 101 |
|
33 | | -*To be documented* |
| 102 | +```bash |
| 103 | +# Test with real data (no mocks) |
| 104 | +pytest tests/ --cov |
34 | 105 |
|
35 | | -## Integration with AGI |
| 106 | +# Format code |
| 107 | +ruff check --fix . && ruff format . |
| 108 | +``` |
36 | 109 |
|
37 | | -This tool serves as the primary interface for annotating static image datasets in the Annotation Garden ecosystem, starting with: |
38 | | -- Natural Scenes Dataset (NSD): 73,000 COCO images |
39 | | -- Other image-based stimulus repositories |
| 110 | + |
| 111 | +## NSD Research Usage |
| 112 | + |
| 113 | +1. **Download NSD images** to `/path/to/NSD_stimuli/shared1000/` |
| 114 | +2. **Configure models** in `config/config.json` |
| 115 | +3. **Process in batches** using `VLMService.process_batch()` |
| 116 | +4. **Post-process** with annotation tools |
| 117 | +5. **Export results** to CSV for analysis |
| 118 | + |
| 119 | +See [CONTRIBUTING.md](CONTRIBUTING.md) for detailed research workflows. |
40 | 120 |
|
41 | 121 | ## Contributing |
42 | 122 |
|
43 | | -See [CONTRIBUTING.md](https://github.com/Annotation-Garden/management/blob/main/CONTRIBUTING.md) in the management repository. |
| 123 | +See [CONTRIBUTING.md](CONTRIBUTING.md) for development setup, standards, and submission process. |
44 | 124 |
|
45 | 125 | ## License |
46 | 126 |
|
47 | | -*To be determined based on source repository* |
| 127 | +This project is licensed under CC-BY-NC-SA 4.0 - see the [LICENSE](LICENSE) file for details. |
| 128 | + |
| 129 | +## Acknowledgments |
| 130 | + |
| 131 | +- Natural Scene Dataset (NSD) team |
| 132 | +- LangChain and OLLAMA communities |
0 commit comments