diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md new file mode 100644 index 00000000..38f8e1c7 --- /dev/null +++ b/CONTRIBUTING.md @@ -0,0 +1,215 @@ +# Contributing to MiroThinker + +Thank you for your interest in contributing to MiroThinker! This document provides guidelines and instructions for contributing. + +## Table of Contents + +- [Code of Conduct](#code-of-conduct) +- [Getting Started](#getting-started) +- [Development Setup](#development-setup) +- [How to Contribute](#how-to-contribute) +- [Pull Request Process](#pull-request-process) +- [Coding Standards](#coding-standards) +- [Running Tests](#running-tests) +- [Project Structure](#project-structure) + +## Code of Conduct + +Be respectful and inclusive. We welcome contributions from everyone. + +## Getting Started + +1. Fork the repository +2. Clone your fork: + ```bash + git clone https://github.com/YOUR_USERNAME/MiroThinker.git + cd MiroThinker + ``` +3. Create a branch for your changes: + ```bash + git checkout -b your-feature-name + ``` + +## Development Setup + +### Prerequisites + +- Python 3.12+ +- [uv](https://github.com/astral-sh/uv) package manager +- Git + +### Installation + +```bash +# Install dependencies for miroflow-agent +cd apps/miroflow-agent +uv sync + +# Install dependencies for gradio-demo +cd ../gradio-demo +uv sync + +# Install dependencies for libs +cd ../../libs/miroflow-tools +uv sync +``` + +### Environment Configuration + +Copy the example environment file and configure your API keys: + +```bash +cd apps/miroflow-agent +cp .env.example .env +# Edit .env with your API keys +``` + +Required keys for minimal setup: +- `SERPER_API_KEY` - Google search API +- `JINA_API_KEY` - Web scraping +- `E2B_API_KEY` - Code execution sandbox +- `SUMMARY_LLM_*` - LLM for content summarization + +## How to Contribute + +### Reporting Bugs + +1. Check existing issues to avoid duplicates +2. Use a clear, descriptive title +3. Include: + - Steps to reproduce + - Expected behavior + - Actual behavior + - Environment details (OS, Python version, etc.) + +### Suggesting Features + +1. Open an issue with the "enhancement" label +2. Describe the feature and its use case +3. Explain why it would benefit the project + +### Submitting Code + +1. Create a feature branch +2. Make your changes +3. Add tests if applicable +4. Run tests and linting +5. Submit a pull request + +## Pull Request Process + +1. **Branch naming**: Use descriptive names like `fix/resource-cleanup` or `feat/add-api-endpoint` + +2. **Commit messages**: Follow conventional commits: + ``` + type(scope): description + + # Examples: + fix(orchestrator): resolve state pollution on rollback + feat(mcp): add support for SSE transport + docs(readme): update installation instructions + ``` + +3. **PR description**: Include: + - Summary of changes + - Related issue number (if any) + - Testing performed + - Screenshots (if applicable) + +4. **Code review**: Address all review comments + +5. **CI checks**: Ensure all tests pass + +## Coding Standards + +### Python Style + +- Follow [PEP 8](https://peps.python.org/pep-0008/) +- Use type hints where appropriate +- Write docstrings for public functions and classes + +### Code Formatting + +We use [Ruff](https://github.com/astral-sh/ruff) for linting and formatting: + +```bash +# Run linting +just lint + +# Format code +just format + +# Sort imports +just sort-imports + +# Run all pre-commit checks +just precommit +``` + +### File Structure + +``` +apps/ +├── miroflow-agent/ # Core agent logic +│ ├── src/ +│ │ ├── core/ # Orchestrator, tool executor +│ │ ├── llm/ # LLM clients +│ │ └── utils/ # Utilities +│ └── tests/ # Unit tests +├── gradio-demo/ # Web UI +├── collect-trace/ # Trace collection +└── visualize-trace/ # Visualization tools + +libs/ +└── miroflow-tools/ # MCP tools library +``` + +## Running Tests + +### Unit Tests + +```bash +cd apps/miroflow-agent +uv run pytest tests/ -v +``` + +### Test Coverage + +```bash +uv run pytest tests/ --cov=src --cov-report=html +``` + +### Running Specific Tests + +```bash +# Run a specific test file +uv run pytest tests/test_orchestrator.py -v + +# Run tests matching a pattern +uv run pytest tests/ -k "test_tool" -v +``` + +## Project Structure + +| Directory | Description | +|-----------|-------------| +| `apps/miroflow-agent/` | Core agent framework | +| `apps/gradio-demo/` | Gradio web interface | +| `apps/collect-trace/` | Training trace collection | +| `apps/visualize-trace/` | Trace visualization | +| `apps/laravel-compatibility/` | LLM compatibility layer | +| `libs/miroflow-tools/` | MCP server tools | + +## Getting Help + +- 💬 [Discord](https://discord.com/invite/GPqEnkzQZd) +- 🐛 [GitHub Issues](https://github.com/MiroMindAI/MiroThinker/issues) +- 📖 [Documentation](https://github.com/MiroMindAI/MiroThinker#readme) + +## License + +By contributing, you agree that your contributions will be licensed under the MIT License. + +--- + +Thank you for contributing to MiroThinker! 🎉 \ No newline at end of file diff --git a/apps/gradio-demo/main.py b/apps/gradio-demo/main.py index fb50bdb1..0412a268 100644 --- a/apps/gradio-demo/main.py +++ b/apps/gradio-demo/main.py @@ -24,6 +24,10 @@ # Create global cleanup thread pool for operations that won't be affected by asyncio.cancel cleanup_executor = ThreadPoolExecutor(max_workers=2, thread_name_prefix="cleanup") +# Register cleanup on exit to prevent resource leaks +import atexit +atexit.register(cleanup_executor.shutdown) + logger = logging.getLogger(__name__) # Set DEMO_MODE for simplified tool configuration @@ -399,13 +403,15 @@ async def check_cancellation(): "data": {"workflow_id": workflow_id, "error": f"Stream error: {str(e)}"}, } finally: - cancel_event.set() - stream_queue.close() + cancel_event.set() # Signal pipeline to stop try: - future.result(timeout=1.0) + # Wait longer for pipeline thread to finish + future.result(timeout=5.0) except Exception: - pass - executor.shutdown(wait=False) + pass # Thread may have been cancelled + finally: + stream_queue.close() # Close queue after thread is done + executor.shutdown(wait=True, cancel_futures=True) # ========================= Gradio Integration ========================= diff --git a/apps/gradio-demo/tests/__init__.py b/apps/gradio-demo/tests/__init__.py new file mode 100644 index 00000000..e69de29b diff --git a/apps/gradio-demo/tests/test_gradio_utils.py b/apps/gradio-demo/tests/test_gradio_utils.py new file mode 100644 index 00000000..7651c2f4 --- /dev/null +++ b/apps/gradio-demo/tests/test_gradio_utils.py @@ -0,0 +1,180 @@ +# Copyright (c) 2025 MiroMind +# This source code is licensed under the MIT License. + +""" +Unit tests for Gradio demo utilities. + +Tests for: +- ThreadSafeAsyncQueue +- filter_google_search_organic +""" + +import asyncio +import pytest + + +class ThreadSafeAsyncQueue: + """Thread-safe async queue wrapper (copied for testing).""" + + def __init__(self): + self._queue = asyncio.Queue() + self._loop = None + self._closed = False + + def set_loop(self, loop): + self._loop = loop + + async def put(self, item): + """Put data safely from any thread""" + if self._closed: + return + await self._queue.put(item) + + def put_nowait_threadsafe(self, item): + """Put data from other threads - use direct queue put for lower latency""" + if self._closed or not self._loop: + return + self._loop.call_soon_threadsafe(lambda: self._queue.put_nowait(item)) + + async def get(self): + return await self._queue.get() + + def close(self): + self._closed = True + + +def filter_google_search_organic(organic: list) -> list: + """ + Filter google search organic results to remove unnecessary information + """ + result = [] + for item in organic: + result.append( + { + "title": item.get("title", ""), + "link": item.get("link", ""), + } + ) + return result + + +class TestThreadSafeAsyncQueue: + """Tests for ThreadSafeAsyncQueue class.""" + + @pytest.mark.asyncio + async def test_put_and_get(self): + """Test basic put and get operations.""" + queue = ThreadSafeAsyncQueue() + queue.set_loop(asyncio.get_event_loop()) + + await queue.put({"test": "data"}) + result = await queue.get() + assert result == {"test": "data"} + + @pytest.mark.asyncio + async def test_closed_queue_rejects_put(self): + """Test that closed queue rejects put operations.""" + queue = ThreadSafeAsyncQueue() + queue.set_loop(asyncio.get_event_loop()) + + queue.close() + await queue.put({"test": "data"}) + + # Queue should be empty because put was rejected + assert queue._queue.empty() + + @pytest.mark.asyncio + async def test_close_sets_flag(self): + """Test that close() sets the closed flag.""" + queue = ThreadSafeAsyncQueue() + assert queue._closed is False + + queue.close() + assert queue._closed is True + + @pytest.mark.asyncio + async def test_put_nowait_threadsafe(self): + """Test threadsafe put operation.""" + queue = ThreadSafeAsyncQueue() + queue.set_loop(asyncio.get_event_loop()) + + queue.put_nowait_threadsafe({"test": "data"}) + + # Allow event loop to process the call_soon_threadsafe + await asyncio.sleep(0.01) + + result = await queue.get() + assert result == {"test": "data"} + + @pytest.mark.asyncio + async def test_put_nowait_threadsafe_closed(self): + """Test that put_nowait_threadsafe respects closed flag.""" + queue = ThreadSafeAsyncQueue() + queue.set_loop(asyncio.get_event_loop()) + + queue.close() + queue.put_nowait_threadsafe({"test": "data"}) + + await asyncio.sleep(0.01) + assert queue._queue.empty() + + @pytest.mark.asyncio + async def test_put_nowait_threadsafe_no_loop(self): + """Test that put_nowait_threadsafe handles missing loop.""" + queue = ThreadSafeAsyncQueue() + # Don't set loop + + queue.put_nowait_threadsafe({"test": "data"}) + + await asyncio.sleep(0.01) + assert queue._queue.empty() + + +class TestFilterGoogleSearchOrganic: + """Tests for filter_google_search_organic function.""" + + def test_basic_filtering(self): + """Test basic filtering of search results.""" + organic = [ + {"title": "Result 1", "link": "https://example.com/1", "snippet": "..."}, + {"title": "Result 2", "link": "https://example.com/2", "snippet": "..."}, + ] + result = filter_google_search_organic(organic) + assert len(result) == 2 + assert result[0] == {"title": "Result 1", "link": "https://example.com/1"} + assert result[1] == {"title": "Result 2", "link": "https://example.com/2"} + + def test_missing_fields(self): + """Test handling of missing fields.""" + organic = [ + {"title": "Result 1"}, # Missing link + {"link": "https://example.com/2"}, # Missing title + {}, # Missing both + ] + result = filter_google_search_organic(organic) + assert result[0] == {"title": "Result 1", "link": ""} + assert result[1] == {"title": "", "link": "https://example.com/2"} + assert result[2] == {"title": "", "link": ""} + + def test_empty_list(self): + """Test handling of empty list.""" + result = filter_google_search_organic([]) + assert result == [] + + def test_extra_fields_ignored(self): + """Test that extra fields are ignored.""" + organic = [ + { + "title": "Result", + "link": "https://example.com", + "snippet": "text", + "date": "2024-01-01", + } + ] + result = filter_google_search_organic(organic) + assert "snippet" not in result[0] + assert "date" not in result[0] + + +if __name__ == "__main__": + pytest.main([__file__, "-v"]) \ No newline at end of file diff --git a/apps/miroflow-agent/tests/__init__.py b/apps/miroflow-agent/tests/__init__.py new file mode 100644 index 00000000..e69de29b diff --git a/apps/miroflow-agent/tests/test_parsing_utils.py b/apps/miroflow-agent/tests/test_parsing_utils.py new file mode 100644 index 00000000..d929bb83 --- /dev/null +++ b/apps/miroflow-agent/tests/test_parsing_utils.py @@ -0,0 +1,132 @@ +# Copyright (c) 2025 MiroMind +# This source code is licensed under the MIT License. + +""" +Unit tests for parsing utilities. + +Tests for: +- filter_none_values +- safe_json_loads +- _fix_backslash_escapes +- extract_llm_response_text +""" + +import pytest +from src.utils.parsing_utils import ( + filter_none_values, + safe_json_loads, + _fix_backslash_escapes, +) + + +class TestFilterNoneValues: + """Tests for filter_none_values function.""" + + def test_filter_none_values_basic(self): + """Test filtering None values from a dictionary.""" + input_dict = {"a": 1, "b": None, "c": 3} + result = filter_none_values(input_dict) + assert result == {"a": 1, "c": 3} + + def test_filter_none_values_all_none(self): + """Test filtering when all values are None.""" + input_dict = {"a": None, "b": None} + result = filter_none_values(input_dict) + assert result == {} + + def test_filter_none_values_no_none(self): + """Test filtering when no values are None.""" + input_dict = {"a": 1, "b": 2, "c": 3} + result = filter_none_values(input_dict) + assert result == input_dict + + def test_filter_none_values_empty_dict(self): + """Test filtering an empty dictionary.""" + result = filter_none_values({}) + assert result == {} + + def test_filter_none_values_non_dict(self): + """Test that non-dict values are returned unchanged.""" + assert filter_none_values("string") == "string" + assert filter_none_values(123) == 123 + assert filter_none_values([1, 2, 3]) == [1, 2, 3] + assert filter_none_values(None) is None + + def test_filter_none_values_nested(self): + """Test that nested structures are preserved.""" + input_dict = {"a": {"nested": None}, "b": [1, None, 3]} + result = filter_none_values(input_dict) + assert result == {"a": {"nested": None}, "b": [1, None, 3]} + + +class TestFixBackslashEscapes: + """Tests for _fix_backslash_escapes function.""" + + def test_fix_windows_path(self): + """Test fixing Windows paths with backslashes.""" + json_str = r'{"path": "C:\Users\test"}' + result = _fix_backslash_escapes(json_str) + # Should escape the backslashes before uppercase letters + assert r"\\" in result + + def test_fix_backslash_before_digit(self): + """Test fixing backslashes before digits.""" + json_str = r'{"value": "\1\2\3"}' + result = _fix_backslash_escapes(json_str) + assert r"\\" in result + + def test_preserve_valid_escapes(self): + """Test that valid escape sequences are preserved.""" + json_str = r'{"text": "line1\nline2\ttab"}' + result = _fix_backslash_escapes(json_str) + # Should not modify valid escape sequences + assert "\\n" in result and "\\t" in result + + def test_empty_string(self): + """Test handling empty string.""" + result = _fix_backslash_escapes("") + assert result == "" + + +class TestSafeJsonLoads: + """Tests for safe_json_loads function.""" + + def test_valid_json(self): + """Test parsing valid JSON string.""" + json_str = '{"name": "test", "value": 123}' + result = safe_json_loads(json_str) + assert result == {"name": "test", "value": 123} + + def test_invalid_json_repair(self): + """Test that invalid JSON is repaired when possible.""" + # Missing closing brace - json_repair should fix it + json_str = '{"name": "test"' + result = safe_json_loads(json_str) + assert isinstance(result, dict) + + def test_empty_json(self): + """Test parsing empty JSON object.""" + result = safe_json_loads("{}") + assert result == {} + + def test_json_with_escaped_chars(self): + """Test parsing JSON with escaped characters.""" + json_str = '{"text": "line1\\nline2"}' + result = safe_json_loads(json_str) + assert result["text"] == "line1\nline2" + + def test_json_with_nested_structure(self): + """Test parsing nested JSON.""" + json_str = '{"outer": {"inner": {"value": 1}}}' + result = safe_json_loads(json_str) + assert result["outer"]["inner"]["value"] == 1 + + def test_json_with_list(self): + """Test parsing JSON with list.""" + json_str = '{"items": [1, 2, 3]}' + result = safe_json_loads(json_str) + assert result["items"] == [1, 2, 3] + + +if __name__ == "__main__": + pytest.main([__file__, "-v"]) \ No newline at end of file