-
Notifications
You must be signed in to change notification settings - Fork 2
Contributing Guide ‐ Osservatorio
Benvenuto! Siamo felici che tu voglia contribuire al progetto Osservatorio ISTAT. Questa guida ti aiuterà a iniziare nel modo giusto.
# Fork il repository su GitHub
# Poi clona il tuo fork
git clone https://github.com/YOUR_USERNAME/Osservatorio.git
cd Osservatorio
# Aggiungi upstream remote
git remote add upstream https://github.com/AndreaBozzo/Osservatorio.git
# Setup ambiente locale
python -m venv venv
source venv/bin/activate # Linux/Mac
venv\Scripts\activate # Windows
pip install -r requirements.txt
pip install -r requirements-dev.txt
# Installa pre-commit hooks
pre-commit install
# Test hooks
pre-commit run --all-files
# Test che tutto funzioni
pytest tests/unit/test_config.py -v
python src/api/istat_api.py
Usa il Bug Report Template:
- Descrizione chiara del problema
- Steps per riprodurre
- Environment details (OS, Python version)
- Stack trace se disponibile
Usa il Feature Request Template:
- Descrizione del problema che risolve
- Soluzione proposta
- Alternative considerate
- Implementation notes
Usiamo il Documentation Template:
- Improve existing docs
- Translate to other languages
- Add examples and tutorials
- API documentation
Seguire il Development Workflow
# 1. Sync con upstream
git checkout main
git pull upstream main
# 2. Crea feature branch
git checkout -b feature/description
# o bug/description
# o docs/description
# 3. Lavora sui cambiamenti
# ... edit files ...
# 4. Commit con conventional commits
git add .
git commit -m "feat: add new data validation feature"
# 5. Push to fork
git push origin feature/description
# 6. Crea Pull Request su GitHub
Usiamo Conventional Commits:
# Format: type(scope): description
feat: add new feature
fix: bug fix
docs: documentation changes
style: formatting changes
refactor: code refactoring
test: adding tests
chore: maintenance tasks
# Examples:
git commit -m "feat(api): add PowerBI dataset validation"
git commit -m "fix(dashboard): resolve memory leak in data loader"
git commit -m "docs(wiki): add troubleshooting guide"
git commit -m "test(converter): add unit tests for XML parsing"
# 1. Run all tests
pytest
# 2. Check coverage (target: 60%+)
pytest --cov=src tests/ --cov-report=term
# 3. Lint and format
black .
flake8 .
isort .
# 4. Security scan
bandit -r src/
safety check
# 5. Pre-commit hooks
pre-commit run --all-files
# File: tests/unit/test_new_feature.py
import pytest
from src.module import NewFeature
class TestNewFeature:
def test_basic_functionality(self):
"""Test basic functionality."""
feature = NewFeature()
result = feature.process()
assert result is not None
def test_error_handling(self):
"""Test error handling."""
feature = NewFeature()
with pytest.raises(ValueError):
feature.process(invalid_input=True)
@pytest.mark.parametrize("input,expected", [
("test1", "result1"),
("test2", "result2"),
])
def test_multiple_inputs(self, input, expected):
"""Test multiple input scenarios."""
feature = NewFeature()
result = feature.process(input)
assert result == expected
# ✅ Good: Follow PEP 8
class DataProcessor:
"""Process ISTAT data for analysis."""
def __init__(self, config: dict) -> None:
"""Initialize processor with configuration."""
self.config = config
self._logger = get_logger(__name__)
def process_data(self, data: pd.DataFrame) -> pd.DataFrame:
"""Process the input data and return cleaned version."""
try:
cleaned_data = self._clean_data(data)
return self._validate_data(cleaned_data)
except Exception as e:
self._logger.error(f"Data processing failed: {e}")
raise
def _clean_data(self, data: pd.DataFrame) -> pd.DataFrame:
"""Private method for data cleaning."""
return data.dropna()
# ❌ Bad: Poor style
class dataprocessor: # PascalCase missing
def __init__(self,config): # No type hints, spacing
self.config=config # No spacing around =
def process_data(self,data): # No spacing, type hints
cleanedData=data.dropna() # camelCase in Python
return cleanedData
def convert_xml_to_tableau(
self,
xml_input: Union[str, Path],
dataset_id: str,
dataset_name: str
) -> Dict[str, Any]:
"""
Convert ISTAT XML data to Tableau-compatible formats.
Args:
xml_input: Path to XML file or XML content string
dataset_id: ISTAT dataset identifier (e.g., 'DCIS_POPRES1')
dataset_name: Human-readable dataset name
Returns:
Dictionary containing conversion results with keys:
- success: bool indicating conversion success
- files_created: dict with paths to generated files
- data_quality: dict with quality metrics
- summary: dict with conversion summary
Raises:
ValueError: If XML content is invalid
FileNotFoundError: If XML file path doesn't exist
SecurityError: If file path validation fails
Example:
>>> converter = IstatXMLtoTableauConverter()
>>> result = converter.convert_xml_to_tableau(
... "data/raw/population.xml",
... "DCIS_POPRES1",
... "Popolazione Residente"
... )
>>> print(result['summary']['files_created'])
3
"""
# ✅ Good: Use security utilities
from src.utils.secure_path import SecurePathValidator
from src.utils.security_enhanced import security_manager
def process_file(file_path: str) -> None:
"""Process file with security validation."""
validator = SecurePathValidator()
safe_path = validator.validate_path(file_path)
with validator.safe_open(safe_path, 'r') as file:
content = file.read()
# Process content...
# ❌ Bad: Direct file access
def process_file(file_path: str) -> None:
"""Insecure file processing."""
with open(file_path, 'r') as file: # No path validation
content = file.read()
# ✅ Good: Validate all inputs
def api_endpoint(user_input: str) -> str:
"""API endpoint with input validation."""
# Sanitize input
clean_input = security_manager.sanitize_input(user_input)
# Validate input
if not clean_input or len(clean_input) > 1000:
raise ValueError("Invalid input")
return process_input(clean_input)
# ❌ Bad: No validation
def api_endpoint(user_input: str) -> str:
"""Insecure API endpoint."""
return process_input(user_input) # Direct use of user input
# ✅ Good: Use generators and chunking
def process_large_dataset(file_path: str) -> Iterator[pd.DataFrame]:
"""Process large dataset in chunks."""
chunk_size = 10000
for chunk in pd.read_csv(file_path, chunksize=chunk_size):
yield process_chunk(chunk)
# ❌ Bad: Load everything in memory
def process_large_dataset(file_path: str) -> pd.DataFrame:
"""Memory-intensive processing."""
df = pd.read_csv(file_path) # Could be huge
return process_dataframe(df)
# ✅ Good: Use caching for expensive operations
from functools import lru_cache
@lru_cache(maxsize=128)
def expensive_calculation(param: str) -> float:
"""Cached expensive calculation."""
# Expensive computation...
return result
# Clear cache when needed
expensive_calculation.cache_clear()
Prima di sottomettere il PR:
- Descrizione chiara del cambiamento
- Tests aggiornati/aggiunti
- Documentation aggiornata
- Changelog entry (se necessario)
- All tests pass locally
- Code coverage mantiene/migliora %
- Security scan passed
- Pre-commit hooks passed
## Description
Brief description of changes
## Type of Change
- [ ] Bug fix (non-breaking change)
- [ ] New feature (non-breaking change)
- [ ] Breaking change (fix/feature causing existing functionality to not work)
- [ ] Documentation update
## Testing
- [ ] Unit tests added/updated
- [ ] Integration tests added/updated
- [ ] Manual testing performed
## Screenshots (if applicable)
Add screenshots for UI changes
## Additional Notes
Any additional context or notes
- Automated Checks: CI/CD must pass
- Code Review: At least 1 approval required
- Manual Testing: For significant changes
- Documentation Review: For public API changes
- Security Review: For security-related changes
Usiamo labels per categorizzare issues:
-
bug
- Bug reports -
enhancement
- Feature requests -
documentation
- Documentation improvements -
question
- General questions -
duplicate
- Duplicate issues -
invalid
- Invalid issues
-
priority: critical
- Critical issues (security, data loss) -
priority: high
- High priority features/fixes -
priority: medium
- Medium priority items -
priority: low
- Low priority nice-to-haves
-
component: api
- API-related issues -
component: dashboard
- Dashboard/UI issues -
component: converter
- Data conversion issues -
component: security
- Security-related issues -
component: testing
- Testing infrastructure -
component: docs
- Documentation issues
-
status: waiting-for-feedback
- Waiting for user feedback -
status: in-progress
- Currently being worked on -
status: ready-for-review
- Ready for code review -
status: blocked
- Blocked by external dependencies
Contributors con contributi significativi:
- Core maintainers
- Feature contributors
- Documentation contributors
- Bug hunters
- Community helpers
Riconosciamo vari tipi di contribuzioni:
- 💻 Code contributions
- 📖 Documentation
- 🐛 Bug reports
- 💡 Ideas & suggestions
- 🔍 Code reviews
- 📢 Community building
- 🌍 Translations
- GitHub Discussions: Per domande generali
- Issues: Per bug reports e feature requests
- Wiki: Per documentazione dettagliata
- IRC/Discord: [Link quando disponibile]
- New contributors welcome!
- Pair programming sessions disponibili
- Code review learning opportunities
- Documentation contribution guidance
- Weekly office hours: [TBD]
- Timezone-friendly sessions
- Open to all contributors
- Focus on mentoring and Q&A
Seguiamo Keep a Changelog:
## [Unreleased]
### Added
- New feature X for better data processing
### Changed
- Improved performance of API calls
### Fixed
- Fixed bug in XML parsing
### Security
- Updated dependencies for security patches
## [1.2.0] - 2025-01-20
### Added
- Dashboard live deployment
- Security manager implementation
We pledge to make participation in our community a harassment-free experience for everyone.
Examples of behavior that contributes to a positive environment:
- Using welcoming and inclusive language
- Being respectful of differing viewpoints
- Gracefully accepting constructive criticism
- Focusing on community benefit
- Showing empathy towards community members
Violations can be reported to the project maintainers. All complaints will be reviewed and investigated promptly and fairly.
Grazie per il tuo interesse nel contribuire a Osservatorio ISTAT! Every contribution, no matter how small, helps make this project better for everyone.
Happy coding! 🚀