Skip to content

Latest commit

 

History

History
226 lines (144 loc) · 7.15 KB

File metadata and controls

226 lines (144 loc) · 7.15 KB

🛡️ Contributing to Android Malware Detection with Machine Learning

Welcome, contributors 👋
We’re thrilled that you want to contribute to Android Malware Detection with Machine Learning, an open-source AI-powered project designed to identify and classify malicious Android applications using machine learning and explainable AI (XAI).

Your ideas, code, and enthusiasm help strengthen the security ecosystem. 🧠💪


📚 Quick links

  • Project README: README.md
  • Main entry points: main.py
  • Feature extraction: FeatureExtractionModule/FeatureExtraction.py

🧭 Table of contents

  1. Getting started
  2. Fork and clone
  3. Development setup
  4. Branching and commits
  5. Pull request process
  6. Code style and quality
  7. Testing
  8. Areas you can contribute to
  9. Reporting issues
  10. Security and responsible disclosure
  11. Community and support
  12. Need help?

🎯 Getting started

Before you make changes, read README.md to understand the goals and architecture. The most relevant files and folders for contributors are:

  • FeatureExtractionModule/FeatureExtraction.py — static feature extraction logic
  • MachineLearningModule/MachineLearningFlow.py — end-to-end ML flow and orchestration
  • MachineLearningModule/Classifiers/ — classifier implementations
  • Datasets/ — CSV files used for experiments (e.g. Drebin_v1.csv)
  • ModelEvaluation/ — evaluation utilities and metrics
  • Util/ — helper functions used across modules

If you are new to the repository, begin with small doc fixes, tests or simple bug fixes to learn the codebase.


🍴 Fork and Clone

Fork this repo using the Fork button on GitHub.

Clone your fork locally:

git clone https://github.com/<your-username>/AndroidMalwareDetection.git
cd AndroidMalwareDetection

Add the upstream remote:

git remote add upstream https://github.com/VarnitKumar/AndroidMalwareDetection.git

Keep your fork updated regularly:

git fetch upstream
git merge upstream/main

⚙️ Development setup

Recommended steps (Windows example):

  1. Create a virtual environment and activate it:

    python -m venv venv

    venv\Scripts\activate

  2. Install dependencies:

    pip install -r requirements.txt

  3. Run a quick smoke test to ensure the environment works. For example:

    python main.py

Notes:

  • If you add packages, update requirements.txt and mention them in your PR.
  • For heavy experiments, run them on a machine with sufficient RAM (4GB+ recommended).

🌿 Branching and commits

  • Branch names: feature/<short>, fix/<short>, chore/<short>, experiment/<name>.
  • Make focused commits and write clear commit messages.
  • Use verbs and a short scope prefix when useful.

Commit message format :

<type>: <short summary>

Common types:

  • feat: New feature
  • fix: Bug fix
  • docs: Documentation changes
  • test: Adding/updating tests
  • refactor: Code improvement without changing functionality
  • style: Code style or formatting

Examples:

  • feat: add permission-based extractor
  • fix: avoid division by zero in Normalisation
  • feat: integrate SHAP explainability module

Include a concise description in the commit body when more context helps reviewers.


🚀 Pull request process

When you open a PR against main, include:

  • A short title and 1–2 paragraph summary of the change and why it matters.
  • Steps to reproduce or test locally (commands + expected output).
  • Any dataset or model artifacts required for testing (or a subset/sample).
  • Notes about performance differences or new dependencies.
  • Links to related issues or discussion threads.

PR checklist (maintainers will look for):

  • Code compiles / scripts run without errors for the described steps
  • Tests for new behavior or a clear explanation why tests are not required
  • Updated docs or README when behavior changes
  • No secrets or large binary files committed (use links or artifact storage)

🎨 Code style and quality

  • Follow PEP 8 for Python. Use meaningful names and small functions.
  • Add or update docstrings for public functions and classes.
  • Prefer type hints for public APIs.
  • If you introduce a new module, include a short module-level docstring describing responsibilities.
  • If you use a linter (recommended: flake8), include configuration in the repo or list commands in your PR.

🧪 Testing

  • Add unit tests for any new logic or bug fixes. Place tests under a tests/ directory.

  • Run tests with pytest:

    pip install pytest python -m pytest -q

  • For data-heavy tests, include small synthetic or sampled CSVs to keep CI fast.


🤖 Areas you can contribute to

  • Feature extractors: permission analysis, API calls, intent filters.
  • Preprocessing: missing value handling, normalization, imbalance correction.
  • Models: add classifiers, hyperparameter tuning, or cross-validation improvements.
  • Explainability: integrate or improve SHAP/LIME analyses and visualizations.
  • Testing & CI: add tests and a minimal GitHub Actions workflow (optional).
  • Documentation: improve README examples, usage guides, and notebooks.

If you'd like to work on a larger feature, open an issue with a short design proposal so maintainers can provide feedback first.


🐛 Reporting issues

When creating an issue, include:

  • A descriptive title
  • Steps to reproduce (commands, exact file names)
  • Expected vs actual behavior
  • Error messages / tracebacks and relevant log excerpts
  • Minimal dataset or sample input if possible

Label issues clearly (bug, enhancement, question) to help triage.


🔒 Security and responsible disclosure

If you find a security vulnerability (example: accidental credentials, data exfiltration paths), do not create a public issue. Instead contact the repository owner(s) directly or use GitHub's private security advisory to report the issue.


🤝 Community and support

  • For help getting started, open an issue with the help-wanted label.

  • If you want maintainers to review a proposed design before implementation, open an issue titled RFC: short description and outline the approach.

  • Join discussions on GitHub or reach out via email (if provided in the repo).

  • Be respectful, collaborative, and constructive.

  • Support other contributors and share knowledge.

  • Use issues and PRs for discussions, not personal communication.

  • Avoid spam or irrelevant comments.


❓ Need help?

  • Ask in the Discord server (check README).
  • If you’re stuck on a task, open/continue the discussion on the Issue itself.

Thank you for helping to make Android safer through open-source collaboration ❤️. Your contributions empower innovation in cybersecurity and AI research.