diff --git a/.github/workflows/ci-go.yml b/.github/workflows/ci-go.yml index 5e2adfb..54348f7 100644 --- a/.github/workflows/ci-go.yml +++ b/.github/workflows/ci-go.yml @@ -17,6 +17,7 @@ on: jobs: golang-ci: + if: github.repository == 'MiroMindAI/trace-blame' runs-on: ubuntu-latest-m steps: - uses: actions/checkout@v4 diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md new file mode 100644 index 0000000..a26b2b7 --- /dev/null +++ b/CONTRIBUTING.md @@ -0,0 +1,63 @@ +# Contributing to trace-blame + +Thank you for your interest in contributing to trace-blame! + +## Getting Started + +1. Fork the repository and clone your fork: + ```bash + git clone https://github.com//trace-blame.git + cd trace-blame + ``` + +2. Ensure you have Go 1.24+ installed (see `go.mod` for the exact version). + +3. Fetch test data (symlinked from upstream HTA): + ```bash + git clone --depth 1 --filter=blob:none --sparse \ + https://github.com/facebookresearch/HolisticTraceAnalysis.git /tmp/hta + cd /tmp/hta && git sparse-checkout set tests/data + ln -sf /tmp/hta/tests ../trace-blame/tests + ``` + +4. Build and verify: + ```bash + go build -o trace-blame ./cmd/trace-blame/ + go vet ./... + go test -timeout 20m ./... + ``` + +## Making Changes + +1. Create a feature branch from `main`: + ```bash + git checkout -b my-feature main + ``` + +2. Make your changes. Please follow these conventions: + - Write idiomatic Go — run `go vet` and `gofmt` before committing. + - Keep commits focused — one logical change per commit. + - Add or update tests for any new functionality. + +3. Ensure all tests pass before opening a pull request. + +## Pull Requests + +- Open PRs against the `main` branch. +- Provide a clear title and description of what your change does and why. +- Link any related issues. +- CI runs automatically on PRs that touch Go source, module files, or the workflow itself. + +## Reporting Issues + +- Use [GitHub Issues](https://github.com/MiroMindAI/trace-blame/issues) to report bugs or request features. +- Include reproduction steps, expected behaviour, and actual behaviour. +- Attach trace files or error output when applicable. + +## Code of Conduct + +Be respectful and constructive. We follow the [Contributor Covenant](https://www.contributor-covenant.org/version/2/1/code_of_conduct/). + +## License + +By contributing, you agree that your contributions will be licensed under the same license as the project (see [LICENSE](LICENSE)). diff --git a/README.md b/README.md index 5f62e6c..b808c36 100644 --- a/README.md +++ b/README.md @@ -2,7 +2,7 @@ A Go CLI for analyzing PyTorch Profiler traces. -Reimplements [HolisticTraceAnalysis](https://github.com/facebookresearch/HolisticTraceAnalysis) with following features: +Reimplements most of the workflows in [HolisticTraceAnalysis](https://github.com/facebookresearch/HolisticTraceAnalysis) with following features: 1. `install-skill` supports agent usage. 2. single go binary. @@ -39,3 +39,20 @@ trace-blame idle-time-breakdown --db trace.db --ranks 0,1 | CUPTI | `cupti-counter-data` | Run `trace-blame` with no arguments for usage, or `trace-blame -h` for flag details. + +## Roadmap + +- Expand debugging workflows beyond the current HTA coverage +- Navigate up and down the operator call stack from within the agent +- Traverse forward and backward along a CUDA stream in the agent +- Support memory-profiling workflows + +Ideas and contributions are welcome! See [CONTRIBUTING.md](CONTRIBUTING.md) to get started. + +## Other Awesome Tools + +Check out these tools to make debugging pytorch training job easier: + +- [hta](https://github.com/facebookresearch/HolisticTraceAnalysis) analyzes torch profile. +- [tlparse](https://github.com/meta-pytorch/tlparse/) analyzes torch compile process. +- [mosaic](https://github.com/facebookresearch/mosaic) analyzes torch memory snapshot. \ No newline at end of file