Skip to content

Test: Added Stryker Mutator (Dynamic Analysis)#99

Merged
aliciach1 merged 26 commits intomainfrom
tool-stryker-cici
Oct 30, 2025
Merged

Test: Added Stryker Mutator (Dynamic Analysis)#99
aliciach1 merged 26 commits intomainfrom
tool-stryker-cici

Conversation

@qianxuege
Copy link
Copy Markdown

@qianxuege qianxuege commented Oct 23, 2025

Summary

  • Stryker Mutator is a mutation-testing framework that intentionally breaks the code in tiny ways to see if the tests fail.
  • Since NodeBB is a large and complex application, we will use Stryker to test its plugins or specific modules, rather than attempting a full mutation test on the entire core codebase.

How Stryker Mutator Works

  1. Stryker copies the source files into a sandbox and generates mutants
    • Each small, artificial change is called a mutant.
  2. It then runs tests for each mutant.
    • If tests fail, that mutant is killed ✅ → your tests caught the bug.
    • If tests pass, that mutant survived ❌ → your tests missed it.
  3. It will compute a mutation score at the end. The higher the score, the more fault-resistant the test suite is.

Evidence that you had successfully installed the tool

  • new package has been added to package.json: "@stryker-mutator/mocha-runner": "^9.2.0"
  • stryker.config.json was created in the root directory

Artifacts that demonstrate that you have successfully run the tool on your repository

  • Run NODE_ENV=test DISABLE_EMAIL=true npx mocha test/categories.js to make sure that the non-mutated code passes all tests.
  • Then, run npx stryker run --logLevel info to run Stryker mutator on src/categories/create.js
  • Attached is a screenshot of Stryker running.
Screenshot 2025-10-23 at 8 04 23 PM Screenshot 2025-10-23 at 8 04 37 PM Screenshot 2025-10-23 at 8 04 46 PM Screenshot 2025-10-23 at 8 04 54 PM Screenshot 2025-10-23 at 8 05 04 PM

What types of problems are you hoping your tooling will catch? What types of problems does this particular tool catch?

  • Overall, I hope that my tooling will verify whether or not my code produces expected results, and mitigate the chance of crashing when I deploy my code.
  • Stryker exposes logical weaknesses in the test suite, i.e. it finds places where the tests execute code but are insufficient to verify correctness.
    • It is a dynamic analysis tool that evaluates how reliable the existing test suite is.

What types of customization are possible or necessary?

A PRIORI CUSTOMIZATION (initial setup)

  1. Mutation Scope: defines which files Stryler will mutate. This is necessary, otherwise Stryker will mutate everything and cause crashes.
    • i.e. "mutate": [ "src/categories/**/*.js" ],
    • These should only target business logic — not tests, build output, or framework code.
  2. Test Runner Integration: tells Stryker how to run the tests. This is necessary and can impact running speed.
    • Built-in: mocha, jest, karma, vitest, etc.
    • External command: "testRunner": "command" when you have complex bootstrap logic.
    • I experimented with both mocha and command
  3. Coverage Analysis Mode: controls whether Stryker tracks coverage per test. It is necessary and has been added by default.
    • "perTest" (default) → more accurate, slower.
    • "all" → less precise, faster.
    • "off" → fastest; relies purely on mutants’ test results.
  4. File Exclusions (ignorePatterns): prevents sandbox from copying gigabytes of data. It is not necessary but would be good to have.
    • It is common to exclude node_modules, build artifacts, uploads, coverage reports, logs, etc.
  5. Timeouts & Concurrency: prevent false timeouts on slow integration tests. These are set by default but we can tune them according to CPU and test runtime.
    • i.e."timeoutMS": 300000, "timeoutFactor": 1.5, "concurrency": 4
  6. Reporters: Select which outputs matter to the workflow.
    • This project has html (local debugging), progress (terminal feedback), and clear-text (CI logs) turned on
  7. Environment Bootstraps: to ensure sandboxes mimic production
    • preloaded stryker-sandbox-bootstrap.js

OVER TIME CUSTOMIZATION (ongoing tuning)

  1. Exclude or include specific mutation types if they produce noise
    • i.e. "mutator": { "excludedMutations": ["BooleanSubstitution", "StringLiteral"] }
  2. Thresholds & Quality Gates: Add numeric gates that fail CI if the mutation score drops
    • i.e. "thresholds": { "high": 80, "low": 60, "break": 50 }
  3. Incremental Mutations: Enable incremental mode so Stryker only re-tests changed files between commits
  4. Split or Modular Runs: Break the project into separate configs or commands and run them in parallel/rotation in CI to manage runtime
  5. Parallelization / Cloud Execution: integrate with GitHub Actions to distribute mutants across cores or agents
  6. Custom Hooks & Build Commands: custom pre-sandbox hooks that copy assets/templates since our app requires setup
  7. Continuous Calibration: Periodically raise thresholds, re-include modules we skipped earlier, trim unproductive mutation types, measure runtime vs. benefit to keep it practical.

How can/should this tool be integrated into a development process?

  • Stryker is a quality amplifier, it measures whether the existing tests actually detect logic errors. Therefore, it should not replace tests. Instead, it should be used to continuously validate test effectiveness.
  • Running Stryker can prevent us from only looking at 100% line coverage when determining the quality of our test cases.
  • We could run Stryker occasionally during feature or refactor work to ensure the tests we wrote actually fail on bugs.
    • Occasionally mutate all files in our src folder before merging to main, but most often run Stryker with incremental mode so Stryker only re-tests changed files when we create our pull requests.
  • Since NodeBB is a large and complex application, we will use Stryker to test its plugins or specific modules, rather than attempting a full mutation test on the entire core codebase. For context, running Stryker on the full application would take 10+ hours.

Are there many false positives? False negatives? True positive reports about things you don’t care about?

  • Stryker made 236 small code changes in src/categories/create.js. Our tests caught 200 of them (i.e. they failed as expected), one mutant made the test hang or run too long, and 35 of the mutants didn’t cause test failures (survived).
  • The survived mutant could mean that there are True positives (legitimate gaps in test coverage)
    • i.e. const smallestOrder = firstChild.length ? firstChild[0].score - 1 : 1; mutated to const smallestOrder = firstChild.length ? firstChild[0].score + 1 : 1; but the test cases did not catch it
  • or False positives / low-impact mutants (they change cosmetic or unused code — not actually tested but not dangerous either).
    • i.e. description: data.description ? data.description : '', → description: data.description ? data.description : "Stryker was here!"
  • There are about 20–25 True positives and 8–10 False positives combined.
  • There are some True positives that I don't really care about
    • i.e. [Survived] ConditionalExpression src/categories/create.js:216:7 - if (copyParent) { + if (true) {, sinceif (copyParent) is essentially the same as saying if (copyParent == true)
  • False negatives are possible when tests are structured in a way that hides failures, i.e. using mocks/stubs that bypass real logic or catching errors but not asserting results, but none were reported by Stryker.

@coveralls
Copy link
Copy Markdown

coveralls commented Oct 23, 2025

Pull Request Test Coverage Report for Build 18956348932

Details

  • 4 of 4 (100.0%) changed or added relevant lines in 2 files are covered.
  • 5 unchanged lines in 3 files lost coverage.
  • Overall coverage decreased (-0.01%) to 78.447%

Files with Coverage Reduction New Missed Lines %
src/admin/versions.js 1 87.88%
src/controllers/admin/dashboard.js 2 85.34%
src/middleware/render.js 2 78.89%
Totals Coverage Status
Change from base Build 18755099280: -0.01%
Covered Lines: 24896
Relevant Lines: 29877

💛 - Coveralls

@qianxuege qianxuege changed the title Test: Added Stryker Mutator (Dynamic Analysis) WIP: Added Stryker Mutator (Dynamic Analysis) Oct 23, 2025
@qianxuege qianxuege changed the title WIP: Added Stryker Mutator (Dynamic Analysis) Test: Added Stryker Mutator (Dynamic Analysis) Oct 23, 2025
@qianxuege qianxuege added documentation Improvements or additions to documentation enhancement New feature or request labels Oct 24, 2025
Copy link
Copy Markdown

@aliciach1 aliciach1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please remove output file in reports/mutation/mutation.json. Otherwise, looks good!

@qianxuege
Copy link
Copy Markdown
Author

please remove output file in reports/mutation/mutation.json. Otherwise, looks good!

Removed! thanks

@qianxuege qianxuege requested a review from aliciach1 October 30, 2025 22:15
@qianxuege qianxuege self-assigned this Oct 30, 2025
Copy link
Copy Markdown

@aliciach1 aliciach1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

checked out branch locally and ran tests, tests pass. Changes thoroughly implements Stryker Mutator into CI pipeline and doesn't break anything.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants