AI-Driven Test Selection on Pull Request acceptance tests #244

srbarrios · 2025-02-16T17:35:25Z

Project Title

AI-Driven Test Selection on Pull Request acceptance tests

Description

Large test suites can slow down CI/CD pipelines, leading to longer feedback loops and inefficient resource usage. This project aims to leverage machine learning (ML) to predict which tests should be executed based on recent code changes, commit history, past test failures, and code coverage.

By analyzing this data we can train an ML model to prioritize high-risk tests and reduce overall test execution time. The goal is to reduce the Pull Request acceptance tests execution time by running only the most relevant tests using this ML model.

This project is a continuation of our current work in the Uyuni project, that will be presented during the SeleniumConf 2025

The project will involve

Extracting commit history to identify impacted files.
Analyzing test execution history to track past failures.
Processing code coverage data to map tests to code changes
Training a machine learning model (e.g., Random Forest, XGBoost) to recommend which tests to run.
Integrating the trained model into our GH actions to dynamically select tests for each Pull Request.

This approach ensures that tests are executed intelligently, reducing test cycle time while maintaining high test coverage.

Deliverables

Data extraction scripts for:
- Commit history from Git (files changed, commit messages)
- Test execution logs (pass/fail results, error messages)
- JaCoCo Code coverage reports
A trained ML model that predicts which tests should run based on commit history and past test results.
Integration with GitHub actions to automate test selection.
Comprehensive documentation on setup, training, and deployment of the model.

Mentor

Oscar Barrios (@srbarrios)

Skills Required

Ruby (for test framework integration).
Machine Learning Basics (feature engineering, model training).
Python + Scikit-Learn/Pandas (for ML model development).
GitHub actions (to integrate test selection into pipelines).
Cucumber/Selenium Testing (understanding of automated tests).

Skill Level

Medium – Requires knowledge of Ruby (for test integration) and basic ML concepts (training and using models).
Prior experience with CI/CD and automated testing is a plus.

Project Size

Medium-Sized Project (160 hours)

Get Started

Data Sources:

Uyuni Git Commit History: Includes code changes, affected files, and commit messages.
Test Execution Logs: Stores past test results, including failures, execution time, and errors. This content it gonna be publicly available through a web server located in AWS, for now we are publishing only the Cucumber reports here.
Code Coverage Data: Tracks which tests touch specific parts of the codebase. This is available in a Redis database, and already in use in our GH actions

Steps

Data Collection & Preprocessing
- Extract Commit History
- Extract Test Execution History
- Extract Code Coverage Data
Use Python + Scikit-Learn for training a model using all the data collected.
- Prepare Training Data
- Train a Classification Model
Integrate it as GitHub action into every Pull Request
Enhance the current GH action using the ML model.

Useful links

srbarrios · 2025-02-17T07:28:03Z

@ddemaio can you add the Uyuni label on this issue? Thanks!

srbarrios changed the title ~~AI-Driven Test Selection for Faster CI Pipelines~~ AI-Driven Test Selection on Pull Request acceptance tests Feb 17, 2025

ddemaio added Uyuni Medium Sized Project Medium sized project is 175 hours AI Has elements of AI development associated with the project labels Feb 17, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AI-Driven Test Selection on Pull Request acceptance tests #244

AI-Driven Test Selection on Pull Request acceptance tests #244

srbarrios commented Feb 16, 2025 •

edited

Loading

srbarrios commented Feb 17, 2025

AI-Driven Test Selection on Pull Request acceptance tests #244

AI-Driven Test Selection on Pull Request acceptance tests #244

Comments

srbarrios commented Feb 16, 2025 • edited Loading

Project Title

Description

The project will involve

Deliverables

Mentor

Skills Required

Skill Level

Project Size

Get Started

srbarrios commented Feb 17, 2025

srbarrios commented Feb 16, 2025 •

edited

Loading