Skip to content
This repository was archived by the owner on Oct 14, 2024. It is now read-only.

Commit

Permalink
docs: add initial finding RFC proposal
Browse files Browse the repository at this point in the history
  • Loading branch information
ramizpolic committed Jan 16, 2024
1 parent e5cbfe4 commit a7298bb
Showing 1 changed file with 41 additions and 8 deletions.
49 changes: 41 additions & 8 deletions rfc/group-assets-per-finding.md
Original file line number Diff line number Diff line change
@@ -1,29 +1,62 @@
# [RFC] Enable per-asset grouping for findings
# [RFC] Aggregated summaries for findings

*Note: this RFC template follows HashiCrop RFC format described [here](https://works.hashicorp.com/articles/rfc-template)*


| | |
|---------------|---------------------------------------------|
| **Created** | 2023-12-14 |
| **Status** | **WIP** \| InReview \| Approved \| Obsolete |
| **Owner** | *ramizpolic* |
| **Approvers** | *TODO* |
| | |
|---------------|-------------------------------------------|
| **Created** | 2023-12-14 |
| **Status** | WIP \| **InReview** \| Approved \| Obsolete |
| **Owner** | *ramizpolic* |
| **Approvers** | *TODO* |

---

This RFC proposes the assets to be grouped by unique findings to simplify views and extend API operations.
This RFC proposes the assets to be aggregated in findings to improve related API operations and simplify UI.

## Background

The current iteration of `Finiding` model does not use any kind of aggregation to express the relationship with related models such as scans and assets.
This expresses the finding as a relationship of its dependencies such as `finding = ({findingInfo}, {asset}, {scan})`.
This results in each (new) finding being considered unique due to being defined by its dependencies, for example:

```
findingInfo = {packageName="pkg",...} # same underlying package
finding_1 = (findingInfo, {assetID=1}, {scanID=1})
finding_2 = (findingInfo, {assetID=2}, {scanID=2})
```
In summary, this makes the API return the same finding multiple times for each asset and scan, leading to:
- duplicating findings data in the database and adding unnecessary overhead to both speed and memory
- difficulties understanding and working with findings both on the API and the UI level
- assumes that each finding is tied to a a specific asset and scan

## Proposal

If we _**assume that each finding data is truly unique**_, unrelated to its dependencies such as asset it was discovered on or the scan that found it, we get:
- Each finding will be aggregated based on the assets and scans (dependencies), and uniquely described by only the related findings data
- API and UI will be more easily consumable, reducing the database overheads and improving efficiency

Therefore, we can define and treat findings similar to scans, that is:
- Finding is unique and is expressed by `FindingInfo` model
- Related dependencies are aggregated in `FindingSummary` similarly implemented as `ScanSummary`
- References to dependencies can be added to `assetIDs` and `scanIDs`
- TODO(ramizpolic): can we make any simplifications/assumptions here?


### Abandoned Ideas (Optional)

---

## Implementation

1. Extend `Finding` API to utilize `Scan` approach to express relationship between dependencies
2. Update related UI models to make use of the updated API

## UX

This RFC has no visible impacts on the UX.

## UI

This RFC changes the findings data as they need to display aggregated results.

0 comments on commit a7298bb

Please sign in to comment.