docs: add initial finding RFC proposal

openclarity · Jan 16, 2024 · a7298bb · a7298bb
1 parent e5cbfe4
commit a7298bb
Showing 1 changed file with 41 additions and 8 deletions.
diff --git a/rfc/group-assets-per-finding.md b/rfc/group-assets-per-finding.md
@@ -1,29 +1,62 @@
-# [RFC] Enable per-asset grouping for findings
+# [RFC] Aggregated summaries for findings
 
 *Note: this RFC template follows HashiCrop RFC format described [here](https://works.hashicorp.com/articles/rfc-template)*
 
 
-|               |                                             |
-|---------------|---------------------------------------------|
-| **Created**   | 2023-12-14                                  |
-| **Status**    | **WIP** \| InReview \| Approved \| Obsolete |
-| **Owner**     | *ramizpolic*                                |
-| **Approvers** | *TODO*                                      |
+|               |                                           |
+|---------------|-------------------------------------------|
+| **Created**   | 2023-12-14                                |
+| **Status**    | WIP \| **InReview** \| Approved \| Obsolete |
+| **Owner**     | *ramizpolic*                              |
+| **Approvers** | *TODO*                                    |
 
 ---
 
-This RFC proposes the assets to be grouped by unique findings to simplify views and extend API operations.
+This RFC proposes the assets to be aggregated in findings to improve related API operations and simplify UI.
 
 ## Background
 
+The current iteration of `Finiding` model does not use any kind of aggregation to express the relationship with related models such as scans and assets.
+This expresses the finding as a relationship of its dependencies such as `finding = ({findingInfo}, {asset}, {scan})`.
+This results in each (new) finding being considered unique due to being defined by its dependencies, for example:
+
+```
+findingInfo = {packageName="pkg",...} # same underlying package
+
+finding_1 = (findingInfo, {assetID=1}, {scanID=1})
+finding_2 = (findingInfo, {assetID=2}, {scanID=2})
+```
+In summary, this makes the API return the same finding multiple times for each asset and scan, leading to:
+- duplicating findings data in the database and adding unnecessary overhead to both speed and memory
+- difficulties understanding and working with findings both on the API and the UI level
+- assumes that each finding is tied to a a specific asset and scan
+
 ## Proposal
 
+If we _**assume that each finding data is truly unique**_, unrelated to its dependencies such as asset it was discovered on or the scan that found it, we get:
+- Each finding will be aggregated based on the assets and scans (dependencies), and uniquely described by only the related findings data
+- API and UI will be more easily consumable, reducing the database overheads and improving efficiency
+
+Therefore, we can define and treat findings similar to scans, that is:
+- Finding is unique and is expressed by `FindingInfo` model
+- Related dependencies are aggregated in `FindingSummary` similarly implemented as `ScanSummary`
+- References to dependencies can be added to `assetIDs` and `scanIDs`
+  - TODO(ramizpolic): can we make any simplifications/assumptions here? 
+
+
 ### Abandoned Ideas (Optional)
 
 ---
 
 ## Implementation
 
+1. Extend `Finding` API to utilize `Scan` approach to express relationship between dependencies
+2. Update related UI models to make use of the updated API 
+
 ## UX
 
+This RFC has no visible impacts on the UX.
+
 ## UI
+
+This RFC changes the findings data as they need to display aggregated results.