Expires entity cache for related cases #2944

shubham1g5 · 2025-01-21T12:11:34Z

Summary

Expires cache for all related cases in order to support caching xpath calculation results for related cases like instance('casedb')/casedb/case[@case_type='mother'][@case_id=current()/index/parent]/case_name. Without expiring related cases from cache, these kind of expressions may result in obsolete data to the user.

Feature Flag

CACHE_AND_INDEX

PR Checklist

If I think the PR is high risk, "High Risk" label is set
I have confidence that this PR will not introduce a regression for the reasons below
Do we need to enhance manual QA test coverage ? If yes, "QA Note" label is set correctly
Does the PR introduce any major changes worth communicating ? If yes, "Release Note" label is set and a "Release Note" is specified in PR description.

Automated test coverage

PR adds a test to demonstrate desired cache expiration.

Safety story

To remove the cache for related cases, we are doing a lookup in the case index table, every time a case gets changed which may have a performance impact. We currently only index the case index table on case_id and index_name columns but not on target case and I am wondering whether we should add an index on target case to make these lookups much faster ?

cross-request: dimagi/commcare-core#1456

coderabbitai · 2025-01-21T12:15:18Z

📝 Walkthrough

Walkthrough

This pull request introduces several changes across multiple files in the CommCare Android project, focusing on enhancing case indexing, cache management, and testing capabilities. The modifications include adding the Mockito testing library to the project's dependencies, implementing new methods for case index retrieval and cache invalidation in AndroidCaseIndexTable and AndroidCaseXmlParser classes, and updating the unit test infrastructure. The changes aim to improve the robustness of case-related operations, particularly in handling case relationships and cache management during XML parsing and data restoration. The modifications also extend the testing framework to support more comprehensive and parameterized testing scenarios for entity list caching and case indexing.

Sequence Diagram

sequenceDiagram
    participant Parser as AndroidCaseXmlParser
    participant Cache as EntityCache
    participant IndexTable as AndroidCaseIndexTable
    participant Case as Case

    Parser->>Case: Commit case
    Parser->>Parser: clearEntityCache(case)
    Parser->>IndexTable: getCasesWithTarget(caseId)
    IndexTable-->>Parser: Related case IDs
    Parser->>Parser: Recursively clear related cases
    Parser->>Cache: invalidateCaches(recordIdsToWipe)

This sequence diagram illustrates the new cache invalidation process introduced in the AndroidCaseXmlParser. When a case is committed, the clearEntityCache method is called, which recursively finds and clears caches for related cases using the AndroidCaseIndexTable to identify connected cases, and then invalidates the appropriate cache entries.

✨ Finishing Touches

📝 Generate Docstrings (Beta)

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (2)

app/src/org/commcare/models/database/user/models/AndroidCaseIndexTable.java (2)

205-224: LGTM! The implementation is robust and consistent with existing patterns.

The new getCasesWithTarget method is well-implemented with:

Proper cursor handling and resource cleanup
Debug logging support
Consistent query pattern

However, consider these potential improvements:

Add cursor null check for defensive programming
Consider adding index on COL_INDEX_TARGET column for query optimization

 @Override
 public LinkedHashSet<Integer> getCasesWithTarget(String targetValue) {
     String[] args = new String[]{targetValue};
     if (SqlStorage.STORAGE_OUTPUT_DEBUG) {
         String query = String.format("SELECT %s FROM %s WHERE %s = ?", COL_CASE_RECORD_ID, TABLE_NAME, COL_INDEX_TARGET);
         DbUtil.explainSql(db, query, args);
     }
     Cursor c = db.query(TABLE_NAME, new String[]{COL_CASE_RECORD_ID}, COL_INDEX_TARGET + " =  ?", args, null,
             null, null);
+    if (c == null) {
+        return new LinkedHashSet<>();
+    }
     LinkedHashSet<Integer> ret = new LinkedHashSet<>();
     SqlStorage.fillIdWindow(c, COL_CASE_RECORD_ID, ret);
     return ret;
 }

Line range hint 52-53: Consider adding an index for the new query pattern.

Since we now have a new query pattern that filters only on COL_INDEX_TARGET, consider adding a corresponding index in the createIndexes method.

 public static void createIndexes(SQLiteDatabase db) {
     String recordFirstIndexId = "RECORD_NAME_ID_TARGET";
     String recordFirstIndex = COL_CASE_RECORD_ID + ", " + COL_INDEX_NAME + ", " + COL_INDEX_TARGET;
     db.execSQL(DatabaseIndexingUtils.indexOnTableCommand(recordFirstIndexId, TABLE_NAME, recordFirstIndex));

     String typeFirstIndexId = "NAME_TARGET_RECORD";
     String typeFirstIndex = COL_INDEX_NAME + ", " + COL_CASE_RECORD_ID + ", " + COL_INDEX_TARGET;
     db.execSQL(DatabaseIndexingUtils.indexOnTableCommand(typeFirstIndexId, TABLE_NAME, typeFirstIndex));
+
+    // Index for getCasesWithTarget queries
+    String targetFirstIndexId = "TARGET_ONLY";
+    String targetFirstIndex = COL_INDEX_TARGET;
+    db.execSQL(DatabaseIndexingUtils.indexOnTableCommand(targetFirstIndexId, TABLE_NAME, targetFirstIndex));
 }

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2a031f9 and dddccd8.

📒 Files selected for processing (7)

app/build.gradle (1 hunks)
app/src/org/commcare/models/database/user/models/AndroidCaseIndexTable.java (1 hunks)
app/src/org/commcare/xml/AndroidBulkCaseXmlParser.java (4 hunks)
app/src/org/commcare/xml/AndroidCaseXmlParser.java (4 hunks)
app/unit-tests/resources/commcare-apps/case_list_lookup/restore.xml (3 hunks)
app/unit-tests/resources/commcare-apps/index_and_cache_test/incremental_restore.xml (1 hunks)
app/unit-tests/src/org/commcare/android/tests/caselist/EntityListCacheIndexTest.java (2 hunks)

⏰ Context from checks skipped due to timeout of 90000ms (1)

GitHub Check: Lint Code Base

🔇 Additional comments (13)

app/unit-tests/src/org/commcare/android/tests/caselist/EntityListCacheIndexTest.java (3)

3-26: LGTM! Well-structured test setup.

The imports are properly organized, including all necessary testing utilities for assertions, mocking, and parameterized testing.

40-49: Good use of parameterized testing.

The test is well-designed to verify cache behavior under different bulk processing configurations.

59-84: Comprehensive test coverage for cache expiration.

The test effectively verifies that:

Cases are initially cached correctly

Cache is properly invalidated for related cases after modifications

Behavior is consistent across bulk processing configurations

app/unit-tests/resources/commcare-apps/index_and_cache_test/incremental_restore.xml (1)

1-18: Well-structured test data for cache invalidation.

The XML correctly models a case update scenario with:

Proper namespace declarations

User group fixture for context

Case update with all required fields

app/unit-tests/resources/commcare-apps/case_list_lookup/restore.xml (1)

17-19: Well-designed case relationship structure.

The case relationships are properly modeled using index elements, creating a hierarchical structure that's ideal for testing cache invalidation propagation:

pat (parent) <- stan (middle) <- ellen (child)

This structure allows thorough testing of cache invalidation across related cases.

Also applies to: 33-35

app/build.gradle (1)

45-45: LGTM! Appropriate test dependency.

The Mockito dependency is:

Correctly scoped to testImplementation

Using a recent version (5.5.0)

Required for the static mocking in EntityListCacheIndexTest
app/src/org/commcare/models/database/user/models/AndroidCaseIndexTable.java (1)

205-224: Verify database migration and performance impact.

Since we're adding a new query pattern and potentially a new index:

Ensure that database migration scripts are updated if needed

Verify the performance impact of the new index on write operations

✅ Verification successful

No migration or performance concerns identified.

The getCasesWithTarget method is using an existing table structure with established migration paths and proper indexing. The implementation follows the codebase's standard query patterns and includes built-in performance monitoring through SqlStorage.STORAGE_OUTPUT_DEBUG.
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Check for existing migration scripts and measure query performance
echo "Checking for migration scripts..."
fd -e java -e sql migration
echo "Checking for existing usages of getCasesWithTarget..."
rg "getCasesWithTarget" -A 5
Length of output: 1979

Script:
#!/bin/bash
# Check git history for this method and table schema
echo "Checking git history for getCasesWithTarget..."
git log -L :getCasesWithTarget:app/src/org/commcare/models/database/user/models/AndroidCaseIndexTable.java

echo "Checking for table creation and schema..."
rg "CREATE TABLE.*${TABLE_NAME}" --type java
rg "TABLE_NAME.*=.*case_index" --type java -A 2
Length of output: 1871

Script:
#!/bin/bash
# Check table schema details
echo "Checking table schema..."
rg "CREATE TABLE.*case_index_storage" -A 10 --type java

echo "Checking for index definitions..."
rg "CREATE.*INDEX.*case_index" --type java
Length of output: 238

Script:
#!/bin/bash
# Get full implementation details
echo "Checking full table implementation..."
ast-grep --pattern 'class AndroidCaseIndexTable {
  $$$
}'

echo "Checking for database upgrade patterns..."
rg "onUpgrade|upgradeTable|upgrade.*Database" --type java
Length of output: 9312
app/src/org/commcare/xml/AndroidBulkCaseXmlParser.java (3)

3-3: LGTM! Good code reuse through static import.

The static import of clearEntityCacheHelper promotes code reuse between the parser classes.

Also applies to: 20-20

80-90: LGTM! Improved cache invalidation with case tracking.

The changes maintain transactional integrity while improving cache invalidation by tracking the actual cases being processed.

104-112: Consider monitoring performance impact of recursive cache clearing.

While the implementation is correct, recursive cache clearing through related cases could impact performance for large case hierarchies.

Consider adding performance monitoring to track:

Depth of case relationships being traversed

Time taken for cache invalidation

Number of cases processed in each bulk operation

app/src/org/commcare/xml/AndroidCaseXmlParser.java (3)

12-12: LGTM! Required imports added for new functionality.

The new imports support the cache invalidation logic and case relationship handling.

Also applies to: 32-34

103-103: LGTM! Consistent cache invalidation approach.

The change aligns with the bulk parser implementation while maintaining transactional integrity.

112-120: LGTM! Robust null handling and clean encapsulation.

The implementation properly handles edge cases and encapsulates the cache invalidation logic.

coderabbitai · 2025-01-21T12:15:21Z

app/src/org/commcare/xml/AndroidCaseXmlParser.java

+    public static void clearEntityCacheHelper(Case aCase, Set<Integer> visited, AndroidCaseIndexTable caseIndexTable,
+            IStorageUtilityIndexed<Case> storage) {
+        int caseRecordId = aCase.getID();
+
+        // Prevent infinite recursion by protecting against cyclic relationships
+        if (visited.contains(caseRecordId)) {
+            return;
+        }
+        visited.add(caseRecordId);
+
+        // Recursively clear cache for related cases
+        if (aCase.getIndices() != null) {
+            for (CaseIndex ci : aCase.getIndices()) {
+                Case relatedCase = storage.getRecordForValue(Case.INDEX_CASE_ID, ci.getTarget());
+                if (relatedCase != null) {
+                    clearEntityCacheHelper(relatedCase, visited, caseIndexTable, storage);
+                }
+            }
+        }
+
+        // We also need to clear cache of cases that have a index to our case
+        LinkedHashSet<Integer> relatedCases = caseIndexTable.getCasesWithTarget(aCase.getCaseId());
+        for (Integer relatedCaseRecordId : relatedCases) {
+            Case relatedCase = storage.read(relatedCaseRecordId);
+            if (relatedCase != null) {
+                clearEntityCacheHelper(relatedCase, visited, caseIndexTable, storage);
+            }
+        }
+    }


🛠️ Refactor suggestion

Add error handling for database operations.

While the implementation thoroughly handles case relationships and prevents cycles, it should handle potential database errors when:

Retrieving related cases through indices

Reading cases by record ID

Consider wrapping database operations in try-catch blocks:

public static void clearEntityCacheHelper(Case aCase, Set<Integer> visited, AndroidCaseIndexTable caseIndexTable, IStorageUtilityIndexed<Case> storage) { int caseRecordId = aCase.getID(); if (visited.contains(caseRecordId)) { return; } visited.add(caseRecordId); if (aCase.getIndices() != null) { for (CaseIndex ci : aCase.getIndices()) { + try { Case relatedCase = storage.getRecordForValue(Case.INDEX_CASE_ID, ci.getTarget()); if (relatedCase != null) { clearEntityCacheHelper(relatedCase, visited, caseIndexTable, storage); } + } catch (Exception e) { + Logger.log(LogTypes.TYPE_ERROR_STORAGE, + "Error retrieving related case: " + ci.getTarget() + " Error: " + e.getMessage()); + } } } LinkedHashSet<Integer> relatedCases = caseIndexTable.getCasesWithTarget(aCase.getCaseId()); for (Integer relatedCaseRecordId : relatedCases) { + try { Case relatedCase = storage.read(relatedCaseRecordId); if (relatedCase != null) { clearEntityCacheHelper(relatedCase, visited, caseIndexTable, storage); } + } catch (Exception e) { + Logger.log(LogTypes.TYPE_ERROR_STORAGE, + "Error reading case: " + relatedCaseRecordId + " Error: " + e.getMessage()); + } } }

Committable suggestion skipped: line range outside the PR's diff.

ctsims

Added a couple of initial thoughts, but I think my feedback here will need to be based in the overall direction of approach for the feature, IE: What level of confidence are you targeting from the spec?

ctsims · 2025-01-25T00:10:09Z

app/src/org/commcare/xml/AndroidCaseXmlParser.java

+        mEntityCache.invalidateCaches(recordsToWipe);
+    }
+
+    public static void clearEntityCacheHelper(Case aCase, Set<Integer> visited, AndroidCaseIndexTable caseIndexTable,


Instead of being recursive, I would update this methodology to use the existing case index map functionality, similar to how we generate the DAG for the case purge algorithm, and then just run the purge in a loop with the list of cases.

That would let us keep logic (and especially optimizations) for doing these kinds of case tree walks in one place, and keep the same guarantees about the shapes of the graphs.

ctsims · 2025-01-25T00:23:46Z

app/src/org/commcare/xml/AndroidBulkCaseXmlParser.java

+        if (mEntityCache != null) {
+            Set<Integer> recordIdsToWipe = new HashSet<>();
+            for (int i = 0; i < casesToWipe.size(); i++) {
+                clearEntityCacheHelper(casesToWipe.get(i), recordIdsToWipe, mCaseIndexTable, storage);


The performance characteristics of these methods needs a lot more consideration, I think.

It gets really problematic for CommCare if saving a case takes a long time, since there are lots of contexts where we might update a bunch of cases in a row), and this code introduces a lot of potential reads (and writes) that could end up making what currently takes milliseconds into something that snowballs. There are definitely contexts where it's probably faster to wipe the entire cache than it is to update it record by record.

There's some chance I'm overthinking it, but having worked a lot with case subgraphs in the purge code, it is really expensive to do even one walk over the full case graph. If we were going to make a change like this I'd want to see how it affects performance of high case loads in both sparse (1 parent 1 child) type cases and in dense cases (for instance, 1 person case with 100 visit cases)

As an alternative, I wonder if instead of updating the invalidation synchronously, we instead added the "cases needing invalidation" to a queue that can process them async, and block on the queue clearing before showing a case list, which would be no delay at all in the fast cases.

This has two other nice advantages:
1 - with the queue of invalidations you could process them in chunks. If, for instance, if there were 5 updates pending, they could share an index walk to accumulate related cases, and then submit all of the cache updates in one transaction.
2 - Seeing the spool size would let us be smarter about large backlogs. IE: there was a bulk update and 5,000 cases were changed of the 5,500 on the device) were changed, instead of invalidating them one by one we could just wipe the cache

"cases needing invalidation" to a queue that can process them async,

@ctsims The alternative approach you mentioned here makes a lot of sense to me. One question - How do you feel about adding a new dirty field to the existing Cache table as a manner of implementing this queue ? We can mark the cache as dirty for the specific cases synchronously and then asynchronously re-validate these records along with their case relation graph ?

Ah, I think that something like that could be a really good approach! It reminds me a lot of something Emord was proposing a while ago about switching pillows from an event bus model to a stale flag processor. The fact that we're never going to have significant parallelization of the processing lends itself well (I think) to flagging this way.

Only tricky thing is that I think in this case it's a "Case -> Case subgraph" extraction process that I think is the time consuming thing that should get onto the queue (IE: Determining which cases are adjacent to the one that needs to get updated). I don't think we could just put the flag onto the Cache tables, since it won't be clear at that point which of the rows to actually tag.

shubham1g5 · 2025-02-18T15:32:40Z

Closing for #2955

Expires entity cache for related cases

dddccd8

shubham1g5 mentioned this pull request Jan 21, 2025

Adds a new interface method for case index table dimagi/commcare-core#1456

Draft

3 tasks

coderabbitai bot reviewed Jan 21, 2025

View reviewed changes

shubham1g5 requested review from ctsims and avazirna January 23, 2025 10:53

ctsims reviewed Jan 25, 2025

View reviewed changes

shubham1g5 closed this Feb 18, 2025

shubham1g5 deleted the entityCacheExpiration branch February 18, 2025 15:32

shubham1g5 mentioned this pull request Feb 18, 2025

Expires entity cache for related cases #2955

Merged

4 tasks

coderabbitai bot mentioned this pull request Mar 12, 2025

MInor Tweaks for Cache and Lazy Load #2982

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Expires entity cache for related cases #2944

Expires entity cache for related cases #2944

shubham1g5 commented Jan 21, 2025 •

edited

Loading

coderabbitai bot commented Jan 21, 2025

Walkthrough

Sequence Diagram

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Documentation and Community

coderabbitai bot left a comment

coderabbitai bot Jan 21, 2025

ctsims left a comment

ctsims Jan 25, 2025

ctsims Jan 25, 2025

shubham1g5 Feb 6, 2025

ctsims Feb 7, 2025

shubham1g5 commented Feb 18, 2025

Expires entity cache for related cases #2944

Expires entity cache for related cases #2944

Conversation

shubham1g5 commented Jan 21, 2025 • edited Loading

Summary

Feature Flag

PR Checklist

Automated test coverage

Safety story

coderabbitai bot commented Jan 21, 2025

Walkthrough

Sequence Diagram

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Jan 21, 2025

Choose a reason for hiding this comment

ctsims left a comment

Choose a reason for hiding this comment

ctsims Jan 25, 2025

Choose a reason for hiding this comment

ctsims Jan 25, 2025

Choose a reason for hiding this comment

shubham1g5 Feb 6, 2025

Choose a reason for hiding this comment

ctsims Feb 7, 2025

Choose a reason for hiding this comment

shubham1g5 commented Feb 18, 2025

shubham1g5 commented Jan 21, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)