-
-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expires entity cache for related cases #2944
Conversation
📝 WalkthroughWalkthroughThis pull request introduces several changes across multiple files in the CommCare Android project, focusing on enhancing case indexing, cache management, and testing capabilities. The modifications include adding the Mockito testing library to the project's dependencies, implementing new methods for case index retrieval and cache invalidation in Sequence DiagramsequenceDiagram
participant Parser as AndroidCaseXmlParser
participant Cache as EntityCache
participant IndexTable as AndroidCaseIndexTable
participant Case as Case
Parser->>Case: Commit case
Parser->>Parser: clearEntityCache(case)
Parser->>IndexTable: getCasesWithTarget(caseId)
IndexTable-->>Parser: Related case IDs
Parser->>Parser: Recursively clear related cases
Parser->>Cache: invalidateCaches(recordIdsToWipe)
This sequence diagram illustrates the new cache invalidation process introduced in the ✨ Finishing Touches
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (2)
app/src/org/commcare/models/database/user/models/AndroidCaseIndexTable.java (2)
205-224
: LGTM! The implementation is robust and consistent with existing patterns.The new
getCasesWithTarget
method is well-implemented with:
- Proper cursor handling and resource cleanup
- Debug logging support
- Consistent query pattern
However, consider these potential improvements:
- Add cursor null check for defensive programming
- Consider adding index on
COL_INDEX_TARGET
column for query optimization@Override public LinkedHashSet<Integer> getCasesWithTarget(String targetValue) { String[] args = new String[]{targetValue}; if (SqlStorage.STORAGE_OUTPUT_DEBUG) { String query = String.format("SELECT %s FROM %s WHERE %s = ?", COL_CASE_RECORD_ID, TABLE_NAME, COL_INDEX_TARGET); DbUtil.explainSql(db, query, args); } Cursor c = db.query(TABLE_NAME, new String[]{COL_CASE_RECORD_ID}, COL_INDEX_TARGET + " = ?", args, null, null, null); + if (c == null) { + return new LinkedHashSet<>(); + } LinkedHashSet<Integer> ret = new LinkedHashSet<>(); SqlStorage.fillIdWindow(c, COL_CASE_RECORD_ID, ret); return ret; }
Line range hint
52-53
: Consider adding an index for the new query pattern.Since we now have a new query pattern that filters only on
COL_INDEX_TARGET
, consider adding a corresponding index in thecreateIndexes
method.public static void createIndexes(SQLiteDatabase db) { String recordFirstIndexId = "RECORD_NAME_ID_TARGET"; String recordFirstIndex = COL_CASE_RECORD_ID + ", " + COL_INDEX_NAME + ", " + COL_INDEX_TARGET; db.execSQL(DatabaseIndexingUtils.indexOnTableCommand(recordFirstIndexId, TABLE_NAME, recordFirstIndex)); String typeFirstIndexId = "NAME_TARGET_RECORD"; String typeFirstIndex = COL_INDEX_NAME + ", " + COL_CASE_RECORD_ID + ", " + COL_INDEX_TARGET; db.execSQL(DatabaseIndexingUtils.indexOnTableCommand(typeFirstIndexId, TABLE_NAME, typeFirstIndex)); + + // Index for getCasesWithTarget queries + String targetFirstIndexId = "TARGET_ONLY"; + String targetFirstIndex = COL_INDEX_TARGET; + db.execSQL(DatabaseIndexingUtils.indexOnTableCommand(targetFirstIndexId, TABLE_NAME, targetFirstIndex)); }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
app/build.gradle
(1 hunks)app/src/org/commcare/models/database/user/models/AndroidCaseIndexTable.java
(1 hunks)app/src/org/commcare/xml/AndroidBulkCaseXmlParser.java
(4 hunks)app/src/org/commcare/xml/AndroidCaseXmlParser.java
(4 hunks)app/unit-tests/resources/commcare-apps/case_list_lookup/restore.xml
(3 hunks)app/unit-tests/resources/commcare-apps/index_and_cache_test/incremental_restore.xml
(1 hunks)app/unit-tests/src/org/commcare/android/tests/caselist/EntityListCacheIndexTest.java
(2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: Lint Code Base
🔇 Additional comments (13)
app/unit-tests/src/org/commcare/android/tests/caselist/EntityListCacheIndexTest.java (3)
3-26
: LGTM! Well-structured test setup.The imports are properly organized, including all necessary testing utilities for assertions, mocking, and parameterized testing.
40-49
: Good use of parameterized testing.The test is well-designed to verify cache behavior under different bulk processing configurations.
59-84
: Comprehensive test coverage for cache expiration.The test effectively verifies that:
- Cases are initially cached correctly
- Cache is properly invalidated for related cases after modifications
- Behavior is consistent across bulk processing configurations
app/unit-tests/resources/commcare-apps/index_and_cache_test/incremental_restore.xml (1)
1-18
: Well-structured test data for cache invalidation.The XML correctly models a case update scenario with:
- Proper namespace declarations
- User group fixture for context
- Case update with all required fields
app/unit-tests/resources/commcare-apps/case_list_lookup/restore.xml (1)
17-19
: Well-designed case relationship structure.The case relationships are properly modeled using index elements, creating a hierarchical structure that's ideal for testing cache invalidation propagation:
- pat (parent) <- stan (middle) <- ellen (child)
This structure allows thorough testing of cache invalidation across related cases.
Also applies to: 33-35
app/build.gradle (1)
45-45
: LGTM! Appropriate test dependency.The Mockito dependency is:
- Correctly scoped to
testImplementation
- Using a recent version (5.5.0)
- Required for the static mocking in EntityListCacheIndexTest
app/src/org/commcare/models/database/user/models/AndroidCaseIndexTable.java (1)
205-224
: Verify database migration and performance impact.Since we're adding a new query pattern and potentially a new index:
- Ensure that database migration scripts are updated if needed
- Verify the performance impact of the new index on write operations
✅ Verification successful
No migration or performance concerns identified.
The
getCasesWithTarget
method is using an existing table structure with established migration paths and proper indexing. The implementation follows the codebase's standard query patterns and includes built-in performance monitoring throughSqlStorage.STORAGE_OUTPUT_DEBUG
.🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Check for existing migration scripts and measure query performance echo "Checking for migration scripts..." fd -e java -e sql migration echo "Checking for existing usages of getCasesWithTarget..." rg "getCasesWithTarget" -A 5Length of output: 1979
Script:
#!/bin/bash # Check git history for this method and table schema echo "Checking git history for getCasesWithTarget..." git log -L :getCasesWithTarget:app/src/org/commcare/models/database/user/models/AndroidCaseIndexTable.java echo "Checking for table creation and schema..." rg "CREATE TABLE.*${TABLE_NAME}" --type java rg "TABLE_NAME.*=.*case_index" --type java -A 2Length of output: 1871
Script:
#!/bin/bash # Check table schema details echo "Checking table schema..." rg "CREATE TABLE.*case_index_storage" -A 10 --type java echo "Checking for index definitions..." rg "CREATE.*INDEX.*case_index" --type javaLength of output: 238
Script:
#!/bin/bash # Get full implementation details echo "Checking full table implementation..." ast-grep --pattern 'class AndroidCaseIndexTable { $$$ }' echo "Checking for database upgrade patterns..." rg "onUpgrade|upgradeTable|upgrade.*Database" --type javaLength of output: 9312
app/src/org/commcare/xml/AndroidBulkCaseXmlParser.java (3)
3-3
: LGTM! Good code reuse through static import.The static import of
clearEntityCacheHelper
promotes code reuse between the parser classes.Also applies to: 20-20
80-90
: LGTM! Improved cache invalidation with case tracking.The changes maintain transactional integrity while improving cache invalidation by tracking the actual cases being processed.
104-112
: Consider monitoring performance impact of recursive cache clearing.While the implementation is correct, recursive cache clearing through related cases could impact performance for large case hierarchies.
Consider adding performance monitoring to track:
- Depth of case relationships being traversed
- Time taken for cache invalidation
- Number of cases processed in each bulk operation
app/src/org/commcare/xml/AndroidCaseXmlParser.java (3)
12-12
: LGTM! Required imports added for new functionality.The new imports support the cache invalidation logic and case relationship handling.
Also applies to: 32-34
103-103
: LGTM! Consistent cache invalidation approach.The change aligns with the bulk parser implementation while maintaining transactional integrity.
112-120
: LGTM! Robust null handling and clean encapsulation.The implementation properly handles edge cases and encapsulates the cache invalidation logic.
public static void clearEntityCacheHelper(Case aCase, Set<Integer> visited, AndroidCaseIndexTable caseIndexTable, | ||
IStorageUtilityIndexed<Case> storage) { | ||
int caseRecordId = aCase.getID(); | ||
|
||
// Prevent infinite recursion by protecting against cyclic relationships | ||
if (visited.contains(caseRecordId)) { | ||
return; | ||
} | ||
visited.add(caseRecordId); | ||
|
||
// Recursively clear cache for related cases | ||
if (aCase.getIndices() != null) { | ||
for (CaseIndex ci : aCase.getIndices()) { | ||
Case relatedCase = storage.getRecordForValue(Case.INDEX_CASE_ID, ci.getTarget()); | ||
if (relatedCase != null) { | ||
clearEntityCacheHelper(relatedCase, visited, caseIndexTable, storage); | ||
} | ||
} | ||
} | ||
|
||
// We also need to clear cache of cases that have a index to our case | ||
LinkedHashSet<Integer> relatedCases = caseIndexTable.getCasesWithTarget(aCase.getCaseId()); | ||
for (Integer relatedCaseRecordId : relatedCases) { | ||
Case relatedCase = storage.read(relatedCaseRecordId); | ||
if (relatedCase != null) { | ||
clearEntityCacheHelper(relatedCase, visited, caseIndexTable, storage); | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add error handling for database operations.
While the implementation thoroughly handles case relationships and prevents cycles, it should handle potential database errors when:
- Retrieving related cases through indices
- Reading cases by record ID
Consider wrapping database operations in try-catch blocks:
public static void clearEntityCacheHelper(Case aCase, Set<Integer> visited, AndroidCaseIndexTable caseIndexTable,
IStorageUtilityIndexed<Case> storage) {
int caseRecordId = aCase.getID();
if (visited.contains(caseRecordId)) {
return;
}
visited.add(caseRecordId);
if (aCase.getIndices() != null) {
for (CaseIndex ci : aCase.getIndices()) {
+ try {
Case relatedCase = storage.getRecordForValue(Case.INDEX_CASE_ID, ci.getTarget());
if (relatedCase != null) {
clearEntityCacheHelper(relatedCase, visited, caseIndexTable, storage);
}
+ } catch (Exception e) {
+ Logger.log(LogTypes.TYPE_ERROR_STORAGE,
+ "Error retrieving related case: " + ci.getTarget() + " Error: " + e.getMessage());
+ }
}
}
LinkedHashSet<Integer> relatedCases = caseIndexTable.getCasesWithTarget(aCase.getCaseId());
for (Integer relatedCaseRecordId : relatedCases) {
+ try {
Case relatedCase = storage.read(relatedCaseRecordId);
if (relatedCase != null) {
clearEntityCacheHelper(relatedCase, visited, caseIndexTable, storage);
}
+ } catch (Exception e) {
+ Logger.log(LogTypes.TYPE_ERROR_STORAGE,
+ "Error reading case: " + relatedCaseRecordId + " Error: " + e.getMessage());
+ }
}
}
Committable suggestion skipped: line range outside the PR's diff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a couple of initial thoughts, but I think my feedback here will need to be based in the overall direction of approach for the feature, IE: What level of confidence are you targeting from the spec?
mEntityCache.invalidateCaches(recordsToWipe); | ||
} | ||
|
||
public static void clearEntityCacheHelper(Case aCase, Set<Integer> visited, AndroidCaseIndexTable caseIndexTable, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of being recursive, I would update this methodology to use the existing case index map functionality, similar to how we generate the DAG for the case purge algorithm, and then just run the purge in a loop with the list of cases.
That would let us keep logic (and especially optimizations) for doing these kinds of case tree walks in one place, and keep the same guarantees about the shapes of the graphs.
if (mEntityCache != null) { | ||
Set<Integer> recordIdsToWipe = new HashSet<>(); | ||
for (int i = 0; i < casesToWipe.size(); i++) { | ||
clearEntityCacheHelper(casesToWipe.get(i), recordIdsToWipe, mCaseIndexTable, storage); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The performance characteristics of these methods needs a lot more consideration, I think.
It gets really problematic for CommCare if saving a case takes a long time, since there are lots of contexts where we might update a bunch of cases in a row), and this code introduces a lot of potential reads (and writes) that could end up making what currently takes milliseconds into something that snowballs. There are definitely contexts where it's probably faster to wipe the entire cache than it is to update it record by record.
There's some chance I'm overthinking it, but having worked a lot with case subgraphs in the purge code, it is really expensive to do even one walk over the full case graph. If we were going to make a change like this I'd want to see how it affects performance of high case loads in both sparse (1 parent 1 child) type cases and in dense cases (for instance, 1 person case with 100 visit cases)
As an alternative, I wonder if instead of updating the invalidation synchronously, we instead added the "cases needing invalidation" to a queue that can process them async, and block on the queue clearing before showing a case list, which would be no delay at all in the fast cases.
This has two other nice advantages:
1 - with the queue of invalidations you could process them in chunks. If, for instance, if there were 5 updates pending, they could share an index walk to accumulate related cases, and then submit all of the cache updates in one transaction.
2 - Seeing the spool size would let us be smarter about large backlogs. IE: there was a bulk update and 5,000 cases were changed of the 5,500 on the device) were changed, instead of invalidating them one by one we could just wipe the cache
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"cases needing invalidation" to a queue that can process them async,
@ctsims The alternative approach you mentioned here makes a lot of sense to me. One question - How do you feel about adding a new dirty
field to the existing Cache table as a manner of implementing this queue ? We can mark the cache as dirty for the specific cases synchronously and then asynchronously re-validate these records along with their case relation graph ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I think that something like that could be a really good approach! It reminds me a lot of something Emord was proposing a while ago about switching pillows from an event bus model to a stale flag processor. The fact that we're never going to have significant parallelization of the processing lends itself well (I think) to flagging this way.
Only tricky thing is that I think in this case it's a "Case -> Case subgraph" extraction process that I think is the time consuming thing that should get onto the queue (IE: Determining which cases are adjacent to the one that needs to get updated). I don't think we could just put the flag onto the Cache tables, since it won't be clear at that point which of the rows to actually tag.
Closing for #2955 |
Summary
Spec
Expires cache for all related cases in order to support caching xpath calculation results for related cases like
instance('casedb')/casedb/case[@case_type='mother'][@case_id=current()/index/parent]/case_name
. Without expiring related cases from cache, these kind of expressions may result in obsolete data to the user.Feature Flag
CACHE_AND_INDEX
PR Checklist
Automated test coverage
PR adds a test to demonstrate desired cache expiration.
Safety story
To remove the cache for related cases, we are doing a lookup in the case index table, every time a case gets changed which may have a performance impact. We currently only index the case index table on case_id and index_name columns but not on target case and I am wondering whether we should add an index on target case to make these lookups much faster ?
cross-request: dimagi/commcare-core#1456