Correct matching against description and memo#2185
Correct matching against description and memo#2185code-gnucash-org merged 3 commits intoGnucash:stablefrom
Conversation
jralls
left a comment
There was a problem hiding this comment.
Please separate uncommenting the DEBUG statements and fixing the bug into separate commits.
And since you created a companion bug (no need to do that), please title the bug fix commit with the bug number and title$mdash;"[Bug 799745] - Import matcher doesn't handle zero length memo and description properly" so that the release note script categorizes it correctly.
| else if (strlen(match_memo) > 1 && (strncasecmp(memo, match_memo, | ||
| strlen(match_memo) / 2) == 0)) |
There was a problem hiding this comment.
strlen and strncasecmp need to be null-protected. I'd also go for a bit more than a more restrictive match if one of the strings is substantially longer than the other, maybe test 1/2 the length of whichever is longer.
There was a problem hiding this comment.
I was trying for the minimal set of changes to fix the issue. The change to the length of the match would be a larger change in behavior. I can make the larger change if you want though.
As far as the companion bug, I was just following the developer guidelines.
There was a problem hiding this comment.
Changes are in place.
There was a problem hiding this comment.
As far as the companion bug, I was just following the developer guidelines.
Where did you find that?
There was a problem hiding this comment.
I was reading https://wiki.gnucash.org/wiki/Git#Pull_Requests and saw the requirement for bug reports under patches and assumed it was for all changes. I've had other projects refuse PRs for not opening a bug when I see such language.
There was a problem hiding this comment.
strlenandstrncasecmpneed to be null-protected. I'd also go for a bit more than a more restrictive match if one of the strings is substantially longer than the other, maybe test 1/2 the length of whichever is longer.
@jralls I've updated the PR do use this algorithm. Let me know what you think.
| /* If it does, abort the process for this transaction, since | ||
| it is already in the system. */ | ||
| DEBUG("%s", "Transaction with same online ID exists, destroying current transaction"); | ||
| DEBUG("Transaction with same online ID exists, destroying current transaction: %s", qof_print_date(xaccTransGetDate(trans))); |
There was a problem hiding this comment.
This is a useful change, but will leak qof_print_date. Options: g_free later
char* date = qof_print_date(...);
DEBUG("...%s",date);
g_free(date);Alternatively just show the pointer DEBUG("...%p", trans);.
Or even better, also debug-log the online_id from within gnc_import_exists_online_id:
modified gnucash/import-export/import-backend.cpp
@@ -1052,6 +1052,10 @@ gboolean gnc_import_exists_online_id (Transaction *trans, GHashTable* acct_id_ha
}
auto online_id_exists = g_hash_table_contains (online_id_hash, source_online_id);
+
+ if (online_id_exists)
+ DEBUG ("transaction with online id %s already exists", source_online_id);
+
g_free (source_online_id);
return online_id_exists;
}There was a problem hiding this comment.
THank you, I had forgotten about freeing the pointers. 2 questions:
- any thoughts on using shared pointers to avoid issues with free?
- is it all right to have the fix as a separate commit or do you want me to rebase the changes into the previous commits?
There was a problem hiding this comment.
std::shared_ptr carries a fair amount of overhead for instance-counting so you only want to use it if you're actually sharing ownership of the pointer. std::unique_ptr has less overhead and is intended for automatic freeing of pointers where ownership isn't shared, so that's what you'd use here. But both of them use operator delete by default so if you use them on pointers that aren't allocated with operator new you must provide a custom deleter; that can be a simple lambda that calls free. g_free is just a null-protected wrapper around free.
I don't think that using std::unique_ptr here would get you anything. You can't be sure when using it inline like
DEBUG("Transaction with same online ID exists, destroying current transaction: %s", (std::unique_ptr<char*, (void(*)(char*)>(qof_print_date(xaccTransGetDate(trans)[](char* p){ g_free(p); })));
that the temporary will survive long enough for the printf buried in the DEBUG macro to use it before it's freed so you would still need to create a local variable. Plus you're creating a lot of extra work for yourself and the compiler. The problem for most of us is noticing that we need to free the rv from a function like qof_print_date and therefore can't use it as a temporary.
You should fix the first commit. You can force-push your branch and the PR will update automatically.
There was a problem hiding this comment.
Thank you for the explanation about shared and unique pointers. I don't see why there would be any issue with the compiler executing the free before the printf gets the pointer. I expect that would be a bug in the compiler.
That all said, I'm using g_free in my updated commits.
There was a problem hiding this comment.
This is a useful change, but will leak
qof_print_date. Options: g_free later
@christopherlam I've updated the PR with the freeing of the memory. I've also added the debug statement that you suggested in gnc_import_exists_online_id.
There was a problem hiding this comment.
I don't see why there would be any issue with the compiler executing the free before the printf gets the pointer. I expect that would be a bug in the compiler.
I got distracted by the deleter lambda and forgot to get() the char* from the unique_ptr in my example. Here's the corrected one:
DEBUG("Transaction with same online ID exists, destroying current transaction: %s", (std::unique_ptr<char*, (void(*)(char*)>(qof_print_date(xaccTransGetDate(trans)[](char* p){ g_free(p); }))))->get());
It's complicated and I think implementation dependent because the actual temporary is the char* from the get, though that might qualify as creating a reference that extends the lifetime of the unique_ptr. As it happens both gcc and clang call the deleter after the printf, see compiler explorer.
Enhance the debugging to make it easier to track why some transactions aren't considered possible matches. Uncomment debug statements so that users dont' need to build their own binary to debug.
Check that strings aren't null before using them. Factor out access to strings to avoid extra calls to the xacc* methods.
…cription properly When doing a fuzzy check on strings, check half the length of the longest string and only check if both strings are longer than 1 character.
|
Thanks! |
When the description or the memo is of length zero or one, doing a fuzzy match causes the the check to be against zero characters and this always comes back true. Because of this, skip the match when the string is too short.
Also enhance the debugging to make it easier to track why some transactions aren't considered possible matches.
This is a fix for https://bugs.gnucash.org/show_bug.cgi?id=799745