-
Notifications
You must be signed in to change notification settings - Fork 13
Wipe refactoring + Toc/Remote Wipe #158
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
…k-md and --check-full.
Set default for WipeElementType type_ Co-authored-by: Metin Cakircali <[email protected]>
…s. Abort list polling if too many retries.
simondsmart
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added far too many comments to this PR. And I think that it could get lost in the weeds. My overall comments:
- The breakdown of Catalogue/store functionality looks good.
- Clearly we need to have a mutable Catalogue object through this process. This probably means creating a base EntryVisitor class with a non-const, private catalogue_entry, accessible to two derived classes MutableEntryVisitor and ImmutableEntryVisitor (or similar) which access a protected catalogue() which returns a reference differing in constness.
- This wipe runs fully on the catalogue, dispatching to the stores. But this does not work in all cases. Consider a local disk-based store, or OpenFAM store - the remote catalogue has neither access, nor filesystem permissions to make modifications to the store. As such, control here needs to be routed through the client process. This may require the catalogue to sign change requests to the store (or at least, to have a placeholder for doing this now).
- I'm not sure that WipeElements is the correct structure to be storing for state to be communicated around this process. That is for communicating back to clients through the API - and it feels a bit forced in places.
- I can see why we had a second-level visitor pattern, as there is state which needs to be held. Pushing this all to be persistent state in the Catalogue object is extremely ugly, and makes our thread-nastiness even worse. I think that we need, fundamentally, some type of class-specific context object. Probably a friend object. Which stores that persistent state through the wipe process. Should be easy to extract.
| #ifndef fdb5_ListIterator_H | ||
| #define fdb5_ListIterator_H | ||
|
|
||
| #include <unordered_map> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. I'm very surprised to see this being added to this header when the list functionality should be essentially untouched here...
| } | ||
|
|
||
| size_t encodeSizeString(const std::string& s) { | ||
| return (1 + 4 + s.size()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the 1 and the 4 for?
| size_t WipeElement::encodeSize() const { | ||
| size_t aux = 0; | ||
| for (const auto& l : uris_) { | ||
| aux += 256; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
256? wot?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Surely this should add the length of the URI? Or something dependent on that.
|
|
||
| void WipeElement::print(std::ostream& out) const { | ||
|
|
||
| LOG_DEBUG_LIB(LibFdb5) << "Wipe(type=" << type_ << ",msg=" << msg_ << ",uris=["; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oof. This is unpleasant inside ::print(), as this method is typically called in the middle of ostream output. Which results in info/error/debug output becoming interleaved on the terminal. A but unpleasant.
|
|
||
| // If the catalogue cannot be wiped, then we don't proceed. | ||
| // This is a safeguard against wiping a catalogue that is not in a state to be wiped. | ||
| return false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The entry visitors can recurse all the way down to elements. If you return false from visitDatabase it says to the visit mechanism not to visit the indexes or the entries explicitly. i.e. stop here.
| void deselectIndex() override { NOTIMP; } | ||
|
|
||
| std::vector<eckit::PathName> metadataPaths() const override { NOTIMP; } | ||
| // std::vector<eckit::PathName> metadataPaths() const override { NOTIMP; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If not needed, delete.
| const metkit::mars::MarsRequest& request, std::ostream& out, bool doit, bool porcelain, | ||
| bool unsafeWipeAll) : | ||
| WipeVisitor(request, out, doit, porcelain, unsafeWipeAll), catalogue_(catalogue), store_(store), dbKvName_("") {} | ||
| DaosWipeVisitor::DaosWipeVisitor(const DaosCatalogue& catalogue, const metkit::mars::MarsRequest& request, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This worries me.
Most of the other WipeVisitors have been removed from the code. I think correctly.
But here, we seem to have just commented most of the functionality out. So I suspect that this is just broken.
| return true; | ||
| } | ||
| bool DaosStore::doWipe(const std::vector<eckit::URI>& unknownURIs) const { | ||
| return true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doWipe is really a NOP?
|
|
||
| eckit::URI locationURI{index.location().uri()}; | ||
|
|
||
| if (!locationURI.path().dirName().sameAs(basePath())) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes
| return forwardApiCall(StatsHelper(), request); | ||
| } | ||
|
|
||
| WipeIterator RemoteFDB::wipe(const FDBToolRequest& request, bool doit, bool porcelain, bool unsafeWipeAll) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding of this is that it implies that wipe runs fully on the Catalogue (none of the communication is via the client).
What happens in the case that we have a remote Catalogue, but local stores (e.g. stores on Lustre). The Catalogue may not have the right filesystem permissions to wipe data which has been written by the user.
I'm not sure that this design is functionally correct.
Description
The aim is to unify the WipeVisitor and expose some catalogue and store methods responsible for the backend-specific checks and removal actions.
Contributor Declaration
By opening this pull request, I affirm the following:
🌈🌦️📖🚧 Documentation 🚧📖🌦️🌈
https://sites.ecmwf.int/docs/dev-section/fdb/pull-requests/PR-158