-
Notifications
You must be signed in to change notification settings - Fork 128
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SYS-6179] Add MultiReadAsync file API. #290
Conversation
86c1e00
to
3e0c7de
Compare
3e0c7de
to
89ef005
Compare
89ef005
to
0e5cdb9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm accepting this PR, but at the same time want to discuss an alternative. This PR reaches fairly deep into RocksDB and will introduce merge conflicts as we try to keep up with upstream. What do you think about leaving RocksDB as is and introducing request batching at our layer?
I'm thinking ReadAsync
implementation on our end could wait for a very short period of time, and then batch up any requests that come along in that wait period. That way we might be able to even batch requests from multiple threads if we get lucky. What do you think?
0e5cdb9
to
114bfa1
Compare
I am not a fan of splitting multi part entities and then trying to reassemble them later, this always seem to cause some kind of tricky issue on top of being inefficient. However, this is just my personal bias -- TAO did something similar in recent years with DB missies and was quite successful at it. Anyways, for my taste I would go with what we have here and suffer the merge conflicts if they show up, however I don't have any experience of actually suffering through them, so LMK if you feel I am misguided here. |
Sounds good, let's ship this. |
2351d5d
to
79e24b3
Compare
1b9a4fa
to
6792b57
Compare
1d6d20f
to
6700da7
Compare
f9284f3
to
a1b8d6e
Compare
a1b8d6e
to
d1af15c
Compare
d1af15c
to
5bddc59
Compare
db/version_set.cc
Outdated
@@ -5683,7 +5683,7 @@ Status VersionSet::LogAndApplyHelper(ColumnFamilyData* cfd, | |||
} | |||
|
|||
if (edit->HasReplicationSequence()) { | |||
replication_sequence_ = edit->GetReplicationSequence(); | |||
replication_sequence_ = edit->GetReplicationSequence(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might conflict with @seckcoder 's changes, and since it's only a formatting change, can we revert it pls?
file/random_access_file_reader.cc
Outdated
std::function<void(const FSReadRequest*, size_t, void*)> cb_; | ||
void* cb_arg_; | ||
uint64_t start_time_; | ||
#ifndef ROCKSDB_LITE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ROCKSDB_LITE is removed in 8.0, which I plan to upgrade to very soon, so let's remove the guard here.
file/random_access_file_reader.cc
Outdated
} else if (!req.status.IsAborted()) { | ||
RecordTick(stats_, ASYNC_READ_ERROR_COUNT, 1); | ||
} | ||
#ifndef ROCKSDB_LITE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See above
// Returns CloudManifest file name for a given db. | ||
virtual std::string CloudManifestFile(const std::string& dbname) = 0; | ||
|
||
virtual CloudManifest* GetCloudManifest() = 0; | ||
|
||
// TODO(wei): this function is used to temporarily support open db and switch | ||
// cookie. Change it to use the cookie from options once migration complete. | ||
virtual IOStatus CreateCloudManifest(const std::string& local_dbname, | ||
const std::string& cookie) = 0; | ||
|
||
virtual IOStatus LoadCloudManifest(const std::string& local_dbname, | ||
bool read_only) = 0; | ||
|
||
virtual IOStatus UploadCloudManifest(const std::string& local_dbname, | ||
const std::string& cookie) const = 0; | ||
|
||
// Prepare a local directory for use as a clone of the cloud storage | ||
virtual IOStatus SanitizeLocalDirectory(const DBOptions& options, | ||
const std::string& local_name, | ||
bool read_only) = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these changes unrelated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is related to cleaning up the APIs so that we can implement DisaggFileSystem in Rockset.
include/rocksdb/file_system.h
Outdated
virtual IOStatus Poll(std::vector<void*>& io_handles, | ||
size_t /*min_completions*/) { | ||
// RocksDB-Cloud contribution begin | ||
if (!io_handles.empty()) { | ||
// Default implementations on async operations do not return any | ||
// io_handles. If you override them, you need to override Poll as well. | ||
return IOStatus::NotSupported("Poll"); | ||
} | ||
// RocksDB-Cloud contribution end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this just for added safety? I prefer less safety if that means less merge conflicts in the future :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okayyy.
include/rocksdb/file_system.h
Outdated
// | ||
// When the read request is completed, callback function specified in cb | ||
// should be called with arguments cb_arg and the result populated in | ||
// FSReadRequest with result and status fileds updated by FileSystem. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/fileds/fields
Extending the file API to provide an option to implement asynchronous multi read operation. The default implementation delegates to AsyncRead. Updated the AsyncFileReader class implementation to use MultiReadAsync.
Expanidng the CloudFileSystem API to include methods for handling CloudManifest. This allows us to better hide CloudFileSystemImpl and allow different implementations of CloudFileSystem.
5bddc59
to
2d5d260
Compare
Extending the file API to provide an option to implement asynchronous
multi read operation. The default implementation delegates to AsyncRead.
Updated the AsyncFileReader class implementation to use MultiReadAsync.