fix: allow read-only operations when all masters are down#497
Conversation
a489998 to
427fc6a
Compare
427fc6a to
c93121e
Compare
c93121e to
3dd8fbe
Compare
|
|
||
| function utils.get_spaces(vshard_router, timeout, replica_id) | ||
| local replicasets, replicaset, replicaset_id, master | ||
| local function find_any_healthy_replica_conn(replicasets) |
There was a problem hiding this comment.
There was a problem hiding this comment.
https://github.com/tarantool/vshard/blob/0fd36939/vshard/replicaset.lua#L341-L362 vshard checks connections like this
There was a problem hiding this comment.
Used vshard's replica:is_connected: https://github.com/tarantool/vshard/blob/5a03168d11e717e52a89aca9b28c453e3f737be4/vshard/replicaset.lua#L1441
| end | ||
|
|
||
| local results, err = call.map(vshard_router, CRUD_LEN_FUNC_NAME, {space_name}, { | ||
| mode = 'write', |
There was a problem hiding this comment.
Otherwise len() will fail if masters are unavailable
There was a problem hiding this comment.
Otherwise
len()will fail if masters are unavailable
may be it's ok? We need to know exact number of tuples on master. No masters -> no result.
you may add ticket for some flag for len method to get len from replicas when master is absent
There was a problem hiding this comment.
Added the read only options (mode, request_timeout, prefer_replica, balance) to allow users to explicitly fetch the length from replicas if needed.
The default behavior is unchanged (mode = 'write'), so len will still fail without masters to guarantee exact data.
|
|
||
| function utils.get_space(space_name, vshard_router, timeout, replica_id) | ||
| local spaces, err, schema_version = utils.get_spaces(vshard_router, timeout, replica_id) | ||
| function utils.get_space(space_name, vshard_router, timeout, replica_id, read_only) |
There was a problem hiding this comment.
may be add timeout, replica_id, read_only args in one opts table argument?
Read-only operations (get, select, pairs, count, len, min, max) used to fail with connection errors if all masters in the cluster were unavailable, even when healthy replicas were up and failover hadn't processed yet. To resolve this, the following improvements were made: - Introduced a `read_only` flag to `utils.get_space[s]` to fetch cluster schema from any healthy replica if masters are down. - Updated `get`, `select`, `pairs`, `count`, `len`, `min`, `max` to use this new flag. - Rewrote `call.any` to iterate through all replicasets and utilize vshard's `callro` instead of `call` to fetch metadata from replicas. - Added support for `mode`, `balance`, `prefer_replica`, and `request_timeout` options in `crud.len`. The default mode remains `write` to preserve backward compatibility.
3dd8fbe to
6a1fada
Compare
Read-only operations (get, select, pairs, count, len, min, max) used to fail with connection errors if all masters in the cluster were unavailable, even when healthy replicas were up and failover hadn't processed yet.
To resolve this, the following improvements were made:
read_onlyflag toutils.get_space[s]to fetch cluster schema from any healthy replica if masters are down.get,select,pairs,count,len,min,maxto use this new flag.call.anyto iterate through all replicasets and utilize vshard'scallroinstead ofcallto fetch metadata from replicas.Closes TNTP-7102