Ignore shardId updates from replica nodes (#13877)
CI / test-ubuntu-latest (push) Failing after 31s Details
CI / test-sanitizer-address (push) Failing after 31s Details
CI / build-debian-old (push) Failing after 31s Details
CI / build-32bit (push) Failing after 31s Details
CI / build-libc-malloc (push) Failing after 31s Details
CI / build-centos-jemalloc (push) Failing after 31s Details
CI / build-old-chain-jemalloc (push) Failing after 31s Details
Codecov / code-coverage (push) Failing after 31s Details
Spellcheck / Spellcheck (push) Failing after 31s Details
CI / build-macos-latest (push) Has been cancelled Details
Coverity Scan / coverity (push) Has been skipped Details
External Server Tests / test-external-standalone (push) Failing after 32s Details
External Server Tests / test-external-cluster (push) Failing after 32s Details
External Server Tests / test-external-nodebug (push) Failing after 2m18s Details

Close https://github.com/redis/redis/issues/13868

This bug was introduced by https://github.com/redis/redis/pull/13468

## Issue
To maintain compatibility with older versions that do not support
shardid, when a replica passes a shardid, we also update the master’s
shardid accordingly.

However, when both the master and replica support shardid, an issue
arises: in one moment, the master may pass a shardid, causing us to
update both the master and all its replicas to match the master’s
shardid. But if the replica later passes a different shardid, we would
then update the master’s shardid again, leading to continuous changes in
shardid.

## Solution
Regardless of the situation, we always ensure that the replica’s shardid
remains consistent with the master’s shardid.
This commit is contained in:
Jason 2025-03-30 03:15:04 -04:00 committed by GitHub
parent 057f039c4b
commit aa8e2d1712
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
1 changed files with 11 additions and 17 deletions

View File

@ -911,30 +911,24 @@ static void assignShardIdToNode(clusterNode *node, const char *shard_id, int fla
static void updateShardId(clusterNode *node, const char *shard_id) {
if (shard_id && memcmp(node->shard_id, shard_id, CLUSTER_NAMELEN) != 0) {
assignShardIdToNode(node, shard_id, CLUSTER_TODO_SAVE_CONFIG);
/* If the replica or master does not support shard-id (old version),
* we still need to make our best effort to keep their shard-id consistent.
/* We always make our best effort to keep the shard-id consistent
* between the master and its replicas:
*
* 1. Master supports but the replica does not.
* We might first update the replica's shard-id to the master's randomly
* generated shard-id. Then, when the master's shard-id arrives, we must
* also update all its replicas.
* 2. If the master does not support but the replica does.
* We also need to synchronize the master's shard-id with the replica.
* 3. If neither of master and replica supports it.
* The master will have a randomly generated shard-id and will update
* the replica to match the master's shard-id. */
* 1. When updating the master's shard-id, we simultaneously update the
* shard-id of all its replicas to ensure consistency.
* 2. When updating replica's shard-id, if it differs from its master's shard-id,
* we discard this replica's shard-id and continue using master's shard-id.
* This applies even if the master does not support shard-id, in which
* case we rely on the master's randomly generated shard-id. */
if (node->slaveof == NULL) {
assignShardIdToNode(node, shard_id, CLUSTER_TODO_SAVE_CONFIG);
for (int i = 0; i < clusterNodeNumSlaves(node); i++) {
clusterNode *slavenode = clusterNodeGetSlave(node, i);
if (memcmp(slavenode->shard_id, shard_id, CLUSTER_NAMELEN) != 0)
assignShardIdToNode(slavenode, shard_id, CLUSTER_TODO_SAVE_CONFIG|CLUSTER_TODO_FSYNC_CONFIG);
}
} else {
clusterNode *masternode = node->slaveof;
if (memcmp(masternode->shard_id, shard_id, CLUSTER_NAMELEN) != 0)
assignShardIdToNode(masternode, shard_id, CLUSTER_TODO_SAVE_CONFIG|CLUSTER_TODO_FSYNC_CONFIG);
} else if (memcmp(node->slaveof->shard_id, shard_id, CLUSTER_NAMELEN) == 0) {
assignShardIdToNode(node, shard_id, CLUSTER_TODO_SAVE_CONFIG);
}
}
}