redis/tests/unit
Eduardo Semprebon 91d0c758e5
Replica keep serving data during repl-diskless-load=swapdb for better availability (#9323)
For diskless replication in swapdb mode, considering we already spend replica memory
having a backup of current db to restore in case of failure, we can have the following benefits
by instead swapping database only in case we succeeded in transferring db from master:

- Avoid `LOADING` response during failed and successful synchronization for cases where the
  replica is already up and running with data.
- Faster total time of diskless replication, because now we're moving from Transfer + Flush + Load
  time to Transfer + Load only. Flushing the tempDb is done asynchronously after swapping.
- This could be implemented also for disk replication with similar benefits if consumers are willing
  to spend the extra memory usage.

General notes:
- The concept of `backupDb` becomes `tempDb` for clarity.
- Async loading mode will only kick in if the replica is syncing from a master that has the same
  repl-id the one it had before. i.e. the data it's getting belongs to a different time of the same timeline. 
- New property in INFO: `async_loading` to differentiate from the blocking loading
- Slot to Key mapping is now a field of `redisDb` as it's more natural to access it from both server.db
  and the tempDb that is passed around.
- Because this is affecting replicas only, we assume that if they are not readonly and write commands
  during replication, they are lost after SYNC same way as before, but we're still denying CONFIG SET
  here anyways to avoid complications.

Considerations for review:
- We have many cases where server.loading flag is used and even though I tried my best, there may
  be cases where async_loading should be checked as well and cases where it shouldn't (would require
  very good understanding of whole code)
- Several places that had different behavior depending on the loading flag where actually meant to just
  handle commands coming from the AOF client differently than ones coming from real clients, changed
  to check CLIENT_ID_AOF instead.

**Additional for Release Notes**
- Bugfix - server.dirty was not incremented for any kind of diskless replication, as effect it wouldn't
  contribute on triggering next database SAVE
- New flag for RM_GetContextFlags module API: REDISMODULE_CTX_FLAGS_ASYNC_LOADING
- Deprecated RedisModuleEvent_ReplBackup. Starting from Redis 7.0, we don't fire this event.
  Instead, we have the new RedisModuleEvent_ReplAsyncLoad holding 3 sub-events: STARTED,
  ABORTED and COMPLETED.
- New module flag REDISMODULE_OPTIONS_HANDLE_REPL_ASYNC_LOAD for RedisModule_SetModuleOptions
  to allow modules to declare they support the diskless replication with async loading (when absent, we fall
  back to disk-based loading).

Co-authored-by: Eduardo Semprebon <edus@saxobank.com>
Co-authored-by: Oran Agra <oran@redislabs.com>
2021-11-04 10:46:50 +02:00
..
moduleapi Replica keep serving data during repl-diskless-load=swapdb for better availability (#9323) 2021-11-04 10:46:50 +02:00
type Fixes LPOP/RPOP wrong replies when count is 0 (#9692) 2021-11-04 09:43:08 +02:00
acl.tcl Treat subcommands as commands (#9504) 2021-10-20 11:52:57 +03:00
aofrw.tcl Replace all usage of ziplist with listpack for t_zset (#9366) 2021-09-09 18:18:53 +03:00
auth.tcl Prevent unauthenticated client from easily consuming lots of memory (CVE-2021-32675) (#9588) 2021-10-04 12:10:31 +03:00
bitfield.tcl Improve test suite to handle external servers better. (#9033) 2021-06-09 15:13:24 +03:00
bitops.tcl bitpos/bitcount add bit index (#9324) 2021-09-12 11:31:22 +03:00
client-eviction.tcl Client eviction ci issues (#9549) 2021-09-26 17:45:02 +03:00
cluster.tcl fix new cluster tests issues (#9657) 2021-10-20 15:40:28 +03:00
dump.tcl Improve test suite to handle external servers better. (#9033) 2021-06-09 15:13:24 +03:00
expire.tcl Add NX/XX/GT/LT options to EXPIRE command group (#2795) 2021-08-02 08:57:49 +03:00
geo.tcl GEO* STORE with empty src key delete the dest key and return 0, not empty array (#9271) 2021-08-01 19:32:24 +03:00
hyperloglog.tcl Improve test suite to handle external servers better. (#9033) 2021-06-09 15:13:24 +03:00
info.tcl Treat subcommands as commands (#9504) 2021-10-20 11:52:57 +03:00
introspection-2.tcl Fix COMMAND GETKEYS on EVAL without keys (#9733) 2021-11-03 14:38:26 +02:00
introspection.tcl Make Cluster-bus port configurable with new cluster-port config (#9389) 2021-10-18 22:28:27 -07:00
keyspace.tcl Replace all usage of ziplist with listpack for t_zset (#9366) 2021-09-09 18:18:53 +03:00
latency-monitor.tcl Treat subcommands as commands (#9504) 2021-10-20 11:52:57 +03:00
lazyfree.tcl attempt to fix tracking test issue with external tests due to lazy free (#9722) 2021-11-02 16:42:53 +02:00
limits.tcl Improve test suite to handle external servers better. (#9033) 2021-06-09 15:13:24 +03:00
maxmemory.tcl Replication backlog and replicas use one global shared replication buffer (#9166) 2021-10-25 09:24:31 +03:00
memefficiency.tcl fix defrag test looking at the wrong latency metric (#9723) 2021-11-02 15:52:56 +02:00
multi.tcl Fail EXEC command in case a watched key is expired (#9194) 2021-07-11 13:17:23 +03:00
networking.tcl Pre-test bind-source-addr before running test. (#9214) 2021-07-11 09:54:07 +03:00
obuf-limits.tcl Better error handling for updateClientOutputBufferLimit. (#9308) 2021-08-29 15:03:05 +03:00
oom-score-adj.tcl Improve test suite to handle external servers better. (#9033) 2021-06-09 15:13:24 +03:00
other.tcl Test fails when flushdb triggers a bgsave (#9535) 2021-10-06 11:50:47 +03:00
pause.tcl Fix wrong offset when replica pause (#9448) 2021-09-08 16:07:25 +08:00
pendingquerybuf.tcl Improve test suite to handle external servers better. (#9033) 2021-06-09 15:13:24 +03:00
printver.tcl Print version info before running the test 2011-05-20 11:44:54 +02:00
protocol.tcl argv mem leak during multi command execution. (#9598) 2021-10-05 12:17:36 +03:00
pubsub.tcl Add test verifying PUBSUB NUMPAT behavior (#9209) 2021-09-03 15:52:39 -07:00
querybuf.tcl Ignore resize threshold on idle qbuf resizing (#9322) 2021-08-06 20:50:34 +03:00
quit.tcl Add tests for OK on QUIT 2010-10-15 12:54:53 +02:00
scan.tcl Replace all usage of ziplist with listpack for t_zset (#9366) 2021-09-09 18:18:53 +03:00
scripting.tcl fix valgrind issues with long double module test (#9709) 2021-11-01 13:41:35 +02:00
shutdown.tcl Improve test suite to handle external servers better. (#9033) 2021-06-09 15:13:24 +03:00
slowlog.tcl slowlog get command supports passing in -1 to get all logs. (#9018) 2021-06-14 16:46:45 +03:00
sort.tcl Add SORT_RO command (#9299) 2021-08-09 09:40:29 +03:00
tls.tcl Add support for reading encrypted keyfiles. (#8644) 2021-03-22 13:27:46 +02:00
tracking.tcl Solve issues with tracking test in external mode (#9726) 2021-11-02 16:07:51 -07:00
violations.tcl Fix ziplist and listpack overflows and truncations (CVE-2021-32627, CVE-2021-32628) (#9589) 2021-10-04 12:11:02 +03:00
wait.tcl Improve test suite to handle external servers better. (#9033) 2021-06-09 15:13:24 +03:00