redis/tests/integration
Oran Agra 0ecf6cdc0a
fix handshake timeout replication test race (#11640)
Test on ARM + TLS often fail with this error:
```
*** [err]: Slave is able to detect timeout during handshake in tests/integration/replication.tcl
Replica is not able to detect timeout
```
https://github.com/redis/redis-extra-ci/actions/runs/3727554226/jobs/6321797837

The replica logs show that in this case the replica got timeout before even getting a response to the PING command (instead of the SYNC command).

it should have shown these:
```
* MASTER <-> REPLICA sync started
* REPLICAOF 127.0.0.1:22112 enabled ....
### Starting test Slave enters handshake in tests/integration/replication.tcl
* Non blocking connect for SYNC fired the event.
```
then:
```
* Master replied to PING, replication can continue...
* Trying a partial resynchronization (request 50da9eff70d774f4e6cb723eb4b091440f215772:1).
```
and then hang for 5 seconds:
```
# Timeout connecting to the MASTER...
* Reconnecting to MASTER 127.0.0.1:21112 after failure
```

but instead it got this (looks like it disconnected too early, and then tried to re-connect):
```
10890:M 19 Dec 2022 01:32:54.794 * Ready to accept connections tls
10890:M 19 Dec 2022 01:32:54.809 - Accepted 127.0.0.1:41047
10890:M 19 Dec 2022 01:32:54.878 - Reading from client: error:0A000126:SSL routines::unexpected eof while reading
10890:M 19 Dec 2022 01:32:54.925 - Accepted 127.0.0.1:39207
10890:S 19 Dec 2022 01:32:55.463 * Before turning into a replica, using my own master parameters to synthesize a cached master: I may be able to synchronize with the new master with just a partial transfer.
10890:S 19 Dec 2022 01:32:55.463 * Connecting to MASTER 127.0.0.1:24126
10890:S 19 Dec 2022 01:32:55.463 * MASTER <-> REPLICA sync started
10890:S 19 Dec 2022 01:32:55.463 * REPLICAOF 127.0.0.1:24126 enabled (user request from 'id=4 addr=127.0.0.1:39207 laddr=127.0.0.1:24125 fd=8 name= age=1 idle=0 flags=N db=9 sub=0 psub=0 ssub=0 multi=-1 qbuf=43 qbuf-free=20431 argv-mem=21 multi-mem=0 rbs=1024 rbp=5 obl=0 oll=0 omem=0 tot-mem=22317 events=r cmd=slaveof user=default redir=-1 resp=2')
### Starting test Slave enters handshake in tests/integration/replication.tcl
10890:S 19 Dec 2022 01:32:55.476 * Non blocking connect for SYNC fired the event.
10890:S 19 Dec 2022 01:33:00.701 # Failed to read response from the server: (null)         <- note this!!
10890:S 19 Dec 2022 01:33:00.701 # Master did not respond to command during SYNC handshake
10890:S 19 Dec 2022 01:33:01.002 * Connecting to MASTER 127.0.0.1:24126
10890:S 19 Dec 2022 01:33:01.002 * MASTER <-> REPLICA sync started
### Starting test Slave is able to detect timeout during handshake in tests/integration/replication.tcl
10890:S 19 Dec 2022 01:33:05.497 * Non blocking connect for SYNC fired the event.
10890:S 19 Dec 2022 01:33:05.500 * Master replied to PING, replication can continue...
10890:S 19 Dec 2022 01:33:05.510 * Trying a partial resynchronization (request 947e1956372a0e6c819cfec51c42cc7979b0c221:1).
10890:S 19 Dec 2022 01:34:05.833 # Failed to read response from the server: error:0A000126:SSL routines::unexpected eof while reading
10890:S 19 Dec 2022 01:34:05.833 # Master did not reply to PSYNC, will try later
```

This PR sets enables the 5 seconds timeout at a later stage to try and prevent the early disconnection.
2023-01-04 10:56:09 +02:00
..
aof-multi-part.tcl Fix bug when AOF enabled after startup. put the new incr file in the manifest only when AOFRW is done. (#10616) 2022-04-26 16:31:19 +03:00
aof-race.tcl Re-enable aof-race integration tests (#10972) 2022-08-04 11:13:29 +03:00
aof.tcl FLUSHDB and FLUSHALL add call forceCommandPropagation / FLUSHALL reset dirty counter to 0 if we enable save (#10691) 2022-05-11 11:21:16 +03:00
block-repl.tcl Improve test suite to handle external servers better. (#9033) 2021-06-09 15:13:24 +03:00
convert-ziplist-hash-on-load.tcl Replace all usage of ziplist with listpack for t_hash (#8887) 2021-08-10 09:18:49 +03:00
convert-ziplist-zset-on-load.tcl Replace all usage of ziplist with listpack for t_zset (#9366) 2021-09-09 18:18:53 +03:00
convert-zipmap-hash-on-load.tcl Replace all usage of ziplist with listpack for t_hash (#8887) 2021-08-10 09:18:49 +03:00
corrupt-dump-fuzzer.tcl Fix set with duplicate elements causes sdiff to hang (#11530) 2022-11-22 11:20:24 +02:00
corrupt-dump.tcl Fix set with duplicate elements causes sdiff to hang (#11530) 2022-11-22 11:20:24 +02:00
dismiss-mem.tcl Add external test that runs without debug command (#9964) 2021-12-19 17:41:51 +02:00
failover.tcl Improve test suite to handle external servers better. (#9033) 2021-06-09 15:13:24 +03:00
logging.tcl Avoid ASAN test errors on crash report tests (#11605) 2022-12-11 18:20:42 +02:00
psync2-master-restart.tcl Fix race in PSYNC2 partial resync test (#11653) 2022-12-22 14:23:14 +02:00
psync2-pingoff.tcl Fix race condition in psync2-pingoff test (#9712) 2021-11-01 16:07:08 +02:00
psync2-reg.tcl Improve test suite to handle external servers better. (#9033) 2021-06-09 15:13:24 +03:00
psync2.tcl Try to fix a race in psync2 test (#11553) 2022-11-30 22:03:23 +02:00
rdb.tcl Add listpack encoding for list (#11303) 2022-11-16 20:29:46 +02:00
redis-benchmark.tcl fix benchmark failure in daily test with TLS (#10896) 2022-06-23 18:19:36 +03:00
redis-cli.tcl Functions: Move library meta data to be part of the library payload. (#10500) 2022-04-05 10:27:24 +03:00
replication-2.tcl Set repl-diskless-sync to yes by default, add repl-diskless-sync-max-replicas (#10092) 2022-01-17 14:11:11 +02:00
replication-3.tcl Remove EVAL script verbatim replication, propagation, and deterministic execution logic (#9812) 2021-12-21 08:32:42 +02:00
replication-4.tcl Fix replication on expired key test timing issue, give it more chances (#11548) 2022-11-28 13:03:55 +02:00
replication-buffer.tcl Set repl-diskless-sync to yes by default, add repl-diskless-sync-max-replicas (#10092) 2022-01-17 14:11:11 +02:00
replication-psync.tcl Improve test suite to handle external servers better. (#9033) 2021-06-09 15:13:24 +03:00
replication.tcl fix handshake timeout replication test race (#11640) 2023-01-04 10:56:09 +02:00
shutdown.tcl Wait for replicas when shutting down (#9872) 2022-01-02 09:50:15 +02:00