redis

Commit Graph

Author	SHA1	Message	Date
Madelyn Olson	c0e064ef16	Optimize the performance of cluster slots for non-continuous slots (#11745 ) This change improves the performance of cluster slots by removing the deferring lengths that are used. Deferring lengths are used in two contexts, the first is for determining the number of replicas that serve a slot (Added in 6.2 as part of a different performance improvement) and the second is for determining the extra networking options for each node (Added in 7.0). For continuous slots, (e.g. 0-8196) this improvement is very negligible, however it becomes more significant when slots are not continuous (e.g. 0 2 4 6 etc) which can happen in production for various users. The `cluster slots` command is deprecated in favor of `cluster shards`, but since most clients don't support the new command yet I think it's important to not degrade performance here. Benchmarking shows about 2x improvement, however I wasn't able to get a coherent TPS number since the benchmark process was being saturated long before Redis was, so had to run with multiple benchmarks and merge results. If needed I can add this to our memtier framework. Instead the next section shows the number of usec per call from the benchmark results, which shows significant improvement as well as having a more coherent response in the CoB. \| \| New Code \| Old Code \| % Improvements \|----\|----\|----- \|----- \| Uniform slots\| usec_per_call=10.46 \| usec_per_call=11.03 \| 5.7% \| Worst case (Only even slots)\| usec_per_call=963.80 \| usec_per_call=2950.99 \| 307% This change also removes some extra white space that I added a when making a code change for adding hostnames. (cherry picked from commit `e74a1f3bd9`)	2023-02-28 18:32:34 +02:00
guybe7	3a6f00329a	Call postExecutionUnitOperations in active-expire of writable replicas (#11615 ) We need to honor the post-execution-unit API and call it after each KSN Note that this is an edge case that only happens in case volatile keys were created directly on a writable replica, and that anyway nothing is propagated to sub-replicas Co-authored-by: Oran Agra <oran@redislabs.com> (cherry picked from commit `df327b8bd5`)	2023-02-28 18:32:34 +02:00
Wen Hui	4d5a4e4b36	Fix command BITFIELD_RO and BITFIELD argument json file, add some test cases for them (#11445 ) According to the source code, the commands can be executed with only key name, and no GET/SET/INCR operation arguments. change the docs to reflect that by marking these arguments as optional. also add tests. (cherry picked from commit `fea9bbbe0f`)	2023-02-28 18:32:34 +02:00
Oran Agra	1c75ab062d	Redis 7.0.8	2023-01-16 18:40:35 +02:00
Oran Agra	3f1f02034c	Fix range issues in ZRANDMEMBER and HRANDFIELD (CVE-2023-22458) missing range check in ZRANDMEMBER and HRANDIFLD leading to panic due to protocol limitations	2023-01-16 18:40:35 +02:00
Oran Agra	6c25c6b7da	Avoid integer overflows in SETRANGE and SORT (CVE-2022-35977) Authenticated users issuing specially crafted SETRANGE and SORT(_RO) commands can trigger an integer overflow, resulting with Redis attempting to allocate impossible amounts of memory and abort with an OOM panic.	2023-01-16 18:40:35 +02:00
Oran Agra	4537830ea1	Obuf limit, exit during loop in RAND commands and KEYS Related to the hang reported in #11671 Currently, redis can disconnect a client due to reaching output buffer limit, it'll also avoid feeding that output buffer with more data, but it will keep running the loop in the command (despite the client already being marked for disconnection) This PR is an attempt to mitigate the problem, specifically for commands that are easy to abuse, specifically: KEYS, HRANDFIELD, SRANDMEMBER, ZRANDMEMBER. The RAND family of commands can take a negative COUNT argument (which is not bound to the number of elements in the key), so it's enough to create a key with one field, and then these commands can be used to hang redis. For KEYS the caller can use the existing keyspace in redis (if big enough).	2023-01-16 18:40:35 +02:00
knggk	5fa7d9a272	Add minimum version information to new xsetid arguments (#11694 ) the metadata for the new arguments of XSETID, entries-added and max-deleted-id, which have been added in Redis 7.0 was missing. (cherry picked from commit `44c6770372`)	2023-01-16 18:40:35 +02:00
Oran Agra	3e82bdf738	Make sure that fork child doesn't do incremental rehashing (#11692 ) Turns out that a fork child calling getExpire while persisting keys (and possibly also a result of some module fork tasks) could cause dictFind to do incremental rehashing in the child process, which is both a waste of time, and also causes COW harm. (cherry picked from commit `2bec254d89`)	2023-01-16 18:40:35 +02:00
Gabi Ganam	574a49b96c	Blocking command with a 0.001 seconds timeout blocks indefinitely (#11688 ) Any value in the range of [0-1) turns to 0 when being cast from double to long long. This change rounds up instead of down for values that can't be stored precisely as long doubles. (cherry picked from commit `eef29b68a2`)	2023-01-16 18:40:35 +02:00
Oran Agra	61a1d4540d	Fix potential issue with Lua argv caching, module command filter and libc realloc (#11652 ) TLDR: solve a problem introduced in Redis 7.0.6 (#11541) with RM_CommandFilterArgInsert being called from scripts, which can lead to memory corruption. Libc realloc can return the same pointer even if the size was changed. The code in freeLuaRedisArgv had an assumption that if the pointer didn't change, then the allocation didn't change, and the cache can still be reused. However, if rewriteClientCommandArgument or RM_CommandFilterArgInsert were used, it could be that we realloced the argv array, and the pointer didn't change, then a consecutive command being executed from Lua can use that argv cache reaching beyond its size. This was actually only possible with modules, since the decision to realloc was based on argc, rather than argv_len. (cherry picked from commit `c8052122a2`)	2023-01-16 18:40:35 +02:00
judeng	f9f48ef674	Optimize the performance of msetnx command by call lookupkey only once (#11594 ) This is a small addition to #9640 It improves performance by avoiding double lookup of the the key. (cherry picked from commit `884ca601b2`)	2023-01-16 18:40:35 +02:00
sundb	b7b78a2db2	Remove unnecessary updateClientMemUsageAndBucket() when feeding monitors (#11657 ) This call is introduced in #8687, but became irrelevant in #11348, and is currently a no-op. The fact is that #11348 an unintended side effect, which is that even if the client eviction config is enabled, there are certain types of clients for which memory consumption is not accurately tracked, and so unlike normal clients, their memory isn't reported correctly in INFO. (cherry picked from commit `af0a4fe207`)	2023-01-16 18:40:35 +02:00
Moti Cohen	fcfb046d91	Fix sentinel issue if replica changes IP (#11590 ) As Sentinel supports dynamic IP only when using hostnames, there are few leftover addess comparison logic that doesn't take into account that the IP might get change. Co-authored-by: moticless <moticless@github.com> (cherry picked from commit `4a27aa4875`)	2023-01-16 18:40:35 +02:00
Oran Agra	b73de0d9c2	Redis 7.0.7	2022-12-16 12:52:57 +02:00
filipe oliveira	41bff94b43	Fixed small distance replies on GEODIST and GEO commands WITHDIST (#11631 ) Fixes a regression introduced by #11552 in 7.0.6. it causes replies in the GEO commands to contain garbage when the result is a very small distance (less than 1) Includes test to confirm indeed with junk in buffer now we properly reply (cherry picked from commit `d7b4c9175e`)	2022-12-16 12:52:57 +02:00
Oran Agra	c0924a8361	Redis 7.0.6	2022-12-12 17:36:34 +02:00
Binbin	e2665c6fc0	Fix replication on expired key test timing issue, give it more chances (#11548 ) In replica, the key expired before master's `INCR` was arrived, so INCR creates a new key in the replica and the test failed. ``` *** [err]: Replication of an expired key does not delete the expired key in tests/integration/replication-4.tcl Expected '0' to be equal to '1' (context: type eval line 13 cmd {assert_equal 0 [$slave exists k]} proc ::test) ``` This test is very likely to do a false positive if the `wait_for_ofs_sync` takes longer than the expiration time, so give it a few more chances. The test was introduced in #9572. (cherry picked from commit `06b577aad0`)	2022-12-12 17:36:34 +02:00
Binbin	3c525fab6a	Fix redis-cli cluster add-node race in cli.tcl (#11349 ) There is a race condition in the test: ``` *** [err]: redis-cli --cluster add-node with cluster-port in tests/unit/cluster/cli.tcl Expected '5' to be equal to '4' {assert_equal 5 [CI 0 cluster_known_nodes]} proc ::test) ``` When using cli to add node, there can potentially be a race condition in which all nodes presenting cluster state o.k even though the added node did not yet meet all cluster nodes. This comment and the fix were taken from #11221. Also apply it in several other similar places. (cherry picked from commit `a549b78c48`)	2022-12-12 17:36:34 +02:00
ranshid	e1557e6c7a	fix test Migrate the last slot away from a node using redis-cli (#11221 ) When using cli to add node, there can potentially be a race condition in which all nodes presenting cluster state o.k even though the added node did not yet meet all cluster nodes. this adds another utility function to wait until all cluster nodes see the same cluster size (cherry picked from commit `c0ce97facc`)	2022-12-12 17:36:34 +02:00
Binbin	b5784feabd	Fix timing issue in cluster test (#11008 ) A timing issue like this was reported in freebsd daily CI: ``` *** [err]: Sanity test push cmd after resharding in tests/unit/cluster/cli.tcl Expected 'CLUSTERDOWN The cluster is down' to match 'MOVED' ``` We additionally wait for each node to reach a consensus on the cluster state in wait_for_condition to avoid the cluster down error. The fix just like #10495, quoting madolson's comment: Cluster check just verifies the the config state is self-consistent, waiting for cluster_state to be okay is an independent check that all the nodes actually believe each other are healthy. At the same time i noticed that unit/moduleapi/cluster.tcl has an exact same test, may have the same problem, also modified it. (cherry picked from commit `5ce64ab010`)	2022-12-12 17:36:34 +02:00
Binbin	a43c51b297	Fix CLUSTERDOWN issue in cluster reshard unblock test (#11139 ) change the cluster-node-timeout from 1 to 1000 (cherry picked from commit `3a16ad30b7`)	2022-12-12 17:36:34 +02:00
David CARLIER	d86408b7b9	Fixes build warning when CACHE_LINE_SIZE is already defined. (#11389 ) * Fixes build warning when CACHE_LINE_SIZE is already defined * Fixes wrong CACHE_LINE_SIZE on some FreeBSD systems where it could be set to 128 (e.g. on MIPS) * Fixes wrong CACHE_LINE_SIZE on Apple M1 (use 128 instead of 64) Wrong cache line size in that case can some false sharing of array elements between threads, see #10892 (cherry picked from commit `871cc200a0`)	2022-12-12 17:36:34 +02:00
Oran Agra	dca63da432	Avoid ASAN test errors on crash report tests Clang Address Sanitizer tests started reporting unknown-crash on these tests due to the memcheck, disable the memcheck to avoid that noise. (cherry picked from commit `18ff6a3269`)	2022-12-12 17:36:34 +02:00
Oran Agra	4a5aba16c9	Try to fix a race in psync2 test (#11553 ) This test sets the master ping interval to 1 hour, in order to avoid pings in the replicatoin stream incrementing the replication offset, however, it didn't increase the repl-timeout so on slow machines where the test took more than 60 seconds, the replicas would drop and reconnect. ``` *** [err]: PSYNC2: Partial resync after restart using RDB aux fields in tests/integration/psync2.tcl Replica didn't partial sync ``` The test would detect 4 additional partial syncs where it expects only one. (cherry picked from commit `b0250b4508`)	2022-12-12 17:36:34 +02:00
Binbin	7275935641	Bump vmactions/freebsd-vm to 0.3.0 to fix FreeBSD daily (#11476 ) Our FreeBSD daily has been failing recently: ``` Config file: freebsd-13.1.conf cd: /Users/runner/work/redis/redis: No such file or directory gmake: *** No targets specified and no makefile found. Stop. ``` Upgrade vmactions/freebsd-vm to the latest version (0.3.0) can work. I've tested it, but don't know why, but first let's fix it. (cherry picked from commit `5246bf4544`)	2022-12-12 17:36:34 +02:00
Binbin	98410d0da5	Fix bgsaveerr issue in psync wrong offset test (#11043 ) The kill above is sometimes successful and sometimes already too late. The PING in pysnc wrong offset test got rejected by bgsaveerr because lastbgsave_status is C_ERR. In theory, using diskless can avoid PING being affected, because when the replica is dropped, we will kill the child with SIGUSR1, and this will not affect lastbgsave_status. Anyway, this kill is not particularly needed here, dropping the kill is the best one, since we do have the waitForBgsave, so just let it take care of the bgsave. No need for fast termination. (cherry picked from commit `e7144693e2`)	2022-12-12 17:36:34 +02:00
filipe oliveira	9ccb8dfd47	Reduce rewriteClientCommandVector usage on EXPIRE command (#11602 ) There is overhead on Redis 7.0 EXPIRE command that is not present on 6.2.7. We could see that on the unstable profile there are around 7% of CPU cycles spent on rewriteClientCommandVector that are not present on 6.2.7. This was introduced in #8474. This PR reduces the overhead by using 2X rewriteClientCommandArgument instead of rewriteClientCommandVector. In this scenario rewriteClientCommandVector creates 4 arguments. the above usage of rewriteClientCommandArgument reduces the overhead in half. This PR should also improve PEXPIREAT performance by avoiding at all rewriteClientCommandArgument usage. Co-authored-by: Oran Agra <oran@redislabs.com> (cherry picked from commit `c3fb48da8b`)	2022-12-12 17:36:34 +02:00
Harkrishn Patro	05c9378b3f	Optimize client memory usage tracking operation while client eviction is disabled (#11348 ) ## Issue During the client input/output buffer processing, the memory usage is incrementally updated to keep track of clients going beyond a certain threshold `maxmemory-clients` to be evicted. However, this additional tracking activity leads to unnecessary CPU cycles wasted when no client-eviction is required. It is applicable in two cases. * `maxmemory-clients` is set to `0` which equates to no client eviction (applicable to all clients) * `CLIENT NO-EVICT` flag is set to `ON` which equates to a particular client not applicable for eviction. ## Solution * Disable client memory usage tracking during the read/write flow when `maxmemory-clients` is set to `0` or `client no-evict` is `on`. The memory usage is tracked only during the `clientCron` i.e. it gets periodically updated. * Cleanup the clients from the memory usage bucket when client eviction is disabled. * When the maxmemory-clients config is enabled or disabled at runtime, we immediately update the memory usage buckets for all clients (tested scanning 80000 took some 20ms) Benchmark shown that this can improve performance by about 5% in certain situations. Co-authored-by: Oran Agra <oran@redislabs.com> (cherry picked from commit `c0267b3fa5`)	2022-12-12 17:36:34 +02:00
Binbin	b322a77e99	Fix command line startup --sentinel problem (#11591 ) There is a issue with --sentinel: ``` [root]# src/redis-server sentinel.conf --sentinel --loglevel verbose * FATAL CONFIG FILE ERROR (Redis 255.255.255) * Reading the configuration file, at line 352 >>> 'sentinel "--loglevel" "verbose"' Unrecognized sentinel configuration statement ``` This is because in #10660 (Redis 7.0.1), `--` prefix change break it. In this PR, we will handle `--sentinel` the same as we did for `--save` in #10866. i.e. it's a pseudo config option with no value. (cherry picked from commit `8f13ac10b4`)	2022-12-12 17:36:34 +02:00
filipe oliveira	5e02338e5e	GEOSEARCH BYBOX: Simplified haversine distance formula when longitude diff is 0 (#11579 ) This is take 2 of `GEOSEARCH BYBOX` optimizations based on haversine distance formula when longitude diff is 0. The first one was in #11535 . - Given longitude diff is 0 the asin(sqrt(a)) on the haversine is asin(sin(abs(u))). - arcsin(sin(x)) equal to x when x ∈[−𝜋/2,𝜋/2]. - Given latitude is between [−𝜋/2,𝜋/2] we can simplifiy arcsin(sin(x)) to x. On the sample dataset with 60M datapoints, we've measured 55% increase in the achievable ops/sec. (cherry picked from commit `e48ac075c0`)	2022-12-12 17:36:34 +02:00
filipe oliveira	6a6d8806a4	Reintroduce lua argument cache in luaRedisGenericCommand removed in v7.0 (#11541 ) This mechanism aims to reduce calls to malloc and free when preparing the arguments the script sends to redis commands. This is a mechanism was originally implemented in `48c49c4` and `4f68655`, and was removed in #10220 (thinking it's not needed and that it has no impact), but it now turns out it was wrong, and it indeed provides some 5% performance improvement. The implementation is a little bit too simplistic, it assumes consecutive calls use the same size in the same arg index, but that's arguably sufficient since it's only aimed at caching very small things. We could even consider always pre-allocating args to the full LUA_CMD_OBJCACHE_MAX_LEN (64 bytes) rather than the right size for the argument, that would increase the chance they'll be able to be re-used. But in some way this is already happening since we're using sdsalloc, which in turn uses s_malloc_usable and takes ownership of the full side of the allocation, so we are padded to the allocator bucket size. Co-authored-by: Oran Agra <oran@redislabs.com> Co-authored-by: sundb <sundbcn@gmail.com> (cherry picked from commit `2d80cd7840`)	2022-12-12 17:36:34 +02:00
filipe oliveira	77570e3965	Speedup GEODIST with fixedpoint_d2string as an optimized version of snprintf %.4f (#11552 ) GEODIST used snprintf("%.4f") for the reply using addReplyDoubleDistance, which was slow. This PR optimizes it without breaking compatibility by following the approach of ll2string with some changes to match the use case of distance and precision. I.e. we multiply it by 10000 format it as an integer, and then add a decimal point. This can achieve about 35% increase in the achievable ops/sec. Co-authored-by: Oran Agra <oran@redislabs.com> (cherry picked from commit `61c85a2b20`)	2022-12-12 17:36:34 +02:00
Yossi Gottlieb	60ef6d2896	Improve TLS error handling. (#11563 ) * Remove duplicate code, propagating SSL errors into connection state. * Add missing error handling in synchronous IO functions. * Fix connection error reporting in some replication flows. (cherry picked from commit `155acef51a`)	2022-12-12 17:36:34 +02:00
filipe oliveira	669a8dba37	changing addReplySds and sdscat to addReplyStatusLength() within luaReplyToRedisReply() (#11556 ) profiling EVALSHA\ we see that luaReplyToRedisReply takes 8.73% out of the 56.90% of luaCallFunction CPU cycles. Using addReplyStatusLength instead of directly composing the protocol to avoid sdscatprintf and addReplySds ( which imply multiple sdslen calls ). The new approach drops luaReplyToRedisReply CPU cycles to 3.77% (cherry picked from commit `68e87eb088`)	2022-12-12 17:36:34 +02:00
filipe oliveira	7c4f6e179b	Reduce eval related overhead introduced in v7.0 by evalCalcFunctionName (#11521 ) As being discussed in #10981 we see a degradation in performance between v6.2 and v7.0 of Redis on the EVAL command. After profiling the current unstable branch we can see that we call the expensive function evalCalcFunctionName twice. The current "fix" is to basically avoid calling evalCalcFunctionName and even dictFind(lua_scripts) twice for the same command. Instead we cache the current script's dictEntry (for both Eval and Functions) in the current client so we don't have to repeat these calls. The exception would be when doing an EVAL on a new script that's not yet in the script cache. in that case we will call evalCalcFunctionName (and even evalExtractShebangFlags) twice. Co-authored-by: Oran Agra <oran@redislabs.com> (cherry picked from commit `7dfd7b9197`)	2022-12-12 17:36:34 +02:00
zhaozhao.zz	fba49b107d	benchmark getRedisConfig exit only when meet NOAUTH error (#11096 ) redis-benchmark: when trying to get the CONFIG before benchmark, avoid printing any warning on most errors (e.g. NOPERM error). avoid aborting the benchmark on NOPERM. keep the warning only when we abort the benchmark on a NOAUTH error (cherry picked from commit `f0005b5328`)	2022-12-12 17:36:34 +02:00
filipe oliveira	3d206f0fcf	Simplified geoAppendIfWithinShape() and removed spurious calls do sdsdup and sdsfree (#11522 ) In scenarios in which we have large datasets and the elements are not contained within the range we do spurious calls do sdsdup and sdsfree. I.e. instead of pre-creating an sds before we know if we're gonna use it or not, change the role of geoAppendIfWithinShape to just do geoWithinShape, and let the caller create the string only when needed. Co-authored-by: Oran Agra <oran@redislabs.com> (cherry picked from commit `376b689b03`)	2022-12-12 17:36:34 +02:00
DevineLiu	81e72e8e29	[BUG] Fix announced ports not updating on local node when updated at runtime (#10745 ) The cluster-announce-port/cluster-announce-bus-port/cluster-announce-tls-port should take effect at runtime Co-authored-by: Madelyn Olson <madelyneolson@gmail.com> (cherry picked from commit `25ffa79b64`)	2022-12-12 17:36:34 +02:00
filipe oliveira	cffabb641f	GEOSEARCH BYBOX: Reduce wastefull computation on geohashGetDistanceIfInRectangle and geohashGetDistance (#11535 ) Optimize geohashGetDistanceIfInRectangle when there are many misses. It calls 3x geohashGetDistance. The first 2 times we call them to produce intermediate results. This PR focus on optimizing for those 2 intermediate results. 1 Reduce expensive computation on intermediate geohashGetDistance with same long 2 Avoid expensive lon_distance calculation if lat_distance fails beforehand Co-authored-by: Oran Agra <oran@redislabs.com> (cherry picked from commit `ae1de54900`)	2022-12-12 17:36:34 +02:00
sundb	4972f5768c	Ignore -Wstringop-overread warning for SHA1Transform() on GCC 12 (#11538 ) Fix compile warning for SHA1Transform() method under alpine with GCC 12. Warning: ``` In function 'SHA1Update', inlined from 'SHA1Final' at sha1.c:187:9: sha1.c:144:13: error: 'SHA1Transform' reading 64 bytes from a region of size 0 [-Werror=stringop-overread] 144 \| SHA1Transform(context->state, &data[i]); \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ sha1.c:144:13: note: referencing argument 2 of type 'const unsigned char[64]' sha1.c: In function 'SHA1Final': sha1.c:56:6: note: in a call to function 'SHA1Transform' 56 \| void SHA1Transform(uint32_t state[5], const unsigned char buffer[64]) \| ^~~~~~~~~~~~~ ``` This warning is a false positive because it has been determined in the loop judgment that there must be 64 chars after position `i` ```c for ( ; i + 63 < len; i += 64) { SHA1Transform(context->state, &data[i]); } ``` Reference: `e1d7d3e40a` (cherry picked from commit `fd80818552`)	2022-12-12 17:36:34 +02:00
Wen Hui	d81c2a5f33	Add explicit error log message for AOF_TRUNCATED status when server load AOF file (#11484 ) Now, according to the comments, if the truncated file is not the last file, it will be considered as a fatal error. And the return code will updated to AOF_FAILED, then server will exit without any error message to the client. Similar to other error situations, this PR add an explicit error message for this case and make the client know clearly what happens. (cherry picked from commit `6e9724cb6a`)	2022-12-12 17:36:34 +02:00
Madelyn Olson	a26ac7ebbc	Explicitly send function commands to monitor (#11510 ) Both functions and eval are marked as "no-monitor", since we want to explicitly feed in the script command before the commands generated by the script. Note that we want this behavior generally, so that commands can redact arguments before being added to the monitor. (cherry picked from commit `d136bf2830`)	2022-12-12 17:36:34 +02:00
uriyage	7ad786db2e	Module CLIENT_CHANGE, Fix crash on free blocked client with DB!=0 (#11500 ) In moduleFireServerEvent we change the real client DB to 0 on freeClient in case the event is REDISMODULE_EVENT_CLIENT_CHANGE. It results in a crash if the client is blocked on a key on other than DB 0. The DB change is not necessary even for module-client, as we set its DB to 0 on either createClient or moduleReleaseTempClient. Co-authored-by: Madelyn Olson <34459052+madolson@users.noreply.github.com> Co-authored-by: Binbin <binloveplay1314@qq.com> (cherry picked from commit `e4eb18b303`)	2022-12-12 17:36:34 +02:00
Oran Agra	abf83bc682	fixes for fork child exit and test: #11463 (#11499 ) Fix a few issues with the recent #11463 * use exitFromChild instead of exit * test should ignore defunct process since that's what we expect to happen for thees child processes when the parent dies. * fix typo Co-authored-by: Binbin <binloveplay1314@qq.com> (cherry picked from commit `4c54528f0f`)	2022-12-12 17:36:34 +02:00
Oran Agra	7c956d5cdf	diskless master, avoid bgsave child hung when fork parent crashes (#11463 ) During a diskless sync, if the master main process crashes, the child would have hung in `write`. This fix closes the read fd on the child side, so that if the parent crashes, the child will get a write error and exit. This change also fixes disk-based replication, BGSAVE and AOFRW. In that case the child wouldn't have been hang, it would have just kept running until done which may be pointless. There is a certain degree of risk here. in case there's a BGSAVE child that could maybe succeed and the parent dies for some reason, the old code would have let the child keep running and maybe succeed and avoid data loss. On the other hand, if the parent is restarted, it would have loaded an old rdb file (or none), and then the child could reach the end and rename the rdb file (data conflicting with what the parent has), or also have a race with another BGSAVE child that the new parent started. Note that i removed a comment saying a write error will be ignored in the child and handled by the parent (this comment was very old and i don't think relevant). (cherry picked from commit `ccaef5c923`)	2022-12-12 17:36:34 +02:00
Moti Cohen	19d01b62ca	Fix sentinel function that compares hostnames (if failed resolve) (#11419 ) Funcion sentinelAddrEqualsHostname() of sentinel makes DNS resolve and based on it determines if two IP addresses are equal. Now, If the DNS resolve command fails, the function simply returns 0, even if the hostnames are identical. This might become an issue in case of failover such that sentinel might receives from Redis instance, response to regular INFO query it sent, and wrongly decide that the instance is pointing to is different leader than the one recorded because of this function, yet hostnames are identical. In turn sentinel disconnects the connection between sentinel and valid slave which leads to -failover-abort-no-good-slave. See issue #11241. I managed to reproduce only part of the flow in which the function return wrong result and trigger +fix-slave-config. The fix is even if the function failed to resolve then compare based on hostnames. That is our best effort as long as the server is unavailable for some reason. It is fine since Redis instance cannot have multiple hostnames for a given setup (cherry picked from commit `bd23b15ad7`)	2022-12-12 17:36:34 +02:00
sundb	9fc20f4f9d	Fix crash due to to reuse iterator entry after list deletion in module (#11383 ) In the module, we will reuse the list iterator entry for RM_ListDelete, but `listTypeDelete` will only update `quicklistEntry->zi` but not `quicklistEntry->node`, which will result in `quicklistEntry->node` pointing to a freed memory address if the quicklist node is deleted. This PR sync `key->u.list.index` and `key->u.list.entry` to list iterator after `RM_ListDelete`. This PR also optimizes the release code of the original list iterator. Co-authored-by: Viktor Söderqvist <viktor@zuiderkwast.se> (cherry picked from commit `6dd213558b`)	2022-12-12 17:36:34 +02:00
C Charles	f95af7785b	MIGTATE with AUTH that contains "keys" is getting wrong key names in migrateGetKeys, leads to ACL errors (#11253 ) When using the MIGRATE, with a destination Redis that has the user name or password set to the string "keys", Redis would have determine the wrong set of key names the command is gonna access. This lead to ACL returning wrong authentication result. Destination instance: ``` 127.0.0.1:6380> acl setuser default >keys OK 127.0.0.1:6380> acl setuser keys on nopass ~* &* +@all OK ``` Source instance: ``` 127.0.0.1:6379> set a 123 OK 127.0.0.1:6379> acl setuser cc on nopass ~a* +@all OK 127.0.0.1:6379> auth cc 1 OK 127.0.0.1:6379> migrate 127.0.0.1 6380 "" 0 1000 auth keys keys a (error) NOPERM this user has no permissions to access one of the keys used as arguments 127.0.0.1:6379> migrate 127.0.0.1 6380 "" 0 1000 auth2 keys pswd keys a (error) NOPERM this user has no permissions to access one of the keys used as arguments ``` Using `acl dryrun` we know that the parameters of `auth` and `auth2` are mistaken for the `keys` option. ``` 127.0.0.1:6379> acl dryrun cc migrate whatever whatever "" 0 1000 auth keys keys a "This user has no permissions to access the 'keys' key" 127.0.0.1:6379> acl dryrun cc migrate whatever whatever "" 0 1000 auth2 keys pswd keys a "This user has no permissions to access the 'pswd' key" ``` Fix the bug by editing db.c/migrateGetKeys function, which finds the `keys` option and all the keys following. (cherry picked from commit `9ab873d9d3`)	2022-12-12 17:36:34 +02:00
Oran Agra	92ad0b5c8b	Improve linux overcommit check and warning (#11357 ) 1. show the overcommit warning when overcommit is disabled (2), not just when it is set to heuristic (0). 2. improve warning text to mention the issue with jemalloc causing VM mapping fragmentation when set to 2. (cherry picked from commit `dd60c6c8d3`)	2022-12-12 17:36:34 +02:00

1 2 3 4 5 ...

11432 Commits All Branches Search

11432 Commits

All Branches