redis

Commit Graph

Author	SHA1	Message	Date
antirez	e3243819ef	Don't mess with node attributes without protection. The background VSIMs use the node attributes (via the callback) so we can't modify them without waiting for the background operations to terminate.	2025-03-26 23:36:14 +01:00
antirez	a6c8a15cad	VADD: fix leak on thread creation failure.	2025-03-26 22:50:47 +01:00
antirez	3e2649f1f1	hnsw_insert() should never fail in practice. We pass our aborting allocation function to the HNSW lib, the only other reason for it to fail is pthread mutex locking failing but this is also practically impossible AFAIK in modern systems, and if it happens (for kernel reosurces shortage) anyway to abort is the best thing to do: otherwise we would have to return that we could not complete the operation for some reason, which is not uniform with everything Redis does. In Redis under normal conditions writes must succeed if they are semantically correct, or the server crash for OOM.	2025-03-26 22:46:00 +01:00
Ozan Tezcan	a0da8390a2	Fix use-after-free when diskless load config is not swapdb (#13887 ) CI / build-macos-latest (push) Waiting to run Details CI / test-sanitizer-address (push) Failing after 32s Details CI / build-debian-old (push) Failing after 31s Details CI / build-centos-jemalloc (push) Failing after 32s Details CI / build-libc-malloc (push) Failing after 32s Details CI / build-32bit (push) Failing after 32s Details CI / build-old-chain-jemalloc (push) Failing after 32s Details Codecov / code-coverage (push) Failing after 31s Details External Server Tests / test-external-standalone (push) Failing after 32s Details External Server Tests / test-external-cluster (push) Failing after 32s Details External Server Tests / test-external-nodebug (push) Failing after 32s Details CI / test-ubuntu-latest (push) Failing after 1m37s Details Spellcheck / Spellcheck (push) Failing after 32s Details When the diskless load configuration is set to on-empty-db, we retain a pointer to the function library context. When emptyData() is called, it frees this function library context pointer, leading to a use-after-free situation. I refactored code to ensure that emptyData() is called first, followed by retrieving the valid pointer to the function library context. Refactored code should not introduce any runtime implications. Bug introduced by https://github.com/redis/redis/pull/13495 (Redis 8.0) Co-authored-by: Oran Agra <oran@redislabs.com>	2025-03-26 21:50:10 +03:00
antirez	8dfc501fb8	VSIM: fix double free if thread creation fails.	2025-03-26 19:43:59 +01:00
antirez	9d4325ee25	VSIM NOTHREAD, mainly for testing goals.	2025-03-26 16:52:28 +01:00
antirez	707c132392	Count threaded exec time in stats.	2025-03-26 16:48:02 +01:00
antirez	08e3f958fa	README: remove no longer valid RP issue. now the projection matrix is deterministic.	2025-03-26 11:33:32 +01:00
antirez	23b3e21817	README: suggest using FP32 vs VALUES.	2025-03-26 11:28:05 +01:00
Cong Chen	981aa5c12f	Fix timing issue in HEXPIREAT test (#13873 ) CI / build-macos-latest (push) Waiting to run Details CI / build-debian-old (push) Failing after 7s Details CI / build-centos-jemalloc (push) Failing after 3s Details CI / build-old-chain-jemalloc (push) Failing after 3s Details CI / build-32bit (push) Failing after 21s Details Codecov / code-coverage (push) Failing after 8s Details CI / build-libc-malloc (push) Successful in 50s Details CI / test-ubuntu-latest (push) Failing after 2m9s Details CI / test-sanitizer-address (push) Failing after 2m40s Details Spellcheck / Spellcheck (push) Successful in 9m2s Details External Server Tests / test-external-standalone (push) Failing after 32s Details External Server Tests / test-external-cluster (push) Failing after 32s Details Coverity Scan / coverity (push) Has been skipped Details External Server Tests / test-external-nodebug (push) Failing after 31s Details This fixes an error that occurs in the job [test-valgrind-no-malloc-usable-size-test](https://github.com/redis/redis/actions/runs/13912357739/job/38929051397) of the Daily workflow: ``` *** [err]: HEXPIREAT - Set time and then get TTL (listpackex) in tests/unit/type/hash-field-expire.tcl Expected '999' to be between to '1000' and '2000' (context: type eval line 6 cmd {assert_range [r hpttl myhash FIELDS 1 field1] 1000 2000} proc ::test) ```	2025-03-26 10:00:38 +08:00
antirez	16e3c5a8f9	Locks error checking improved.	2025-03-24 19:10:28 +01:00
antirez	adfd2dc7c0	Remove useless OOM checks, but handle mutex creation failure.	2025-03-24 12:54:41 +01:00
antirez	8bf9b8abc1	Use Hadamard-based projection. Works better and being deterministic (only relative to the projection size) the replicas will have the same matrix automatically.	2025-03-24 12:48:04 +01:00
Oran Agra	2a189709e0	avoid possible use-after-free with module KSN changes (#13875 ) CI / build-debian-old (push) Failing after 4s Details CI / build-centos-jemalloc (push) Failing after 3s Details CI / build-old-chain-jemalloc (push) Failing after 3s Details CI / build-32bit (push) Failing after 18s Details CI / build-libc-malloc (push) Successful in 53s Details CI / test-sanitizer-address (push) Failing after 1m6s Details CI / test-ubuntu-latest (push) Failing after 2m57s Details Spellcheck / Spellcheck (push) Successful in 9m5s Details Coverity Scan / coverity (push) Has been skipped Details External Server Tests / test-external-cluster (push) Failing after 31s Details External Server Tests / test-external-standalone (push) Failing after 6m35s Details External Server Tests / test-external-nodebug (push) Failing after 15m1s Details CI / build-macos-latest (push) Has been cancelled Details in #13505, we changed the code to use the string value of the key rather than the integer value on the stack, but we have a test in unit/moduleapi/keyspace_events that uses keyspace notification hook to modify the value with RM_StringDMA, which can cause this value to be released before used. the reason it didn't happen so far is because we were using shared integers, so releasing the object doesn't free it.	2025-03-24 12:24:52 +02:00
antirez	958ebee091	README: specify how to add REDUCE in VADD.	2025-03-24 09:55:45 +01:00
Yuan Wang	319bbcc1a7	Fix sdscatprintf error of the in output of `info stats` (#13871 ) CI / build-macos-latest (push) Waiting to run Details CI / build-debian-old (push) Failing after 4s Details CI / build-32bit (push) Failing after 15s Details CI / build-centos-jemalloc (push) Failing after 3s Details CI / build-old-chain-jemalloc (push) Failing after 2s Details CI / test-sanitizer-address (push) Failing after 1m2s Details Codecov / code-coverage (push) Failing after 33s Details CI / build-libc-malloc (push) Successful in 48s Details CI / test-ubuntu-latest (push) Failing after 2m51s Details Spellcheck / Spellcheck (push) Failing after 9s Details Coverity Scan / coverity (push) Has been skipped Details External Server Tests / test-external-standalone (push) Failing after 33s Details External Server Tests / test-external-nodebug (push) Failing after 32s Details External Server Tests / test-external-cluster (push) Failing after 9m29s Details CI failed: https://github.com/redis/redis/actions/runs/13981749993/job/39148249096, since i don't reassign `info` after `sdscatprintf(info, xxx)` Thanks to @sundb for spotting this introduced in https://github.com/redis/redis/pull/13846	2025-03-24 09:17:58 +08:00
debing.sun	87b7c3ac1a	Fix rax node defragmentaion being skipped (#13847 ) First, when we do `raxSeek()` and then call raxNext, we will get the `RAX_ITER_JUST_SEEKED` flag and return success directly. We always set the node defrag callback after `raxSeek()`, which means that when we break from defragmentation, the first node that comes in again will never be defragged. In this PR, we save the last as the next node to be processed, not the last node to be completed. This way we defrag the next node when we exit to avoid it being skipped on the next resume. --------- Co-authored-by: oranagra <oran@redislabs.com>	2025-03-24 08:57:08 +08:00
antirez	8007ccd51b	Use RESP3-friendly bool replies.	2025-03-23 20:14:40 +01:00
antirez	9cc750fd66	Test: projection regression test fixed.	2025-03-23 15:04:58 +01:00
antirez	aa92b37589	VINFO: use a single field for random projection info.	2025-03-23 14:49:52 +01:00
antirez	8f479b22b9	Tests: replication test.	2025-03-23 14:45:34 +01:00
Salvatore Sanfilippo	854c7fdddb	Merge pull request #6 from rowantrollope/main Fix possible crash with random projection	2025-03-23 14:44:53 +01:00
Rowan Trollope	31bc07955c	Fix possible crash with random projection	2025-03-22 09:11:20 -07:00
antirez	f330d6175a	Clarify HNSW_MAX_THREADS vs one thread per request.	2025-03-20 15:42:11 +01:00
Benson-li	427c36888e	Fix potential infinite loop of RANDOMKEY during client pause (#13863 ) CI / test-ubuntu-latest (push) Failing after 31s Details CI / build-debian-old (push) Failing after 32s Details CI / build-libc-malloc (push) Failing after 31s Details CI / build-centos-jemalloc (push) Failing after 31s Details CI / build-old-chain-jemalloc (push) Failing after 32s Details Codecov / code-coverage (push) Failing after 32s Details Spellcheck / Spellcheck (push) Failing after 32s Details CI / test-sanitizer-address (push) Failing after 4m35s Details CI / build-32bit (push) Failing after 5m35s Details CI / build-macos-latest (push) Has been cancelled Details CodeQL / Analyze (cpp) (push) Failing after 32s Details Coverity Scan / coverity (push) Has been skipped Details External Server Tests / test-external-standalone (push) Failing after 31s Details External Server Tests / test-external-cluster (push) Failing after 31s Details External Server Tests / test-external-nodebug (push) Failing after 6m47s Details The bug mentioned in this [#13862](https://github.com/redis/redis/issues/13862) has been fixed. --------- Signed-off-by: li-benson <1260437731@qq.com> Signed-off-by: youngmore1024 <youngmore1024@outlook.com> Co-authored-by: youngmore1024 <youngmore1024@outlook.com>	2025-03-20 21:32:12 +08:00
debing.sun	cb02bd190b	Fix timing issue in module defrag test (#13870 ) After #13840, the data we populate becomes more complex and slower, we always wait for a defragmentation cycle to end before verifying that the test is okay. However, in some slow environments, an entire defragmentation cycle can exceed 5 seconds, and in my local test using 'taskset -c 0' it can reach 6 seconds, so increase the threshold to avoid test failures.	2025-03-20 21:22:47 +08:00
Yuan Wang	951ec79654	Cluster compatibility check (#13846 ) CI / build-macos-latest (push) Waiting to run Details CI / build-32bit (push) Failing after 31s Details CI / build-libc-malloc (push) Failing after 31s Details CI / build-debian-old (push) Failing after 1m32s Details CI / build-old-chain-jemalloc (push) Failing after 31s Details Codecov / code-coverage (push) Failing after 31s Details CI / test-ubuntu-latest (push) Failing after 3m21s Details Spellcheck / Spellcheck (push) Failing after 31s Details CI / test-sanitizer-address (push) Failing after 6m36s Details CI / build-centos-jemalloc (push) Failing after 6m36s Details External Server Tests / test-external-standalone (push) Failing after 2m10s Details Coverity Scan / coverity (push) Has been skipped Details External Server Tests / test-external-nodebug (push) Failing after 2m12s Details External Server Tests / test-external-cluster (push) Failing after 2m16s Details ### Background The program runs normally in standalone mode, but migrating to cluster mode may cause errors, this is because some cross slot commands can not run in cluster mode. We should provide an approach to detect this issue when running in standalone mode, and need to expose a metric which indicates the usage of no incompatible commands. ### Solution To avoid perf impact, we introduce a new config `cluster-compatibility-sample-ratio` which define the sampling ratio (0-100) for checking command compatibility in cluster mode. When a command is executed, it is sampled at the specified ratio to determine if it complies with Redis cluster constraints, such as cross-slot restrictions. A new metric is exposed: `cluster_incompatible_ops` in `info stats` output. The following operations will be considered incompatible operations. - cross-slot command If a command has multiple cross slot keys, it is incompatible - `swap, copy, move, select` command These commands involve multi databases in some cases, we don't allow multiple DB in cluster mode, so there are not compatible - Module command with `no-cluster` flag If a module command has `no-cluster` flag, we will encounter an error when loading module, leading to fail to load module if cluster is enabled, so this is incompatible. - Script/function with `no-cluster` flag Similar with module command, if we declare `no-cluster` in shebang of script/function, we also can not run it in cluster mode - `sort` command by/get pattern When `sort` command has `by/get` pattern option, we must ask that the pattern slot is equal with the slot of keys, otherwise it is incompatible in cluster mode. - The script/function command accesses the keys and declared keys have different slots For the script/function command, we not only check the slot of declared keys, but only check the slot the accessing keys, if they are different, we think it is incompatible. Besides, commands like `keys, scan, flushall, script/function flush`, that in standalone mode iterate over all data to perform the operation, are only valid for the server that executes the command in cluster mode and are not broadcasted. However, this does not lead to errors, so we do not consider them as incompatible commands. ### Performance impact test cross slot test Below are the test commands and results. When using MSET with 8 keys, performance drops by approximately 3%. single key test It may be due to the overhead of the sampling function, and single-key commands could cause a 1-2% performance drop.	2025-03-20 10:35:53 +08:00
Filipe Oliveira (Redis)	3e012c9260	Fix string2d usage in case of hexadecimal strings parsing and overflow (#13845 ) CI / build-macos-latest (push) Waiting to run Details CI / build-debian-old (push) Failing after 6s Details CI / build-centos-jemalloc (push) Failing after 5s Details CI / build-old-chain-jemalloc (push) Failing after 3s Details Codecov / code-coverage (push) Failing after 7s Details CI / build-libc-malloc (push) Successful in 56s Details CI / test-sanitizer-address (push) Failing after 1m8s Details CI / test-ubuntu-latest (push) Failing after 2m13s Details CI / build-32bit (push) Failing after 3m28s Details Coverity Scan / coverity (push) Has been skipped Details External Server Tests / test-external-nodebug (push) Failing after 1m48s Details External Server Tests / test-external-standalone (push) Failing after 2m9s Details External Server Tests / test-external-cluster (push) Failing after 2m14s Details Spellcheck / Spellcheck (push) Successful in 9m3s Details Since https://github.com/redis/redis/pull/11884, what was previously accepted as a valid input (hexadecimal string) before 8.0 returned an error. This PR addresses it. To avoid performance penalties if hints the compiler that the fallbacks are not likely to happen. Furthermore, we were ignoring std::result_out_of_range outputs from fast_float. This PR addresses it as well and includes tests for both identified scenarios. --------- Co-authored-by: debing.sun <debing.sun@redis.com>	2025-03-19 20:08:45 +08:00
antirez	758e963a4e	VRANDMEMBER documentation.	2025-03-19 09:02:15 +01:00
debing.sun	26dcec4812	Fix messed-up unblocked clients in flush command (#13865 ) CI / build-macos-latest (push) Waiting to run Details CI / build-debian-old (push) Failing after 6s Details CI / build-centos-jemalloc (push) Failing after 5s Details CI / build-old-chain-jemalloc (push) Failing after 3s Details Codecov / code-coverage (push) Failing after 7s Details CI / build-32bit (push) Failing after 20s Details CI / build-libc-malloc (push) Successful in 47s Details CI / test-sanitizer-address (push) Failing after 2m6s Details CI / test-ubuntu-latest (push) Failing after 2m11s Details External Server Tests / test-external-standalone (push) Failing after 2m13s Details External Server Tests / test-external-nodebug (push) Failing after 2m11s Details External Server Tests / test-external-cluster (push) Failing after 2m18s Details Spellcheck / Spellcheck (push) Successful in 9m4s Details Fix https://github.com/redis/redis/pull/13853#pullrequestreview-2675227138 This PR ensures that the client's current command is not reset by unblockClient(), while still needing to be handled after `unblockclient()`. The FLUSH command still requires reprocessing (update the replication offset) after unblockClient(). Therefore, we mark such blocked clients with the CLIENT_PENDING_COMMAND flag to prevent the command from being reset during unblockClient().	2025-03-19 10:22:47 +08:00
antirez	3424757f4d	Test: added another threading stress test. This access pattern triggered the bug fixed about VADD and CAS in `70ffa8c`.	2025-03-18 23:18:26 +01:00
antirez	70ffa8ce5c	Fix VADD_CASReply() NULL reference on ID mismatch. This bug was fixed thanks to the kind help of Dvir Dukhan (@DvirDukhan) that found it and provided useful context.	2025-03-18 21:37:06 +01:00
antirez	99176b3e04	Test: VRANDMEMBER test added.	2025-03-18 16:49:27 +01:00
antirez	22ce9f3fad	VRANDMEMBER command implemented.	2025-03-17 23:52:15 +01:00
debing.sun	a5a3afd923	Fix crash during SLAVEOF when clients are blocked on lazyfree (#13853 ) CI / build-debian-old (push) Failing after 7s Details CI / build-centos-jemalloc (push) Failing after 6s Details CI / build-old-chain-jemalloc (push) Failing after 4s Details Codecov / code-coverage (push) Failing after 7s Details CI / build-libc-malloc (push) Successful in 53s Details CI / test-sanitizer-address (push) Failing after 1m4s Details CI / test-ubuntu-latest (push) Failing after 2m9s Details CI / build-32bit (push) Failing after 9m50s Details Spellcheck / Spellcheck (push) Successful in 9m0s Details Coverity Scan / coverity (push) Has been skipped Details External Server Tests / test-external-standalone (push) Failing after 31s Details External Server Tests / test-external-cluster (push) Failing after 6m36s Details External Server Tests / test-external-nodebug (push) Failing after 9m54s Details CI / build-macos-latest (push) Has been cancelled Details After https://github.com/redis/redis/pull/13167, when a client calls `FLUSHDB` command, we still async empty database, and the client was blocked until the lazyfree completes. 1) If another client calls `SLAVEOF` command during this time, the server will unblock all blocked clients, including those blocked by the lazyfree. However, when unblocking a lazyfree blocked client, we forgot to call `updateStatsOnUnblock()`, which ultimately triggered the following assertion. 2) If a client blocked by Lazyfree is unblocked midway, and at this point the `bio_comp_list` has already received the completion notification for the bio, we might end up processing a client that has already been unblocked in `flushallSyncBgDone()`. Therefore, we need to filter it out. --------- Co-authored-by: oranagra <oran@redislabs.com>	2025-03-17 20:27:05 +08:00
kei-nan	752576ce47	Use Search v7.99.5 (#13859 ) CI / build-macos-latest (push) Waiting to run Details CI / test-ubuntu-latest (push) Failing after 32s Details CI / test-sanitizer-address (push) Failing after 31s Details CI / build-debian-old (push) Failing after 31s Details CI / build-32bit (push) Failing after 32s Details CI / build-libc-malloc (push) Failing after 31s Details CI / build-centos-jemalloc (push) Failing after 31s Details CI / build-old-chain-jemalloc (push) Failing after 31s Details Codecov / code-coverage (push) Failing after 31s Details Spellcheck / Spellcheck (push) Failing after 31s Details Coverity Scan / coverity (push) Has been skipped Details External Server Tests / test-external-standalone (push) Failing after 2m7s Details External Server Tests / test-external-nodebug (push) Failing after 2m5s Details External Server Tests / test-external-cluster (push) Failing after 8m25s Details	2025-03-16 10:00:51 +02:00
antirez	706721f8c8	HSNW: random node.	2025-03-16 00:08:43 +01:00
antirez	8a5cf17cb2	HNSW: cursor fixes and thread safety.	2025-03-15 23:31:24 +01:00
antirez	a363e5fe6d	README: memory usage section.	2025-03-15 23:16:28 +01:00
antirez	6e434bcaaf	HNSW: use node max link property. This is both more correct in formal terms, and in practical terms as well, as we could over-allocate nodes sometimes.	2025-03-15 10:30:14 +01:00
antirez	68d3067125	w2v test: fix recall EF usage.	2025-03-15 10:24:20 +01:00
antirez	d94058fad9	w2v test: recall histograms + configurable M.	2025-03-15 09:46:42 +01:00
antirez	c1c7eeaa69	Document VADD M parameter.	2025-03-15 09:28:55 +01:00
antirez	542736ce25	w2v test: proper recall test added.	2025-03-15 00:24:10 +01:00
antirez	13a0a63bef	Copyright Sanfilipo -> Redis Ltd.	2025-03-14 23:06:22 +01:00
antirez	d996eb82ef	VADD: make M configurable at creation time.	2025-03-13 16:58:55 +01:00
antirez	4e57d3f76f	README: grammar.	2025-03-13 15:56:05 +01:00
antirez	2fcf389f2a	README: troubleshooting and understandability.	2025-03-13 13:25:48 +01:00
antirez	9500539c55	HNSW: implement last resort node reallocation.	2025-03-13 11:30:07 +01:00
antirez	095842a748	README: scaling information.	2025-03-12 22:58:33 +01:00

1 2 3 4 5 ...

12574 Commits All Branches Search

12574 Commits

All Branches