redis

Commit Graph

Author	SHA1	Message	Date
debing.sun	08d714d0e5	Fix crash due to cron argv release (#13725 ) Introduced by https://github.com/redis/redis/issues/13521 If the client argv was released due to a timeout before sending the complete command, `argv_len` will be reset to 0. When argv is parsed again and resized, requesting a length of 0 may result in argv being NULL, then leading to a crash. And fix a bug that `argv_len` is not updated correctly in `replaceClientCommandVector()`. --------- Co-authored-by: ShooterIT <wangyuancode@163.com> Co-authored-by: meiravgri <109056284+meiravgri@users.noreply.github.com>	2025-01-08 09:57:23 +08:00
RQfreefly	4a12291765	Fix typos in multiple Redis source files (#13716 )	2025-01-07 15:35:47 +08:00
Yuan Wang	8e9f5146dd	Add reads/writes metrics for IO threads (#13703 ) The main job of the IO thread is read queries and write replies, so reads/writes metrics can reflect the workload of IO threads, now we also support this metrics `io_threaded_reads/writes_processed` in detail for each IO thread. Of course, to avoid break changes, `io_threaded_reads/writes_processed` is still there. But before async io thread commit, we may sum the IO done by the main thread if IO threads are active, but now we only sum the IO done by IO threads. Now threads section in `info` command output is as follows: ``` # Threads io_thread_0:clients=0,reads=0,writes=0 io_thread_1:clients=54,reads=6546940,writes=6546919 io_thread_2:clients=54,reads=6513650,writes=6513625 io_thread_3:clients=54,reads=6396571,writes=6396525 io_thread_4:clients=53,reads=6511120,writes=6511097 io_thread_5:clients=53,reads=6539302,writes=6539280 io_thread_6:clients=53,reads=6502269,writes=6502248 ```	2025-01-06 15:59:02 +08:00
raffertyyu	04f63d4af7	Fix index error of CRLF when replying with integer-encoded strings (#13711 ) close #13709 Fix the index error of CRLF character for integer-encoded strings in addReplyBulk function --------- Co-authored-by: debing.sun <debing.sun@redis.com>	2024-12-31 21:41:10 +08:00
Yuan Wang	dc57ee03b1	Do security attack check only when command not found to reduce the critical path (#13702 ) This PR is based on the commits from PR https://github.com/valkey-io/valkey/pull/1212. When explored the cycles distribution for main thread with io-threads enabled. We found this security attack check takes significant time in main thread, ~3% cycles were used to do the commands security check in main thread. This patch try to completely avoid doing it in the hot path. We can do it only after we looked up the command and it wasn't found, just before we call commandCheckExistence. --------- Co-authored-by: Lipeng Zhu <lipeng.zhu@intel.com> Co-authored-by: Wangyang Guo <wangyang.guo@intel.com>	2024-12-26 12:51:44 +08:00
Thalia Archibald	8144019a13	Check length before reading in `stringmatchlen` (#13690 ) Fixes four cases where `stringmatchlen` could overrun the pattern if it is not terminated with NUL. These commits are cherry-picked from my [fork](https://github.com/thaliaarchi/antirez-stringmatch) which extracts `stringmatch` as a library and compares it to other projects by antirez which uses the same matcher.	2024-12-26 12:37:23 +08:00
Yuan Wang	7665bdc91a	Offload `lookupCommand` into IO threads when threaded IO is enabled (#13696 ) From flame graph, we could see `lookupCommand` in main thread costs much CPU, so we can let IO threads to perform `lookupCommand`. To avoid race condition among multiple IO threads, made the following changes: - Pause all IO threads when register or unregister commands - Force a full rehashing of the command table dict when resizing	2024-12-25 16:03:22 +08:00
Yuan Wang	64a40b20d9	Async IO Threads (#13695 ) ## Introduction Redis introduced IO Thread in 6.0, allowing IO threads to handle client request reading, command parsing and reply writing, thereby improving performance. The current IO thread implementation has a few drawbacks. - The main thread is blocked during IO thread read/write operations and must wait for all IO threads to complete their current tasks before it can continue execution. In other words, the entire process is synchronous. This prevents the efficient utilization of multi-core CPUs for parallel processing. - When the number of clients and requests increases moderately, it causes all IO threads to reach full CPU utilization due to the busy wait mechanism used by the IO threads. This makes it challenging for us to determine which part of Redis has reached its bottleneck. - When IO threads are enabled with TLS and io-threads-do-reads, a disconnection of a connection with pending data may result in it being assigned to multiple IO threads simultaneously. This can cause race conditions and trigger assertion failures. Related issue: redis#12540 Therefore, we designed an asynchronous IO threads solution. The IO threads adopt an event-driven model, with the main thread dedicated to command processing, meanwhile, the IO threads handle client read and write operations in parallel. ## Implementation ### Overall As before, we did not change the fact that all client commands must be executed on the main thread, because Redis was originally designed to be single-threaded, and processing commands in a multi-threaded manner would inevitably introduce numerous race and synchronization issues. But now each IO thread has independent event loop, therefore, IO threads can use a multiplexing approach to handle client read and write operations, eliminating the CPU overhead caused by busy-waiting. the execution process can be briefly described as follows: the main thread assigns clients to IO threads after accepting connections, IO threads will notify the main thread when clients finish reading and parsing queries, then the main thread processes queries from IO threads and generates replies, IO threads handle writing reply to clients after receiving clients list from main thread, and then continue to handle client read and write events. ### Each IO thread has independent event loop We now assign each IO thread its own event loop. This approach eliminates the need for the main thread to perform the costly `epoll_wait` operation for handling connections (except for specific ones). Instead, the main thread processes requests from the IO threads and hands them back once completed, fully offloading read and write events to the IO threads. Additionally, all TLS operations, including handling pending data, have been moved entirely to the IO threads. This resolves the issue where io-threads-do-reads could not be used with TLS. ### Event-notified client queue To facilitate communication between the IO threads and the main thread, we designed an event-notified client queue. Each IO thread and the main thread have two such queues to store clients waiting to be processed. These queues are also integrated with the event loop to enable handling. We use pthread_mutex to ensure the safety of queue operations, as well as data visibility and ordering, and race conditions are minimized, as each IO thread and the main thread operate on independent queues, avoiding thread suspension due to lock contention. And we implemented an event notifier based on `eventfd` or `pipe` to support event-driven handling. ### Thread safety Since the main thread and IO threads can execute in parallel, we must handle data race issues carefully. client->flags The primary tasks of IO threads are reading and writing, i.e. `readQueryFromClient` and `writeToClient`. However, IO threads and the main thread may concurrently modify or access `client->flags`, leading to potential race conditions. To address this, we introduced an io-flags variable to record operations performed by IO threads, thereby avoiding race conditions on `client->flags`. Pause IO thread In the main thread, we may want to operate data of IO threads, maybe uninstall event handler, access or operate query/output buffer or resize event loop, we need a clean and safe context to do that. We pause IO thread in `IOThreadBeforeSleep`, do some jobs and then resume it. To avoid thread suspended, we use busy waiting to confirm the target status. Besides we use atomic variable to make sure memory visibility and ordering. We introduce these functions to pause/resume IO Threads as below. ``` pauseIOThread, resumeIOThread pauseAllIOThreads, resumeAllIOThreads pauseIOThreadsRange, resumeIOThreadsRange ``` Testing has shown that `pauseIOThread` is highly efficient, allowing the main thread to execute nearly 200,000 operations per second during stress tests. Similarly, `pauseAllIOThreads` with 8 IO threads can handle up to nearly 56,000 operations per second. But operations performed between pausing and resuming IO threads must be quick; otherwise, they could cause the IO threads to reach full CPU utilization. freeClient and freeClientAsync The main thread may need to terminate a client currently running on an IO thread, for example, due to ACL rule changes, reaching the output buffer limit, or evicting a client. In such cases, we need to pause the IO thread to safely operate on the client. maxclients and maxmemory-clients updating When adjusting `maxclients`, we need to resize the event loop for all IO threads. Similarly, when modifying `maxmemory-clients`, we need to traverse all clients to calculate their memory usage. To ensure safe operations, we pause all IO threads during these adjustments. Client info reading The main thread may need to read a client’s fields to generate a descriptive string, such as for the `CLIENT LIST` command or logging purposes. In such cases, we need to pause the IO thread handling that client. If information for all clients needs to be displayed, all IO threads must be paused. Tracking redirect Redis supports the tracking feature and can even send invalidation messages to a connection with a specified ID. But the target client may be running on IO thread, directly manipulating the client’s output buffer is not thread-safe, and the IO thread may not be aware that the client requires a response. In such cases, we pause the IO thread handling the client, modify the output buffer, and install a write event handler to ensure proper handling. clientsCron In the `clientsCron` function, the main thread needs to traverse all clients to perform operations such as timeout checks, verifying whether they have reached the soft output buffer limit, resizing the output/query buffer, or updating memory usage. To safely operate on a client, the IO thread handling that client must be paused. If we were to pause the IO thread for each client individually, the efficiency would be very low. Conversely, pausing all IO threads simultaneously would be costly, especially when there are many IO threads, as clientsCron is invoked relatively frequently. To address this, we adopted a batched approach for pausing IO threads. At most, 8 IO threads are paused at a time. The operations mentioned above are only performed on clients running in the paused IO threads, significantly reducing overhead while maintaining safety. ### Observability In the current design, the main thread always assigns clients to the IO thread with the least clients. To clearly observe the number of clients handled by each IO thread, we added the new section in INFO output. The `INFO THREADS` section can show the client count for each IO thread. ``` # Threads io_thread_0:clients=0 io_thread_1:clients=2 io_thread_2:clients=2 ``` Additionally, in the `CLIENT LIST` output, we also added a field to indicate the thread to which each client is assigned. `id=244 addr=127.0.0.1:41870 laddr=127.0.0.1:6379 ... resp=2 lib-name= lib-ver= io-thread=1` ## Trade-off ### Special Clients For certain special types of clients, keeping them running on IO threads would result in severe race issues that are difficult to resolve. Therefore, we chose not to offload these clients to the IO threads. For replica, monitor, subscribe, and tracking clients, main thread may directly write them a reply when conditions are met. Race issues are difficult to resolve, so we have them processed in the main thread. This includes the Lua debug clients as well, since we may operate connection directly. For blocking client, after the IO thread reads and parses a command and hands it over to the main thread, if the client is identified as a blocking type, it will be remained in the main thread. Once the blocking operation completes and the reply is generated, the client is transferred back to the IO thread to send the reply and wait for event triggers. ### Clients Eviction To support client eviction, it is necessary to update each client’s memory usage promptly during operations such as read, write, or command execution. However, when a client operates on an IO thread, it is not feasible to update the memory usage immediately due to the risk of data races. As a result, memory usage can only be updated either in the main thread while processing commands or in the `ClientsCron` periodically. The downside of this approach is that updates might experience a delay of up to one second, which could impact the precision of memory management for eviction. To avoid incorrectly evicting clients. We adopted a best-effort compensation solution, when we decide to eviction a client, we update its memory usage again before evicting, if the memory used by the client does not decrease or memory usage bucket is not changed, then we will evict it, otherwise, not evict it. However, we have not completely solved this problem. Due to the delay in memory usage updates, it may lead us to make incorrect decisions about the need to evict clients. ### Defragment In the majority of cases we do NOT use the data from argv directly in the db. 1. key names We store a copy that we allocate in the main thread, see `sdsdup()` in `dbAdd()`. 2. hash key and value We store key as hfield and store value as sds, see `hfieldNew()` and `sdsdup()` in `hashTypeSet()`. 3. other datatypes They don't even use SDS, so there is no reference issues. But in some cases client the data from argv may be retain by the main thread. As a result, during fragmentation cleanup, we need to move allocations from the IO thread’s arena to the main thread’s arena. We always allocate new memory in the main thread’s arena, but the memory released by IO threads may not yet have been reclaimed. This ultimately causes the fragmentation rate to be higher compared to creating and allocating entirely within a single thread. The following cases below will lead to memory allocated by the IO thread being kept by the main thread. 1. string related command: `append`, `getset`, `mset` and `set`. If `tryObjectEncoding()` does not change argv, we will keep it directly in the main thread, see the code in `tryObjectEncoding()`(specifically `trimStringObjectIfNeeded()`) 2. block related command. the key names will be kept in `c->db->blocking_keys`. 3. watch command the key names will be kept in `c->db->watched_keys`. 4. [s]subscribe command channel name will be kept in `serverPubSubChannels`. 5. script load command script will be kept in `server.lua_scripts`. 7. some module API: `RM_RetainString`, `RM_HoldString` Those issues will be handled in other PRs. ## Testing ### Functional Testing The commit with enabling IO Threads has passed all TCL tests, but we did some changes: Client query buffer: In the original code, when using a reusable query buffer, ownership of the query buffer would be released after the command was processed. However, with IO threads enabled, the client transitions from an IO thread to the main thread for processing. This causes the ownership release to occur earlier than the command execution. As a result, when IO threads are enabled, the client's information will never indicate that a shared query buffer is in use. Therefore, we skip the corresponding query buffer tests in this case. Defragment: Add a new defragmentation test to verify the effect of io threads on defragmentation. Command delay: For deferred clients in TCL tests, due to clients being assigned to different threads for execution, delays may occur. To address this, we introduced conditional waiting: the process proceeds to the next step only when the `client list` contains the corresponding commands. ### Sanitizer Testing The commit passed all TCL tests and reported no errors when compiled with the `fsanitizer=thread` and `fsanitizer=address` options enabled. But we made the following modifications: we suppressed the sanitizer warnings for clients with watched keys when updating `client->flags`, we think IO threads read `client->flags`, but never modify it or read the `CLIENT_DIRTY_CAS` bit, main thread just only modifies this bit, so there is no actual data race. ## Others ### IO thread number In the new multi-threaded design, the main thread is primarily focused on command processing to improve performance. Typically, the main thread does not handle regular client I/O operations but is responsible for clients such as replication and tracking clients. To avoid breaking changes, we still consider the main thread as the first IO thread. When the io-threads configuration is set to a low value (e.g., 2), performance does not show a significant improvement compared to a single-threaded setup for simple commands (such as SET or GET), as the main thread does not consume much CPU for these simple operations. This results in underutilized multi-core capacity. However, for more complex commands, having a low number of IO threads may still be beneficial. Therefore, it’s important to adjust the `io-threads` based on your own performance tests. Additionally, you can clearly monitor the CPU utilization of the main thread and IO threads using `top -H -p $redis_pid`. This allows you to easily identify where the bottleneck is. If the IO thread is the bottleneck, increasing the `io-threads` will improve performance. If the main thread is the bottleneck, the overall performance can only be scaled by increasing the number of shards or replicas. --------- Co-authored-by: debing.sun <debing.sun@redis.com> Co-authored-by: oranagra <oran@redislabs.com>	2024-12-23 14:16:40 +08:00
Moti Cohen	08c2b276fb	Optimize dict `no_value` also for even addresses (#13683 ) This pull request enhances the no_value flag option in the dict implementation, which is used to store keys without associated values. Previously, when a key had an odd memory address and was the only item in a table entry, it could be directly stored as a pointer without requiring an intermediate dictEntry. With this update, the optimization has been extended to also handle keys with even memory addresses in the same manner.	2024-12-22 14:10:07 +02:00
debing.sun	1f09a55eba	Avoid importing memory aligned malloc (#13693 ) This PR is based on the commits from PR https://github.com/valkey-io/valkey/pull/1442. We deprecate the usage of classic malloc and free, but under certain circumstances they might get imported from intrinsics. The original thought is we should just override malloc and free to use zmalloc and zfree, but I think we should continue to deprecate it to avoid accidental imports of allocations. --------- Co-authored-by: Madelyn Olson <matolson@amazon.com>	2024-12-20 09:39:14 +08:00
Nugine	684077682e	Fix bug in PFMERGE command (#13672 ) The bug was introduced in #13558 . When merging dense hll structures, `hllDenseCompress` writes to wrong location and the result will be zero. The unit tests didn't cover this case. This PR + fixes the bug + adds `PFDEBUG SIMD (ON\|OFF)` for unit tests + adds a new TCL test to cover the cases Synchronized from https://github.com/valkey-io/valkey/pull/1293 --------- Signed-off-by: Xuyang Wang <xuyangwang@link.cuhk.edu.cn> Co-authored-by: debing.sun <debing.sun@redis.com>	2024-12-18 14:41:04 +08:00
Filipe Oliveira (Redis)	f8942f93a6	Avoid unnecessary hfield Creation/Deletion on updates in hashTypeSet. HSET updates improvement of ~10% (#13655 ) This PR eliminates unnecessary creation and destruction of hfield objects, ensuring only required updates or insertions are performed. This reduces overhead and improves performance by streamlining field management in hash dictionaries, particularly in scenarios involving frequent updates, like the benchmarks in: - [memtier_benchmark-100Kkeys-load-hash-50-fields-with-100B-values](https://github.com/redis/redis-benchmarks-specification/blob/main/redis_benchmarks_specification/test-suites/memtier_benchmark-100Kkeys-load-hash-50-fields-with-100B-values.yml) - [memtier_benchmark-10Mkeys-load-hash-5-fields-with-100B-values-pipeline-10](https://github.com/redis/redis-benchmarks-specification/blob/main/redis_benchmarks_specification/test-suites/memtier_benchmark-10Mkeys-load-hash-5-fields-with-100B-values-pipeline-10.yml) To test it we can simply focus on the hfield related tests ``` tclsh tests/test_helper.tcl --single unit/type/hash-field-expire tclsh tests/test_helper.tcl --single unit/type/hash tclsh tests/test_helper.tcl --dump-logs --single unit/other ``` Extra check on full CI: - [x] https://github.com/filipecosta90/redis/actions/runs/12225788759 ## microbenchmark results 16.7% improvement (drop in time) in dictAddNonExistingRaw vs dictAddRaw ``` make REDIS_CFLAGS="-g -fno-omit-frame-pointer -O3 -DREDIS_TEST" -j $ ./src/redis-server test dict --accurate (...) Inserting via dictAddRaw() non existing: 5000000 items in 2592 ms (...) Inserting via dictAddNonExistingRaw() non existing: 5000000 items in 2160 ms ``` 8% improvement (drop in time) in find (non existing) and adding via `dictGetHash()+dictFindWithHash()+dictAddNonExistingRaw()` vs `dictFind()+dictAddRaw()` ``` make REDIS_CFLAGS="-g -fno-omit-frame-pointer -O3 -DREDIS_TEST" -j $ ./src/redis-server test dict --accurate (...) Find() and inserting via dictFind()+dictAddRaw() non existing: 5000000 items in 2983 ms Find() and inserting via dictGetHash()+dictFindWithHash()+dictAddNonExistingRaw() non existing: 5000000 items in 2740 ms ``` ## benchmark results To benchmark: ``` pip3 install redis-benchmarks-specification==0.1.250 taskset -c 0 ./src/redis-server --save '' --protected-mode no --daemonize yes redis-benchmarks-spec-client-runner --tests-regexp ".load-hash." --flushall_on_every_test_start --flushall_on_every_test_end --cpuset_start_pos 2 --override-memtier-test-time 60 ``` Improvements on achievable throughput in: test \| ops/sec unstable (`59953d2df6`) \| ops/sec this PR (`24af7190fd`) \| % change -- \| -- \| -- \| -- memtier_benchmark-1key-load-hash-1K-fields-with-5B-values \| 4097 \| 5032 \| 22.8% memtier_benchmark-100Kkeys-load-hash-50-fields-with-100B-values \| 37658 \| 44688 \| 18.7% memtier_benchmark-100Kkeys-load-hash-50-fields-with-1000B-values \| 14736 \| 17350 \| 17.7% memtier_benchmark-1Mkeys-load-hash-5-fields-with-1000B-values-pipeline-10 \| 131848 \| 143485 \| 8.8% memtier_benchmark-1Mkeys-load-hash-hmset-5-fields-with-1000B-values \| 82071 \| 85681 \| 4.4% memtier_benchmark-1Mkeys-load-hash-5-fields-with-1000B-values \| 82882 \| 86336 \| 4.2% memtier_benchmark-10Mkeys-load-hash-5-fields-with-100B-values-pipeline-10 \| 262502 \| 273376 \| 4.1% memtier_benchmark-10Kkeys-load-hash-50-fields-with-10000B-values \| 2821 \| 2936 \| 4.1% --------- Co-authored-by: Moti Cohen <moticless@gmail.com>	2024-12-12 19:41:08 +02:00
Moti Cohen	c51c96656b	modules API: Add test for ACL check of empty prefix (#13678 ) - Add empty string test for the new API `RedisModule_ACLCheckKeyPrefixPermissions`. - Fix order of checks: `(pattern[patternLen - 1] != '*' \|\| patternLen == 0)` --------- Co-authored-by: debing.sun <debing.sun@redis.com>	2024-12-10 09:16:30 +02:00
Moti Cohen	0dd057222b	Modules API: new HashFieldMinExpire(). Add flag REDISMODULE_HASH_EXPIRE_TIME to HashGet(). (#13676 ) This PR introduces API to query Expiration time of hash fields. # New `RedisModule_HashFieldMinExpire()` For a given hash, retrieves the minimum expiration time across all fields. If no fields have expiration or if the key is not a hash then return `REDISMODULE_NO_EXPIRE` (-1). ``` mstime_t RM_HashFieldMinExpire(RedisModuleKey *hash); ``` # Extension to `RedisModule_HashGet()` Adds a new flag, `REDISMODULE_HASH_EXPIRE_TIME`, to retrieve the expiration time of a specific hash field. If the field does not exist or has no expiration, returns `REDISMODULE_NO_EXPIRE`. It is fully backward-compatible (RM_HashGet retains its original behavior unless the new flag is used). Example: ``` mstime_t expiry1, expiry2; RedisModule_HashGet(mykey, REDISMODULE_HASH_EXPIRE_TIME, "field1", &expiry1, NULL); RedisModule_HashGet(mykey, REDISMODULE_HASH_EXPIRE_TIME, "field1", &expiry1, "field2", &expiry2, NULL); ```	2024-12-05 11:14:52 +02:00
Filipe Oliveira (Redis)	59953d2df6	Improve listpack Handling and Decoding Efficiency: 16.3% improvement on LRANGE command (#13652 ) This PR focused on refining listpack encoding/decoding functions and optimizing reply handling mechanisms related to it. Each commit has the measured improvement up until the last accumulated improvement of 16.3% on [memtier_benchmark-1key-list-100-elements-lrange-all-elements-pipeline-10](https://github.com/redis/redis-benchmarks-specification/blob/main/redis_benchmarks_specification/test-suites/memtier_benchmark-1key-list-100-elements-lrange-all-elements-pipeline-10.yml) benchmark. Connection mode \| CE Baseline (Nov 14th) `701f06657d` \| CE PR #13652 \| CE PR vs CE Unstable -- \| -- \| -- \| -- TCP \| 155696 \| 178874 \| 14.9% Unix socket \| 169743 \| 197428 \| 16.3% To test it we can simply focus on the scan.tcl ``` tclsh tests/test_helper.tcl --single unit/replybufsize ``` ### Commit details: - `2e58d048fd` + `29c6c86c6b` : Eliminate an indirect memory access on lpCurrentEncodedSizeBytes and completely avoid passing p* fully to lpCurrentEncodedSizeBytes + Add lpNextWithBytes helper function and optimize addListListpackRangeReply - Improvement of 3.1%, from 168969.88 ops/sec to 174239.75 ops/sec - `af52aacff8` Refactor lpDecodeBacklen for loop-based decoding, improving readability and branch efficiency. - NO CHANGE. REVERTED in 09f6680ba0d0b5acabca537c651008f0c8ec061b - `048bfe4eda` + `03e8ff3af7` : reducing condition checks in _addReplyToBuffer, inlining it, and avoid entering it when there are there already entries in the reply list and check if the reply length exceeds available buffer space before calling _addReplyToBuffer - accumulated Improvement of 12.4%, from 168969.88 ops/sec to 189726.81 ops/sec - 9a63d4d6a9fa946505e31ecce4c7796845fc022c: always update the buf_peak on _addReplyToBufferOrList - accumulated Improvement of 14.2%, from 168969.88 ops/sec to 193887 ops/sec - b544ade67628a1feaf714d6cfd114930e0c7670b: Introduce lpEncodeBacklenBytes to avoid any indirect memory access on previous usage of lpEncodeBacklen(NULL,...). inline lpEncodeBacklenBytes(). - accumulated Improvement of 16.3%, from 168969.88 ops/sec to 197427.70 ops/sec --------- Co-authored-by: debing.sun <debing.sun@redis.com>	2024-12-04 18:04:37 +08:00
Filipe Oliveira (Redis)	ddafac4c6c	Optimize dictFind with prefetching and branch prediction hints (#13646 ) This pull request optimizes the `dictFind` function by adding software prefetching and branch prediction hints to improve cache efficiency and reduce memory latency. It introduces 2 prefetch hints (read/write) that became no-ops in case the compiler does not support it. Baseline profiling with Intel VTune indicated that dictFind was significantly back-end bound, with memory latency accounting for 59.6% of clockticks, with frequent stalls from DRAM-bound operations due to cache misses during hash table lookups. ![microarch](https://github.com/user-attachments/assets/9e3cf334-ae6b-4767-b568-713a4ac24e87) --------- Co-authored-by: Yuan Wang <wangyuancode@163.com>	2024-12-04 17:16:14 +08:00
Ozan Tezcan	2af69a931a	Do not call _dictClear()'s callback for the first 65k items (#13674 ) In https://github.com/redis/redis/pull/13495, we introduced a feature to reply -LOADING while flushing a large db on a replica. While `_dictClear()` is in progress, it calls a callback for every 65k items and we yield back to eventloop to reply -LOADING. This change has made some tests unstable as those tests don't expect new -LOADING reply. One observation, inside `_dictClear()`, we call the callback even if db has a few keys. Most tests run with small amount of keys. So, each replication and cluster test has to handle potential -LOADING reply now. This PR changes this behavior, skips calling callback when `i=0` to stabilize replication tests. Callback will be called after the first 65k items. Most tests use less than 65k keys and they won't get -LOADING reply.	2024-12-03 09:26:19 +03:00
Moti Cohen	06b144aa09	Modules API: Add RedisModule_ACLCheckKeyPrefixPermissions (#13666 ) This PR introduces a new API function to the Redis Module API: ``` int RedisModule_ACLCheckKeyPrefixPermissions(RedisModuleUser user, RedisModuleString prefix, int flags); ``` Purpose: The function checks if a given user has access permissions to any key that match a specific prefix. This validation is based on the user’s ACL permissions and the specified flags. Note, this prefix-based approach API may fail to detect prefixes that are individually uncovered but collectively covered by the patterns. For example the prefix `ID-` is not fully included in pattern `ID-[0]` and is not fully included in pattern `ID-[^0]` but it is fully included in the set of patterns `{ID-[0], ID-[^0]*}`	2024-11-28 18:33:58 +02:00
Vitah Lin	db33b67d37	Deprecate ubuntu lunar and macos-12 in workflows (#13669 ) 1. Ubuntu Lunar reached End of Life on January 25, 2024, so upgrade the ubuntu version to plucky in action `test-ubuntu-jemalloc-fortify` to pass the daily CI 2. The macOS-12 environment is deprecated so upgrade macos-12 to macos-13 in daily CI --------- Co-authored-by: debing.sun <debing.sun@redis.com>	2024-11-28 21:59:43 +08:00
Filipe Oliveira (Redis)	a106198878	Optimize addReplyBulk on sds/int encoded strings: 2.2% to 4% reduction of CPU Time on GET high pipeline use-cases (#13644 ) ### Summary By profing 1KiB 100% GET's use-case, on high pipeline use-cases, we can see that addReplyBulk and it's inner calls takes 8.30% of the CPU cycles. This PR reduces from 2.2% to 4% the CPU time spent on addReplyBulk. Specifically for GET use-cases, we saw an improvement from 2.7% to 9.1% on the achievable ops/sec ### Improvement By reducing the duplicate work we can improve by around 2.7% on sds encoded strings, and around 9% on int encoded strings. This PR does the following: - Avoid duplicate sdslen on addReplyBulk() for sds enconded objects - Avoid duplicate sdigits10() call on int incoded objects on addReplyBulk() - avoid final "\r\n" addReplyProto() in the OBJ_ENCODING_INT type on addReplyBulk Altogether this improvements results in the following improvement on the achievable ops/sec : Encoding \| unstable (commit `9906daf5c9`) \| this PR \| % improvement -- \| -- \| -- \| -- 1KiB Values string SDS encoded \| 1478081.88 \| 1517635.38 \| 2.7% Values string "1" OBJ_ENCODING_INT \| 1521139.36 \| 1658876.59 \| 9.1% ### CPU Time: Total of addReplyBulk Encoding \| unstable (commit `9906daf5c9`) \| this PR \| reduction of CPU Time: Total -- \| -- \| -- \| -- 1KiB Values string SDS encoded \| 8.30% \| 6.10% \| 2.2% Values string "1" OBJ_ENCODING_INT \| 7.20% \| 3.20% \| 4.0% ### To reproduce Run redis with unix socket enabled ``` taskset -c 0 /root/redis/src/redis-server --unixsocket /tmp/1.socket --save '' --enable-debug-command local ``` #### 1KiB Values string SDS encoded Load data ``` taskset -c 2-5 memtier_benchmark --ratio 1:0 -n allkeys --key-pattern P:P --key-maximum 1000000 --hide-histogram --pipeline 10 -S /tmp/1.socket ``` Benchmark ``` taskset -c 2-6 memtier_benchmark --ratio 0:1 -c 1 -t 5 --test-time 60 --hide-histogram -d 1000 --pipeline 500 -S /tmp/1.socket --key-maximum 1000000 --json-out-file results.json ``` #### Values string "1" OBJ_ENCODING_INT Load data ``` $ taskset -c 2-5 memtier_benchmark --command "SET __key__ 1" -n allkeys --command-key-pattern P --key-maximum 1000000 --hide-histogram -c 1 -t 1 --pipeline 100 -S /tmp/1.socket # confirm we have the expected reply and format $ redis-cli get memtier-1 "1" $ redis-cli debug object memtier-1 Value at:0x7f14cec57570 refcount:2147483647 encoding:int serializedlength:2 lru:2861503 lru_seconds_idle:8 ``` Benchmark ``` taskset -c 2-6 memtier_benchmark --ratio 0:1 -c 1 -t 5 --test-time 60 --hide-histogram -d 1000 --pipeline 500 -S /tmp/1.socket --key-maximum 1000000 --json-out-file results.json ```	2024-11-26 16:11:01 +08:00
Ali	05b99c8f4c	Fix typo in redis.conf (#12634 ) unnecessarily and repetitive "OR"	2024-11-22 20:29:17 +08:00
Ozan Tezcan	9ebf80a28c	Fix memory leak of jemalloc tcache on function flush command (#13661 ) Starting from https://github.com/redis/redis/pull/13133, we allocate a jemalloc thread cache and use it for lua vm. On certain cases, like `script flush` or `function flush` command, we free the existing thread cache and create a new one. Though, for `function flush`, we were not actually destroying the existing thread cache itself. Each call creates a new thread cache on jemalloc and we leak the previous thread cache instances. Jemalloc allows maximum 4096 thread cache instances. If we reach this limit, Redis prints "Failed creating the lua jemalloc tcache" log and abort. There are other cases that can cause this memory leak, including replication scenarios when emptyData() is called. The implication is that it looks like redis `used_memory` is low, but `allocator_allocated` and RSS remain high. Co-authored-by: debing.sun <debing.sun@redis.com>	2024-11-21 14:12:58 +03:00
Moti Cohen	155634502d	modules API: Support register unprefixed config parameters (#13656 ) PR #10285 introduced support for modules to register four types of configurations — Bool, Numeric, String, and Enum. Accessible through the Redis config file and the CONFIG command. With this PR, it will be possible to register configuration parameters without automatically prefixing the parameter names. This provides greater flexibility in configuration naming, enabling, for instance, both `bf-initial-size` or `initial-size` to be defined in the module without automatically prefixing with `<MODULE-NAME>.`. In addition it will also be possible to create a single additional alias via the same API. This brings us another step closer to integrate modules into redis core. Example: Register a configuration parameter `bf-initial-size` with an alias `initial-size` without the automatic module name prefix, set with new `REDISMODULE_CONFIG_UNPREFIXED` flag: ``` RedisModule_RegisterBoolConfig(ctx, "bf-initial-size\|initial-size", default_val, optflags \| REDISMODULE_CONFIG_UNPREFIXED, getfn, setfn, applyfn, privdata); ``` # API changes Related functions that now support unprefixed configuration flag (`REDISMODULE_CONFIG_UNPREFIXED`) along with optional alias: ``` RedisModule_RegisterBoolConfig RedisModule_RegisterEnumConfig RedisModule_RegisterNumericConfig RedisModule_RegisterStringConfig ``` # Implementation Details: `config.c`: On load server configuration, at function `loadServerConfigFromString()`, it collects all unknown configurations into `module_configs_queue` dictionary. These may include valid module configurations or invalid ones. They will be validated later by `loadModuleConfigs()` against the configurations declared by the loaded module(s). `Module.c:` The `ModuleConfig` structure has been modified to store now: (1) Full configuration name (2) Alias (3) Unprefixed flag status - ensuring that configurations retain their original registration format when triggered in notifications. Added error printout: This change introduces an error printout for unresolved configurations, detailing each unresolved parameter detected during startup. The last line in the output existed prior to this change and has been retained to systems relies on it: ``` 595011:M 18 Nov 2024 08:26:23.616 # Unresolved Configuration(s) Detected: 595011:M 18 Nov 2024 08:26:23.616 # >>> 'bf-initiel-size 8' 595011:M 18 Nov 2024 08:26:23.616 # >>> 'search-sizex 32' 595011:M 18 Nov 2024 08:26:23.616 # Module Configuration detected without loadmodule directive or no ApplyConfig call: aborting ``` # Backward Compatibility: Existing modules will function without modification, as the new functionality only applies if REDISMODULE_CONFIG_UNPREFIXED is explicitly set. # Module vs. Core API Conflict Behavior The new API allows to modules loading duplication of same configuration name or same configuration alias, just like redis core configuration allows (i.e. the users sets two configs with a different value, but these two configs are actually the same one). Unlike redis core, given a name and its alias, it doesn't allow have both configuration on load. To implement it, it is required to modify DS `module_configs_queue` to reflect the order of their loading and later on, during `loadModuleConfigs()`, resolve pairs of names and aliases and which one is the last one to apply. "Relaxing" this limitation can be deferred to a future update if necessary, but for now, we error in this case.	2024-11-21 09:55:02 +02:00
Oran Agra	79fd255828	Add Lua VM memory to memory overhead, now that it's part of zmalloc (#13660 ) To complement the work done in #13133. it added the script VMs memory to be counted as part of zmalloc, but that means they should be also counted as part of the non-value overhead. this commit contains some refactoring to make variable names and function names less confusing. it also adds a new field named `script.VMs` into the `MEMORY STATS` command. additionally, clear scripts and stats between tests in external mode (which is related to how this issue was discovered)	2024-11-21 08:22:17 +02:00
nafraf	5b84dc9678	Fix module loadex command crash due to invalid config (#13653 ) Fix to https://github.com/redis/redis/issues/13650 providing an invalid config to a module with datatype crashes when redis tries to unload the module due to the invalid config --------- Co-authored-by: debing.sun <debing.sun@redis.com>	2024-11-21 14:14:14 +08:00
debing.sun	701f06657d	Reuse c->argv after command execution to reduce memory allocation overhead (#13521 ) inspred by https://github.com/redis/redis/pull/12730 Before this PR, we allocate new memory to store the user command arguments, however, if the size of the current `c->argv` is larger than the current command, we can reuse the previously allocated argv to avoid allocating new memory for the current command. And we will free `c->argv` in client cron when the client is idle for 2 seconds. --------- Co-authored-by: Ozan Tezcan <ozantezcan@gmail.com>	2024-11-14 20:35:31 +08:00
Moti Cohen	cf83803880	CRC64 perf improvements (#13638 ) Improve the performance of crc64 for large batches by processing large number of bytes in parallel and combining the results. --------- Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech> Co-authored-by: Madelyn Olson <madelyneolson@gmail.com> Co-authored-by: Josiah Carlson <josiah.carlson@gmail.com>	2024-11-12 09:21:22 +02:00
Ozan Tezcan	54038811c0	Print command tokens on a crash when hide-user-data-from-log is enabled (#13639 ) If `hide-user-data-from-log` config is enabled, we don't print client argv in the crashlog to avoid leaking user info. Though, debugging a crash becomes harder as we don't see the command arguments causing the crash. With this PR, we'll be printing command tokens to the log. As we have command tokens defined in json schema for each command, using this data, we can find tokens in the client argv. e.g. `SET key value GET EX 10` ---> we'll print `SET * * GET EX *` in the log. Modules should introduce their command structure via `RM_SetCommandInfo()`. Then, on a crash we'll able to know module command tokens.	2024-11-11 09:34:18 +03:00
Nugine	fdeb97629e	Optimize PFCOUNT, PFMERGE command by SIMD acceleration (#13558 ) This PR optimizes the performance of HyperLogLog commands (PFCOUNT, PFMERGE) by adding AVX2 fast paths. Two AVX2 functions are added for conversion between raw representation and dense representation. They are 15 ~ 30 times faster than scalar implementaion. Note that sparse representation is not accelerated. AVX2 fast paths are enabled when the CPU supports AVX2 (checked at runtime) and the hyperloglog configuration is default (HLL_REGISTERS == 16384 && HLL_BITS == 6). When merging 3 dense hll structures, the benchmark shows a 12x speedup compared to the scalar version. ``` pfcount key1 key2 key3 pfmerge keyall key1 key2 key3 ``` ``` ====================================================================================================== Type Ops/sec Avg. Latency p50 Latency p99 Latency p99.9 Latency KB/sec ------------------------------------------------------------------------------------------------------ PFCOUNT-scalar 5570.09 35.89060 32.51100 65.27900 69.11900 299.17 PFCOUNT-avx2 72604.92 2.82072 2.73500 5.50300 7.13500 3899.68 ------------------------------------------------------------------------------------------------------ PFMERGE-scalar 7879.13 25.52156 24.19100 46.33500 48.38300 492.45 PFMERGE-avx2 126448.64 1.58120 1.53500 3.08700 4.89500 7903.04 ------------------------------------------------------------------------------------------------------ scalar: redis:unstable `9906daf5c9` avx2: Nugine:hll-simd `02e09f85ac` CPU: 13th Gen Intel® Core™ i9-13900H × 20 Memory: 32.0 GiB OS: Ubuntu 22.04.5 LTS ``` Experiment repo: https://github.com/Nugine/redis-hyperloglog Benchmark script: https://github.com/Nugine/redis-hyperloglog/blob/main/scripts/memtier.sh Algorithm: https://github.com/Nugine/redis-hyperloglog/blob/main/cpp/bench.cpp resolves #13551 --------- Co-authored-by: Yuan Wang <wangyuancode@163.com> Co-authored-by: debing.sun <debing.sun@redis.com>	2024-11-08 15:19:38 +08:00
David Dougherty	9906daf5c9	Update old links for modules-api-ref.md (#13479 ) This PR replaces old .../topics/... links with current links, specifically for the modules-api-ref.md file and the new automation that Paolo Lazzari is working on. A few of the topics links have redirects, but some don't. Best to use updated links.	2024-11-04 18:18:22 +02:00
guybe7	ded8d993b7	Modules: defrag CB should take robj, not sds (#13627 ) Added a log of the keyname in the test modules to reproduce the problem (tests crash without the fix)	2024-10-30 17:32:51 +08:00
Moti Cohen	6437d07b03	Fix memory leak on rdbload error (#13626 ) On RDB load error, if an invalid `expireAt` value is read, `dupSearchDict` is not released.	2024-10-30 10:03:31 +02:00
debing.sun	4b29be3f36	Avoid redundant lpGet to boost quicklistCompare (#11533 ) `lpCompare()` in `quicklistCompare()` will call `lpGet()` again, which would be a waste. The change will result in a boost for all commands that use `quicklistCompre()`, including `linsert`, `lpos` and `lrem`.	2024-10-30 08:45:25 +08:00
Moti Cohen	2ec78d262d	Add KEYSIZES section to INFO (#13592 ) This PR adds a new section to the `INFO` command output, called `keysizes`. This section provides detailed statistics on the distribution of key sizes for each data type (strings, lists, sets, hashes and zsets) within the dataset. The distribution is tracked using a base-2 logarithmic histogram. # Motivation Currently, Redis lacks a built-in feature to track key sizes and item sizes per data type at a granular level. Understanding the distribution of key sizes is critical for monitoring memory usage and optimizing performance, particularly in large datasets. This enhancement will allow users to inspect the size distribution of keys directly from the `INFO` command, assisting with performance analysis and capacity planning. # Changes New Section in `INFO` Command: A new section called `keysizes` has been added to the `INFO` command output. This section reports a per-database, per-type histogram of key sizes. It provides insights into how many keys fall into specific size ranges (represented in powers of 2). Example output: ``` 127.0.0.1:6379> INFO keysizes # Keysizes db0_distrib_strings_sizes:1=19,2=655,512=100899,1K=31,2K=29,4K=23,8K=16,16K=3,32K=2 db0_distrib_lists_items:1=5784492,32=3558,64=1047,128=676,256=533,512=218,4K=1,8K=42 db0_distrib_sets_items:1=735564=50612,8=21462,64=1365,128=974,2K=292,4K=154,8K=89, db0_distrib_hashes_items:2=1,4=544,32=141169,64=207329,128=4349,256=136226,1K=1 ``` ## Future Use Cases: The key size distribution is collected per slot as well, laying the groundwork for future enhancements related to Redis Cluster.	2024-10-29 13:07:26 +02:00
Shockingly Good	611c950293	Fix crash in RM_GetCurrentUserName() when the user isn't accessible (#13619 ) The crash happens whenever the user isn't accessible, for example, it isn't set for the context (when it is temporary) or in some other cases like `notifyKeyspaceEvent`. To properly check for the ACL compliance, we need to get the user name and the user to invoke other APIs. However, it is not possible if it crashes, and it is impossible to work that around in the code since we don't know (and shouldn't know!) when it is available and when it is not.	2024-10-28 21:26:29 +08:00
opt-m	0a8e546957	Fix get # option in sort command (#13608 ) From 7.4, Redis allows `GET` options in cluster mode when the pattern maps to the same slot as the key, but GET # pattern that represents key itself is missed. This commit resolves it, bug report #13607. --------- Co-authored-by: Yuan Wang <yuan.wang@redis.com>	2024-10-22 09:55:00 +08:00
debing.sun	4f8cdc2a1e	Fix compilation on compilers that do not support target attribute (#13609 ) introduced by https://github.com/redis/redis/pull/13359 failure CI on ARM64: https://github.com/redis/redis-extra-ci/actions/runs/11377893230/job/31652773710 --------- Co-authored-by: Ozan Tezcan <ozantezcan@gmail.com> Co-authored-by: ShooterIT <wangyuancode@163.com>	2024-10-18 09:11:23 +08:00
hanhui365	3788a055fe	Optimize bitcount command by using popcnt (#13359 ) Nowadays popcnt instruction is almost supported by X86 machine, which is used to calculate "Hamming weight", it can bring much performance boost in redis bitcount comand. --------- Signed-off-by: hanhui365(hanhui@hygon.cn) Co-authored-by: debing.sun <debing.sun@redis.com> Co-authored-by: oranagra <oran@redislabs.com> Co-authored-by: Nugine <nugine@foxmail.com>	2024-10-17 09:13:19 +08:00
Yuan Wang	b71a610f5c	Clean up .rediscli_history_test temporary file (#13601 ) After running test in local, there will be a file named `.rediscli_history_test`, and it is not in `.gitignore` file, so this is considered to have changed the code base. It is a little annoying, this commit just clean up the temporary file. We should delete `.rediscli_history_test` in the end since the second server tests also write somethings into it, to make it corresponding, i put `set ::env(REDISCLI_HISTFILE) ".rediscli_history_test"` at the beginning. Maybe we also can add this file into `.gitignore`	2024-10-17 09:12:11 +08:00
YaacovHazan	efcfffc528	Update modules with latest version (#13606 ) Update redisbloom, redisjson and redistimeseries versions to 7.99.1 Co-authored-by: YaacovHazan <yaacov.hazan@redislabs.com>	2024-10-15 19:58:42 +03:00
paoloredis	99d09c824c	Only run redis_docs_sync.yaml on latest release (#13603 ) We only want to trigger the workflow on the documentation repository for the latest release	2024-10-15 16:02:11 +03:00
YaacovHazan	6c5e263d7b	Temporarily hide the new SFLUSH command by marking it as experimental (#13600 ) - Add a new 'EXPERIMENTAL' command flag, which causes the command generator to skip over it and make the command to be unavailable for execution - Skip experimental tests by default - Move the SFLUSH tests from the old framework to the new one --------- Co-authored-by: YaacovHazan <yaacov.hazan@redislabs.com>	2024-10-15 11:02:51 +03:00
debing.sun	3fc7ef8f81	Fix race in stream-cgroups test (#13593 ) failed CI: https://github.com/redis/redis/actions/runs/11171608362/job/31056659165 https://github.com/redis/redis/actions/runs/11226025974/job/31205787575	2024-10-12 09:23:19 +08:00
guybe7	a38c29b6c8	Cleanups related to expiry/eviction (#13591 ) 1. `dbRandomKey`: excessive call to `dbFindExpires` (will always return 1 if `allvolatile` + anyway called inside `expireIfNeeded` 2. Add `deleteKeyAndPropagate` that is used by both expiry/eviction 3. Change the order of calls in `expireIfNeeded` to save redundant calls to `keyIsExpired` 4. `expireIfNeeded`: move `OBJ_STATIC_REFCOUNT` to `deleteKeyAndPropagate` 5. `performEvictions` now uses `deleteEvictedKeyAndPropagate` 6. active-expire: moved `postExecutionUnitOperations` inside `activeExpireCycleTryExpire` 7. `activeExpireCycleTryExpire`: less indentation + expire a key if `now == t` 8. rename `lazy_expire_disabled` to `allow_access_expired`	2024-10-10 16:58:52 +08:00
Oran Agra	472d8a0df5	Prevent pattern matching abuse (CVE-2024-31228)	2024-10-08 20:55:44 +03:00
Oran Agra	8ec5da785b	Fix ACL SETUSER Read/Write key pattern selector (CVE-2024-31227) The '%' rule must contain one or both of R/W	2024-10-08 20:55:44 +03:00
Oran Agra	3a2669e8ae	Fix lua bit.tohex (CVE-2024-31449) INT_MIN value must be explicitly checked, and cannot be negated.	2024-10-08 20:55:44 +03:00
alonre24	f39e51178e	Update target module in search (#13578 ) Update search target path and version from M02	2024-10-08 13:58:28 +03:00
chx9	5f7d7ce8b0	fix typo in test_helper.tcl (#13576 ) fix typo in test_helper.tcl: even driven => event driven	2024-10-08 14:15:48 +08:00
Moti Cohen	d092d64d7a	Add new SFLUSH command to cluster for slot-based FLUSH (#13564 ) This PR introduces a new `SFLUSH` command to cluster mode that allows partial flushing of nodes based on specified slot ranges. Current implementation is designed to flush all slots of a shard, but future extensions could allow for more granular flushing. Command Usage: `SFLUSH <start-slot> <end-slot> [<start-slot> <end-slot>]* [SYNC\|ASYNC]` This command removes all data from the specified slots, either synchronously or asynchronously depending on the optional SYNC/ASYNC argument. Functionality: Current imp of `SFLUSH` command verifies that the provided slot ranges are valid and cover all of the node's slots before proceeding. If slots are partially or incorrectly specified, the command will fail and return an error, ensuring that all slots of a node must be fully covered for the flush to proceed. The function supports both synchronous (default) and asynchronous flushing. In addition, if possible, SFLUSH SYNC will be run as blocking ASYNC as an optimization.	2024-09-29 09:13:21 +03:00

1 2 3 4 5 ...

12308 Commits All Branches Search

12308 Commits

All Branches