redis

Commit Graph

Author	SHA1	Message	Date
Yuan Wang	033abd6f57	Async IO threads (#13665 ) ## Introduction Redis introduced IO Thread in 6.0, allowing IO threads to handle client request reading, command parsing and reply writing, thereby improving performance. The current IO thread implementation has a few drawbacks. - The main thread is blocked during IO thread read/write operations and must wait for all IO threads to complete their current tasks before it can continue execution. In other words, the entire process is synchronous. This prevents the efficient utilization of multi-core CPUs for parallel processing. - When the number of clients and requests increases moderately, it causes all IO threads to reach full CPU utilization due to the busy wait mechanism used by the IO threads. This makes it challenging for us to determine which part of Redis has reached its bottleneck. - When IO threads are enabled with TLS and io-threads-do-reads, a disconnection of a connection with pending data may result in it being assigned to multiple IO threads simultaneously. This can cause race conditions and trigger assertion failures. Related issue: https://github.com/redis/redis/issues/12540 Therefore, we designed an asynchronous IO threads solution. The IO threads adopt an event-driven model, with the main thread dedicated to command processing, meanwhile, the IO threads handle client read and write operations in parallel. ## Implementation ### Overall As before, we did not change the fact that all client commands must be executed on the main thread, because Redis was originally designed to be single-threaded, and processing commands in a multi-threaded manner would inevitably introduce numerous race and synchronization issues. But now each IO thread has independent event loop, therefore, IO threads can use a multiplexing approach to handle client read and write operations, eliminating the CPU overhead caused by busy-waiting. the execution process can be briefly described as follows: the main thread assigns clients to IO threads after accepting connections, IO threads will notify the main thread when clients finish reading and parsing queries, then the main thread processes queries from IO threads and generates replies, IO threads handle writing reply to clients after receiving clients list from main thread, and then continue to handle client read and write events. ### Each IO thread has independent event loop We now assign each IO thread its own event loop. This approach eliminates the need for the main thread to perform the costly `epoll_wait` operation for handling connections (except for specific ones). Instead, the main thread processes requests from the IO threads and hands them back once completed, fully offloading read and write events to the IO threads. Additionally, all TLS operations, including handling pending data, have been moved entirely to the IO threads. This resolves the issue where io-threads-do-reads could not be used with TLS. ### Event-notified client queue To facilitate communication between the IO threads and the main thread, we designed an event-notified client queue. Each IO thread and the main thread have two such queues to store clients waiting to be processed. These queues are also integrated with the event loop to enable handling. We use pthread_mutex to ensure the safety of queue operations, as well as data visibility and ordering, and race conditions are minimized, as each IO thread and the main thread operate on independent queues, avoiding thread suspension due to lock contention. And we implemented an event notifier based on `eventfd` or `pipe` to support event-driven handling. ### Thread safety Since the main thread and IO threads can execute in parallel, we must handle data race issues carefully. client->flags The primary tasks of IO threads are reading and writing, i.e. `readQueryFromClient` and `writeToClient`. However, IO threads and the main thread may concurrently modify or access `client->flags`, leading to potential race conditions. To address this, we introduced an io-flags variable to record operations performed by IO threads, thereby avoiding race conditions on `client->flags`. Pause IO thread In the main thread, we may want to operate data of IO threads, maybe uninstall event handler, access or operate query/output buffer or resize event loop, we need a clean and safe context to do that. We pause IO thread in `IOThreadBeforeSleep`, do some jobs and then resume it. To avoid thread suspended, we use busy waiting to confirm the target status. Besides we use atomic variable to make sure memory visibility and ordering. We introduce these functions to pause/resume IO Threads as below. ``` pauseIOThread, resumeIOThread pauseAllIOThreads, resumeAllIOThreads pauseIOThreadsRange, resumeIOThreadsRange ``` Testing has shown that `pauseIOThread` is highly efficient, allowing the main thread to execute nearly 200,000 operations per second during stress tests. Similarly, `pauseAllIOThreads` with 8 IO threads can handle up to nearly 56,000 operations per second. But operations performed between pausing and resuming IO threads must be quick; otherwise, they could cause the IO threads to reach full CPU utilization. freeClient and freeClientAsync The main thread may need to terminate a client currently running on an IO thread, for example, due to ACL rule changes, reaching the output buffer limit, or evicting a client. In such cases, we need to pause the IO thread to safely operate on the client. maxclients and maxmemory-clients updating When adjusting `maxclients`, we need to resize the event loop for all IO threads. Similarly, when modifying `maxmemory-clients`, we need to traverse all clients to calculate their memory usage. To ensure safe operations, we pause all IO threads during these adjustments. Client info reading The main thread may need to read a client’s fields to generate a descriptive string, such as for the `CLIENT LIST` command or logging purposes. In such cases, we need to pause the IO thread handling that client. If information for all clients needs to be displayed, all IO threads must be paused. Tracking redirect Redis supports the tracking feature and can even send invalidation messages to a connection with a specified ID. But the target client may be running on IO thread, directly manipulating the client’s output buffer is not thread-safe, and the IO thread may not be aware that the client requires a response. In such cases, we pause the IO thread handling the client, modify the output buffer, and install a write event handler to ensure proper handling. clientsCron In the `clientsCron` function, the main thread needs to traverse all clients to perform operations such as timeout checks, verifying whether they have reached the soft output buffer limit, resizing the output/query buffer, or updating memory usage. To safely operate on a client, the IO thread handling that client must be paused. If we were to pause the IO thread for each client individually, the efficiency would be very low. Conversely, pausing all IO threads simultaneously would be costly, especially when there are many IO threads, as clientsCron is invoked relatively frequently. To address this, we adopted a batched approach for pausing IO threads. At most, 8 IO threads are paused at a time. The operations mentioned above are only performed on clients running in the paused IO threads, significantly reducing overhead while maintaining safety. ### Observability In the current design, the main thread always assigns clients to the IO thread with the least clients. To clearly observe the number of clients handled by each IO thread, we added the new section in INFO output. The `INFO THREADS` section can show the client count for each IO thread. ``` # Threads io_thread_0:clients=0 io_thread_1:clients=2 io_thread_2:clients=2 ``` Additionally, in the `CLIENT LIST` output, we also added a field to indicate the thread to which each client is assigned. `id=244 addr=127.0.0.1:41870 laddr=127.0.0.1:6379 ... resp=2 lib-name= lib-ver= io-thread=1` ## Trade-off ### Special Clients For certain special types of clients, keeping them running on IO threads would result in severe race issues that are difficult to resolve. Therefore, we chose not to offload these clients to the IO threads. For replica, monitor, subscribe, and tracking clients, main thread may directly write them a reply when conditions are met. Race issues are difficult to resolve, so we have them processed in the main thread. This includes the Lua debug clients as well, since we may operate connection directly. For blocking client, after the IO thread reads and parses a command and hands it over to the main thread, if the client is identified as a blocking type, it will be remained in the main thread. Once the blocking operation completes and the reply is generated, the client is transferred back to the IO thread to send the reply and wait for event triggers. ### Clients Eviction To support client eviction, it is necessary to update each client’s memory usage promptly during operations such as read, write, or command execution. However, when a client operates on an IO thread, it is not feasible to update the memory usage immediately due to the risk of data races. As a result, memory usage can only be updated either in the main thread while processing commands or in the `ClientsCron` periodically. The downside of this approach is that updates might experience a delay of up to one second, which could impact the precision of memory management for eviction. To avoid incorrectly evicting clients. We adopted a best-effort compensation solution, when we decide to eviction a client, we update its memory usage again before evicting, if the memory used by the client does not decrease or memory usage bucket is not changed, then we will evict it, otherwise, not evict it. However, we have not completely solved this problem. Due to the delay in memory usage updates, it may lead us to make incorrect decisions about the need to evict clients. ### Defragment In the majority of cases we do NOT use the data from argv directly in the db. 1. key names We store a copy that we allocate in the main thread, see `sdsdup()` in `dbAdd()`. 2. hash key and value We store key as hfield and store value as sds, see `hfieldNew()` and `sdsdup()` in `hashTypeSet()`. 3. other datatypes They don't even use SDS, so there is no reference issues. But in some cases client the data from argv may be retain by the main thread. As a result, during fragmentation cleanup, we need to move allocations from the IO thread’s arena to the main thread’s arena. We always allocate new memory in the main thread’s arena, but the memory released by IO threads may not yet have been reclaimed. This ultimately causes the fragmentation rate to be higher compared to creating and allocating entirely within a single thread. The following cases below will lead to memory allocated by the IO thread being kept by the main thread. 1. string related command: `append`, `getset`, `mset` and `set`. If `tryObjectEncoding()` does not change argv, we will keep it directly in the main thread, see the code in `tryObjectEncoding()`(specifically `trimStringObjectIfNeeded()`) 2. block related command. the key names will be kept in `c->db->blocking_keys`. 3. watch command the key names will be kept in `c->db->watched_keys`. 4. [s]subscribe command channel name will be kept in `serverPubSubChannels`. 5. script load command script will be kept in `server.lua_scripts`. 7. some module API: `RM_RetainString`, `RM_HoldString` Those issues will be handled in other PRs. ## Testing ### Functional Testing The commit with enabling IO Threads has passed all TCL tests, but we did some changes: Client query buffer: In the original code, when using a reusable query buffer, ownership of the query buffer would be released after the command was processed. However, with IO threads enabled, the client transitions from an IO thread to the main thread for processing. This causes the ownership release to occur earlier than the command execution. As a result, when IO threads are enabled, the client's information will never indicate that a shared query buffer is in use. Therefore, we skip the corresponding query buffer tests in this case. Defragment: Add a new defragmentation test to verify the effect of io threads on defragmentation. Command delay: For deferred clients in TCL tests, due to clients being assigned to different threads for execution, delays may occur. To address this, we introduced conditional waiting: the process proceeds to the next step only when the `client list` contains the corresponding commands. ### Sanitizer Testing The commit passed all TCL tests and reported no errors when compiled with the `fsanitizer=thread` and `fsanitizer=address` options enabled. But we made the following modifications: we suppressed the sanitizer warnings for clients with watched keys when updating `client->flags`, we think IO threads read `client->flags`, but never modify it or read the `CLIENT_DIRTY_CAS` bit, main thread just only modifies this bit, so there is no actual data race. ## Others ### IO thread number In the new multi-threaded design, the main thread is primarily focused on command processing to improve performance. Typically, the main thread does not handle regular client I/O operations but is responsible for clients such as replication and tracking clients. To avoid breaking changes, we still consider the main thread as the first IO thread. When the io-threads configuration is set to a low value (e.g., 2), performance does not show a significant improvement compared to a single-threaded setup for simple commands (such as SET or GET), as the main thread does not consume much CPU for these simple operations. This results in underutilized multi-core capacity. However, for more complex commands, having a low number of IO threads may still be beneficial. Therefore, it’s important to adjust the `io-threads` based on your own performance tests. Additionally, you can clearly monitor the CPU utilization of the main thread and IO threads using `top -H -p $redis_pid`. This allows you to easily identify where the bottleneck is. If the IO thread is the bottleneck, increasing the `io-threads` will improve performance. If the main thread is the bottleneck, the overall performance can only be scaled by increasing the number of shards or replicas. --------- Co-authored-by: debing.sun <debing.sun@redis.com> Co-authored-by: oranagra <oran@redislabs.com>	2024-12-22 19:30:37 +08:00
Yuan Wang	779af3ab92	Dynamic event loop binding for connection structure (#13642 ) The IO thread has an independent event loop, so we can no longer hard-code the event loop to the connection, instead, we should dynamically select the event loop for the connection. - configure the event loop during connection creation. - add a new interface to allow dynamic event loop binding. For TLS connection, we need to check for any pending data on the connection and handle it accordingly when changing connection cross IO thread and main thread. This commit doesn't handle it, @sundb will overall support for TLS connection later. --------- Co-authored-by: debing.sun <debing.sun@redis.com>	2024-11-08 15:58:10 +08:00
guybe7	ded8d993b7	Modules: defrag CB should take robj, not sds (#13627 ) Added a log of the keyname in the test modules to reproduce the problem (tests crash without the fix)	2024-10-30 17:32:51 +08:00
Moti Cohen	6437d07b03	Fix memory leak on rdbload error (#13626 ) On RDB load error, if an invalid `expireAt` value is read, `dupSearchDict` is not released.	2024-10-30 10:03:31 +02:00
debing.sun	4b29be3f36	Avoid redundant lpGet to boost quicklistCompare (#11533 ) `lpCompare()` in `quicklistCompare()` will call `lpGet()` again, which would be a waste. The change will result in a boost for all commands that use `quicklistCompre()`, including `linsert`, `lpos` and `lrem`.	2024-10-30 08:45:25 +08:00
Moti Cohen	2ec78d262d	Add KEYSIZES section to INFO (#13592 ) This PR adds a new section to the `INFO` command output, called `keysizes`. This section provides detailed statistics on the distribution of key sizes for each data type (strings, lists, sets, hashes and zsets) within the dataset. The distribution is tracked using a base-2 logarithmic histogram. # Motivation Currently, Redis lacks a built-in feature to track key sizes and item sizes per data type at a granular level. Understanding the distribution of key sizes is critical for monitoring memory usage and optimizing performance, particularly in large datasets. This enhancement will allow users to inspect the size distribution of keys directly from the `INFO` command, assisting with performance analysis and capacity planning. # Changes New Section in `INFO` Command: A new section called `keysizes` has been added to the `INFO` command output. This section reports a per-database, per-type histogram of key sizes. It provides insights into how many keys fall into specific size ranges (represented in powers of 2). Example output: ``` 127.0.0.1:6379> INFO keysizes # Keysizes db0_distrib_strings_sizes:1=19,2=655,512=100899,1K=31,2K=29,4K=23,8K=16,16K=3,32K=2 db0_distrib_lists_items:1=5784492,32=3558,64=1047,128=676,256=533,512=218,4K=1,8K=42 db0_distrib_sets_items:1=735564=50612,8=21462,64=1365,128=974,2K=292,4K=154,8K=89, db0_distrib_hashes_items:2=1,4=544,32=141169,64=207329,128=4349,256=136226,1K=1 ``` ## Future Use Cases: The key size distribution is collected per slot as well, laying the groundwork for future enhancements related to Redis Cluster.	2024-10-29 13:07:26 +02:00
Shockingly Good	611c950293	Fix crash in RM_GetCurrentUserName() when the user isn't accessible (#13619 ) The crash happens whenever the user isn't accessible, for example, it isn't set for the context (when it is temporary) or in some other cases like `notifyKeyspaceEvent`. To properly check for the ACL compliance, we need to get the user name and the user to invoke other APIs. However, it is not possible if it crashes, and it is impossible to work that around in the code since we don't know (and shouldn't know!) when it is available and when it is not.	2024-10-28 21:26:29 +08:00
opt-m	0a8e546957	Fix get # option in sort command (#13608 ) From 7.4, Redis allows `GET` options in cluster mode when the pattern maps to the same slot as the key, but GET # pattern that represents key itself is missed. This commit resolves it, bug report #13607. --------- Co-authored-by: Yuan Wang <yuan.wang@redis.com>	2024-10-22 09:55:00 +08:00
debing.sun	4f8cdc2a1e	Fix compilation on compilers that do not support target attribute (#13609 ) introduced by https://github.com/redis/redis/pull/13359 failure CI on ARM64: https://github.com/redis/redis-extra-ci/actions/runs/11377893230/job/31652773710 --------- Co-authored-by: Ozan Tezcan <ozantezcan@gmail.com> Co-authored-by: ShooterIT <wangyuancode@163.com>	2024-10-18 09:11:23 +08:00
hanhui365	3788a055fe	Optimize bitcount command by using popcnt (#13359 ) Nowadays popcnt instruction is almost supported by X86 machine, which is used to calculate "Hamming weight", it can bring much performance boost in redis bitcount comand. --------- Signed-off-by: hanhui365(hanhui@hygon.cn) Co-authored-by: debing.sun <debing.sun@redis.com> Co-authored-by: oranagra <oran@redislabs.com> Co-authored-by: Nugine <nugine@foxmail.com>	2024-10-17 09:13:19 +08:00
Yuan Wang	b71a610f5c	Clean up .rediscli_history_test temporary file (#13601 ) After running test in local, there will be a file named `.rediscli_history_test`, and it is not in `.gitignore` file, so this is considered to have changed the code base. It is a little annoying, this commit just clean up the temporary file. We should delete `.rediscli_history_test` in the end since the second server tests also write somethings into it, to make it corresponding, i put `set ::env(REDISCLI_HISTFILE) ".rediscli_history_test"` at the beginning. Maybe we also can add this file into `.gitignore`	2024-10-17 09:12:11 +08:00
YaacovHazan	efcfffc528	Update modules with latest version (#13606 ) Update redisbloom, redisjson and redistimeseries versions to 7.99.1 Co-authored-by: YaacovHazan <yaacov.hazan@redislabs.com>	2024-10-15 19:58:42 +03:00
paoloredis	99d09c824c	Only run redis_docs_sync.yaml on latest release (#13603 ) We only want to trigger the workflow on the documentation repository for the latest release	2024-10-15 16:02:11 +03:00
YaacovHazan	6c5e263d7b	Temporarily hide the new SFLUSH command by marking it as experimental (#13600 ) - Add a new 'EXPERIMENTAL' command flag, which causes the command generator to skip over it and make the command to be unavailable for execution - Skip experimental tests by default - Move the SFLUSH tests from the old framework to the new one --------- Co-authored-by: YaacovHazan <yaacov.hazan@redislabs.com>	2024-10-15 11:02:51 +03:00
debing.sun	3fc7ef8f81	Fix race in stream-cgroups test (#13593 ) failed CI: https://github.com/redis/redis/actions/runs/11171608362/job/31056659165 https://github.com/redis/redis/actions/runs/11226025974/job/31205787575	2024-10-12 09:23:19 +08:00
guybe7	a38c29b6c8	Cleanups related to expiry/eviction (#13591 ) 1. `dbRandomKey`: excessive call to `dbFindExpires` (will always return 1 if `allvolatile` + anyway called inside `expireIfNeeded` 2. Add `deleteKeyAndPropagate` that is used by both expiry/eviction 3. Change the order of calls in `expireIfNeeded` to save redundant calls to `keyIsExpired` 4. `expireIfNeeded`: move `OBJ_STATIC_REFCOUNT` to `deleteKeyAndPropagate` 5. `performEvictions` now uses `deleteEvictedKeyAndPropagate` 6. active-expire: moved `postExecutionUnitOperations` inside `activeExpireCycleTryExpire` 7. `activeExpireCycleTryExpire`: less indentation + expire a key if `now == t` 8. rename `lazy_expire_disabled` to `allow_access_expired`	2024-10-10 16:58:52 +08:00
Oran Agra	472d8a0df5	Prevent pattern matching abuse (CVE-2024-31228)	2024-10-08 20:55:44 +03:00
Oran Agra	8ec5da785b	Fix ACL SETUSER Read/Write key pattern selector (CVE-2024-31227) The '%' rule must contain one or both of R/W	2024-10-08 20:55:44 +03:00
Oran Agra	3a2669e8ae	Fix lua bit.tohex (CVE-2024-31449) INT_MIN value must be explicitly checked, and cannot be negated.	2024-10-08 20:55:44 +03:00
alonre24	f39e51178e	Update target module in search (#13578 ) Update search target path and version from M02	2024-10-08 13:58:28 +03:00
chx9	5f7d7ce8b0	fix typo in test_helper.tcl (#13576 ) fix typo in test_helper.tcl: even driven => event driven	2024-10-08 14:15:48 +08:00
Moti Cohen	d092d64d7a	Add new SFLUSH command to cluster for slot-based FLUSH (#13564 ) This PR introduces a new `SFLUSH` command to cluster mode that allows partial flushing of nodes based on specified slot ranges. Current implementation is designed to flush all slots of a shard, but future extensions could allow for more granular flushing. Command Usage: `SFLUSH <start-slot> <end-slot> [<start-slot> <end-slot>]* [SYNC\|ASYNC]` This command removes all data from the specified slots, either synchronously or asynchronously depending on the optional SYNC/ASYNC argument. Functionality: Current imp of `SFLUSH` command verifies that the provided slot ranges are valid and cover all of the node's slots before proceeding. If slots are partially or incorrectly specified, the command will fail and return an error, ensuring that all slots of a node must be fully covered for the flush to proceed. The function supports both synchronous (default) and asynchronous flushing. In addition, if possible, SFLUSH SYNC will be run as blocking ASYNC as an optimization.	2024-09-29 09:13:21 +03:00
Ozan Tezcan	99c40ab53d	Use hashtable as the default type of temp set object during sunion/sdiff (#13567 ) This PR is based on https://github.com/valkey-io/valkey/pull/996 Currently, for operations like SUNION or SDIFF, temporary set object can be intset or listpack. Search operation is costly for these encodings. This patch tries to set the temporary set object as hash table by default. It also tries to determine correct encoding for the temporary set object to reduce the unnecessary conversation. This change is supposed to give performance boost for tests like: - [memtier_benchmark-2keys-set-10-100-elements-sdiff](https://github.com/redis/redis-benchmarks-specification/blob/main/redis_benchmarks_specification/test-suites/memtier_benchmark-2keys-set-10-100-elements-sdiff.yml) 66.2% IMPROVEMENT - [memtier_benchmark-2keys-set-10-100-elements-sunion](https://github.com/redis/redis-benchmarks-specification/blob/main/redis_benchmarks_specification/test-suites/memtier_benchmark-2keys-set-10-100-elements-sunion.yml) 126.5% IMPROVEMENT ------- Co-authored-by: Lipeng Zhu <lipeng.zhu@intel.com> Co-authored-by: Wangyang Guo <wangyang.guo@intel.com> Co-authored-by: Lipeng Zhu <lipeng.zhu@intel.com> Co-authored-by: Wangyang Guo <wangyang.guo@intel.com>	2024-09-25 12:41:17 +03:00
Moti Cohen	26ef28467a	Optimize ZUNION[STORE] by avoiding redundant temporary dict usage (#13566 ) This PR is based on valkey-io/valkey#829 Previously, ZUNION and ZUNIONSTORE commands used a temporary accumulator dict and at the end copied it as-is to dstzset->dict. This PR removes accumulator and directly stores into dstzset->dict, eliminating the extra copy. Co-authored-by: Rayacoo zisong.cw@alibaba-inc.com	2024-09-25 11:55:00 +03:00
Moti Cohen	5f28bd96db	Fix race in HFE tests (#13563 ) Test 1 - give more time for expiration Test 2 - Evaluate expiration time boundaries [+1,+2] before setting expiration [+1] Test 3 - Avoid race on test HFEs propagated to replica	2024-09-23 10:30:29 +03:00
debing.sun	438cfed70a	Replace wrongly free with zfree in redis-cli (#13560 ) #13258 Incorrect use of free instead of zfree	2024-09-23 09:40:47 +08:00
Moti Cohen	3a3cacfefa	Extend modules API to read also expired keys and subkeys (#13526 ) The PR extends `RedisModule_OpenKey`'s flags to include `REDISMODULE_OPEN_KEY_ACCESS_EXPIRED`, which allows to access expired keys. It also allows to access expired subkeys. Currently relevant only for hash fields and has its impact on `RM_HashGet` and `RM_Scan`.	2024-09-19 20:47:00 +03:00
debing.sun	617909e943	Align the offset in ASCII logo (#13557 ) Since `\\` is only one character, we need to add an extra space to the right.	2024-09-18 14:42:32 +08:00
adamiBs	e9cbfccec6	Support `musl` Rust Installation in Modules Makefile (#13549 ) This PR introduces the installation of the `musl`-based version of Rust, in order to support alpine-based runtime environments (Rust is used by [RedisJSON](https://github.com/RedisJSON/RedisJSON)).	2024-09-15 20:23:05 +03:00
Filipe Oliveira (Redis)	7b69183a8d	Replace usage of _addReplyLongLongWithPrefix with specific bulk/mbulk functions to reduce condition checks in hotpath. (#13520 ) Instead of adding runtime logic to decide which prefix/shared object to use when doing the reply we can simply use an inline method to avoid runtime overhead of condition checks, and also keep the code change small. Preliminary data show improvements on commands that heavily rely on bulk/mbulk replies (example of LRANGE). --------- Co-authored-by: debing.sun <debing.sun@redis.com>	2024-09-15 21:40:09 +08:00
Filipe Oliveira (Personal)	af7fca797a	Using fast_float library for faster parsing of 64 decimal strings. (#11884 ) Fixes #8825 We're using the fast_float library[1] in our (compiled-in) floating-point fast_float_strtod implementation for faster and more portable parsing of 64 decimal strings. The single file fast_float.h is an amalgamation of the entire library, which can be (re)generated with the amalgamate.py script (from the fast_float repository) via the command: ``` python3 ./script/amalgamate.py --license=MIT > $REDIS_SRC/deps/fast_float/fast_float.h ``` [1]: https://github.com/fastfloat/fast_float The used commit from fast_float library was the one from https://github.com/fastfloat/fast_float/releases/tag/v3.10.1 --------- Co-authored-by: fcostaoliveira <filipe@redis.com>	2024-09-15 21:37:29 +08:00
Filipe Oliveira (Redis)	9146ac050b	Optimize HSCAN/ZSCAN command in case of listpack encoding: avoid the usage of intermediate list (#13531 ) Similar to #13530 , applied to HSCAN and ZSCAN in case of listpack encoding. Preliminary benchmark results showcase an improvement of 108% on the achievable ops/sec for ZSCAN and 65% for HSCAN. --------- Co-authored-by: debing.sun <debing.sun@redis.com>	2024-09-13 20:36:19 +08:00
debing.sun	ef3a5f58a8	Fix missing initialization of EbucketsIterator->isRax (#13545 ) in https://github.com/redis/redis/pull/13519, when `eb` is empty, `isRax` is not correctly initialized to 0, which can lead to `ebStop()` potentially entering the wrong rax branch.	2024-09-13 17:12:27 +08:00
Filipe Oliveira (Redis)	f2f85ba354	Optimize SSCAN command in case of listpack or intset encoding: avoid the usage of intermediate list. From 2N to N iterations (#13530 ) On SSCAN, in case of listpack and intset encoding we actually reply the entire set, and always reply with the cursor 0. For those cases, we don't need to accumulate the replies in a list and can completely avoid the overhead of list appending and then iterating over the list again -- meaning we do N iterations instead of 2N iterations over the SET and save intermediate memory as well. Preliminary benchmarks, `SSCAN set:100 0`, showcased an improvement of 60% as visible bellow on a SET with 100 string elements (listpack encoded).	2024-09-12 22:36:54 +08:00
Moti Cohen	c115c5230e	Add iterator capability to ebuckets DS (#13519 ) Add basic iterator API for ebuckets of start, next, nextBucket and stop.	2024-09-12 15:02:32 +03:00
Moti Cohen	65a87cb773	Correct spelling error at t_hash.c comment (#13540 ) spell check error : ./src/t_hash.c:1141: RESOTRE ==> RESTORE	2024-09-12 12:50:44 +03:00
Moti Cohen	9a89e32a95	HFE - Fix key ref by the hash on RENAME/MOVE/SWAPDB/RESTORE (#13539 ) If the hash previously had HFEs (hash-fields with expiration) but later no longer does, the key ref in the hash might become outdated after a MOVE, COPY, RENAME or RESTORE operation. These commands maintain the key ref only if HFEs are present. That is, we can only be sure that key ref is valid as long as the hash has HFEs.	2024-09-12 12:40:12 +03:00
Oran Agra	610eb26c11	RED-129256, Fix TOUCH command from script in no-touch mode (#13512 ) When a client in no-touch mode issues a TOUCH command on a key, the key’s access time should be updated, but in scripts, and module's RM_Call, it isn’t updated. Command proc should be matched to the executing client, not the current client. Co-authored-by: Udi Ron <udi@speedb.io>	2024-09-12 11:33:26 +03:00
Steve	d265a61438	Avoid cluster.nodes load corruption due to shard-id generation (#13468 ) PR #13428 doesn't fully resolve an issue where corruption errors can still occur on loading of cluster.nodes file - seen on upgrade where there were no shard_ids (from old Redis), 7.2.5 loading generated new random ones, and persisted them to the file before gossip/handshake could propagate the correct ones (or some other nodes unreachable). This results in a primary/replica having differing shard_id in the cluster.nodes and then the server cannot startup - reports corruption. This PR builds on #13428 by simply ignoring the replica's shard_id in cluster.nodes (if it exists), and uses the replica's primary's shard_id. Additional handling was necessary to cover the case where the replica appears before the primary in cluster.nodes, where it will first use a generated shard_id for the primary, and then correct after it loads the primary cluster.nodes entry. --------- Co-authored-by: debing.sun <debing.sun@redis.com>	2024-09-12 14:01:09 +08:00
debing.sun	2dd4cca363	Increment kvstore's non_empty_dicts only on first insert (#13528 ) Found by @oranagra Currently, when the size of dict becomes 1, we do not check whether `delta` is positive or negative. As a result, `non_empty_dicts` is still incremented when the size of dict changes from 2 to 1. We should only increment `non_empty_dicts` when `delta` is positive, as this indicates the first time an element is inserted into the dict. --------- Co-authored-by: oranagra <oran@redislabs.com>	2024-09-11 09:36:01 +08:00
Filipe Oliveira (Redis)	bcae770819	Optimize LREM, LPOS, LINSERT, LINDEX: Avoid N-1 sdslen() calls on listTypeEqual (#13529 ) This is a very easy optimization, that avoids duplicate computation of the object length for LREM, LPOS, LINSERT na LINDEX. We can see that sdslen takes 7.7% of the total CPU cycles of the benchmarks. Function Stack \| CPU Time: Total \| CPU Time: Self \| Module \| Function (Full) \| Source File \| Start Address -- \| -- \| -- \| -- \| -- \| -- \| -- listTypeEqual \| 15.50% \| 2.346s \| redis-server \| listTypeEqual \| t_list.c \| 0x845dd sdslen \| 7.70% \| 2.300s \| redis-server \| sdslen \| sds.h \| 0x845e4 Preliminary data showcases 4% improvement in the achievable ops/sec of LPOS in string elements, and 2% in int elements.	2024-09-10 20:26:36 +08:00
YaacovHazan	bf802b0764	Add the option to build Redis with modules (#13524 ) A new BUILD_WITH_MODULES flag was added to the Makefile to control building the module directory. The new module directory includes a general Makefile that iterates over each module, fetch a specific version, and build it. Co-authored-by: YaacovHazan <yaacov.hazan@redislabs.com>	2024-09-09 15:47:02 +03:00
Ozan Tezcan	ac03e3721d	Fix flaky replication tests (#13518 ) #13495 introduced a change to reply -LOADING while flushing existing db on a replica. Some of our tests are sensitive to this change and do no expect -LOADING reply. Fixing a couple of tests that fail time to time.	2024-09-08 12:54:01 +03:00
Filipe Oliveira (Redis)	31227f4faf	Optimize client type check on reply hot code paths (#13516 ) ## Proposed improvement This PR introduces the static inlined function `clientTypeIsSlave` which is doing only 1 condition check vs 3 checks of `getClientType`, and also uses the `unlikely` to tell the compiler that the most common outcome is for the client not to be a slave. Preliminary data show 3% improvement on the achievable ops/sec on the specific LRANGE benchmark. After running the entire suite we see up to 5% improvement in 2 tests. https://github.com/redis/redis/pull/13516#issuecomment-2331326052 ## Context This optimization efforts comes from analyzing the profile info from the [memtier_benchmark-1key-list-1K-elements-lrange-all-elements](https://github.com/redis/redis-benchmarks-specification/blob/main/redis_benchmarks_specification/test-suites/memtier_benchmark-1key-list-1K-elements-lrange-all-elements.yml) benchmark. By going over it, we can see that `getClientType` consumes 2% of the cpu time, strictly to check if the client is a slave ( https://github.com/redis/redis/blob/unstable/src/networking.c#L397 , and https://github.com/redis/redis/blob/unstable/src/networking.c#L1254 ) Function \| CPU Time: Total \| CPU Time: Self \| Module \| Function (Full) -- \| -- \| -- \| -- \| -- _addReplyToBufferOrList->getClientType \| 1.20% \| 0.728s \| redis-server \| getClientType clientHasPendingReplies->getClientType \| 0.80% \| 0.482s \| redis-server \| getClientType --------- Co-authored-by: debing.sun <debing.sun@redis.com>	2024-09-06 10:24:30 +08:00
Max Malekzadeh	f6f11f3ef1	Remove outdated "Try Redis" link in README.md (#13498 )	2024-09-05 22:04:49 +08:00
Moti Cohen	569584d463	HFE - Simplify logic of HGETALL command (#13425 )	2024-09-05 12:48:44 +03:00
debing.sun	ea3e8b79a1	Introduce reusable query buffer for client reads (#13488 ) This PR is based on the commits from PR https://github.com/valkey-io/valkey/pull/258, https://github.com/valkey-io/valkey/pull/593, https://github.com/valkey-io/valkey/pull/639 This PR optimizes client query buffer handling in Redis by introducing a reusable query buffer that is used by default for client reads. This reduces memory usage by ~20KB per client by avoiding allocations for most clients using short (<16KB) complete commands. For larger or partial commands, the client still gets its own private buffer. The primary changes are: * Adding a reusable query buffer `thread_shared_qb` that clients use by default. * Modifying client querybuf initialization and reset logic. * Freeing idle client query buffers when empty to allow reuse of the reusable query buffer. * Master client query buffers are kept private as their contents need to be preserved for replication stream. * When nested commands is executed, only the first user uses the reuse buffer, and subsequent users will still use the private buffer. In addition to the memory savings, this change shows a 3% improvement in latency and throughput when running with 1000 active clients. The memory reduction may also help reduce the need to evict clients when reaching max memory limit, as the query buffer is the main memory consumer per client. This PR is different from https://github.com/valkey-io/valkey/pull/258 1. When a client is in the mid of requiring a reused buffer and returning it, regardless of whether the query buffer has changed (expanded), we do not update the reused query buffer in the middle, but return the reused query buffer (expanded or with data remaining) or reset it at the end. 2. Adding a new thread variable `thread_shared_qb_used` to avoid multiple clients requiring the reusable query buffer at the same time. --------- Signed-off-by: Uri Yagelnik <uriy@amazon.com> Signed-off-by: Madelyn Olson <matolson@amazon.com> Co-authored-by: Uri Yagelnik <uriy@amazon.com> Co-authored-by: Madelyn Olson <madelyneolson@gmail.com> Co-authored-by: oranagra <oran@redislabs.com>	2024-09-04 19:10:40 +08:00
debing.sun	74609d44cd	Fix set with invalid length causes smembers to hang (#13515 ) After https://github.com/redis/redis/pull/13499, If the length set by `addReplySetLen()` does not match the actual number of elements in the reply, it will cause protocol broken and result in the client hanging.	2024-09-04 17:35:46 +08:00
Ozan Tezcan	ea05c6ac47	Fix RM_RdbLoad() to enable AOF after loading is completed (#13510 ) RM_RdbLoad() disables AOF temporarily while loading RDB. Later, it does not enable it back as it checks AOF state (disabled by then) rather than AOF config parameter. Added a change to restart AOF according to config parameter.	2024-09-04 11:11:04 +03:00
Filipe Oliveira (Redis)	05aed4cab9	Optimize SET/INCR/DECR/SETRANGE/APPEND by reducing duplicate computation (#13505 ) - Avoid addReplyLongLong (which converts back to string) the value we already have as a robj, by using addReplyProto + addReply - Avoid doing dbFind Twice for the same dictEntry on INCR/DECR/SETRANGE/APPEND commands. - Avoid multiple sdslen calls with the same input on setrangeCommand and appendCommand - Introduce setKeyWithDictEntry, which is like setKey(), but accepts an optional dictEntry input: Avoids the second dictFind in SET command --------- Co-authored-by: debing.sun <debing.sun@redis.com>	2024-09-04 14:51:21 +08:00

1 2 3 4 5 ...

12280 Commits All Branches Search

12280 Commits

All Branches