Commit Graph

2410 Commits

Author SHA1 Message Date
Mincho Paskalev 86c8be6368
Add new KSN types - overwritten and type_changed (#14141)
## What

Add new keyspace notification event types
- OVERWRITTEN - emitted when the value of a key is completely
overwritten
- TYPE_CHANGED - when the value of a key's type changes

Used in Pub/Sub KSN mechanism. Also added module hooks for the new
types.

## Motivation

Many commands overwrite the value of a key. F.e SET completely
overwrites the value of any key, even its type. Other commands that have
the REPLACE parameter also do so.

This commit gives more granularity over following such events.
Specific use-case at hand was module that is subscribed to string events
for the sole purpose of checking if hash keys get converted to strings
via the `SET` command. Subscribing to `type_changed` event not only
removes the need to subscribe to string events but is also more correct
as not only `SET` can change the type of a key.

## List of commands emitting the new events

* SET
* MSET
* COPY
* RESTORE
* RENAME
* BITOP

Each type with STORE operation:
* SORT
* S*STORE
* Z*STORE
* GEORADIUS
* GEOSEARCHSTORE

## Usage example

### pub-sub

Emit overwritten and type-changed events...
```
$ redis-server --notify-keyspace-events KEoc
```

Generate an overwritten event that also changes the type of a key...
```
$ redis-cli
127.0.0.1:6379> lpush l 1 2 3
(integer) 3
127.0.0.1:6379> set l x
OK
```

Subscribe to the events...
```
$ ./src/redis-cli
127.0.0.1:6379> psubscribe *
1) "psubscribe"
2) "*"
3) (integer) 1
1) "pmessage"
2) "*"
3) "__keyspace@0__:l"
4) "overwritten"
1) "pmessage"
2) "*"
3) "__keyevent@0__:overwritten"
4) "l"
1) "pmessage"
2) "*"
3) "__keyspace@0__:l"
4) "type_changed"
1) "pmessage"
2) "*"
3) "__keyevent@0__:type_changed"
4) "l"
```

### Modules

As with any other KSN type subscribe to the appropriate events
```
RedisModule_SubscribeToKeyspaceEvents(
      ctx,
      REDISMODULE_NOTIFY_OVERWRITTEN | REDISMODULE_NOTIFY_TYPE_CHANGED | ...
      notificationCallback
);
```

## Implementation notes

Most of the cases are handled in `setKeyByLink` but for some commands
overwriting had to be manually checked - specifically `RESTORE`, `COPY`
and `RENAME` manually call `dbAddInternal`
2025-07-07 13:29:14 +03:00
debing.sun 4322cebc17
Ensure empty error tables in scripts don't crash (#14163)
This PR is based on: https://github.com/valkey-io/valkey/pull/2229

When calling the command `EVAL error{} 0`, Redis crashes with the
following stack trace. This patch ensures we never leave the
`err_info.msg` field null when we fail to extract a proper error
message.

---------

Signed-off-by: Fusl <fusl@meo.ws>
Signed-off-by: Binbin <binloveplay1314@qq.com>
Co-authored-by: Fusl <fusl@meo.ws>
Co-authored-by: Binbin <binloveplay1314@qq.com>
2025-07-07 10:12:51 +08:00
Mincho Paskalev 15706f2e82
Module set/get config API (#14051)
# Problem

Some redis modules need to call `CONFIG GET/SET` commands. Server may be
ran with `rename-command CONFIG ""`(or something similar) which leads to
the module being unable to access the config.

# Solution

Added new API functions for use by modules
```
RedisModuleConfigIterator* RedisModule_GetConfigIterator(RedisModuleCtx *ctx, const char *pattern);
void RedisModule_ReleaseConfigIterator(RedisModuleCtx *ctx, RedisModuleConfigIterator *iter);
const char *RedisModule_ConfigIteratorNext(RedisModuleConfigIterator *iter);
int RedisModule_GetConfigType(const char *name, RedisModuleConfigType *res);
int RedisModule_GetBoolConfig(RedisModuleCtx *ctx, const char *name, int *res);
int RedisModule_GetConfig(RedisModuleCtx *ctx, const char *name, RedisModuleString **res);
int RedisModule_GetEnumConfig(RedisModuleCtx *ctx, const char *name, RedisModuleString **res);
int RedisModule_GetNumericConfig(RedisModuleCtx *ctx, const char *name, long long *res);
int RedisModule_SetBoolConfig(RedisModuleCtx *ctx, const char *name, int value, RedisModuleString **err);
int RedisModule_SetConfig(RedisModuleCtx *ctx, const char *name, RedisModuleString *value, RedisModuleString **err);
int RedisModule_SetEnumConfig(RedisModuleCtx *ctx, const char *name, RedisModuleString *value, RedisModuleString **err);
int RedisModule_SetNumericConfig(RedisModuleCtx *ctx, const char *name, long long value, RedisModuleString **err);
```

## Implementation

The work is mostly done inside `config.c` as I didn't want to expose the
config dict outside of it. That means each of these module functions has
a corresponding method in `config.c` that actually does the job. F.e
`RedisModule_SetEnumConfig` calls `moduleSetEnumConfig` which is
implemented in `config.c`

## Notes

Also, refactored `configSetCommand` and `restoreBackupConfig` functions
for the following reasons:
- code and logic is now way more clear in `configSetCommand`. Only
caveat here is removal of an optimization that skipped running apply
functions that already have ran in favour of code clarity.
- Both functions needlessly separated logic for module configs and
normal configs whereas no such separation is needed. This also had the
side effect of removing some allocations.
- `restoreBackupConfig` now has clearer interface and can be reused with
ease. One of the places I reused it is for the individual
`moduleSet*Config` functions, each of which needs the restoration
functionality but for a single config only.

## Future

Additionally, a couple considerations were made for potentially
extending the API in the future
- if need be an API for atomically setting multiple config values can be
added - `RedisModule_SetConfigsTranscationStart/End` or similar that can
be put around `RedisModule_Set*Config` calls.
- if performance is an issue an API
`RedisModule_GetConfigIteratorNextWithTypehint` or similar may be added
in order not to incur the additional cost of calling
`RedisModule_GetConfigType`.

---------

Co-authored-by: Oran Agra <oran@redislabs.com>
2025-07-03 13:46:33 +03:00
Mincho Paskalev ad8c7de3fe
Fix assertion in updateClientMemUsageAndBucket (#14152)
## Description

`updateClientMemUsageAndBucket` is called from the main thread to update
memory usage and memory bucket of a client. That's why it has assertion
that it's being called by the main thread.

But it may also be called from a thread spawned by a module.
Specifically, when a module calls `RedisModule_Call` which in turn calls
`call`->`replicationFeedMonitors`->`updateClientMemUsageAndBucket`.
This is generally safe as module calls inside a spawned thread should be
guarded by a call to `ThreadSafeContextLock`, i.e the module is holding
the GIL at this point.

This commit fixes the assertion inside `updateClientMemUsageAndBucket`
so that it encompasses that case also. Generally calls from
module-spawned threads are safe to operate on clients that are not
running on IO-threads when the module is holding the GIL.

---------

Co-authored-by: Yuan Wang <wangyuancode@163.com>
Co-authored-by: debing.sun <debing.sun@redis.com>
2025-07-02 11:55:57 +03:00
Slavomir Kaslev 0d8e750883
Add CLUSTER SLOT-STATS command (#14039)
Add CLUSTER SLOT-STATS command for key count, cpu time and network IO
per slot currently.

The command has the following syntax

    CLUSTER SLOT-STATS SLOTSRANGE start-slot end-slot

or

    CLUSTER SLOT-STATS ORDERBY metric [LIMIT limit] [ASC/DESC]

where metric can currently be one of the following

    key-count -- Number of keys in a given slot
cpu-usec -- Amount of CPU time (in microseconds) spent on a given slot
network-bytes-in -- Amount of network ingress (in bytes) received for
given slot
network-bytes-out -- Amount of network egress (in bytes) sent out for
given slot

This PR is based on:
    valkey-io/valkey#351
    valkey-io/valkey#709
    valkey-io/valkey#710
    valkey-io/valkey#720
    valkey-io/valkey#840

Co-authored-by: Kyle Kim <kimkyle@amazon.com>
Co-authored-by: Madelyn Olson <madelyneolson@gmail.com>
Co-authored-by: Harkrishn Patro <harkrisp@amazon.com>

---------

Co-authored-by: Kyle Kim <kimkyle@amazon.com>
Co-authored-by: Madelyn Olson <madelyneolson@gmail.com>
2025-07-01 20:26:51 +03:00
debing.sun fa040a72c0
Add XDELEX and XACKDEL commands for stream (#14130)
## Summary and detailed design for new stream command

## XDELEX

### Syntax
```
XDELEX key [KEEPREF | DELREF | ACKED] IDS numids id [id ...]
```

### Description
The `XDELEX` command extends the Redis Streams `XDEL` command, offering
enhanced control over message entry deletion with respect to consumer
groups. It accepts optional `DELREF` or `ACKED` parameters to modify its
behavior:

- **KEEPREF:** Deletes the specified entries from the stream, but
preserves existing references to these entries in all consumer groups'
PEL. This behavior is similar to XDEL.
- **DELREF:** Deletes the specified entries from the stream and also
removes all references to these entries from all consumer groups'
pending entry lists, effectively cleaning up all traces of the messages.
- **ACKED:** Only trims entries that were read and acknowledged by all
consumer groups.

**Note:** The `IDS` block can appear at any position in the command,
consistent with other commands.

### Reply
Array reply, for each `id`:
- `-1`: No such `id` exists in the provided stream `key`.
- `1`: Entry was deleted from the stream.
- `2`: Entry was not deleted, but there are still dangling references.
(ACKED option)

## XACKDEL

### Syntax
```
XACKDEL key group [KEEPREF | DELREF | ACKED] IDS numids id [id ...]
```

### Description
The `XACKDEL` command combines `XACK` and `XDEL` functionalities in
Redis Streams. It acknowledges specified message IDs in the given
consumer group and attempts to delete corresponding stream entries. It
accepts optional `DELREF` or `ACKED` parameters:

- **KEEPREF:** Acknowledges the messages in the specified consumer group
and deletes the entries from the stream, but preserves existing
references to these entries in all consumer groups' PEL.
- **DELREF:** Acknowledges the messages in the specified consumer group,
deletes the entries from the stream, and also removes all references to
these entries from all consumer groups' pending entry lists, effectively
cleaning up all traces of the messages.
- **ACKED:** Acknowledges the messages in the specified consumer group
and only trims entries that were read and acknowledged by all consumer
groups.


### Reply
Array reply, for each `id`:
- `-1`: No such `id` exists in the provided stream `key`.
- `1`: Entry was acknowledged and deleted from the stream.
- `2`: Entry was acknowledged but not deleted, but there are still
dangling references. (ACKED option)

# Redis Streams Commands Extension

## XTRIM

### Syntax
```
XTRIM key <MAXLEN | MINID> [= | ~] threshold [LIMIT count] [KEEPREF | DELREF | ACKED]
```

### Description
The `XTRIM` command trims a stream by removing entries based on
specified criteria, extended to include optional `DELREF` or `ACKED`
parameters for consumer group handling:

- **KEEPREF:** Trims the stream according to the specified strategy
(MAXLEN or MINID) regardless of whether entries are referenced by any
consumer groups, but preserves existing references to these entries in
all consumer groups' PEL.
- **DELREF:** Trims the stream according to the specified strategy and
also removes all references to the trimmed entries from all consumer
groups' PEL.
- **ACKED:** Only trims entries that were read and acknowledged by all
consumer groups.

### Reply
No change.

## XADD

### Syntax
```
XADD key [NOMKSTREAM] [<MAXLEN | MINID> [= | ~] threshold [LIMIT count]] [KEEPREF | DELREF | ACKED] <* | id> field value [field value ...]
```

### Description
The `XADD` command appends a new entry to a stream and optionally trims
it in the same operation, extended to include optional `DELREF` or
`ACKED` parameters for trimming behavior:

- **KEEPREF:** When trimming, removes entries from the stream according
to the specified strategy (MAXLEN or MINID), regardless of whether they
are referenced by any consumer groups, but preserves existing references
to these entries in all consumer groups' PEL.
- **DELREF:** When trimming, removes entries from the stream according
to the specified strategy and also removes all references to these
entries from all consumer groups' PEL.
- **ACKED:** When trimming, only removes entries that were read and
acknowledged by all consumer groups. Note that if the number of
referenced entries is bigger than MAXLEN, we will still stop.

### Reply
No change.

## Key implementation

Since we currently have no simple way to track the association between
an entry and consumer groups without iterating over all groups, we
introduce two mechanisms to establish this link. This allows us to
determine whether an entry has been seen by all consumer groups, and to
identify which groups are referencing it. With this links, we can break
the association when the entry is either acknowledged or deleted.

1) Added reference tracking between stream messages and consumer groups
using `cgroups_ref`
The cgroups_ref is implemented as a rax that maps stream message IDs to
lists of consumer groups that reference those messages, and streamNACK
stores the corresponding nodes of this list, so that the corresponding
groups can be deleted during `ACK`.
In this way, we can determine whether an entry has been seen but not
ack.
2) Store a cache minimum last_id in the stream structure.
The reason for doing this is that there is a situation where an entry
has never been seen by the consume group. In this case, we think this
entry has not been consumed either. If there is an "ACKED" option, we
cannot directly delete this entry either.
When a consumer group updates its last_id, we don’t immediately update
the cached minimum last_id. Instead, we check whether the group’s
previous last_id was equal to the current minimum, or whether the new
last_id is smaller than the current minimum (when using `XGROUP SETID`).
If either is true, we mark the cached minimum last_id as invalid, and
defer the actual update until the next time it’s needed.

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: moticless <moticless@github.com>
Co-authored-by: Ozan Tezcan <ozantezcan@gmail.com>
Co-authored-by: Slavomir Kaslev <slavomir.kaslev@gmail.com>
Co-authored-by: Yuan Wang <yuan.wang@redis.com>
2025-07-01 21:00:42 +08:00
debing.sun 5ff81f68a3
Fix XPENDING reply schema for empty reply (#14129)
When the PEL is empty, the reply of `XPENDING` without `start` option
will be:
```
1) (integer) 0
2) (nil)
3) (nil)
4) (nil)
```

It is not an empty array, so we need to create an individual reply
schema for it.
2025-07-01 17:35:09 +08:00
itayTziv 64ae81d37c
New config: lazyexpire-nested-arbitrary-keys (#14149)
In this PR we added hidden config - `lazyexpire-nested-arbitrary-keys`,
which can take:
* yes - the default. produce and propagate lazy-expire DELs as usual.
* no - avoid lazy-expire from commands that touch arbitrary keys (such
as SCAN, RANDOMKEY), if generated within a transactions (MULTI/EXEC,
LUA). This ensures such commands won't induce CROSSSLOT on remote proxy,
as happened in when replicating one db into another (possibly sharded
differently).
Since the issue is relevant only in replicated servers (RE's replica-of
mode or CRDT) - it was added to the core as a hidden config.

Please note that this config will always apply to read-only commands
(see EXPIRE_FORCE_DELETE_EXPIRED flag).
Since write commands may require key expiration to operate correctly.

---------

Co-authored-by: debing.sun <debing.sun@redis.com>
2025-07-01 15:28:13 +08:00
Oran Agra 96930663b4
Make Active defrag big list test much faster (#14157)
it aims to create listpacks of 500k, but did that with 5 insertions of
100k each, instead do that in one insertion, reducing the need for
listpack gradual growth, and reducing the number of commands we send.
apparently there are some stalls reading the replies of the commands,
specifically in GH actions, reducing the number of commands seems to
eliminate that.
2025-06-30 16:56:17 +03:00
wclmxxs ca6145b18c
Reduce the main thread blocking in clients cron (#13900)
The main thread needs to check clients in every cron iteration. During
this check, the corresponding I/O threads must not operate on these
clients to avoid data-race. As a result, the main thread is blocked
until the I/O threads finish processing and are suspended, allowing the
main thread to proceed with client checks.

Since the main thread's resources are more valuable than those of I/O
threads in Redis, this blocking behavior should be avoided. To address
this, the I/O threads check during their cron whether any of their
maintained clients need to be inspected by the main thread. If so, the
I/O threads send those clients to the main thread for processing, then
the main thread runs cron jobs for these clients.

In addition, an always-active client might not be in thread->clients, so
before processing the client’s command, we also check whether the client
has skipped running its cron job for over 1 second. If it has, we run
the cron job for the client.

The main thread does not need to actively pause the IO threads, thus
avoiding potential blocking behavior, fixes
https://github.com/redis/redis/issues/13885

Besides, this approach also can let all clients run cron task in a
second, but before, we pause IO threads in multiple batches when there
are more than 8 IO threads, that may cause some clients are not be
processed in a second.

---------

Co-authored-by: Yuan Wang <yuan.wang@redis.com>
2025-06-30 09:37:17 +08:00
Yuan Wang 4313d7ff23
Stabilize tests for IO threading (#14138)
- tests/unit/maxmemory.tcl
If multithreaded, we need to let IO threads have chance to reply output
buffer, to avoid next commands causing eviction. After eviction is
performed,
the next command becomes ready immediately in IO threads, and now we
enqueue
the client to be processed in main thread’s beforeSleep without
notification.
However, invalidation messages generated by eviction may not have been
fully
delivered by that time. As a result, executing the command in
beforeSleep of
the event loop (running eviction) can cause additional keys to be
evicted.
```
Expected '73' to be between to '200' and '300' (context: type source line 473 file 
redis/tests/unit/maxmemory.tcl cmd {assert_range [r dbsize] 200 300} proc ::test)
```
the reason why CI doesn't find this issue is that we skill this test
`tsan:skip` as below
`start_server {tags {"maxmemory external:skip tsan:skip"}} `,so remove
this tag.

- tests/integration/aof.tcl
Because IO and the main thread are working in better parallelism without
notification,
the main thread may haven't write AOF buffer into file, but the IO
thread just writes
the reply, so the clients receive the reply before AOF file is changed.
We should use `appendfsync always` policy to make the command is written
into
AOF file when receiving reply.
```
Expected '0' to be equal to '54' (context: type source line 249 file
redis/tests/integration/aof.tcl cmd {assert_equal $before $after} proc ::test)
```

#13969 makes these scenarios easy to appear.
2025-06-25 15:36:40 +08:00
Ozan Tezcan 03816c15f7
Fix short read of hfe key that causes exit() on replica (#14143)
If replica detects broken connection while reading min expiration time
of hfe key, it calls exit().
Fixed it to handle the error gracefully without calling exit. 

To reproduce the issue, the short-read test was modified to generate
many small hfe keys, increasing the likelihood of a connection drop
while reading min expiration time:

```tcl
for {set k 0} {$k < 50000} {incr k} {
  for {set i 0} {$i < 1} {incr i} {
    r hsetex "$k hfe_small" EX [expr {int(rand()*10)}] FIELDS 1 [string repeat A [expr {int(rand()*10)}]] 0[string repeat A [expr {int(rand()*10)}]]
  }
}
```

We can't have the test use only hfe keys, so a few were added alongside
other data. I couldn't reproduce the issue this way but with the test's
randomization, it should hit this scenario in one of the runs.
2025-06-23 07:41:30 +03:00
Stav-Levi 51239f75d0
Record the time a replica attempts to connect with master (#13990)
Merge fork counters with https://github.com/redis/redis/pull/12957
repl_current_sync_attempts - Total number of attempts to connect to a
master since the last time we disconnected from a good connection (or a
configuration change). any number greater than 1 (even if the link is
currently up), indicates an issue.
repl_total_sync_attempts - Number of times in current configuration, the
replica attempted to sync to a master. (dosent reset on master
reconnect.)
repl_total_disconnect_time - Total cumulative time we've been
disconnected as a replica, visible when the link is up too.
master_link_up_since_seconds - Number of seconds since the link is down,
just maintain symmetry with master_link_down_since_seconds.
2025-06-22 09:19:26 +03:00
yzc-yzc 117424f85c
Fix negative offset issue for ZRANGEBY[SCORE|LEX] command (#14043)
Fix #13952

This PR ensures that ZRANGE_SCORE/LEX command with a negative offset
will return empty.
2025-06-20 13:51:52 +08:00
yzc-yzc 61fa8bb06f
Record peak memory time (#14067)
resolve #14049

---------

Co-authored-by: debing.sun <debing.sun@redis.com>
2025-06-20 13:49:20 +08:00
lerman25 94aebb7324
Add config base to vector-sets and hnsw thread config (#14082)
This PR introduces the initial configuration infrastructure for
vector-sets, along with a new option:
`vset-force-single-threaded-execution`. When enabled, it applies the
`NOTHREAD` flag to VSIM and disables the `CAS` option for VADD, thereby
enforcing single-threaded execution.
Note: This mode is not optimized for single-threaded performance.

---------

Co-authored-by: GuyAv46 <47632673+GuyAv46@users.noreply.github.com>
Co-authored-by: debing.sun <debing.sun@redis.com>
2025-06-16 10:06:43 +08:00
debing.sun 2467eff59a
Fix db->expires can't be defragged due to incorrect comparison in the expires stage (#14092)
This bug was introduced by https://github.com/redis/redis/issues/13814

When defragmenting `db->expires`, if the process exits early and
`db->expires` was modified in the meantime (e.g., FLUSHDB), we need to
check whether the previously defragmented expires is still the same as
the current one when resuming. If they differ, we should abort the
current defragmentation of expires.

However, in https://github.com/redis/redis/issues/13814, I made a
mistake by using `db->keys` and `db->expires`, as expires will never be
defragged.
2025-06-05 21:52:33 +08:00
Yuan Wang 70a079db5e
Improve multithreaded performance with memory prefetching (#14017)
This PR is based on: https://github.com/valkey-io/valkey/pull/861

> ### Memory Access Amortization
> (Designed and implemented by [dan
touitou](https://github.com/touitou-dan))
> 
> Memory Access Amortization (MAA) is a technique designed to optimize
the performance of dynamic data structures by reducing the impact of
memory access latency. It is applicable when multiple operations need to
be executed concurrently. The principle behind it is that for certain
dynamic data structures, executing operations in a batch is more
efficient than executing each one separately.
> 
> Rather than executing operations sequentially, this approach
interleaves the execution of all operations. This is done in such a way
that whenever a memory access is required during an operation, the
program prefetches the necessary memory and transitions to another
operation. This ensures that when one operation is blocked awaiting
memory access, other memory accesses are executed in parallel, thereby
reducing the average access latency.
> 
> We applied this method in the development of dictPrefetch, which takes
as parameters a vector of keys and dictionaries. It ensures that all
memory addresses required to execute dictionary operations for these
keys are loaded into the L1-L3 caches when executing commands.
Essentially, dictPrefetch is an interleaved execution of dictFind for
all the keys.

### Implementation of Redis
When the main thread processes clients with ready-to-execute commands
(i.e., clients for which the IO thread has parsed the commands), a batch
of up to 16 commands is created. Initially, the command's argv, which
were allocated by the IO thread, is prefetched to the main thread's L1
cache. Subsequently, all the dict entries and values required for the
commands are prefetched from the dictionary before the command
execution.

#### Memory prefetching for main hash table
As shown in the picture, after https://github.com/redis/redis/pull/13806
, we unify key value and the dict uses no_value optimization, so the
memory prefetching has 4 steps:

1. prefetch the bucket of the hash table
2. prefetch the entry associated with the given key's hash
3. prefetch the kv object of the entry
4. prefetch the value data of the kv object

we also need to handle the case that the dict entry is the pointer of kv
object, just skip step 3.

MAA can improves single-threaded memory access efficiency by
interleaving the execution of multiple independent operations, allowing
memory-level parallelism and better CPU utilization. Its key point is
batch-wise interleaved execution. Split a batch of independent
operations (such as multiple key lookups) into multiple state machines,
and interleave their progress within a single thread to hide the memory
access latency of individual requests.

The difference between serial execution and interleaved execution:
**naive serial execution**
```
key1: step1 → wait → step2 → wait → done
key2: step1 → wait → step2 → wait → done
```
**interleaved execution**
```
key1: step1   → step2   → done
key2:   step1 → step2   → done
key3:     step1 → step2 → done
         ↑ While waiting for key1’s memory, progress key2/key3
```

#### New configuration
This PR involves a new configuration `prefetch-batch-max-size`, but we
think it is a low level optimization, so we hide this config:
When multiple commands are parsed by the I/O threads and ready for
execution, we take advantage of knowing the next set of commands and
prefetch their required dictionary entries in a batch. This reduces
memory access costs. The optimal batch size depends on the specific
workflow of the user. The default batch size is 16, which can be
modified using the 'prefetch-batch-max-size' config.
When the config is set to 0, prefetching is disabled.

---------

Co-authored-by: Uri Yagelnik <uriy@amazon.com>
Co-authored-by: Ozan Tezcan <ozantezcan@gmail.com>
2025-06-05 08:57:43 +08:00
Slavomir Kaslev b7c6755b1b
Add thread sanitizer run to daily CI (#13964)
Add thread sanitizer run to daily CI.

Few tests are skipped in tsan runs for two reasons:
* Stack trace producing tests (oom, `unit/moduleapi/crash`, etc) are
tagged `tsan:skip` because redis calls `backtrace()` in signal handler
which turns out to be signal-unsafe since it might allocate memory (e.g.
glibc 2.39 does it through a call to `_dl_map_object_deps()`).
* Few tests become flaky with thread sanitizer builds and don't finish
in expected deadlines because of the additional tsan overhead. Instead
of skipping those tests, this can improved in the future by allowing
more iterations when waiting for tsan builds.

Deadlock detection is disabled for now because of tsan limitation where
max 64 locks can be taken at once.

There is one outstanding (false-positive?) race in jemalloc which is
suppressed in `tsan.sup`.

Fix few races thread sanitizer reported having to do with writes from
signal handlers. Since in multi-threaded setting signal handlers might
be called on any thread (modulo pthread_sigmask) while the main thread
is running, `volatile sig_atomic_t` type is not sufficient and atomics
are used instead.
2025-06-02 10:13:23 +03:00
Ozan Tezcan 7f60945bc6
Fix short read issue that causes exit() on replica (#14085)
When `repl-diskless-load` is enabled on a replica, and it is in the
process of loading an RDB file, a broken connection detected by the main
channel may trigger a call to rioAbort(). This sets a flag to cause the
rdb channel to fail on the next rioRead() call, allowing it to perform
necessary cleanup.

However, there are specific scenarios where the error is checked using
rioGetReadError(), which does not account for the RIO_ABORT flag (see
[source](79b37ff535/src/rdb.c (L3098))).
As a result, the error goes undetected. The code then proceeds to
validate a module type, fails to find a match, and calls
rdbReportCorruptRDB() which logs the following error and exits the
process:

```
The RDB file contains module data I can't load: no matching module type '_________'
```

To fix this issue, the RIO_ABORT flag has been removed. Now, rioAbort()
sets both read and write error flags, so that subsequent operations and
error checks properly detect the failure.

Additional keys were added to the short read test. It reproduces the
issue with this change. We hit that problematic line once per key. My
guess is that with many smaller keys, the likelihood of the connection
being killed at just the right moment increases.
2025-05-28 12:43:59 +03:00
Moti Cohen 79b37ff535
Fix RESTORE with TTL (#14071)
restoreCommand() creates a key-value object (kv) with a TTL in two steps.
During the second step, setExpire() may reallocate the kv object. To ensure
correct behavior, kv must be updated after this call, as it might be used later
in the function.
2025-05-28 08:02:10 +03:00
guybe7 6349a7c4f9
Add GETRANGE tests with negative indices (#13950)
Inspired by https://github.com/redis/redis/pull/12272
2025-05-27 09:41:28 +08:00
Hüseyin Açacak 645858d518
Add size_t cast for RM_call() in module tests (#14061)
This PR addresses a potential misalignment issue when using `va_args`.
Without this fix,
[argument](9a9aa921bc/src/module.c (L6249-L6264))
values may occasionally become incorrect due to stack alignment
inconsistencies.
2025-05-23 10:10:11 +08:00
Salvatore Sanfilippo 871d4c4004
Test: check always for memory leaks on MacOS. (#14060)
When running the Redis test on MacOS, the test detects that the
operating system is able to use "leaks" to test for memory leaks and
executes this check after every server spinned is terminated.

While we have the ability to run the test in environments able to detect
memory issues, the fact it is possible to check for leaks at every run
baasically for free is very valuable, and allows to fix leaks
immediately in your laptop before submitting a PR.

However, the feature avoided to run leaks when no test was run: this
check was added in the early stage of Redis, when all the tests were
like:

server {
   test { ... }
}

So the check counts for the number of tests ran, and if no test is
executed, no leaks detection is performed. However now we have certain
tests that are in the form:

test {
    server { ... }
}

For instance just loading a corrupted RDB or alike. In this case, the
leaks test is not executed. This commit removes the check so that the
leaks test is always executed.
2025-05-20 17:46:56 +08:00
Mincho Paskalev 8dfb823c51
Implement DIFF, DIFF1, ANDOR and ONE for BITOP (#13898)
This PR adds 4 new operators to the `BITOP` command - `DIFF`, `DIFF1`,
`ANDOR` and `ONE`. They enable redis clients to atomically do
non-trivial logical operations that are useful for checking membership
of a bitmap against a group of bitmaps.
 
* **DIFF**
    `BITOP DIFF dest srckey1 srckey2 [key...]`

    **Description**
DIFF(*X*, *A1*, *A2*, *...*, *AN*) = *X* ∧ ¬(*A1* ∨ *A2* ∨ *...* ∨
*AN*), i.e the set bits of *X* that are not set in any of *A1*, *A2*,
*…*, *AN*

    **NOTE**
    Command expects at least 2 source keys.

* **DIFF1**
    `BITOP DIFF1 dest srckey1 srckey2 [key...]`

    **Description**
DIFF1(*X*, *A1*, *A2*, *...*, *AN*) = ¬*X* ∧ (*A1* ∨ *A2* ∨ *...* ∨
*AN*), i.e the bits set in one or more of *A1*, *A2*, *…*, *AN* that are
not set in *X*

    **NOTE**
    Command expects at least 2 source keys.

* **ANDOR**
    `BITOP ANDOR dest srckey1 srckey2 [key...]`

    **Description**
ANDOR(*X*, *A1*, *A2*, *...*, *AN*) = *X* ∧ (*A1* ∨ *A2* ∨ *...* ∨
*AN*), i.e the set bits of X that are also set in *A1*, *A2*, *…*, *AN*

    **NOTE**
    Command expects at least 2 source keys.

* **ONE**
    `BITOP ONE dest key [key...]`

    **Description**
    ONE(*A1*, *A2*, *...*, *AN*) = *X*, where 
if *X[i]* is the *i*-th bit of *X* then *X[i] = 1* if and only if there
is m such that *A_m[i] = 1* and *An[i] = 0* for all *n != m*, i.e bit
*X[i]* is set only if it set in exactly one of *A1*, *A2*, *...*, *AN*

**Return value**
As in all other `BITOP` operators return value for all the new ones is
the number bytes of the longest key.

EDIT:
Besides adding the new commands couple more changes were made:
- Added AVX2 path for more optimized computation of the BITOP operations
(including the new ones)
- Removed the hard limit of max 16 source keys for the fast path to be
used - now no matter the number of keys we can enter the fast path given
keys are long enough.

---------

Co-authored-by: debing.sun <debing.sun@redis.com>
2025-05-20 10:45:50 +03:00
Moti Cohen 51ad2f8d00
Fix keysizes - SPOP with count (case 3) and SETRANGE (#14028)
This commit addresses issues with the keysizes histogram tracking in two
Redis commands:

**SPOP with count (case 3)**
In the spopWithCountCommand function, when handling case 3 (where the
number of elements to return is very large, approaching the size of the
set itself), the keysizes histogram was not being properly updated. This
PR adds the necessary call to updateKeysizesHist() to ensure the
histogram accurately reflects the changes in set size after the
operation.

**SETRANGE command**
Fixed an issue in the setrangeCommand function where the keysizes
histogram wasn't being properly updated when modifying strings. The PR
ensures that the histogram correctly tracks the old and new lengths of
the string after a SETRANGE operation.

Added tests accordingly.
2025-05-19 16:59:21 +03:00
debing.sun 5d0d64b062
Add support to defrag ebuckets incrementally (#13842)
In PR #13229, we introduced the ebucket for HFE.
Before this PR, when updating eitems stored in ebuckets, the lack of
incremental fragmentation support for non-kvstore data structures (until
PR #13814) meant that we had to reverse lookup the position of the eitem
in the ebucket and then perform the update.
This approach was inefficient as it often required frequent traversals
of the segment list to locate and update the item.

To address this issue, in this PR, This PR implements incremental
fragmentation for hash dict ebuckets and server.hexpires.
By incrementally defrag the ebuckets, we also perform defragmentation
for the associated items, eliminates the need for frequent traversals of
the segment list for defragging the eitem.

---------

Co-authored-by: Moti Cohen <moticless@gmail.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-05-18 12:38:53 +08:00
Moti Cohen 3b19c7919b
Fix test INFO overhead for 32bit architecture (#14035)
This PR fixes the `tests/unit/info.tcl` test to properly handle 32-bit 
architectures by dynamically determining the pointer size based on the 
architecture instead of hardcoding it to 8 bytes.
2025-05-15 12:35:36 +03:00
debing.sun ae0bb6e82a
Fix internal-secret test flakiness under slow environment (#14024)
in the original test, we start a cluster with 20 instances(10 masters +
10 replicas), which leads to frequent disconnections of instances in a
slow environment, resulting in an inability to achieve consistency.

This test reduced the number of instances from 20 to 6.
2025-05-14 16:31:41 +08:00
Vitah Lin 232f2fb077
Include missing getchannels.tcl in moduleapi tests and fix incorrect assertions (#14037) 2025-05-14 08:57:01 +08:00
Ozan Tezcan a0b22576b8
Fix flaky replication test (#14034)
- Fix flaky replication test which checks memory usage on master
- Fix comments in another replication test
2025-05-13 13:29:27 +03:00
Salvatore Sanfilippo 65e164caff
[Vector sets] More rdb loading fixes (#14032)
Hi all, this PR fixes two things:

1. An assertion, that prevented the RDB loading from recovery if there
was a quantization type mismatch (with regression test).
2. Two code paths that just returned NULL without proper cleanup during
RDB loading.
2025-05-12 21:57:38 +03:00
Moti Cohen e1789e4368
keyspace - Unify key and value & use dict no_value=1 (#13806)
The idea of packing the key (`sds`), value (`robj`) and optionally TTL
into a single struct in memory was mentioned a few times in the past by
the community in various flavors. This approach improves memory
efficiency, reduces pointer dereferences for faster lookups, and
simplifies expiration management by keeping all relevant data in one
place. This change goes along with setting keyspace's dict to
no_value=1, and saving considerable amount of memory.

Two more motivations that well aligned with this unification are:

- Prepare the groundwork for replacing EXPIRE scan based implementation
and evaluate instead new `ebuckets` data structure that was introduced
as part of [Hash Field Expiration
feature](https://redis.io/blog/hash-field-expiration-architecture-and-benchmarks/).
Using this data structure requires embedding the ExpireMeta structure
within each object.
- Consider replacing dict with a more space efficient open addressing
approach hash table that might rely on keeping a single pointer to
object.

Before this PR, I POC'ed on a variant of open addressing hash-table and
was surprised to find that dict with no_value actually could provide a
good balance between performance, memory efficiency, and simplicity.
This realization prompted the separation of the unification step from
the evaluation of a new hash table to avoid introducing too many changes
at once and to evaluate its impact independently before considering
replacement of existing hash-table. On an earlier
[commit](https://github.com/redis/redis/pull/13683) I extended dict
no_value optimization (which saves keeping dictEntry where possible) to
be relevant also for objects with even addresses in memory. Combining it
with this unification saves a considerable amount of memory for
keyspace.

# kvobj
This PR adopts Valkey’s
[packing](3eb8314be6)
layout and logic for key, value, and TTL. However, unlike Valkey
implementation, which retained a common `robj` throughout the project,
this PR distinguishes between the general-purpose, overused `robj`, and
the new `kvobj`, which embeds both the key and value and used by the
keyspace. Conceptually, `robj` serves as a base class, while `kvobj`
acts as a derived class.

Two new flags introduced into redis object, `iskvobj` and `expirable`:
```
struct redisObject {
    unsigned type:4;
    unsigned encoding:4;
    unsigned lru:LRU_BITS;
    unsigned iskvobj : 1;             /* new flag */
    unsigned expirable : 1;           /* new flag */
    unsigned refcount : 30;           /* modified: 32bits->30bits */
    void *ptr;
};

typedef struct redisObject robj;
typedef struct redisObject kvobj;
```
When the `iskvobj` flag is set, the object includes also the key and it
is appended to the end of the object. If the `expirable` flag is set, an
additional 8 bytes are added to the object. If the object is of type
string, and the string is rather short, then it will be embedded as
well.

As a result, all keys in the keyspace are promoted to be of type
`kvobj`. This term attempts to align with the existing Redis object,
robj, and the kvstore data structure.

# EXPIRE Implementation
As `kvobj` embeds expiration time as well, looking up expiration times
is now an O(1) operation. And the hash-table of EXPIRE is set now to be
`no_value` mode, directly referencing `kvobj` entries, and in turn,
saves memory.

Next, I plan to evaluate replacing the EXPIRE implementation with the
[ebuckets](https://github.com/redis/redis/blob/unstable/src/ebuckets.h)
data structure, which would eliminate keyspace scans for expired keys.
This requires embedding `ExpireMeta` within each `kvobj` of each key
with expiration. In such implementation, the `expirable` flag will be
shifted to indicate whether `ExpireMeta` is attached.


# Implementation notes

## Manipulating keyspace (find, modify, insert)
Initially, unifying the key and value into a single object and storing
it in dict with `no_value` optimization seemed like a quick win.
However, it (quickly) became clear that this change required deeper
modifications to how keys are manipulated. The challenge was handling
cases where a dictEntry is opt-out due to no_value optimization. In such
cases, many of the APIs that return the dictEntry from a lookup become
insufficient, as it just might be the key itself. To address this issue,
a new-old approach of returning a "link" to the looked-up key's
`dictEntry` instead of the `dictEntry` itself. The term `link` was
already somewhat available in dict API, and is well aligned with the new
dictEntLink declaration:
```
typedef dictEntry **dictEntLink;
```
This PR introduces two new function APIs to dict to leverage returned
link from the search:
```
dictEntLink dictFindLink(dict *d, const void *key, dictEntLink *bucket);
void dictSetKeyAtLink(dict *d, void *key, dictEntLink *link, int newItem);
```
After calling `link = dictFindLink(...)`, any necessary updates must be
performed immediately after by calling `dictSetKeyAtLink()` without any
intervening operations on given dict. Otherwise, `dictEntLink` may
become invalid. Example:
```
/* replace existing key */
link = dictFindLink(d, key, &bucket, 0);
// ... Do something, but don't modify the dict ...
// assert(link != NULL);
dictSetKeyAtLink(d, kv, &link, 0);
     
/* Add new value (If no space for the new key, dict will be expanded and 
   bucket will be looked up again.) */  
link = dictFindLink(d, key, &bucket);
// ... Do something, but don't modify the dict ...
// assert(link == NULL);
dictSetKeyAtLink(d, kv, &bucket, 1);
```
## dict.h 
- The dict API has became cluttered with many unused functions. I have
removed these from dict.h.
- Additionally, APIs specifically related to hash maps (no_value=0),
primarily those handling key-value access, have been gathered and
isolated.
- Removed entirely internal functions ending with “*ByHash()” that were
originally added for optimization and not required any more.
- Few other legacy dict functions were adapted at API level to work with
the term dictEntLink as well.
- Simplified and generalized an optimization that related to comparison
of length of keys of type strings.

## Hash Field Expiration
Until now each hash object with expiration on fields needed to maintain
a reference to its key-name (of the hash object), such that in case it
will be active-expired, then it will be possible to resolve the key-name
for the notification sake. Now there is no need anymore.

---------

Co-authored-by: debing.sun <debing.sun@redis.com>
2025-05-12 10:15:17 +03:00
Mincho Paskalev 4d9b4d6e51
Input output traffic stats and command process count for each client. (#13944)
CI / build-centos-jemalloc (push) Failing after 6s Details
CI / build-old-chain-jemalloc (push) Failing after 6s Details
CI / build-debian-old (push) Failing after 30s Details
CI / build-libc-malloc (push) Failing after 31s Details
Codecov / code-coverage (push) Failing after 31s Details
Spellcheck / Spellcheck (push) Failing after 31s Details
CI / test-sanitizer-address (push) Failing after 1m26s Details
CI / test-ubuntu-latest (push) Failing after 1m31s Details
CI / build-32bit (push) Failing after 7m7s Details
CI / build-macos-latest (push) Has been cancelled Details
CodeQL / Analyze (cpp) (push) Failing after 1s Details
Coverity Scan / coverity (push) Has been skipped Details
External Server Tests / test-external-standalone (push) Failing after 31s Details
External Server Tests / test-external-cluster (push) Failing after 32s Details
External Server Tests / test-external-nodebug (push) Failing after 32s Details
2025-05-09 16:55:47 +03:00
Mincho Paskalev fdbf88032c
Add MSan and integrate it with CI (#13916)
## Description
Memory sanitizer (MSAN) is used to detect use-of-uninitialized memory
issues. While Address Sanitizer catches a wide range of memory safety
issues, it doesn't specifically detect uninitialized memory usage.
Therefore, Memory Sanitizer complements Address Sanitizer. This PR adds
MSAN run to the daily build, with the possibility of incorporating it
into the ci.yml workflow in the future if needed.

Changes in source files fix false-positive issues and they should not
introduce any runtime implications.

Note: Valgrind performs similar checks to both ASAN and MSAN but
sanitizers run significantly faster.

## Limitations
- Memory sanitizer is only supported by Clang.
- MSAN documentation states that all dependencies, including the
standard library, must be compiled with MSAN. However, it also mentions
there are interceptors for common libc functions, so compiling the
standard library with the MSAN flag is not strictly necessary.
Therefore, we are not compiling libc with MSAN.

---------

Co-authored-by: Ozan Tezcan <ozantezcan@gmail.com>
2025-05-09 11:44:54 +03:00
Moti Cohen 30d5f05637
Fix various KEYSIZES enumeration issues (#13923)
There are several issues with maintaining histogram counters.

Ideally, the hooks would be placed in the low-level datatype
implementations. However, this logic is triggered in various contexts
and doesn’t always map directly to a stored DB key. As a result, the
hooks sit closer to the high-level commands layer. It’s a bit messy, but
the right way to ensure histogram counters behave correctly is through
broad test coverage.

* Fix inaccuracies around deletion scenarios.
* Fix inaccuracies around modules calls. Added corresponding tests.
* The info-keysizes.tcl test has been extended to operate on meaningful
datasets
* Validate histogram correctness in edge cases involving collection
deletions.
* Add new macro debugServerAssert(). Effective only if compiled with
DEBUG_ASSERTIONS.

---------

Co-authored-by: debing.sun <debing.sun@redis.com>
2025-05-08 10:59:12 +03:00
Yuan Wang 6a436b6f72
Redis-cli gets RDB by RDB channel (#13809)
Now we have RDB channel in https://github.com/redis/redis/pull/13732,
child process can transfer RDB in a background method, instead of
handled by main thread. So when redis-cli gets RDB from server, we can
adopt this way to reduce the main thread load.

---------

Co-authored-by: Ozan Tezcan <ozantezcan@gmail.com>
2025-05-08 08:47:29 +08:00
Salvatore Sanfilippo a46624e10e
[Vector sets] RDB IO errors handling (#13978)
This PR adds support for REDISMODULE_OPTIONS_HANDLE_IO_ERRORS.
and tests for short read and corrupted RESTORE payload.

Please: note that I also removed the comment about async loading support
since we should be already covered. No manipulation of global data
structures in Vector Sets, if not for the unique ID used to create new
vector sets with different IDs.
2025-05-07 21:49:00 +03:00
debing.sun ac0bef15b5
Correctly update kvstore overhead after emptying or releasing dict (#13984)
CI / build-debian-old (push) Failing after 2s Details
CI / build-centos-jemalloc (push) Failing after 2s Details
CI / build-old-chain-jemalloc (push) Failing after 9s Details
Codecov / code-coverage (push) Failing after 7s Details
CI / build-32bit (push) Failing after 16s Details
Spellcheck / Spellcheck (push) Failing after 30s Details
CI / build-libc-malloc (push) Successful in 2m13s Details
CI / test-sanitizer-address (push) Failing after 3m26s Details
CI / test-ubuntu-latest (push) Failing after 4m14s Details
Coverity Scan / coverity (push) Has been skipped Details
External Server Tests / test-external-cluster (push) Failing after 1m31s Details
External Server Tests / test-external-nodebug (push) Failing after 2m5s Details
External Server Tests / test-external-standalone (push) Failing after 6m35s Details
CI / build-macos-latest (push) Has been cancelled Details
Close #13973

This PR fixed two bugs.
1)  `overhead_hashtable_lut` isn't updated correctly
    This bug was introduced by https://github.com/redis/redis/pull/12913
We only update `overhead_hashtable_lut` at the beginning and end of
rehashing, but we forgot to update it when a dict is emptied or
released.

This PR introduces a new `bucketChanged` callback to track the change
changes in the bucket size.
Now, `rehashingStarted` and `rehashingCompleted` callbacks are no longer
responsible for bucket changes, but are entirely handled by
`bucketChanged`, this can also avoid that we need to register three
callbacks to track the change of bucket size, now only one is needed.

In most cases, it will be triggered together with `rehashingStarted` or
`rehashingCompleted`,
except when a dict is being emptied or released, in these cases, even if
the dict is not rehashing, we still need to subtract its current size.

On the other hand, `overhead_hashtable_lut` was duplicated with
`bucket_count`, so we remove `overhead_hashtable_lut` and use
`bucket_count` instead.

Note that this bug only happens with cluster mode, because we don't use
KVSTORE_FREE_EMPTY_DICTS without cluster.

2) The size of `dict_size_index` repeatedly counted in terms of memory
usage.
`dict_size_index` is created at startup, so its memory usage has been
counted into `used_memory_startup`.
However, when we want to count the overhead, we repeat the calculation,
which may cause the overhead to exceed the total memory usage.

---------

Co-authored-by: Yuan Wang <yuan.wang@redis.com>
2025-05-07 16:45:23 +08:00
Vitah Lin 47505c3533
Fix 'Client output buffer hard limit is enforced' test causing infinite loop (#13934)
This PR fixes an issue in the CI test for client-output-buffer-limit,
which was causing an infinite loop when running on macOS 15.4.

### Problem

This test start two clients, R and R1:
```c
R1 subscribe foo
R publish foo bar
```

When R executes `PUBLISH foo bar`, the server first stores the message
`bar` in R1‘s buf. Only when the space in buf is insufficient does it
call `_addReplyProtoToList`.
Inside this function, `closeClientOnOutputBufferLimitReached` is invoked
to check whether the client’s R1 output buffer has reached its
configured limit.
On macOS 15.4, because the server writes to the client at a high speed,
R1’s buf never gets full. As a result,
`closeClientOnOutputBufferLimitReached` in the test is never triggered,
causing the test to never exit and fall into an infinite loop.

---------

Co-authored-by: debing.sun <debing.sun@redis.com>
2025-05-06 10:44:16 +08:00
Pieter Cailliau d65102861f
Adding AGPLv3 as a license option to Redis! (#13997)
Read more about [the new license option](http://redis.io/blog/agplv3/)
and [the Redis 8 release](http://redis.io/blog/redis-8-ga/).
2025-05-01 14:04:22 +01:00
YaacovHazan de16bee70a
Limiting output buffer for unauthenticated client (CVE-2025-21605) (#13993)
CI / build-centos-jemalloc (push) Failing after 18s Details
CI / build-32bit (push) Failing after 30s Details
CI / test-ubuntu-latest (push) Failing after 32s Details
CI / build-old-chain-jemalloc (push) Failing after 41s Details
CI / build-libc-malloc (push) Successful in 1m16s Details
CI / test-sanitizer-address (push) Failing after 1m39s Details
Codecov / code-coverage (push) Failing after 2m47s Details
Spellcheck / Spellcheck (push) Failing after 4m31s Details
CI / build-debian-old (push) Failing after 6m49s Details
Coverity Scan / coverity (push) Has been skipped Details
External Server Tests / test-external-standalone (push) Failing after 1m53s Details
External Server Tests / test-external-nodebug (push) Failing after 1m57s Details
External Server Tests / test-external-cluster (push) Failing after 1m59s Details
CI / build-macos-latest (push) Has been cancelled Details
For unauthenticated clients the output buffer is limited to prevent them
from abusing it by not reading the replies
2025-04-30 09:58:51 +03:00
Yuan Wang 14dd59ab12
Remove io-threads-do-reads from normal config list (#13987)
CI / test-ubuntu-latest (push) Failing after 30s Details
CI / test-sanitizer-address (push) Failing after 31s Details
CI / build-32bit (push) Failing after 31s Details
CI / build-libc-malloc (push) Failing after 31s Details
CI / build-old-chain-jemalloc (push) Failing after 31s Details
Codecov / code-coverage (push) Failing after 31s Details
Spellcheck / Spellcheck (push) Failing after 30s Details
CI / build-debian-old (push) Failing after 43s Details
CI / build-centos-jemalloc (push) Failing after 1m31s Details
CI / build-macos-latest (push) Has been cancelled Details
Coverity Scan / coverity (push) Has been skipped Details
External Server Tests / test-external-standalone (push) Failing after 1m52s Details
External Server Tests / test-external-nodebug (push) Failing after 1m58s Details
External Server Tests / test-external-cluster (push) Failing after 3m3s Details
Since after https://github.com/redis/redis/pull/13695,
`io-threads-do-reads` config is deprecated, we should remove it from
normal config list and only keep it in deprecated config list, but we
forgot to do this, this PR fixes this.

thanks @YaacovHazan for reporting this
2025-04-28 12:55:47 +03:00
Vitah Lin 9f99dd5f6d
Fix tls port update not reflected in CLUSTER SLOTS (#13966)
### Problem 

A previous PR (https://github.com/redis/redis/pull/13932) fixed the TCP
port issue in CLUSTER SLOTS, but it seems the handling of the TLS port
was overlooked.

There is this comment in the `addNodeToNodeReply` function in the
`cluster.c` file:
```c
    /* Report TLS ports to TLS client, and report non-TLS port to non-TLS client. */
    addReplyLongLong(c, clusterNodeClientPort(node, shouldReturnTlsInfo()));
    addReplyBulkCBuffer(c, clusterNodeGetName(node), CLUSTER_NAMELEN);
```

### Fixed 

This PR fixes the TLS port issue and adds relevant tests.
2025-04-24 09:36:45 +08:00
nesty92 8468ded667
Fix incorrect lag due to trimming stream via XTRIM or XADD command (#13958)
CI / build-libc-malloc (push) Failing after 1s Details
CI / build-centos-jemalloc (push) Failing after 1s Details
CI / test-ubuntu-latest (push) Failing after 31s Details
CI / build-debian-old (push) Failing after 31s Details
CI / build-32bit (push) Failing after 31s Details
CI / build-old-chain-jemalloc (push) Failing after 31s Details
Codecov / code-coverage (push) Failing after 32s Details
CI / test-sanitizer-address (push) Failing after 1m22s Details
Spellcheck / Spellcheck (push) Failing after 31s Details
CI / build-macos-latest (push) Has been cancelled Details
Coverity Scan / coverity (push) Has been skipped Details
External Server Tests / test-external-standalone (push) Failing after 31s Details
External Server Tests / test-external-cluster (push) Failing after 31s Details
External Server Tests / test-external-nodebug (push) Failing after 31s Details
This PR fix the lag calculation by ensuring that when consumer group's last_id
is behind the first entry, the consumer group's entries read is considered
invalid and recalculated from the start of the stream

Supplement to PR #13473 

Close #13957

Signed-off-by: Ernesto Alejandro Santana Hidalgo <ernesto.alejandrosantana@gmail.com>
2025-04-22 10:11:10 +08:00
Stav-Levi a257b6b4ba
Fix port update not reflected in CLUSTER SLOTS (#13932)
Close https://github.com/redis/redis/issues/13892 
config set port cmd updates server.port. cluster slot retrieves
information about cluster slots and their associated nodes. the fix
updates this info when config set port cmd is done, so cluster slots cmd
returns the right value.
2025-04-21 17:13:55 +08:00
Oran Agra 725c71b87a
fix flaky replication test (#13945)
from the master's perspective, the replica can become online before it's
actually done loading the rdb file.
this was always like that, in disk-based repl, and thus ok with diskless
and rdb channel.
in this test, because all the keys are added before the backlog is
created, the replication offset is 0, so the test proceeds and could get
a LOADING error when trying to run the function.
2025-04-15 17:18:00 +03:00
Oran Agra d3a0d95dab
Avoid using debug log level in tests that produce many keys (#13942)
CI / test-ubuntu-latest (push) Failing after 31s Details
CI / test-sanitizer-address (push) Failing after 31s Details
CI / build-debian-old (push) Failing after 31s Details
CI / build-32bit (push) Failing after 31s Details
CI / build-libc-malloc (push) Failing after 31s Details
CI / build-centos-jemalloc (push) Failing after 31s Details
CI / build-old-chain-jemalloc (push) Failing after 31s Details
Codecov / code-coverage (push) Failing after 31s Details
Spellcheck / Spellcheck (push) Failing after 31s Details
Coverity Scan / coverity (push) Has been skipped Details
External Server Tests / test-external-standalone (push) Failing after 31s Details
External Server Tests / test-external-cluster (push) Failing after 31s Details
External Server Tests / test-external-nodebug (push) Failing after 32s Details
CI / build-macos-latest (push) Has been cancelled Details
if the test fails, and there are per-key log prints, this can flood the
CI due to --dump-logs
2025-04-14 14:15:50 +03:00
Ozan Tezcan c9be4fbd72
Fix order of KSN for hgetex command (#13931)
If HGETEX command deletes the only field due to lazy expiry, Redis
currently sends `del` KSN (Keyspace Notification) first, followed by
`hexpired` KSN. The order should be reversed, `hexpired` should be sent
first and `del` later.

Additonal changes: More test coverage for HGETDEL KSN

---------

Co-authored-by: hristosko <hristosko.chaushev@redis.com>
2025-04-14 13:31:31 +03:00
debing.sun b33a405bf1
Fix timing issue in lazyfree test (#13926)
This test was introduced by https://github.com/redis/redis/issues/13853
We determine if the client is in blocked status, but if async flushdb is
completed before checking the blocked status, the test will fail.
So modify the test to only determine if `lazyfree_pending_objects` is
correct to ensure that flushdb is async, that is, the client must be
blocked.
2025-04-13 20:32:16 +08:00