Commit Graph

2373 Commits

Author SHA1 Message Date
Moti Cohen c18ff05665
Hash Field Expiration - Basic support
- Add ebuckets & mstr data structures
- Integrate active & lazy expiration
- Add most of the commands 
- Add support for dict (listpack is missing)
TODOs:  RDB, notification, listpack, HSET, HGETF, defrag, aof
2024-04-18 16:06:30 +03:00
Binbin 804110a487
Allocate Lua VM code with jemalloc instead of libc, and count it used memory (#13133)
## Background
1. Currently Lua memory control does not pass through Redis's zmalloc.c.
Redis maxmemory cannot limit memory problems caused by users abusing lua
since these lua VM memory is not part of used_memory.

2. Since jemalloc is much better (fragmentation and speed), and also we
know it and trust it. we are
going to use jemalloc instead of libc to allocate the Lua VM code and
count it used memory.

## Process:
In this PR, we will use jemalloc in lua. 
1. Create an arena for all lua vm (script and function), which is
shared, in order to avoid blocking defragger.
2. Create a bound tcache for the lua VM, since the lua VM and the main
thread are by default in the same tcache, and if there is no isolated
tcache, lua may request memory from the tcache which has just been freed
by main thread, and vice versa
On the other hand, since lua vm might be release in bio thread, but
tcache is not thread-safe, we need to recreate
    the tcache every time we recreate the lua vm.
3. Remove lua memory statistics from memory fragmentation statistics to
avoid the effects of lua memory fragmentation

## Other
Add the following new fields to `INFO DEBUG` (we may promote them to
INFO MEMORY some day)
1. allocator_allocated_lua: total number of bytes allocated of lua arena
2. allocator_active_lua: total number of bytes in active pages allocated
in lua arena
3. allocator_resident_lua: maximum number of bytes in physically
resident data pages mapped in lua arena
4. allocator_frag_bytes_lua: fragment bytes in lua arena

This is oranagra's idea, and i got some help from sundb.

This solves the third point in #13102.

---------

Co-authored-by: debing.sun <debing.sun@redis.com>
Co-authored-by: Oran Agra <oran@redislabs.com>
2024-04-16 12:43:33 +03:00
debing.sun 4581d43230
Fix daylight race condition and some thread leaks (#13191)
fix some issues that come from sanitizer thread report.

1. when the main thread is updating daylight_active, other threads (bio,
module thread) may be writing logs at the same time.
```
WARNING: ThreadSanitizer: data race (pid=661064)
  Read of size 4 at 0x55c9a4d11c70 by thread T2:
    #0 serverLogRaw /home/sundb/data/redis_fork/src/server.c:116 (redis-server+0x8d797) (BuildId: dca0b1945ba30010e36129bdb296e488dd2b32d0)
    #1 _serverLog.constprop.2 /home/sundb/data/redis_fork/src/server.c:146 (redis-server+0x2a3b14) (BuildId: dca0b1945ba30010e36129bdb296e488dd2b32d0)
    #2 bioProcessBackgroundJobs /home/sundb/data/redis_fork/src/bio.c:329 (redis-server+0x1c24ca) (BuildId: dca0b1945ba30010e36129bdb296e488dd2b32d0)

  Previous write of size 4 at 0x55c9a4d11c70 by main thread (mutexes: write M0, write M1, write M2, write M3):
    #0 updateCachedTimeWithUs /home/sundb/data/redis_fork/src/server.c:1102 (redis-server+0x925e7) (BuildId: dca0b1945ba30010e36129bdb296e488dd2b32d0)
    #1 updateCachedTimeWithUs /home/sundb/data/redis_fork/src/server.c:1087 (redis-server+0x925e7)
    #2 updateCachedTime /home/sundb/data/redis_fork/src/server.c:1118 (redis-server+0x925e7)
    #3 afterSleep /home/sundb/data/redis_fork/src/server.c:1811 (redis-server+0x925e7)
    #4 aeProcessEvents /home/sundb/data/redis_fork/src/ae.c:389 (redis-server+0x85ae0) (BuildId: dca0b1945ba30010e36129bdb296e488dd2b32d0)
    #5 aeProcessEvents /home/sundb/data/redis_fork/src/ae.c:342 (redis-server+0x85ae0)
    #6 aeMain /home/sundb/data/redis_fork/src/ae.c:477 (redis-server+0x85ae0)
    #7 main /home/sundb/data/redis_fork/src/server.c:7211 (redis-server+0x7168c) (BuildId: dca0b1945ba30010e36129bdb296e488dd2b32d0)
```

2. thread leaks in module tests
```
WARNING: ThreadSanitizer: thread leak (pid=668683)
  Thread T13 (tid=670041, finished) created by main thread at:
    #0 pthread_create ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:1036 (libtsan.so.2+0x3d179) (BuildId: 28a9f70061dbb2dfa2cef661d3b23aff4ea13536)
    #1 HelloBlockNoTracking_RedisCommand /home/sundb/data/redis_fork/tests/modules/blockonbackground.c:200 (blockonbackground.so+0x97fd) (BuildId: 9cd187906c57e88cdf896d121d1d96448b37a136)
    #2 HelloBlockNoTracking_RedisCommand /home/sundb/data/redis_fork/tests/modules/blockonbackground.c:169 (blockonbackground.so+0x97fd)
    #3 call /home/sundb/data/redis_fork/src/server.c:3546 (redis-server+0x9b7fb) (BuildId: dca0b1945ba30010e36129bdb296e488dd2b32d0)
    #4 processCommand /home/sundb/data/redis_fork/src/server.c:4176 (redis-server+0xa091c) (BuildId: dca0b1945ba30010e36129bdb296e488dd2b32d0)
    #5 processCommandAndResetClient /home/sundb/data/redis_fork/src/networking.c:2468 (redis-server+0xd2b8e) (BuildId: dca0b1945ba30010e36129bdb296e488dd2b32d0)
    #6 processInputBuffer /home/sundb/data/redis_fork/src/networking.c:2576 (redis-server+0xd2b8e)
    #7 readQueryFromClient /home/sundb/data/redis_fork/src/networking.c:2722 (redis-server+0xd358f) (BuildId: dca0b1945ba30010e36129bdb296e488dd2b32d0)
    #8 callHandler /home/sundb/data/redis_fork/src/connhelpers.h:58 (redis-server+0x288a7b) (BuildId: dca0b1945ba30010e36129bdb296e488dd2b32d0)
    #9 connSocketEventHandler /home/sundb/data/redis_fork/src/socket.c:277 (redis-server+0x288a7b)
    #10 aeProcessEvents /home/sundb/data/redis_fork/src/ae.c:417 (redis-server+0x85b45) (BuildId: dca0b1945ba30010e36129bdb296e488dd2b32d0)
    #11 aeProcessEvents /home/sundb/data/redis_fork/src/ae.c:342 (redis-server+0x85b45)
    #12 aeMain /home/sundb/data/redis_fork/src/ae.c:477 (redis-server+0x85b45)
    #13 main /home/sundb/data/redis_fork/src/server.c:7211 (redis-server+0x7168c) (BuildId: dca0b1945ba30010e36129bdb296e488dd2b32d0)
```
2024-04-04 13:49:51 +03:00
Moti Cohen 4df037962d
Change FLUSHALL/FLUSHDB SYNC to run as blocking ASYNC (#13167)
# Overview
Users utilize the `FLUSHDB SYNC` and `FLUSHALL SYNC` commands for a variety of 
reasons. The main issue with this command is that if the database becomes 
substantial in size, the server will be unresponsive for an extended period. 
Other than freezing application traffic, this may also lead some clients making 
incorrect judgments about the server's availability. For instance, a watchdog may 
erroneously decide to terminate the process, resulting in potential adverse 
outcomes. While a `FLUSH* ASYNC` can address these issues, it might not be used 
for two reasons: firstly, it's not the default, and secondly, in some cases, the 
client issuing the flush wants to wait for its completion before repopulating the 
database.

Between the option of triggering FLUSH* asynchronously in the background without 
indication for completion versus running it synchronously in the foreground by 
the main thread, there is another more appealing option. We can block the
client that requested the flush, execute the flush command in the background, and 
once done, unblock the client and return notification for completion. This approach 
ensures the server remains responsive to other clients, and the blocked client 
receives the expected response only after the flush operation has been successfully 
carried out.

# Implementation details
Instead of defining yet another flavor to the flush command, we can modify
`FLUSHALL SYNC` and `FLUSHDB SYNC` always run in this new mode.

## Extending BIO Threads capabilities
Today jobs that are carried out by BIO threads don't have the capability to 
indicate completion to the main thread. We can add this infrastructure by having
an additional dummy job, coined as completion-job, that eventually will be written 
by BIO threads to a response-queue. The main thread will take care to consume items
from the response-queue and call the provided callback function of each 
completion-job.

## FLUSH* SYNC to run as blocking ASYNC
Command `FLUSH* SYNC` will be modified to create one or more async jobs to flush
DB(s) and afterward will push additional completion-job request. By sending the
completion job request only at the end, the main thread will be called back only
after all the preceding jobs completed their task in the background. During that
time, the client of the command is suspended and marked as `BLOCKED_LAZYFREE`
whereas any other client will be able to communicate with the server without any
issue.
2024-04-02 15:09:52 +03:00
Pieter Cailliau 0b34396924
Change license from BSD-3 to dual RSALv2+SSPLv1 (#13157)
[Read more about the license change
here](https://redis.com/blog/redis-adopts-dual-source-available-licensing/)
Live long and prosper 🖖
2024-03-20 22:38:24 +00:00
Yanqi Lv bad33f8738
fix wrong data type conversion in zrangeResultBeginStore (#13148)
In `beginResultEmission`, -1 means the result length is not known in
advance. But after #12185, if we pass -1 to `zrangeResultBeginStore`, it
will convert to SIZE_MAX in `zsetTypeCreate` and try to `dictExpand`.
Although `dictExpand` won't succeed because the size overflows, I think
we'd better to avoid this wrong conversion.

This bug can be triggered when the source of `zrangestore` doesn't exist
or we use `zrangestore` command with `byscore` or `bylex`.
The impact is that dst keys will be converted to use skiplist instead of
listpack.
2024-03-19 08:52:55 +02:00
Binbin e04d41d78d
Prevent lua error_reply abuse from causing errorstats to become larger (#13141)
Users who abuse lua error_reply will generate a new error object on each
error call, which can make server.errors get bigger and bigger. This
will
cause the server to block when calling INFO (we also return errorstats
by
default).

To prevent the damage it can cause, when a misuse is detected, we will
print a warning log and disable the errorstats to avoid adding more new
errors. It can be re-enabled via CONFIG RESETSTAT.

Because server.errors may be very large (it may be better now since we
have the limit), config resetstat may block for a while. So in
resetErrorTableStats, we will try to lazyfree server.errors.

See the related discussion at the end of #8217.
2024-03-19 08:18:22 +02:00
Binbin 7b070423b8
Fix dictionary use-after-free in active expire and make kvstore iter to respect EMPTY flag (#13135)
After #13072, there is an use-after-free error. In expireScanCallback, we
will delete the dict, and then in dictScan we will continue to use the dict,
like we will doing `dictResumeRehashing(d)` in the end, this casued an error.

In this PR, in freeDictIfNeeded, if the dict's pauserehash is set, don't
delete the dict yet, and then when scan returns try to delete it again.

At the same time, we noticed that there will be similar problems in iterator.
We may also delete elements during the iteration process, causing the dict
to be deleted, so the part related to iter in the PR has also been modified.
dictResetIterator was also missing from the previous kvstoreIteratorNextDict,
we currently have no scenario that elements will be deleted in kvstoreIterator
process, deal with it together to avoid future problems. Added some simple
tests to verify the changes.

In addition, the modification in #13072 omitted initTempDb and emptyDbAsync,
and they were also added. This PR also remove the slow flag from the expire
test (consumes 1.3s) so that problems can be found in CI in the future.
2024-03-18 17:41:54 +02:00
Binbin ad28d222ed
Lua eval scripts first in first out LRU eviction (#13108)
In some cases, users will abuse lua eval. Each EVAL call generates
a new lua script, which is added to the lua interpreter and cached
to redis-server, consuming a large amount of memory over time.

Since EVAL is mostly the one that abuses the lua cache, and these
won't have pipeline issues (i.e. the script won't disappear
unexpectedly,
and cause errors like it would with SCRIPT LOAD and EVALSHA),
we implement a plain FIFO LRU eviction only for these (not for
scripts loaded with SCRIPT LOAD).

### Implementation notes:
When not abused we'll probably have less than 100 scripts, and when
abused we'll have many thousands. So we use a hard coded value of 500
scripts. And considering that we don't have many scripts, then unlike
keys, we don't need to worry about the memory usage of keeping a true
sorted LRU linked list. We compute the SHA of each script anyway,
and put the script in a dict, we can store a listNode there, and use
it for quick removal and re-insertion into an LRU list each time the
script is used.

### New interfaces:
At the same time, a new `evicted_scripts` field is added to
INFO, which represents the number of evicted eval scripts. Users
can check it to see if they are abusing EVAL.

### benchmark:
`./src/redis-benchmark -P 10 -n 1000000 -r 10000000000 eval "return
__rand_int__" 0`

The simple abuse of eval benchmark test that will create 1 million EVAL
scripts. The performance has been improved by 50%, and the max latency
has dropped from 500ms to 13ms (this may be caused by table expansion
inside Lua when the number of scripts is large). And in the INFO memory,
it used to consume 120MB (server cache) + 310MB (lua engine), but now
it only consumes 70KB (server cache) + 210KB (lua_engine) because of
the scripts eviction.

For non-abusive case of about 100 EVAL scripts, there's no noticeable
change in performance or memory usage.

### unlikely potentially breaking change:
in theory, a user can maybe load a
script with EVAL and then use EVALSHA to call it (by calculating the
SHA1 value on the client side), it could be that if we read the docs
carefully we'll realized it's a valid scenario, but we suppose it's
extremely rare. So it may happen that EVALSHA acts on a script created
by EVAL, and the script is evicted and EVALSHA returns a NOSCRIPT error.
that is if you have more than 500 scripts being used in the same
transaction / pipeline.

This solves the second point in #13102.
2024-03-13 08:27:41 +02:00
Ronen Kalish a8e745117f
Xread last entry in stream (#7388) (#13117)
Allow using `+` as a special ID for last item in stream on XREAD
command.

This would allow to iterate on a stream with XREAD starting with the
last available message instead of the next one which `$` is used for.
I.e. the caller can use `BLOCK` and `+` on the first call, and change to
`$` on the next call.

Closes #7388

---------

Co-authored-by: Felipe Machado <462154+felipou@users.noreply.github.com>
2024-03-13 08:23:32 +02:00
Viktor Söderqvist 9efc6ad6a6
Add API RedisModule_ClusterKeySlot and RedisModule_ClusterCanonicalKeyNameInSlot (#13069)
Sometimes it's useful to compute a key's cluster slot in a module.

This API function is just like the command CLUSTER KEYSLOT (but faster).

A "reverse" API is also added:
`RedisModule_ClusterCanonicalKeyNameInSlot`. Given a slot, it returns a
short string that we can call a canonical key for the slot.
2024-03-12 09:26:12 -07:00
Binbin da727ad445
Fix redis-check-aof incorrectly considering data in manifest format as MP-AOF (#12958)
The check in fileIsManifest misjudged the manifest file. For example,
if resp aof contains "file", it will be considered a manifest file and
the check will fail:
```
*3
$3
set
$4
file
$4
file
```

In #12951, if the preamble aof also contains it, it will also fail.
Fixes #12951.

the bug was happening if the the word "file" is mentioned
in the first 1024 lines of the AOF. and now as soon as it finds
a non-comment line it'll break (if it contains "file" or doesn't)
2024-03-12 08:47:43 +02:00
Matthew Douglass 5fdaa53d20
Fix conversion of numbers in lua args to redis args (#13115)
Since lua_Number is not explicitly an integer or a double, we need to
make an effort
to convert it as an integer when that's possible, since the string could
later be used
in a context that doesn't support scientific notation (e.g. 1e9 instead
of 100000000).

Since fpconv_dtoa converts numbers with the equivalent of `%f` or `%e`,
which ever is shorter,
this would break if we try to pass a long integer number to a command
that takes integer.
we'll get an implicit conversion to string in Lua, and then the parsing
in getLongLongFromObjectOrReply will fail.

```
> eval "redis.call('hincrby', 'key', 'field', '1000000000')" 0
(nil)
> eval "redis.call('hincrby', 'key', 'field', tonumber('1000000000'))" 0
(error) ERR value is not an integer or out of range script: ac99c32e4daf7e300d593085b611de261954a946, on @user_script:1.
```

Switch to using ll2string if the number can be safely represented as a
long long.

The problem was introduced in #10587 (Redis 7.2).
closes #13113.

---------

Co-authored-by: Binbin <binloveplay1314@qq.com>
Co-authored-by: debing.sun <debing.sun@redis.com>
Co-authored-by: Oran Agra <oran@redislabs.com>
2024-03-10 08:46:49 +02:00
debing.sun 9738ba9841
Check user's oom_score_adj write permission for oom-score-adj test (#13111)
`CONFIG SET oom-score-adj handles configuration failures` test failed in
some CI jobs today.
Failed CI: https://github.com/redis/redis/actions/runs/8152519326

Not sure why the github action's docker image perssions have changed,
but the issue is similar to #12887,
where we can't assume the range of oom_score_adj that a user can change.

## Solution:
Modify the way of determining whether the current user has no privileges
or not,
instead of relying on whether the user id is 0 or not.
2024-03-05 14:42:28 +02:00
debing.sun ad12730333
Implement defragmentation for pubsub kvstore (#13058)
After #13013

### This PR make effort to defrag the pubsub kvstore in the following
ways:

1. Till now server.pubsub(shard)_channels only share channel name obj
with the first subscribed client, now change it so that the clients and
the pubsub kvstore share the channel name robj.
This would save a lot of memory when there are many subscribers to the
same channel.
It also means that we only need to defrag the channel name robj in the
pubsub kvstore, and then update
all client references for the current channel, avoiding the need to
iterate through all the clients to do the same things.
    
2. Refactor the code to defragment pubsub(shard) in the same way as
defragment of keys and EXPIRES, with the exception that we only
defragment pubsub(without shard) when slot is zero.


### Other
Fix an overlook in #11695, if defragment doesn't reach the end time, we
should wait for the current
db's keys and expires, pubsub and pubsubshard to finish before leaving,
now it's possible to exit
early when the keys are defragmented.

---------

Co-authored-by: oranagra <oran@redislabs.com>
2024-03-04 16:56:50 +02:00
Chen Tianjie 4cae99e785
Add overhead of all DBs and rehashing dict count to info. (#12913)
Sometimes we need to make fast judgement about why Redis is suddenly
taking more memory. One of the reasons is main DB's dicts doing
rehashing.

We may use `MEMORY STATS` to monitor the overhead memory of each DB, but
there still lacks a total sum to show an overall trend. So this PR adds
the total overhead of all DBs to `INFO MEMORY` section, together with
the total count of rehashing DB dicts, providing some intuitive metrics
about main dicts rehashing.

This PR adds the following metrics to INFO MEMORY
* `mem_overhead_db_hashtable_rehashing` - only size of ht[0] in
dictionaries we're rehashing (i.e. the memory that's gonna get released
soon)

and a similar ones to MEMORY STATS:
* `overhead.db.hashtable.lut` (complements the existing
`overhead.hashtable.main` and `overhead.hashtable.expires` which also
counts the `dictEntry` structs too)
* `overhead.db.hashtable.rehashing` - temporary rehashing overhead.
* `db.dict.rehashing.count` - number of top level dictionaries being
rehashed.

---------

Co-authored-by: zhaozhao.zz <zhaozhao.zz@alibaba-inc.com>
Co-authored-by: Oran Agra <oran@redislabs.com>
2024-03-01 13:41:24 +08:00
Binbin f17381a38d
Fix propagation of entries_read by calling streamPropagateGroupID unconditionally (#12898)
In XREADGROUP ACK, because streamPropagateXCLAIM does not propagate
entries-read, entries-read will be inconsistent between master and
replicas.
I.e. if no entries were claimed, it would have propagated correctly, but
if some
were claimed, then the entries-read field would be inconsistent on the
replica.

The fix was suggested by guybe7, call streamPropagateGroupID
unconditionally,
so that we will normalize entries_read on the replicas. In the past, we
would
only set propagate_last_id when NOACK was specified. And in #9127,
XCLAIM did
not propagate entries_read in ACK, which would cause entries_read to be
inconsistent between master and replicas.

Another approach is add another arg to XCLAIM and let it propagate
entries_read,
but we decided not to use it. Because we want minimal damage in case
there's an
old target and new source (in the worst case scenario, the new source
doesn't
recognize XGROUP SETID ... ENTRIES READ and the lag is lost. If we
change XCLAIM,
the damage is much more severe).

In this patch, now if the user uses XREADGROUP .. COUNT 1 there will be
an additional
overhead of MULTI, EXEC and XGROUPSETID. We assume the extra command in
case of
COUNT 1 (4x factor, changing from one XCLAIM to
MULTI+XCLAIM+XSETID+EXEC), is probably
ok since reading just one entry is in any case very inefficient (a
client round trip
per record), so we're hoping it's not a common case.

Issue was introduced in #9127.
2024-02-29 09:48:20 +02:00
debing.sun 4a265554ae
Expose lua os.clock() api (#12971)
Implement #12699

This PR exposing Lua os.clock() api for getting the elapsed time of Lua
code execution.

Using:
```lua
local start = os.clock()
...
do something
...
local elpased = os.clock() - start
```

---------

Co-authored-by: Meir Shpilraien (Spielrein) <meir@redis.com>
Co-authored-by: Madelyn Olson <34459052+madolson@users.noreply.github.com>
2024-02-22 11:29:52 +02:00
debing.sun 165afc5f2a
Determine the large limit of the quicklist node based on fill (#12659)
Following #12568

In issue #9357, when inserting an element larger than 1GB, we currently
store it in a plain node instead of a listpack.
Presently, when we insert an element that exceeds the maximum size of a
packed node, it cannot be accommodated in any other nodes, thus ending
up isolated like a large element.
I.e. it's a node with only one element, but it's listpack encoded rather
than a plain buffer.

This PR lowers the threshold for considering an element as 'large' from
1GB to the maximum size of a node.
While this change doesn't completely resolve the bug mentioned in the
previous PR, it does mitigate its potential impact.

As a result of this change, we can now only use LSET to replace an
element with another element that falls below the maximum size
threshold.
In the worst-case scenario, with a fill of -5, the largest packed node
we can create is 2GB (32k * 64k):
* 32k: The smallest element in a listpack is 2 bytes, which allows us to
store up to 32k elements.
* 64k: This is the maximum size for a single quicklist node.

## Others
To fully fix #9357, we need more work, as discussed in #12568, when we
insert an element into a quicklistNode, it may be created in a new node,
put into another node, or merged, and we can't correctly delete the node
that was supposed to be deleted.
I'm not sure it's worth it, since it involves a lot of modifications.
2024-02-22 10:02:38 +02:00
Binbin ca5cac998e
xinfo-stream add minimum to seen-time, skip logreqres in fuzzer (#13056)
Recently I saw in CI that reply-schemas-validator fails here:
```
Failed validating 'minimum' in schema[1]['properties']['groups']['items']['properties']['consumers']['items']['properties']['active-time']:
    {'description': 'Last time this consumer was active (successful '
                    'reading/claiming).',
     'minimum': 0,
     'type': 'integer'}

On instance['groups'][0]['consumers'][0]['active-time']:
    -1729380548878722639
```

The reason is that in fuzzer, we may restore corrupted active-time,
which will cause the reply schema CI to fail.

The fuzzer can cause corrupt the state in many places, which will
bugs that mess up the reply, so we decided to skip logreqres.

Also, seen-time is the same type as active-time, adding the minimum.

---------

Co-authored-by: Oran Agra <oran@redislabs.com>
2024-02-20 12:21:10 +02:00
Binbin 3c2ea1ea95
Fix wathced client test timing issue caused by late close (#13062)
There is a timing issue in the test, close may arrive late, or in
freeClientAsync we will free the client in async way, which will
lead to errors in watching_clients statistics, since we will only
unwatch all keys when we truly freeClient.

Add a wait here to avoid this problem. Also fixed some outdated
comments i saw. The test was introduced in #12966.
2024-02-20 11:12:19 +02:00
Binbin 4e3be944fc
Fix timing issue in blockedclient test (#13071)
We can see that the past time here happens to be busy_time_limit,
causing the test to fail:
```
[err]: RM_Call from blocked client in tests/unit/moduleapi/blockedclient.tcl
Expected '50' to be more than '50' (context: type eval line 26 cmd {assert_morethan [expr [clock clicks -milliseconds]-$start] $busy_time_limit} proc ::test)
```

It is reasonable for them to be equal, so equal is added here.
It should be noted that in the previous `Busy module command` test,
we also used assert_morethan_equal, so this should have been missed
at the time.
2024-02-20 08:43:13 +02:00
zhaozhao.zz 50d6fe8c4b
Add metrics for WATCH (#12966)
Redis has some special commands that mark the client's state, such as
`subscribe` and `blpop`, which mark the client as `CLIENT_PUBSUB` or
`CLIENT_BLOCKED`, and we have metrics for the special use cases.

However, there are also other special commands, like `WATCH`, which
although do not have a specific flags, and should also be considered
stateful client types. For stateful clients, in many scenarios, the
connections cannot be shared in "connection pool", meaning connection
pool cannot be used. For example, whenever the `WATCH` command is
executed, a new connection is required to put the client into the "watch
state" because the watched keys are stored in the client.

If different business logic requires watching different keys, separate
connections must be used; otherwise, there will be contamination. This
also means that if a user's business heavily relies on the `WATCH`
command, a large number of connections will be required.

Recently we have encountered this situation in our platform, where some
users consume a significant number of connections when using Redis
because of `WATCH`.

I hope we can have a way to observe these special use cases and special
client connections. Here I add a few monitoring metrics:

1. `watching_clients` in `INFO` reply: The number of clients currently
in the "watching" state.
2. `total_watched_keys` in `INFO` reply: The total number of keys being
watched.
3. `watch` in `CLIENT LIST` reply: The number of keys each client is
currently watching.
2024-02-18 10:36:41 +02:00
Binbin 32f44da510
Increase tolerance range to block reprocess tests to avoid timing issues (#13053)
These tests have all failed in daily CI:
```
*** [err]: Blocking XREADGROUP for stream key that has clients blocked on stream - reprocessing command in tests/unit/type/stream-cgroups.tcl
Expected '1101' to be between to '1000' and '1100' (context: type eval line 23 cmd {assert_range [expr $end-$start] 1000 1100} proc ::test)

*** [err]: BLPOP unblock but the key is expired and then block again - reprocessing command in tests/unit/type/list.tcl
Expected '1101' to be between to '1000' and '1100' (context: type eval line 23 cmd {assert_range [expr $end-$start] 1000 1100} proc ::test)

*** [err]: BZPOPMIN unblock but the key is expired and then block again - reprocessing command in tests/unit/type/zset.tcl
Expected '1103' to be between to '1000' and '1100' (context: type eval line 23 cmd {assert_range [expr $end-$start] 1000 1100} proc ::test)
```

Increase the range to avoid failures, and improve the comment to be
clearer.
tests was introduced in #13004.
2024-02-15 10:44:49 +02:00
Binbin 8eeece4ab3
Fix CLIENAT KILL MAXAGE test timing issue (#13047)
This test fails occasionally:
```
*** [err]: CLIENT KILL maxAGE will kill old clients in tests/unit/introspection.tcl
Expected 2 == 1 (context: type eval line 14 cmd {assert {$res == 1}} proc ::test)
```

This test is very likely to do a false positive if the execute time
takes longer than the max age, for example, if the execution time
between sleep and kill exceeds 1s, rd2 will also be killed due to
the max age.

The test can adjust the order of execution statements to increase
the probability of passing, but this is still will be a timing issue
in some slow machines, so decided give it a few more chances.

The test was introduced in #12299.
2024-02-12 08:11:33 +02:00
Binbin 493e31e3ad
Add new DEBUG dict-resizing command to disable the dict resize (#13043)
The test fails here and there:
```
*** [err]: expire scan should skip dictionaries with lot's of empty buckets in tests/unit/expire.tcl
scan didn't handle slot skipping logic.
```

There are two case:
1. In the case of passing the test, we use child process to avoid the
dict resize, but it can not completely limit it, since in the dictDelete
we still have chance to trigger the resize (hit the force radio). The
reason why our test passed before is because the expire dict is still
in the rehashing process, so the dictDelete, the dictShrinkIfNeeded can
not trigger the resize.

2. In the case of failing the test, the expire dict finished the
rehashing,
so the last dictDelete, the dictShrinkIfNeeded trigger the dict resize
since it hit the force radio, so the skipping logic fail.

This PR add a new DEBUG command to disbale the dict resize.
2024-02-08 16:39:58 +02:00
Binbin 813327b231
Fix SORT STORE quicklist with the right options (#13042)
We forgot to call quicklistSetOptions after createQuicklistObject,
in the sort store scenario, we will create a quicklist with default
fill or compress options.

This PR adds fill and depth parameters to createQuicklistObject to
specify that options need to be set after creating a quicklist.

This closes #12871.

release notes:
> Fix lists created by SORT STORE to respect list compression and
packing configs.
2024-02-08 14:36:11 +02:00
debing.sun 1e8dc1da0d
Fix crash due to merge of quicklist node introduced by #12955 (#13040)
Fix two crash introducted by #12955

When a quicklist node can't be inserted and split, we eventually merge
the current node with its neighboring
nodes after inserting, and compress the current node and its siblings.

1. When the current node is merged with another node, the current node
may become invalid and can no longer be used.

   Solution: let `_quicklistMergeNodes()` return the merged nodes.

3. If the current node is a LZF quicklist node, its recompress will be
1. If the split node can be merged with a sibling node to become head or
tail, recompress may cause the head and tail to be compressed, which is
not allowed.

    Solution: always recompress to 0 after merging.
2024-02-08 14:29:16 +02:00
Binbin 886b117031
Fix dict don't rehash when there is child test (#13035)
The reason is the same as #13016. The reason is that in #12819,
in cron, in addition to trying to shrink, we will also tyring
to expand. The dict was expanded by cron before we trigger the
bgsave since we do have the enough keys (4096) to hit the radio.

Before the bgsave, we only add 4095 keys to avoid this issue.
2024-02-07 09:19:18 +02:00
debing.sun 1f00c951c2
Prevent LSET command from causing quicklist plain node size to exceed 4GB (#12955)
Fix #12864

The main reason for this crash is that when replacing a element of a
quicklist packed node with lpReplace() method,
if the final size is larger than 4GB, lpReplace() will fail and returns
NULL, causing `node->entry` to be incorrectly set to NULL.

Since the inserted data is not a large element, we can't just replace it
like a large element, first quicklistInsertAfter()
and then quicklistDelIndex(), because the current node may be merged and
invalidated in quicklistInsertAfter().

The solution of this PR:
When replacing a node fails (listpack exceeds 4GB), split the current
node, create a new node to put in the middle, and try to merge them.
This is the same as inserting a large element.
In the worst case, its size will not exceed 4GB.
2024-02-06 18:21:28 +02:00
Binbin 13bd3643c2
Re-compute active_defrag_running after adjusting defrag configurations (#13020)
Currently, once active defrag starts, we can not adjust
active_defrag_running
downwards. This is because active_defrag_running will be dynamically
compute
based on the fragmentation, we think we should not lower the effort when
the
fragmentation drops.

However, we need to note that active_defrag_running will also be
dynamically
computed based on configurations. In this case, we are not respecting
cycle-min
or cycle-max. Some people may realize halfway through that defrag
consumes a
lot and want to adjust it.

Previously we could only turn off activedefrag and then turn it on again
to
adjust active_defrag_running downwards. So in this PR, when a active
defrag
configuration change is made, we will re-compute it.

These configuration items are:
- active-defrag-cycle-min
- active-defrag-cycle-max
- active-defrag-threshold-upper
2024-02-06 13:39:07 +02:00
guybe7 8cd62f82ca
Refactor the per-slot dict-array db.c into a new kvstore data structure (#12822)
# Description
Gather most of the scattered `redisDb`-related code from the per-slot
dict PR (#11695) and turn it to a new data structure, `kvstore`. i.e.
it's a class that represents an array of dictionaries.

# Motivation
The main motivation is code cleanliness, the idea of using an array of
dictionaries is very well-suited to becoming a self-contained data
structure.
This allowed cleaning some ugly code, among others: loops that run twice
on the main dict and expires dict, and duplicate code for allocating and
releasing this data structure.

# Notes
1. This PR reverts the part of https://github.com/redis/redis/pull/12848
where the `rehashing` list is global (handling rehashing `dict`s is
under the responsibility of `kvstore`, and should not be managed by the
server)
2. This PR also replaces the type of `server.pubsubshard_channels` from
`dict**` to `kvstore` (original PR:
https://github.com/redis/redis/pull/12804). After that was done,
server.pubsub_channels was also chosen to be a `kvstore` (with only one
`dict`, which seems odd) just to make the code cleaner by making it the
same type as `server.pubsubshard_channels`, see
`pubsubtype.serverPubSubChannels`
3. the keys and expires kvstores are currenlty configured to allocate
the individual dicts only when the first key is added (unlike before, in
which they allocated them in advance), but they won't release them when
the last key is deleted.

Worth mentioning that due to the recent change the reply of DEBUG
HTSTATS changed, in case no keys were ever added to the db.

before:
```
127.0.0.1:6379> DEBUG htstats 9
[Dictionary HT]
Hash table 0 stats (main hash table):
No stats available for empty dictionaries
[Expires HT]
Hash table 0 stats (main hash table):
No stats available for empty dictionaries
```

after:
```
127.0.0.1:6379> DEBUG htstats 9
[Dictionary HT]
[Expires HT]
```
2024-02-05 17:21:35 +02:00
Binbin f20774eced
Fix active expire timeout when db done the scanning (#13030)
When db->expires_cursor==0, it means the DB is done the scanning,
we should exit the loop to avoid the useless scanning.

It is easy to see the active expire timeout in the modified test,
for example, let's assume that there is only 1 expired key in the
DB, and the size / buckets ratio is less than 1%, which means that
we will skip it in isExpiryDictValidForSamplingCb, and the return
value of expires_cursor is 0.

Because `data.sampled == 0` is always true, so `repeat` is also
always true, we will keep scanning the DB, but every time it is
skipped by the previous judgment (expires_cursor = 0), until the
timelimit is finally exhausted.
2024-02-05 16:56:46 +02:00
Binbin 9a7d311855
Fix dict resize allow test (#13016)
Ci report this failure:
```
*** [err]: Don't rehash if used memory exceeds maxmemory after rehash in tests/unit/maxmemory.tcl
Expected '4098' to equal or match '4002'

WARNING: the new maxmemory value set via CONFIG SET (1176088) is smaller than the current memory usage (1231083)
```

It can be seen from the log that used_memory changed before we set
maxmemory.
The reason is that in #12819, in cron, in addition to trying to shrink,
we will
also tyring to expand. The dict was expanded by cron before we set
maxmemory,
causing the test to fail.

Before setting maxmemory, we only add 4095 keys to avoid triggering
resize.
2024-01-31 13:11:52 +02:00
Binbin 6016973ac0
Fix module assertion crash when timer and timeout are unlocked in the same event loop (#13015)
When we use a timer to unblock a client in module, if the timer
period and the block timeout are very close, they will unblock the
client in the same event loop, and it will trigger the assertion.
The reason is that in moduleBlockedClientTimedOut we will protect
against re-processing, so we don't actually call updateStatsOnUnblock
(see #12817), so we are not able to reset the c->duration. 

The reason is unblockClientOnTimeout() didn't realize that bc had
been unblocked. We add a function to the module to determine if bc
is blocked, and then use it in unblockClientOnTimeout() to exit.

There is the stack:
```
beforeSleep
blockedBeforeSleep
handleBlockedClientsTimeout
checkBlockedClientTimeout
unblockClientOnTimeout
unblockClient
resetClient
-- assertion, crash the server
'c->duration == 0' is not true
```
2024-01-31 13:10:19 +02:00
Binbin 74a6e48a3d
Fix module unblock crash due to no timeout_callback (#13017)
The block timeout is passed in the test case, but we do not pass
in the timeout_callback, and it will crash when unlocking. In this
case, in moduleBlockedClientTimedOut we will check timeout_callback.
There is the stack:
```
beforeSleep
blockedBeforeSleep
handleBlockedClientsTimeout
checkBlockedClientTimeout
unblockClientOnTimeout
replyToBlockedClientTimedOut
moduleBlockedClientTimedOut
-- timeout_callback is NULL, invalidFunctionWasCalled
bc->timeout_callback(&ctx,(void**)c->argv,c->argc);
```
2024-01-31 09:28:50 +02:00
Chen Tianjie f469dd8ca6
Add novalues option to command HSCAN. (#12765)
Add a way to HSCAN a hash key, and get only the filed names.
Command syntax is now:
```
HSCAN key cursor [MATCH pattern] [COUNT count] [NOVALUES]
```
when `NOVALUES` is on, the command will only return keys in the hash.

---------

Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
2024-01-30 20:32:58 +02:00
Slava Koyfman 24f6d08b3f
Implement `CLIENT KILL MAXAGE <maxage>` (#12299)
Adds an ability to kill clients older than a specified age.

Also, fixed the age calculation in `catClientInfoString` to use
`commandTimeSnapshot`
instead of the old `server.unixtime`, and added missing documentation
for
`CLIENT KILL ID` to output of `CLIENT help`.

---------

Co-authored-by: Oran Agra <oran@redislabs.com>
2024-01-30 20:24:36 +02:00
Oran Agra 7c9f41b52b
fix dict rehash tests introduced by #12802 broken by #12819 (#13009)
tests consistently fail on timeout (sleep that's too short).
it now takes more time because in #12819 we iterate on all dicts, not
just non-empty ones.
it passed the PR's CI because it skips the `slow` tag, which might have
been misplaced, but now it is probably required.
with the fix, the tests take quite a lot of time:
```
[ok]: Redis can trigger resizing (1860 ms)
[ok]: Redis can rewind and trigger smaller slot resizing (744 ms)
```
before #12819:
```
[ok]: Redis can trigger resizing (309 ms)
[ok]: Redis can rewind and trigger smaller slot resizing (295 ms)
```

failure:
https://github.com/redis/redis/actions/runs/7704158180/job/20995931735
```
*** [err]: expire scan should skip dictionaries with lot's of empty buckets in tests/unit/expire.tcl
scan didn't handle slot skipping logic.
*** [err]: Redis can trigger resizing in tests/unit/other.tcl
Expected '[Dictionary HT]
Hash table 0 stats (main hash table):
 table size: 128
 number of elements: 5
[Expires HT]
Hash table 0 stats (main hash table):
No stats available for empty dictionaries
' to match '*table size: 8*' (context: type eval line 29 cmd {assert_match "*table size: 8*" [r debug HTSTATS 0]} proc ::test) 
*** [err]: Redis can rewind and trigger smaller slot resizing in tests/unit/other.tcl
Expected '[Dictionary HT]
Hash table 0 stats (main hash table):
 table size: 256
 number of elements: 10
[Expires HT]
Hash table 0 stats (main hash table):
No stats available for empty dictionaries
' to match '*table size: 16*' (context: type eval line 27 cmd {assert_match "*table size: 16*" [r debug HTSTATS 0]} proc ::test) 
```
2024-01-30 14:32:38 +02:00
Binbin 492021db95
Fix blocking commands timeout is reset due to re-processing command (#13004)
In #11012, we will reprocess command when client is unblocked on keys,
in some blocking commands, for example, in the XREADGROUP BLOCK
scenario,
because of the re-processing command, we will recalculate the block
timeout,
causing the blocking time to be reset.

This commit add a new CLIENT_REPROCESSING_COMMAND clent flag, explicitly
let the command know that it is being re-processed, later in
blockForKeys
we will not reset the timeout.

Affected BLOCK cases: 
- list / zset / stream, added test cases for each.

Unaffected cases:
- module (never re-process the commands).
- WAIT / WAITAOF (never re-process the commands).

Fixes #12998.
2024-01-30 11:32:59 +02:00
Chen Tianjie af7ceeb765
Optimize resizing hash table to resize not only non-empty dicts. (#12819)
The function `tryResizeHashTables` only attempts to shrink the dicts
that has keys (change from #11695), this was a serious problem until the
change in #12850 since it meant if all keys are deleted, we won't shrink
the dick.
But still, both dictShrink and dictExpand may be blocked by a fork child
process, therefore, the cron job needs to perform both dictShrink and
dictExpand, for not just non-empty dicts, but all dicts in DBs.

What this PR does:

1. Try to resize all dicts in DBs (not just non-empty ones, as it was
since #12850)
2. handle both shrink and expand (not just shrink, as it was since
forever)
3. Refactor some APIs about dict resizing (get rid of `htNeedsShrink`
`htNeedsShrink` `dictShrinkToFit`, and expose `dictShrinkIfNeeded`
`dictExpandIfNeeded` which already contains all the code of those
functions we get rid of, to make APIs more neat)
4. In the `Don't rehash if redis has child process` test, now that cron
would do resizing, we no longer need to write to DB after the child
process got killed, and can wait for the cron to expand the hash table.
2024-01-29 21:02:07 +02:00
Ozan Tezcan c5273cae18
Add RM_TryCalloc() and RM_TryRealloc() (#12985)
Modules may want to handle allocation failures gracefully. Adding
RM_TryCalloc() and RM_TryRealloc() for it.
RM_TryAlloc() was added before:
https://github.com/redis/redis/pull/10541
2024-01-29 20:56:03 +02:00
Roshan Khatri 5358bd7cdd
Reduce performance impact of dict rehashing and make it shorter. (#12899)
#### Problem Statement:
For any read/update operation during rehashing, we're doing ~10+ random
DRAM lookups to do the rehashing, as we are using the `rehashidx` to
rehash 10 buckets, whose dict entries most likely aren't cached in the
CPU or near the bucket we are operating on. If these random bucket are
empty, the rehashing process during that command execution is skipped.

#### Implementation:
For reducing the performance recession while dict is rehashing, we
determine the index at which the key would be stored in the 0th HT, we
check if that index has already been rehashed, if not we will rehash the
bucket containing the key and the bucket will be moved from 0th HT to
the 1st HT.

If the key has already been rehashed, we perform the random access
bucket rehash (using `rehashidx`) and we again verify if rehashing is
still ongoing and look up the key in the respective HT.

This ensures rehashing is not skipped in any command call and that we
rehash a particular bucket or random bucket in each call.

#### Changes in this PR:
- Added a new method `dictBucketRehash` to perform rehash on a single
bucket.
- Helper function `moveKeysInBucketOldtoNew` for `dictRehash` and
`dictBucketRehash` to move all the keys in a bucket from the old to the
new hash HT.
- Helper function `verifyMoreRehashRequired` for `dictRehash` and
`dictBucketRehash` to check if we have already rehashed the whole table
and if more rehashing is required.

### Benchmark:
- This PR still shows **~13%** improvement in the latency during
rehashing.

- Rehashing is now **~2%** faster for this PR when compared to unstable.

---------

Co-authored-by: Oran Agra <oran@redislabs.com>
Co-authored-by: Madelyn Olson <34459052+madolson@users.noreply.github.com>
2024-01-27 11:11:53 +02:00
Wen Hui 685409139b
Add INCR type command against wrong argument test cases. (#12836)
We have test cases for incr related commands with no key exist and
spaces in key and wrong type of key. However, we dont have test cases
covered for INCRBY INCRBYFLOAT DECRBY INCR DECR HINCRBY HINCRBYFLOAT
ZINCRBY with valid key and invalid value as argument, and float value to
incrby and decrby. So added test cases for the scenarios in incr.tcl.

Thank you!
2024-01-23 15:39:38 +02:00
Binbin 85c31e0cff
Allow running WAITAOF in scripts, remove NOSCRIPT flag (#12977)
In #11568 we removed the NOSCRIPT flag from commands, e.g. removing
NOSCRIPT flag from WAIT. Aiming to allow them in scripts and let them
implicitly behave in the non-blocking way.

This PR remove NOSCRIPT flag from WAITAOF just like WAIT (to be
symmetrical)).
And this PR also add BLOCKING flag for WAIT and WAITAOF.
2024-01-23 15:19:41 +02:00
Binbin 628c0dea1b
Some cleanups around function (#12940)
This PR did some cleanups around function:
- drop the comment about Libraries Ctx, since we do have comment
  in functionsLibCtx, no need to maintain multiple copies.
- remove outdated comment about the dropped Library description.
- remove unused desc and code vars in functionExtractLibMetaData.
- fix engines_nemory typo, changed it to engines_memory.
- remove outdated comment about FUNCTION CREATE and FUNCTION INFO,
  FUNCTION CREATE was renamed to FUNCTION LOAD.
- Check in initServer whether the return of functionsInit is OK.
2024-01-23 14:26:33 +02:00
Harkrishn Patro 2bce71b5ff
Exit early if slowlog/acllog max len set to zero (#12965)
Currently slowlog gets disabled if slowlog-log-slower-than is set to less than zero. I think we should also disable it if slowlog-max-len is set to zero. We apply the same logic to acllog-max-len.
2024-01-22 16:01:04 -08:00
Yanqi Lv b07174afc2
Change the threshold of dict expand, shrink and rehash (#12948)
Before this change (most recently modified in
https://github.com/redis/redis/pull/12850#discussion_r1421406393), The
trigger for normal expand threshold was 100% utilization and the trigger
for normal shrink threshold was 10% (HASHTABLE_MIN_FILL).
While during fork (DICT_RESIZE_AVOID), when we want to avoid rehash, the
trigger thresholds were multiplied by 5 (`dict_force_resize_ratio`),
meaning 500% for expand and 2% (100/10/5) for shrink.

However, in `dictRehash` (the incremental rehashing), the rehashing
threshold for shrinking during fork (DICT_RESIZE_AVOID) was 20% by
mistake.
This meant that if a shrinking is triggered when `dict_can_resize` is
`DICT_RESIZE_ENABLE` which the threshold is 10%, the rehashing can
continue when `dict_can_resize` is `DICT_RESIZE_AVOID`.
This would cause unwanted CopyOnWrite damage.

It'll make sense to change the thresholds of the rehash trigger and the
thresholds of the incremental rehashing the same, however, in one we
compare the size of the hash table to the number of records, and in the
other we compare the size of ht[0] to the size of ht[1], so the formula
is not exactly the same.

to make things easier we change all the thresholds to powers of 2, so
the normal shrinking threshold is changed from 100/10 (i.e. 10%) to
100/8 (i.e. 12.5%), and we change the threshold during forks from 5 to
4, i.e. from 500% to 400% for expand, and from 2% (100/10/5) to 3.125%
(100/8/4)
2024-01-19 17:00:43 +02:00
debing.sun d0640029dc
Fix race condition issues between the main thread and module threads (#12817)
Fix #12785 and other race condition issues.
See the following isolated comments.

The following report was obtained using SANITIZER thread.
```sh
make SANITIZER=thread
./runtest-moduleapi --config io-threads 4 --config io-threads-do-reads yes --accurate
```

1. Fixed thread-safe issue in RM_UnblockClient()
Related discussion:
https://github.com/redis/redis/pull/12817#issuecomment-1831181220
* When blocking a client in a module using `RM_BlockClientOnKeys()` or
`RM_BlockClientOnKeysWithFlags()`
with a timeout_callback, calling RM_UnblockClient() in module threads
can lead to race conditions
     in `updateStatsOnUnblock()`.

     - Introduced: 
        Version: 6.2
        PR: #7491

     - Touch:
`server.stat_numcommands`, `cmd->latency_histogram`, `server.slowlog`,
and `server.latency_events`
     
     - Harm Level: High
Potentially corrupts the memory data of `cmd->latency_histogram`,
`server.slowlog`, and `server.latency_events`

     - Solution:
Differentiate whether the call to moduleBlockedClientTimedOut() comes
from the module or the main thread.
Since we can't know if RM_UnblockClient() comes from module threads, we
always assume it does and
let `updateStatsOnUnblock()` asynchronously update the unblock status.
     
* When error reply is called in timeout_callback(), ctx is not
thread-safe, eventually lead to race conditions in `afterErrorReply`.

     - Introduced: 
        Version: 6.2
        PR: #8217

     - Touch
       `server.stat_total_error_replies`, `server.errors`, 

     - Harm Level: High
       Potentially corrupts the memory data of `server.errors`
   
      - Solution: 
Make the ctx in `timeout_callback()` with `REDISMODULE_CTX_THREAD_SAFE`,
and asynchronously reply errors to the client.

2. Made RM_Reply*() family API thread-safe
Related discussion:
https://github.com/redis/redis/pull/12817#discussion_r1408707239
Call chain: `RM_Reply*()` -> `_addReplyToBufferOrList()` -> touch
server.current_client

    - Introduced: 
       Version: 7.2.0
       PR: #12326

   - Harm Level: None
Since the module fake client won't have the `CLIENT_PUSHING` flag, even
if we touch server.current_client,
     we can still exit after `c->flags & CLIENT_PUSHING`.

   - Solution
      Checking `c->flags & CLIENT_PUSHING` earlier.

3. Made freeClient() thread-safe
    Fix #12785

    - Introduced: 
       Version: 4.0
Commit:
3fcf959e60

    - Harm Level: Moderate
       * Trigger assertion
It happens when the module thread calls freeClient while the io-thread
is in progress,
which just triggers an assertion, and doesn't make any race condiaions.

* Touch `server.current_client`, `server.stat_clients_type_memory`, and
`clientMemUsageBucket->clients`.
It happens between the main thread and the module threads, may cause
data corruption.
1. Error reset `server.current_client` to NULL, but theoretically this
won't happen,
because the module has already reset `server.current_client` to old
value before entering freeClient.
2. corrupts `clientMemUsageBucket->clients` in
updateClientMemUsageAndBucket().
3. Causes server.stat_clients_type_memory memory statistics to be
inaccurate.
    
    - Solution:
* No longer counts memory usage on fake clients, to avoid updating
`server.stat_clients_type_memory` in freeClient.
* No longer resetting `server.current_client` in unlinkClient, because
the fake client won't be evicted or disconnected in the mid of the
process.
* Judgment assertion `io_threads_op == IO_THREADS_OP_IDLE` only if c is
not a fake client.

4. Fixed free client args without GIL
Related discussion:
https://github.com/redis/redis/pull/12817#discussion_r1408706695
When freeing retained strings in the module thread (refcount decr), or
using them in some way (refcount incr), we should do so while holding
the GIL,
otherwise, they might be simultaneously freed while the main thread is
processing the unblock client state.

    - Introduced: 
       Version: 6.2.0
       PR: #8141

   - Harm Level: Low
     Trigger assertion or double free or memory leak. 

   - Solution:
Documenting that module API users need to ensure any access to these
retained strings is done with the GIL locked

5. Fix adding fake client to server.clients_pending_write
    It will incorrectly log the memory usage for the fake client.
Related discussion:
https://github.com/redis/redis/pull/12817#issuecomment-1851899163

    - Introduced: 
       Version: 4.0
Commit:
9b01b64430

    - Harm Level: None
      Only result in NOP

    - Solution:
       * Don't add fake client into server.clients_pending_write
* Add c->conn assertion for updateClientMemUsageAndBucket() and
updateClientMemoryUsage() to avoid same
         issue in the future.
So now it will be the responsibility of the caller of both of them to
avoid passing in fake client.

6. Fix calling RM_BlockedClientMeasureTimeStart() and
RM_BlockedClientMeasureTimeEnd() without GIL
    - Introduced: 
       Version: 6.2
       PR: #7491

   - Harm Level: Low
Causes inaccuracies in command latency histogram and slow logs, but does
not corrupt memory.

   - Solution:
Module API users, if know that non-thread-safe APIs will be used in
multi-threading, need to take responsibility for protecting them with
their own locks instead of the GIL, as using the GIL is too expensive.

### Other issue
1. RM_Yield is not thread-safe, fixed via #12905.

### Summarize
1. Fix thread-safe issues for `RM_UnblockClient()`, `freeClient()` and
`RM_Yield`, potentially preventing memory corruption, data disorder, or
assertion.
2. Updated docs and module test to clarify module API users'
responsibility for locking non-thread-safe APIs in multi-threading, such
as RM_BlockedClientMeasureTimeStart/End(), RM_FreeString(),
RM_RetainString(), and RM_HoldString().

### About backpot to 7.2
1. The implement of (1) is not too satisfying, would like to get more
eyes.
2. (2), (3) can be safely for backport
3. (4), (6) just modifying the module tests and updating the
documentation, no need for a backpot.
4. (5) is harmless, no need for a backpot.

---------

Co-authored-by: Oran Agra <oran@redislabs.com>
2024-01-19 15:12:49 +02:00
Binbin 29e6245a05
Fix unexpected resize causing test failure (#12960)
Before #12850, we will only try to shrink the dict in serverCron,
which we can control by using a child process, but now every time
we delete a key, the shrink check will be called.

In these test (added in #12802), we meant to disable the resizing,
but druing the delete, the dict will meet the force shrink, like
2 / 128 = 0.015 < 0.2, the delete will trigger a force resize and
will cause the test to fail.

In this commit, we try to keep the load factor at 3 / 128 = 0.023,
that is, do not meet the force shrink.
2024-01-18 11:19:29 +02:00
Binbin 131d95f203
Fix race in slot dict resize test (#12942)
The test have a race:
```
*** [err]: Redis can rewind and trigger smaller slot resizing in tests/unit/other.tcl
Expected '[Dictionary HT]
Hash table 0 stats (main hash table):
 table size: 12
 number of elements: 2
[Expires HT]
Hash table 0 stats (main hash table):
No stats available for empty dictionaries
' to match '*table size: 8*' (context: type eval line 12 cmd {assert_match "*table size: 8*" [r debug HTSTATS 0]} proc ::test)
```

When `r del "{alice}$j"` is executed in the loop, when the key is
deleted to [9, 12], the load factor has meet HASHTABLE_MIN_FILL,
if serverCron happens to trigger slot dict resize, then the test
will fail. Because there is not way to meet HASHTABLE_MIN_FILL in
the subsequent dels.

The solution is to avoid triggering the resize in advance. We can
use multi to delete them at once, or we can disable the resize.
Since we disabled resize in the previous test, the fix also uses
the method of disabling resize.

The test is introduced in #12802.
2024-01-17 08:46:09 +02:00
Yanqi Lv e2b7932b34
Shrink dict when deleting dictEntry (#12850)
When we insert entries into dict, it may autonomously expand if needed.
However, when we delete entries from dict, it doesn't shrink to the
proper size. If there are few entries in a very large dict, it may cause
huge waste of memory and inefficiency when iterating.

The main keyspace dicts (keys and expires), are shrinked by cron
(`tryResizeHashTables` calls `htNeedsResize` and `dictResize`),
And some data structures such as zset and hash also do that (call
`htNeedsResize`) right after a loop of calls to `dictDelete`,
But many other dicts are completely missing that call (they can only
expand).

In this PR, we provide the ability to automatically shrink the dict when
deleting. The conditions triggering the shrinking is the same as
`htNeedsResize` used to have. i.e. we expand when we're over 100%
utilization, and shrink when we're below 10% utilization.

Additionally:
* Add `dictPauseAutoResize` so that flows that do mass deletions, will
only trigger shrinkage at the end.
* Rename `dictResize` to `dictShrinkToFit` (same logic as it used to
have, but better name describing it)
* Rename `_dictExpand` to `_dictResize` (same logic as it used to have,
but better name describing it)
 
related to discussion
https://github.com/redis/redis/pull/12819#discussion_r1409293878

---------

Co-authored-by: Oran Agra <oran@redislabs.com>
Co-authored-by: zhaozhao.zz <zhaozhao.zz@alibaba-inc.com>
2024-01-15 08:20:53 +02:00
zhaozhao.zz bb2b6e2927
fix scripts access wrong slot if they disagree with pre-declared keys (#12906)
Regarding how to obtain the hash slot of a key, there is an optimization
in `getKeySlot()`, it is used to avoid redundant hash calculations for
keys: when the current client is in the process of executing a command,
it can directly use the slot of the current client because the slot to
access has already been calculated in advance in `processCommand()`.

However, scripts are a special case where, in default mode or with
`allow-cross-slot-keys` enabled, they are allowed to access keys beyond
the pre-declared range. This means that the keys they operate on may not
belong to the slot of the pre-declared keys. Currently, when the
commands in a script are executed, the slot of the original client
(i.e., the current client) is not correctly updated, leading to
subsequent access to the wrong slot.

This PR fixes the above issue. When checking the cluster constraints in
a script, the slot to be accessed by the current command is set for the
original client (i.e., the current client). This ensures that
`getKeySlot()` gets the correct slot cache.

Additionally, the following modifications are made:

1. The 'sort' and 'sort_ro' commands use `getKeySlot()` instead of
`c->slot` because the client could be an engine client in a script and
can lead to potential bug.
2. `getKeySlot()` is also used in pubsub to obtain the slot for the
channel, standardizing the way slots are retrieved.
2024-01-15 09:57:12 +08:00
bentotten b3aaa0a136
When one shard, sole primary node marks potentially failed replica as FAIL instead of PFAIL (#12824)
Fixes issue where a single primary cannot mark a replica as failed in a
single-shard cluster.
2024-01-11 15:48:19 -08:00
Binbin b351a04b1e
Add announced-endpoints test to all_tests and fix tls related tests (#12927)
The test was introduced in #10745, but we forgot to add it to the
test_helper.tcl, so our CI did not actually run it. This PR adds it
and ensures it passes CI tests.
2024-01-09 18:18:59 -08:00
Madelyn Olson 8bb9a2895e
Address some failures with new tests for improving debug report (#12915)
Fix a daily test failure because alpine doesn't support stack traces and
add in an extra assertion related to making sure the stack trace was
printed twice.
2024-01-08 17:56:06 -08:00
Madelyn Olson 068051e378
Handle recursive serverAsserts and provide more information for recursive segfaults (#12857)
This change is trying to make two failure modes a bit easier to deep dive:
1. If a serverPanic or serverAssert occurs during the info (or module)
printing, it will recursively panic, which is a lot of fun as it will
just keep recursively printing. It will eventually stack overflow, but
will generate a lot of text in the process.
2. When a segfault happens during the segfault handler, no information
is communicated other than it happened. This can be problematic because
`info` may help diagnose the real issue, but without fixing the
recursive crash it might be hard to get at that info.
2024-01-02 18:20:22 -08:00
Chen Tianjie 8527959598
Replace slots_to_channels radix tree with slot specific dictionaries for shard channels. (#12804)
We have achieved replacing `slots_to_keys` radix tree with key->slot
linked list (#9356), and then replacing the list with slot specific
dictionaries for keys (#11695).

Shard channels behave just like keys in many ways, and we also need a
slots->channels mapping. Currently this is still done by using a radix
tree. So we should split `server.pubsubshard_channels` into 16384 dicts
and drop the radix tree, just like what we did to DBs.

Some benefits (basically the benefits of what we've done to DBs):
1. Optimize counting channels in a slot. This is currently used only in
removing channels in a slot. But this is potentially more useful:
sometimes we need to know how many channels there are in a specific slot
when doing slot migration. Counting is now implemented by traversing the
radix tree, and with this PR it will be as simple as calling `dictSize`,
from O(n) to O(1).
2. The radix tree in the cluster has been removed. The shard channel
names no longer require additional storage, which can save memory.
3. Potentially useful in slot migration, as shard channels are logically
split by slots, thus making it easier to migrate, remove or add as a
whole.
4. Avoid rehashing a big dict when there is a large number of channels.

Drawbacks:
1. Takes more memory than using radix tree when there are relatively few
shard channels.

What this PR does:
1. in cluster mode, split `server.pubsubshard_channels` into 16384
dicts, in standalone mode, still use only one dict.
2. drop the `slots_to_channels` radix tree.
3. to save memory (to solve the drawback above), all 16384 dicts are
created lazily, which means only when a channel is about to be inserted
to the dict will the dict be initialized, and when all channels are
deleted, the dict would delete itself.
5. use `server.shard_channel_count` to keep track of the number of all
shard channels.

---------

Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
2023-12-27 17:40:45 +08:00
sundb bef5715374
Fix oom-score-adj test due to no permission (#12887)
Fix #12792

On ubuntu 23(lunar), non-root users will not be allowed to change the
oom_score_adj of a process to a value that is too low.
Since terminal's default oom_score_adj is 200, if we run the test on
terminal, we won't be able to set the oom_score_adj of the redis process
to 9 or 22, which is too low.

Reproduction on ubuntu 23(lunar) terminal:
```sh
$ cat /proc/`pgrep redis-server`/oom_score_adj
200
$ echo 100 > /proc/`pgrep redis-server`/oom_score_adj
# success without error
$ echo 99 > /proc/`pgrep redis-server`/oom_score_adj
echo: write error: Permission denied
```

As from the output above, we can only set the minimum oom score of redis
processes to 100.
By modifying the test, make oom_score_adj only increase upwards and not
decrease.

---------

Co-authored-by: debing.sun <debing.sun@redis.com>
2023-12-27 08:42:46 +02:00
Slava Koyfman 20214b26a4
Don't disconnect all clients in ACL LOAD (#12171)
Previous implementation would disconnect _all_ clients when running `ACL
LOAD`, which wasn't very useful.

This change brings the behavior in line with that of `ACL SETUSER`, `ACL
DELUSER`, in that only clients whose user is deleted or clients
subscribed to channels which they no longer have access to will be
disconnected.

---------

Co-authored-by: Oran Agra <oran@redislabs.com>
Co-authored-by: Madelyn Olson <34459052+madolson@users.noreply.github.com>
2023-12-24 11:56:44 +02:00
Binbin 09e0d338f5
redis-cli adds -4 / -6 options to determine IPV4 / IPV6 priority in DNS lookup (#11315)
This PR, we added -4 and -6 options to redis-cli to determine
IPV4 / IPV6 priority in DNS lookup.
This was mentioned in
https://github.com/redis/redis/pull/11151#issuecomment-1231570651

For now it's only used in CLUSTER MEET.

The options also made it possible to reliably test dns lookup in CI,
using this option, we can add some localhost tests for #11151.

The commit was cherry-picked from #11151, back then we decided to split
the PR.

Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
2023-12-24 10:40:34 +02:00
Wen Hui 5dc631d880
Add missing test cases for hash commands (#12851)
We dont have test for hgetall against key doesnot exist so added the
test in test suite and along with this, added wrong type cases for other
missing commands.
2023-12-17 14:02:53 +02:00
Chen Tianjie e95a5d4831
Support by/get options for sort(_ro) in cluster mode when pattern implies slot. (#12728)
The by/get options of sort/sort_ro command used to be forbidden in
cluster mode, since we are not sure which slot the pattern may be in.

As the optimization done in #12536, patterns now can be mapped to slots,
we should allow by/get options in cluster mode when the pattern maps to
the same slot as the key.
2023-12-13 21:16:36 +02:00
Binbin 3c0fd25201
Redact ACL username information and mark *-key-file-pass configs as sensitive (#12860)
In #11489, we consider acl username to be sensitive information,
and consider the ACL GETUSER a sensitive command and remove it
from redis-cli historyfile.

This PR redact username information in ACL GETUSER and ACL DELUSER
from SLOWLOG, and also remove ACL DELUSER from redis-cli historyfile.

This PR also mark tls-key-file-pass and tls-client-key-file-pass
as sensitive config, will redact it from SLOWLOG and also
remove them from redis-cli historyfile.
2023-12-13 15:28:13 +02:00
Chen Tianjie f9cc25c1dd
Add metric to INFO CLIENTS: pubsub_clients. (#12849)
In INFO CLIENTS section, we already have blocked_clients and
tracking_clients. We should add a new metric showing the number of
pubsub connections, which helps performance monitoring and trouble
shooting.
2023-12-13 13:44:13 +08:00
Binbin c85a9b7896
Fix delKeysInSlot server events are not executed inside an execution unit (#12745)
This is a follow-up fix to #12733. We need to apply the same changes to
delKeysInSlot. Refer to #12733 for more details.

This PR contains some other minor cleanups / improvements to the test
suite and docs.
It uses the postnotifications test module in a cluster mode test which
revealed a leak in the test module (fixed).
2023-12-11 20:15:19 +02:00
Chen Tianjie 991aff1c0f
Optimize KEYS when pattern includes hashtag and implies a single slot. (#12754)
in #12536 we made a similar optimization for SCAN, now that hashtags in
patterns. When we can make sure all keys matching the pettern will be in
the same slot, we can limit the iteration to run only one one.
2023-12-05 16:21:50 +02:00
sundb 91309f7981
Fix compilation warning in KeySpace_ServerEventCallback and add CFLAGS=-Werror flag for module CI (#12786)
Warning:
```
postnotifications.c:216:77: warning: format specifies type 'long' but the argument has type 'uint64_t' (aka 'unsigned long long') [-Wformat]
        RedisModule_Log(ctx, "warning", "Got an unexpected subevent '%ld'", subevent);
                                                                     ~~~    ^~~~~~~~
                                                                     %llu
```

CI:
https://github.com/redis/redis/actions/runs/6937308713/job/18871124342#step:6:115

## Other
Add `CFLAGS=-Werror` flag for module CI.

---------

Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
2023-11-30 17:41:00 +02:00
zhaozhao.zz 3431b1f156
format cpu config as redis style (#7351)
The following four configurations are renamed to align with Redis style:

1. server_cpulist renamed to server-cpulist
2. bio_cpulist renamed to bio-cpulist
3. aof_rewrite_cpulist renamed to aof-rewrite-cpulist
4. bgsave_cpulist renamed to bgsave-cpulist

The original names are retained as aliases to ensure compatibility with
old configuration files. We recommend users to gradually transition to
using the new configuration names to maintain consistency in style.
2023-11-29 13:40:06 +08:00
zhaozhao.zz a1c5171c1d
Fix resize hash tables stuck on the last non-empty slot (#12802)
Introduced in #11695 .

The tryResizeHashTables function gets stuck on the last non-empty slot
while iterating through dictionaries. It does not restart from the
beginning. The reason for this issue is a problem with the usage of
dbIteratorNextDict:

/* Returns next dictionary from the iterator, or NULL if iteration is complete. */
dict *dbIteratorNextDict(dbIterator *dbit) {
    if (dbit->next_slot == -1) return NULL;
    dbit->slot = dbit->next_slot;
    dbit->next_slot = dbGetNextNonEmptySlot(dbit->db, dbit->slot, dbit->keyType);
    return dbGetDictFromIterator(dbit);
}

When iterating to the last non-empty slot, next_slot is set to -1,
causing it to loop indefinitely on that slot. We need to modify the code
to ensure that after iterating to the last non-empty slot, it returns to
the first non-empty slot.

BTW, function tryResizeHashTables is actually iterating over slots
that have keys. However, in its implementation, it leverages the
dbIterator (which is a key iterator) to obtain slot and dictionary
information. While this approach works fine, but it is not very
intuitive. This PR also improves readability by changing the iteration
to directly iterate over slots, thereby enhancing clarity.
2023-11-28 18:50:16 +08:00
Binbin d6f19539d2
Un-register notification and server event when RedisModule_OnLoad fails (#12809)
When we register notification or server event in RedisModule_OnLoad, but
RedisModule_OnLoad eventually fails, triggering notification or server
event
will cause the server to crash.

If the loading fails on a later stage of moduleLoad, we do call
moduleUnload
which handles all un-registration, but when it fails on the
RedisModule_OnLoad
call, we only un-register several specific things and these were
missing:

- moduleUnsubscribeNotifications
- moduleUnregisterFilters
- moduleUnsubscribeAllServerEvents

Refactored the code to reuse the code from moduleUnload.

Fixes #12808.
2023-11-27 17:26:33 +02:00
meiravgri 2e854bccc6
Fix async safety in signal handlers (#12658)
see discussion from after https://github.com/redis/redis/pull/12453 was
merged
----
This PR replaces signals that are not considered async-signal-safe
(AS-safe) with safe calls.

#### **1. serverLog() and serverLogFromHandler()**
`serverLog` uses unsafe calls. It was decided that we will **avoid**
`serverLog` calls by the signal handlers when:
* The signal is not fatal, such as SIGALRM. In these cases, we prefer
using `serverLogFromHandler` which is the safe version of `serverLog`.
Note they have different prompts:
`serverLog`: `62220:M 26 Oct 2023 14:39:04.526 # <msg>`
`serverLogFromHandler`: `62220:signal-handler (1698331136) <msg>`
* The code was added recently. Calls to `serverLog` by the signal
handler have been there ever since Redis exists and it hasn't caused
problems so far. To avoid regression, from now we should use
`serverLogFromHandler`

#### **2. `snprintf` `fgets` and `strtoul`(base = 16) -------->
`_safe_snprintf`, `fgets_async_signal_safe`, `string_to_hex`**
The safe version of `snprintf` was taken from
[here](8cfc4ca5e7/src/mc_util.c (L754))

#### **3. fopen(), fgets(), fclose() --------> open(), read(), close()**

#### **4. opendir(), readdir(), closedir() --------> open(),
syscall(SYS_getdents64), close()**

#### **5. Threads_mngr sync mechanisms**
* waiting for the thread to generate stack trace: semaphore -------->
busy-wait
* `globals_rw_lock` was removed: as we are not using malloc and the
semaphore anymore we don't need to protect `ThreadsManager_cleanups`.

#### **6. Stacktraces buffer**
The initial problem was that we were not able to safely call malloc
within the signal handler.
To solve that we created a buffer on the stack of `writeStacktraces` and
saved it in a global pointer, assuming that under normal circumstances,
the function `writeStacktraces` would complete before any thread
attempted to write to it. However, **if threads lag behind, they might
access this global pointer after it no longer belongs to the
`writeStacktraces` stack, potentially corrupting memory.**
To address this, various solutions were discussed
[here](https://github.com/redis/redis/pull/12658#discussion_r1390442896)
Eventually, we decided to **create a pipe** at server startup that will
remain valid as long as the process is alive.
We chose this solution due to its minimal memory usage, and since
`write()` and `read()` are atomic operations. It ensures that stack
traces from different threads won't mix.

**The stacktraces collection process is now as  follows:**
* Cleaning the pipe to eliminate writes of late threads from previous
runs.
* Each thread writes to the pipe its stacktrace
* Waiting for all the threads to mark completion or until a timeout (2
sec) is reached
* Reading from the pipe to print the stacktraces.

#### **7. Changes that were considered and eventually were dropped**
* replace watchdog timer with a POSIX timer: 
according to [settimer man](https://linux.die.net/man/2/setitimer)

> POSIX.1-2008 marks getitimer() and setitimer() obsolete, recommending
the use of the POSIX timers API
([timer_gettime](https://linux.die.net/man/2/timer_gettime)(2),
[timer_settime](https://linux.die.net/man/2/timer_settime)(2), etc.)
instead.

However, although it is supposed to conform to POSIX std, POSIX timers
API is not supported on Mac.
You can take a look here at the Linux implementation:

[here](c7562ee135)
To avoid messing up the code, and uncertainty regarding compatibility,
it was decided to drop it for now.

* avoid using sds (uses malloc) in logConfigDebugInfo
It was considered to print config info instead of using sds, however
apparently, `logConfigDebugInfo` does more than just print the sds, so
it was decided this fix is out of this issue scope.

#### **8. fix Signal mask check**
The check `signum & sig_mask` intended to indicate whether the signal is
blocked by the thread was incorrect. Actually, the bit position in the
signal mask corresponds to the signal number. We fixed this by changing
the condition to: `sig_mask & (1L << (sig_num - 1))`

#### **9. Unrelated changes**
both `fork.tcl `and `util.tcl` implemented a function called
`count_log_message` expecting different parameters. This caused
confusion when trying to run daily tests with additional test parameters
to run a specific test.
The `count_log_message` in `fork.tcl` was removed and the calls were
replaced with calls to `count_log_message` located in `util.tcl`

---------

Co-authored-by: Ozan Tezcan <ozantezcan@gmail.com>
Co-authored-by: Oran Agra <oran@redislabs.com>
2023-11-23 13:22:20 +02:00
Wen Hui 5a1f4b9aec
Adding missing SWAPDB related test cases. (#12769)
We have some test cases of swapdb with watchkey but missing seperate
basic swapdb test cases, unhappy path and flushdb after swapdb. So added
the test cases in keyspace.tcl.
2023-11-19 12:44:48 +02:00
Binbin 3d9c427f8c
Fix timing issue in CLUSTER SLAVE / REPLICAS consistent test (#12774)
CI reports that this test failed, the reason is because during
the command processing, the node processed PING/PONG, resulting
in ping_sent or pong_received mismatch.

Change to use MULTI to avoid timing issue. The test was introduced
in #12224.
2023-11-19 11:09:33 +02:00
Binbin fe36306340
Fix DB iterator not resetting pauserehash causing dict being unable to rehash (#12757)
When using DB iterator, it will use dictInitSafeIterator to init a old safe
dict iterator. When dbIteratorNext is used, it will jump to the next slot db
dict when we are done a dict. During this process, we do not have any calls to
dictResumeRehashing, which causes the dict's pauserehash to always be > 0.

And at last, it will be returned directly in dictRehashMilliseconds, which causes
us to have slot dict in a state where rehash cannot be completed.

In the "expire scan should skip dictionaries with lot's of empty buckets" test,
adding a `keys *` can reproduce the problem stably. `keys *` will call dbIteratorNext
to trigger a traversal of all slot dicts.

Added dbReleaseIterator and dbIteratorInitNextSafeIterator methods to call dictResetIterator.
Issue was introduced in #11695.
2023-11-14 14:28:46 +02:00
Harkrishn Patro 9ca8490315
Increase timeout for expiry cluster tests (#12752)
Test recently added fails on timeout in valgrind in GH actions.

Locally with valgrind the test finishes within 1.5 sec(s). Couldn't find
any issue due to lack of reproducibility. Increasing the timeout and
adding an additional log to the test to understand how many keys
were left at the end.
2023-11-11 12:01:04 +02:00
Meir Shpilraien (Spielrein) 0ffb9d2ea9
Before evicted and before expired server events are not executed inside an execution unit. (#12733)
Redis 7.2 (#9406) introduced a new modules event, `RedisModuleEvent_Key`.
This new event allows the module to read the key data just before it is removed
from the database (either deleted, expired, evicted, or overwritten).

When the key is removed from the database, either by active expire or eviction.
The new event was not called as part of an execution unit. This can cause an
issue if the module registers a post notification job inside the event. This job will
not be executed atomically with the expiration/eviction operation and will not
replicated inside a Multi/Exec. Moreover, the post notification job will be executed
right after the event where it is still not safe to perform any write operation, this will
violate the promise that post notification job will be called atomically with the
operation that triggered it and **only when it is safe to write**.

This PR fixes the issue by wrapping each expiration/eviction of a key with an execution
unit. This makes sure the entire operation will run atomically and all the post notification
jobs will be executed at the end where it is safe to write.

Tests were modified to verify the fix.
2023-11-08 09:28:22 +02:00
Yossi Gottlieb 6223355cf3
Use cross-platform-actions for FreeBSD support. (#12732)
This change overcomes many stability issues experienced with the
vmactions action.

We need to limit VMs to 8GB for better stability, as the 13GB default
seems to hang them occasionally.

Shell code has been simplified since this action seem to use `bash -e`
which will abort on non-zero exit codes anyway.
2023-11-06 18:07:14 +02:00
Roshan Khatri 15a048d4f0
re-enable defrag tests in cluster mode. (#12710)
Reverts the skipping defrag tests in cluster mode (done in #12672.
instead it skips only some defrag tests that are relevant for cluster modes.
The test now run well after investigating and making the changes in #12674 and #12694.

Co-authored-by: Oran Agra <oran@redislabs.com>
2023-11-02 13:55:48 +02:00
Viktor Söderqvist 8878817d89
Optimize SCAN with MATCH when pattern implies cluster slot (#12536)
Optimize the performance of SCAN commands when a match pattern can only contain keys from a 
single slot in cluster mode. This can happen when the pattern contains a hash tag before any 
wildcard matchers or when the key contains no matchers.
2023-11-01 00:06:49 -07:00
Harkrishn Patro 3fac869f02
Fix test, disable expiration until empty buckets are formed (#12689)
Test failure on freebsd CI:
```
*** [err]: expire scan should skip dictionaries with lot's of empty buckets in tests/unit/expire.tcl
  scan didn't handle slot skipping logic.
```

Observation:

expiry of keys might happen before the empty buckets are formed and won't help with the expiry skip logic validation.

Solution:

Disable expiration until the empty buckets are formed.
2023-10-24 11:29:40 +03:00
Harkrishn Patro 26eb4ce397
Fix defrag test (#12674)
Fixing issues started after #11695 when the defrag tests are being executed in cluster mode too.
For some reason, it looks like the defragmentation is over too quickly, before the test is able to
detect that it's running.
so now instead of waiting to see that it's active, we wait to see that it did some work
```
[err]: Active defrag big list: cluster in tests/unit/memefficiency.tcl
defrag not started.
[err]: Active defrag big keys: cluster in tests/unit/memefficiency.tcl
defrag didn't stop.
```
2023-10-22 11:56:45 +03:00
Harkrishn Patro becd50d0da
Disable flaky defrag tests affecting daily run (#12672)
Temporarily disabling few of the defrag tests in cluster mode to make the daily run stable:

Active defrag eval scripts
Active defrag big keys
Active defrag big list
Active defrag edge case
2023-10-19 21:12:58 +03:00
Harkrishn Patro f3bf8485d8
Fix resize hash table dictionary iterator (#12660)
Dictionary iterator logic in the `tryResizeHashTables` method is picking the next
(incorrect) dictionary while the cursor is at a given slot. This could lead to some
dictionary/slot getting skipped from resizing.

Also stabilize the test.

problem introduced recently in #11695
2023-10-19 13:58:32 +03:00
Vitaly 0270abda82
Replace cluster metadata with slot specific dictionaries (#11695)
This is an implementation of https://github.com/redis/redis/issues/10589 that eliminates 16 bytes per entry in cluster mode, that are currently used to create a linked list between entries in the same slot.  Main idea is splitting main dictionary into 16k smaller dictionaries (one per slot), so we can perform all slot specific operations, such as iteration, without any additional info in the `dictEntry`. For Redis cluster, the expectation is that there will be a larger number of keys, so the fixed overhead of 16k dictionaries will be The expire dictionary is also split up so that each slot is logically decoupled, so that in subsequent revisions we will be able to atomically flush a slot of data.

## Important changes
* Incremental rehashing - one big change here is that it's not one, but rather up to 16k dictionaries that can be rehashing at the same time, in order to keep track of them, we introduce a separate queue for dictionaries that are rehashing. Also instead of rehashing a single dictionary, cron job will now try to rehash as many as it can in 1ms.
* getRandomKey - now needs to not only select a random key, from the random bucket, but also needs to select a random dictionary. Fairness is a major concern here, as it's possible that keys can be unevenly distributed across the slots. In order to address this search we introduced binary index tree). With that data structure we are able to efficiently find a random slot using binary search in O(log^2(slot count)) time.
* Iteration efficiency - when iterating dictionary with a lot of empty slots, we want to skip them efficiently. We can do this using same binary index that is used for random key selection, this index allows us to find a slot for a specific key index. For example if there are 10 keys in the slot 0, then we can quickly find a slot that contains 11th key using binary search on top of the binary index tree.
* scan API - in order to perform a scan across the entire DB, the cursor now needs to not only save position within the dictionary but also the slot id. In this change we append slot id into LSB of the cursor so it can be passed around between client and the server. This has interesting side effect, now you'll be able to start scanning specific slot by simply providing slot id as a cursor value. The plan is to not document this as defined behavior, however. It's also worth nothing the SCAN API is now technically incompatible with previous versions, although practically we don't believe it's an issue.
* Checksum calculation optimizations - During command execution, we know that all of the keys are from the same slot (outside of a few notable exceptions such as cross slot scripts and modules). We don't want to compute the checksum multiple multiple times, hence we are relying on cached slot id in the client during the command executions. All operations that access random keys, either should pass in the known slot or recompute the slot. 
* Slot info in RDB - in order to resize individual dictionaries correctly, while loading RDB, it's not enough to know total number of keys (of course we could approximate number of keys per slot, but it won't be precise). To address this issue, we've added additional metadata into RDB that contains number of keys in each slot, which can be used as a hint during loading.
* DB size - besides `DBSIZE` API, we need to know size of the DB in many places want, in order to avoid scanning all dictionaries and summing up their sizes in a loop, we've introduced a new field into `redisDb` that keeps track of `key_count`. This way we can keep DBSIZE operation O(1). This is also kept for O(1) expires computation as well.

## Performance
This change improves SET performance in cluster mode by ~5%, most of the gains come from us not having to maintain linked lists for keys in slot, non-cluster mode has same performance. For workloads that rely on evictions, the performance is similar because of the extra overhead for finding keys to evict. 

RDB loading performance is slightly reduced, as the slot of each key needs to be computed during the load.

## Interface changes
* Removed `overhead.hashtable.slot-to-keys` to `MEMORY STATS`
* Scan API will now require 64 bits to store the cursor, even on 32 bit systems, as the slot information will be stored.
* New RDB version to support the new op code for SLOT information. 

---------

Co-authored-by: Vitaly Arbuzov <arvit@amazon.com>
Co-authored-by: Harkrishn Patro <harkrisp@amazon.com>
Co-authored-by: Roshan Khatri <rvkhatri@amazon.com>
Co-authored-by: Madelyn Olson <madelyneolson@gmail.com>
Co-authored-by: Oran Agra <oran@redislabs.com>
2023-10-14 23:58:26 -07:00
Oran Agra f0c1c730d4
test suite: clean server pids after server crashed (#12639)
when a server in the test suite crashes and is restarted by redstart_server, we didn't clean it's pid from the list.
we can see that when the corrupt-dump-fuzzer hangs, it has a long list of servers to lean, but in fact they're all already dead.
2023-10-13 16:28:52 +03:00
Harkrishn Patro b784c5375e
Unsubscribe all clients from replica for shard channel if the master ownership changes (#12577)
Unsubscribe all clients from replica for shard channel if the master ownership changes
2023-10-12 20:48:27 -07:00
zhaozhao.zz 77a65e82b2
support XREAD[GROUP] with BLOCK option in scripts (#12596)
In #11568 we removed the NOSCRIPT flag from commands and keep the BLOCKING flag.
Aiming to allow them in scripts and let them implicitly behave in the non-blocking way.

In that sense, the old behavior was to allow LPOP and reject BLPOP, and the new behavior,
is to allow BLPOP too, and fail it only in case it ends up blocking.
So likewise, so far we allowed XREAD and rejected XREAD BLOCK, and we will now allow
that too, and only reject it if it ends up blocking.
2023-10-12 10:54:50 +03:00
Oran Agra b810384c62
dump server longs on hang corrupt dump fuzzer test
recently there are some incidents of hanged tests in the CI
when we try to reproduce them, we get an assertion, not a hang.
maybe the server logs will reveal some info.
2023-10-08 16:19:31 +03:00
YaacovHazan 2cf50ddbad
Fix 'load corrupted rdb with no CRC' test (#12629)
After the change in #12626 (2e0f6724e), the is_alive proc gets pid and not
server config.

This PR aligns it in 'load corrupted rdb with no CRC' test.
2023-10-03 11:09:25 +03:00
meiravgri 4ba9e18ef0
fix crash in crash-report and other improvements (#12623)
## Crash fix
### Current behavior
We might crash if we fail to collect some of the threads' output. If it exceeds timeout for example.

The threads mngr API guarantees that the output array length will be `tids_len`, however, some
indices can be NULL, in case it fails to collect some of the threads' outputs.

When we use the threads mngr to collect the threads' stacktraces, we rely on this and skip NULL
entries. Since the output array was allocated with malloc, instead of NULL, it contained garbage,
so we got a segmentation fault when trying to read this garbage. (in debug.c:writeStacktraces() )

### fix
Allocate the global output array with zcalloc.

### To reproduce the bug, you'll have to change the code:
**in threadsmngr:ThreadsManager_runOnThreads():**
make sure the g_output_array allocation is initialized with garbage and not 0s 
(add `memset(g_output_array, 2, sizeof(void*) * tids_len);` below the allocation).

Force one of the threads to write to the array:
add a global var: `static redisAtomic size_t return_now = 0;` 
add to `invoke_callback()` before writing to the output array:
```
    size_t i_return;
    atomicGetIncr(return_now, i_return, 1);
    if(i_return == 1) return;
```
compile, start the server with `--enable-debug-command local` and run `redis-cli debug assert`
The assertion triggers the the stacktrace collection. 
Expect to get 2 prints of the stack trace - since we get the segmentation fault after we return from
the threads mngr, it can be safely triggered again.

## Added global variables r/w lock in ThreadsManager
To avoid a situation where the main thread runs `ThreadsManager_cleanups` while threads are still
invoking the signal handler, we use a r/w lock.
For cleanups, we will acquire the write lock.
The threads will acquire the read lock to enable them to write simultaneously.
If we fail to acquire the read lock, it means cleanups are in progress and we return immediately.
After acquiring the lock we can safely check that the global output array wasn't nullified and proceed
to write to it.
This way we ensure the threads are not modifying the global variables/ trying to write to the output
array after they were zeroed/nullified/destroyed(the semaphore).

## other minor logging change
1. removed logging if the semaphore times out because the threads can still write to the output array
  after this check. Instead, we print the total number of printed stacktraces compared to the exacted
  number (len_tids).
2. use noinline attribute to make sure the uplevel number of ignored stack trace entries stays correct.
3. improve testing

Co-authored-by: Oran Agra <oran@redislabs.com>
2023-10-02 20:02:02 +03:00
YaacovHazan 2e0f6724e0
Stabilization and improvements around aof tests (#12626)
In some tests, the code manually searches for a log message, and it
uses tail -1 with a delay of 1 second, which can miss the expected line.

Also, because the aof tests use start_server_aof and not start_server,
the test name doesn't log into the server log.

To fix the above, I made the following changes:
- Change the start_server_aof to wrap the start_server.
  This will add the created aof server to the servers list, and make
  srv() and wait_for_log_messages() available for the tests.

- Introduce a new option for start_server.
  'wait_ready' - an option to let the caller start the test code without
  waiting for the server to be ready. useful for tests on a server that
  is expected to exit on startup.

- Create a new start_server_aof_ex.
  The new proc also accept options as argument and make use of the
  new 'short_life' option for tests that are expected to exit on startup
  because of some error in the aof file(s).

Because of the above, I had to change many lines and replace every
local srv variable (a server config) usage with the srv().
2023-10-02 08:20:53 +03:00
guybe7 c2a4b78491
WAITAOF: Update fsynced_reploff_pending even if there's nothing to fsync (#12622)
The problem is that WAITAOF could have hang in case commands were
propagated only to replicas.
This can happen if a module uses RM_Call with the REDISMODULE_ARGV_NO_AOF flag.
In that case, master_repl_offset would increase, but there would be nothing to fsync, so
in the absence of other traffic, fsynced_reploff_pending would stay the static, and WAITAOF can hang.

This commit updates fsynced_reploff_pending to the latest offset in flushAppendOnlyFile in case
there's nothing to fsync. i.e. in case it's behind because of the above mentions case it'll be refreshed
and release the WAITAOF.

Other changes:
Fix a race in wait.tcl (client getting blocked vs. the fsync thread)
2023-09-28 17:19:20 +03:00
guybe7 bfa3931a04
WAITAOF: Update fsynced_reploff_pending just before starting the initial AOFRW fork (#12620)
If we set `fsynced_reploff_pending` in `startAppendOnly`, and the fork doesn't start
immediately (e.g. there's another fork active at the time), any subsequent commands
will increment `server.master_repl_offset`, but will not cause a fsync (given they were
executed before the fork started, they just ended up in the RDB part of it)
Therefore, any WAITAOF will wait on the new master_repl_offset, but it will time out
because no fsync will be executed.

Release notes:
```
WAITAOF could timeout in the absence of write traffic in case a new AOF is created and
an AOFRW can't immediately start.
This can happen by the appendonly config is changed at runtime, but also after FLUSHALL,
and replica full sync.
```
2023-09-28 17:05:53 +03:00
Binbin 9fe63bdc80
Dump server logs when corrupt fuzzer reports crash (#12612)
Recently we found some signal crashes, but unable to reproduce them.
It is a good idea to dump the server logs when a failure happens.
2023-09-27 09:08:18 +03:00
meiravgri cc2be63997
Print stack trace from all threads in crash report (#12453)
In this PR we are adding the functionality to collect all the process's threads' backtraces.

## Changes made in this PR

### **introduce threads mngr API**
The **threads mngr API** which has 2 abilities:
* `ThreadsManager_init() `- register to SIGUSR2. called on the server start-up.
* ` ThreadsManager_runOnThreads()` - receives a list of a pid_t and a callback, tells every
  thread in the list to invoke the callback, and returns the output collected by each invocation.
**Elaborating atomicvar API**
* `atomicIncrGet(var,newvalue_var,count) `-- Increment and get the atomic counter new value
* `atomicFlagGetSet` -- Get and set the atomic counter value to 1

### **Always set SIGALRM handler**
SIGALRM handler prints the process's stacktrace to the log file. Up until now, it was set only if the
`server.watchdog_period` > 0. This can be also useful if debugging is needed. However, in situations
where the server can't get requests, (a deadlock, for example) we weren't able to change the signal handler.
To make it available at run time we set SIGALRM handler on server startup. The signal handler name was
changed to a more general `sigalrmSignalHandler`.

### **Print all the process' threads' stacktraces**

`logStackTrace()` now calls `writeStacktraces()`, instead of logging the current thread stacktrace.
`writeStacktraces()`:
* On Linux systems we use the threads manager API to collect the backtraces of all the process' threads.
  To get the `tids` list (threads ids) we read the `/proc/<redis-server-pid>/tasks` file which includes a list of directories.
  Each directory name corresponds to one tid (including the main thread). For each thread, we also need to check if it
  can get the signal from the threads manager (meaning it is not blocking/ignoring that signal). We send the threads
  manager this tids list and `collect_stacktrace_data()` callback, which collects the thread's backtrace addresses,
  its name, and tid.
* On other systems, the behavior remained as it was (writing only the current thread stacktrace to the log file).

## compatibility notes
1. **The threads mngr API is only supported in linux.** 
2. glibc earlier than 2.3 We use `syscall(SYS_gettid)` and `syscall(SYS_tgkill...)` because their dedicated
  alternatives (`gettid()` and `tgkill`) were added in glibc 2.3.

## Output example

Each thread backtrace will have the following format:
`<tid> <thread_name> [additional_info]`
* **tid**: as read from the `/proc/<redis-server-pid>/tasks` file
* **thread_name**: the tread name as it is registered in the os/
* **additional_info**: Sometimes we want to add specific information about one of the threads. currently.
  it is only used to mark the thread that handles the backtraces collection by adding "*".
  In case of crash - this also indicates which thread caused the crash. The handling thread in won't
  necessarily appear first.

```
------ STACK TRACE ------
EIP:
/lib/aarch64-linux-gnu/libc.so.6(epoll_pwait+0x9c)[0xffffb9295ebc]

67089 redis-server *
linux-vdso.so.1(__kernel_rt_sigreturn+0x0)[0xffffb9437790]
/lib/aarch64-linux-gnu/libc.so.6(epoll_pwait+0x9c)[0xffffb9295ebc]
redis-server *:6379(+0x75e0c)[0xaaaac2fe5e0c]
redis-server *:6379(aeProcessEvents+0x18c)[0xaaaac2fe6c00]
redis-server *:6379(aeMain+0x24)[0xaaaac2fe7038]
redis-server *:6379(main+0xe0c)[0xaaaac3001afc]
/lib/aarch64-linux-gnu/libc.so.6(+0x273fc)[0xffffb91d73fc]
/lib/aarch64-linux-gnu/libc.so.6(__libc_start_main+0x98)[0xffffb91d74cc]
redis-server *:6379(_start+0x30)[0xaaaac2fe0370]

67093 bio_lazy_free
/lib/aarch64-linux-gnu/libc.so.6(+0x79dfc)[0xffffb9229dfc]
/lib/aarch64-linux-gnu/libc.so.6(pthread_cond_wait+0x208)[0xffffb922c8fc]
redis-server *:6379(bioProcessBackgroundJobs+0x174)[0xaaaac30976e8]
/lib/aarch64-linux-gnu/libc.so.6(+0x7d5c8)[0xffffb922d5c8]
/lib/aarch64-linux-gnu/libc.so.6(+0xe5d1c)[0xffffb9295d1c]

67091 bio_close_file
/lib/aarch64-linux-gnu/libc.so.6(+0x79dfc)[0xffffb9229dfc]
/lib/aarch64-linux-gnu/libc.so.6(pthread_cond_wait+0x208)[0xffffb922c8fc]
redis-server *:6379(bioProcessBackgroundJobs+0x174)[0xaaaac30976e8]
/lib/aarch64-linux-gnu/libc.so.6(+0x7d5c8)[0xffffb922d5c8]
/lib/aarch64-linux-gnu/libc.so.6(+0xe5d1c)[0xffffb9295d1c]

67092 bio_aof
/lib/aarch64-linux-gnu/libc.so.6(+0x79dfc)[0xffffb9229dfc]
/lib/aarch64-linux-gnu/libc.so.6(pthread_cond_wait+0x208)[0xffffb922c8fc]
redis-server *:6379(bioProcessBackgroundJobs+0x174)[0xaaaac30976e8]
/lib/aarch64-linux-gnu/libc.so.6(+0x7d5c8)[0xffffb922d5c8]
/lib/aarch64-linux-gnu/libc.so.6(+0xe5d1c)[0xffffb9295d1c]
67089:signal-handler (1693824528) --------
```
2023-09-24 09:47:23 +03:00
Binbin 96e9dec419
Bump codespell from 2.2.4 to 2.2.5 (#12557)
and adjustments.
2023-09-08 16:10:17 +03:00
alonre24 044e29dd34
redis-benchmark - add the support for binary strings (#9414)
Recently, the option of sending an argument from stdin using `-x` flag
was added to redis-benchmark (this option is available in redis-cli as well).
However, using the `-x` option for sending a blobs that contains null-characters
doesn't work as expected - the argument is trimmed in the first occurrence of
`\X00` (unlike in redis-cli).  
This PR aims to fix this issue and add the support for every binary string input,
by sending arguments length to `redisFormatCommandArgv` when processing
redis-benchmark command, so we won't treat the arguments as C-strings.

Additionally, we add a simple test coverage for `-x` (without binary strings, and
also remove an excessive server started in tests, and make sure to select db 0
so that `r` and the benchmark work on the same db.

Co-authored-by: Oran Agra <oran@redislabs.com>
2023-09-02 15:37:04 +03:00
Binbin 4ba144a4eb
Add logreqres:skip flag to new INFO obuf limit test (#12537)
The new test added in #12476 causes reply-schemas-validator to fail.
When doing `catch {r get key}`, the req-res output is:
```
3
get
3
key
12
__argv_end__
$100000
aaaaaaaaaaaaaaaaaaaa...4
info
5
stats
12
__argv_end__
=1670
txt:# Stats
...
```

And we can see the link after `$100000`, there is a 4 in the last,
it break the req-res-log-validator script since the format is wrong.

The reason i guess is after the client reconnection (after the output
buf limit), we will not add newlines, but append args directly.
Since obuf-limits.tcl is doing the same thing, and it had the logreqres:skip
flag, so this PR is following it.
2023-09-01 14:15:11 +03:00
Chen Tianjie b26e8e3213
Optimize ZRANGE offset location from linear search to skiplist jump. (#12450)
ZRANGE BYSCORE/BYLEX with [LIMIT offset count] option was
using every level in skiplist to jump to the first/last node in range,
but only use level[0] in skiplist to locate the node at offset, resulting
in sub-optimal performance using LIMIT:
```
while (ln && offset--) {
    if (reverse) {
        ln = ln->backward;
    } else {
        ln = ln->level[0].forward;
    }
}
```
It could be slow when offset is very big. We can get the total rank of
the offset location and use skiplist to jump to it. It is an improvement
from O(offset) to O(log rank).

Below shows how this is implemented (if the offset is positve):

Use the skiplist to seach for the first element in the range, record its
rank `rank_0`, so we can have the rank of the target node `rank_t`.
Meanwhile we record the last node we visited which has zsl->level-1
levels and its rank `rank_1`. Then we start from the zsl->level-1 node,
use skiplist to go forward `rank_t-rank_1` nodes to reach the target node.

It is very similiar when the offset is reversed.

Note that if `rank_t` is very close to `rank_0`, we just start from the first
element in range and go node by node, this for the case when zsl->level-1
node is to far away and it is quicker to reach the target node by node.

Here is a test using a random generated zset including 10000 elements
(with different positive scores), doing a bench mark which compares how
fast the `ZRANGE` command is exucuted before and after the optimization. 

The start score is set to 0 and the count is set to 1 to make sure that
most of the time is spent on locating the offset.
```
memtier_benchmark -h 127.0.0.1 -p 6379 --command="zrange test 0 +inf byscore limit <offset> 1"
```
| offset | QPS(unstable) | QPS(optimized) |
|--------|--------|--------|
| 10 | 73386.02 | 74819.82 |
| 1000 | 48084.96 | 73177.73 |
| 2000 | 31156.79 | 72805.83 |
| 5000 | 10954.83 | 71218.21 |

With the result above, we can see that the original code is greatly
slowed down when offset gets bigger, and with the optimization the
speed is almost not affected.

Similiar results are generated when testing reversed offset:
```
memtier_benchmark -h 127.0.0.1 -p 6379 --command="zrange test +inf 0 byscore rev limit <offset> 1"
```
| offset | QPS(unstable) | QPS(optimized) |
|--------|--------|--------|
| 10 | 74505.14 | 71653.67 |
| 1000 | 46829.25 | 72842.75 |
| 2000 | 28985.48 | 73669.01 |
| 5000 | 11066.22 | 73963.45 | 

And the same conclusion is drawn from the tests of ZRANGE BYLEX.
2023-08-31 14:42:08 +03:00
Binbin 9ce8c54d74
Update sort_ro reply_schema to mention the null reply (#12534)
Also added a test to cover this case, so this can
cover the reply schemas check.
2023-08-31 06:36:35 +03:00
Roshan Khatri 7519960527
Allows modules to declare new ACL categories. (#12486)
This PR adds a new Module API int RM_AddACLCategory(RedisModuleCtx *ctx, const char *category_name) to add a new ACL command category.

Here, we initialize the ACLCommandCategories array by allocating space for 64 categories and duplicate the 21 default categories from the predefined array 'ACLDefaultCommandCategories' into the ACLCommandCategories array while ACL initialization. Valid ACL category names can only contain alphanumeric characters, underscores, and dashes.

The API when called, checks for the onload flag, category name validity, and for duplicate category name if present. If the conditions are satisfied, the API adds the new category to the trailing end of the ACLCommandCategories array and assigns the acl_categories flag bit according to the index at which the category is added.

If any error is encountered the errno is set accordingly by the API.

---------

Co-authored-by: Madelyn Olson <madelyneolson@gmail.com>
2023-08-30 13:01:24 -07:00
bodong.ybd b59f53efb3
Fix sort_ro get-keys function return wrong key number (#12522)
Before:
```
127.0.0.1:6379> command getkeys sort_ro key
(empty array)
127.0.0.1:6379>
```
After:
```
127.0.0.1:6379> command getkeys sort_ro key
1) "key"
127.0.0.1:6379>
```
2023-08-30 22:00:02 +03:00
Chen Tianjie e3d4b30d09
Add two stats to count client input and output buffer oom. (#12476)
Add these INFO metrics:
* client_query_buffer_limit_disconnections
* client_output_buffer_limit_disconnections

Sometimes it is useful to monitor whether clients reaches size limit of
query buffer and output buffer, to decide whether we need to adjust the
buffer size limit or reduce client query payload.
2023-08-30 21:51:14 +03:00
Binbin e792653753
Add printing for LATENCY related tests (#12514)
This test failed several times:
```
*** [err]: LATENCY GRAPH can output the event graph in tests/unit/latency-monitor.tcl
Expected '478' to be more than or equal to '500' (context: type eval
line 8 cmd {assert_morethan_equal $high 500} proc ::test)
```

Not sure why, adding some verbose printing that'll print the command
result on the next time.
2023-08-27 11:42:55 +03:00
Binbin 1407ac1f3e
BITCOUNT and BITPOS with non-existing key and illegal arguments should return error, not 0 (#11734)
BITCOUNT and BITPOS with non-existing key will return 0 even the
arguments are error, before this commit:
```
> flushall
OK
> bitcount s 0
(integer) 0
> bitpos s 0 0 1 hello
(integer) 0

> set s 1
OK
> bitcount s 0
(error) ERR syntax error
> bitpos s 0 0 1 hello
(error) ERR syntax error
```

The reason is that we judged non-existing before parameter checking and
returned. This PR fixes it, and after this commit:
```
> flushall
OK
> bitcount s 0
(error) ERR syntax error
> bitpos s 0 0 1 hello
(error) ERR syntax error
```

Also BITPOS made the same fix as #12394, check for wrong argument, before
checking for key.
```
> lpush mylist a b c
(integer) 3                                                                                    
> bitpos mylist 1 a b
(error) WRONGTYPE Operation against a key holding the wrong kind of value
```
2023-08-21 19:48:30 +03:00
Wen Hui 45d3310694
BITCOUNT: check for argument, before checking for key (#12394)
Generally, In any command we first check for  the argument and then check if key exist.

Some of the examples are

```
127.0.0.1:6379> getrange no-key invalid1 invalid2
(error) ERR value is not an integer or out of range
127.0.0.1:6379> setbit no-key 1 invalid
(error) ERR bit is not an integer or out of range
127.0.0.1:6379> xrange no-key invalid1 invalid2
(error) ERR Invalid stream ID specified as stream command argument
```

**Before change** 
```
bitcount no-key invalid1 invalid2
0
```

**After change**
```
bitcount no-key invalid1 invalid2
(error) ERR value is not an integer or out of range
```
2023-08-21 12:53:46 +03:00
Wen Hui e532c95dfc
Added tests for Client commands (#10276)
In our test case, now we missed some test coverage for client sub-commands.
This pr goal is to add some test coverage cases of the following commands:

Client caching
Client kill
Client no-evict
Client pause
Client reply
Client tracking
Client setname

At the very least, this is useful to make sure there are no leaks and crashes in these code paths.
2023-08-20 19:17:51 +03:00
Oran Agra 2b8cde71bb
Update supported version list. (#12488)
Add 7.2, drop 6.0 as per https://redis.io/docs/about/releases/
Also replace a few concordances of the `’` char, with standard `'`
2023-08-16 08:36:40 +03:00
Madelyn Olson 7c179f9bf4
Fixed a bug where sequential matching ACL rules weren't compressed (#12472)
When adding a new ACL rule was added, an attempt was made to remove
any "overlapping" rules. However, there when a match was found, the search
was not resumed at the right location, but instead after the original position of
the original command.

For example, if the current rules were `-config +config|get` and a rule `+config`
was added. It would identify that `-config` was matched, but it would skip over
`+config|get`, leaving the compacted rule `-config +config`. This would be evaluated
safely, but looks weird.

This bug can only be triggered with subcommands, since that is the only way to
have sequential matching rules. Resolves #12470. This is also only present in 7.2.
I think there was also a minor risk of removing another valid rule, since it would start
the search of the next command at an arbitrary point. I couldn't find a valid offset that
would have cause a match using any of the existing commands that have subcommands
with another command.
2023-08-10 09:58:53 +03:00
Binbin 6abfda54c3
Fix flaky SENTINEL RESET test (#12437)
After SENTINEL RESET, sometimes the sentinel can
sense the master again, causing the test to fail.
Here we give it a few more chances.
2023-08-10 08:58:52 +03:00
zhaozhao.zz 8226f39fb2
do not call handleClientsBlockedOnKeys inside yielding command (#12459)
Fix the assertion when a busy script (timeout) signal ready keys (like LPUSH),
and then an arbitrary client's `allow-busy` command steps into `handleClientsBlockedOnKeys`
try wake up clients blocked on keys (like BLPOP).

Reproduction process:
1. start a redis with aof
    `./redis-server --appendonly yes`
2. exec blpop
    `127.0.0.1:6379> blpop a 0`
3. use another client call a busy script and this script push the blocked key
    `127.0.0.1:6379> eval "redis.call('lpush','a','b') while(1) do end" 0`
4. user a new client call an allow-busy command like auth
    `127.0.0.1:6379> auth a`

BTW, this issue also break the atomicity of script.

This bug has been around for many years, the old versions only have the
atomic problem, only 7.0/7.2 has the assertion problem.

Co-authored-by: Oran Agra <oran@redislabs.com>
2023-08-05 09:52:03 +03:00
sundb da9c2804a5
Avoid mostly harmless integer overflow in cjson (#12456)
This PR mainly fixes a possible integer overflow in `json_append_string()`.
When we use `cjson.encoding()` to encode a string larger than 2GB, at specific
compilation flags, an integer overflow may occur leading to truncation, resulting
in the part of the string larger than 2GB not being encoded.
On the other hand, this overflow doesn't cause any read or write out-of-range or segment fault.

1) using -O0 for lua_cjson (`make LUA_DEBUG=yes`)
    In this case, `i` will overflow and leads to truncation.
    When `i` reaches `INT_MAX+1` and overflows to INT_MIN, when compared to
    len, `i` (1000000..00) is expanded to 64 bits signed integer (1111111.....000000) .
    At this point i will be greater than len and jump out of the loop, so `for (i = 0; i < len; i++)`
    will loop up to 2^31 times, and the part of larger than 2GB will be truncated.

```asm
`i` => -0x24(%rbp)
<+253>:   addl   $0x1,-0x24(%rbp)       ; overflow if i large than 2^31
<+257>:   mov    -0x24(%rbp),%eax
<+260>:   movslq %eax,%rdx	            ; move a 32-bit value with sign extension into a 64-bit signed
<+263>:   mov    -0x20(%rbp),%rax
<+267>:   cmp    %rax,%rdx              ; check `i < len`
<+270>:   jb     0x212600 <json_append_string+148>
```
   
2) using -O2/-O3 for lua_cjson (`make LUA_DEBUG=no`, **the default**)
    In this case, because singed integer overflow is an undefined behavior, `i` will not overflow.
   `i` will be optimized by the compiler and use 64-bit registers for all subsequent instructions.

```asm
<+180>:   add    $0x1,%rbx           ; Using 64-bit register `rbx` for i++
<+184>:   lea    0x1(%rdx),%rsi
<+188>:   mov    %rsi,0x10(%rbp)
<+192>:   mov    %al,(%rcx,%rdx,1)
<+195>:   cmp    %rbx,(%rsp)         ; check `i < len`
<+199>:   ja     0x20b63a <json_append_string+154>
```

3) using 32bit
    Because `strbuf_ensure_empty_length()` preallocates memory of length (len * 6 + 2),
    in 32-bit `cjson.encode()` can only handle strings smaller than ((2 ^ 32) - 3 ) / 6.
    So 32bit is not affected.

Also change `i` in `strbuf_append_string()` to `size_t`.
Since its second argument `str` is taken from the `char2escape` string array which is never
larger than 6, so `strbuf_append_string()` is not at risk of overflow (the bug was unreachable).
2023-08-05 07:57:06 +03:00
zhaozhao.zz 90ab91f00b
fix false success and a memory leak for ACL selector with bad parenthesis combination (#12452)
When doing merge selector, we should check whether the merge
has started (i.e., whether open_bracket_start is -1) every time.
Otherwise, encountering an illegal selector pattern could succeed
and also cause memory leaks, for example:

```
acl setuser test1 (+PING (+SELECT (+DEL )
```

The above would leak memory and succeed with only DEL being applied,
and would now error after the fix.

Co-authored-by: Oran Agra <oran@redislabs.com>
2023-08-02 10:46:06 +03:00
Diego Lopez Recas b653c759cd
Fix race condition in tests/unit/auth.tcl (#12444)
Changing the masterauth while turning into a replica is racy.

Turn into replica after changing the masterauth instead.
2023-08-01 18:03:33 +03:00
DarrenJiang13 6abb3c4038
change log match to line match in tcl sanitizer_errors_from_file. (#12446)
In the tcl foreach loop, the function should compare line rather than the whole file.
2023-07-30 08:48:29 +03:00
Harkrishn Patro 42985b00ea
Test coverage for incr/decr operation on robj encoding type optimization (#12435)
Additional test coverage for incr/decr operation.

integer number could be present in raw encoding format due to operation like append. A incr/decr operation following it optimize the string to int encoding format.
2023-07-25 16:43:31 -07:00
Harkrishn Patro 34b95f752c
Add test case for APPEND command usage on integer value (#12429)
Add test coverage to validate object encoding update on APPEND command usage on a integer value
2023-07-24 18:25:50 -07:00
Makdon 2495b90a64
redis-cli: use previous hostip when not provided by redis cluster server (#12273)
When the redis server cluster running on cluster-preferred-endpoint-type unknown-endpoint mode, and receive a request that should be redirected to another redis server node, it does not reply the hostip, but a empty host like MOVED 3999 :6381.

The redis-cli would try to connect to an address without a host, which cause the issue:
```
127.0.0.1:7002> set bar bar
-> Redirected to slot [5061] located at :7000
Could not connect to Redis at :7000: No address associated with hostname
Could not connect to Redis at :7000: No address associated with hostname
not connected> exit
```

In this case, the redis-cli should use the previous hostip when there's no host provided by the server.

---------

Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
Co-authored-by: Madelyn Olson <madelynolson@gmail.com>
2023-07-20 15:31:06 -07:00
Oran Agra 936cfa464f
Lua cjson and cmsgpack integer overflow issues (CVE-2022-24834) (#12398)
* Fix integer overflows due to using wrong integer size.
* Add assertions / panic when overflow still happens.
* Deletion of dead code to avoid need to maintain it
* Some changes are not because of bugs, but rather paranoia.
* Improve cmsgpack and cjson test coverage.

Co-authored-by: Yossi Gottlieb <yossigo@gmail.com>
2023-07-10 10:26:09 +03:00
Sankar 1190f25ca7
Process loss of slot ownership in cluster bus (#12344)
Process loss of slot ownership in cluster bus

When a node no longer owns a slot, it clears the bit corresponding
to the slot in the cluster bus messages. The receiving nodes
currently don't record the fact that the sender stopped claiming
a slot until some other node in the cluster starts claiming the slot.
This can cause a slot to go missing during slot migration when subjected
to inopportune race with addition of new shards or a failover.
This fix forces the receiving nodes to process the loss of ownership
to avoid spreading wrong information.
2023-07-05 17:46:23 -07:00
Binbin d56a7d9b05
Fix and increase tollerance of event loop test, add verbose logs (#12385)
The test fails on freebsd CI:
```
*** [err]: stats: eventloop metrics in tests/unit/info.tcl
Expected '31777' to be less than '16183' (context: type eval line 17 cmd
{assert_lessthan $el_sum2 [expr $el_sum1+10000] } proc ::test)
```

The test added in #11963, fails on freebsd CI which is slow,
increase tollerance and also add some verbose logs, now we can
see these logs in verbose mode (for better views):
```
eventloop metrics cycle1: 12, cycle2: 15
eventloop metrics el_sum1: 315, el_sum2: 411
eventloop metrics cmd_sum1: 126, cmd_sum2: 137
[ok]: stats: eventloop metrics (111 ms)
instantaneous metrics instantaneous_eventloop_cycles_per_sec: 8
instantaneous metrics instantaneous_eventloop_duration_usec: 55
[ok]: stats: instantaneous metrics (1603 ms)
[ok]: stats: debug metrics (112 ms)
```
2023-07-05 09:32:30 +03:00
Lior Lahav b7559d9f3b
Fix possible crash in command getkeys (#12380)
When getKeysUsingKeySpecs processes a command with more than one key-spec,
and called with a total of more than 256 keys, it'll call getKeysPrepareResult again,
but since numkeys isn't updated, getKeysPrepareResult will not bother to copy key
names from the old result (leaving these slots uninitialized). Furthermore, it did not
consider the keys it already found when allocating more space.

Co-authored-by: Oran Agra <oran@redislabs.com>
2023-07-03 12:45:18 +03:00
Binbin 6b57241fa8
Revert zrangeGenericCommand negative offset check (#12377)
The negative offset check was added in #9052, we realized
that this is a non-mandatory breaking change and we would
like to add it only in 8.0.

This reverts PR #9052, will be re-introduced later in 8.0.
2023-07-02 20:38:30 +03:00
judeng 07ed0eafa9
improve performance for scan command when matching pattern or data type (#12209)
Optimized the performance of the SCAN command in a few ways:
1. Move the key filtering (by MATCH pattern) in the scan callback,
  so as to avoid collecting them for later filtering.
2. Reduce a many memory allocations and copying (use a reference
  to the original sds, instead of creating an robj, an excessive 2 mallocs
  and one string duplication)
3. Compare TYPE filter directly (as integers), instead of inefficient string
  compare per key.
4. fixed a small bug: when scan zset and hash types, maxiterations uses
  a more accurate number to avoid wrong double maxiterations.

Changes **postponed** for a later version (8.0):
1. Prepare to move the TYPE filtering to the scan callback as well. this was
  put on hold since it has side effects that can be considered a breaking
  change, which is that we will not attempt to do lazy expire (delete) a key
  that was filtered by not matching the TYPE (changing it would mean TYPE filter
  starts behaving the same as MATCH filter already does in that respect). 
2. when the specified key TYPE filter is an unknown type, server will reply a error
  immediately instead of doing a full scan that comes back empty handed. 

Benchmark result:
For different scenarios, we obtained about 30% or more performance improvement.

Co-authored-by: Oran Agra <oran@redislabs.com>
2023-06-27 16:43:46 +03:00
Chen Tianjie 22a29935ff
Support TLS service when "tls-cluster" is not enabled and persist both plain and TLS port in nodes.conf (#12233)
Originally, when "tls-cluster" is enabled, `port` is set to TLS port. In order to support non-TLS clients, `pport` is used to propagate TCP port across cluster nodes. However when "tls-cluster" is disabled, `port` is set to TCP port, and `pport` is not used, which means the cluster cannot provide TLS service unless "tls-cluster" is on.
```
typedef struct {
    // ...
    uint16_t port;  /* Latest known clients port (TLS or plain). */
    uint16_t pport; /* Latest known clients plaintext port. Only used if the main clients port is for TLS. */
    // ...
} clusterNode;
```
```
typedef struct {
    // ...
    uint16_t port;   /* TCP base port number. */
    uint16_t pport;  /* Sender TCP plaintext port, if base port is TLS */
    // ...
} clusterMsg;
```
This PR renames `port` and `pport` in `clusterNode` to `tcp_port` and `tls_port`, to record both ports no matter "tls-cluster" is enabled or disabled.

This allows to provide TLS service to clients when "tls-cluster" is disabled: when displaying cluster topology, or giving `MOVED` error, server can provide TLS or TCP port according to client's connection type, no matter what type of connection cluster bus is using.

For backwards compatibility, `port` and `pport` in `clusterMsg` are preserved, when "tls-cluster" is enabled, `port` is set to TLS port and `pport` is set to TCP port, when "tls-cluster" is disabled, `port` is set to TCP port and `pport` is set to TLS port (instead of 0).

Also, in the nodes.conf file, a new aux field displaying an extra port is added to complete the persisted info. We may have `tls_port=xxxxx` or `tcp_port=xxxxx` in the aux field, to complete the cluster topology, while the other port is stored in the normal `<ip>:<port>` field. The format is shown below.
```
<node-id> <ip>:<tcp_port>@<cport>,<hostname>,shard-id=...,tls-port=6379 myself,master - 0 0 0 connected 0-1000
```
Or we can switch the position of two ports, both can be correctly resolved.
```
<node-id> <ip>:<tls_port>@<cport>,<hostname>,shard-id=...,tcp-port=6379 myself,master - 0 0 0 connected 0-1000
```
2023-06-26 07:43:38 -07:00
Meir Shpilraien (Spielrein) 153f8f082e
Fix use after free on blocking RM_Call. (#12342)
blocking RM_Call was introduced on: #11568, It allows a module to perform
blocking commands and get the reply asynchronously.If the command gets
block, a special promise CallReply is returned that allow to set the unblock
handler. The unblock handler will be called when the command invocation
finish and it gets, as input, the command real reply.

The issue was that the real CallReply was created using a stack allocated
RedisModuleCtx which is no longer available after the unblock handler finishes.
So if the module keeps the CallReply after the unblock handler finished, the
CallReply holds a pointer to invalid memory and will try to access it when the
CallReply will be released.

The solution is to create the CallReply with a NULL context to make it totally
detached and can be freed freely when the module wants.

Test was added to cover this case, running the test with valgrind before the
fix shows the use after free error. With the fix, there are no valgrind errors.

unrelated: adding a missing `$rd close` in many tests in that file.
2023-06-25 14:12:27 +03:00
guybe7 3230199920
Modules: Unblock from within a timer coverage (#12337)
Apart from adding the missing coverage, this PR also adds `blockedBeforeSleep`
that gathers all block-related functions from `beforeSleep`

The order inside `blockedBeforeSleep` is different: now `handleClientsBlockedOnKeys`
(which may unblock clients) is called before `processUnblockedClients` (which handles
unblocked clients).
It makes sense to have this order.

There are no visible effects of the wrong ordering, except some cleanups of the now-unblocked
client would have  happen in the next `beforeSleep` (will now happen in the current one)

The reason we even got into it is because i triggers an assertion in logresreq.c (breaking
the assumption that `unblockClient` is called **before** actually flushing the reply to the socket):
`handleClientsBlockedOnKeys` is called, then it calls `moduleUnblockClientOnKey`, which calls
`moduleUnblockClient`, which adds the client to `moduleUnblockedClients` back to `beforeSleep`,
we call `handleClientsWithPendingWritesUsingThreads`, it writes the data of buf to the client, so
`client->bufpos` became 0
On the next `beforeSleep`, we call `moduleHandleBlockedClients`, which calls `unblockClient`,
which calls `reqresAppendResponse`, triggering the assert. (because the `bufpos` is 0) - see https://github.com/redis/redis/pull/12301#discussion_r1226386716
2023-06-22 23:15:16 +03:00
Madelyn Olson 73cf0243df
Make nodename test more consistent (#12330)
To determine when everything was stable, we couldn't just query the nodename since they aren't API visible by design. Instead, we were using a proxy piece of information which was bumping the epoch and waiting for everyone to observe that. This works for making source Node 0 and Node 1 had pinged, and Node 0 and Node 2 had pinged, but did not guarantee that Node 1 and Node 2 had pinged. Although unlikely, this can cause this failure message. To fix it I hijacked hostnames and used its validation that it has been propagated, since we know that it is stable.

I also noticed while stress testing this sometimes the test took almost 4.5 seconds to finish, which is really close to the current 5 second limit of the log check, so I bumped that up as well just to make it a bit more consistent.
2023-06-20 18:00:55 -07:00
guybe7 20fa156067
Align RM_ReplyWithErrorFormat with RM_ReplyWithError (#12321)
Introduced by https://github.com/redis/redis/pull/11923 (Redis 7.2 RC2)

It's very weird and counterintuitive that `RM_ReplyWithError` requires the error-code
**without** a hyphen while `RM_ReplyWithErrorFormat` requires either the error-code
**with** a hyphen or no error-code at all
```
RedisModule_ReplyWithError(ctx, "BLA bla bla");
```
vs.
```
RedisModule_ReplyWithErrorFormat(ctx, "-BLA %s", "bla bla");
```

This commit aligns RM_ReplyWithErrorFormat to behvae like RM_ReplyWithError.
it's a breaking changes but it's done before 7.2 goes GA.
2023-06-20 20:44:43 +03:00
Oran Agra 8ad8f0f9d8
Fix broken protocol when PUBLISH emits local push inside MULTI (#12326)
When a connection that's subscribe to a channel emits PUBLISH inside MULTI-EXEC,
the push notification messes up the EXEC response.

e.g. MULTI, PING, PUSH foo bar, PING, EXEC
the EXEC's response will contain: PONG, {message foo bar}, 1. and the second PONG
will be delivered outside the EXEC's response.

Additionally, this PR changes the order of responses in case of a plain PUBLISH (when
the current client also subscribed to it), by delivering the push after the command's
response instead of before it.
This also affects modules calling RM_PublishMessage in a similar way, so that we don't
run the risk of getting that push mixed together with the module command's response.
2023-06-20 20:41:41 +03:00
Binbin d9c2ef8a72
zrangeGenericCommand add check for negative offset (#9052)
Now we will check the offset in zrangeGenericCommand.
With a negative offset, we will throw an error and return.

This also resolve the issue of zeroing the destination key
in case of the "store" variant when we input a negative offset.
```
127.0.0.1:6379> set key value
OK
127.0.0.1:6379> zrangestore key myzset 0 10 byscore limit -1 10
(integer) 0
127.0.0.1:6379> exists key
(integer) 0
```

This change affects the following commands:
- ZRANGE / ZRANGESTORE / ZRANGEBYLEX / ZRANGEBYSCORE
- ZREVRANGE / ZREVRANGEBYSCORE / ZREVRANGEBYLEX
2023-06-20 14:33:17 +03:00
Wen Hui 66ea178cb6
adding geo command edge cases tests (#12274)
For geosearch and georadius we have already test coverage for wrong type, but we dont have for geodist, geohash, geopos commands. So adding the wrong type test cases for geodist, geohash, geopos commands.

Existing code, we have verify_geo_edge_response_bymember function for wrong type test cases which has member as an option. But the function is being called in other test cases where the output is not inline with these commnds(geodist, geohash, geopos). So I could not include these commands(geodist, geohash, geopos) as part of existing function, hence implemented a new function verify_geo_edge_response_generic and called from the test case.
2023-06-20 12:50:03 +03:00
Binbin d306d86146
Fix ZRANK/ZREVRANK reply_schema description (#12331)
The parameter name is WITHSCORE instead of WITHSCORES.
2023-06-20 11:15:40 +03:00
Wen Hui 813924b4c3
Sanitizer reported memory leak for '--invalid' option or port number is missed cases to redis-server. (#12322)
Observed that the sanitizer reported memory leak as clean up is not done
before the process termination in negative/following cases:

**- when we passed '--invalid' as option to redis-server.**

```
 -vm:~/mem-leak-issue/redis$ ./src/redis-server --invalid

*** FATAL CONFIG FILE ERROR (Redis 255.255.255) ***
Reading the configuration file, at line 2
>>> 'invalid'
Bad directive or wrong number of arguments

=================================================================
==865778==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 8 byte(s) in 1 object(s) allocated from:
    #0 0x7f0985f65867 in __interceptor_malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:145
    #1 0x558ec86686ec in ztrymalloc_usable_internal /home/ubuntu/mem-leak-issue/redis/src/zmalloc.c:117
    #2 0x558ec86686ec in ztrymalloc_usable /home/ubuntu/mem-leak-issue/redis/src/zmalloc.c:135
    #3 0x558ec86686ec in ztryrealloc_usable_internal /home/ubuntu/mem-leak-issue/redis/src/zmalloc.c:276
    #4 0x558ec86686ec in zrealloc /home/ubuntu/mem-leak-issue/redis/src/zmalloc.c:327
    #5 0x558ec865dd7e in sdssplitargs /home/ubuntu/mem-leak-issue/redis/src/sds.c:1172
    #6 0x558ec87a1be7 in loadServerConfigFromString /home/ubuntu/mem-leak-issue/redis/src/config.c:472
    #7 0x558ec87a13b3 in loadServerConfig /home/ubuntu/mem-leak-issue/redis/src/config.c:718
    #8 0x558ec85e6f15 in main /home/ubuntu/mem-leak-issue/redis/src/server.c:7258
    #9 0x7f09856e5d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58

SUMMARY: AddressSanitizer: 8 byte(s) leaked in 1 allocation(s).

```

**- when we pass '--port' as option and missed to add port number to redis-server.**

```
vm:~/mem-leak-issue/redis$ ./src/redis-server --port

*** FATAL CONFIG FILE ERROR (Redis 255.255.255) ***
Reading the configuration file, at line 2
>>> 'port'
wrong number of arguments

=================================================================
==865846==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 8 byte(s) in 1 object(s) allocated from:
    #0 0x7fdcdbb1f867 in __interceptor_malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:145
    #1 0x557e8b04f6ec in ztrymalloc_usable_internal /home/ubuntu/mem-leak-issue/redis/src/zmalloc.c:117
    #2 0x557e8b04f6ec in ztrymalloc_usable /home/ubuntu/mem-leak-issue/redis/src/zmalloc.c:135
    #3 0x557e8b04f6ec in ztryrealloc_usable_internal /home/ubuntu/mem-leak-issue/redis/src/zmalloc.c:276
    #4 0x557e8b04f6ec in zrealloc /home/ubuntu/mem-leak-issue/redis/src/zmalloc.c:327
    #5 0x557e8b044d7e in sdssplitargs /home/ubuntu/mem-leak-issue/redis/src/sds.c:1172
    #6 0x557e8b188be7 in loadServerConfigFromString /home/ubuntu/mem-leak-issue/redis/src/config.c:472
    #7 0x557e8b1883b3 in loadServerConfig /home/ubuntu/mem-leak-issue/redis/src/config.c:718
    #8 0x557e8afcdf15 in main /home/ubuntu/mem-leak-issue/redis/src/server.c:7258
    #9 0x7fdcdb29fd8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58

Indirect leak of 10 byte(s) in 1 object(s) allocated from:
    #0 0x7fdcdbb1fc18 in __interceptor_realloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:164
    #1 0x557e8b04f9aa in ztryrealloc_usable_internal /home/ubuntu/mem-leak-issue/redis/src/zmalloc.c:287
    #2 0x557e8b04f9aa in ztryrealloc_usable /home/ubuntu/mem-leak-issue/redis/src/zmalloc.c:317
    #3 0x557e8b04f9aa in zrealloc_usable /home/ubuntu/mem-leak-issue/redis/src/zmalloc.c:342
    #4 0x557e8b033f90 in _sdsMakeRoomFor /home/ubuntu/mem-leak-issue/redis/src/sds.c:271
    #5 0x557e8b033f90 in sdsMakeRoomFor /home/ubuntu/mem-leak-issue/redis/src/sds.c:295
    #6 0x557e8b033f90 in sdscatlen /home/ubuntu/mem-leak-issue/redis/src/sds.c:486
    #7 0x557e8b044e1f in sdssplitargs /home/ubuntu/mem-leak-issue/redis/src/sds.c:1165
    #8 0x557e8b188be7 in loadServerConfigFromString /home/ubuntu/mem-leak-issue/redis/src/config.c:472
    #9 0x557e8b1883b3 in loadServerConfig /home/ubuntu/mem-leak-issue/redis/src/config.c:718
    #10 0x557e8afcdf15 in main /home/ubuntu/mem-leak-issue/redis/src/server.c:7258
    #11 0x7fdcdb29fd8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58

SUMMARY: AddressSanitizer: 18 byte(s) leaked in 2 allocation(s).

```

As part analysis found that the sdsfreesplitres is not called when this condition checks are being hit.

Output after the fix:


```
vm:~/mem-leak-issue/redis$ ./src/redis-server --invalid

*** FATAL CONFIG FILE ERROR (Redis 255.255.255) ***
Reading the configuration file, at line 2
>>> 'invalid'
Bad directive or wrong number of arguments
vm:~/mem-leak-issue/redis$

===========================================
vm:~/mem-leak-issue/redis$ ./src/redis-server --jdhg

*** FATAL CONFIG FILE ERROR (Redis 255.255.255) ***
Reading the configuration file, at line 2
>>> 'jdhg'
Bad directive or wrong number of arguments

---------------------------------------------------------------------------
vm:~/mem-leak-issue/redis$ ./src/redis-server --port

*** FATAL CONFIG FILE ERROR (Redis 255.255.255) ***
Reading the configuration file, at line 2
>>> 'port'
wrong number of arguments
```

Co-authored-by: Oran Agra <oran@redislabs.com>
2023-06-20 10:07:29 +03:00
Shaya Potter 07316f16f0
Add ability for modules to know which client is being cmd filtered (#12219)
Adds API
- RedisModule_CommandFilterGetClientId()

Includes addition to commandfilter test module to validate that it works
by performing the same command from 2 different clients
2023-06-20 09:03:52 +03:00
Wen Hui 070453eef3
Cluster human readable nodename feature (#9564)
This PR adds a human readable name to a node in clusters that are visible as part of error logs. This is useful so that admins and operators of Redis cluster have better visibility into failures without having to cross-reference the generated ID with some logical identifier (such as pod-ID or EC2 instance ID). This is mentioned in #8948. Specific nodenames can be set by using the variable cluster-announce-human-nodename. The nodename is gossiped using the clusterbus extension in #9530.

Co-authored-by: Madelyn Olson <madelyneolson@gmail.com>
2023-06-17 21:16:51 -07:00
sundb b00a235186
Use Reservoir Sampling for random sampling of dict, and fix hang during fork (#12276)
## Issue:
When a dict has a long chain or the length of the chain is longer than
the number of samples, we will never be able to sample the elements
at the end of the chain using dictGetSomeKeys().
This could mean that SRANDMEMBER can be hang in and endless loop.
The most severe case, is the pathological case of when someone uses SCAN+DEL
or SSCAN+SREM creating an unevenly distributed dict.
This was amplified by the recent change in #11692 which prevented a
down-sizing rehashing while there is a fork.

## Solution
1. Before, we will stop sampling when we reach the maximum number
  of samples, even if there is more data after the current chain.
  Now when we reach the maximum we use the Reservoir Sampling
  algorithm to fairly sample the end of the chain that cannot be sampled
2. Fix the rehashing code, so that the same as it allows rehashing for up-sizing
  during fork when the ratio is extreme, it will allow it for down-sizing as well.

Issue was introduced (or became more severe) by #11692

Co-authored-by: Oran Agra <oran@redislabs.com>
2023-06-16 19:13:07 +03:00
Binbin 439b0315c8
Fix SPOP/RESTORE propagation when doing lazy free (#12320)
In SPOP, when COUNT is greater than or equal to set's size,
we will remove the set. In dbDelete, we will do DEL or UNLINK
according to the lazy flag. This is also required for propagate.

In RESTORE, we won't store expired keys into the db, see #7472.
When used together with REPLACE, it should emit a DEL or UNLINK
according to the lazy flag.

This PR also adds tests to cover the propagation. The RESTORE
test will also cover #7472.
2023-06-16 08:14:11 -07:00
YaacovHazan 5da9eecdb8
Removing duplicated tests (#12318)
In 4ba47d2d2 the following tests added in both tracking.tcl and introspection.tcl

- Coverage: Basic CLIENT CACHING
- Coverage: Basic CLIENT REPLY
- Coverage: Basic CLIENT TRACKINGINFO
- Coverage: Basic CLIENT GETREDIR
2023-06-16 15:55:24 +03:00
Binbin 9dc6f93e9d
Add command being unblocked cause another command to get unblocked execution order test (#12324)
* Add command being unblocked cause another command to get unblocked execution order test

In #12301, we observed that if the
`while(listLength(server.ready_keys) != 0)`
in handleClientsBlockedOnKeys is changed to
`if(listLength(server.ready_keys) != 0)`,
the order of command execution will change.

It is wrong to change that. It means that if a command
being unblocked causes another command to get unblocked
(like a BLMOVE would do), then the new unblocked command
will wait for later to get processed rather than right away.

It'll not have any real implication if we change that since
we do call handleClientsBlockedOnKeys in beforeSleep again,
and redis will still behave correctly, but we don't change that.

An example:
1. $rd1 blmove src{t} dst{t} left right 0
2. $rd2 blmove dst{t} src{t} right left 0
3. $rd3 set key1{t}, $rd3 lpush src{t}, $rd3 set key2{t} in a pipeline

The correct order would be:
1. set key1{t}
2. lpush src{t}
3. lmove src{t} dst{t} left right
4. lmove dst{t} src{t} right left
5. set key2{t}

The wrong order would be:
1. set key1{t}
2. lpush src{t}
3. lmove src{t} dst{t} left right
4. set key2{t}
5. lmove dst{t} src{t} right left

This PR adds corresponding test to cover it.

* Add comment near while(listLength(server.ready_keys) != 0)
2023-06-16 15:39:00 +03:00
Binbin 254475537a
Optimize SET PXAT to reduce calls of rewriteClientCommandVector (#12316)
In PXAT case, there is no need to do the rewriteClientCommandVector,
a simply benchmark show we gain a improvement of 10%.
2023-06-15 10:07:47 +03:00
Wen Hui 1946083410
Removing the duplicate test case (#12310)
Looks like the Zadd test case was copied to create Zincrby test case ,but missed to change the command.
2023-06-14 10:03:33 +03:00
Harkrishn Patro a9e32767f7
Allow cluster slots/shards api to respond during loading (#12269)
It would be helpful for clients to get cluster slots/shards information during a node failover and is loading data.
2023-06-13 18:16:32 +03:00
Binbin e7129e43e0
Fix XREADGROUP BLOCK stuck in endless loop (#12301)
For the XREADGROUP BLOCK > scenario, there is an endless loop.
Due to #11012, it keep going, reprocess command -> blockForKeys -> reprocess command

The right fix is to avoid an endless loop in handleClientsBlockedOnKey and handleClientsBlockedOnKeys,
looks like there was some attempt in handleClientsBlockedOnKeys but maybe not sufficiently good,
and it looks like using a similar trick in handleClientsBlockedOnKey is complicated.
i.e. stashing the list on the stack and iterating on it after creating a fresh one for future use,
is problematic since the code keeps accessing the global list.

Co-authored-by: Oran Agra <oran@redislabs.com>
2023-06-13 13:27:05 +03:00
Oran Agra f228ec1ea5
flushSlavesOutputBuffers should not write to replicas scheduled to drop (#12242)
This will increase the size of an already large COB (one already passed
the threshold for disconnection)

This could also mean that we'll attempt to write that data to the socket
and the replica will manage to read it, which will result in an
undesired partial sync (undesired for the test)
2023-06-12 14:05:34 +03:00
YaacovHazan 0bfb6d5582
Reset command duration for rejected command. (#12247)
In 7.2, After 971b177fa we make sure (assert) that
the duration has been recorded when resetting the client.

This is not true for rejected commands.
The use case I found is a blocking command that an ACL rule changed before
it was unblocked, and while reprocessing it, the command rejected and triggered the assert.

The PR reset the command duration inside rejectCommand / rejectCommandSds.

Co-authored-by: Oran Agra <oran@redislabs.com>
2023-06-11 23:12:05 +03:00
Chen Tianjie 2d7d3911da
Allow bigger tolerance in eventloop duration test. (#12179)
In #11963, some new tests about eventloop duration were added, which includes time measurement in TCL scripts. This has caused some unexpected CI failures, such as #12169 and #12177, due to slow test servers or some performance jittering.
2023-06-11 09:02:41 +03:00
Wen Hui d412269ff8
Adding missing test cases for Addslot Command (#12288)
Added missing test case coverage for below scenarios:

1. The command only works if all the specified slots are, from
  the point of view of the node receiving the command, currently
  not assigned. A node will refuse to take ownership for slots that
  already belong to some other node (including itself).
2. The command fails if the same slot is specified multiple times.
2023-06-11 08:36:26 +03:00
Binbin da8f7428fa
Try to fix SENTINEL SIMULATE-FAILURE test by re-source init-tests before each test (#12194)
This test was introduced in #12079, it works well most of the time, but
occasionally fails:
```
00:34:45> SENTINEL SIMULATE-FAILURE crash-after-election works: OK
00:34:45> SENTINEL SIMULATE-FAILURE crash-after-promotion works: FAILED: Sentinel set crash-after-promotion but did not exit
```

Don't know the reason, it may be affected by the exit of the previous
crash-after-election test. Because it doesn't really make much sense to
go deeper into it now, we re-source init-tests to get a clean environment
before each test, to try to fix this.

After applying this change, we found a new error:
```
16:39:33> SENTINEL SIMULATE-FAILURE crash-after-election works: FAILED: caught an error in the test couldn't open socket: connection refused
couldn't open socket: connection refused
```

I am guessing the sentinel triggers failover and exits before SENTINEL FAILOVER,
added a new || condition in wait_for_condition to fix it.
2023-05-29 13:43:26 +03:00
Oran Agra 2764dc3768
Optimize MSETNX to avoid double lookup (#11944)
This is a redo of #11594 which got reverted in #11940
It improves performance by avoiding double lookup of the the key.
2023-05-28 10:58:29 +03:00
Oran Agra 6117f28822
Fix WAIT for clients being blocked in a module command (#12220)
So far clients being blocked and unblocked by a module command would
update the c->woff variable and so WAIT was ineffective and got released
without waiting for the command actions to propagate.

This seems to have existed since forever, but not for RM_BlockClientOnKeys.

It is problematic though to know if the module did or didn't propagate
anything in that command, so for now, instead of adding an API, we'll
just update the woff to the latest offset when unblocking, this will
cause the client to possibly wait excessively, but that's not that bad.
2023-05-28 10:10:52 +03:00
Wen Hui 1a188e4ed6
[BUG] Incorrect error msg for XREAD command (#12238)
XREAD only supports a special ID of $ and XREADGROUP only supports ^.
make sure not to suggest the wrong one when rerunning an error about unbalanced ID arguments

Co-authored-by: Oran Agra <oran@redislabs.com>
2023-05-28 08:37:32 +03:00
judeng d71478a889
postpone the initialization of oject's lru&lfu until it is added to the db as a value object (#11626)
This pr can get two performance benefits:
1. Stop redundant initialization when most robj objects are created
2. LRU_CLOCK will no longer be called in io threads, so we can avoid the `atomicGet`

Another code optimization:
deleted the redundant judgment in dbSetValue, no matter in LFU or LRU, the lru field inold
robj is always the freshest (it is always updated in lookupkey), so we don't need to judge if in LFU
2023-05-24 09:40:11 +03:00
Wen Hui d664889992
Adding test case for hvals, hkeys, hexists against wrong type (#12198)
HVALS, HKEYS and HEXISTS commands wrong type test cases were not covered so added the test cases.
2023-05-24 09:34:13 +03:00
Binbin d0994c5bca
Sync the new loglevel nothing to sentinel (#12223)
We add a new loglevel 'nothing' to disable logging in #12133.
This PR syncs that config change to sentinel. Because in #11214
we support modifying loglevel in runtime.

Although I think sentinel doesn't need this nothing config,
it's better to be consistent.
2023-05-24 09:32:39 +03:00
Binbin ec5721d6ca
Add dummy CLUSTER SLAVES call tests to fix reply ci (#12224)
In #12166, we removed a call to CLUSTER SLAVES, which
then caused reply-schemas ci to fail:
```
WARNING! The following commands were not hit at all:
  cluster|slaves
  ERROR! at least one command was not hit by the tests
```

Because we already have command output that cover CLUSTER REPLICAS
elsewhere, here we simply add some dummy tests to fix the ci.
2023-05-24 09:28:38 +03:00
Ping Xie 4c74dd986f
Exclude aux fields from "cluster nodes" and "cluster replicas" output (#12166)
This commit excludes aux fields from the output of the `cluster nodes` and `cluster replicas` command.
We may decide to re-introduce them in some form or another in the future, but not in v7.2.
2023-05-23 18:32:37 +03:00
binfeng-xin 38e284f106
optimize spopwithcount propagation (#12082)
A single SPOP with command with count argument resulted in many SPOP
commands being propagated to the replica.
This is inefficient because the key name is repeated many times, and is also
being looked-up many times.
also it results in high QPS metrics on the replica.
To solve that, we flush batches of 1024 fields per SPOP command.

Co-authored-by: zhaozhao.zz <zhaozhao.zz@alibaba-inc.com>
2023-05-22 10:27:14 +03:00
Binbin 48757934ff
Performance improvement to ZADD and ZRANGESTORE, convert to skiplist and expand dict in advance (#12185)
For zsets that will eventually be stored as the skiplist encoding (has a dict),
we can convert it to skiplist ahead of time. This change checks the number
of arguments in the ZADD command, and converts the data-structure
if the number of new entries exceeds the listpack-max-entries configuration.
This can cause us to over-allocate memory if there are duplicate entries in the
input, which is unexpected.

For ZRANGESTORE, we know the size of the zset, so we can expand
the dict in advance, to avoid the temporary dict from being rehashed
while it grows.

Simple benchmarks shows it provides some 4% improvement in ZADD and 20% in ZRANGESTORE
2023-05-18 15:24:46 +03:00
Hanna Fadida 37cf1984b9
Add BITFIELD_RO basic tests for non-repl use cases (#12187)
Current tests for BITFIELD_RO command are skipped in the external mode,
and therefore reply-schemas-validator reports a coverage error.
This PR adds basic tests to increase coverage.
2023-05-18 12:16:46 +03:00
Wen Hui df1890ef7f
Allow SENTINEL CONFIG SET and SENTINEL CONFIG GET to handle multiple parameters. (#10362)
Extend SENTINEL CONFIG SET and SENTINEL CONFIG GET to be
compatible with variadic CONFIG SET and CONFIG GET and allow multiple
parameters to be modified in a single call atomically.

Co-authored-by: Oran Agra <oran@redislabs.com>
2023-05-17 10:26:02 +03:00
Binbin fd566f4050
Fix for set max entries edge case in setTypeCreate / setTypeMaybeConvert (#12183)
In the judgment in setTypeCreate, we should judge size_hint <= max_entries.

This results in the following inconsistencies:
```
127.0.0.1:6379> config set set-max-intset-entries 5 set-max-listpack-entries 5
OK

127.0.0.1:6379> sadd intset_set1 1 2 3 4 5
(integer) 5
127.0.0.1:6379> object encoding intset_set1
"hashtable"
127.0.0.1:6379> sadd intset_set2 1 2 3 4
(integer) 4
127.0.0.1:6379> sadd intset_set2 5
(integer) 1
127.0.0.1:6379> object encoding intset_set2
"intset"

127.0.0.1:6379> sadd listpack_set1 a 1 2 3 4
(integer) 5
127.0.0.1:6379> object encoding listpack_set1
"hashtable"
127.0.0.1:6379> sadd listpack_set2 a 1 2 3
(integer) 4
127.0.0.1:6379> sadd listpack_set2 4
(integer) 1
127.0.0.1:6379> object encoding listpack_set2
"listpack"
```

This was introduced in #12019, added corresponding tests.
2023-05-16 11:32:21 -07:00
Wen Hui e45272884e
Adding missing test case for smembers scard commands (#12148)
Minor missing test case addition.

SMEMBERS SCARD against non set
SMEMBERS SCARD against non existing key
2023-05-16 09:26:49 -07:00
Oran Agra 2ffde15a1d
increase tollerance of new event loop test, fails on freebsd CI (#12169)
new test added in #11963, fails on freebsd CI which is slow.
2023-05-14 17:40:29 +03:00
JJ Lu 11cf5cbdcc
Fix bug: LPOS RANK LONG_ MIN causes overflow (#12167)
Limit the range of RANK to -LONG_ MAX ~ LONG_ MAX.
Without this limit, passing -9223372036854775808 would effectively
be the same as passing -1.
2023-05-14 09:04:33 +03:00
Chen Tianjie 29ca87955e
Add basic eventloop latency measurement. (#11963)
The measured latency(duration) includes the list below, which can be shown by `INFO STATS`.
```
eventloop_cycles  // ever increasing counter
eventloop_duration_sum // cumulative duration of eventloop in microseconds
eventloop_duration_cmd_sum  // cumulative duration of executing commands in microseconds
instantaneous_eventloop_cycles_per_sec  // average eventloop count per second in recent 1.6s
instantaneous_eventloop_duration_usec  // average single eventloop duration in recent 1.6s
```

Also added some experimental metrics, which are shown only when `INFO DEBUG` is called.
This section isn't included in the default INFO, or even in `INFO ALL` and the fields in this section
can change in the future without considering backwards compatibility.
```
eventloop_duration_aof_sum  // cumulative duration of writing AOF
eventloop_duration_cron_sum  // cumulative duration cron jobs (serverCron, beforeSleep excluding IO and AOF)
eventloop_cmd_per_cycle_max  // max number of commands executed in one eventloop
eventloop_duration_max  // max duration of one eventloop
```

All of these are being reset by CONFIG RESETSTAT
2023-05-12 20:13:15 +03:00
Leibale Eidelman e04ebdb8d3
fix `GEORADIUS[BYMEMBER]` `STORE` & `STOREDIST` args spec (#12151)
in GEO commands, `STORE` and `STOREDIST` are mutually exclusive.
use `oneof` block to contain them
2023-05-09 14:24:37 +03:00
Binbin 6ab2174d37
EXPIRE precision test more attempts and more tolerant (#12150)
The test failed on MacOS:
```
*** [err]: EXPIRE precision is now the millisecond in tests/unit/expire.tcl
Expected 'somevalue {}' to equal or match '{} {}'
```

`set a [r get x]`, even though we tried 10 times, sometimes we
still get {}, this is a time-sensitive test.

In this PR, we add the following changes:
1. More attempts, change it from 10 to 30.
2. More tolerant, change the `after 900` to `after 800`.

In addition, we judging $a in advance and changing `after 1100`
to `after 300`, this will save us some times.
2023-05-09 14:14:22 +03:00
cui fliter 03d50e0c30
Remove several instances of duplicate "the" in comments (#12144)
Remove several instances of duplicate "the" in comments
2023-05-08 16:12:44 -07:00
Wen Hui 42dd98ec19
adding missing test cases GET and GETEX (#12125)
adding test case of expired key or not exist for GET and GETEX.
for better test coverage.
2023-05-07 11:46:11 +03:00
sundb ce5f4ea3a9
Delete empty key if fails after moduleCreateEmptyKey() in module (#12129)
When `RM_ZsetAdd()`/`RM_ZsetIncrby()`/`RM_StreamAdd()` fails, if a new key happens to 
be created using `moduleCreateEmptyKey()`, we should clean up the empty key.

## Test
1) Add new module commands(`zset.add` and `zset.incrby`) to cover  `RM_ZsetAdd()`/`RM_ZsetIncrby()`.
2) Add a large-memory test to cover `RM_StreamAdd()`.
2023-05-07 10:13:19 +03:00
zhaozhao.zz b0dd7b3245
Free backlog only if rsi is invalid when master reboot (#12088)
When master reboot from RDB, if rsi in RDB is valid we should not free replication backlog, even if master_repl_offset or repl-offset is 0.

Since if master doesn't send any data to replicas master_repl_offset is 0, it's a valid number.

A clear example:

1. start a master and apply some write commands, the master's master_repl_offset is 0 since it has no replicas.
2. stop write commands on master, and start another instance and replicaof the master, trigger an FULLRESYNC
3. the master's master_repl_offset is still 0 (set a large number for repl-ping-replica-period), do BGSAVE and restart the master
4. master load master_repl_offset from RDB's rsi and it's still 0, and we should make sure replica can partially resync with master.
2023-05-06 11:53:28 +08:00
Binbin e49c2a5292
Pause cron to prevent premature shrinking in querybuf test (#12126)
Tests occasionally fail since #12000:
```
*** [err]: query buffer resized correctly when not idle in tests/unit/querybuf.tcl
Expected 0 > 32768 (context: type eval line 11 cmd {assert {$orig_test_client_qbuf > 32768}} proc ::test)

*** [err]: query buffer resized correctly with fat argv in tests/unit/querybuf.tcl
query buffer should not be resized when client idle time smaller than 2s
```

The reason may be because we set hz to 100, querybuf shrinks before we count
client_query_buffer. We avoid this problem by setting pause-cron to 1.
2023-05-04 13:02:08 +03:00
guybe7 857c09b04d
multi.tcl: reset readraw at the end of the test (#12123)
1. reset the readraw mode after a test that uses it. undetected since the
  only test after that on the same server didn't read any replies.
2. fix a cross slot issue that was undetected in cluster mode because
  readraw doesn't throw exceptions on errors.
2023-05-04 11:58:31 +03:00
Madelyn Olson 5e3be1be09
Remove prototypes with empty declarations (#12020)
Technically declaring a prototype with an empty declaration has been deprecated since the early days of C, but we never got a warning for it. C2x will apparently be introducing a breaking change if you are using this type of declarator, so Clang 15 has started issuing a warning with -pedantic. Although not apparently a problem for any of the compiler we build on, if feels like the right thing is to properly adhere to the C standard and use (void).
2023-05-02 17:31:32 -07:00
Wen Hui f32d1817e3
Updating missing test cases for Hash commands (#12116)
Adding missing test case against wrong type for HRANDFIELD HGET HGETALL HDEL HINCRBY HINCRBYFLOAT HSTRLEN.
2023-05-01 21:00:07 +03:00
Binbin d659c73456
Add missing reply schema and coverage tests (#12079)
The change in #12018 break the CI (fixed by #12083).
There are quite a few sentinel commands that are missing both test coverage and also schema.

PR added reply-schema to the following commands:
- sentinel debug
- sentinel info-cache
- sentinel pendding-scripts
- sentinel reset
- sentinel simulate-failure

Added some very basic tests for other sentinel commands, just so that they have some coverage.
- sentinel help
- sentinel masters
- sentinel myid
- sentinel sentinels
- sentinel slaves

These tests should be improved / replaced in a followup PR.
2023-04-27 09:32:14 +03:00
judeng 9b588f3820
minor optimization for slowlog get (#12103)
We can always know the array length of the response, so there is no need to
use addReplyDeferredLen which may introduce some additional overheads.
2023-04-25 10:17:21 +03:00
Wen Hui 091412cf62
add test cases for decr decrby on missing key (#12070)
Minor test case addition for DECR and DECRBY.

Currently DECR and DECRBY do not have test case coverage for the
scenarios where they run on a non-existing key.
2023-04-19 09:55:56 +03:00
Binbin 20533cc1d7
Tests: Do not save an RDB by default and add a SIGTERM default AOFRW test (#12064)
In order to speed up tests, avoid saving an RDB (mostly notable on shutdown),
except for tests that explicitly test the RDB mechanism

In addition, use `shutdown-on-sigterm force` to prevetn shutdown from failing
in case the server is in the middle of the initial AOFRW

Also a a test that checks that the `shutdown-on-sigterm default` is to refuse
shutdown if there's an initial AOFRW

Co-authored-by: Guy Benoish <guy.benoish@redislabs.com>
2023-04-18 16:14:26 +03:00
sundb 42c8c61813
Fix some compile warnings and errors when building with gcc-12 or clang (#12035)
This PR is to fix the compilation warnings and errors generated by the latest
complier toolchain, and to add a new runner of the latest toolchain for daily CI.

## Fix various compilation warnings and errors

1) jemalloc.c

COMPILER: clang-14 with FORTIFY_SOURCE

WARNING:
```
src/jemalloc.c:1028:7: warning: suspicious concatenation of string literals in an array initialization; did you mean to separate the elements with a comma? [-Wstring-concatenation]
                    "/etc/malloc.conf",
                    ^
src/jemalloc.c:1027:3: note: place parentheses around the string literal to silence warning
                "\"name\" of the file referenced by the symbolic link named "
                ^
```

REASON:  the compiler to alert developers to potential issues with string concatenation
that may miss a comma,
just like #9534 which misses a comma.

SOLUTION: use `()` to tell the compiler that these two line strings are continuous.

2) config.h

COMPILER: clang-14 with FORTIFY_SOURCE

WARNING:
```
In file included from quicklist.c:36:
./config.h:319:76: warning: attribute declaration must precede definition [-Wignored-attributes]
char *strcat(char *restrict dest, const char *restrict src) __attribute__((deprecated("please avoid use of unsafe C functions. prefer use of redis_strlcat instead")));
```

REASON: Enabling _FORTIFY_SOURCE will cause the compiler to use `strcpy()` with check,
it results in a deprecated attribute declaration after including <features.h>.

SOLUTION: move the deprecated attribute declaration from config.h to fmacro.h before "#include <features.h>".

3) networking.c

COMPILER: GCC-12

WARNING: 
```
networking.c: In function ‘addReplyDouble.part.0’:
networking.c:876:21: warning: writing 1 byte into a region of size 0 [-Wstringop-overflow=]
  876 |         dbuf[start] = '$';
      |                     ^
networking.c:868:14: note: at offset -5 into destination object ‘dbuf’ of size 5152
  868 |         char dbuf[MAX_LONG_DOUBLE_CHARS+32];
      |              ^
networking.c:876:21: warning: writing 1 byte into a region of size 0 [-Wstringop-overflow=]
  876 |         dbuf[start] = '$';
      |                     ^
networking.c:868:14: note: at offset -6 into destination object ‘dbuf’ of size 5152
  868 |         char dbuf[MAX_LONG_DOUBLE_CHARS+32];
```

REASON: GCC-12 predicts that digits10() may return 9 or 10 through `return 9 + (v >= 1000000000UL)`.

SOLUTION: add an assert to let the compiler know the possible length;

4) redis-cli.c & redis-benchmark.c

COMPILER: clang-14 with FORTIFY_SOURCE

WARNING:
```
redis-benchmark.c:1621:2: warning: embedding a directive within macro arguments has undefined behavior [-Wembedded-directive] #ifdef USE_OPENSSL
redis-cli.c:3015:2: warning: embedding a directive within macro arguments has undefined behavior [-Wembedded-directive] #ifdef USE_OPENSSL
```

REASON: when _FORTIFY_SOURCE is enabled, the compiler will use the print() with
check, which is a macro. this may result in the use of directives within the macro, which
is undefined behavior.

SOLUTION: move the directives-related code out of `print()`.

5) server.c

COMPILER: gcc-13 with FORTIFY_SOURCE

WARNING:
```
In function 'lookupCommandLogic',
    inlined from 'lookupCommandBySdsLogic' at server.c:3139:32:
server.c:3102:66: error: '*(robj **)argv' may be used uninitialized [-Werror=maybe-uninitialized]
 3102 |     struct redisCommand *base_cmd = dictFetchValue(commands, argv[0]->ptr);
      |                                                              ~~~~^~~
```

REASON: The compiler thinks that the `argc` returned by `sdssplitlen()` could be 0,
resulting in an empty array of size 0 being passed to lookupCommandLogic.
this should be a false positive, `argc` can't be 0 when strings are not NULL.

SOLUTION: add an assert to let the compiler know that `argc` is positive.

6) sha1.c

COMPILER: gcc-12

WARNING:
```
In function ‘SHA1Update’,
    inlined from ‘SHA1Final’ at sha1.c:195:5:
sha1.c:152:13: warning: ‘SHA1Transform’ reading 64 bytes from a region of size 0 [-Wstringop-overread]
  152 |             SHA1Transform(context->state, &data[i]);
      |             ^
sha1.c:152:13: note: referencing argument 2 of type ‘const unsigned char[64]’
sha1.c: In function ‘SHA1Final’:
sha1.c:56:6: note: in a call to function ‘SHA1Transform’
   56 | void SHA1Transform(uint32_t state[5], const unsigned char buffer[64])
      |      ^
In function ‘SHA1Update’,
    inlined from ‘SHA1Final’ at sha1.c:198:9:
sha1.c:152:13: warning: ‘SHA1Transform’ reading 64 bytes from a region of size 0 [-Wstringop-overread]
  152 |             SHA1Transform(context->state, &data[i]);
      |             ^
sha1.c:152:13: note: referencing argument 2 of type ‘const unsigned char[64]’
sha1.c: In function ‘SHA1Final’:
sha1.c:56:6: note: in a call to function ‘SHA1Transform’
   56 | void SHA1Transform(uint32_t state[5], const unsigned char buffer[64])
```

REASON: due to the bug[https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80922], when
enable LTO, gcc-12 will not see `diagnostic ignored "-Wstringop-overread"`, resulting in a warning.

SOLUTION: temporarily set SHA1Update to noinline to avoid compiler warnings due
to LTO being enabled until the above gcc bug is fixed.

7) zmalloc.h

COMPILER: GCC-12

WARNING: 
```
In function ‘memset’,
    inlined from ‘moduleCreateContext’ at module.c:877:5,
    inlined from ‘RM_GetDetachedThreadSafeContext’ at module.c:8410:5:
/usr/include/x86_64-linux-gnu/bits/string_fortified.h:59:10: warning: ‘__builtin_memset’ writing 104 bytes into a region of size 0 overflows the destination [-Wstringop-overflow=]
   59 |   return __builtin___memset_chk (__dest, __ch, __len,
```

REASON: due to the GCC-12 bug [https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96503],
GCC-12 cannot see alloc_size, which causes GCC to think that the actual size of memory
is 0 when checking with __glibc_objsize0().

SOLUTION: temporarily set malloc-related interfaces to `noinline` to avoid compiler warnings
due to LTO being enabled until the above gcc bug is fixed.

## Other changes
1) Fixed `ps -p [pid]`  doesn't output `<defunct>` when using procps 4.x causing `replication
  child dies when parent is killed - diskless` test to fail.
2) Add a new fortify CI with GCC-13 and ubuntu-lunar docker image.
2023-04-18 09:53:51 +03:00
Wen Hui d2db4aa753
Added getrange missing testcase (#12061)
Minor test case addition.
Currently GETRANGE command does not have the test case coverage for the scenarios:
An error is returned when key exists but of different type
Added missing test cases for getrange command.
2023-04-18 08:32:10 +03:00
judeng e7f18432b8
avoid incorrect shrinking of querybuf when client is reading a big argv (#12000)
this pr fix two wrongs:
1. When client’s querybuf is pre-allocated for a fat argv, we need to update the
  querybuf_peak of the client immediately to completely avoid the unexpected
  shrinking of querybuf in the next clientCron (before data arrives to set the peak).
2. the protocol's bulklen does not include `\r\n`, but the allocation and the data we
  read does. so in `clientsCronResizeQueryBuffer`, the `resize` or `querybuf_peak`
  should add these 2 bytes.

the first bug is likely to hit us on large payloads over slow connections, in which case
transferring the payload can take longer and a cron event will be triggered (specifically
if there are not a lot of clients)
2023-04-16 15:49:26 +03:00
Wen Hui 4375b01cc7
Adding missing test cases for substring (#12039)
There is are some missing test cases for SUBSTR command.
These might already be covered by GETRANGE, but no harm in adding them since they're simple.

Added 3 test case.

* start > stop
* start and stop both greater than string length
* when no key is present.
2023-04-13 21:48:26 +03:00
Wen Hui bc82309ceb
Adding missing test cases for linsert command (#12040)
Currently LINSERT command does not have the test case coverage for following scenarios.
1. When key does not exist, it is considered an empty list and no operation is performed.
2. An error is returned when key exists but does not hold a list value.

Added above two missing test cases for linsert command.
2023-04-13 19:05:41 +03:00
Binbin bfec2d700b
Add RM_ReplyWithErrorFormat that can support format (#11923)
* Add RM_ReplyWithErrorFormat that can support format

Reply with the error create from a printf format and arguments.

If the error code is already passed in the string 'fmt', the error
code provided is used, otherwise the string "-ERR " for the generic
error code is automatically added.

The usage is, for example:
    RedisModule_ReplyWithErrorFormat(ctx, "An error: %s", "foo");
    RedisModule_ReplyWithErrorFormat(ctx, "-WRONGTYPE Wrong Type: %s", "foo");

The function always returns REDISMODULE_OK.
2023-04-12 10:11:29 +03:00
Oran Agra 997fa41e99
Attempt to solve MacOS CI issues in GH Actions (#12013)
The MacOS CI in github actions often hangs without any logs. GH argues that
it's due to resource utilization, either running out of disk space, memory, or CPU
starvation, and thus the runner is terminated.

This PR contains multiple attempts to resolve this:
1. introducing pause_process instead of SIGSTOP, which waits for the process
  to stop before resuming the test, possibly resolving race conditions in some tests,
  this was a suspect since there was one test that could result in an infinite loop in that
 case, in practice this didn't help, but still a good idea to keep.
2. disable the `save` config in many tests that don't need it, specifically ones that use
  heavy writes and could create large files.
3. change the `populate` proc to use short pipeline rather than an infinite one.
4. use `--clients 1` in the macos CI so that we don't risk running multiple resource
  demanding tests in parallel.
5. enable `--verbose` to be repeated to elevate verbosity and print more info to stdout
  when a test or a server starts.
2023-04-12 09:19:21 +03:00
Binbin 45b8eea19f
Add ZREMRANGEBYLEX basics tests to fix reply-schemas daily (#12021)
We do have ZREMRANGEBYLEX tests, but it is a stress test
marked with slow tag and then skipped in reply-schemas daily.

In the past, we were able to succeed on a daily, i guess
it was because there were some random command executions,
such as corrupt-dump-fuzzy, which might call it.

These test examples are taken from ZRANGEBYLEX basics test.
2023-04-11 11:14:16 +03:00
Ozan Tezcan e55568edb5
Add RM_RdbLoad and RM_RdbSave module API functions (#11852)
Add `RM_RdbLoad()` and `RM_RdbSave()` to load/save RDB files from the module API. 

In our use case, we have our clustering implementation as a module. As part of this
implementation, the module needs to trigger RDB save operation at specific points.
Also, this module delivers RDB files to other nodes (not using Redis' replication).
When a node receives an RDB file, it should be able to load the RDB. Currently,
there is no module API to save/load RDB files. 


This PR adds four new APIs:
```c
RedisModuleRdbStream *RM_RdbStreamCreateFromFile(const char *filename);
void RM_RdbStreamFree(RedisModuleRdbStream *stream);

int RM_RdbLoad(RedisModuleCtx *ctx, RedisModuleRdbStream *stream, int flags);
int RM_RdbSave(RedisModuleCtx *ctx, RedisModuleRdbStream *stream, int flags);
```

The first step is to create a `RedisModuleRdbStream` object. This PR provides a function to
create RedisModuleRdbStream from the filename. (You can load/save RDB with the filename).
In the future, this API can be extended if needed: 
e.g., `RM_RdbStreamCreateFromFd()`, `RM_RdbStreamCreateFromSocket()` to save/load
RDB from an `fd` or a `socket`. 


Usage:
```c
/* Save RDB */
RedisModuleRdbStream *stream = RedisModule_RdbStreamCreateFromFile("example.rdb");
RedisModule_RdbSave(ctx, stream, 0);
RedisModule_RdbStreamFree(stream);

/* Load RDB */
RedisModuleRdbStream *stream = RedisModule_RdbStreamCreateFromFile("example.rdb");
RedisModule_RdbLoad(ctx, stream, 0);
RedisModule_RdbStreamFree(stream);
```
2023-04-09 12:07:32 +03:00
Slava Koyfman f38aa6bfb7
Disconnect pub-sub subscribers when revoking `allchannels` permission (#11992)
The existing logic for killing pub-sub clients did not handle the `allchannels`
permission correctly. For example, if you:

    ACL SETUSER foo allchannels

Have a client authenticate as the user `foo` and subscribe to a channel, and then:

    ACL SETUSER foo resetchannels

The subscribed client would not be disconnected, though new clients under that user
would be blocked from subscribing to any channels.

This was caused by an incomplete optimization in `ACLKillPubsubClientsIfNeeded`
checking whether the new channel permissions were a strict superset of the old ones.
2023-04-02 16:18:28 +03:00
Jason Elbaum 1f76bb17dd
Reimplement cli hints based on command arg docs (#10515)
Now that the command argument specs are available at runtime (#9656), this PR addresses
#8084 by implementing a complete solution for command-line hinting in `redis-cli`.

It correctly handles nearly every case in Redis's complex command argument definitions, including
`BLOCK` and `ONEOF` arguments, reordering of optional arguments, and repeated arguments
(even when followed by mandatory arguments). It also validates numerically-typed arguments.
It may not correctly handle all possible combinations of those, but overall it is quite robust.

Arguments are only matched after the space bar is typed, so partial word matching is not
supported - that proved to be more confusing than helpful. When the user's current input
cannot be matched against the argument specs, hinting is disabled.

Partial support has been implemented for legacy (pre-7.0) servers that do not support
`COMMAND DOCS`, by falling back to a statically-compiled command argument table.
On startup, if the server does not support `COMMAND DOCS`, `redis-cli` will now issue
an `INFO SERVER` command to retrieve the server version (unless `HELLO` has already
been sent, in which case the server version will be extracted from the reply to `HELLO`).
The server version will be used to filter the commands and arguments in the command table,
removing those not supported by that version of the server. However, the static table only
includes core Redis commands, so with a legacy server hinting will not be supported for
module commands. The auto generated help.h and the scripts that generates it are gone.

Command and argument tables for the server and CLI use different structs, due primarily
to the need to support different runtime data. In order to generate code for both, macros
have been added to `commands.def` (previously `commands.c`) to make it possible to
configure the code generation differently for different use cases (one linked with redis-server,
and one with redis-cli).

Also adding a basic testing framework for the command hints based on new (undocumented)
command line options to `redis-cli`: `--test_hint 'INPUT'` prints out the command-line hint for
a given input string, and `--test_hint_file <filename>` runs a suite of test cases for the hinting
mechanism. The test suite is in `tests/assets/test_cli_hint_suite.txt`, and it is run from
`tests/integration/redis-cli.tcl`.

Co-authored-by: Oran Agra <oran@redislabs.com>
Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
2023-03-30 19:03:56 +03:00
Itamar Haber 0c3b8b7e90
Overhauls command summaries and man pages. (#11942)
This is an attempt to normalize/formalize command summaries.

Main actions performed:

* Starts with the continuation of the phrase "The XXXX command, when called, ..." for user commands.
* Starts with "An internal command...", "A container command...", etc... when applicable.
* Always uses periods.
* Refrains from referring to other commands. If this is needed, backquotes should be used for command names.
* Tries to be very clear about the data type when applicable.
* Tries to mention additional effects, e.g. "The key is created if it doesn't exist" and "The set is deleted if the last member is removed."
* Prefers being terse over verbose.
* Tries to be consistent.
2023-03-29 20:48:59 +03:00
Binbin cb17178658
Fix fork done handler wrongly update fsync metrics and enhance AOF_ FSYNC_ALWAYS (#11973)
This PR fix several unrelated bugs that were discovered by the same set of tests
(WAITAOF tests in #11713), could make the `WAITAOF` test hang. 

The change in `backgroundRewriteDoneHandler` is about MP-AOF.
That leftover / old code assumes that we started a new AOF file just now
(when we have a new base into which we're gonna incrementally write), but
the fact is that with MP-AOF, the fork done handler doesn't really affect the
incremental file being maintained by the parent process, there's no reason to
re-issue `SELECT`, and no reason to update any of the fsync variables in that flow.
This should have been deleted with MP-AOF (introduced in #9788, 7.0).
The damage is that the update to `aof_fsync_offset` will cause us to miss an fsync
in `flushAppendOnlyFile`, that happens if we stop write commands in `AOF_FSYNC_EVERYSEC`
while an AOFRW is in progress. This caused a new `WAITAOF` test to sometime hang forever.

Also because of MP-AOF, we needed to change `aof_fsync_offset` to `aof_last_incr_fsync_offset`
and match it to `aof_last_incr_size` in `flushAppendOnlyFile`. This is because in the past we compared
`aof_fsync_offset` and `aof_current_size`, but with MP-AOF it could be the total AOF file will be
smaller after AOFRW, and the (already existing) incr file still has data that needs to be fsynced.

The change in `flushAppendOnlyFile`, about the `AOF_FSYNC_ALWAYS`, it is follow #6053
(the details is in #5985), we also check `AOF_FSYNC_ALWAYS` to handle a case where
appendfsync is changed from everysec to always while there is data that's written but not yet fsynced.
2023-03-29 15:17:05 +03:00
Binbin aa2403ca98
Fix redis-cli cluster test timing issue (#11887)
This test fails sporadically:
```
*** [err]: Migrate the last slot away from a node using redis-cli in tests/unit/cluster/cli.tcl
cluster size did not reach a consistent size 4
```

I guess the time (5s) of wait_for_cluster_size is not enough,
usually, the waiting time for our other tests for cluster
consistency is 50s, so also changing it to 50s.
2023-03-26 08:46:58 +03:00
Binbin 2cc99c692c
Add COMMAND COUNT test to cover reply-schemas-validator test (#11971)
Since we remove the COMMAND COUNT call in sentinel test in #11950,
reply-schemas-validator started reporting this error:
```
WARNING! The following commands were not hit at all:
  command|count
  ERROR! at least one command was not hit by the tests
```

This PR add a COMMAND COUNT test to cover it and also fix some
typos in req-res-log-validator.py
2023-03-26 08:39:04 +03:00
Ozan Tezcan 99e6855453
Add needs:reset for the test (#11959)
Added missing needs:reset tag.

Introduced by #11758
2023-03-23 10:48:45 +02:00
Oran Agra d38df59a3f
fix CLIENT SETINFO to use error replies instead of status replies (#11952) 2023-03-22 14:32:36 +02:00
Binbin 9c4c90c1bf
Replcae sentinel commands sanity check with infrastructure work test (#11950)
The sanity check test intention was to detect that when a command is
added to sentinel it is on purpose. This test is easily broken, like
CLIENT SETINFO introduced by #11758.

We replace it with a test that validates that a few specific commands
are either there or missing (to test the infrastructure works correctly).
2023-03-22 12:18:03 +02:00
Igor Malinovskiy c3b9f2fbd9
Allow clients to report name and version (#11758)
This PR allows clients to send information about the client library to redis
to be displayed in CLIENT LIST and CLIENT INFO.

Currently supports:
`CLIENT [lib-name | lib-ver] <value>`
Client libraries are expected to pipeline these right after AUTH, and ignore
the failure in case they're talking to an older version of redis.

These will be shown in CLIENT LIST and CLIENT INFO as:
* `lib-name` - meant to hold the client library name.
* `lib-ver` - meant to hold the client library version.

The values cannot contain spaces, newlines and any wild ASCII characters,
but all other normal chars are accepted, e.g `.`, `=` etc (same as CLIENT NAME).

The RESET command does NOT clear these, but they can be cleared to the
default by sending a command with a blank string.

Co-authored-by: Oran Agra <oran@redislabs.com>
2023-03-22 08:17:20 +02:00