This PR is based on the commits from PR
https://github.com/valkey-io/valkey/pull/670.
## Description
While exploring hotspots with profiling some benchmark workloads, we
noticed the high cycles ratio of `prepareClientToWrite`, taking about 9%
of the CPU of `smembers`, `lrange` commands. After deep dive the code
logic, we thought we can gain the performance by reducing the redundant
call of `prepareClientToWrite` when call addReply* continuously.
For example: In
https://github.com/valkey-io/valkey/blob/unstable/src/networking.c#L1080-L1082,
`prepareClientToWrite` is called three times in a row.
---------
Signed-off-by: Lipeng Zhu <lipeng.zhu@intel.com>
Co-authored-by: Lipeng Zhu <lipeng.zhu@intel.com>
Co-authored-by: Wangyang Guo <wangyang.guo@intel.com>
test failure:
```
[err]: Interactive CLI: should find second search result if user presses ctrl+s in tests/integration/redis-cli.tcl
Expected '1' to be equal to '0' (context: type eval line 10 cmd {assert_equal 1 [regexp {\(i-search\): \x1B\[0mk\x1B\[1mey\x1B\[0ms one} $result]} proc ::test)
```
this test (introduced in #12543) depends on the local history file, so
it can fail if there's some match there.
the fix is to use a different history file, and delete it before each
run.
128 is not enough chars when we're talking about commands like RESTORE.
Of course, it's impossible to find the perfect number, but 1024 is
better than 128, and it's not obscenely large.
If `config resetstat` is executed and a defrag is started after it, the
`total_active_defrag_time` will not be 0.
When we start the defrag again, we will skip the following steps:
1. waiting for the defrag to start. (s total_active_defrag_time is equal
0)
2. waiting for the test to complete. (active_defrag_running is euqal 0)
which result in the test failed.
---------
Co-authored-by: oranagra <oran@redislabs.com>
This commit reverts the deletion of the condition `!bc->blocked_on_keys`
that was accidentally introduced by
https://github.com/redis/redis/pull/12817.
In case a blocked-on-keys module client is unblocked both
`moduleUnblockClientOnKey` and `moduleHandleBlockedClients` are called
which resulted in `updateStatsOnUnblock` being called twice
Now, that `moduleHandleBlockedClients` doesn't call
`updateStatsOnUnblock` in case of unblocked module key-blocked clients,
in the unlikely event that the module decides to call `RM_UnblockClient`
on a key-blocked client, we need to call `updateStatsOnUnblock` from
within `moduleBlockedClientTimedOut`, but since
`moduleBlockedClientTimedOut` is not tread-safe we can't call it
directly from withing `RM_UnblockClient`.
Added a new flag `blocked_on_keys_explicit_unblock` for that specific
case, which will cause `moduleBlockedClientTimedOut` to be called from
`moduleHandleBlockedClients` (which is only called from the main thread)
---------
Co-authored-by: debing.sun <debing.sun@redis.com>
### Issue
The current implementation of `FUNCTION FLUSH` command uses
`lua_unref()` to unreference script closures in Lua vm. However,
invoking `lua_unref()` during lazy free (`ASYNC` argument) is risky
since it is not thread-safe.
Another issue is that using `lua_unref()` to unreference references does
not trigger GC, This can result in the Lua VM leaves a significant
amount of garbage, which may never be cleaned up if not properly GC.
### Solution
The proposed solution is to completely rebuild the engines, resulting in
a brand new Lua VM.
---------
Co-authored-by: meir <meir@redis.com>
This PR is based on the commits from PR #11747.
In the event of an assertion failure, hide command arguments from the
operator.
In some cases, private client information can be voluntarily exposed
when a redis instance crashes due to an assertion failure.
This commit prevent וnintentional client info exposure.
Operators can still access the hidden data, but they must actively
request it.
Any of the client info commands remains the unchanged.
### Config
Add a new config `hide-user-data-from-log` to turn this feature on and
off, default off.
---------
Co-authored-by: naglera <anagler123@gmail.com>
Co-authored-by: naglera <58042354+naglera@users.noreply.github.com>
https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/
Due to GitHub removing support for CentOS 7 in GitHub Actions, all
actions utilizing CentOS 7 need to be upgraded, upgrade the centos
version from `contos:7` to `quay.io/centos/centos:stream9` which is the
official RedHat centos container.
Create some new actions named `old-chain` to verify support for gcc 4.8.
This PR also includes the upgrade of actions/checkout from version 3 to
version 4.
---------
Co-authored-by: debing.sun <debing.sun@redis.com>
* Following this feature, Redis (ROF) may implement flow that allows hashes to be
dumped directly from RDB to FLUSH without parsing. In this scenario, it is still
essential to determine when to update hashes due to expired fields. By writing
and reading the next minimum hash-field expiration before serializing objects
to and from RDB, we can effectively track and expire hash fields without the need
to parse the hash during loading.
Before:
#define RDB_TYPE_HASH_METADATA 22
#define RDB_TYPE_HASH_LISTPACK_EX 23
After:
/* Hash with HFEs. Doesn't attach min TTL at start */
#define RDB_TYPE_HASH_METADATA_PRE_GA 22
/* Hash LP with HFEs. Doesn't attach min TTL at start */
#define RDB_TYPE_HASH_LISTPACK_EX_PRE_GA 23
/* Hash with HFEs. Attach min TTL at start */
#define RDB_TYPE_HASH_METADATA 24
/* Hash LP with HFEs. Attach min TTL at start */
#define RDB_TYPE_HASH_LISTPACK_EX 25
* Manually test loading RDB file before the change and verify hash and its HFEs
is as expected.
* Added `subexpires` counter to `redis-check-rdb`
The following PR does the following changes based upon on CPU profile
info. The `getNodeByQuery` function represents 8.2% of an overhead of
12.3% when comparing single shard cluster with standalone.
Proposed changes:
- inlinging keyHashSlot to reduce overhead of that function call
- Reduce duplicate calls to getCommandFlags within getNodeByQuery
The above changes represent an improvement of approximately 5% on the achievable ops/sec.
Co-authored-by: filipecosta90 <filipecosta.90@gmail.com>
* INFO command : rename `hashes_with_expiry_fields` to `subexpiry`
* INFO command : rename `expired_hash_fields` to `expired_subkeys`
* Fix statistic of `expired_subkeys` to count also lazy expired
* Remove TODOs comments leftover in TCL
* Fix potential flaky test of rdb load of hash-field-expiration
1. Update macOS version in GitHub Actions due to deprecation of macOS 11
Because the macOS 11 runner image will be removed by 06/28/2024.
2. CentOS has arrived at EOL, we need to update the yum repo to new url.
If we run FLUSHALL when the 'save' config is set, and there's a fork
child ding BGSAVE, there's a chance the child is already finished, and
the parent process is unaware of it. in that case the child will not get
the kill signal and will finish successfully, but the parent process
thinks it killed it and will reset the dirty counter to 0, then the
backgroundSaveDoneHandlerDisk method can set the dirty counter to a
negative value.
getKeysUsingKeySpece had the range check AFTER the allocation, of the
keys buffer, which could lead to an OOM panic when invalid arguments are
provided, leading to an overflow.
The allocated memory is only used after the range check, so there's no
risk of buffer overrun.
The OOM panic can happen on 32bit builds, or 64 builds running on
systems with less than 4GB of RAM, and is reachable via the COMMAND
GETKEYSANDFLAGS, and ACL key name validation.
There was wrong preliminary assumption that we can optionally provide
vector of arguments more than count.
This is error-prone approach that leaded to actual error in that case.
This PR enforce that vector of argument match count.
Also fixed flaky HRANDFIELD test.
In certain situations, we might generate a large number of propagates
(e.g., multi/exec, Lua script, or a single command generating tons of
propagations) within an event loop.
During the process of propagating to a replica, if the replica is
disconnected(marked as CLIENT_CLOSE_ASAP) due to exceeding the output
buffer limit, we should remove its reference to the global replication
buffer to avoid the global replication buffer being unable to be
properly trimmed due to being referenced.
---------
Co-authored-by: oranagra <oran@redislabs.com>
H(P)EXPIREAT command might delete fields in case the absolute time is in the
past. Those HDELs need to be propagated as well.
In general, as we need to propagate H(P)EXPIRE(AT) command to the replica, each
field that is mentioned in the command should be categorized into one of the four
options:
1. Managed to update field’s expiration time - propagate it to replica as part
of the HPEXPIREAT command.
2. Deleted the field because the time is in the past - propagate also HDEL command
to delete the field and remove the field from the propagated HPEXPIREAT.
3. Condition not met for the field - Remove the field from the propagated
HPEXPIREAT command.
4. Field does not exists - Remove the field from the propagated HPEXPIREAT command.
If none of the provided fields match option number 1, then avoid also propagating
the HPEXPIREAT command to the replica.
This approach is aligned with the EXPIRE command:
If a given key has already expired, then DEL will be propagated instead of
EXPIRE command. If condition not met, then command will be rejected. Otherwise,
EXPIRE command will be propagated for given key.
Considerations for the selected imp of HRANDFIELD & HFE feature:
HRANDFIELD might access any of the fields in the hash as some of them
might be expired. And so the Implementation of HRANDFIELD along with HFEs
might be one of the two options:
1. Expire hash-fields before diving into handling HRANDFIELD.
2. Refine HRANDFIELD cases to deal with expired fields.
Regarding the first option, as reference, the command RANDOMKEY also
declareson O(1) complexity, yet might be stuck on a very long (but not infinite)
loop trying to find non-expired keys. Furthermore RANDOMKEY also evicts expired
keys along the way even though it is categorized as a read-only command. Note
that the case of HRANDFIELD is more lightweight versus RANDOMKEY since
HFEs have much more effective and aggressive active-expiration for fields behind.
The second option introduces additional implementation complexity to HRANDFIELD.
We could further refine HRANDFIELD cases to differentiate between scenarios
with many expired fields versus few expired fields, and adjust based on the
percentage of expired fields. However, this approach could still lead to long
loops or necessitate expiring fields before selecting them. For the “lightweight”
cases it is also expected to have a lightweight expiration.
Considering the pros and cons, and the fact that HRANDFIELD is an infrequent
command (particularly with HFEs) and the fact we have effective active-expiration
behind for hash-fields, it is better to keep it simple and choose option number 1.
Other changes:
* Don't mark command dirty by internal hashTypeExpire(). It causes to read
only command of HRANDFIELD to be accidently propagated (This flag
should be indicated at higher level, by the command functions).
* Align `hashTypeExpireIfNeeded()` and `hashTypeGetValue()` to be more
aligned with `expireIfNeeded()` logic of keyspace.
Add two new debug commands for outputing script.
1. `DEBUG SCRIPT LIST`
Output all scripts.
2. `DEBUG SCRIPT <sha1>`
Output a specific script.
Close#3846
Related to #12647
1. Make clear that `RM_Replicate` and `RM_ReplicateVerbatim` are
non-thread safe.
2. Make clear that `RM_Replicate` and `RM_ReplicateVerbatim` are alwarys
wrapped into MULTI in any case.
As part of HFE feature, the logic of rdbLoadObject() was wrongly
modified to indicate of loaded empty hash from RDB as hash that all its
fields got expired. Rollback to `emptykey` logic. This function should
load blindly all fields, expired or not. Manually verified.
Few more minor fixes:
- remove hash double check of emptyKey
- Fix from `sds` to `hfield` in rdbLoadObject() (not really a bug. Both
are of type char*)
- Revert code rdbLoadObject() to get dbid instead of db
Currently, HFE commands reply with empty array if the key does not
exist. Though, non-existing key and empty key is the same thing.
It means fields given in the command do not exist in the empty key.
So, replying with an array of 'no field' error codes (-2) suits better
to Redis logic. e.g. Similarly, `hmget` returns array of nulls if the
key does not exist.
After this PR:
```
127.0.0.1:6379> hpersist missingkey fields 2 a b
1) (integer) -2
2) (integer) -2
```
I reviewed `XREAD` command syntax:
```
XREAD [COUNT count] [BLOCK milliseconds] STREAMS key [key ...] id [id ...]
```
Here’s the structure for `XREAD`:
```json
"arguments": [
{
"token": "COUNT",
"name": "count",
"type": "integer",
"optional": true
},
{
"token": "BLOCK",
"name": "milliseconds",
"type": "integer",
"optional": true
},
{
"name": "streams",
"token": "STREAMS",
"type": "block",
"arguments": [
{
"name": "key",
"type": "key",
"key_spec_index": 0,
"multiple": true
},
{
"name": "ID",
"type": "string",
"multiple": true
}
]
}
]
```
Now, consider the `HEXPIRE` syntax:
```
HEXPIRE key seconds [NX | XX | GT | LT] FIELDS numfields field [field ...]
```
Since the `FIELDS` token functions similarly to `STREAMS`, and given that `STREAMS` is defined as a block, I believe the `FIELDS` in `hepxire` should also be defined as a block.
When the hash field expired, we will send a new `hexpired` notification.
It mainly includes the following three cases:
1. When field expired by active expiration.
2. When field expired by lazy expiration.
3. When the user uses the `h(p)expire(at)` command, the user will also
get a `hexpired` notification if the field expires during the command.
## Improvement
1. Now if more than one field expires in the hmget command, we will only
send a `hexpired` notification.
2. When a field with TTL is deleted by commands like hdel without
updating the global DS, active expire will not send a notification.
---------
Co-authored-by: Ozan Tezcan <ozantezcan@gmail.com>
Co-authored-by: Moti Cohen <moti.cohen@redis.com>
Reserve 2 bits out of hash-field expiration time (`EB_EXPIRE_TIME_MAX`)
for possible future lightweight indexing/categorizing of fields. It can
be achieved by hacking HFE as follows:
```
HPEXPIREAT key [ 2^47 + USER_INDEX ] FIELDS numfields field [field …]
```
Redis will also need to expose kind of `HEXPIRESCAN` and `HEXPIRECOUNT`
for this idea. Yet to be better defined.
`HFE_MAX_ABS_TIME_MSEC` constraint must be enforced only at API level.
Internally, the expiration time can be up to `EB_EXPIRE_TIME_MAX` for
future readiness.
Need to be carefull if called by modules since modules API allow to open
and close key handler. We don't want to invalidate the handler
underneath.
* hashTypeExists(), hashTypeGetValueObject() - will return the logical
state of the field. A flag will indicate noExpire.
* RM_HashGet() - Will get NULL if the field expired. Fields won’t be
deleted.
* RM_ScanKey() - might return 0 items if all fields got expired. Fields
won’t be deleted.
* RM_HashSet() - If set, then override expired field. If delete, we can
either delete or leave it to active-expiration. XX/NX - logically
correct (Verify with tests).
Nice to have (not implemented):
* RedisModule_CloseKey() - We can local active-expire up-to 100 items.
Note:
Length will be wrong to modules just like redis (Count expired fields).
In the old test, we give the `hexpire` a very short expire time, which
caused the filed to be deleted by the time `hpersist` command was
executed. As a result, the `hpersist` command won't be able to give a
`hpersist` notification, leading to test stuck.
fail CI:
https://github.com/redis/redis/actions/runs/9342175887/job/25709886471
1. Don't allow HEXPIRE/HEXPIREAT/HPEXPIRE/HPEXPIREAT command expire
parameters is negative
2. Remove a dead code reported from Coverity.
when `unit` is not `UNIT_SECONDS`, the second `if (expire > (long long)
EB_EXPIRE_TIME_MAX)` will be dead code.
```c
# t_hash.c
2988 /* Check expire overflow */
cond_at_most: Condition expire > 281474976710655LL, taking false branch. Now the value of expire is at most 281474976710655.
2989 if (expire > (long long) EB_EXPIRE_TIME_MAX) {
2990 addReplyErrorExpireTime(c);
2991 return;
2992 }
2994 if (unit == UNIT_SECONDS) {
2995 if (expire > (long long) EB_EXPIRE_TIME_MAX / 1000) {
2996 addReplyErrorExpireTime(c);
2997 return;
2998 }
2999 expire *= 1000;
3000 } else {
at_most: At condition expire > 281474976710655LL, the value of expire must be at most 281474976710655.
dead_error_condition: The condition expire > 281474976710655LL cannot be true.
3001 if (expire > (long long) EB_EXPIRE_TIME_MAX) {
CID 494223: (#1 of 1): Logically dead code (DEADCODE)
dead_error_begin: Execution cannot reach this statement: addReplyErrorExpireTime(c);.
3002 addReplyErrorExpireTime(c);
3003 return;
3004 }
3005 }
```
---------
Co-authored-by: Ozan Tezcan <ozantezcan@gmail.com>
In #13224, we found a crash during cluster slot migration but don't know
why. So i check all the return C_OK in processCommand to see if we are
missing some duration reset and see this.
This fix is like #12247, when we reject the command, we should reset the
duration. I test it and verify it can fix#13224.
So the reason may because we are using stream block and then during the
slot migration, it got a redirect and then crash the server.
---------
Co-authored-by: debing.sun <debing.sun@redis.com>
RM_ScanKey() was overlooked while introducing hash field expiration.
An assert is triggered when it is called on a hash key with
OBJ_ENCODING_LISTPACK_EX encoding.
I've changed to code to handle listpackex encoding properly.
At `ebuckets` structure, On `ebExpire()`, if the callback indicated to update
the item expiration time and return it back to ebuckets (`ACT_UPDATE_EXP_ITEM`),
then returned value `nextExpireTime` should be updated, if needed.
Invalid value of `nextExpireTime` was modified from 0 to `EB_EXPIRE_TIME_INVALID`.
The crash happens when the user that triggers the permission changes
should be affected (and should be disconnected eventually).
To handle such a scenario, we should use the
`CLIENT_CLOSE_AFTER_COMMAND` flag.
This commit encapsulates all the places that should be handled in the
same way in `deauthenticateAndCloseClient`
Also:
* bugfix: during the ACL LOAD we ignore clients that are marked as
`CLIENT MASTER`
**Related issue**
https://github.com/redis/redis/issues/13219
**Motivation**
Currently we have to manually update the all_tests variable when
introducing new test files.
**Modification**
I have modified it to list test files dynamically, but instead of
modifying it to add all test files, I have modified it to only add only
test files from the following 4 paths
- unit
- unit/type
- unit/cluster
- integration
so that it doesn't deviate too much from what we already do
**Result**
- dynamically list test files to all_tests variable
- close issue https://github.com/redis/redis/issues/13219
**Additional information**
- removed `list-common.tcl` file and added
`generate_largevalue_test_array` proc in `util.tcl`. because
`list-common.tcl` is not a test file
- There is an order dependency. So I added a code to the "Is a ziplist
encoded Hash promoted on big payload?" test that resets
hash-max-listpack-value to the default (64).
---------
Signed-off-by: jonghoonpark <dev@jonghoonpark.com>
Co-authored-by: debing.sun <debing.sun@redis.com>
## Background
This PR introduces support for field-level expiration in Redis hashes. Previously, Redis supported expiration only at the key level, but this enhancement allows setting expiration times for individual fields within a hash.
## New commands
* HEXPIRE
* HEXPIREAT
* HEXPIRETIME
* HPERSIST
* HPEXPIRE
* HPEXPIREAT
* HPEXPIRETIME
* HPTTL
* HTTL
## Short example
from @moticless
```sh
127.0.0.1:6379> hset myhash f1 v1 f2 v2 f3 v3
(integer) 3
127.0.0.1:6379> hpexpire myhash 10000 NX fields 2 f2 f3
1) (integer) 1
2) (integer) 1
127.0.0.1:6379> hpttl myhash fields 3 f1 f2 f3
1) (integer) -1
2) (integer) 9997
3) (integer) 9997
127.0.0.1:6379> hgetall myhash
1) "f3"
2) "v3"
3) "f2"
4) "v2"
5) "f1"
6) "v1"
... after 10 seconds ...
127.0.0.1:6379> hgetall myhash
1) "f1"
2) "v1"
127.0.0.1:6379>
```
## Expiration strategy
1. Integrate active
Redis periodically performs active expiration and deletion of hash keys that contain expired fields, with a maximum attempt limit.
3. Lazy expiration
When a client touches fields within a hash, Redis checks if the fields are expired. If a field is expired, it will be deleted. However, we do not delete expired fields during a traversal, we implicitly skip over them.
## RDB changes
Add two new rdb type s`RDB_TYPE_HASH_METADATA` and `RDB_TYPE_HASH_LISTPACK_EX`.
## Notification
1. Add `hpersist` notification for `HPERSIST` command.
5. Add `hexpire` notification for `HEXPIRE`, `HEXPIREAT`, `HPEXPIRE` and `HPEXPIREAT` commands.
## Internal
1. Add new data structure `ebuckets`, which is used to store TTL and keys, enabling quick retrieval of keys based on TTL.
2. Add new data structure `mstr` like sds, which is used to store a string with TTL.
This work was done by @moticless, @tezc, @ronen-kalish, @sundb, I just release it.
* For replica sake, rewrite commands `H*EXPIRE*` , `HSETF`, `HGETF` to
have absolute unix time in msec.
* On active-expiration of field, propagate HDEL to replica
(`propagateHashFieldDeletion()`)
* On lazy-expiration, propagate HDEL to replica (`hashTypeGetValue()`
now calls `hashTypeDelete()`. It also takes care to call
`propagateHashFieldDeletion()`).
* Fix `H*EXPIRE*` command such that if it gets flag `LT` and it doesn’t
have any expiration on the field then it will considered as valid
condition.
Note, replicas doesn’t make any active expiration, and should avoid lazy
expiration. On `hashTypeGetValue()` it doesn't check expiration (As long
as the master didn’t request to delete the field, it is valid)
TODO:
* Attach `dbid` to HASH metadata. See
[here](https://github.com/redis/redis/pull/13209#discussion_r1593385850)
---------
Co-authored-by: debing.sun <debing.sun@redis.com>
In the last step of hscan, while replying to client, we assume all items
in the result list are keys which are mstr instances. Though, there
might be values which are sds instances.
Added a check to avoid calling mstrlen() for value objects.
To reproduce:
```
127.0.0.1:6379> hset myhash1 a 11111111111111111111111111111111111111111111111111111111111111111
(integer) 0
127.0.0.1:6379> hscan myhash1 0
1) "0"
2) 1) "a"
2) "11111111111111111111111111111111111111111111111111111111111111111\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00"
```
Added `--keystats` and `--keystats-samples <n>` commands.
The new commands combines memkeys and bigkeys with additional
distribution data.
We often run memkeys and bigkeys one after the other. It will be
convenient to just have one command.
Distribution and top 10 key sizes are useful when we have multiple keys
taking a lot of memory.
Like for memkeys and bigkeys, we can use `-i <n>` and Ctrl-C to
interrupt the scan loop. It will still show the statistics on the keys
sampled so far.
We can use two new optional parameters:
- `--cursor <n>` to continue from the last cursor after an interruption
of the scan.
- `--top <n>` to change the number of top key sizes (10 by default).
Implemented a fix for the `--memkeys-samples 0` issue in order to use
`--keystats-samples 0`.
Used hdr_histogram for the key size distribution.
For key length, hdr_histogram seemed overkill and preset ranges were
used.
The memory used by keystats with hdr_histogram is around 7MB (compared
to 3MB for memkeys or bigkeys).
Execution time is somewhat equivalent to adding memkeys and bigkeys
time. Each scan loop is having more commands (key type, key size, key
length/cardinality).
We can redirect the output to a file. In that case, no color or text
refresh will happen.
Limitation:
- As the information printed during the loop is refreshed (moving cursor
up), stderr and information not fitting in the terminal window (width or
height) might create some refresh issues.
Comments:
- config.top_sizes_limit could be used globally like config.count, but
it is passed as parameter to be consistent with config.memkeys_samples.
- Not sure if we should move some utility functions to cli-common.c.
Got some tips and help from @ofirluzon.
---------
Co-authored-by: Binbin <binloveplay1314@qq.com>
Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
Co-authored-by: debing.sun <debing.sun@redis.com>
Added hashes_with_expiry_fields.
Optimially it would better to have statistic of that counts all fields
with expiry. But it requires careful logic and computation to follow and
deep dive listpacks and hashes. This statistics is trivial to achieve
and reflected by global HFE DS that has builtin enumeration of all the
hashes that are registered in it.
on active-expire of buckets, at function `ebExpire()` ->
`ebSegExpire()`, if callback `onExpireItem()` indicated to stop, Yet
iterator (iter) was wrongly already got updated to point to next item.
In turn the segment will be updated without current item.