Commit Graph

1475 Commits

Author SHA1 Message Date
Oran Agra 472d8a0df5 Prevent pattern matching abuse (CVE-2024-31228) 2024-10-08 20:55:44 +03:00
Oran Agra 3a2669e8ae Fix lua bit.tohex (CVE-2024-31449)
INT_MIN value must be explicitly checked, and cannot be negated.
2024-10-08 20:55:44 +03:00
Moti Cohen 5f28bd96db
Fix race in HFE tests (#13563)
Test 1 - give more time for expiration
Test 2 - Evaluate expiration time boundaries [+1,+2] before setting expiration [+1]
Test 3 - Avoid race on test HFEs propagated to replica
2024-09-23 10:30:29 +03:00
Moti Cohen 3a3cacfefa
Extend modules API to read also expired keys and subkeys (#13526)
The PR extends `RedisModule_OpenKey`'s flags to include
`REDISMODULE_OPEN_KEY_ACCESS_EXPIRED`, which allows to access expired
keys.

It also allows to access expired subkeys. Currently relevant only for
hash fields
and has its impact on `RM_HashGet` and `RM_Scan`.
2024-09-19 20:47:00 +03:00
Moti Cohen 9a89e32a95
HFE - Fix key ref by the hash on RENAME/MOVE/SWAPDB/RESTORE (#13539)
If the hash previously had HFEs (hash-fields with expiration) but later no longer
does, the key ref in the hash might become outdated after a MOVE, COPY,
RENAME or RESTORE operation. These commands maintain the key ref only
if HFEs are present. That is, we can only be sure that key ref is valid as long as the
hash has HFEs.
2024-09-12 12:40:12 +03:00
Oran Agra 610eb26c11
RED-129256, Fix TOUCH command from script in no-touch mode (#13512)
When a client in no-touch mode issues a TOUCH command on a key, the
key’s access time should be updated, but in scripts, and module's
RM_Call, it isn’t updated.
Command proc should be matched to the executing client, not the current
client.

Co-authored-by: Udi Ron <udi@speedb.io>
2024-09-12 11:33:26 +03:00
Ozan Tezcan ac03e3721d
Fix flaky replication tests (#13518)
#13495 introduced a change to reply -LOADING while flushing existing db on a replica. Some of our tests are
 sensitive to this change and do no expect -LOADING reply.

Fixing a couple of tests that fail time to time.
2024-09-08 12:54:01 +03:00
Moti Cohen 569584d463
HFE - Simplify logic of HGETALL command (#13425) 2024-09-05 12:48:44 +03:00
debing.sun ea3e8b79a1
Introduce reusable query buffer for client reads (#13488)
This PR is based on the commits from PR
https://github.com/valkey-io/valkey/pull/258,
https://github.com/valkey-io/valkey/pull/593,
https://github.com/valkey-io/valkey/pull/639

This PR optimizes client query buffer handling in Redis by introducing
a reusable query buffer that is used by default for client reads. This
reduces memory usage by ~20KB per client by avoiding allocations for
most clients using short (<16KB) complete commands. For larger or
partial commands, the client still gets its own private buffer.

The primary changes are:

* Adding a reusable query buffer `thread_shared_qb` that clients use by
default.
* Modifying client querybuf initialization and reset logic.
* Freeing idle client query buffers when empty to allow reuse of the
reusable query buffer.
* Master client query buffers are kept private as their contents need to
be preserved for replication stream.
* When nested commands is executed, only the first user uses the reuse
buffer, and subsequent users will still use the private buffer.

In addition to the memory savings, this change shows a 3% improvement in
latency and throughput when running with 1000 active clients.

The memory reduction may also help reduce the need to evict clients when
reaching max memory limit, as the query buffer is the main memory
consumer per client.

This PR is different from https://github.com/valkey-io/valkey/pull/258
1. When a client is in the mid of requiring a reused buffer and
returning it, regardless of whether the query buffer has changed
(expanded), we do not update the reused query buffer in the middle, but
return the reused query buffer (expanded or with data remaining) or
reset it at the end.
2. Adding a new thread variable `thread_shared_qb_used` to avoid
multiple clients requiring the reusable query buffer at the same time.

---------

Signed-off-by: Uri Yagelnik <uriy@amazon.com>
Signed-off-by: Madelyn Olson <matolson@amazon.com>
Co-authored-by: Uri Yagelnik <uriy@amazon.com>
Co-authored-by: Madelyn Olson <madelyneolson@gmail.com>
Co-authored-by: oranagra <oran@redislabs.com>
2024-09-04 19:10:40 +08:00
Ozan Tezcan ea05c6ac47
Fix RM_RdbLoad() to enable AOF after loading is completed (#13510)
RM_RdbLoad() disables AOF temporarily while loading RDB.
Later, it does not enable it back as it checks AOF state (disabled by then) 
rather than AOF config parameter.

Added a change to restart AOF according to config parameter.
2024-09-04 11:11:04 +03:00
Meir Shpilraien (Spielrein) d3d94ccf2e
Added new defrag API to allocate and free raw memory. (#13509)
All the defrag allocations API expects to get a value and replace it, leaving the old value untouchable.
In some cases a value might be shared between multiple keys, in such cases we can not simply replace
it when the defrag callback is called.

To allow support such use cases, the PR adds two new API's to the defrag API:

1. `RM_DefragAllocRaw` - allocate memory base on a given size.
2. `RM_DefragFreeRaw` - Free the given pointer.

Those API's avoid using tcache so they operate just like `RM_DefragAlloc` but allows the user to split
the allocation and the memory free operations into two stages and control when those happen.

In addition the PR adds new API to allow the module to receive notifications when defrag start and end: `RM_RegisterDefragCallbacks`
Those callbacks are the same as `RM_RegisterDefragFunc` but promised to be called and the start
and the end of the defrag process.
2024-09-03 15:03:19 +03:00
Zihao Lin 6ceadfb580
Improve GETRANGE command behavior (#12272)
Fixed the issue about GETRANGE and SUBSTR command
return unexpected result caused by the `start` and `end` out of
definition range of string.

---
## break change
Before this PR, when negative `end` was out of range (i.e., end <
-strlen), we would fix it to 0 to get the substring, which also resulted
in the first character still being returned for this kind of out of
range.
After this PR, we ensure that `GETRANGE` returns an empty bulk when the
negative end index is out of range.

Closes #11738

---------

Co-authored-by: debing.sun <debing.sun@redis.com>
2024-08-20 12:34:43 +08:00
judeng 7f0a7f0a69
improve performance for scan command when matching data type (#12395)
Move the TYPE filtering to the scan callback so that avoided the
`lookupKey` operation. This is the follow-up to #12209 . In this thread
we introduced two breaking changes:
1. we will not attempt to do lazy expire (delete) a key that was
filtered by not matching the TYPE (like we already do for MATCH
pattern).
2. when the specified key TYPE filter is an unknown type, server will
reply a error immediately instead of doing a full scan that comes back
empty handed.
2024-08-20 10:47:51 +08:00
debing.sun 2b88db90aa
Fix incorrect lag due to trimming stream via XTRIM command (#13473)
## Describe
When using the `XTRIM` command to trim a stream, it does not update the
maximal tombstone (`max_deleted_entry_id`). This leads to an issue where
the lag calculation incorrectly assumes that there are no tombstones
after the consumer group's last_id, resulting in an inaccurate lag.

The reason XTRIM doesn't need to update the maximal tombstone is that it
always trims from the beginning of the stream. This means that it
consistently changes the position of the first entry, leading to the
following scenarios:

1) First entry trimmed after maximal tombstone:
If the first entry is trimmed to a position after the maximal tombstone,
all tombstones will be before the first entry, so they won't affect the
consumer group's lag.

2) First entry trimmed before maximal tombstone:
If the first entry is trimmed to a position before the maximal
tombstone, the maximal tombstone will not be updated.

## Solution
Therefore, this PR optimizes the lag calculation by ensuring that when
both the consumer group's last_id and the maximal tombstone are behind
the first entry, the consumer group's lag is always equal to the number
of remaining elements in the stream.

Supplement to PR https://github.com/redis/redis/pull/13338
2024-08-16 23:13:31 +08:00
debing.sun b94b714f81
Fix error message for XREAD command with wrong parameter (#13474)
Fixed a missing from #13117.
When the number of streams is incorrect, the error message for `XREAD`
needs to include the '+' symbol.
2024-08-14 21:40:43 +08:00
Moti Cohen 806459f481
On HDEL last field with expiry, update global HFE DS (#13470)
Hash field expiration is optimized to avoid frequent update global HFE DS for
each field deletion. Eventually active-expiration will run and update or remove
the hash from global HFE DS gracefully. Nevertheless, statistic "subexpiry"
might reflect wrong number of hashes with HFE to the user if HDEL deletes
the last field with expiration in hash (yet there are more fields without expiration).

Following this change, if HDEL the last field with expiration in the hash then
take care to remove the hash from global HFE DS as well.
2024-08-11 16:39:03 +03:00
debing.sun 6f0ddc9d92
Pass extensions to node if extension processing is handled by it (#13465)
This PR is based on the commits from PR
https://github.com/valkey-io/valkey/pull/52.
Ref: https://github.com/redis/redis/pull/12760
Close https://github.com/redis/redis/issues/13401
This PR will replace https://github.com/redis/redis/pull/13449

Fixes compatibilty of Redis cluster (7.2 - extensions enabled by
default) with older Redis cluster (< 7.0 - extensions not handled) .

With some of the extensions enabled by default in 7.2 version, new nodes
running 7.2 and above start sending out larger clusterbus message
payload including the ping extensions. This caused an incompatibility
with node running engine versions < 7.0. Old nodes (< 7.0) would receive
the payload from new nodes (> 7.2) would observe a payload length
(totlen) > (estlen) and would perform an early exit and won't process
the message.

This fix does the following things:
1. Always set `CLUSTERMSG_FLAG0_EXT_DATA`, because during the meet
phase, we do not know whether the connected node supports ext data, we
need to make sure that it knows and send back its ext data if it has.
2. If another node does not support ext data, we will not send it ext
data to avoid the handshake failure due to the incorrect payload length.

Note: A successful `PING`/`PONG` is required as a sender for a given
node to be marked as `CLUSTERMSG_FLAG0_EXT_DATA` and then extensions
message
will be sent to it. This could cause a slight delay in receiving the
extensions message(s).

---------

Signed-off-by: Harkrishn Patro <harkrisp@amazon.com>
Co-authored-by: Harkrishn Patro <harkrisp@amazon.com>

---------

Signed-off-by: Harkrishn Patro <harkrisp@amazon.com>
Co-authored-by: Harkrishn Patro <harkrisp@amazon.com>
2024-08-08 10:48:03 +08:00
debing.sun e750c619b2
Fix some test failures caused by key being deleted due to premature expiration (#13453)
1. Fix fuzzer test failure when the key was deleted due to expiration
before sending random traffic for the key.

After HFE, when all fields in a hash are expired, the hash might be
deleted due to expiration.

If the key was expired in the mid of `RESTORE` command and sending rand
trafic, `fuzzer` test will fail in the following code because the 'TYPE
key' will return `none` and then throw an exception because it cannot be
found in `$commands`

94b9072e44/tests/support/util.tcl (L712-L713)

This PR adds a `None` check for the reply of `KEY TYPE` command and adds
a print of `err` to avoid false positives in the future.

failed CI:
https://github.com/redis/redis/actions/runs/10127334121/job/28004985388

2. Fix the issue where key was deleted due to expiration before the
`scan.scan_key` command could execute, caused by premature enabling of
`set-active-expire`.

failed CI:
https://github.com/redis/redis/actions/runs/10153722715/job/28077610552

---------

Co-authored-by: oranagra <oran@redislabs.com>
2024-07-31 08:15:39 +08:00
debing.sun 93fb83b4cb
Fix incorrect lag field in XINFO when tombstone is after the last_id of consume group (#13338)
Fix #13337

Ths PR fixes fixed two bugs that caused lag calculation errors.
1. When the latest tombstone is before the first entry, the tombstone
may stil be after the last id of consume group.
2. When a tombstone is after the last id of consume group, the group's
counter will be invalid, we should caculate the entries_read by using
estimates.
2024-07-30 22:31:31 +08:00
Oran Agra 447ce11a64
solve race conditions in tests (#13433)
[exception]: Executing test client: ERR FAILOVER target replica is not
online.. ERR FAILOVER target replica is not online.
    while executing
"$node_0 failover to $node_1_host $node_1_port"
    ("uplevel" body line 16)
    invoked from within
"uplevel 1 $code"
    (procedure "test" line 58)
    invoked from within
"test {failover command to specific replica works} {

[err]: client evicted due to percentage of maxmemory in
tests/unit/client-eviction.tcl
Expected 33622 >= 220200 && 33622 < 440401 (context: type eval line 17
cmd {assert {$tot_mem >= $n && $tot_mem < $maxmemory_clients_actual}}
proc ::test)
2024-07-22 10:11:56 +03:00
Oran Agra a331978583
Fix external test hang in redis-cli test when run in a certain order (#13423)
When the tests are run against an external server in this order:
`--single unit/introspection --single unit/moduleapi/blockonbackground
--single integration/redis-cli`
the test would hang when the "ASK redirect test" test attempts to create
a listening socket (it fails, and then redis-cli itself hangs waiting
for a non-responsive socket created by the introspection test).

the reasons are:
1. the blockedbackground test includes util.tcl and resets the
`::last_port_attempted` variable
2. the test in introspection didn't close the listening server, so it's
still alive.
3. find_available_port doesn't properly detect the busy port, and it
thinks that the port is free even though it's busy.

fixing all 3 of these problems, even though fixing just one would be
enough to let the test pass.
2024-07-17 15:42:28 +03:00
debing.sun 88af96c7a2
Trigger Lua GC after script loading (#13407)
Nowdays we do not trigger LUA GC after loading lua script. This means
that when a large number of scripts are loaded, such as when functions
are propagating from the master to the replica, if the LUA scripts are
never touched on the replica, the garbage might remain there
indefinitely.

Before this PR, we would share a gc_count between scripts and functions.
This means that, under certain circumstances, the GC trigger for scripts
and functions was not fair.
For example, loading a large number of scripts followed by a small
number of functions could result in the functions triggering GC.
In this PR, we assign a unique `gc_count` to each of them, so the GC
triggers between them will no longer affect each other.

on the other hand, this PR will to bring regession for script loading
commands(`FUNCTION LOAD` and `SCRIPT LOAD`), but they are not hot path,
we can ignore it, and it will be replaced
https://github.com/redis/redis/pull/13375 in the future.

---------

Co-authored-by: Oran Agra <oran@redislabs.com>
2024-07-16 09:28:47 +08:00
debing.sun d39548c854
Avoid starting defrag after config resetstat for defrag test (#13399)
If `config resetstat` is executed and a defrag is started after it, the
`total_active_defrag_time` will not be 0.
When we start the defrag again, we will skip the following steps:
1. waiting for the defrag to start. (s total_active_defrag_time is equal
0)
2. waiting for the test to complete. (active_defrag_running is euqal 0)
which result in the test failed.

---------

Co-authored-by: oranagra <oran@redislabs.com>
2024-07-12 10:07:09 +08:00
debing.sun ad9277238f
Disable aof for `Scripts do not block on waitaof` test (#13409)
The`Scripts do not block on waitaof` test keeps failing in the external
CI:
https://github.com/redis/redis/actions/runs/9875670156/job/27272955900?pr=13407

The main reason is that if AOF happens to be enabled, we get an acklocal
of 1.
2024-07-11 20:25:57 +08:00
guybe7 915a9e4b93
KSN: HEXPIRE-like commands should emit hdel if expire-time is in the past (#13408)
To be more similar to EXPIRE-like commands, which emit a "del"
notification if the expire-time is in the past
2024-07-11 16:24:58 +08:00
debing.sun ffff7fea7c
Rebuild function engines for function flush command (#13383)
### Issue
The current implementation of `FUNCTION FLUSH` command uses
`lua_unref()` to unreference script closures in Lua vm. However,
invoking `lua_unref()` during lazy free (`ASYNC` argument) is risky
since it is not thread-safe.

Another issue is that using `lua_unref()` to unreference references does
not trigger GC, This can result in the Lua VM leaves a significant
amount of garbage, which may never be cleaned up if not properly GC.

### Solution
The proposed solution is to completely rebuild the engines, resulting in
a brand new Lua VM.

---------

Co-authored-by: meir <meir@redis.com>
2024-07-10 17:35:36 +08:00
Moti Cohen a84cc20aef
HFE - Fix statistic to count also lazy expired and rename INFO params (#13372)
* INFO command : rename `hashes_with_expiry_fields` to `subexpiry`
* INFO command : rename `expired_hash_fields` to `expired_subkeys`
* Fix statistic of `expired_subkeys` to count also lazy expired
* Remove TODOs comments leftover in TCL
* Fix potential flaky test of rdb load of hash-field-expiration
2024-07-02 18:22:10 +03:00
Oran Agra 799c5e5f51
Solve a race between BGSAVE and FLUSHALL messing up the dirty counter (#13361)
If we run FLUSHALL when the 'save' config is set, and there's a fork
child ding BGSAVE, there's a chance the child is already finished, and
the parent process is unaware of it. in that case the child will not get
the kill signal and will finish successfully, but the parent process
thinks it killed it and will reset the dirty counter to 0, then the
backgroundSaveDoneHandlerDisk method can set the dirty counter to a
negative value.
2024-07-01 09:04:52 +03:00
Oran Agra 69b7137d32
Fix possible crash due to OOM panic on invalid command (#13380)
getKeysUsingKeySpece had the range check AFTER the allocation, of the
keys buffer, which could lead to an OOM panic when invalid arguments are
provided, leading to an overflow.
The allocated memory is only used after the range check, so there's no
risk of buffer overrun.
The OOM panic can happen on 32bit builds, or 64 builds running on
systems with less than 4GB of RAM, and is reachable via the COMMAND
GETKEYSANDFLAGS, and ACL key name validation.
2024-07-01 09:04:20 +03:00
Moti Cohen a9267137ee
HFE - count in command must match actual number of fields (#13369)
There was wrong preliminary assumption that we can optionally provide
vector of arguments more than count.
This is error-prone approach that leaded to actual error in that case.
This PR enforce that vector of argument match count.

Also fixed flaky HRANDFIELD test.
2024-06-26 14:12:06 +03:00
debing.sun 52e12d8bac
Don't keep global replication buffer reference for replicas marked CLIENT_CLOSE_ASAP (#13363)
In certain situations, we might generate a large number of propagates
(e.g., multi/exec, Lua script, or a single command generating tons of
propagations) within an event loop.
During the process of propagating to a replica, if the replica is
disconnected(marked as CLIENT_CLOSE_ASAP) due to exceeding the output
buffer limit, we should remove its reference to the global replication
buffer to avoid the global replication buffer being unable to be
properly trimmed due to being referenced.

---------

Co-authored-by: oranagra <oran@redislabs.com>
2024-06-26 08:26:23 +08:00
Moti Cohen 5eac99c312
Fix H(P)EXPIREAT command to propagate HDEL as well (#13364)
H(P)EXPIREAT command might delete fields in case the absolute time is in the 
past. Those HDELs need to be propagated as well.
 
In general, as we need to propagate H(P)EXPIRE(AT) command to the replica, each 
field that is mentioned in the command should be categorized into one of the four
options:
1. Managed to update field’s expiration time - propagate it to replica as part 
   of the HPEXPIREAT command.
2. Deleted the field because the time is in the past - propagate also HDEL command
   to delete the field and remove the field from the propagated HPEXPIREAT.
3. Condition not met for the field - Remove the field from the propagated
   HPEXPIREAT command.
4. Field does not exists - Remove the field from the propagated HPEXPIREAT command.

If none of the provided fields match option number 1, then avoid also propagating 
the HPEXPIREAT command to the replica.

This approach is aligned with the EXPIRE command:
If a given key has already expired, then DEL will be propagated instead of
EXPIRE command. If condition not met, then command will be rejected. Otherwise, 
EXPIRE command will be propagated for given key.
2024-06-25 13:15:09 +03:00
Moti Cohen e26ea35cd4
Adapt HRANDFIELD to HFE feature (#13348)
Considerations for the selected imp of HRANDFIELD & HFE feature:

HRANDFIELD might access any of the fields in the hash as some of them
might be expired. And so the Implementation of HRANDFIELD along with HFEs
might be one of the two options:
1. Expire hash-fields before diving into handling HRANDFIELD.
2. Refine HRANDFIELD cases to deal with expired fields.

Regarding the first option, as reference, the command RANDOMKEY also
declareson O(1) complexity, yet might be stuck on a very long (but not infinite)
loop trying to find non-expired keys. Furthermore RANDOMKEY also evicts expired 
keys along the way even though it is categorized as a read-only command. Note
that the case of HRANDFIELD is more lightweight versus RANDOMKEY since 
HFEs have much more effective and aggressive active-expiration for fields behind.

The second option introduces additional implementation complexity to HRANDFIELD.
We could further refine HRANDFIELD cases to differentiate between scenarios
with many expired fields versus few expired fields, and adjust based on the
percentage of expired fields. However, this approach could still lead to long
loops or necessitate expiring fields before selecting them. For the “lightweight”
cases it is also expected to have a lightweight expiration.

Considering the pros and cons, and the fact that HRANDFIELD is an infrequent
command (particularly with HFEs) and the fact we have effective active-expiration
behind for hash-fields, it is better to keep it simple and choose option number 1.

Other changes:
* Don't mark command dirty by internal hashTypeExpire(). It causes to read 
  only command of HRANDFIELD to be accidently propagated (This flag
  should be indicated at higher level, by the command functions).
* Align `hashTypeExpireIfNeeded()` and `hashTypeGetValue()` to be more
  aligned with `expireIfNeeded()` logic of keyspace.
2024-06-24 18:11:53 +03:00
Ozan Tezcan 4aa25d042c
Reply with array of return codes if the key does not exist for HFE commands (#13343)
Currently, HFE commands reply with empty array if the key does not
exist. Though, non-existing key and empty key is the same thing. 
It means fields given in the command do not exist in the empty key. 
So, replying with an array of 'no field' error codes (-2) suits better 
to Redis logic. e.g. Similarly, `hmget` returns array of nulls if the 
key does not exist.

After this PR:
```
127.0.0.1:6379> hpersist missingkey fields 2 a b
1) (integer) -2
2) (integer) -2
```
2024-06-14 09:35:05 +03:00
debing.sun ed10f737b8
Add new hexpired notification for HFE (#13329)
When the hash field expired, we will send a new `hexpired` notification.
It mainly includes the following three cases:
1. When field expired by active expiration.
2. When field expired by lazy expiration.
3. When the user uses the `h(p)expire(at)` command, the user will also
get a `hexpired` notification if the field expires during the command.

## Improvement
1. Now if more than one field expires in the hmget command, we will only
send a `hexpired` notification.
2. When a field with TTL is deleted by commands like hdel without
updating the global DS, active expire will not send a notification.

---------

Co-authored-by: Ozan Tezcan <ozantezcan@gmail.com>
Co-authored-by: Moti Cohen <moti.cohen@redis.com>
2024-06-11 21:42:34 +08:00
Moti Cohen f01fdc3960
Reserve 2 bits out of EB_EXPIRE_TIME_MAX for possible future use (#13331)
Reserve 2 bits out of hash-field expiration time (`EB_EXPIRE_TIME_MAX`)
for possible future lightweight indexing/categorizing of fields. It can
be achieved by hacking HFE as follows:
```
HPEXPIREAT key [ 2^47 + USER_INDEX ] FIELDS numfields field [field …]
```

Redis will also need to expose kind of `HEXPIRESCAN` and `HEXPIRECOUNT`
for this idea. Yet to be better defined.

`HFE_MAX_ABS_TIME_MSEC` constraint must be enforced only at API level.
Internally, the expiration time can be up to `EB_EXPIRE_TIME_MAX` for
future readiness.
2024-06-10 16:57:26 +03:00
Moti Cohen ce121b9261
HFE - Avoid lazy expire if called by modules + cleanup (#13326)
Need to be carefull if called by modules since modules API allow to open
and close key handler. We don't want to invalidate the handler
underneath.

* hashTypeExists(), hashTypeGetValueObject() - will return the logical
state of the field. A flag will indicate noExpire.
* RM_HashGet() - Will get NULL if the field expired. Fields won’t be
deleted.
* RM_ScanKey() - might return 0 items if all fields got expired. Fields
won’t be deleted.
* RM_HashSet() - If set, then override expired field. If delete, we can
either delete or leave it to active-expiration. XX/NX - logically
correct (Verify with tests).

Nice to have (not implemented):
* RedisModule_CloseKey() - We can local active-expire up-to 100 items. 

Note:
Length will be wrong to modules just like redis (Count expired fields).
2024-06-10 11:24:26 +03:00
debing.sun 690ef36330
Fix HFE notification test timing caused by field expiration in hexpire command (#13325)
In the old test, we give the `hexpire` a very short expire time, which
caused the filed to be deleted by the time `hpersist` command was
executed. As a result, the `hpersist` command won't be able to give a
`hpersist` notification, leading to test stuck.

fail CI:
https://github.com/redis/redis/actions/runs/9342175887/job/25709886471
2024-06-05 14:49:45 +08:00
debing.sun 9a2c6ba4e7
Prevent negative expire parameter in HEXPIRE/HEXPIREAT/HPEXPIRE/HPEXPIREAT commands (#13310)
1. Don't allow HEXPIRE/HEXPIREAT/HPEXPIRE/HPEXPIREAT command expire
parameters is negative

2. Remove a dead code reported from Coverity.
when `unit` is not `UNIT_SECONDS`, the second `if (expire > (long long)
EB_EXPIRE_TIME_MAX)` will be dead code.
```c
# t_hash.c
2988    /* Check expire overflow */
      	cond_at_most: Condition expire > 281474976710655LL, taking false branch. Now the value of expire is at most 281474976710655.
2989    if (expire > (long long) EB_EXPIRE_TIME_MAX) {
2990        addReplyErrorExpireTime(c);
2991        return;
2992    }

2994    if (unit == UNIT_SECONDS) {
2995        if (expire > (long long) EB_EXPIRE_TIME_MAX / 1000) {
2996            addReplyErrorExpireTime(c);
2997            return;
2998        }
2999        expire *= 1000;
3000    } else {
      	at_most: At condition expire > 281474976710655LL, the value of expire must be at most 281474976710655.
      	dead_error_condition: The condition expire > 281474976710655LL cannot be true.
3001        if (expire > (long long) EB_EXPIRE_TIME_MAX) {
      	
CID 494223: (#1 of 1): Logically dead code (DEADCODE)
dead_error_begin: Execution cannot reach this statement: addReplyErrorExpireTime(c);.
3002            addReplyErrorExpireTime(c);
3003            return;
3004        }
3005    }
```

---------

Co-authored-by: Ozan Tezcan <ozantezcan@gmail.com>
2024-06-04 20:35:26 +08:00
Ozan Tezcan 44352beefa
Fix crash in RM_ScanKey() when used with hexpire (#13320)
RM_ScanKey() was overlooked while introducing hash field expiration. 
An assert is triggered when it is called on a hash key with
OBJ_ENCODING_LISTPACK_EX encoding.

I've changed to code to handle listpackex encoding properly.
2024-06-04 11:42:02 +03:00
Valentino Geron 50569a906c
Free current client asynchronously after user permissions changes (#13274)
The crash happens when the user that triggers the permission changes
should be affected (and should be disconnected eventually).

To handle such a scenario, we should use the
`CLIENT_CLOSE_AFTER_COMMAND` flag.

This commit encapsulates all the places that should be handled in the
same way in `deauthenticateAndCloseClient`

Also:
* bugfix: during the ACL LOAD we ignore clients that are marked as
`CLIENT MASTER`
2024-05-30 22:09:30 +08:00
jonghoonpark 5a3534f9b5
dynamically list test files (#13220)
**Related issue**
https://github.com/redis/redis/issues/13219

**Motivation**
Currently we have to manually update the all_tests variable when
introducing new test files.

**Modification**
I have modified it to list test files dynamically, but instead of
modifying it to add all test files, I have modified it to only add only
test files from the following 4 paths

- unit
- unit/type
- unit/cluster
- integration

so that it doesn't deviate too much from what we already do

**Result**
- dynamically list test files to all_tests variable
- close issue https://github.com/redis/redis/issues/13219

**Additional information**
- removed `list-common.tcl` file and added
`generate_largevalue_test_array` proc in `util.tcl`. because
`list-common.tcl` is not a test file
- There is an order dependency. So I added a code to the "Is a ziplist
encoded Hash promoted on big payload?" test that resets
hash-max-listpack-value to the default (64).

---------

Signed-off-by: jonghoonpark <dev@jonghoonpark.com>
Co-authored-by: debing.sun <debing.sun@redis.com>
2024-05-30 19:09:26 +08:00
debing.sun 7b9e960690
Hash Field Expiration (#13303)
## Background

This PR introduces support for field-level expiration in Redis hashes. Previously, Redis supported expiration only at the key level, but this enhancement allows setting expiration times for individual fields within a hash.

## New commands
* HEXPIRE
* HEXPIREAT
* HEXPIRETIME
* HPERSIST
* HPEXPIRE
* HPEXPIREAT
* HPEXPIRETIME
* HPTTL
* HTTL

## Short example
from @moticless
```sh
127.0.0.1:6379>  hset myhash f1 v1 f2 v2 f3 v3                                                   
(integer) 3
127.0.0.1:6379>  hpexpire myhash 10000 NX fields 2 f2 f3                                         
1) (integer) 1
2) (integer) 1
127.0.0.1:6379>  hpttl myhash fields 3 f1 f2 f3                                                                                                                                                                         
1) (integer) -1
2) (integer) 9997
3) (integer) 9997
127.0.0.1:6379>  hgetall myhash  
1) "f3"
2) "v3"
3) "f2"
4) "v2"
5) "f1"
6) "v1"

... after 10 seconds ...

127.0.0.1:6379>  hgetall myhash  
1) "f1"
2) "v1"
127.0.0.1:6379>
```

## Expiration strategy
1. Integrate active
    Redis periodically performs active expiration and deletion of hash keys that contain expired fields, with a maximum attempt limit.
3. Lazy expiration
    When a client touches fields within a hash, Redis checks if the fields are expired. If a field is expired, it will be deleted. However, we do not delete expired fields during a traversal, we implicitly skip over them.

## RDB changes
Add two new rdb type s`RDB_TYPE_HASH_METADATA` and `RDB_TYPE_HASH_LISTPACK_EX`.

## Notification
1. Add `hpersist` notification for `HPERSIST` command.
5. Add `hexpire` notification for `HEXPIRE`, `HEXPIREAT`, `HPEXPIRE` and `HPEXPIREAT` commands.

## Internal
1. Add new data structure `ebuckets`, which is used to store TTL and keys, enabling quick retrieval of keys based on TTL.
2. Add new data structure `mstr` like sds, which is used to store a string with TTL.

This work was done by @moticless, @tezc, @ronen-kalish, @sundb, I just release it.
2024-05-30 15:26:19 +08:00
Moti Cohen 33fc0fbfae
HFE to support AOF and replicas (#13285)
* For replica sake, rewrite commands `H*EXPIRE*` , `HSETF`, `HGETF` to
have absolute unix time in msec.
* On active-expiration of field, propagate HDEL to replica
(`propagateHashFieldDeletion()`)
* On lazy-expiration, propagate HDEL to replica (`hashTypeGetValue()`
now calls `hashTypeDelete()`. It also takes care to call
`propagateHashFieldDeletion()`).
* Fix `H*EXPIRE*` command such that if it gets flag `LT` and it doesn’t
have any expiration on the field then it will considered as valid
condition.

Note, replicas doesn’t make any active expiration, and should avoid lazy
expiration. On `hashTypeGetValue()` it doesn't check expiration (As long
as the master didn’t request to delete the field, it is valid)

TODO: 
* Attach `dbid` to HASH metadata. See
[here](https://github.com/redis/redis/pull/13209#discussion_r1593385850)

---------

Co-authored-by: debing.sun <debing.sun@redis.com>
2024-05-29 19:47:48 +08:00
Ozan Tezcan 6a11d458be
Fix hscan return value (#13297)
In the last step of hscan, while replying to client, we assume all items
in the result list are keys which are mstr instances. Though, there 
might be values which are sds instances. 

Added a check to avoid calling mstrlen() for value objects.

To reproduce:
```
127.0.0.1:6379> hset myhash1 a 11111111111111111111111111111111111111111111111111111111111111111
(integer) 0
127.0.0.1:6379> hscan myhash1 0
1) "0"
2) 1) "a"
   2) "11111111111111111111111111111111111111111111111111111111111111111\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00"
```
2024-05-28 11:59:57 +03:00
Ozan Tezcan 2f34f6f0b9
Delete hsetf and hgetf (#13291)
Changes:
- Delete hsetf and hgetf commands
- Hfe commands will return empty array instead of nil.
---------

Co-authored-by: Moti Cohen <moticless@gmail.com>
2024-05-26 13:30:45 +03:00
Moti Cohen 60e1582ddb
Fix statistics test (#13293) 2024-05-26 03:34:15 +03:00
Moti Cohen f34f2ade85
Add Statistics hashes_with_expiry_fields to INFO (#13275)
Added hashes_with_expiry_fields.
Optimially it would better to have statistic of that counts all fields
with expiry. But it requires careful logic and computation to follow and
deep dive listpacks and hashes. This statistics is trivial to achieve
and reflected by global HFE DS that has builtin enumeration of all the
hashes that are registered in it.
2024-05-23 17:29:45 +03:00
debing.sun 95cbe879e5
sanitize dump payload for HFE (#13278)
Add the following validations:
1. Get TTL using the lpGetIntegerValue() method instead of lpGetValue(),
Ref https://github.com/redis/redis/pull/13209#discussion_r1602569422
2. The TTL of listpackex is a number in the valid range
(0~EB_EXPIRE_TIME_MAX) and ordered.
3. The TTL fields of listpackex are ordered. 
4. The TTL of hashtable is within the valid range
(0~EB_EXPIRE_TIME_MAX).

Other:
Fix the missing of handling OBJ_ENCODING_LISTPACK_EX in
dismissHashObject().

---------

Co-authored-by: Ozan Tezcan <ozantezcan@gmail.com>
2024-05-22 10:53:30 +08:00
debing.sun 9ffc35c98e
Have consistent behavior of SPUBLISH within multi/exec like regular command (#13276)
This PR is based on the commits from PR #12944.

Allow SPUBLISH command within multi/exec on replica

Behavior on unstable:

```
127.0.0.1:6380> CLUSTER NODES
39ce8aa20f1f0d91f1a88d976ee1926dfefcdf1a 127.0.0.1:6380@16380 myself,slave 8b0feb120b68aac489d6a5af9c77dc40d71bc792 0 0 0 connected
8b0feb120b68aac489d6a5af9c77dc40d71bc792 127.0.0.1:6379@16379 master - 0 1705091681202 0 connected 0-16383
127.0.0.1:6380> SPUBLISH hello world
(integer) 0
127.0.0.1:6380> MULTI
OK
127.0.0.1:6380(TX)> SPUBLISH hello world
QUEUED
127.0.0.1:6380(TX)> EXEC
(error) MOVED 866 127.0.0.1:6379
```

With this change:

```
127.0.0.1:6380> SPUBLISH hello world
(integer) 0
127.0.0.1:6380> MULTI
OK
127.0.0.1:6380(TX)> SPUBLISH hello world
QUEUED
127.0.0.1:6380(TX)> EXEC
1) (integer) 0
```

---------

Co-authored-by: Harkrishn Patro <harkrisp@amazon.com>
Co-authored-by: oranagra <oran@redislabs.com>
2024-05-21 09:25:13 +08:00