redis

Commit Graph

Author	SHA1	Message	Date
sundb	f106beebfa	Fix the missing server.dirty increment and redundant signalModifiedKey in serveClientBlockedOnList (#11326 ) Mainly fix two minor bug 1. When handle BL*POP/BLMOVE commands with blocked clients, we should increment server.dirty. 2. `listPopRangeAndReplyWithKey()` in `serveClientBlockedOnList()` should not repeat calling `signalModifiedKey()` which has been called when an element was pushed on the list. (was skipped in all bpop commands, other than blmpop) Other optimization add `signal` param for `listElementsRemoved` to skip `signalModifiedKey()` to unify all pop operation. Unifying all pop operations also prepares us for #11303, so that we can avoid having to deal with the conversion from quicklist to listpack() in various places when the list shrinks.	2022-09-28 21:07:38 +03:00
guybe7	3330ea1864	RM_CreateCommand should not set CMD_KEY_VARIABLE_FLAGS automatically (#11320 ) The original idea behind auto-setting the default (first,last,step) spec was to use the most "open" flags when the user didn't provide any key-spec flags information. While the above idea is a good approach, it really makes no sense to set CMD_KEY_VARIABLE_FLAGS if the user didn't provide the getkeys-api flag: in this case there's not way to retrieve these variable flags, so what's the point? Internally in redis there was code to ignore this already, so this fix doesn't change redis's behavior, it only affects the output of COMMAND command.	2022-09-28 14:15:07 +03:00
Ozan Tezcan	18920813a9	Ignore RM_Call deny-oom flag if maxmemory is zero (#11319 ) If a command gets an OOM response and then if we set maxmemory to zero to disable the limit, server.pre_command_oom_state never gets updated and it stays true. As RM_Call() calls with "respect deny-oom" flag checks server.pre_command_oom_state, all calls will fail with OOM. Added server.maxmemory check in RM_Call() to process deny-oom flag only if maxmemory is configured.	2022-09-26 10:03:45 +03:00
Shaya Potter	6e993a5dfa	Add RM_SetContextUser to support acl validation in RM_Call (and scripts) (#10966 ) Adds a number of user management/ACL validaiton/command execution functions to improve a Redis module's ability to enforce ACLs correctly and easily. * RM_SetContextUser - sets a RedisModuleUser on the context, which RM_Call will use to both validate ACLs (if requested and set) as well as assign to the client so that scripts executed via RM_Call will have proper ACL validation. * RM_SetModuleUserACLString - Enables one to pass an entire ACL string, not just a single OP and have it applied to the user * RM_GetModuleUserACLString - returns a stringified version of the user's ACL (same format as dump and list). Contains an optimization to cache the stringified version until the underlying ACL is modified. * Slightly re-purpose the "C" flag to RM_Call from just being about ACL check before calling the command, to actually running the command with the right user, so that it also affects commands inside EVAL scripts. see #11231	2022-09-22 16:29:00 +03:00
Oran Agra	6d21560190	Fix heap overflow vulnerability in XAUTOCLAIM (CVE-2022-35951) (#11301 ) Executing an XAUTOCLAIM command on a stream key in a specific state, with a specially crafted COUNT argument may cause an integer overflow, a subsequent heap overflow, and potentially lead to remote code execution. The problem affects Redis versions 7.0.0 or newer.	2022-09-22 11:55:53 +03:00
Binbin	bb6513cbba	ACL default newly created user set USER_FLAG_SANITIZE_PAYLOAD flag (#11279 ) Starting from 6.2, after ACL SETUSER user reset, the user will carry the sanitize-payload flag. It was added in #7807, and then ACL SETUSER reset is inconsistent with default newly created user which missing sanitize-payload flag. Same as `off` and `on` these two bits are mutually exclusive, the default created user needs to have sanitize-payload flag. Adds USER_FLAG_SANITIZE_PAYLOAD flag to ACLCreateUser. Note that the bug don't have any real implications, since the code in rdb.c (rdbLoadObject) checks for `USER_FLAG_SANITIZE_PAYLOAD_SKIP`, so the fact that `USER_FLAG_SANITIZE_PAYLOAD` is missing doesn't really matters. Added tests to make sure it won't be broken in the future, and updated the comment in ACLSetUser and redis.conf	2022-09-22 09:13:39 +03:00
Shay Fadida	eedb8b1724	Fix missing sections for INFO ALL with module (#11291 ) When using `INFO ALL <section>`, when `section` is a specific module section. Redis will not print the additional section(s). The fix in this case, will search the modules info sections if the user provided additional sections to `ALL`. Co-authored-by: Oran Agra <oran@redislabs.com>	2022-09-21 08:10:03 +03:00
sundb	13d25dd95e	Fix crash due to delete entry from compress quicklistNode and wrongly split quicklistNode (#11242 ) This PR mainly deals with 2 crashes introduced in #9357, and fix the QUICKLIST-PACKED-THRESHOLD mess in external test mode. 1. Fix crash due to deleting an entry from a compress quicklistNode When inserting a large element, we need to create a new quicklistNode first, and then delete its previous element, if the node where the deleted element is located is compressed, it will cause a crash. Now add `dont_compress` to quicklistNode, if we want to use a quicklistNode after some operation, we can use this flag like following: ```c node->dont_compress = 1; /* Prevent to be compressed / some_operation(node); / This operation might try to compress this node / some_other_operation(node); / We can use this node without decompress it / node->dont_compress = 0; / Re-able compression */ quicklistCompressNode(node); ``` Perhaps in the future, we could just disable the current entry from being compressed during the iterator loop, but that would require more work. 2. Fix crash due to wrongly split quicklist before #9357, the offset param of _quicklistSplitNode() will not negative. For now, when offset is negative, the split extent will be wrong. following example: ```c int orig_start = after ? offset + 1 : 0; int orig_extent = after ? -1 : offset; int new_start = after ? 0 : offset; int new_extent = after ? offset + 1 : -1; # offset: -2, after: 1, node->count: 2 # current wrong range: [-1,-1] [0,-1] # correct range: [1,-1] [0, 1] ``` Because only `_quicklistInsert()` splits the quicklistNode and only `quicklistInsertAfter()`, `quicklistInsertBefore()` call _quicklistInsert(), so `quicklistReplaceEntry()` and `listTypeInsert()` might occur this crash. But the iterator of `listTypeInsert()` is alway from head to tail(iter->offset is always positive), so it is not affected. The final conclusion is this crash only occur when we insert a large element with negative index into a list, that affects `LSET` command and `RM_ListSet` module api. 3. In external test mode, we need to restore quicklist packed threshold after when the end of test. 4. Show `node->count` in quicklistRepr(). 5. Add new tcl proc `config_get_set` to support restoring config in tests.	2022-09-19 09:47:52 +03:00
ranshid	c0ce97facc	fix test Migrate the last slot away from a node using redis-cli (#11221 ) When using cli to add node, there can potentially be a race condition in which all nodes presenting cluster state o.k even though the added node did not yet meet all cluster nodes. this adds another utility function to wait until all cluster nodes see the same cluster size	2022-09-06 16:54:24 -07:00
chendianqiang	e42d98ed27	Correctly handle scripts with shebang (not read-only) on a cluster replica (#11223 ) EVAL scripts are by default not considered `write` commands, so they were allowed on a replica. But when adding a shebang, they become `write` command (unless the `no-writes` flag is added). With this change we'll handle them as write commands, and reply with MOVED instead of READONLY when executed on a redis cluster replica. Co-authored-by: chendianqiang <chendianqiang@meituan.com>	2022-09-05 16:59:14 +03:00
Shaya Potter	87e7973c7e	Add a dry run flag to RM_Call execution (#11158 ) Add a new "D" flag to RM_Call which runs whatever verification the user requests, but returns before the actual execution of the command. It automatically enables returning error messages as CallReply objects to distinguish success (NULL) from failure (CallReply returned).	2022-09-05 16:19:32 +03:00
Oran Agra	c3b7bde914	fix false valgrind error on new hash test (#11200 ) New test fails on valgrind because strtold("+inf") with valgrind returns a non-inf result same thing is done in incr.tcl.	2022-08-29 10:25:24 +03:00
Shaya Potter	bed6d759bc	Improve cmd_flags for script/functions in RM_Call (#11159 ) When RM_Call was used with `M` (reject OOM), `W` (reject writes), as well as `S` (rejecting stale or write commands in "Script mode"), it would have only checked the command flags, but not the declared script flag in case it's a command that runs a script. Refactoring: extracts out similar code in server.c's processCommand to be usable in RM_Call as well.	2022-08-28 13:10:10 +03:00
weimeng	8945067544	bugfix:del keys in slot replicate to replica, and trigger other invalidations (#11084 ) Bugfix: with the scenario if we force assigned a slot to other master, old master will lose the slot ownership, then old master will call the function delKeysInSlot() to delete all keys which in the slot. These delete operations should replicate to replicas, avoid the data divergence issue in master and replicas. Additionally, in this case, we now call: * signalModifiedKey (to invalidate WATCH) * moduleNotifyKeyspaceEvent (key space notification for modules) * dirty++ (to signal that the persistence file may be outdated) Co-authored-by: weimeng <weimeng@didiglobal.com> Co-authored-by: Madelyn Olson <madelyneolson@gmail.com>	2022-08-28 11:37:26 +03:00
chendianqiang	bc7fe41e58	fix hincrbyfloat not to create a key if the new value is invalid (#11149 ) Check the validity of the value before performing the create operation, prevents new data from being generated even if the request fails to execute. Co-authored-by: Oran Agra <oran@redislabs.com> Co-authored-by: chendianqiang <chendianqiang@meituan.com> Co-authored-by: Binbin <binloveplay1314@qq.com>	2022-08-28 11:33:41 +03:00
Oran Agra	c789fb0aa7	Fix assertion when a key is lazy expired during cluster key migration (#11176 ) Redis 7.0 has #9890 which added an assertion when the propagation queue was not flushed and we got to beforeSleep. But it turns out that when processCommands calls getNodeByQuery and decides to reject the command, it can lead to a key that was lazy expired and is deleted without later flushing the propagation queue. This change prevents lazy expiry from deleting the key at this stage (not as part of a command being processed in `call`)	2022-08-24 19:39:15 +03:00
Binbin	78259826cd	Bump codespell from 2.1.0 to 2.2.1 in /.codespell (#11184 ) add a few terms to the white list, and fix a few newly detected typos	2022-08-24 15:07:43 +03:00
Meir Shpilraien (Spielrein)	3603f19496	fix test timeout wait command (#11181 ) Fix `Test replication with lazy expire` test to not timeout the wait command. This fix will allow the test to pass on slow environments and when running with valgrind.	2022-08-24 13:38:55 +03:00
Meir Shpilraien (Spielrein)	c1bd61a4a5	Reverts most of the changes of #10969 (#11178 ) The PR reverts the changes made on #10969. The reason for revert was trigger because of occasional test failure that started after the PR was merged. The issue is that if there is a lazy expire during the command invocation, the `del` command is added to the replication stream after the command placeholder. So the logical order on the primary is: * Delete the key (lazy expiration) * Command invocation But the replication stream gets it the other way around: * Command invocation (because the command is written into the placeholder) * Delete the key (lazy expiration) So if the command write to the key that was just lazy expired we will get inconsistency between primary and replica. One solution we considered is to add another lazy expire replication stream and write all the lazy expire there. Then when replicating, we will replicate the lazy expire replication stream first. This will solve this specific test failure but we realize that the issues does not ends here and the more we dig the more problems we find.One of the example we thought about (that can actually crashes Redis) is as follow: * User perform SINTERSTORE * When Redis tries to fetch the second input key it triggers lazy expire * The lazy expire trigger a module logic that deletes the first input key * Now Redis hold the robj of the first input key that was actually freed We believe we took the wrong approach and we will come up with another PR that solve the problem differently, for now we revert the changes so we will not have the tests failure. Notice that not the entire code was revert, some parts of the PR are changes that we would like to keep. The changes that was reverted are: * Saving a placeholder for replication at the beginning of the command (`call` function) * Order of the replication stream on active expire and eviction (we will decide how to handle it correctly on follow up PR) * `Spop` changes are no longer needed (because we reverted the placeholder code) Changes that was not reverted: * On expire/eviction, wrap the `del` and the notification effect in a multi exec. * `PropagateNow` function can still accept a special dbid, -1, indicating not to replicate select. * Keep optimisation for reusing the `alsoPropagate` array instead of allocating it each time. Tests: * All tests was kept and only few tests was modify to work correctly with the changes * Test was added to verify that the revert fixes the issues.	2022-08-24 12:51:36 +03:00
Oran Agra	4faddf18ca	Build TLS as a loadable module * Support BUILD_TLS=module to be loaded as a module via config file or command line. e.g. redis-server --loadmodule redis-tls.so * Updates to redismodule.h to allow it to be used side by side with server.h by defining REDISMODULE_CORE_MODULE * Changes to server.h, redismodule.h and module.c to avoid repeated type declarations (gcc 4.8 doesn't like these) * Add a mechanism for non-ABI neutral modules (ones who include server.h) to refuse loading if they detect not being built together with redis (release.c) * Fix wrong signature of RedisModuleDefragFunc, this could break compilation of a module, but not the ABI * Move initialization of listeners in server.c to be after loading the modules * Config TLS after initialization of listeners * Init cluster after initialization of listeners * Add TLS module to CI * Fix a test suite race conditions: Now that the listeners are initialized later, it's not sufficient to wait for the PID message in the log, we need to wait for the "Server Initialized" message. * Fix issues with moduleconfigs test as a result from start_server waiting for "Server Initialized" * Fix issues with modules/infra test as a result of an additional module present Notes about Sentinel: Sentinel can't really rely on the tls module, since it uses hiredis to initiate connections and depends on OpenSSL (won't be able to use any other connection modules for that), so it was decided that when TLS is built as a module, sentinel does not support TLS at all. This means that it keeps using redis_tls_ctx and redis_tls_client_ctx directly. Example code of config in redis-tls.so(may be use in the future): RedisModuleString tls_cfg = NULL; void tlsInfo(RedisModuleInfoCtx ctx, int for_crash_report) { UNUSED(for_crash_report); RedisModule_InfoAddSection(ctx, ""); RedisModule_InfoAddFieldLongLong(ctx, "var", 42); } int tlsCommand(RedisModuleCtx ctx, RedisModuleString argv, int argc) { if (argc != 2) return RedisModule_WrongArity(ctx); return RedisModule_ReplyWithString(ctx, argv[1]); } RedisModuleString getStringConfigCommand(const char name, void privdata) { REDISMODULE_NOT_USED(name); REDISMODULE_NOT_USED(privdata); return tls_cfg; } int setStringConfigCommand(const char name, RedisModuleString new, void privdata, RedisModuleString err) { REDISMODULE_NOT_USED(name); REDISMODULE_NOT_USED(err); REDISMODULE_NOT_USED(privdata); if (tls_cfg) RedisModule_FreeString(NULL, tls_cfg); RedisModule_RetainString(NULL, new); tls_cfg = new; return REDISMODULE_OK; } int RedisModule_OnLoad(void ctx, RedisModuleString **argv, int argc) { .... if (RedisModule_CreateCommand(ctx,"tls",tlsCommand,"",0,0,0) == REDISMODULE_ERR) return REDISMODULE_ERR; if (RedisModule_RegisterStringConfig(ctx, "cfg", "", REDISMODULE_CONFIG_DEFAULT, getStringConfigCommand, setStringConfigCommand, NULL, NULL) == REDISMODULE_ERR) return REDISMODULE_ERR; if (RedisModule_LoadConfigs(ctx) == REDISMODULE_ERR) { if (tls_cfg) { RedisModule_FreeString(ctx, tls_cfg); tls_cfg = NULL; } return REDISMODULE_ERR; } ... } Co-authored-by: zhenwei pi <pizhenwei@bytedance.com> Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>	2022-08-23 12:37:56 +03:00
Wen Hui	c3a0253bc8	Add 2 test cases for XDEL and XGROUP CREATE command (#11137 ) This PR includes 2 missed test cases of XDEL and XGROUP CREATE command 1. one test case: XDEL delete multiply id once 2. 3 test cases: XGROUP CREATE has ENTRIESREAD parameter, which equal 0 (special positive number), 3 and negative value. Co-authored-by: Ubuntu <lucas.guang.yang1@huawei.com> Co-authored-by: Oran Agra <oran@redislabs.com> Co-authored-by: Binbin <binloveplay1314@qq.com>	2022-08-21 07:52:57 +03:00
Binbin	3a16ad30b7	Fix CLUSTERDOWN issue in cluster reshard unblock test (#11139 ) change the cluster-node-timeout from 1 to 1000	2022-08-18 09:18:18 -07:00
guybe7	223046ec9a	Repurpose redisCommandArg's name as the unique ID (#11051 ) This PR makes sure that "name" is unique for all arguments in the same level (i.e. all args of a command and all args within a block/oneof). This means several argument with identical meaning can be referred to together, but also if someone needs to refer to a specific one, they can use its full path. In addition, the "display_text" field has been added, to be used by redis.io in order to render the syntax of the command (for the vast majority it is identical to "name" but sometimes we want to use a different string that is not "name") The "display" field is exposed via COMMAND DOCS and will be present for every argument, except "oneof" and "block" (which are container arguments) Other changes: 1. Make sure we do not have any container arguments ("oneof" or "block") that contain less than two sub-args (otherwise it doesn't make sense) 2. migrate.json: both AUTH and AUTH2 should not be "optional" 3. arg names cannot contain underscores, and force the usage of hyphens (most of these were a result of the script that generated the initial json files from redis.io commands.json).	2022-08-18 15:09:36 +03:00
Binbin	fc3956e8f4	Fix memory leak in moduleFreeCommand (#11147 ) Currently, we call zfree(cmd->args), but the argument array needs to be freed recursively (there might be sub-args). Also fixed memory leaks on cmd->tips and cmd->history. Fixes #11145	2022-08-18 12:36:01 +03:00
Meir Shpilraien (Spielrein)	508a138885	Fix replication inconsistency on modules that uses key space notifications (#10969 ) Fix replication inconsistency on modules that uses key space notifications. ### The Problem In general, key space notifications are invoked after the command logic was executed (this is not always the case, we will discuss later about specific command that do not follow this rules). For example, the `set x 1` will trigger a `set` notification that will be invoked after the `set` logic was performed, so if the notification logic will try to fetch `x`, it will see the new data that was written. Consider the scenario on which the notification logic performs some write commands. for example, the notification logic increase some counter, `incr x{counter}`, indicating how many times `x` was changed. The logical order by which the logic was executed is has follow: ``` set x 1 incr x{counter} ``` The issue is that the `set x 1` command is added to the replication buffer at the end of the command invocation (specifically after the key space notification logic was invoked and performed the `incr` command). The replication/aof sees the commands in the wrong order: ``` incr x{counter} set x 1 ``` In this specific example the order is less important. But if, for example, the notification would have deleted `x` then we would end up with primary-replica inconsistency. ### The Solution Put the command that cause the notification in its rightful place. In the above example, the `set x 1` command logic was executed before the notification logic, so it should be added to the replication buffer before the commands that is invoked by the notification logic. To achieve this, without a major code refactoring, we save a placeholder in the replication buffer, when finishing invoking the command logic we check if the command need to be replicated, and if it does, we use the placeholder to add it to the replication buffer instead of appending it to the end. To be efficient and not allocating memory on each command to save the placeholder, the replication buffer array was modified to reuse memory (instead of allocating it each time we want to replicate commands). Also, to avoid saving a placeholder when not needed, we do it only for WRITE or MAY_REPLICATE commands. #### Additional Fixes * Expire and Eviction notifications: * Expire/Eviction logical order was to first perform the Expire/Eviction and then the notification logic. The replication buffer got this in the other way around (first notification effect and then the `del` command). The PR fixes this issue. * The notification effect and the `del` command was not wrap with `multi-exec` (if needed). The PR also fix this issue. * SPOP command: * On spop, the `spop` notification was fired before the command logic was executed. The change in this PR would have cause the replication order to be change (first `spop` command and then notification `logic`) although the logical order is first the notification logic and then the `spop` logic. The right fix would have been to move the notification to be fired after the command was executed (like all the other commands), but this can be considered a breaking change. To overcome this, the PR keeps the current behavior and changes the `spop` code to keep the right logical order when pushing commands to the replication buffer. Another PR will follow to fix the SPOP properly and match it to the other command (we split it to 2 separate PR's so it will be easy to cherry-pick this PR to 7.0 if we chose to). #### Unhanded Known Limitations * key miss event: * On key miss event, if a module performed some write command on the event (using `RM_Call`), the `dirty` counter would increase and the read command that cause the key miss event would be replicated to the replication and aof. This problem can also happened on a write command that open some keys but eventually decides not to perform any action. We decided not to handle this problem on this PR because the solution is complex and will cause additional risks in case we will want to cherry-pick this PR. We should decide if we want to handle it in future PR's. For now, modules writers is advice not to perform any write commands on key miss event. #### Testing * We already have tests to cover cases where a notification is invoking write commands that are also added to the replication buffer, the tests was modified to verify that the replica gets the command in the correct logical order. * Test was added to verify that `spop` behavior was kept unchanged. * Test was added to verify key miss event behave as expected. * Test was added to verify the changes do not break lazy expiration. #### Additional Changes * `propagateNow` function can accept a special dbid, -1, indicating not to replicate `select`. We use this to replicate `multi/exec` on `propagatePendingCommands` function. The side effect of this change is that now the `select` command will appear inside the `multi/exec` block on the replication stream (instead of outside of the `multi/exec` block). Tests was modified to match this new behavior.	2022-08-18 10:16:32 +03:00
Valentino Geron	6a9cc20d94	Tests: Add missing key declaration in scripts (#11134 ) Make sure the script calls in the tests declare the keys they intend to use. Do that with minimal changes to existing lines (so many scripts still have a hard coded key names) Co-authored-by: Valentino Geron <valentino@redis.com>	2022-08-16 22:04:22 +03:00
guybe7	1189680edd	Rename offset and xsetid tags (#11103 ) There's really no point in having dedicated flags to test these features (why shouldn't all commands/features get their own tag?)	2022-08-14 18:25:01 +03:00
sundb	8aad2ac352	Add missing lua_pop in luaGetFromRegistry (#11097 ) This pr mainly has the following four changes: 1. Add missing lua_pop in `luaGetFromRegistry`. This bug affects `redis.register_function`, where `luaGetFromRegistry` in `luaRegisterFunction` will return null when we call `redis.register_function` nested. .e.g ``` FUNCTION LOAD "#!lua name=mylib \n local lib=redis \n lib.register_function('f2', function(keys, args) lib.register_function('f1', function () end) end)" fcall f2 0 ```` But since we exit when luaGetFromRegistry returns null, it does not cause the stack to grow indefinitely. 3. When getting `REGISTRY_RUN_CTX_NAME` from the registry, use `serverAssert` instead of error return. Since none of these lua functions are registered at the time of function load, scriptRunCtx will never be NULL. 4. Add `serverAssert` for `luaLdbLineHook`, `luaEngineLoadHook`. 5. Remove `luaGetFromRegistry` from `redis_math_random` and `redis_math_randomseed`, it looks like they are redundant.	2022-08-14 11:50:18 +03:00
DarrenJiang13	44859a41ee	fix the client type in trackingInvalidateKey() (#11052 ) Fix bug with scripts ignoring client tracking NOLOOP and send an invalidation message anyway.	2022-08-10 11:58:54 +03:00
Valentino Geron	3270f2d54e	Tests: improve skip tags around resp3 (#11090 ) some skip tags were missing on some tests avoid using HELLO if denytags has resp3 (target server may not support it) Co-authored-by: Valentino Geron <valentino@redis.com>	2022-08-07 16:32:31 +03:00
Huang Zhw	ec5034a2e3	acl: bitfield with get and set\|incrby can be executed with readonly permission (#11086 ) `bitfield` with `get` may not be readonly. ``` 127.0.0.1:6384> acl setuser hello on nopass %R~* +@all OK 127.0.0.1:6384> auth hello 1 OK 127.0.0.1:6384> bitfield hello set i8 0 1 (error) NOPERM this user has no permissions to access one of the keys used as arguments 127.0.0.1:6384> bitfield hello set i8 0 1 get i8 0 1) (integer) 0 2) (integer) 1 ``` Co-authored-by: Oran Agra <oran@redislabs.com>	2022-08-07 09:21:19 +03:00
Binbin	6a7dd00cdd	Re-enable aof-race integration tests (#10972 ) This is the history of aof-race related changes: 1. added in `3aa4b00970` 2. disabled in `dcdfd005a0` 3. enabled in `5c63922691` 4. disabled in `53a2af3941` This PR refreshes the aof-race test, re-enable it. Closes #10971	2022-08-04 11:13:29 +03:00
Valentino Geron	dcafee55a5	Fix acl tests to support `--singledb` flag (#11077 ) * some of the tests don't clean the key the use * marked tests with `{singledb:skip}` if they use SELECT Co-authored-by: Valentino Geron <valentino@redis.com>	2022-08-03 12:11:32 +03:00
Wen Hui	beb9746a9f	Fix function load error message (#10964 ) Update error messages for function load	2022-08-02 18:19:53 -07:00
Binbin	9f0f533bc8	Solve usleep compilation warning in keyspace_events.c (#11073 ) There is a -Wimplicit-function-declaration warning in here: ``` keyspace_events.c: In function ‘KeySpace_NotificationGeneric’: keyspace_events.c:67:9: warning: implicit declaration of function ‘usleep’; did you mean ‘sleep’? [-Wimplicit-function-declaration] 67 \| usleep(1); \| ^~~~~~ \| sleep ```	2022-08-02 18:00:11 +03:00
Binbin	e13b681874	Tests (cluster / sentinel): add --stop and--loop options (#11070 ) --stop: Blocks once the first test fails. --loop: Execute the specified set of tests forever. It is useful when we debug some test failures.	2022-08-01 10:12:27 +03:00
Huang Zhw	61451b02cb	tracking pending invalidation message of flushdb sent by (#11068 ) trackingHandlePendingKeyInvalidations should use proto.	2022-07-31 16:14:39 +03:00
Binbin	e7144693e2	Fix bgsaveerr issue in psync wrong offset test (#11043 ) The kill above is sometimes successful and sometimes already too late. The PING in pysnc wrong offset test got rejected by bgsaveerr because lastbgsave_status is C_ERR. In theory, using diskless can avoid PING being affected, because when the replica is dropped, we will kill the child with SIGUSR1, and this will not affect lastbgsave_status. Anyway, this kill is not particularly needed here, dropping the kill is the best one, since we do have the waitForBgsave, so just let it take care of the bgsave. No need for fast termination.	2022-07-27 14:58:25 +03:00
guybe7	45c99d7092	Adds RM_Microseconds and RM_CachedMicroseconds (#11016 ) RM_Microseconds Return the wall-clock Unix time, in microseconds RM_CachedMicroseconds Returns a cached copy of the Unix time, in microseconds. It is updated in the server cron job and before executing a command. It is useful for complex call stacks, such as a command causing a key space notification, causing a module to execute a RedisModule_Call, causing another notification, etc. It makes sense that all these callbacks would use the same clock.	2022-07-27 14:40:05 +03:00
Huang Zhw	6f0a27e38e	When client tracking is on, invalidation message of flushdb in a (#11038 ) When FLUSHDB / FLUSHALL / SWAPDB is inside MULTI / EXEC, the client side tracking invalidation message was interleaved with transaction response.	2022-07-26 13:28:37 +03:00
Meir Shpilraien (Spielrein)	020e046b42	Fix #11030 , use lua_rawget to avoid triggering metatables and crash. (#11032 ) Fix #11030, use lua_rawget to avoid triggering metatables. #11030 shows how return `_G` from the Lua script (either function or eval), cause the Lua interpreter to Panic and the Redis processes to exit with error code 1. Though return `_G` only panic on Redis 7 and 6.2.7, the underline issue exists on older versions as well (6.0 and 6.2). The underline issue is returning a table with a metatable such that the metatable raises an error. The following example demonstrate the issue: ``` 127.0.0.1:6379> eval "local a = {}; setmetatable(a,{__index=function() foo() end}) return a" 0 Error: Server closed the connection ``` ``` PANIC: unprotected error in call to Lua API (user_script:1: Script attempted to access nonexistent global variable 'foo') ``` The Lua panic happened because when returning the result to the client, Redis needs to introspect the returning table and transform the table into a resp. In order to scan the table, Redis uses `lua_gettable` api which might trigger the metatable (if exists) and might raise an error. This code is not running inside `pcall` (Lua protected call), so raising an error causes the Lua to panic and exit. Notice that this is not a crash, its a Lua panic that exit with error code 1. Returning `_G` panics on Redis 7 and 6.2.7 because on those versions `_G` has a metatable that raises error when trying to fetch a none existing key. ### Solution Instead of using `lua_gettable` that might raise error and cause the issue, use `lua_rawget` that simply return the value from the table without triggering any metatable logic. This is promised not to raise and error. The downside of this solution is that it might be considered as breaking change, if someone rely on metatable in the returned value. An alternative solution is to wrap this entire logic with `pcall` (Lua protected call), this alternative require a much bigger refactoring. ### Back Porting The same fix will work on older versions as well (6.2, 6.0). Notice that on those version, the issue can cause Redis to crash if inside the metatable logic there is an attempt to accesses Redis (`redis.call`). On 7.0, there is not crash and the `redis.call` is executed as if it was done from inside the script itself. ### Tests Tests was added the verify the fix	2022-07-26 10:33:50 +03:00
Viktor Söderqvist	5032de50f2	Gossip forgotten nodes on `CLUSTER FORGET` (#10869 ) Gossip the cluster node blacklist in ping and pong messages. This means that CLUSTER FORGET doesn't need to be sent to all nodes in a cluster. It can be sent to one or more nodes and then be propagated to the rest of them. For each blacklisted node, its node id and its remaining blacklist TTL is gossiped in a cluster bus ping extension (introduced in #9530).	2022-07-26 10:28:13 +03:00
Binbin	5ce64ab010	Fix timing issue in cluster test (#11008 ) A timing issue like this was reported in freebsd daily CI: ``` *** [err]: Sanity test push cmd after resharding in tests/unit/cluster/cli.tcl Expected 'CLUSTERDOWN The cluster is down' to match 'MOVED' ``` We additionally wait for each node to reach a consensus on the cluster state in wait_for_condition to avoid the cluster down error. The fix just like #10495, quoting madolson's comment: Cluster check just verifies the the config state is self-consistent, waiting for cluster_state to be okay is an independent check that all the nodes actually believe each other are healthy. At the same time i noticed that unit/moduleapi/cluster.tcl has an exact same test, may have the same problem, also modified it.	2022-07-18 20:35:13 -07:00
Oran Agra	2825b6057b	Fix heap overflow corruption in XAUTOCLAIM (CVE-2022-31144) (#11002 ) The temporary array for deleted entries reply of XAUTOCLAIM was insufficient, but also in fact the COUNT argument should be used to control the size of the reply, so instead of terminating the loop by only counting the claimed entries, we'll count deleted entries as well. Fix #10968 Addresses CVE-2022-31144	2022-07-18 11:36:19 +03:00
ranshid	eacca729a5	Avoid using unsafe C functions (#10932 ) replace use of: sprintf --> snprintf strcpy/strncpy --> redis_strlcpy strcat/strncat --> redis_strlcat why are we making this change? Much of the code uses some unsafe variants or deprecated buffer handling functions. While most cases are probably not presenting any issue on the known path programming errors and unterminated strings might lead to potential buffer overflows which are not covered by tests. As part of this PR we change 1. added implementation for redis_strlcpy and redis_strlcat based on the strl implementation: https://linux.die.net/man/3/strl 2. change all occurrences of use of sprintf with use of snprintf 3. change occurrences of use of strcpy/strncpy with redis_strlcpy 4. change occurrences of use of strcat/strncat with redis_strlcat 5. change the behavior of ll2string/ull2string/ld2string so that it will always place null termination ('\0') on the output buffer in the first index. this was done in order to make the use of these functions more safe in cases were the user will not check the output returned by them (for example in rdbRemoveTempFile) 6. we added a compiler directive to issue a deprecation error in case a use of sprintf/strcpy/strcat is found during compilation which will result in error during compile time. However keep in mind that since the deprecation attribute is not supported on all compilers, this is expected to fail during push workflows. NOTE: while this is only an initial milestone. We might also consider using the *_s implementation provided by the C11 Extensions (however not yet widly supported). I would also suggest to start looking at static code analyzers to track unsafe use cases. For example LLVM clang checker supports security.insecureAPI.DeprecatedOrUnsafeBufferHandling which can help locate unsafe function usage. https://clang.llvm.org/docs/analyzer/checkers.html#security-insecureapi-deprecatedorunsafebufferhandling-c The main reason not to onboard it at this stage is that the alternative excepted by clang is to use the C11 extensions which are not always supported by stdlib.	2022-07-18 10:56:26 +03:00
Madelyn Olson	3abdec9969	Fix cluster hostnames test causing failover while running valgrind (#10991 ) In the newly added cluster hostnames test, the primary is failing over during the reboot for valgrind so we are validating the wrong node. This change just sets the replica to prevent taking over, which seems to fix the test. We could have also set the timeout higher, but it slows down the test.	2022-07-17 09:57:34 +03:00
Valentino Geron	847cdca151	Add an option to specify multiple skip files using `--skipfile` (#10975 ) `--skipfile` can be repeated. For example: ./runtests --skipfile file1.txt --skipfile file2.txt Co-authored-by: Valentino Geron <valentino@redis.com>	2022-07-17 08:47:35 +03:00
Oran Agra	599e59ebc5	Avoid valgrind fishy value warning on corrupt restore payloads (#10937 ) The corrupt dump fuzzer uncovered a valgrind warning saying: ``` ==76370== Argument 'size' of function malloc has a fishy (possibly negative) value: -3744781444216323815 ``` This allocation would have failed (returning NULL) and being handled properly by redis (even before this change), but we also want to silence the valgrind warnings (which are checking that casting to ssize_t produces a non-negative value). The solution i opted for is to explicitly fail these allocations (returning NULL), before even reaching `malloc` (which would have failed and return NULL too). The implication is that we will not be able to support a single allocation of more than 2GB on a 32bit system (which i don't think is a realistic scenario). i.e. i do think we could be facing cases were redis consumes more than 2gb on a 32bit system, but not in a single allocation. The byproduct of this, is that i dropped the overflow assertions, since these will now lead to the same OOM panic we have for failed allocations.	2022-07-13 09:14:38 +03:00
Madelyn Olson	8a4e3bcd8d	Cluster test improvements (#10920 ) * Restructured testing to allow running cluster tests easily as part of the normal testing	2022-07-12 10:41:29 -07:00
Binbin	693acc0114	Trying to fix cluster test (#10963 ) #10942 break the new test added in #10449 ``` Testing unit: 29-slot-migration-response.tcl Cluster Join and auto-discovery test: FAILED: Cluster failed to join into a full mesh. ``` It looks like we need to wait for the cluster in 28 to become stable.	2022-07-11 15:21:35 +03:00
Binbin	35e8ae3eb5	Add cluster-port support to redis-cli --cluster (#10344 ) In #9389, we add a new `cluster-port` config and make cluster bus port configurable, and currently redis-cli --cluster create/add-node doesn't support with a configurable `cluster-port` instance. Because redis-cli uses the old way (port + 10000) to send the `CLUSTER MEET` command. Now we add this support on redis-cli `--cluster`, note we don't need to explicitly pass in the `cluster-port` parameter, we can get the real `cluster-port` of the node in `clusterManagerNodeLoadInfo`, so the `--cluster create` and `--cluster add-node` interfaces have not changed. We will use the `cluster-port` when we are doing `CLUSTER MEET`, also note that `CLUSTER MEET` bus-port parameter was added in 4.0, so if the bus_port (the one in redis-cli) is 0, or equal (port + 10000), we just call `CLUSTER MEET` with 2 arguments, using the old form. Co-authored-by: Madelyn Olson <34459052+madolson@users.noreply.github.com>	2022-07-11 11:23:31 +03:00
Madelyn Olson	e6a1b2ea95	Fix crash during handshake and cluster shards call (#10942 ) * Fix an engine crash when there are nodes in handshaking and a user calls cluster shards	2022-07-10 22:00:44 -07:00
Wen Hui	f620e6ac73	Add tests for error messages during slot migrations (#10449 ) * Add tests for error messages during slot migrations Co-authored-by: Ubuntu <lucas.guang.yang1@huawei.com> Co-authored-by: Madelyn Olson <madelyneolson@gmail.com>	2022-07-04 10:31:12 -05:00
Harkrishn Patro	0ab885a685	Account sharded pubsub channels memory consumption (#10925 ) Account sharded pubsub channels memory consumption in client memory usage computation to accurately evict client based on the set threshold for `maxmemory-clients`.	2022-07-04 09:18:57 +03:00
Yossi Gottlieb	69d5576832	Fix TLS tests on newer tcl-tls/OpenSSL. (#10910 ) Before this commit, TLS tests on Ubuntu 22.04 would fail as dropped connections result with an ECONNABORTED error thrown instead of an empty read.	2022-07-03 13:34:14 +03:00
Binbin	35e836c26d	Add SENTINEL command flag to CLIENT/COMMANDS subcommands (#10904 ) This was harmless because we marked the parent command with SENTINEL flag. So the populateCommandTable was ok. And we also don't show the flag (SENTINEL and ONLY-SENTNEL) in COMMAND INFO. In this PR, we also add the same CMD_SENTINEL and CMD_ONLY_SENTINEL flags check when populating the sub-commands. so that in the future it'll be possible to add some sub-commands to sentinel or sentinel-only but not others.	2022-06-30 16:32:40 +03:00
Wen Hui	51da5c3dde	Fix CLUSTER RESET command argument number issue (#10898 ) Fix regression of CLUSTER RESET command in redis 7.0. cluster reset command format is: CLUSTER RESET [ HARD \| SOFT] According to the cluster reset command doc and codes, the third argument is optional, so the arity in json file should be -2 instead of 3. Add test to verify future regressions with RESET and RESET SOFT that were not covered. Co-authored-by: Ubuntu <lucas.guang.yang1@huawei.com> Co-authored-by: Oran Agra <oran@redislabs.com> Co-authored-by: Binbin <binloveplay1314@qq.com>	2022-06-29 08:17:00 +03:00
jonnyomerredis	35c2ee8716	Add sharded pubsub keychannel count to client info (#10895 ) When calling CLIENT INFO/LIST, and in various debug prints, Redis is printing the number of pubsub channels / patterns the client is subscribed to. With the addition of sharded pubsub, it would be useful to print the number of keychannels the client is subscribed to as well.	2022-06-28 10:11:17 +03:00
Viktor Söderqvist	6af021007a	Add missing REDISMODULE_CLIENTINFO_INITIALIZER (#10885 ) The module API docs mentions this macro, but it was not defined (so no one could have used it). Instead of adding it as is, we decided to add a _V1 macro, so that if / when we some day extend this struct, modules that use this API and don't need the extra fields, will still use the old version and still be compatible with older redis version (despite being compiled with newer redismodule.h)	2022-06-27 08:29:05 +03:00
RinChanNOW!	2854637385	Support conversion between `RedisModuleString` and `unsigned long long` (#10889 ) Since the ranges of `unsigned long long` and `long long` are different, we cannot read an `unsigned long long` integer from a `RedisModuleString` by `RedisModule_StringToLongLong` . So I added two new Redis Module APIs to support the conversion between these two types: * `RedisModule_StringToULongLong` * `RedisModule_CreateStringFromULongLong` Signed-off-by: RinChanNOWWW <hzy427@gmail.com>	2022-06-26 15:02:52 +03:00
Binbin	d443e312ad	redis-server command line arguments allow passing config name and value in the same arg (#10866 ) This commit has two topics. ## Passing config name and value in the same arg In #10660 (Redis 7.0.1), when we supported the config values that can start with `--` prefix (one of the two topics of that PR), we broke another pattern: `redis-server redis.config "name value"`, passing both config name and it's value in the same arg, see #10865 This wasn't a intended change (i.e we didn't realize this pattern used to work). Although this is a wrong usage, we still like to fix it. Now we support something like: ``` src/redis-server redis.conf "--maxmemory '700mb'" "--maxmemory-policy volatile-lru" --proc-title-template --my--title--template --loglevel verbose ``` ## Changes around --save Also in this PR, we undo the breaking change we made in #10660 on purpose. 1. `redis-server redis.conf --save --loglevel verbose` (missing `save` argument before anotehr argument). In 7.0.1, it was throwing an wrong arg error. Now it will work and reset the save, similar to how it used to be in 7.0.0 and 6.2.x. 3. `redis-server redis.conf --loglevel verbose --save` (missing `save` argument as last argument). In 6.2, it did not reset the save, which was a bug (inconsistent with the previous bullet). Now we will make it work and reset the save as well (a bug fix).	2022-06-26 14:36:39 +03:00
Viktor Söderqvist	6272ca609e	Add RM_SetClientNameById and RM_GetClientNameById (#10839 ) Adding Module APIs to let the module read and set the client name of an arbitrary connection.	2022-06-26 14:34:59 +03:00
judeng	d2405b9b6b	fix benchmark failure in daily test with TLS (#10896 ) The new test added in #10891 can fail with a different error. see comment in networking.c saying ```c /* That's a best effort error message, don't check write errors. * Note that for TLS connections, no handshake was done yet so nothing * is written and the connection will just drop. */ ```	2022-06-23 18:19:36 +03:00
judeng	49876158cc	fix redis-benchmark's bug: check if clients are created successfully in idle mode (#10891 ) my maxclients config: ``` redis-cli config get maxclients 1) "maxclients" 2) "4064" ``` Before this bug was fixed, creating 4065 clients appeared to be successful, but only 4064 were actually created``` ``` ./redis-benchmark -c 4065 -I Creating 4065 idle connections and waiting forever (Ctrl+C when done) cients: 4065 ``` now : ``` ./redis-benchmark -c 4065 -I Creating 4065 idle connections and waiting forever (Ctrl+C when done) Error from server: ERR max number of clients reached ./redis-benchmark -c 4064 -I Creating 4064 idle connections and waiting forever (Ctrl+C when done) clients: 4064 ```	2022-06-22 19:30:22 +03:00
Meir Shpilraien (Spielrein)	61baabd8d5	Fix crash on RM_Call with script mode. (#10886 ) The PR fixes 2 issues: ### RM_Call crash on script mode `RM_Call` can potentially be called from a background thread where `server.current_client` are not set. In such case we get a crash on `NULL` dereference. The fix is to check first if `server.current_client` is `NULL`, if it does we should verify disc errors and readonly replica as we do to any normal clients (no masters nor AOF). ### RM_Call block OOM commands when not needed Again `RM_Call` can be executed on a background thread using a `ThreadSafeCtx`. In such case `server.pre_command_oom_state` can be irrelevant and should not be considered when check OOM state. This cause OOM commands to be blocked when not necessarily needed. In such case, check the actual used memory (and not the cached value). Notice that in order to know if the cached value can be used, we check that the ctx that was used on the `RM_Call` is a ThreadSafeCtx. Module writer can potentially abuse the API and use ThreadSafeCtx on the main thread. We consider this as a API miss used.	2022-06-21 10:01:13 +03:00
Moti Cohen	4c72a09b78	Fix sentinel acl change test. Timing issue. (#10868 ) Co-authored-by: moticless <moticless@github.com>	2022-06-19 09:45:16 +03:00
Oran Agra	2189100383	optimize zset conversion on large ZRANGESTORE (#10789 ) when we know the size of the zset we're gonna store in advance, we can check if it's greater than the listpack encoding threshold, in which case we can create a skiplist from the get go, and avoid converting the listpack to skiplist later after it was already populated.	2022-06-14 21:12:45 +03:00
Oran Agra	8ef4f1dbad	Script that made modification will not break with unexpected NOREPLICAS error (#10855 ) If a script made a modification and then was interrupted for taking too long. there's a chance redis will detect that a replica dropped and would like to reject write commands with NOREPLICAS due to insufficient good replicas. returning an error on a command in this case breaks the script atomicity. The same could in theory happen with READONLY, MISCONF, but i don't think these state changes can happen during script execution.	2022-06-14 21:09:50 +03:00
Oran Agra	ffa0077041	Allow ECHO in loading and stale modes (#10853 ) I noticed that scripting.tcl uses INFO from within a script and thought it's an overkill and concluded it's nicer to use another CMD_STALE command, decided to use ECHO, and then noticed it's not at all allowed in stale mode. probably overlooked at #6843	2022-06-14 08:48:08 +03:00
Binbin	92fb4f4f61	Fixed SET and BITFIELD commands being wrongly marked movablekeys (#10837 ) The SET and BITFIELD command were added `get_keys_function` in #10148, causing them to be wrongly marked movablekeys in `populateCommandMovableKeys`. This was an unintended side effect introduced in #10148 (7.0 RC1) which could cause some clients an extra round trip for these commands in cluster mode. Since we define movablekeys as a way to determine if the legacy range [first, last, step] doesn't find all keys, then we need a completely different approach. The right approach should be to check if the legacy range covers all key-specs, and if none of the key-specs have the INCOMPLETE flag. This way, we don't need to look at getkeys_proc of VARIABLE_FLAG at all. Probably with the exception of modules, who may still not be using key-specs. In this PR, we removed `populateCommandMovableKeys` and put its logic in `populateCommandLegacyRangeSpec`. In order to properly serve both old and new modules, we must probably keep relying CMD_MODULE_GETKEYS, but do that only for modules that don't declare key-specs. For ones that do, we need to take the same approach we take with native redis commands. This approach was proposed by Oran. Fixes #10833 Co-authored-by: Oran Agra <oran@redislabs.com>	2022-06-12 08:22:18 +03:00
Christian Krieg	032619b82b	Fixing test to consider statically linked binaries (#10835 ) The test calls `ldd` on `redis-server` in order to find out whether the binary was linked against `libmusl`; However, `ldd` returns a value different from `0` when statically linking the binaries agains libc-musl, because `redis-server` is not a dynamic executable (as given by the exception thrown by the failing test), and `make test` terminates with an error:: $ ldd src/redis-server not a dynamic executable $ echo $? 1 This commit fixes the test by ignoring such failures. Co-authored-by: Yossi Gottlieb <yossigo@gmail.com>	2022-06-09 12:59:33 +03:00
Petr Vaněk	f22bfe86b6	Update musl libc detection pattern (#10826 ) This change fixes failing `integration/logging.tcl` test in Gentoo with musl libc, where `ldd` returns ``` libc.so => /lib/ld-musl-x86_64.so.1 (0x7f9d5f171000) ``` unlike Alpine's ``` libc.musl-x86_64.so.1 => /lib/ld-musl-x86_64.so.1 (0x7f82cfa16000) ``` The solution is to extend matching pattern introduced in #8532.	2022-06-07 18:47:01 +03:00
zhaozhao.zz	a18c91d642	rewrite alias config to original name (#10811 ) Redis 7 adds some new alias config like `hash-max-listpack-entries` alias `hash-max-ziplist-entries`. If a config file contains both real name and alias like this: ``` hash-max-listpack-entries 20 hash-max-ziplist-entries 20 ``` after set `hash-max-listpack-entries` to 100 and `config rewrite`, the config file becomes to: ``` hash-max-listpack-entries 100 hash-max-ziplist-entries 20 ``` we can see that the alias config is not modified, and users will get wrong config after restart. 6.0 and 6.2 doesn't have this bug, since they only have the `slave` word alias. Co-authored-by: Oran Agra <oran@redislabs.com>	2022-06-02 14:03:47 +03:00
zhugezy	cf3323dba4	Fix bugs in CONFIG REWRITE, omitting rename-command and include lines, and inserting comments around module and acl configs (#10761 ) A regression from #10285 (redis 7.0). CONFIG REWRITE would put lines with: `include`, `rename-command`, `user`, `loadmodule`, and any module specific config in a comment. For ACL `user`, `loadmodule` and module specific configs would be re-inserted at the end (instead of updating existing lines), so the only implication is a messy config file full of comments. But for `rename-command` and `include`, the implication would be that they're now missing, so a server restart would lose them. Co-authored-by: Oran Agra <oran@redislabs.com>	2022-06-02 08:36:55 +03:00
Oran Agra	df55861838	Expose script flags to processCommand for better handling (#10744 ) The important part is that read-only scripts (not just EVAL_RO and FCALL_RO, but also ones with `no-writes` executed by normal EVAL or FCALL), will now be permitted to run during CLIENT PAUSE WRITE (unlike before where only the _RO commands would be processed). Other than that, some errors like OOM, READONLY, MASTERDOWN are now handled by processCommand, rather than the command itself affects the error string (and even error code in some cases), and command stats. Besides that, now the `may-replicate` commands, PFCOUNT and PUBLISH, will be considered `write` commands in scripts and will be blocked in all read-only scripts just like other write commands. They'll also be blocked in EVAL_RO (i.e. even for scripts without the `no-writes` shebang flag. This commit also hides the `may_replicate` flag from the COMMAND command output. this is a breaking change. background about may_replicate: We don't want to expose a no-may-replicate flag or alike to scripts, since we consider the may-replicate thing an internal concern of redis, that we may some day get rid of. In fact, the may-replicate flag was initially introduced to flag EVAL: since we didn't know what it's gonna do ahead of execution, before function-flags existed). PUBLISH and PFCOUNT, both of which because they have side effects which may some day be fixed differently. code changes: The changes in eval.c are mostly code re-ordering: - evalCalcFunctionName is extracted out of evalGenericCommand - evalExtractShebangFlags is extracted luaCreateFunction - evalGetCommandFlags is new code	2022-06-01 14:09:40 +03:00
Oran Agra	b2061de2e7	Fix broken protocol in MISCONF error, RM_Yield bugs, RM_Call(EVAL) OOM check bug, and new RM_Call checks. (#10786 ) * Fix broken protocol when redis can't persist to RDB (general commands, not modules), excessive newline. regression of #10372 (7.0 RC3) * Fix broken protocol when Redis can't persist to AOF (modules and scripts), missing newline. * Fix bug in OOM check of EVAL scripts called from RM_Call. set the cached OOM state for scripts before executing module commands too, so that it can serve scripts that are executed by modules. i.e. in the past EVAL executed by RM_Call could have either falsely fail or falsely succeeded because of a wrong cached OOM state flag. * Fix bugs with RM_Yield: 1. SHUTDOWN should only accept the NOSAVE mode 2. Avoid eviction during yield command processing. 3. Avoid processing master client commands while yielding from another client * Add new two more checks to RM_Call script mode. 1. READONLY You can't write against a read only replica 2. MASTERDOWN Link with MASTER is down and `replica-serve-stale-data` is set to `no` * Add new RM_Call flag to let redis automatically refuse `deny-oom` commands while over the memory limit. * Add tests to cover various errors from Scripts, Modules, Modules calling scripts, and Modules calling commands in script mode. Add tests: * Looks like the MISCONF error was completely uncovered by the tests, add tests for it, including from scripts, and modules * Add tests for NOREPLICAS from scripts * Add tests for the various errors in module RM_Call, including RM_Call that calls EVAL, and RM_call in "eval mode". that includes: NOREPLICAS, READONLY, MASTERDOWN, MISCONF	2022-06-01 13:04:22 +03:00
Harkrishn Patro	4065b4f27e	Sharded pubsub publish messagebulk as smessage (#10792 ) To easily distinguish between sharded channel message and a global channel message, introducing `smessage` (instead of `message`) as message bulk for sharded channel publish message. This is gonna be a breaking change in 7.0.1! Background: Sharded pubsub introduced in redis 7.0, but after the release we quickly realized that the fact that it's problematic that the client can't distinguish between normal (global) pubsub messages and sharded ones. This is important because the same connection can subscribe to both, but messages sent to one pubsub system are not propagated to the other (they're completely separate), so if one connection is used to subscribe to both, we need to assist the client library to know which message it got so it can forward it to the correct callback.	2022-05-31 08:03:59 +03:00
Madelyn Olson	ed29d634b3	Add readonly flag to EVAL_RO, EVALSHA_RO and FCALL_RO (#10728 ) * Add readonly flag to EVAL_RO, EVALSHA_RO and FCALL_RO * Require users to explicitly declare @scripting to get access to lua scripting.	2022-05-29 23:42:56 -07:00
Binbin	1013cbeae2	Fix sentinel disconnect test timing issue after auth-pass change (#10784 ) There is a timing issue reported in test-sanitizer-address (gcc): ``` Sentinels (re)connection following SENTINEL SET mymaster auth-pass: FAILED: Expected to be disconnected from master due to wrong password ``` The reason we reach it, is because the test is fast enough to modify auth-pass and test sentinel connection status with the server, before its scheduled operation got the chance to update connection status with the server. We need to wait for `sentinelTimer` to kick in, and then update the connection status. Replace condition with wait_for_condition on the check. Fix just like #10480 did	2022-05-29 08:38:38 +03:00
Vitaly	6461f09f43	Fix ZRANGESTORE crash when zset_max_listpack_entries is 0 (#10767 ) When `zrangestore` is called container destination object is created. Before this PR we used to create a listpack based object even if `zset-max-ziplist-entries` or equivalent`zset-max-listpack-entries` was set to 0. This triggered immediate conversion of the listpack into a skiplist in `zrangestore`, which hits an assertion resulting in an engine crash. Added a TCL test that reproduces this issue.	2022-05-27 22:34:00 +03:00
Binbin	6f7c1a8ce6	Fix outdated comment about flags in moduleCreateArgvFromUserFormat (#10781 ) Clearly more than one flag exists, also fixed some typos. Fixes #10776	2022-05-26 17:34:17 +03:00
Valentino Geron	9eb97b5d94	Fix regex support in --only, --skipfile and --skiptest (#10741 ) The regex support was added in: * https://github.com/redis/redis/pull/9352 * https://github.com/redis/redis/pull/9555 * https://github.com/redis/redis/pull/10212 These commits break backword compatiblity with older versions. This fix keeps the test suite infra compatible with old versions by default. However, if you want regex, the string must start with `/`	2022-05-25 18:25:38 +03:00
Binbin	450c88f368	Fix BZMPOP gets unblocked by non-key args and returns them (#10764 ) This bug was introduced in #9484 (7.0.0). It result that BZMPOP blocked on non-key arguments. Like `bzmpop 0 1 myzset min count 10`, this command will additionally block in these keys (except for the first and the last argument) and can return their values: - 0: timeout value - 1: numkeys value - min: min/max token - count: count token	2022-05-23 14:15:54 +03:00
Oran Agra	b0e18f804d	Scripts that declare the `no-writes` flag are implicitly `allow-oom` too. (#10699 ) Scripts that have the `no-writes` flag, cannot execute write commands, and since all `deny-oom` commands are write commands, we now act as if the `allow-oom` flag is implicitly set for scripts that set the `no-writes` flag. this also implicitly means that the EVAL_RO and FCALL_RO commands can never fails with OOM error. Note about a bug that's no longer relevant: There was an issue with EVAL_RO using shebang not being blocked correctly in OOM state: When an EVAL script declares a shebang, it was by default not allowed to run in OOM state. but this depends on a flag that is updated before the command is executed, which was not updated in case of the `_RO` variants. the result is that if the previous cached state was outdated (either true or false), the script will either unjustly fail with OOM, or unjustly allowed to run despite the OOM state. It doesn't affect scripts without a shebang since these depend on the actual commands they run, and since these are only read commands, they don't care for that cached oom state flag. it did affect scripts with shebang and no allow-oom flag, bug after the change in this PR, scripts that are run with eval_ro would implicitly have that flag so again the cached state doesn't matter. p.s. this isn't a breaking change since all it does is allow scripts to run when they should / could rather than blocking them.	2022-05-22 16:02:59 +03:00
Wen Hui	135998ed8d	Update comments on command args, and a misleading error reply (#10645 ) Updated the comments for: info command lmpopCommand and blmpopCommand sinterGenericCommand Fix the missing "key" words in the srandmemberCommand function For LPOS command, when rank is 0, prompt user that rank could be positive number or negative number, and add a test for it	2022-05-13 17:55:49 +03:00
Binbin	586a16ad79	Fix race in module fork kill test (#10717 ) The purpose of the test is to kill the child while it is running. From the last two lines we can see the child exits before being killed. ``` - Module fork started pid: 56998 * <fork> fork child started - Killing running module fork child: 56998 * <fork> fork child exiting signal-handler (1652267501) Received SIGUSR1 in child, exiting now. ``` In this commit, we pass an argument to `fork.create` indicating how long it should sleep. For the fork kill test, we use a longer time to avoid the child exiting before being killed. Other changes: use wait_for_condition instead of hardcoded `after 250`. Unify the test for failing fork with the one for killing it (save time)	2022-05-12 20:10:38 +03:00
Binbin	bfbb15f75d	redis-server command line arguments support take one bulk string with spaces for MULTI_ARG configs parsing. And allow options value to use the -- prefix (#10660 ) ## Take one bulk string with spaces for MULTI_ARG configs parsing Currently redis-server looks for arguments that start with `--`, and anything in between them is considered arguments for the config. like: `src/redis-server --shutdown-on-sigint nosave force now --port 6380` MULTI_ARG configs behave differently for CONFIG command, vs the command line argument for redis-server. i.e. CONFIG command takes one bulk string with spaces in it, while the command line takes an argv array with multiple values. In this PR, in config.c, if `argc > 1` we can take them as is, and if the config is a `MULTI_ARG` and `argc == 1`, we will split it by spaces. So both of these will be the same: ``` redis-server --shutdown-on-sigint nosave force now --shutdown-on-sigterm nosave force redis-server --shutdown-on-sigint nosave "force now" --shutdown-on-sigterm nosave force redis-server --shutdown-on-sigint nosave "force now" --shutdown-on-sigterm "nosave force" ``` ## Allow options value to use the `--` prefix Currently it decides to switch to the next config, as soon as it sees `--`, even if there was not a single value provided yet to the last config, this makes it impossible to define a config value that has `--` prefix in it. For instance, if we want to set the logfile to `--my--log--file`, like `redis-server --logfile --my--log--file --loglevel verbose`, current code will handle that incorrectly. In this PR, now we allow a config value that has `--` prefix in it. But note that something like `redis-server --some-config --config-value1 --config-value2 --loglevel debug` would not work, because if you want to pass a value to a config starting with `--`, it can only be a single value. like: `redis-server --some-config "--config-value1 --config-value2" --loglevel debug` An example (using `--` prefix config value): ``` redis-server --logfile --my--log--file --loglevel verbose redis-cli config get logfile loglevel 1) "loglevel" 2) "verbose" 3) "logfile" 4) "--my--log--file" ``` ### Potentially breaking change `redis-server --save --loglevel verbose` used to work the same as `redis-server --save "" --loglevel verbose` now, it'll error!	2022-05-11 11:33:35 +03:00
Binbin	783b210db4	FLUSHDB and FLUSHALL add call forceCommandPropagation / FLUSHALL reset dirty counter to 0 if we enable save (#10691 ) ## FLUSHALL We used to restore the dirty counter after `rdbSave` zeroed it if we enable save. Otherwise FLUSHALL will not be replicated nor put into the AOF. And then we do increment it again below. Without that extra dirty++, when db was already empty, FLUSHALL will not be replicated nor put into the AOF. We now gonna replace all that dirty counter magic with a call to forceCommandPropagation (REPL and AOF), instead of all the messing around with the dirty counter. Added tests to cover three part (dirty counter, REPL, AOF). One benefit other than cleaner code is that the `rdb_changes_since_last_save` is correct in this case. ## FLUSHDB FLUSHDB was not replicated nor put into the AOF when db was already empty. Unlike DEL on a non-existing key, FLUSHDB always does something, and that's to call the module hook. So basically FLUSHDB is never a NOP, and thus it should always be propagated. Not doing that, could mean that if a module does something in that hook, and wants to avoid issues of that hook being missing on the replica if the db is empty, it'll need to do complicated things. So now FLUSHDB add call forceCommandPropagation, we will always propagate FLUSHDB. Always propagating FLUSHDB seems like a safe approach that shouldn't have any drawbacks (other than looking odd) This was mentioned in #8972 ## Test section: We actually found it while solving a race condition in the BGSAVE test (other.tcl). It was found in extra_ci Daily Arm64 (test-libc-malloc). ``` [exception]: Executing test client: ERR Background save already in progress. ERR Background save already in progress ``` It look like `r flushdb` trigger (schedule) a bgsave right after `waitForBgsave r` and before `r save`. Changing flushdb to flushall, FLUSHALL will do a foreground save and then set the dirty counter to 0.	2022-05-11 11:21:16 +03:00
Meir Shpilraien (Spielrein)	442e73ea09	Fix #10705 , avoid relinking the same library twice. (#10706 ) Set `old_li` to NULL to avoid linking it again on error. Before the fix, loading an already existing library will cause the existing library to be added again. This cause not harm other then wrong statistics. The statistics that are effected by the issue are: * `libraries_count` and `functions_count` returned by `function stats` command * `used_memory_functions` returned on `info memory` command * `functions.caches` returned on `memory stats` command	2022-05-10 11:47:45 +03:00
Oran Agra	2bcd890d8a	Fix --save command line regression in redis 7.0.0 (#10690 ) Unintentional change in #9644 (since RC1) meant that an empty `--save ""` config from command line, wouldn't have clear any setting from the config file Added tests to cover that, and improved test infra to take additional command line args for redis-server	2022-05-09 13:37:49 +03:00
Oran Agra	eb915a82a5	Bug fixes for enum configs with overlapping bit flags (module API) (#10661 ) If we want to support bits that can be overlapping, we need to make sure that: 1. we don't use the same bit for two return values. 2. values should be sorted so that prefer ones (matching more bits) come first.	2022-05-09 13:36:53 +03:00
Lu JJ	87131a5fa6	fast path when SDIFF command has the same key as the first key (#10663 ) When user uses the same input key for SDIFF as the first one, the result must be empty, so we don't need to process the elements to test. This method is like the one done in zset‘s `zsetChooseDiffAlgorithm` Co-authored-by: Oran Agra <oran@redislabs.com>	2022-05-02 16:18:11 +03:00
meir	efa162bcd7	Protect any table which is reachable from globals and added globals white list. The white list is done by setting a metatable on the global table before initializing any library. The metatable set the `__newindex` field to a function that check the white list before adding the field to the table. Fields which is not on the white list are simply ignored. After initialization phase is done we protect the global table and each table that might be reachable from the global table. For each table we also protect the table metatable if exists.	2022-04-27 00:37:40 +03:00
meir	3731580b6b	Protect globals of both evals scripts and functions. Use the new `lua_enablereadonlytable` Lua API to protect the global tables of both evals scripts and functions. For eval scripts, the implemetation is easy, We simply call `lua_enablereadonlytable` on the global table to turn it into a readonly table. On functions its more complecated, we want to be able to switch globals between load run and function run. To achieve this, we create a new empty table that acts as the globals table for function, we control the actual globals using metatable manipulation. Notice that even if the user gets a pointer to the original tables, all the tables are set to be readonly (using `lua_enablereadonlytable` Lua API) so he can not change them. The following inlustration better explain the solution: ``` Global table {} <- global table metatable {.__index = __real_globals__} ``` The `__real_globals__` is set depends on the run context (function load or function call). Why this solution is needed and its not enough to simply switch globals? When we run in the context of function load and create our functions, our function gets the current globals that was set when they were created. Replacing the globals after the creation will not effect them. This is why this trick it mandatory.	2022-04-27 00:37:40 +03:00
Oran Agra	8192625458	Add module API flag for using enum configs as bit flags (#10643 ) Enables registration of an enum config that'll let the user pass multiple keywords that will be combined with `\|` as flags into the integer config value. ``` const char *enum_vals[] = {"none", "one", "two", "three"}; const int int_vals[] = {0, 1, 2, 4}; if (RedisModule_RegisterEnumConfig(ctx, "flags", 3, REDISMODULE_CONFIG_DEFAULT \| REDISMODULE_CONFIG_BITFLAGS, enum_vals, int_vals, 4, getFlagsConfigCommand, setFlagsConfigCommand, NULL, NULL) == REDISMODULE_ERR) { return REDISMODULE_ERR; } ``` doing: `config set moduleconfigs.flags "two three"` will result in 6 being passed to`setFlagsConfigCommand`.	2022-04-26 20:29:20 +03:00
chenyang8094	46ec6ad98e	Fix bug when AOF enabled after startup. put the new incr file in the manifest only when AOFRW is done. (#10616 ) Changes: - When AOF is enabled after startup, the data accumulated during `AOF_WAIT_REWRITE` will only be stored in a temp INCR AOF file. Only after the first AOFRW is successful, we will add it to manifest file. Before this fix, the manifest referred to the temp file which could cause a restart during that time to load it without it's base. - Add `aof_rewrites_consecutive_failures` info field for aofrw limiting implementation. Now we can guarantee that these behaviors of MP-AOF are the same as before (past redis releases): - When AOF is enabled after startup, the data accumulated during `AOF_WAIT_REWRITE` will only be stored in a visible place. Only after the first AOFRW is successful, we will add it to manifest file. - When disable AOF, we did not delete the AOF file in the past so there's no need to change that behavior now (yet). - When toggling AOF off and then on (could be as part of a full-sync), a crash or restart before the first rewrite is completed, would result with the previous version being loaded (might not be right thing, but that's what we always had).	2022-04-26 16:31:19 +03:00
Eduardo Semprebon	3a1d14259d	Allow configuring signaled shutdown flags (#10594 ) The SHUTDOWN command has various flags to change it's default behavior, but in some cases establishing a connection to redis is complicated and it's easier for the management software to use signals. however, so far the signals could only trigger the default shutdown behavior. Here we introduce the option to control shutdown arguments for SIGTERM and SIGINT. New config options: `shutdown-on-sigint [nosave \| save] [now] [force]` `shutdown-on-sigterm [nosave \| save] [now] [force]` Implementation: Support MULTI_ARG_CONFIG on createEnumConfig to support multiple enums to be applied as bit flags. Co-authored-by: Oran Agra <oran@redislabs.com>	2022-04-26 14:34:04 +03:00
Madelyn Olson	6fa8e4f7af	Set replicas to panic on disk errors, and optionally panic on replication errors (#10504 ) * Till now, replicas that were unable to persist, would still execute the commands they got from the master, now they'll panic by default, and we add a new `replica-ignore-disk-errors` config to change that. * Till now, when a command failed on a replica or AOF-loading, it only logged a warning and a stat, we add a new `propagation-error-behavior` config to allow panicking in that state (may become the default one day) Note that commands that fail on the replica can either indicate a bug that could cause data inconsistency between the replica and the master, or they could be in some cases (specifically in previous versions), a result of a command (e.g. EVAL) that failed on the master, but still had to be propagated to fail on the replica as well.	2022-04-26 13:25:33 +03:00
Madelyn Olson	efcd1bf394	By default prevent cross slot operations in functions and scripts with # (#10615 ) Adds the `allow-cross-slot-keys` flag to Eval scripts and Functions to allow scripts to access keys from multiple slots. The default behavior is now that they are not allowed to do that (unlike before). This is a breaking change for 7.0 release candidates (to be part of 7.0.0), but not for previous redis releases since EVAL without shebang isn't doing this check. Note that the check is done on both the keys declared by the EVAL / FCALL command arguments, and also the ones used by the script when making a `redis.call`. A note about the implementation, there seems to have been some confusion about allowing access to non local keys. I thought I missed something in our wider conversation, but Redis scripts do block access to non-local keys. So the issue was just about cross slots being accessed.	2022-04-26 12:09:21 +03:00
guybe7	df787764e3	Fix regression not aborting transaction on error, and re-edit some error responses (#10612 ) 1. Disk error and slave count checks didn't flag the transactions or counted correctly in command stats (regression from #10372 , 7.0 RC3) 2. RM_Call will reply the same way Redis does, in case of non-exisitng command or arity error 3. RM_WrongArtiy will consider the full command name 4. Use lowercase 'u' in "unknonw subcommand" (to align with "unknown command") Followup work of #10127	2022-04-25 13:08:13 +03:00

1 2 3 4 5 ...

1949 Commits