Commit Graph

4613 Commits

Author SHA1 Message Date
antirez cdf2271c5b cluster.tcl: saner error handling.
Better handling of connection errors in order to update the table and
recovery, populate the startup nodes table after fetching the list of
nodes.

More work to do about it, it is still not as reliable as
redis-rb-cluster implementation which is the minimal reference
implementation for Redis Cluster clients.
2014-05-14 00:15:52 +02:00
antirez bae30479fb redis.tcl: return I/O error message when peer closes connection. 2014-05-14 00:14:35 +02:00
antirez 832a298005 Cluster: fixed data_age computation / check integer overflow. 2014-05-12 17:46:15 +02:00
Matt Stancliff 7c4decb101 Fix lack of strtold under Cygwin
Renaming strtold to strtod then casting
the result is the standard way of dealing with
no strtold in Cygwin.
2014-05-12 11:11:09 -04:00
Matt Stancliff 3e0e51dd9f Fix lack of SA_ONSTACK under Cygwin
Fixes #232
2014-05-12 11:10:24 -04:00
antirez 2692339138 Cluster: forced failover implemented.
Using CLUSTER FAILOVER FORCE it is now possible to failover a master in
a forced way, which means:

1) No check to understand if the master is up is performed.
2) No data age of the slave is checked. Evan a slave with very old data
   can manually failover a master in this way.
3) No chat with the master is attempted to reach its replication offset:
   the master can just be down.
2014-05-12 16:34:20 +02:00
antirez 005f564eb3 Cluster: bypass data_age check for manual failovers.
Automatic failovers only happen in Redis Cluster if the slave trying to
be elected was disconnected from its master for no more than 10 times
the node-timeout value. However there should be no such a check for
manual failovers, since these are initiated by the sysadmin that, in
theory, knows what she is doing when a slave is selected to be promoted.
2014-05-12 16:12:12 +02:00
Akos Vandra b252fab06c Fixed possible buffer overflow bug if RDB file is corrupted.
(Note: commit message modified by @antirez for clarity).
2014-05-12 11:48:14 +02:00
Akos Vandra 433e835d3e fixed possible buffer overflow error 2014-05-12 11:19:07 +02:00
antirez 658ad301cc redis-trib create: use CONFIG SET-CONFIG-EPOCH before joining the cluster.
This way there is no need for the conflict resolution algo to be used in
order to start with a cluster where each node has a different
configEpoch.
2014-05-12 11:06:37 +02:00
antirez 63d1f9e570 Sentinel: Add "dir /tmp" directive in example sentinel.conf. 2014-05-12 10:46:25 +02:00
antirez 715a6d3a78 redis-trib import: trap MIGRATE errors. 2014-05-12 10:36:33 +02:00
antirez 939c586ef7 redis-trib.rb: MIGRATE hardcoded timeout set to 15 sec.
Will be configurable / adaptive at some point but let's start with a
saner value compared to 1 sec which is not a good idea for big data
structures stored into a single key.
2014-05-12 10:22:24 +02:00
antirez 5c78f87666 RESTORE: reply with -BUSYKEY special error code.
The error when the target key is busy was a generic one, while it makes
sense to be able to distinguish between the target key busy error and
the others easily.
2014-05-12 10:01:59 +02:00
antirez 2a48bd4a37 Cluster: initial ability to import data from standalone instance. 2014-05-10 17:59:31 +02:00
antirez 71d0e7e0ea CLUSTER MEET: better error messages when address is invalid.
Fixes issue #1734.
2014-05-09 16:36:59 +02:00
antirez 74435aba47 redis-trib: allow support for mandatory options. 2014-05-09 16:11:11 +02:00
antirez 72ff03346f DEBUG POPULATE: call dictExpand() to avoid useless rehashing. 2014-05-09 15:02:29 +02:00
antirez 8a170c817d Cluster: bulk-accept new nodes connections.
The same change was operated for normal client connections. This is
important for Cluster as well, since when a node rejoins the cluster,
when a partition heals or after a restart, it gets flooded with new
connection attempts by all the other nodes trying to form a full
mesh again.
2014-05-09 11:52:59 +02:00
antirez 3625b52791 Cluster: clusterAcceptHandler() comments updated to match the code. 2014-05-09 11:44:46 +02:00
antirez 2102778606 Sentinel: log when a failover will be attempted again.
When a Sentinel performs a failover (successful or not), or when a
Sentinel votes for a different Sentinel trying to start a failover, it
sets a min delay before it will try to get elected for a failover.

While not strictly needed, because if multiple Sentinels will try
to failover the same master at the same time, only one configuration
will eventually win, this serialization is practically very useful.
Normal failovers are cleaner: one Sentinel starts to failover, the
others update their config when the Sentinel performing the failover
is able to get the selected slave to move from the role of slave to the
one of master.

However currently this timeout was implicit, so users could see
Sentinels not reacting, after a failed failover, for some time, without
giving any feedback in the logs to the poor sysadmin waiting for clues.

This commit makes Sentinels more verbose about the delay: when a master
is down and a failover attempt is not performed because the delay has
still not elaped, something like that will be logged:

    Next failover delay: I will not start a failover
    before Thu May  8 16:48:59 2014
2014-05-08 16:38:53 +02:00
antirez 931beae9b0 Sentinel: generate +config-update-from event when a new config is received.
This event makes clear, before the switch-master event is generated,
that a Sentinel received a configuration update from another Sentinel.
2014-05-08 15:59:34 +02:00
antirez 0b0f872f3f REDIS_ENCODING_EMBSTR_SIZE_LIMIT set to 39.
The new value is the limit for the robj + SDS header + string +
null-term to stay inside the 64 bytes Jemalloc arena in 64 bits
systems.
2014-05-07 17:05:09 +02:00
antirez 76c31d425e Scripting test: check that Lua can call commands rewirting argv.
SPOP, tested in the new test, is among the commands rewritng the
client->argv argument vector (it gets rewritten as SREM) for command
replication purposes.

Because of recent optimizations to client->argv caching in the context
of the Lua internal Redis client, it is important to test for SPOP to be
callable from Lua without bad effects to the other commands.
2014-05-07 16:12:32 +02:00
antirez 088b9eadc4 Test: handle new osx 'leaks' error.
Sometimes the process is still there but no longer in a state that can
be checked (after being killed). This used to happen after a call to
SHUTDOWN NOSAVE in the scripting unit, causing a false positive.
2014-05-07 16:12:32 +02:00
antirez 4f686555ce Scripting: objects caching for Lua c->argv creation.
Reusing small objects when possible is a major speedup under certain
conditions, since it is able to avoid the malloc/free pattern that
otherwise is performed for every argument in the client command vector.
2014-05-07 16:12:32 +02:00
antirez 1e4ba6e7e6 Scripting: Use faster API for Lua client c->argv creation.
Replace the three calls to Lua API lua_tostring, lua_lua_strlen,
and lua_isstring, with a single call to lua_tolstring.

~ 5% consistent speed gain measured.
2014-05-07 16:12:32 +02:00
antirez 76fda9f8e1 Scripting: don't call lua_gc() after Lua script run.
Calling lua_gc() after every script execution is too expensive, and
apparently does not make the execution smoother: the same peak latency
was measured before and after the commit.

This change accounts for scripts execution speedup in the order of 10%.
2014-05-07 16:12:32 +02:00
antirez 48c49c4851 Scripting: cache argv in luaRedisGenericCommand().
~ 4% consistently measured speed improvement.
2014-05-07 16:12:32 +02:00
antirez 3318b74705 Fixed missing c->bufpos reset in luaRedisGenericCommand().
Bug introduced when adding a fast path to avoid copying the reply buffer
for small replies that fit into the client static buffer.
2014-05-07 16:12:32 +02:00
antirez c49955fd77 Scripting: replace tolower() with faster code in evalGenericCommand().
The function showed up consuming a non trivial amount of time in the
profiler output. After this change benchmarking gives a 6% speed
improvement that can be consistently measured.
2014-05-07 16:12:32 +02:00
antirez 0ef4f44c5a Scripting: luaRedisGenericCommand() fast path for buffer-only replies.
When the reply is only contained in the client static output buffer, use
a fast path avoiding the dynamic allocation of an SDS string to
concatenate the client reply objects.
2014-05-07 16:12:32 +02:00
antirez 8226be61ec Define HAVE_ATOMIC for clang. 2014-05-07 16:12:32 +02:00
antirez 40abeb1f40 Scripting: simpler reply buffer creation in luaRedisGenericCommand().
It if faster to just create the string with a single sdsnewlen() call.
If c->bufpos is zero, the call will simply be like sdsemtpy().
2014-05-07 16:12:32 +02:00
antirez 1c130c6b03 Test: cluster/base, check that we can write/read from cluster. 2014-05-02 16:37:12 +02:00
antirez 3bc119c155 Cluster: Tcl cluster client: handle MOVED/ASK. 2014-05-02 15:35:08 +02:00
antirez fcd2065f8e Cluster: Tcl cluster client: slots-nodes map and close method.
Now the client is able to actually run commands in a Redis Cluster
assuming the slots->nodes map is stable.
2014-05-02 10:56:02 +02:00
antirez 5344357f80 Cluster: Tcl cluster client: build nodes representation. 2014-05-02 10:19:28 +02:00
antirez 8b7e23bdde Cluster: Tcl cluster client: get nodes description. 2014-05-02 09:55:27 +02:00
antirez bc8ea04a7d Cluster: Tcl cluster client key -> hashslot. 2014-04-30 18:55:28 +02:00
antirez e8357d0f85 Cluster test: Tcl cluster library initial skeleton. 2014-04-30 15:47:19 +02:00
antirez 1db45ba58c Cluster test: check for state=ok after slot allocation. 2014-04-30 09:29:03 +02:00
antirez 11d9ecb71d CLUSTER SET-CONFIG-EPOCH implemented.
Initially Redis Cluster accepted that after cluster creation all the
nodes were at configEpoch 0, evolving from zero as failovers happen.

However later the semantic was made more strict in order to make sure a
cluster has always all the master nodes with a different configEpoch,
which is more robust in some corner case (especially resulting from
errors by the system administrator).

To assign different configEpochs to different nodes at startup was a
task performed naturally by the config conflicts resolution algorithm
(see the Cluster specification). However this works well only for small
clusters or when there are actually just a few collisions, since it is
designed for exceptional cases.

When a large cluster is created hundred of nodes can be at epoch 0, so
the conflict resolution code is slow to provide an unique config to each
node. For this reason this new command was introduced. It can be called
only when a node is totally fresh: no other nodes known, and configEpoch
set to zero, so it is safe even against misuses.

redis-trib will use the new command in order to start the cluster
already setting an incremental unique config to every node.
2014-04-29 19:15:16 +02:00
antirez 7b5ce1ffb1 Cluster test: slots allocation. 2014-04-29 18:40:43 +02:00
antirez 4a3db25504 Cluster test: use 20 instances.
This makes tests a bit slower, but it is better to test things at a
decent scale instead of using just a few nodes, and for a few tests we
actually need so many nodes.
2014-04-29 16:20:43 +02:00
antirez e8631a6991 Cluster / Sentinel test: instances count moved to run.tcl. 2014-04-29 16:17:15 +02:00
antirez 9e422f74a6 Cluster test: config epoch conflict resolution. 2014-04-29 15:39:59 +02:00
antirez 2c55622333 Cluster test: auto-discovery to form full mesh. 2014-04-29 15:00:11 +02:00
antirez 2555b2f4bd Cluster test: check that every node has a different ID. 2014-04-29 10:42:32 +02:00
antirez e1b129811a Cluster test: basic cluster nodes info access functions. 2014-04-29 10:42:17 +02:00