Commit Graph

1855 Commits

Author SHA1 Message Date
antirez c2595500ac Cluster: request failover authorization, log if we have quorum.
However the failover is yet not really performed.
2013-03-14 16:39:02 +01:00
antirez 7fa42b801d Cluster: clusterSendFailoverAuth() implementation. 2013-03-14 16:31:57 +01:00
NanXiao 79a13b46fb Update config.c
Fix bug in configGetCommand function: get correct masterauth value.
2013-03-14 13:52:43 +08:00
antirez f59ff6fe61 Cluster: clusterSendFailoverAuthIfNeeded() work in progress. 2013-03-13 19:08:03 +01:00
antirez 44f6fdab60 Cluster: handle FAILOVER_AUTH_REQUEST in clusterProcessPacket().
However currently the control is passed to a function doing nothing at
all.
2013-03-13 18:38:08 +01:00
antirez ece95b2dea Cluster: sanity check FAILOVER_AUTH_REQUEST messages for proper length. 2013-03-13 17:31:26 +01:00
antirez 66144337bf Cluster: use 'else if' for mutually exclusive conditionals. 2013-03-13 17:27:06 +01:00
antirez db7c17e969 Cluster: FAILOVER_AUTH_REQUEST message type introduced.
This message is sent by a slave that is ready to failover its master to
other nodes to get the authorization from the majority of masters.
2013-03-13 17:21:20 +01:00
antirez 575cbc9990 Cluster: clusterHandleSlaveFailover() stub. 2013-03-13 13:10:49 +01:00
antirez 1902a9c532 Replication: master_link_down_since_seconds initial value should be huge.
server.repl_down_since used to be initialized to the current time at
startup. This is wrong since the replication never started. Clients
testing this filed to check if data is uptodate should never believe
data is recent if we never ever connected to our master.
2013-03-13 12:54:48 +01:00
antirez 3d448bda39 Cluster: call clusterHandleSlaveFailover() when our master is down. 2013-03-13 12:44:02 +01:00
antirez 79a6844e44 rdbLoad(): rework code to save vertical space. 2013-03-12 19:46:33 +01:00
Salvatore Sanfilippo 9925c7c670 Merge pull request #1001 from djanowski/fatal-errors-rdb-load
Abort when opening the RDB file results in an error other than ENOENT.
2013-03-12 11:40:36 -07:00
Damian Janowski 4178a80282 Abort when opening the RDB file results in an error other than ENOENT.
This fixes cases where the RDB file does exist but can't be accessed for
any reason. For instance, when the Redis process doesn't have enough
permissions on the file.
2013-03-12 14:37:50 -03:00
antirez 215bfaea16 Set default for stop_writes_on_bgsave_err in initServerConfig().
It was placed for error in initServer() that's called after the
configuation is already loaded, causing issue #1000.
2013-03-12 18:34:08 +01:00
antirez 91d3b487e7 redis-cli --bigkeys: don't crash with empty DBs. 2013-03-12 09:58:00 +01:00
antirez 2d851333a6 activeExpireCycle() smarter with many DBs and under expire pressure.
activeExpireCycle() tries to test just a few DBs per iteration so that
it scales if there are many configured DBs in the Redis instance.
However this commit makes it a bit smarter when one a few of those DBs
are under expiration pressure and there are many many keys to expire.

What we do is to remember if in the last iteration had to return because
we ran out of time. In that case the next iteration we'll test all the
configured DBs so that we are sure we'll test again the DB under
pressure.

Before of this commit after some mass-expire in a given DB the function
tested just a few of the next DBs, possibly empty, a few per iteration,
so it took a long time for the function to reach again the DB under
pressure. This resulted in a lot of memory being used by already expired
keys and never accessed by clients.
2013-03-11 11:10:33 +01:00
antirez 08b107e405 In databasesCron() never test more DBs than we have. 2013-03-11 10:51:03 +01:00
antirez 4b1ccdfd49 Make comment name match var name in activeExpireCycle(). 2013-03-11 10:42:14 +01:00
antirez 1f7d2c1e27 Optimize inner loop of activeExpireCycle() for no-expires case. 2013-03-09 11:48:54 +01:00
antirez 5f5aa487f9 REDIS_DBCRON_DBS_PER_SEC -> REDIS_DBCRON_DBS_PER_CALL 2013-03-09 11:44:20 +01:00
antirez db29d71a30 activeExpireCycle(): process only a small number of DBs per iteration.
This small number of DBs is set to 16 so actually in the default
configuraiton Redis should behave exactly like in the past.
However the difference is that when the user configures a very large
number of DBs we don't do an O(N) operation, consuming a non trivial
amount of CPU per serverCron() iteration.
2013-03-08 17:48:58 +01:00
antirez 40a2da159c Use unsigned integers for DB ids, for defined wrap-to-zero. 2013-03-08 17:41:20 +01:00
antirez 7ac3b3a486 Only resize/rehash a few databases per cron iteration.
This is the first step to lower the CPU usage when many databases are
configured. The other is to also process a limited number of DBs per
call in the active expire cycle.
2013-03-08 14:01:12 +01:00
antirez dfd732dff3 Actually call databasesCron() inside serverCron(). 2013-03-08 13:59:50 +01:00
antirez cd9dcd1835 Move Redis databases background processing to databasesCron(). 2013-03-08 12:34:05 +01:00
antirez f0b807cd47 Cluster: update cluster state on PFAIL flag set/cleared on nodes. 2013-03-07 15:40:53 +01:00
antirez 299b8f76c2 Cluster: mark cluster state as fail of majority of masters is unreachable. 2013-03-07 15:36:59 +01:00
antirez abf06fd5ff Cluster: log global cluster state change. 2013-03-07 15:22:32 +01:00
antirez 3dad8196b7 Cluster: clusterUpdateState() function simplified.
Also the NEEDHELP Cluster state was removed as it will no longer be
used by Redis Cluster.
2013-03-06 18:25:40 +01:00
Gengliang Wang 042ed270c8 Removed useless "return" statements in pubsub.c
(original commit message edited)
2013-03-06 16:49:20 +01:00
antirez 7b190a08cf API to lookup commands with their original name.
A new server.orig_commands table was added to the server structure, this
contains a copy of the commant table unaffected by rename-command
statements in redis.conf.

A new API lookupCommandOrOriginal() was added that checks both tables,
new first, old later, so that rewriteClientCommandVector() and friends
can lookup commands with their new or original name in order to fix the
client->cmd pointer when the argument vector is renamed.

This fixes the segfault of issue #986, but does not fix a wider range of
problems resulting from renaming commands that actually operate on data
and are registered into the AOF file or propagated to slaves... That is
command renaming should be handled with care.
2013-03-06 16:28:26 +01:00
antirez bfa25441e7 Handle a non-impossible empty argv in loadServerConfigFromString().
Usually this does not happens since we trim for " \t\r\n", but if there
are other chars that return true with isspace(), we may end with an
empty argv. Better to handle the condition in an explicit way.
2013-03-06 12:40:48 +01:00
antirez 8c193af696 redis-cli: use sdsfreesplitres() instead of hand-coding it. 2013-03-06 12:38:32 +01:00
antirez 011fa89ac9 Cluster: sdssplitargs_free() -> sdsfreesplitres(). 2013-03-06 12:38:06 +01:00
antirez 729a3432ba sds.c: sdssplitargs_free() removed as it was a duplicate. 2013-03-06 12:38:06 +01:00
antirez cf4d7737bb More specific error message in loadServerConfigFromString(). 2013-03-06 12:24:12 +01:00
antirez 4ea89e64c0 sdssplitargs(): on error set *argc to 0.
This makes programs not checking the return value for NULL much safer
since with this change:

1) It is still possible to iterate the zero-length result without
crashes.
2) sdssplitargs_free will work against NULL and 0 count.
2013-03-06 12:21:31 +01:00
antirez 5cabae84e6 sdssplitargs(): now returns NULL only on error.
An empty input string also resulted into the function returning NULL
making it harder for the caller to distinguish between error and empty
string without checking the original input string length.
2013-03-06 12:21:21 +01:00
charsyam 1303f02be6 Don't segfault on unbalanced quotes. 2013-03-06 11:54:02 +01:00
antirez 304ef5e283 Allow AUTH while loading the DB in memory.
While Redis is loading the AOF or RDB file in memory only a subset of
commands are allowed. This commit adds AUTH to this subset.
2013-03-06 11:50:38 +01:00
antirez 1025dd7786 Cluster: connect to our master ASAP after startup if we are a slave node. 2013-03-05 16:12:08 +01:00
antirez bac57ad14b Cluster: more robust FAIL flag cleaup.
If we have a master in FAIL state that's reachable again, and apparently
no one is going to serve its slots, clear the FAIL flag and let the
cluster continue with its operations again.
2013-03-05 15:05:32 +01:00
antirez 1a02b7440a Cluster: new node field fail_time.
This is the unix time at which we set the FAIL flag for the node.
It is only valid if FAIL is set.

The idea is to use it in order to make the cluster more robust, for
instance in order to revert a FAIL state if it is long-standing but
still slots are assigned to this node, that is, no one is going to fix
these slots apparently.
2013-03-05 13:15:05 +01:00
antirez d3b4662347 Cluster: don't check keys hash slots when the source is our master.
Usually we redirect clients to the right hash slot, however we don't
want to do that with our master, we want just to mirror it.
2013-03-05 13:02:44 +01:00
antirez 31ac376051 Cluster: slaveof not allowed in redis.conf as well. 2013-03-05 12:58:22 +01:00
antirez b7d085fc0d Cluster: SLAVEOF command not allowed in cluster mode. 2013-03-05 12:39:41 +01:00
antirez e4b481a5f6 Cluster: A comment updated in clusterCron(). 2013-03-05 12:17:30 +01:00
antirez d728ec6dee Cluster: send a ping to every node we never contacted in timeout/2 seconds.
Usually we try to send just 1 ping every second, however when we detect
we are going to have unreliable failure detection because we can't ping
some node in time, send an additional ping.

This should only happen with very large clusters or when the the node
timeout is set to a very low value.
2013-03-05 12:16:02 +01:00
antirez e7628be2a7 Cluster: set node->slaveof correctly when a node state is updated. 2013-03-05 11:50:11 +01:00
antirez d6457577d4 Cluster: don't perform startup slots sanity check for slaves.
If we are a cluster node the DB content will not match our configured
slots. Don't do the check at all.
2013-03-04 19:47:00 +01:00
antirez d334897e80 Cluster: fix maximum line length when loading config.
There are pathological cases where the line can be even longer a single
node may contain all the slots in importing/migrating state.
2013-03-04 19:45:36 +01:00
antirez 3be893123f Make sure replicationSetMaster() works when ip argument is not an sds. 2013-03-04 15:39:55 +01:00
antirez b8a28bf442 Cluster: actually setup replication in CLUSTER REPLICATE. 2013-03-04 15:27:58 +01:00
antirez 7bead003e2 SLAVEOF command refactored into a proper API.
We now have replicationSetMaster() and replicationUnsetMaster() that can
be called in other contexts (for instance Redis Cluster).
2013-03-04 13:22:21 +01:00
antirez 0c01088b51 Cluster: REPLICATE subcommand and stub for clusterSetMaster(). 2013-03-04 13:15:09 +01:00
charsyam bc84c399f8 adding check error code
adding check error code
2013-03-04 11:20:11 +01:00
antirez 3b3974410e redis-cli: use keepalive socket option.
This should improve things in two ways:

1) Prevent timeouts caused by the execution of long commands.
2) Improve detection of real connection errors.

This is mostly effective only on Linux because of the bogus default
keepalive settings. In Linux we have OS-specific calls to set the
keepalive interval to reasonable values.
2013-03-04 11:14:32 +01:00
Salvatore Sanfilippo 174e51cb75 Merge pull request #967 from 0x20h/fix-git-diff
suppress external diff program when using git diff.
2013-03-04 01:57:01 -08:00
antirez caf9b24a7d Cluster: don't set the slot as unassigned because of PONG info.
As stated in the comment this is usually due to a resharding in progress
so the client should be still redirected to the old node that will
handle the redirection elsewhere.
2013-02-28 15:54:29 +01:00
antirez 0d77440b26 Cluster: better handling of slots changes in PONG packets.
The new code makes sure that the node slots bitmap is always consistent
with the cluster->slots array.
2013-02-28 15:41:54 +01:00
antirez 5f8fd27ace Cluster: refactoring of clusterNode*Bit to use helper bitmap functions. 2013-02-28 15:23:09 +01:00
antirez d21d6b666f Cluster: use node->numslots instead of popcount() where possible. 2013-02-28 15:13:32 +01:00
antirez 4521115b17 Cluster: new field in cluster node structure, "numslots".
Before a relatively slow popcount() operation was needed every time we
needed to get the number of slots served by a given cluster node.
Now we just need to check an integer that is taken in sync with the
bitmap.
2013-02-28 15:11:05 +01:00
antirez a2566d6618 Cluster: don't gossip about nodes that are not useful to the cluster. 2013-02-28 15:00:09 +01:00
antirez bc922dc688 redis-trib: skip nodes without slots when creating the config signature. 2013-02-28 13:12:56 +01:00
antirez 64942fca01 redis-trib help. 2013-02-27 18:02:22 +01:00
antirez d45d184118 Cluster: CLUSTER FORGET implemented. 2013-02-27 17:55:59 +01:00
antirez d2b8281b3f Cluster: added a missing return on CLUSTER SETSLOT. 2013-02-27 17:53:48 +01:00
antirez 7ddc0fe652 redis-trib: skip noaddr and disconnected nodes while loading cluster info. 2013-02-27 17:23:11 +01:00
antirez d20dea3eb7 Cluster: blank node address when flagging it as NOADDR. 2013-02-27 17:09:33 +01:00
antirez 2dcb5ab72b Cluster: add comments in sub-sections of CLUSTER command. 2013-02-27 16:12:59 +01:00
antirez 96dd210970 redis-trib: initial implementation of addnode command. 2013-02-27 15:58:41 +01:00
antirez f7dac639a9 Remove warning when printing redisBuildId(). 2013-02-27 12:33:27 +01:00
antirez e4682c82e3 Remove too agressive/spamming log in rdb.c. 2013-02-27 12:29:20 +01:00
antirez f9b5ca29fd Use GCC printf format attribute for redisLog().
This commit also fixes redisLog() statements producing warnings.
2013-02-27 12:27:15 +01:00
antirez c35b065a64 Better panic message for failed time event creation. 2013-02-27 12:00:11 +01:00
Stam He e431a97660 add a check for aeCreateTimeEvent
1) Add a check for aeCreateTimeEvent in function initServer.
2013-02-27 11:57:35 +01:00
Stam He 9c8be6cab9 Set proctitle: avoid the use of __attribute__((constructor)).
This cased a segfault in some Linux system and was GCC-specific.

Commit modified by @antirez:

1) Stripped away the part to set the proc title via config for now.
2) Handle initialization of setproctitle only when the replacement
   is used.
3) Don't require GCC now that the attribute constructor is no
   longer used.
2013-02-27 11:50:35 +01:00
antirez deb1f4d841 setproctitle.c: declar tmp as static so valgrind will not detect a leak. 2013-02-26 15:51:33 +01:00
antirez d0992d6e8b Cluster: a few random fixes to the new failure detection. 2013-02-26 15:15:44 +01:00
antirez f288b07563 Cluster: log the event when we clear the FAIL flag. 2013-02-26 15:03:38 +01:00
antirez 97ffcd351b Cluster: use the failure report API to reimplement failure detection.
The new system detects a failure only when there is quorum from masters.
2013-02-26 14:58:39 +01:00
antirez 6356cf6808 Set process name in ps output to make operations safer.
This commit allows Redis to set a process name that includes the binding
address and the port number in order to make operations simpler.

Redis children processes doing AOF rewrites or RDB saving change the
name into redis-aof-rewrite and redis-rdb-bgsave respectively.

This in general makes harder to kill the wrong process because of an
error and makes simpler to identify saving children.

This feature was suggested by Arnaud GRANAL in the Redis Google Group,
Arnaud also pointed me to the setproctitle.c implementation includeed in
this commit.

This feature should work on all the Linux, OSX, and all the three major
BSD systems.
2013-02-26 11:52:12 +01:00
antirez 1b1b3f6c06 Cluster: invert two functions declarations in more natural order. 2013-02-26 11:19:48 +01:00
antirez d5e8b0a47f Cluster: cleanup idle failure reports every time we remove one.
This is not very important as anyway when the function counting the
number of reports is called the cleanup is performed. However with this
change if only part of the nodes that reported the failure will report
the node is back ok, we'll cleanup the older entries ASAP. In complex
split net split scenarios, and when we are dealing with clusters having
nodes in the order of ~ 1000, this can save some CPU.
2013-02-26 11:15:18 +01:00
antirez 9cb578ced0 Cluster: new function clusterNodeDelFailureReport() for failure reports.
This is the missing part of the API that will be used to reimplement
failure detection of Cluster nodes.
2013-02-25 19:13:22 +01:00
Arnaud Granal 3d588b37ce Fix error "repl-backlog-size must be 1 or greater"
The parameter repl-backlog-size is not parsed correctly in the configuration file. argv[0] is parsed instead of argv[1].
2013-02-25 19:25:48 +02:00
antirez 18f537083a Cluster: no limits for the count parameter of CLUSTER GETKEYSINSLOT.
Not sure why I set a limit to 1 million keys, there is no reason for
this artificial limit, and anyway this is s a stupid limit because it is
already high enough to create latency issues. So let's the users shoot
on their feet because maybe they just actually know what they are doing.
2013-02-25 12:41:13 +01:00
antirez 544bbe5387 Cluster: validate slot number in CLUSTER COUNTKEYSINSLOT. 2013-02-25 12:40:32 +01:00
antirez 996a643752 Cluster: use O(log(N)) algo for countKeysInSlot(). 2013-02-25 12:37:50 +01:00
antirez d4fa40655d Cluster: new sub-command CLUSTER COUNTKEYSINSLOT.
The new sub-command uses the new countKeysInSlot() API and allows a
cluster client to get the number of keys for a given hashslot.
2013-02-25 12:04:31 +01:00
antirez a517c89321 Cluster: verifyClusterConfigWithData() implemented. 2013-02-25 11:43:49 +01:00
antirez 9947b0d96d A comment in main() clarified. 2013-02-25 11:40:21 +01:00
antirez d2154254be Cluster: fix case for getKeysInSlot() and countKeysInSlot().
Redis functions start in low case. A few functions about cluster were
capitalized the wrong way.
2013-02-25 11:25:40 +01:00
antirez c2eb4a606f Cluster: use CountKeysInSlot() when we just need the count. 2013-02-25 11:23:04 +01:00
antirez ad3bca1fdf Cluster: added stub for verifyClusterConfigWithData().
See the top-comment for the function in this commit for details about
what the function is supposed to do.
2013-02-25 11:20:17 +01:00
antirez 1abce14611 Cluster: added new API countKeysInSlot().
This is similar to getKeysInSlot() but just returns the number of keys
found in a given hash slot, without returning the keys.
2013-02-25 11:15:03 +01:00
0x20h 977035b7b5 suppress external diff program when using git diff. 2013-02-24 18:17:46 +01:00
antirez 825e07f2fd Cluster: if no previous config exists, create the myself node as master. 2013-02-22 19:24:01 +01:00
antirez f4093753e4 Cluster: add cluster_size field in CLUSTER INFO output. 2013-02-22 19:20:38 +01:00
antirez d218a4e244 Cluster: new state information, cluster size.
The definition of cluster size is: the number of known nodes in the
cluster that are masters and serving at least an hash slot.
2013-02-22 19:18:30 +01:00
antirez 5c55ed9388 Cluster: remove warning adding clusterNodeSetSlotBit() prototype. 2013-02-22 17:45:49 +01:00
antirez 974929770b Cluster: introduced a failure reports system.
A §Redis Cluster node used to mark a node as failing when itself
detected a failure for that node, and a single acknowledge was received
about the possible failure state.

The new API will be used in order to possible to require that N other
nodes have a PFAIL or FAIL state for a given node for a node to set it
as failing.
2013-02-22 17:43:35 +01:00
antirez 36af851550 redis-trib: check that all the nodes agree about the slots configuration. 2013-02-22 12:25:16 +01:00
antirez 51b5058d04 redis-trib: skeleton of coverage fix for "keys in multiple nodes" case. 2013-02-22 11:33:10 +01:00
antirez a81c598f95 redis-trib: handle slot coverage fix in the "no nodes with keys" case. 2013-02-22 10:23:53 +01:00
antirez 392e0fa7eb Cluster: fix case of slotToKey...() functions. 2013-02-22 10:16:21 +01:00
antirez d04770988d Cluster: empty the internal sorted set mapping keys to slots on FLUSHDB/ALL. 2013-02-22 10:15:32 +01:00
antirez efe51dfff5 redis-trib: specify single node address when fixing coverage. 2013-02-22 10:05:07 +01:00
antirez 915e81335e redis-trib: ability to fix uncovered slots for the trivial case. 2013-02-21 18:10:06 +01:00
antirez 619d3945f8 redis-trib: fixed typo in method name. 2013-02-21 16:58:27 +01:00
antirez 07b6322735 Cluster: more correct update of slots state when PONG is received. 2013-02-21 16:52:06 +01:00
antirez c6da9d9fac Call clusterUpdateState() after CLUSTER SETSLOT too. 2013-02-21 16:31:22 +01:00
antirez 3a99d1228a Aesthetic change to make a line more 80-cols friendly. 2013-02-21 16:24:48 +01:00
antirez 5fd9f701da redis-trib: move instance vars in the right class. 2013-02-21 13:06:59 +01:00
antirez 7898bf4b7e redis-trib: some refactoring and skeleton of the "fix" command. 2013-02-21 13:00:41 +01:00
antirez dc4af60628 Cluster: clusterAddSlot() was not doing what stated in the comment. 2013-02-21 11:51:17 +01:00
antirez fdb57233e2 Cluster: always use cluster(Add|Del)Slot to modify the cluster slots table. 2013-02-21 11:44:58 +01:00
antirez 786b8d6c87 Use RESTORE-ASKING for MIGRATE when in cluster mode. 2013-02-20 17:36:54 +01:00
antirez ea7fc82a4a Cluster: new command flag forcing implicit ASKING.
Also using this new flag the RESTORE-ASKING command was implemented that
will be used by MIGRATE.
2013-02-20 17:28:35 +01:00
antirez 9ec1b709f5 Cluster: ASKING command fixed, state was not retained. 2013-02-20 17:07:52 +01:00
antirez b8d8b9ec41 redis-trib: set the migrating slot in the correct way when resharding. 2013-02-20 15:29:53 +01:00
antirez 9a04e12cc0 Cluster: I/O errors are now logged at DEBUG level. 2013-02-20 13:18:51 +01:00
antirez 917dd53216 redis-trib: make a few comments 80-cols friendly. 2013-02-15 17:11:55 +01:00
antirez 455da35c7f Cluster: specific error code for cluster down condition. 2013-02-15 16:53:24 +01:00
antirez 522c3255db Merge branch 'unstable' of github.com:antirez/redis into unstable 2013-02-15 16:45:04 +01:00
antirez 02796ba7a7 Cluster: sanity checks on the cluster bus message length. 2013-02-15 16:44:39 +01:00
antirez 8853698a6f Removed useless newlines from hashTypeCurrentObject(). 2013-02-15 13:12:55 +01:00
antirez 6b9c661838 Cluster: make valgrind happy initializing all the bytes of the node IP. 2013-02-15 12:58:35 +01:00
antirez 7371d5e248 Remove wrong decrRefCount() from getNodeByQuery().
This fixes issue #607.
2013-02-15 11:57:53 +01:00
antirez 20f52b5b78 Top comment for getNodeByQuery() improved. 2013-02-15 11:50:54 +01:00
antirez 6b641f3aeb redis-cli: update prompt on cluster redirection. 2013-02-14 18:49:08 +01:00
antirez e0e15bd06d Cluster: with 16384 slots we need bigger buffers. 2013-02-14 15:36:33 +01:00
antirez 1a32d99b28 Cluster: move cluster config file out of config state.
This makes us able to avoid allocating the cluster state structure if
cluster is not enabled, but still we can handle the configuration
directive that sets the cluster config filename.
2013-02-14 15:20:02 +01:00
antirez 1649e509c3 Cluster: the cluster state structure is now heap allocated. 2013-02-14 13:20:56 +01:00
antirez 9dfd11c3da Cluster: Initialize ip and port in createClusterNode(). 2013-02-14 13:01:28 +01:00
antirez a26690e8b5 Cluster: redis-trib updated to use 16384 hash slots. 2013-02-14 12:55:34 +01:00
antirez ebd666db47 Cluster: from 4096 to 16384 hash slots. 2013-02-14 12:49:16 +01:00
antirez 072c91fe13 PSYNC: another change to unexpected reply from PSYNC. 2013-02-13 18:43:40 +01:00
antirez 0e1be5347b PSYNC: More robust handling of unexpected reply to PSYNC. 2013-02-13 18:33:33 +01:00
antirez 7404b95833 Avoid compiler warning by casting to match printf() specifier. 2013-02-13 13:38:20 +01:00
antirez 3419c8ce70 Replication: more strict error checking for master PING reply. 2013-02-12 16:53:27 +01:00
antirez dc24a6b132 Return a specific NOAUTH error if authentication is required. 2013-02-12 16:25:41 +01:00
antirez 24f258360b Replication: added new stats counting full and partial resynchronizations. 2013-02-12 15:33:54 +01:00
antirez 04bdb3a2a4 Add missing bracket removed for error after rebase of PSYNC. 2013-02-12 12:56:32 +01:00
antirez 3af478e9ef PSYNC: debugging printf() calls are now logs at DEBUG level. 2013-02-12 12:52:22 +01:00
antirez 89b48f0825 Remove harmless warning in slaveTryPartialResynchronization(). 2013-02-12 12:52:21 +01:00
antirez 0ed6daa48b PSYNC: don't use the client buffer to send +CONTINUE and +FULLRESYNC.
When we are preparing an handshake with the slave we can't touch the
connection buffer as it'll be used to accumulate differences between
the sent RDB file and what arrives next from clients.

So in short we can't use addReply() family functions.

However we just use write(2) because we know that the socket buffer is
empty, since a prerequisite for SYNC to work is that the static buffer
and the output list are empty, and in general it is not expected that a
client SYNCs after doing some heavy I/O with the master.

However a short write connection is explicitly handled to avoid
fragility (we simply close the connection and the slave will retry).
2013-02-12 12:52:21 +01:00
antirez d2a0348a49 SYNC not allowed with pending data on the static output buffer. 2013-02-12 12:52:21 +01:00
antirez da315d3325 Log the unexpected string received in place of the SYNC payload length. 2013-02-12 12:52:21 +01:00
antirez 41d64a7516 After SLAVEOF <newslave> don't allow chained slaves to PSYNC. 2013-02-12 12:52:21 +01:00
antirez 078882025e PSYNC: work in progress, preview #2, rebased to unstable. 2013-02-12 12:52:21 +01:00
antirez e34a35a511 Use the new unified protocol to send SELECT to slaves.
SELECT was still transmitted to slaves using the inline protocol, that
is conceived mostly for humans to type into telnet sessions, and is
notably not understood by redis-cli --slave.

Now the new protocol is used instead.
2013-02-12 12:50:28 +01:00
antirez 4b83ad4e1f Use replicationFeedSlaves() to send PING to slaves.
A Redis master sends PING commands to slaves from time to time: doing
this ensures that even if absence of writes, the master->slave channel
remains active and the slave can feel the master presence, instead of
closing the connection for timeout.

This commit changes the way PINGs are sent to slaves in order to use the
standard interface used to replicate all the other commands, that is,
the function replicationFeedSlaves().

With this change the stream of commands sent to every slave is exactly
the same regardless of their exact state (Transferring RDB for first
synchronization or slave already online). With the previous
implementation the PING was only sent to online slaves, with the result
that the output stream from master to slaves was not identical for all
the slaves: this is a problem if we want to implement partial resyncs in
the future using a global replication stream offset.

TL;DR: this commit should not change the behaviour in practical terms,
but is just something in preparation for partial resynchronization
support.
2013-02-12 12:50:28 +01:00
antirez 7465ac7ab1 Emit SELECT to slaves in a centralized way.
Before this commit every Redis slave had its own selected database ID
state. This was not actually useful as the emitted stream of commands
is identical for all the slaves.

Now the the currently selected database is a global state that is set to
-1 when a new slave is attached, in order to force the SELECT command to
be re-emitted for all the slaves.

This change is useful in order to implement replication partial
resynchronization in the future, as makes sure that the stream of
commands received by slaves, including SELECT commands, are exactly the
same for every slave connected, at any time.

In this way we could have a global offset that can identify a specific
piece of the master -> slaves stream of commands.
2013-02-12 12:50:28 +01:00
antirez 1a27d41156 Makefile: valgrind target added (forces -O0 and libc malloc). 2013-02-11 12:11:28 +01:00
antirez 76a5b21c3a Tcp keep-alive: send three probes before detectin an error.
Otherwise we end with less reliable connections because it's too easy
that a single packet gets lost.
2013-02-08 17:06:01 +01:00
antirez 124a635bc5 Set SO_KEEPALIVE on client sockets if configured to do so. 2013-02-08 16:40:59 +01:00
antirez ee21c18e5d Add SO_KEEPALIVE support to anet.c. 2013-02-08 16:30:26 +01:00
antirez a6c2f9012f Make all WATCHers dirty when the slave reloads the DB. 2013-02-08 10:26:19 +01:00
antirez 46dd4c62b3 LASTSAVE is a "random" command. 2013-02-07 19:13:00 +01:00
antirez b70b459b0e TCP_NODELAY after SYNC: changes to the implementation. 2013-02-05 12:04:30 +01:00
charsyam c85647f354 Turn off TCP_NODELAY on the slave socket after SYNC.
Further details from @antirez:

It was reported by @StopForumSpam on Twitter that the Redis replication
link was strangely using multiple TCP packets for multiple commands.
This wastes a lot of bandwidth and is due to the TCP_NODELAY option we
enable on the socket after accepting a new connection.

However the master -> slave channel is a one-way channel since Redis
replication is asynchronous, so there is no point in trying to reduce
the latency, we should aim to reduce the bandwidth. For this reason this
commit introduces the ability to disable the nagle algorithm on the
socket after a successful SYNC.

This feature is off by default because the delay can be up to 40
milliseconds with normally configured Linux kernels.
2013-02-05 12:04:25 +01:00
Rock Li 8063155cd0 retval doesn't initalized
If each if conditions are all fail, variable retval will under uninitlized
2013-02-05 15:56:04 +08:00
Gengliang Wang 002747336a Fix a bug in srandmemberWithCountCommand()
In CASE 2, the call sunionDiffGenericCommand will involve the string "srandmember" 
> sadd foo one
(integer 1)
> sadd srandmember two
(integer 2)
> srandmember foo 3
1)"one"
2)"two"
2013-02-04 14:01:08 +08:00
antirez 089cbe643f Sentinel: advertise the promoted slave address only after successful setup. 2013-01-31 17:19:21 +01:00
Salvatore Sanfilippo aca005c246 Merge pull request #914 from catwell/unstable
fix comments forgotten in #285 (zipmap -> ziplist)
2013-01-31 03:37:48 -08:00
antirez ad297a1a67 Z*STORE event fixed: generate del only if resulting sorted set is empty. 2013-01-29 13:50:01 +01:00
antirez e41d1d77e3 Generate del events when S*STORE commands delete the destination key. 2013-01-29 13:43:13 +01:00
antirez 4dfb5752e0 Send 'expired' events when a key expires by lookup. 2013-01-28 13:15:19 +01:00
antirez 562b2bd6a7 Keyspace notifications: fixed a leak and a bug introduced in the latest commit. 2013-01-28 13:15:16 +01:00
antirez fce016d31b Keyspace events: it is now possible to select subclasses of events.
When keyspace events are enabled, the overhead is not sever but
noticeable, so this commit introduces the ability to select subclasses
of events in order to avoid to generate events the user is not
interested in.

The events can be selected using redis.conf or CONFIG SET / GET.
2013-01-28 13:15:12 +01:00
antirez 1c0c551776 decrRefCount -> decrRefCountVoid in list constructor. 2013-01-28 13:15:08 +01:00
antirez da04e6ed44 Keyspace events added for more commands. 2013-01-28 13:14:56 +01:00
antirez 8766e81079 Fix decrRefCount() prototype from void to robj pointer.
decrRefCount used to get its argument as a void* pointer in order to be
used as destructor where a 'void free_object(void*)' prototype is
expected. However this made simpler to introduce bugs by freeing the
wrong pointer. This commit fixes the argument type and introduces a new
wrapper called decrRefCountVoid() that can be used when the void*
argument is needed.
2013-01-28 13:14:53 +01:00
antirez 2d20e68fe4 notifyKeyspaceEvent(): release channel names using the right pointers. 2013-01-28 13:14:49 +01:00
antirez 5b9357a6b3 Initial test events for the new keyspace notification API. 2013-01-28 13:14:46 +01:00
antirez 2ea9518a53 Fixed over-80-cols comment in db.c 2013-01-28 13:14:42 +01:00
antirez 08d8bb57df Two fixes to initial keyspace notifications API. 2013-01-28 13:14:39 +01:00
antirez 4cdbce341e Keyspace events notification API. 2013-01-28 13:14:36 +01:00
Pierre Chapuis 50d43a9823 fix comments forgotten in #285 (zipmap -> ziplist) 2013-01-28 11:07:17 +01:00
antirez 88015b89a0 redis-cli --bigkeys output is now simpler to understand. 2013-01-21 19:15:58 +01:00
antirez 2039f1a38a UNSUBSCRIBE and PUNSUBSCRIBE: always provide a reply.
UNSUBSCRIBE and PUNSUBSCRIBE commands are designed to mass-unsubscribe
the client respectively all the channels and patters if called without
arguments.

However when these functions are called without arguments, but there are
no channels or patters we are subscribed to, the old behavior was to
don't reply at all.

This behavior is broken, as every command should always reply.
Also it is possible that we are no longer subscribed to a channels but we
are subscribed to patters or the other way around, and the client should
be notified with the correct number of subscriptions.

Also it is not pretty that sometimes we did not receive a reply at all
in a redis-cli session from these commands, blocking redis-cli trying
to read the reply.

This fixes issue #714.
2013-01-21 19:02:26 +01:00
antirez 93f61bb2a4 Fixed a bug in memtest progress bar, that had no actual effects.
This closes issue #859, thanks to @erbenmo.
2013-01-21 12:34:22 +01:00
Salvatore Sanfilippo 442235fe40 Merge pull request #869 from bilalhusain/patch-2
s/adiacent/adjacent/
2013-01-21 03:19:02 -08:00
antirez cd892d015d Not every __sun has backtrace().
I don't know how to test for Open Solaris that has support for
backtrace() so for now removing the #ifdef that breaks compilation under
other Solaris flavors.
2013-01-21 12:06:54 +01:00
antirez e50cdbe461 Additionally two typos fixed thanks to @jodal 2013-01-19 13:46:14 +01:00
antirez 79a0ef62db Whitelist SIGUSR1 to avoid auto-triggering errors.
This commit fixes issue #875 that was caused by the following events:

1) There is an active child doing BGSAVE.
2) flushall is called (or any other condition that makes Redis killing
the saving child process).
3) An error is sensed by Redis as the child exited with an error (killed
by a singal), that stops accepting write commands until a BGSAVE happens
to be executed with success.

Whitelisting SIGUSR1 and making sure Redis always uses this signal in
order to kill its own children fixes the issue.
2013-01-19 13:30:38 +01:00
antirez ab247fc176 Clear server.shutdown_asap on failed shutdown.
When a SIGTERM is received Redis schedules a shutdown. However if it
fails to perform the shutdown it must be clear the shutdown_asap flag
otehrwise it will try again and again possibly making the server
unusable.
2013-01-19 13:19:41 +01:00
antirez 08d200baeb Slowlog: don't log EXEC but just the executed commands.
The Redis Slow Log always used to log the slow commands executed inside
a MULTI/EXEC block. However also EXEC was logged at the end, which is
perfectly useless.

Now EXEC is no longer logged and a test was added to test this behavior.

This fixes issue #759.
2013-01-19 12:53:21 +01:00
guiquanz 9d09ce3981 Fixed many typos. 2013-01-19 10:59:44 +01:00
Salvatore Sanfilippo 61dfc2e521 Merge pull request #887 from charsyam/redis-cli-prompt
redis-cli prompt bug fix
2013-01-19 01:32:28 -08:00
Salvatore Sanfilippo 74f137308e Merge pull request #895 from badboy/catch_con_error
redis-cli: always exit if connection fails.
2013-01-19 01:27:56 -08:00
bitterb c2dc172a9d Fix an error reply for CLIENT command 2013-01-19 14:11:33 +09:00
Jan-Erik Rediger 59046a7373 Always exit if connection fails.
This avoids unnecessary core dumps. Fixes antirez/redis#894
2013-01-18 10:13:10 +01:00
Nathan Parry 1920cab3bc redis-cli --rdb fails if server sends a ping
Redis pings slaves in "pre-synchronization stage" with newlines. (See
https://github.com/antirez/redis/blob/2.6.9/src/replication.c#L814)
However, redis-cli does not expect this - it sees the newline as the end
of the bulk length line, and ends up returning 0 as bulk the length.
This manifests as the following when running redis-cli:

    $ ./src/redis-cli --rdb some_file
    SYNC sent to master, writing 0 bytes to 'some_file'
    Transfer finished with success.

With this commit, we just ignore leading newlines while reading the bulk
length line.

To reproduce the problem, load enough data into Redis so that the
preparation of the RDB snapshot takes long enough for a ping to occur
while redis-cli is waiting for the data.
2013-01-18 00:10:58 -05:00
charsyam e396236651 redis-cli prompt bug fix 2013-01-16 17:20:54 -08:00
antirez a0c24821e2 redis-cli: save an RDB dump from remote server to local file. 2013-01-16 19:44:37 +01:00
antirez 9b89ab06c4 Typo fixed, ASCI -> ASCII. 2013-01-15 13:34:13 +01:00
antirez 1971740f0c CLIENT GETNAME and CLIENT SETNAME introduced.
Sometimes it is much simpler to debug complex Redis installations if it
is possible to assign clients a name that is displayed in the CLIENT
LIST output.

This is the case, for example, for "leaked" connections. The ability to
provide a name to the client makes it quite trivial to understand what
is the part of the code implementing the client not releasing the
resources appropriately.

Behavior:

    CLIENT SETNAME: set a name for the client, or remove the current
                    name if an empty name is set.
    CLIENT GETNAME: get the current name, or a nil.
    CLIENT LIST: now displays the client name if any.

Thanks to Mark Gravell for pushing this idea forward.
2013-01-15 13:34:10 +01:00
antirez ef99e146a8 Undo slave-master handshake when SLAVEOF sets a new slave.
Issue #828 shows how Redis was not correctly undoing a non-blocking
connection attempt with the previous master when the master was set to a
new address using the SLAVEOF command.

This was also a result of lack of refactoring, so now there is a
function to cancel the non blocking handshake with the master.
The new function is now used when SLAVEOF NO ONE is called or when
SLAVEOF is used to set the master to a different address.
2013-01-15 13:33:24 +01:00
antirez a33869c330 Makefile.dep updated. 2013-01-11 23:50:32 +01:00
antirez f5fa6824db Comment in the call() function clarified a bit. 2013-01-10 11:19:40 +01:00
antirez baee565029 Multiple fixes for EVAL (issue #872).
1) The event handler was no restored after a timeout condition if the
   command was eventually executed with success.
2) The command was not converted to EVAL in case of errors in the middle
   of the execution.
3) Terrible duplication of code without any apparent reason.
2013-01-10 10:46:05 +01:00
Bilal Husain 717e5ffb45 s/adiacent/adjacent/
fixed typo in a comment (step 2 memcheck)
2013-01-09 21:46:58 +05:30
antirez d7740fc8f3 Better error reporting when fd event creation fails. 2013-01-03 14:29:34 +01:00
antirez e9261b1e2d ae.c: set errno when error is not a failing syscall.
In this way the caller is able to perform better error checking or to
use strerror() without the risk of meaningless error messages being
displayed.
2013-01-03 14:29:20 +01:00
antirez 4dc1e0dd30 Fix overflow in mstime() in redis-cli and benchmark.
The problem does not exist in the Redis server implementation of mstime()
but is only limited to redis-cli and redis-benchmark.

Thix fixes issue #839.
2012-12-20 15:20:55 +01:00
antirez f1481d4a03 serverCron() frequency is now a runtime parameter (was REDIS_HZ).
REDIS_HZ is the frequency our serverCron() function is called with.
A more frequent call to this function results into less latency when the
server is trying to handle very expansive background operations like
mass expires of a lot of keys at the same time.

Redis 2.4 used to have an HZ of 10. This was good enough with almost
every setup, but the incremental key expiration algorithm was working a
bit better under *extreme* pressure when HZ was set to 100 for Redis
2.6.

However for most users a latency spike of 30 milliseconds when million
of keys are expiring at the same time is acceptable, on the other hand a
default HZ of 100 in Redis 2.6 was causing idle instances to use some
CPU time compared to Redis 2.4. The CPU usage was in the order of 0.3%
for an idle instance, however this is a shame as more energy is consumed
by the server, if not important resources.

This commit introduces HZ as a runtime parameter, that can be queried by
INFO or CONFIG GET, and can be modified with CONFIG SET. At the same
time the default frequency is set back to 10.

In this way we default to a sane value of 10, but allows users to
easily switch to values up to 500 for near real-time applications if
needed and if they are willing to pay this small CPU usage penalty.
2012-12-14 17:10:40 +01:00
Salvatore Sanfilippo a4d68dc541 Merge pull request #824 from ptjm/unstable
Define _XOPEN_SOURCE appropriately on NetBSD.
2012-12-12 09:34:41 -08:00
Patrick TJ McPhee 289942b625 Define _XOPEN_SOURCE appropriately on NetBSD. 2012-12-12 10:49:12 -05:00
antirez 705874e31d Fix config.h endianess detection to work on Linux / PPC64.
Config.h performs endianess detection including OS-specific headers to
define the endianess macros, or when this is not possible, checking the
processor type via ifdefs.

Sometimes when the OS-specific macro is included, only __BYTE_ORDER is
defined, while BYTE_ORDER remains undefined. There is code at the end of
config.h endianess detection in order to define the macros without the
underscore, but it was not working correctly.

This commit fixes endianess detection fixing Redis on Linux / PPC64 and
possibly other systems.
2012-12-11 17:01:00 +01:00
antirez ab2924cff3 Memory leak fixed: release client's bpop->keys dictionary.
Refactoring performed after issue #801 resolution (see commit
2f87cf8b01) introduced a memory leak that
is fixed by this commit.

I simply forgot to free the new allocated dictionary in the client
structure trusting the output of "make test" on OSX.

However due to changes in the "leaks" utility the test was no longer
testing memory leaks. This problem was also fixed.

Fortunately the CI test running at ci.redis.io spotted the bug in the
valgrind run.

The leak never ended into a stable release.
2012-12-03 12:12:53 +01:00
antirez 2f87cf8b01 Blocking POP: use a dictionary to store keys clinet side.
To store the keys we block for during a blocking pop operation, in the
case the client is blocked for more data to arrive, we used a simple
linear array of redis objects, in the blockingState structure:

    robj **keys;
    int count;

However in order to fix issue #801 we also use a dictionary in order to
avoid to end in the blocked clients queue for the same key multiple
times with the same client.

The dictionary was only temporary, just to avoid duplicates, but since
we create / destroy it there is no point in doing this duplicated work,
so this commit simply use a dictionary as the main structure to store
the keys we are blocked for. So instead of the previous fields we now
just have:

    dict *keys;

This simplifies the code and reduces the work done by the server during
a blocking POP operation.
2012-12-02 20:43:15 +01:00
antirez 4e6dd7bc86 Client should not block multiple times on the same key.
Sending a command like:

BLPOP foo foo foo foo 0

Resulted into a crash before this commit since the client ended being
inserted in the waiting list for this key multiple times.
This resulted into the function handleClientsBlockedOnLists() to fail
because we have code like that:

    if (de) {
        list *clients = dictGetVal(de);
        int numclients = listLength(clients);

        while(numclients--) {
            listNode *clientnode = listFirst(clients);

            /* server clients here... */
        }
    }

The code to serve clients used to remove the served client from the
waiting list, so if a client is blocking multiple times, eventually the
call to listFirst() will return NULL or worse will access random memory
since the list may no longer exist as it is removed by the function
unblockClientWaitingData() if there are no more clients waiting for this
list.

To avoid making the rest of the implementation more complex, this commit
modifies blockForKeys() so that a client will be put just a single time
into the waiting list for a given key.

Since it is Saturday, I hope this fixes issue #801.
2012-12-02 20:43:07 +01:00
antirez f50e658455 SDIFF is now able to select between two algorithms for speed.
SDIFF used an algorithm that was O(N) where N is the total number
of elements of all the sets involved in the operation.

The algorithm worked like that:

ALGORITHM 1:

1) For the first set, add all the members to an auxiliary set.
2) For all the other sets, remove all the members of the set from the
auxiliary set.

So it is an O(N) algorithm where N is the total number of elements in
all the sets involved in the diff operation.

Cristobal Viedma suggested to modify the algorithm to the following:

ALGORITHM 2:

1) Iterate all the elements of the first set.
2) For every element, check if the element also exists in all the other
remaining sets.
3) Add the element to the auxiliary set only if it does not exist in any
of the other sets.

The complexity of this algorithm on the worst case is O(N*M) where N is
the size of the first set and M the total number of sets involved in the
operation.

However when there are elements in common, with this algorithm we stop
the computation for a given element as long as we find a duplicated
element into another set.

I (antirez) added an additional step to algorithm 2 to make it faster,
that is to sort the set to subtract from the biggest to the
smallest, so that it is more likely to find a duplicate in a larger sets
that are checked before the smaller ones.

WHAT IS BETTER?

None of course, for instance if the first set is much larger than the
other sets the second algorithm does a lot more work compared to the
first algorithm.

Similarly if the first set is much smaller than the other sets, the
original algorithm will less work.

So this commit makes Redis able to guess the number of operations
required by each algorithm, and select the best at runtime according
to the input received.

However, since the second algorithm has better constant times and can do
less work if there are duplicated elements, an advantage is given to the
second algorithm.
2012-11-30 16:36:42 +01:00
antirez b4abbaf755 redis-benchmark: seed the PRNG with time() at startup. 2012-11-30 15:41:09 +01:00
antirez 2f62c9663c Introduced the Build ID in INFO and --version output.
The idea is to be able to identify a build in a unique way, so for
instance after a bug report we can recognize that the build is the one
of a popular Linux distribution and perform the debugging in the same
environment.
2012-11-29 14:20:08 +01:00
antirez b1b602a928 On crash memory test rewrote so that it actaully works.
1) We no longer test location by location, otherwise the CPU write cache
completely makes our business useless.
2) We still need a memory test that operates in steps from the first to
the last location in order to never hit the cache, but that is still
able to retain the memory content.

This was tested using a Linux box containing a bad memory module with a
zingle bit error (always zero).

So the final solution does has an error propagation step that is:

1) Invert bits at every location.
2) Swap adiacent locations.
3) Swap adiacent locations again.
4) Invert bits at every location.
5) Swap adiacent locations.
6) Swap adiacent locations again.

Before and after these steps, and after step 4, a CRC64 checksum is computed.
If the three CRC64 checksums don't match, a memory error was detected.
2012-11-29 10:24:35 +01:00
charsyam dee0b939fc Remove unnecessary condition in _dictExpandIfNeeded (dict.c) 2012-11-28 11:44:39 +01:00
antirez c87a40897c Merge remote-tracking branch 'origin/unstable' into unstable 2012-11-28 11:41:27 +01:00
Matt Arsenault 504e5072eb It's a watchdog, not a watchdong. 2012-11-28 11:35:19 +01:00
charsyam d7c7ac4a57 remove compile warning bioKillThreads 2012-11-23 05:52:39 +08:00
antirez 95f68f7b0f EVALSHA is now case insensitive.
EVALSHA used to crash if the SHA1 was not lowercase (Issue #783).
Fixed using a case insensitive dictionary type for the sha -> script
map used for replication of scripts.
2012-11-22 15:50:00 +01:00
antirez cceb0c5b4a Fix integer overflow in zunionInterGenericCommand().
This fixes issue #761.
2012-11-22 15:28:28 +01:00
antirez 3d1391272a Safer handling of MULTI/EXEC on errors.
After the transcation starts with a MULIT, the previous behavior was to
return an error on problems such as maxmemory limit reached. But still
to execute the transaction with the subset of queued commands on EXEC.

While it is true that the client was able to check for errors
distinguish QUEUED by an error reply, MULTI/EXEC in most client
implementations uses pipelining for speed, so all the commands and EXEC
are sent without caring about replies.

With this change:

1) EXEC fails if at least one command was not queued because of an
error. The EXECABORT error is used.
2) A generic error is always reported on EXEC.
3) The client DISCARDs the MULTI state after a failed EXEC, otherwise
pipelining multiple transactions would be basically impossible:
After a failed EXEC the next transaction would be simply queued as
the tail of the previous transaction.
2012-11-22 10:32:07 +01:00
antirez 7536991726 Make bio.c threads killable ASAP if needed.
We use this new bio.c feature in order to stop our I/O threads if there
is a memory test to do on crash. In this case we don't want anything
else than the main thread to run, otherwise the other threads may mess
with the heap and the memory test will report a false positive.
2012-11-22 10:12:11 +01:00
antirez 5a9e3f5842 Fast memory test on Redis crash. 2012-11-21 13:24:44 +01:00
antirez 3cb432837c Use more fine grained HAVE macros instead of HAVE_PROCFS. 2012-11-21 13:17:38 +01:00
charsyam 52b52a3508 fix randstring bug 2012-11-20 02:50:31 +08:00
antirez 49b6452351 Children creating AOF or RDB files now report memory used by COW.
Finally Redis is able to report the amount of memory used by
copy-on-write while saving an RDB or writing an AOF file in background.

Note that this information is currently only logged (at NOTICE level)
and not shown in INFO because this is less trivial (but surely doable
with some minor form of interprocess communication).

The reason we can't capture this information on the parent before we
call wait3() is that the Linux kernel will release the child memory
ASAP, and only retain the minimal state for the process that is useful
to report the child termination to the parent.

The COW size is obtained by summing all the Private_Dirty fields found
in the "smap" file inside the proc filesystem for the process.

All this is Linux specific and is not available on other systems.
2012-11-19 12:02:08 +01:00
antirez 3bfeb9c1a7 zmalloc_get_private_dirty() function added (Linux only).
For non Linux systmes it just returns 0.

This function is useful to estimate copy-on-write because of childs
saving stuff on disk.
2012-11-19 11:47:35 +01:00
antirez af0b220756 zmalloc: kill unused __size parameter in update_zmalloc_stat_alloc() macro. 2012-11-14 12:52:38 +01:00
antirez a779b7e901 Merge branch 'migrate-cache' into unstable 2012-11-14 12:21:23 +01:00
antirez 2feef47aa1 MIGRATE: retry one time on I/O error.
Now that we cache connections, a retry attempt makes sure that the
operation don't fail just because there is an existing connection error
on the socket, like the other end closing the connection.

Unfortunately this condition is not detectable using
getsockopt(SO_ERROR), so the only option left is to retry.

We don't retry on timeouts.
2012-11-14 11:30:24 +01:00
antirez aa2bf6ba8b TTL API change: TTL returns -2 for non existing keys.
The previous behavior was to return -1 if:

1) Existing key but without an expire set.
2) Non existing key.

Now the second case is handled in a different, and TTL will return -2
if the key does not exist at all.

PTTL follows the same behavior as well.
2012-11-12 23:04:36 +01:00
antirez 05705bc8bb MIGRATE: fix default timeout to 1000 milliseconds.
When a timeout <= 0 is provided we set a default timeout of 1 second.
It was set to 1 millisecond for an error resulting from a recent change.
2012-11-12 18:54:35 +01:00
antirez c8852ebf19 MIGRATE count of cached sockets in INFO output. 2012-11-12 14:01:56 +01:00
antirez 149b527a74 MIGRATE timeout should be in milliseconds.
While it is documented that the MIGRATE timeout is in milliseconds, it
was in seconds instead. This commit fixes the problem.
2012-11-12 14:01:02 +01:00
antirez e23d281e48 MIGRATE TCP connections caching.
By caching TCP connections used by MIGRATE to chat with other Redis
instances a 5x performance improvement was measured with
redis-benchmark against small keys.

This can dramatically speedup cluster resharding and other processes
where an high load of MIGRATE commands are used.
2012-11-12 00:47:24 +01:00
antirez 4365e5b2d3 BSD license added to every C source and header file. 2012-11-08 18:31:32 +01:00
antirez 1237d71c4e COPY and REPLACE options for MIGRATE.
With COPY now MIGRATE does not remove the key from the source instance.
With REPLACE it uses RESTORE REPLACE on the target host so that even if
the key already eixsts in the target instance it will be overwritten.

The options can be used together.
2012-11-07 15:32:27 +01:00
antirez e5b5763f56 REPLACE option for RESTORE.
The REPLACE option deletes an existing key with the same name (if any)
and materializes the new one. The default behavior without RESTORE is to
return an error if a key already exists.
2012-11-07 10:57:23 +01:00
antirez c4b0b6854e Type mismatch errors are now prefixed with WRONGTYPE.
So instead to reply with a generic error like:

-ERR ... wrong kind of value ...

now it replies with:

-WRONGTYPE ... wrong kind of value ...

This makes this particular error easy to check without resorting to
(fragile) pattern matching of the error string (however the error string
used to be consistent already).

Client libraries should return a specific exeption type for this error.

Most of the commit is about fixing unit tests.
2012-11-06 20:25:34 +01:00
Salvatore Sanfilippo 06851a93de Merge pull request #741 from Run/typo
fix a typo in redis.h line 595 comment
2012-11-02 04:10:47 -07:00
antirez 05d8e2c938 More robust handling of AOF rewrite child.
After the wait3() syscall we used to do something like that:

    if (pid == server.rdb_child_pid) {
        backgroundSaveDoneHandler(exitcode,bysignal);
    } else {
        ....
    }

So the AOF rewrite was handled in the else branch without actually
checking if the pid really matches. This commit makes the check explicit
and logs at WARNING level if the pid returned by wait3() does not match
neither the RDB or AOF rewrite child.
2012-11-01 22:39:39 +01:00
Yecheng Fu f0266532fc fix typo in comments (redis.c, networking.c) 2012-11-01 22:26:46 +01:00
antirez 2ea41242f6 Unix socket clients properly displayed in MONITOR and CLIENT LIST.
This also fixes issue #745.
2012-11-01 22:10:45 +01:00
Runzhen Wang c23c657cdd fix a typo in redis.h line 595 comment 2012-11-01 02:14:22 +08:00