Commit Graph

3468 Commits

Author SHA1 Message Date
antirez 278ea9d16b replicationHandleMasterDisconnection() belongs to replication.c. 2015-07-28 14:36:50 +02:00
antirez 54c71f2d96 RDMF: Redis -> Server in adjustOpenFilesLimit(). 2015-07-28 11:19:20 +02:00
antirez 813ff7fdde Avoid magic "0" argument to prepareForShutdown().
Backported from Disque.
2015-07-28 11:10:42 +02:00
antirez 5cfb792777 RDMF: dictRedisObjectDestructor -> dictObjectDestructor." 2015-07-28 11:03:01 +02:00
antirez a83e79b176 Use mstime_t as return value of mstime(). 2015-07-28 10:14:33 +02:00
antirez 02b1d5213d RDMF: use representClusterNodeFlags() generic name. 2015-07-27 15:08:58 +02:00
antirez 3325a9b11f RDMF: more names updated. 2015-07-27 15:03:10 +02:00
antirez 32f80e2f1b RDMF: More consistent define names. 2015-07-27 14:37:58 +02:00
antirez 40eb548a80 RDMF: REDIS_OK REDIS_ERR -> C_OK C_ERR. 2015-07-26 23:17:55 +02:00
antirez 2d9e3eb107 RDMF: redisAssert -> serverAssert. 2015-07-26 15:29:53 +02:00
antirez 14ff572482 RDMF: OBJ_ macros for object related stuff. 2015-07-26 15:28:00 +02:00
antirez 554bd0e7bd RDMF: use client instead of redisClient, like Disque. 2015-07-26 15:20:52 +02:00
antirez 424fe9afd9 RDMF: redisLog -> serverLog. 2015-07-26 15:17:43 +02:00
antirez cef054e868 RDMF (Redis/Disque merge friendlyness) refactoring WIP 1. 2015-07-26 15:17:18 +02:00
antirez c6333def13 SDS: Copyright updated further. 2015-07-25 17:41:56 +02:00
antirez cb2782c314 SDS: changes to unify Redis SDS with antirez/sds repo. 2015-07-25 17:25:44 +02:00
antirez 9894495c5a SDS: Copyright notice updated. 2015-07-25 17:08:44 +02:00
antirez 11425c89cf SDS: sdsjoinsds() call ported from antirez/sds fork. 2015-07-25 17:05:20 +02:00
Rogerio Goncalves ef29748d0d Check args before run ckquorum. Fix issue #2635 2015-07-24 14:08:50 +02:00
antirez 64fcd0e6ff SDS: avoid compiler warning in sdsIncrLen(). 2015-07-24 09:39:12 +02:00
antirez 935251259f Merge branch 'sds' into unstable 2015-07-24 08:49:23 +02:00
antirez ea9bd243ec SDS: use type 8 if we are likely to append to the string.
When empty strings are created, or when sdsMakeRoomFor() is called, we
are likely into an appending pattern. Use at least type 8 SDS strings
since TYPE 5 does not remember the free allocation size and requires to
call sdsMakeRoomFor() at every new piece appended.
2015-07-23 16:10:08 +02:00
antirez cf68f4ee6a Fix SDS type 5 sdsIncrLen() bug and added test.
Thanks to @oranagra for spotting this error.
2015-07-20 16:18:08 +02:00
Salvatore Sanfilippo bcb4d09123 Merge pull request #2636 from badboy/cluster-lock-fix
Cluster lock fix
2015-07-17 11:00:44 +02:00
Salvatore Sanfilippo 29391002f6 Merge pull request #2644 from MOON-CLJ/command_info_fix
pfcount support multi keys
2015-07-17 10:55:58 +02:00
Yongyue Sun 427794d845 bugfix: errno might change before logging
Signed-off-by: Yongyue Sun <abioy.sun@gmail.com>
2015-07-17 10:47:32 +02:00
Tom Kiemes 6142ddc6eb Fix: aof_delayed_fsync is not reset
aof_delayed_fsync was not set to 0 when calling CONFIG RESETSTAT
2015-07-17 10:39:36 +02:00
Salvatore Sanfilippo f049cfdb0d Merge pull request #2676 from july2993/unstable
config tcp-keepalive should be numerical field not bool
2015-07-17 10:34:43 +02:00
antirez 25e1cb3f04 Client timeout handling improved.
The previos attempt to process each client at least once every ten
seconds was not a good idea, because:

1. Usually because of the past min iterations set to 50, you get much
better processing period most of the times.

2. However when there are many clients and a normal setting for
server.hz, the edge case is triggered, and waiting 10 seconds for a
BLPOP that asked for 1 second is not ok.

3. Moreover, because of the high min-itereations limit of 50, when HZ
was set to an high value, the actual behavior was to process a lot of
clients per second.

Also the function checking for timeouts called gettimeofday() at each
iteration which can be costly.

The new implementation will try to process each client once per second,
gets the current time as argument, and does not attempt to process more
than 5 clients per iteration if not needed.

So now:

1. The CPU usage of an idle Redis process is the same or better.
2. The CPU usage of a busy Redis process is the same or better.
3. However a non trivial amount of work may be performed per iteration
when there are many many clients. In this particular case the user may
want to raise the "HZ" value if needed.

Btw with 4000 clients it was still not possible to noticy any actual
latency created by processing 400 clients per second, since the work
performed for each client is pretty small.
2015-07-16 10:54:18 +02:00
Jiahao Huang 92c146dfd3 config tcp-keepalive should be numerical field not bool 2015-07-16 15:53:44 +08:00
antirez e0bb454a16 Clarify a comment in clientsCron(). 2015-07-16 09:26:36 +02:00
antirez 3da97ea67f Add sdshdr5 to DEBUG structsize. 2015-07-16 09:14:39 +02:00
antirez 0ab27a4594 SDS: New sds type 5 implemented.
This is an attempt to use the refcount feature of the sds.c fork
provided in the Pull Request #2509. A new type, SDS_TYPE_5 is introduced
having a one byte header with just the string length, without
information about the available additional length at the end of the
string (this means that sdsMakeRoomFor() will be required each time
we want to append something, since the string will always report to have
0 bytes available).

More work needed in order to avoid common SDS functions will pay the
cost of this type. For example both sdscatprintf() and sdscatfmt()
should try to upgrade to SDS_TYPE_8 ASAP when appending chars.
2015-07-15 12:24:49 +02:00
antirez 056a0ca199 Fix redis-benchmark sds binding.
Same as redis-cli, now redis-benchmark requires to use hiredis sds copy
since it is different compared to the memory optimized fork of Redis
sds.
2015-07-14 17:33:30 +02:00
antirez a76b380e06 Fix DEBUG structsize output. 2015-07-14 17:17:06 +02:00
Oran Agra f15df8ba5d sds size classes - memory optimization 2015-07-14 17:17:06 +02:00
antirez 0f64080dcb DEBUG HTSTATS <dbid> added.
The command reports information about the hash table internal state
representing the specified database ID.

This can be used in order to investigate rehashings, memory usage issues
and for other debugging purposes.
2015-07-14 17:15:37 +02:00
antirez 4c7ee0d584 EXISTS is now variadic.
The new return value is the number of keys existing, among the ones
specified in the command line, counting the same key multiple times if
given multiple times (and if it exists).

See PR #2667.
2015-07-13 18:09:41 +02:00
antirez 5c4fcaf3fe Geo: fix command table keys position indexes for three commands.
GEOHASH, GEOPOS and GEODIST where declared as commands not accepting
keys, so the Redis Cluster redirection did not worked.

Close #2671.
2015-07-13 15:30:11 +02:00
antirez b96af595a5 GEOENCODE / GEODECODE commands removed.
Rationale:

1. The commands look like internals exposed without a real strong use
case.
2. Whatever there is an use case, the client would implement the
commands client side instead of paying RTT just to use a simple to
reimplement library.
3. They add complexity to an otherwise quite straightforward API.

So for now KILLED ;-)
2015-07-09 17:42:59 +02:00
antirez 1e12784259 Geo: -Ofast breaks builds on older GCCs. 2015-07-09 11:25:29 +02:00
antirez 5e04189887 Geo: validate long,lat passed by user via API 2015-07-06 18:39:25 +02:00
antirez 5254c2d3c3 Removed useless tryObjectEncoding() call from ZRANK. 2015-07-03 09:47:08 +02:00
antirez 4160bf0448 Geo: sync faster decoding from krtm that synched from Ardb.
Instead of successive divisions in iteration the new code uses bitwise
magic to interleave / deinterleave two 32bit values into a 64bit one.
All tests still passing and is measurably faster, so worth it.
2015-07-01 16:12:08 +02:00
antirez d308cadc8a Geo: added my copyright notice in modified files. 2015-06-29 16:34:02 +02:00
antirez 69c5b27273 Geo: support units only in abbreviated form.
I'm not a strong believer in multiple syntax for the same stuff, so
now units can be specified only as m, km, ft, mi.
2015-06-29 16:02:33 +02:00
antirez 083acbebc8 Geo: remove static declarations.
Stack traces produced by Redis on crash are the most useful tool we
have to fix non easily reproducible crashes, or even easily reproducible
ones where the user just posts a bug report and does not collaborate
furhter.

By declaring functions "static" they no longer show up in the stack
trace.
2015-06-29 15:57:17 +02:00
antirez f108c687ad Geo: GEODIST and tests. 2015-06-29 12:44:34 +02:00
antirez a12192f5ff Geo: command function names converted to lowercase, as elsewhere.
In Redis MULTIWORDCOMMANDNAME are mapped to functions where the command
name is all lowercase: multiwordcommandnameCommand().
2015-06-29 12:07:18 +02:00
antirez aae0a1f9cc Geo: GEOPOS command and tests. 2015-06-29 10:47:07 +02:00
antirez ddc7b85c5f Geo: GEOENCODE: fix command arity check. 2015-06-29 09:39:34 +02:00
antirez 6a8e108e2d Geo: GEOENCODE now returns score ranges.
If GEOENCODE must be our door to enter the Geocoding implementation
details and do fancy things client side, than return the scores as well
so that we can query the sorted sets directly if we wish to do the same
search multiple times, or want to compute the boxes in the client side
to refine our search needs.
2015-06-29 09:34:05 +02:00
antirez 1884bff12d Geo: fix comment indentation. 2015-06-29 09:24:22 +02:00
antirez db3df44184 Geo: debugging printf calls removed. 2015-06-29 09:21:31 +02:00
antirez 6d21027a23 Geo: GEOADD form using radius removed.
Can't immagine how this is useful in the context of the API exported by
Redis, and we are always in time to add more bloat if needed, later.
2015-06-29 09:20:07 +02:00
antirez 7d59e0a8c3 Geo: commands top comment as in other Redis code. 2015-06-29 09:16:27 +02:00
antirez a3b07b1718 Geo: COUNT option for GEORADIUS. 2015-06-27 10:23:58 +02:00
antirez cd91beea1c Geo: only one way to specify any given option. 2015-06-27 09:43:47 +02:00
antirez 710c05ac2a Geo: remove useless variable. geoRadiusGeneric() top comment improved. 2015-06-27 09:38:56 +02:00
MOON_CLJ c232235734 pfcount support multi keys 2015-06-26 17:58:45 +08:00
antirez fa9d62d34f Geo: from lat,lon API to lon,lat API according to GIS standard
The GIS standard and all the major DBs implementing GIS related
functions take coordinates as x,y that is longitude,latitude.
It was a bad start for Redis to do things differently, so even if this
means that existing users of the Geo module will be required to change
their code, Redis now conforms to the standard.

Usually Redis is very backward compatible, but this is not an exception
to this rule, since this is the first Geo implementation entering the
official Redis source code. It is not wise to try to be backward
compatible with code forks... :-)

Close #2637.
2015-06-26 10:58:27 +02:00
antirez 03ce189628 Geo: explain increment magic in membersOfGeoHashBox(). 2015-06-24 17:37:20 +02:00
antirez 87521f4455 Geo: GEOHASH command added, returning standard geohash strings. 2015-06-24 16:34:07 +02:00
Jan-Erik Rediger c7462ca9ff Don't include sysctl header
It's not needed (anymore) and is not available on Solaris.
2015-06-24 14:57:20 +02:00
Jan-Erik Rediger d28c51d166 Do not attempt to lock on Solaris 2015-06-24 14:57:15 +02:00
antirez 55c4a365d7 Geo: Fix geohashEstimateStepsByRadius() step underestimation.
The returned step was in some case not enough towards normal
coordinates (for example when our search position was was already near the
margin of the central area, and we had to match, using the east or west
neighbor, a very far point). Example:

    geoadd points 67.575457940146066 -62.001317572780565 far
    geoadd points 66.685439060295664 -58.925040587282297 center
    georadius points 66.685439060295664 -58.925040587282297 200 km

In the above case the code failed to find a match (happens at smaller
latitudes too) even if far and center are at less than 200km.

Another fix introduced by this commit is a progressively larger area
towards the poles, since meridians are a lot less far away, so we need
to compensate for this.

The current implementation works comparably to the Tcl brute-force
stress tester implemented in the fuzzy test in the geo.tcl unit for
latitudes between -70 and 70, and is pretty accurate over +/-80 too,
with sporadic false negatives.

A more mathematically clean implementation is possible by computing the
meridian distance at the specified latitude and computing the step
according to it.
2015-06-24 10:42:16 +02:00
antirez 8d5ad19d15 Geo: return REDIS_* where appropriate, improve commenting 2015-06-23 10:27:48 +02:00
antirez bb3284563c Geo: GEOADD implementation improved, replication fixed
1. We no longer use a fake client but just rewriting.
2. We group all the inserts into a single ZADD dispatch (big speed win).
3. As a side effect of the correct implementation, replication works.
4. The return value of the command is now correct.
2015-06-23 10:20:14 +02:00
antirez ae5fd11563 Geo: more x,y renamed lat,lon 2015-06-23 09:35:43 +02:00
antirez a3018a215f Geo: rename x,y to lat,lon for clarity 2015-06-23 09:30:14 +02:00
antirez 51b4a4724b Geo: use the high level API to decode in geoAppendIfWithinRadius() 2015-06-23 09:03:56 +02:00
antirez 0b93139048 Geo: big refactoring of geo.c, zset.[ch] removed.
This commit simplifies the implementation in a few ways:

1. zsetScore implementation improved a bit and moved into t_zset.c where
   is now also used to implement the ZSCORE command.

2. Range extraction from the sorted set remains a separated
   implementation from the one in t_zset.c, but was hyper-specialized in
   order to avoid accumulating results into a list and remove the ones
   outside the radius.

3. A new type is introduced: geoArray, which can accumulate geoPoint
   structures in a vector with power of two expansion policy. This is
   useful since we have to call qsort() against it before returning the
   result to the user.

4. As a result of 1, 2, 3, the two files zset.c and zset.h are now
   removed, including the function to merge two lists (now handled with
   functions that can add elements to existing geoArray arrays) and
   the machinery used in order to pass zset results.

5. geoPoint structure simplified because of the general code structure
   simplification, so we no longer need to take references to objects.

6. Not counting the JSON removal the refactoring removes 200 lines of
   code for the same functionalities, with a simpler to read
   implementation.

7. GEORADIUS is now 2.5 times faster testing with 10k elements and a
   radius resulting in 124 elements returned. However this is mostly a
   side effect of the refactoring and simplification. More speed gains
   can be achieved by trying to optimize the code.
2015-06-23 08:42:57 +02:00
antirez 3d9031eda4 Geo: compile again with optimizations
For some reason the Geo PR included disabling the fact that Redis is
compiled with optimizations. Apparently it was just @mattsta attempt to
speedup the modify-compile-test iteration and there are no other
reasons.
2015-06-22 17:28:48 +02:00
antirez 9fc47ddf0b Geo: zsetScore refactoring
Now used both in geo.c and t_zset to provide ZSCORE.
2015-06-22 17:26:36 +02:00
antirez 2f66550729 Geo: Pub/Sub feature removed
This feature apparently is not going to be very useful, to send a
GEOADD+PUBLISH combo is exactly the same. One that would make a ton of
difference is the ability to subscribe to a position and a radius, and
get the updates in terms of objects entering/exiting the area.
2015-06-22 14:18:18 +02:00
antirez fc03d08ee0 Geo: addReplyDoubleDistance() precision set to 4 digits
Also:
1. The function was renamed.
2. An useless initialization of a buffer was removed.
2015-06-22 13:08:52 +02:00
antirez b18c68aa7f Geo: JSON features removed
The command can only return data in the normal Redis protocol. It is up
to the caller to translate to JSON if needed.
2015-06-22 12:03:44 +02:00
antirez f193b3caa8 Geo: removed bool usage from Geo code inside Redis 2015-06-22 11:24:58 +02:00
Matt Stancliff 7f4ac3d19c [In-Progress] Add Geo Commands
Current todo:
  - replace functions in zset.{c,h} with a new unified Redis
    zset access API.

Once we get the zset interface fixed, we can squash
relevant commits in this branch and have one nice commit
to merge into unstable.

This commit adds:
  - Geo commands
  - Tests; runnable with: ./runtest --single unit/geo
  - Geo helpers in deps/geohash-int/
  - src/geo.{c,h} and src/geojson.{c,h} implementing geo commands
  - Updated build configurations to get everything working
  - TEMPORARY: src/zset.{c,h} implementing zset score and zset
    range reading without writing to client output buffers.
  - Modified linkage of one t_zset.c function for use in zset.c

Conflicts:
	src/Makefile
	src/redis.c
2015-06-22 09:07:13 +02:00
antirez 821a986643 Sentinel: fix bug in config rewriting during failover
We have a check to rewrite the config properly when a failover is in
progress, in order to add the current (already failed over) master as
slave, and don't include in the slave list the promoted slave itself.

However there was an issue, the variable with the right address was
computed but never used when the code was modified, and no tests are
available for this feature for two reasons:

1. The Sentinel unit test currently does not test Sentinel ability to
persist its state at all.
2. It is a very hard to trigger state since it lasts for little time in
the context of the testing framework.

However this feature should be covered in the test in some way.

The bug was found by @badboy using the clang static analyzer.

Effects of the bug on safety of Sentinel
===

This bug results in severe issues in the following case:

1. A Sentinel is elected leader.
2. During the failover, it persists a wrong config with a known-slave
entry listing the master address.
3. The Sentinel crashes and restarts, reading invalid configuration from
disk.
4. It sees that the slave now does not obey the logical configuration
(should replicate from the current master), so it sends a SLAVEOF
command to the master (since the slave master is the same) creating a
replication loop (attempt to replicate from itself) which Redis is
currently unable to detect.
5. This means that the master is no longer available because of the bug.

However the lack of availability should be only transient (at least
in my tests, but other states could be possible where the problem
is not recovered automatically) because:

6. Sentinels treat masters reporting to be slaves as failing.
7. A new failover is triggered, and a slave is promoted to master.

Bug lifetime
===

The bug is there forever. Commit 16237d78 actually tried to fix the bug
but in the wrong way (the computed variable was never used! My fault).
So this bug is there basically since the start of Sentinel.

Since the bug is hard to trigger, I remember little reports matching
this condition, but I remember at least a few. Also in automated tests
where instances were stopped and restarted multiple times automatically
I remember hitting this issue, however I was not able to reproduce nor
to determine with the information I had at the time what was causing the
issue.
2015-06-12 18:36:17 +02:00
Salvatore Sanfilippo 4b5a0f0376 Merge pull request #2614 from linfangrong/patch-1
Update t_zset.c
2015-06-11 15:15:22 +02:00
antirez 8366907bed Use best effort address binding to connect to the master
We usually want to reach the master using the address of the interface
Redis is bound to (via the "bind" config option). That's useful since
the master will get (and publish) the slave address getting the peer
name of the incoming socket connection from the slave.

However, when this is not possible, for example because the slave is
bound to the loopback interface but repliaces from a master accessed via
an external interface, we want to still connect with the master even
from a different interface: in this case it is not really important that
the master will provide any other address, while it is vital to be able
to replicate correctly.

Related to issues #2609 and #2612.
2015-06-11 14:34:38 +02:00
antirez a017b7ec0e anet.c: new API anetTcpNonBlockBestEffortBindConnect()
This performs a best effort source address binding attempt. If it is
possible to bind the local address and still have a successful
connect(), then this socket is returned. Otherwise the call is retried
without source address binding attempt.

Related to issues #2609 and #2612.
2015-06-11 14:34:38 +02:00
antirez 8fa8b251a9 anetTcpGenericConnect(), jump to error not end on error
Two code paths jumped to the "ok, return the socket to the user" code
path to handle error conditions.

Related to issues #2609 and #2612.
2015-06-11 14:34:38 +02:00
antirez a401a84eb2 Don't try to bind the source address for MIGRATE
Related to issues #2609 and #2612.
2015-06-11 14:34:38 +02:00
Ben Murphy ffd6637e90 hide access to debug table 2015-06-03 13:33:28 +02:00
linfangrong 0dc6a5d497 Update t_zset.c 2015-06-02 18:12:57 +08:00
antirez 28a250d9e4 Merge branch 'zaddnx' into unstable 2015-05-29 12:26:27 +02:00
antirez d8a8dca7fd ZADD RETCH option renamed CH
From Twitter:

    "@antirez that’s an awfully-named command :(
     http://en.wikipedia.org/wiki/Retching"
2015-05-29 11:32:22 +02:00
antirez c043a4e6f4 ZADD RETCH option: Return number of elements added or updated
Normally ZADD only returns the number of elements added to a sorted
set, using the RETCH option it returns the sum of elements added or
for which the score was updated.
2015-05-29 11:22:03 +02:00
antirez 5d32abbb9e ZADD NX and XX options 2015-05-29 09:59:42 +02:00
antirez 382a943414 ZADD implemenation able to take options. 2015-05-28 18:10:51 +02:00
Salvatore Sanfilippo a391c36324 Merge pull request #2586 from huachaohuang/patch-1
Update anet.c
2015-05-28 15:10:25 +02:00
Salvatore Sanfilippo c3297a7292 Merge pull request #2587 from itamarhaber/patch-5
Removed incorrect suggestion
2015-05-28 15:09:51 +02:00
Salvatore Sanfilippo 4082c38a60 Merge pull request #2571 from therealbill/sentinel-flushconfig-command
adding a sentinel command: "flushconfig" per RCP4
2015-05-25 12:06:25 +02:00
antirez 20700fe566 Sentinel: clarify effect of resetting failover_start_time. 2015-05-25 10:32:28 +02:00
antirez 5080f2d699 Sentinel: help subcommand in simulate-failure command 2015-05-25 10:24:27 +02:00
antirez fb3af75f74 Sentinel: initial failure simulator implemented
This commit adds the SENTINEL simulate-failure, that sets specific
hooks inside the state machine that will crash Sentinel, for testing
purposes.
2015-05-22 11:49:11 +02:00
Itamar Haber 575eeb1a1c Removed incorrect suggestion
DEL/INCR/DECR and others could be NTH but apparently never made it to the implementation of SORT
2015-05-21 13:24:51 +03:00
Huachao Huang 8c423c0bd6 Update anet.c 2015-05-21 17:40:17 +08:00
antirez c54de703f2 Sentinel: fix sentinelTryConnectionSharing() by checking for no match
Trivial omission of the obvious no-match case.
2015-05-20 09:59:55 +02:00
antirez 164b6bbab5 Merge branch 'sentinel-32' into unstable 2015-05-19 12:26:57 +02:00
antirez abc65e8987 Sentinel: SENTINEL CKQUORUM command
A way for monitoring systems to check that Sentinel is technically able
to reach the quorum and failover, using the currently visible Sentinels.
2015-05-18 12:57:47 +02:00
antirez eb138f1511 Rewrite smoveCommand test with ternary operator 2015-05-15 17:38:48 +02:00
Salvatore Sanfilippo cb9a5a7821 Merge pull request #2529 from gnethercutt/issue_2517
Issue #2517, smove contract violation
2015-05-15 17:36:18 +02:00
antirez b43431ac25 Sentinel: port address update code to shared links logic 2015-05-15 09:47:05 +02:00
antirez 4dee18cb66 Sentinel: config-rewrite unique ID just one time 2015-05-14 17:45:09 +02:00
antirez f9e942d4ae Sentinel: remove debugging message from releaseInstanceLink() 2015-05-14 14:12:45 +02:00
antirez b44c37482c Sentinel: fix access to NULL link->cc in releaseInstanceLink() 2015-05-14 14:08:23 +02:00
antirez 87b6013adb Sentinel: remove SHARED! debugging printf 2015-05-14 13:40:23 +02:00
antirez 5a0516b5b9 Sentinel: rewrite callback chain removing instances with shared links
Otherwise pending commands callbacks will fire with a reference that no
longer exists.
2015-05-14 13:39:26 +02:00
antirez 05dbc82005 Sentinel: debugging code removed from sentinelSendPing() 2015-05-14 10:52:32 +02:00
antirez 58d2bb951a Sentinel: use active/last time for ping logic
The PING trigger was improved again by using two fields instead of a
single one to remember when the last ping was sent:

1. The "active" ping is the time at which we sent the last ping that
still received no reply. However we continue to ping non replying
instances even if they have an old active ping: the link may be
disconnected and reconencted in the meantime so the older pings may get
lost even if it's a TCP socket.

2. The "last" ping is the time at which we really sent the last ping
on the wire, and this is used in order to throttle the amount of pings
we send during failures (when no pong is received).

All in all the failure detector effectiveness should be identical but we
avoid to flood instances with pings during failures or when they are
slow.
2015-05-14 09:56:23 +02:00
antirez 3ab49895b4 Sentinel: limit reconnection frequency to the ping period 2015-05-13 14:23:57 +02:00
antirez 0eb0b55ff0 Sentinel: PING trigger improved
It's ok to ping as soon as the ping period has elapsed since we received
the last PONG, but it's not good that we ping again if there is a
pending ping... With this change we'll send a new ping if there is one
pending only if two times the ping period elapsed since the ping which
is still pending was sent.
2015-05-12 17:03:53 +02:00
antirez 9d5e2ed392 Sentinel: same-Sentinel link sharing across masters 2015-05-12 17:03:00 +02:00
antirez e0a5246f06 Sentinel: add sentinelGetInstanceTypeString() fuction
This is useful for debugging and logging activities: given a
sentinelRedisInstance object returns a C string representing the
instance type: master, slave, sentinel.
2015-05-12 12:12:25 +02:00
Jungtaek Lim 6b953a2681 protocol error log should be seen debug/verbose level 2015-05-12 10:04:52 +09:00
antirez d6e1347869 Sentinel: add link refcount to instance description 2015-05-11 23:49:19 +02:00
therealbill 4e8ccbe7ea adding a sentinel command: "flushconfig"
This new command triggers a config flush to save the in-memory config to
disk. This is useful for cases of a configuration management system or a
package manager wiping out your sentinel config while the process is
still running - and has not yet been restarted. It can also be useful
for scripting a backup and migrate or clone of a running sentinel.
2015-05-11 14:08:57 -05:00
antirez 1029276c0d Sentinel: connection sharing WIP #1 2015-05-11 13:15:26 +02:00
antirez 611283f743 Sentinel: suppress warnings for not used args. 2015-05-08 17:17:59 +02:00
antirez 3eca0752a6 Sentinel: generate +sentinel again, removed in prev commit. 2015-05-08 17:16:48 +02:00
antirez b91434cab1 Sentinel: Use privdata instead of c->data in sentinelReceiveHelloMessages()
This way we may later share the hiredis link "c" among the same Sentinel
instance referenced multiple times for multiple masters.
2015-05-08 17:16:39 +02:00
antirez b849886a0d Sentinel: clarify arguments of SENTINEL IS-MASTER-DOWN-BY-ADDR 2015-05-08 17:16:00 +02:00
antirez a0cd75cd1b Sentinel: don't detect duplicated Sentinels, just address switch
Since with a previous commit Sentinels now persist their unique ID, we
no longer need to detect duplicated Sentinels and re-add them. We remove
and re-add back using different events only in the case of address
switch of the same Sentinel, without generating a new +sentinel event.
2015-05-07 10:07:47 +02:00
antirez 794fc4c9a8 Sentinel: persist its unique ID across restarts.
Previously Sentinels always changed unique ID across restarts, relying
on the server.runid field. This is not a good idea, and forced Sentinel
to rely on detection of duplicated Sentinels and a potentially dangerous
clean-up and re-add operation of the Sentinel instance that was
rebooted.

Now the ID is generated at the first start and persisted in the
configuration file, so that a given Sentinel will have its unique
ID forever (unless the configuration is manually deleted or there is a
filesystem corruption).
2015-05-06 16:19:14 +02:00
Salvatore Sanfilippo 0610cb6296 Merge pull request #2564 from charsyam/feature/compile-error-freebsd-1
fix compile error for struct msghdr in FreeBSD 10
2015-05-05 18:44:46 +02:00
antirez 23e304e313 Substitute DISQUE to REDIS after merge from Disque
Probably this stuff should be called CLIENT_* in order to cross merge
more easily.
2015-05-05 16:36:35 +02:00
antirez 2bc1527a95 processUnblockedClients: don't process clients that blocekd again 2015-05-05 16:35:44 +02:00
antirez f7bd816bbb Don't put clients into unblocked list multiple times 2015-05-05 16:32:53 +02:00
clark.kang 8d18692018 fix compile error for struct msghdr 2015-05-05 22:51:27 +09:00
Salvatore Sanfilippo 8af99d0c09 Merge pull request #2530 from FuGangqiang/unstable
fix sds.c
2015-05-04 13:00:02 +02:00
therealbill cc799d253f Making sentinel flush config on +slave
Originally, only the +slave event which occurs when a slave is
reconfigured during sentinelResetMasterAndChangeAddress triggers a flush
of the config to disk.  However, newly discovered slaves don't
apparently trigger this flush but do trigger the +slave event issuance.

So if you start up a sentinel, add a master, then add a slave to the
master (as a way to reproduce it) you'll see the +slave event issued,
but the sentinel config won't be updated with the known-slave entry.

This change makes sentinel do the flush of the config if a new slave is
deteted in sentinelRefreshInstanceInfo.
2015-05-04 12:54:13 +02:00
antirez 99c93f34a7 Sentinel: remove useless sentinelFlushConfig() call
To rewrite the config in the loop that adds slaves back after a master
reset, in order to handle switching to another master, is useless: it
just adds latency since there is an fsync call in the inner loop,
without providing any additional guarantee, but the contrary, since if
after the first loop iteration the server crashes we end with just a
single slave entry losing all the other informations.

It is wiser to rewrite the config at the end when the full new
state is configured.
2015-05-04 12:50:44 +02:00
Salvatore Sanfilippo 22d00d80ce Merge pull request #2542 from yossigo/lua_client_buffer_crash
Fix Redis server crash when Lua command exceeds client output buffer limit.
2015-05-04 12:19:44 +02:00
Salvatore Sanfilippo 827d07f005 Merge pull request #2551 from charsyam/feature/sentinel-memory-leak-1
fix sentinel memory leak
2015-05-04 12:17:41 +02:00
antirez 9e7f39d29d Add header guard for ziplist.h
As suggested in #2543.
2015-04-29 10:33:21 +02:00
clark.kang eff212ea95 fix sentinel memory leak 2015-04-29 00:05:26 +09:00
antirez 1b25757f41 sha1.c: use standard uint32_t. 2015-04-27 12:07:49 +02:00
Yossi Gottlieb 49c1b60bd8 Fix Redis server crash when Lua command exceeds client output buffer
limit.
2015-04-26 12:04:16 +03:00
FuGangqiang 26a1a08fc7 sdsfree x and y 2015-04-20 23:03:34 +08:00
FuGangqiang 239494db64 fix doc example 2015-04-20 21:46:48 +08:00
FuGangqiang 42b36c5ce9 fix typo 2015-04-19 23:42:27 +08:00
Glenn Nethercutt 626b4f6907 uphold the smove contract to return 0 when the element is not a member of the source set, even if source=dest 2015-04-17 09:27:54 -04:00
antirez 6c60526db9 Net: improve prepareClientToWrite() error handling and comments.
When we fail to setup the write handler it does not make sense to take
the client around, it is missing writes: whatever is a client or a slave
anyway the connection should terminated ASAP.

Moreover what the function does exactly with its return value, and in
which case the write handler is installed on the socket, was not clear,
so the functions comment are improved to make the goals of the function
more obvious.

Also related to #2485.
2015-04-01 10:07:45 +02:00
Oran Agra 159875b5a3 fixes to diskless replication.
master was closing the connection if the RDB transfer took long time.
and also sent PINGs to the slave before it got the initial ACK, in which case the slave wouldn't be able to find the EOF marker.
2015-03-31 23:42:08 +03:00
antirez 66f9393ee4 Fix setTypeNext call assuming NULL can be passed.
Segfault introduced during a refactoring / warning suppression a few
commits away. This particular call assumed that it is safe to pass NULL
to the object pointer argument when we are sure the set has a given
encoding. This can't be assumed and is now guaranteed to segfault
because of the new API of setTypeNext().
2015-03-31 15:26:35 +02:00
antirez 7f330b16f9 Set: setType*() API more defensive initializing both values.
This change fixes several warnings compiling at -O3 level with GCC
4.8.2, and at the same time, in case of misuse of the API, we have the
pointer initialize to NULL or the integer initialized to the value
-123456789 which is easy to spot by naked eye.
2015-03-30 12:24:57 +02:00
antirez 34460dd6ee Check bio.c job type at thread startup.
Another one just to avoid a warning. Slightly more defensive code
anyway.
2015-03-30 12:17:46 +02:00
antirez 221d2932b5 Ensure array index is in range in addReplyLongLongWithPrefix().
Change done in order to remove a warning and improve code robustness. No
actual bug here.
2015-03-30 11:54:49 +02:00
antirez 068d3c9737 dict.c: convert types to unsigned long where appropriate.
No semantical changes since to make dict.c truly able to scale over the
32 bit table size limit, the hash function shoulds and other internals
related to hash function output should be 64 bit ready.
2015-03-27 10:14:52 +01:00
antirez 9cd8333ed2 dict.c: add casting to avoid compilation warning.
rehashidx is always positive in the two code paths, since the only
negative value it could have is -1 when there is no rehashing in
progress, and the condition is explicitly checked.
2015-03-27 10:12:25 +01:00
antirez c3ad70901f Replication: disconnect blocked clients when switching to slave role.
Bug as old as Redis and blocking operations. It's hard to trigger since
only happens on instance role switch, but the results are quite bad
since an inconsistency between master and slave is created.

How to trigger the bug is a good description of the bug itself.

1. Client does "BLPOP mylist 0" in master.
2. Master is turned into slave, that replicates from New-Master.
3. Client does "LPUSH mylist foo" in New-Master.
4. New-Master propagates write to slave.
5. Slave receives the LPUSH, the blocked client get served.

Now Master "mylist" key has "foo", Slave "mylist" key is empty.

Highlights:

* At step "2" above, the client remains attached, basically escaping any
  check performed during command dispatch: read only slave, in that case.
* At step "5" the slave (that was the master), serves the blocked client
  consuming a list element, which is not consumed on the master side.

This scenario is technically likely to happen during failovers, however
since Redis Sentinel already disconnects clients using the CLIENT
command when changing the role of the instance, the bug is avoided in
Sentinel deployments.

Closes #2473.
2015-03-24 16:00:09 +01:00
antirez 9b7f8b1c9b Cluster: redirection refactoring + handling of blocked clients.
There was a bug in Redis Cluster caused by clients blocked in a blocking
list pop operation, for keys no longer handled by the instance, or
in a condition where the cluster became down after the client blocked.

A typical situation is:

1) BLPOP <somekey> 0
2) <somekey> hash slot is resharded to another master.

The client will block forever int this case.

A symmentrical non-cluster-specific bug happens when an instance is
turned from master to slave. In that case it is more serious since this
will desynchronize data between slaves and masters. This other bug was
discovered as a side effect of thinking about the bug explained and
fixed in this commit, but will be fixed in a separated commit.
2015-03-24 11:56:24 +01:00
antirez 2f4240b9d9 Cluster: fix Lua scripts replication to slave nodes. 2015-03-22 22:24:08 +01:00
antirez 94030fa4d7 Two cluster.c comments improved. 2015-03-21 12:12:23 +01:00
antirez 2950824ab6 Cluster: TAKEOVER option for manual failover. 2015-03-21 11:54:32 +01:00
antirez d544600aa5 Fix typo in beforeSleep() comment. 2015-03-21 09:19:08 +01:00
antirez 2b278a3394 Net: processUnblockedClients() and clientsArePaused() minor changes.
1. No need to set btype in processUnblockedClients(), since clients
   flagged REDIS_UNBLOCKED should have it already cleared.
2. When putting clients in the unblocked clients list, clientsArePaused()
   should flag them with REDIS_UNBLOCKED. Not strictly needed with the
   current code but is more coherent.
2015-03-21 09:13:29 +01:00
antirez 5fe4a23131 Net: clientsArePaused() should not touch blocked clients.
When the list of unblocked clients were processed, btype was set to
blocking type none, but the client remained flagged with REDIS_BLOCKED.
When timeout is reached (or when the client disconnects), unblocking it
will trigger an assertion.

There is no need to process pending requests from blocked clients, so
now clientsArePaused() just avoid touching blocked clients.

Close #2467.
2015-03-21 09:04:38 +01:00
antirez a7010ae208 Cluster: non-conditional steps of slave failover refactored into a function. 2015-03-20 17:56:21 +01:00
antirez 230d141420 Cluster: separate unknown master check from the rest.
In no case we should try to attempt to failover if myself->slaveof is
NULL.
2015-03-20 16:56:59 +01:00
antirez 4f2555aa17 Cluster: refactoring around configEpoch handling.
This commit moves the process of generating a new config epoch without
consensus out of the clusterCommand() implementation, in order to make
it reusable for other reasons (current target is to have a CLUSTER
FAILOVER option forcing the failover when no master majority is
reachable).

Moreover the commit moves other functions which are similarly related to
config epochs in a new logical section of the cluster.c file, just for
clarity.
2015-03-20 16:42:52 +01:00
antirez 25c0f5ac63 Cluster: better cluster state transiction handling.
Before we relied on the global cluster state to make sure all the hash
slots are linked to some node, when getNodeByQuery() is called. So
finding the hash slot unbound was checked with an assertion. However
this is fragile. The cluster state is often updated in the
clusterBeforeSleep() function, and not ASAP on state change, so it may
happen to process clients with a cluster state that is 'ok' but yet
certain hash slots set to NULL.

With this commit the condition is also checked in getNodeByQuery() and
reported with a identical error code of -CLUSTERDOWN but slightly
different error message so that we have more debugging clue in the
future.

Root cause of issue #2288.
2015-03-20 09:59:28 +01:00
antirez 2ecb5edf34 Cluster: move clusterBeforeSleep() call before unblocked clients processing.
Related to issue #2288.
2015-03-20 09:47:54 +01:00
antirez 438a1a84e8 Cluster: more robust slave check in CLUSTER REPLICATE.
There are rare conditions where node->slaveof may be NULL even if the
node is a slave. To check by flag is much more robust.
2015-03-18 12:10:14 +01:00
Salvatore Sanfilippo 61fb441c8c Merge pull request #2386 from inkel/sentinel-add-client-command
Support CLIENT commands in Redis Sentinel
2015-03-13 18:23:36 +01:00
antirez 93b1320fac Cluster: fix CLUSTER NODES optimization error in 'j' increment. 2015-03-13 13:16:35 +01:00
antirez e1b6c9dd18 Cluster: CLUSTER NODES speedup. 2015-03-13 11:26:04 +01:00
antirez b2e8eca70d Config: improve loglevel message error. 2015-03-12 14:43:07 +01:00
antirez 792c531688 CONFIG GET syslog-facility added.
Was missing for some reason. Trivial to add after config.c refactoring.
2015-03-12 09:59:10 +01:00
antirez 50b41b6ad3 CONFIG SET refactoring: use enums in more places. 2015-03-11 23:21:04 +01:00
antirez 535b295f96 Net: better Unix socket error. Issue #2449. 2015-03-11 17:24:55 +01:00
antirez 4cd4910f26 Merge branch 'unstable' of github.com:/antirez/redis into unstable 2015-03-11 17:05:14 +01:00
antirez 8e219224b9 CONFIG refactoring: configEnum abstraction.
Still many things to convert inside config.c in the next commits.
Some const safety in String objects creation and addReply() family
functions.
2015-03-11 17:00:13 +01:00
antirez 4a2a0d9e9d CONFIG SET: memory and special field macros. 2015-03-11 09:02:04 +01:00
Michel Martens 6201eb0c55 Add command CLUSTER MYID 2015-03-10 16:43:19 +00:00
antirez 3da7408359 CONFIG SET: additional 2 numerical fields refactored. 2015-03-10 13:00:36 +01:00
antirez d68f28a367 CONFIG SET refactoring of bool and value fields.
Not perfect since The Solution IMHO is to have a DSL with a table of
configuration functions with type, limits, and aux functions to handle
the odd ones. However this hacky macro solution is already better and
forces to put limits in the range of numerical fields.

More field types to be refactored in the next commits hopefully.
2015-03-10 12:37:39 +01:00
antirez a664040eb7 Config: activerehashing option support in CONFIG SET. 2015-03-08 15:33:42 +01:00
antirez 509a6cc1e8 Fix iterator for issue #2438.
Itereator misuse due to analyzeLatencyForEvent() accessing the
dictionary during the iteration, without the iterator being
reclared as safe.
2015-03-04 11:48:19 -08:00
antirez c77081a45a Migrate: replace conditional with pre-computed value. 2015-02-27 22:33:54 +01:00
antirez 4f56f035a7 String: use new sdigits10() API in stringObjectLen().
Should be much faster, and regardless, the code is more obvious now
compared to generating a string just to get the return value of the
ll2stirng() function.
2015-02-27 16:09:17 +01:00
antirez 0e5e8ca9e6 Utils: Include stdint.h and fix signess in sdigits10(). 2015-02-27 16:03:02 +01:00
antirez 0ace1e6d04 Hash: HSTRLEN crash fixed when getting len of int-encoded value 2015-02-27 15:37:04 +01:00
antirez 4e54b85a19 Hash: HSTRLEN (was HVSTRLEN) improved.
1. HVSTRLEN -> HSTRLEN. It's unlikely one needs the length of the key,
   not clear how the API would work (by value does not make sense) and
   there will be better names anyway.
2. Default is to return 0 when field is missing.
3. Default is to return 0 when key is missing.
4. The implementation was slower than needed, and produced unnecessary COW.

Related issue #2415.
2015-02-27 15:31:55 +01:00
antirez 8855b8161f Merge branch 'unstable' of github.com:/antirez/redis into unstable 2015-02-27 15:24:25 +01:00
Salvatore Sanfilippo b49c00a79c Merge pull request #2415 from landmime/unstable
added a new hvstrlen command
2015-02-27 15:24:04 +01:00
antirez d8f8b0575f Hash: API to get value string len by field name. 2015-02-27 15:22:49 +01:00
antirez c95507881a Utils: added function to get radix 10 string length of signed integer. 2015-02-27 15:22:10 +01:00
antirez 7e6b4ea67b server.current_client fix and minor refactoring.
Thanks to @codeslinger (Toby DiPasquale) for identifying the issue.

Related to issue #2409.
2015-02-27 14:17:46 +01:00
antirez 832b0c7cce Improvements to PR #2425
1. Remove useless "cs" initialization.
2. Add a "select" var to capture a condition checked multiple times.
3. Avoid duplication of the same if (!copy) conditional.
4. Don't increment dirty if copy is given (no deletion is performed),
   otherwise we propagate MIGRATE when not needed.
2015-02-26 10:27:56 +01:00
Tommy Wang 7fda935ad3 Add last_dbid to migrateCachedSocket to avoid redundant SELECT
Avoid redundant SELECT calls when continuously migrating keys to
the same dbid within a target Redis instance.
2015-02-26 10:18:43 +01:00
antirez 27c30b0e84 Cast sentlen to int before comparison wit bufpos.
This is safe since bufpos is small, inside the range of the local
client buffer.
2015-02-25 10:33:37 +01:00
Salvatore Sanfilippo 9454f7b3db Merge pull request #2050 from mattsta/bitops-no-overalloc
Bitops: Stop overallocating storage space on set
2015-02-25 10:18:07 +01:00
Salvatore Sanfilippo e00cb78f67 Merge pull request #2054 from mattsta/fix-set-sentinel-quorum
Sentinel: Add initial quorum bounds check
2015-02-25 10:09:40 +01:00
Matt Stancliff 47ab570441 Fix types broken by previous type cleanup
Revert some size_t back to off_t
Set reply_bytes needs to 64 bits everywhere
Revert bufpos to int since it's a max of 16k into buf[]
2015-02-24 17:39:59 +01:00
Salvatore Sanfilippo d83c810265 Merge pull request #2301 from mattsta/fix/lengths
Improve type correctness
2015-02-24 17:22:53 +01:00
Salvatore Sanfilippo 46bd13b806 Merge pull request #1966 from mattsta/fix-sentinel-info
Sentinel: Improve INFO command behavior
2015-02-24 17:20:09 +01:00