![]() ## Introduction Redis introduced IO Thread in 6.0, allowing IO threads to handle client request reading, command parsing and reply writing, thereby improving performance. The current IO thread implementation has a few drawbacks. - The main thread is blocked during IO thread read/write operations and must wait for all IO threads to complete their current tasks before it can continue execution. In other words, the entire process is synchronous. This prevents the efficient utilization of multi-core CPUs for parallel processing. - When the number of clients and requests increases moderately, it causes all IO threads to reach full CPU utilization due to the busy wait mechanism used by the IO threads. This makes it challenging for us to determine which part of Redis has reached its bottleneck. - When IO threads are enabled with TLS and io-threads-do-reads, a disconnection of a connection with pending data may result in it being assigned to multiple IO threads simultaneously. This can cause race conditions and trigger assertion failures. Related issue: redis#12540 Therefore, we designed an asynchronous IO threads solution. The IO threads adopt an event-driven model, with the main thread dedicated to command processing, meanwhile, the IO threads handle client read and write operations in parallel. ## Implementation ### Overall As before, we did not change the fact that all client commands must be executed on the main thread, because Redis was originally designed to be single-threaded, and processing commands in a multi-threaded manner would inevitably introduce numerous race and synchronization issues. But now each IO thread has independent event loop, therefore, IO threads can use a multiplexing approach to handle client read and write operations, eliminating the CPU overhead caused by busy-waiting. the execution process can be briefly described as follows: the main thread assigns clients to IO threads after accepting connections, IO threads will notify the main thread when clients finish reading and parsing queries, then the main thread processes queries from IO threads and generates replies, IO threads handle writing reply to clients after receiving clients list from main thread, and then continue to handle client read and write events. ### Each IO thread has independent event loop We now assign each IO thread its own event loop. This approach eliminates the need for the main thread to perform the costly `epoll_wait` operation for handling connections (except for specific ones). Instead, the main thread processes requests from the IO threads and hands them back once completed, fully offloading read and write events to the IO threads. Additionally, all TLS operations, including handling pending data, have been moved entirely to the IO threads. This resolves the issue where io-threads-do-reads could not be used with TLS. ### Event-notified client queue To facilitate communication between the IO threads and the main thread, we designed an event-notified client queue. Each IO thread and the main thread have two such queues to store clients waiting to be processed. These queues are also integrated with the event loop to enable handling. We use pthread_mutex to ensure the safety of queue operations, as well as data visibility and ordering, and race conditions are minimized, as each IO thread and the main thread operate on independent queues, avoiding thread suspension due to lock contention. And we implemented an event notifier based on `eventfd` or `pipe` to support event-driven handling. ### Thread safety Since the main thread and IO threads can execute in parallel, we must handle data race issues carefully. **client->flags** The primary tasks of IO threads are reading and writing, i.e. `readQueryFromClient` and `writeToClient`. However, IO threads and the main thread may concurrently modify or access `client->flags`, leading to potential race conditions. To address this, we introduced an io-flags variable to record operations performed by IO threads, thereby avoiding race conditions on `client->flags`. **Pause IO thread** In the main thread, we may want to operate data of IO threads, maybe uninstall event handler, access or operate query/output buffer or resize event loop, we need a clean and safe context to do that. We pause IO thread in `IOThreadBeforeSleep`, do some jobs and then resume it. To avoid thread suspended, we use busy waiting to confirm the target status. Besides we use atomic variable to make sure memory visibility and ordering. We introduce these functions to pause/resume IO Threads as below. ``` pauseIOThread, resumeIOThread pauseAllIOThreads, resumeAllIOThreads pauseIOThreadsRange, resumeIOThreadsRange ``` Testing has shown that `pauseIOThread` is highly efficient, allowing the main thread to execute nearly 200,000 operations per second during stress tests. Similarly, `pauseAllIOThreads` with 8 IO threads can handle up to nearly 56,000 operations per second. But operations performed between pausing and resuming IO threads must be quick; otherwise, they could cause the IO threads to reach full CPU utilization. **freeClient and freeClientAsync** The main thread may need to terminate a client currently running on an IO thread, for example, due to ACL rule changes, reaching the output buffer limit, or evicting a client. In such cases, we need to pause the IO thread to safely operate on the client. **maxclients and maxmemory-clients updating** When adjusting `maxclients`, we need to resize the event loop for all IO threads. Similarly, when modifying `maxmemory-clients`, we need to traverse all clients to calculate their memory usage. To ensure safe operations, we pause all IO threads during these adjustments. **Client info reading** The main thread may need to read a client’s fields to generate a descriptive string, such as for the `CLIENT LIST` command or logging purposes. In such cases, we need to pause the IO thread handling that client. If information for all clients needs to be displayed, all IO threads must be paused. **Tracking redirect** Redis supports the tracking feature and can even send invalidation messages to a connection with a specified ID. But the target client may be running on IO thread, directly manipulating the client’s output buffer is not thread-safe, and the IO thread may not be aware that the client requires a response. In such cases, we pause the IO thread handling the client, modify the output buffer, and install a write event handler to ensure proper handling. **clientsCron** In the `clientsCron` function, the main thread needs to traverse all clients to perform operations such as timeout checks, verifying whether they have reached the soft output buffer limit, resizing the output/query buffer, or updating memory usage. To safely operate on a client, the IO thread handling that client must be paused. If we were to pause the IO thread for each client individually, the efficiency would be very low. Conversely, pausing all IO threads simultaneously would be costly, especially when there are many IO threads, as clientsCron is invoked relatively frequently. To address this, we adopted a batched approach for pausing IO threads. At most, 8 IO threads are paused at a time. The operations mentioned above are only performed on clients running in the paused IO threads, significantly reducing overhead while maintaining safety. ### Observability In the current design, the main thread always assigns clients to the IO thread with the least clients. To clearly observe the number of clients handled by each IO thread, we added the new section in INFO output. The `INFO THREADS` section can show the client count for each IO thread. ``` # Threads io_thread_0:clients=0 io_thread_1:clients=2 io_thread_2:clients=2 ``` Additionally, in the `CLIENT LIST` output, we also added a field to indicate the thread to which each client is assigned. `id=244 addr=127.0.0.1:41870 laddr=127.0.0.1:6379 ... resp=2 lib-name= lib-ver= io-thread=1` ## Trade-off ### Special Clients For certain special types of clients, keeping them running on IO threads would result in severe race issues that are difficult to resolve. Therefore, we chose not to offload these clients to the IO threads. For replica, monitor, subscribe, and tracking clients, main thread may directly write them a reply when conditions are met. Race issues are difficult to resolve, so we have them processed in the main thread. This includes the Lua debug clients as well, since we may operate connection directly. For blocking client, after the IO thread reads and parses a command and hands it over to the main thread, if the client is identified as a blocking type, it will be remained in the main thread. Once the blocking operation completes and the reply is generated, the client is transferred back to the IO thread to send the reply and wait for event triggers. ### Clients Eviction To support client eviction, it is necessary to update each client’s memory usage promptly during operations such as read, write, or command execution. However, when a client operates on an IO thread, it is not feasible to update the memory usage immediately due to the risk of data races. As a result, memory usage can only be updated either in the main thread while processing commands or in the `ClientsCron` periodically. The downside of this approach is that updates might experience a delay of up to one second, which could impact the precision of memory management for eviction. To avoid incorrectly evicting clients. We adopted a best-effort compensation solution, when we decide to eviction a client, we update its memory usage again before evicting, if the memory used by the client does not decrease or memory usage bucket is not changed, then we will evict it, otherwise, not evict it. However, we have not completely solved this problem. Due to the delay in memory usage updates, it may lead us to make incorrect decisions about the need to evict clients. ### Defragment In the majority of cases we do NOT use the data from argv directly in the db. 1. key names We store a copy that we allocate in the main thread, see `sdsdup()` in `dbAdd()`. 2. hash key and value We store key as hfield and store value as sds, see `hfieldNew()` and `sdsdup()` in `hashTypeSet()`. 3. other datatypes They don't even use SDS, so there is no reference issues. But in some cases client the data from argv may be retain by the main thread. As a result, during fragmentation cleanup, we need to move allocations from the IO thread’s arena to the main thread’s arena. We always allocate new memory in the main thread’s arena, but the memory released by IO threads may not yet have been reclaimed. This ultimately causes the fragmentation rate to be higher compared to creating and allocating entirely within a single thread. The following cases below will lead to memory allocated by the IO thread being kept by the main thread. 1. string related command: `append`, `getset`, `mset` and `set`. If `tryObjectEncoding()` does not change argv, we will keep it directly in the main thread, see the code in `tryObjectEncoding()`(specifically `trimStringObjectIfNeeded()`) 2. block related command. the key names will be kept in `c->db->blocking_keys`. 3. watch command the key names will be kept in `c->db->watched_keys`. 4. [s]subscribe command channel name will be kept in `serverPubSubChannels`. 5. script load command script will be kept in `server.lua_scripts`. 7. some module API: `RM_RetainString`, `RM_HoldString` Those issues will be handled in other PRs. ## Testing ### Functional Testing The commit with enabling IO Threads has passed all TCL tests, but we did some changes: **Client query buffer**: In the original code, when using a reusable query buffer, ownership of the query buffer would be released after the command was processed. However, with IO threads enabled, the client transitions from an IO thread to the main thread for processing. This causes the ownership release to occur earlier than the command execution. As a result, when IO threads are enabled, the client's information will never indicate that a shared query buffer is in use. Therefore, we skip the corresponding query buffer tests in this case. **Defragment**: Add a new defragmentation test to verify the effect of io threads on defragmentation. **Command delay**: For deferred clients in TCL tests, due to clients being assigned to different threads for execution, delays may occur. To address this, we introduced conditional waiting: the process proceeds to the next step only when the `client list` contains the corresponding commands. ### Sanitizer Testing The commit passed all TCL tests and reported no errors when compiled with the `fsanitizer=thread` and `fsanitizer=address` options enabled. But we made the following modifications: we suppressed the sanitizer warnings for clients with watched keys when updating `client->flags`, we think IO threads read `client->flags`, but never modify it or read the `CLIENT_DIRTY_CAS` bit, main thread just only modifies this bit, so there is no actual data race. ## Others ### IO thread number In the new multi-threaded design, the main thread is primarily focused on command processing to improve performance. Typically, the main thread does not handle regular client I/O operations but is responsible for clients such as replication and tracking clients. To avoid breaking changes, we still consider the main thread as the first IO thread. When the io-threads configuration is set to a low value (e.g., 2), performance does not show a significant improvement compared to a single-threaded setup for simple commands (such as SET or GET), as the main thread does not consume much CPU for these simple operations. This results in underutilized multi-core capacity. However, for more complex commands, having a low number of IO threads may still be beneficial. Therefore, it’s important to adjust the `io-threads` based on your own performance tests. Additionally, you can clearly monitor the CPU utilization of the main thread and IO threads using `top -H -p $redis_pid`. This allows you to easily identify where the bottleneck is. If the IO thread is the bottleneck, increasing the `io-threads` will improve performance. If the main thread is the bottleneck, the overall performance can only be scaled by increasing the number of shards or replicas. --------- Co-authored-by: debing.sun <debing.sun@redis.com> Co-authored-by: oranagra <oran@redislabs.com> |
||
---|---|---|
.codespell | ||
.github | ||
deps | ||
modules | ||
src | ||
tests | ||
utils | ||
.gitattributes | ||
.gitignore | ||
00-RELEASENOTES | ||
BUGS | ||
CODE_OF_CONDUCT.md | ||
CONTRIBUTING.md | ||
INSTALL | ||
LICENSE.txt | ||
MANIFESTO | ||
Makefile | ||
README.md | ||
REDISCONTRIBUTIONS.txt | ||
SECURITY.md | ||
TLS.md | ||
redis.conf | ||
runtest | ||
runtest-cluster | ||
runtest-moduleapi | ||
runtest-sentinel | ||
sentinel.conf |
README.md
This README is just a fast quick start document. You can find more detailed documentation at redis.io.
What is Redis?
Redis is often referred to as a data structures server. What this means is that Redis provides access to mutable data structures via a set of commands, which are sent using a server-client model with TCP sockets and a simple protocol. So different processes can query and modify the same data structures in a shared way.
Data structures implemented into Redis have a few special properties:
- Redis cares to store them on disk, even if they are always served and modified into the server memory. This means that Redis is fast, but that it is also non-volatile.
- The implementation of data structures emphasizes memory efficiency, so data structures inside Redis will likely use less memory compared to the same data structure modelled using a high-level programming language.
- Redis offers a number of features that are natural to find in a database, like replication, tunable levels of durability, clustering, and high availability.
Another good example is to think of Redis as a more complex version of memcached, where the operations are not just SETs and GETs, but operations that work with complex data types like Lists, Sets, ordered data structures, and so forth.
If you want to know more, this is a list of selected starting points:
-
Introduction to Redis data types. https://redis.io/docs/latest/develop/data-types/
-
The full list of Redis commands. https://redis.io/commands
-
There is much more inside the official Redis documentation. https://redis.io/documentation
What is Redis Community Edition?
Redis OSS was renamed Redis Community Edition (CE) with the v7.4 release.
Redis Ltd. also offers Redis Software, a self-managed software with additional compliance, reliability, and resiliency for enterprise scaling, and Redis Cloud, a fully managed service integrated with Google Cloud, Azure, and AWS for production-ready apps.
Read more about the differences between Redis Community Edition and Redis here.
Building Redis
Redis can be compiled and used on Linux, OSX, OpenBSD, NetBSD, FreeBSD. We support big endian and little endian architectures, and both 32 bit and 64 bit systems.
It may compile on Solaris derived systems (for instance SmartOS) but our support for this platform is best effort and Redis is not guaranteed to work as well as in Linux, OSX, and *BSD.
It is as simple as:
% make
To build with TLS support, you'll need OpenSSL development libraries (e.g. libssl-dev on Debian/Ubuntu) and run:
% make BUILD_TLS=yes
To build with systemd support, you'll need systemd development libraries (such as libsystemd-dev on Debian/Ubuntu or systemd-devel on CentOS) and run:
% make USE_SYSTEMD=yes
To append a suffix to Redis program names, use:
% make PROG_SUFFIX="-alt"
You can build a 32 bit Redis binary using:
% make 32bit
After building Redis, it is a good idea to test it using:
% make test
If TLS is built, running the tests with TLS enabled (you will need tcl-tls
installed):
% ./utils/gen-test-certs.sh
% ./runtest --tls
Fixing build problems with dependencies or cached build options
Redis has some dependencies which are included in the deps
directory.
make
does not automatically rebuild dependencies even if something in
the source code of dependencies changes.
When you update the source code with git pull
or when code inside the
dependencies tree is modified in any other way, make sure to use the following
command in order to really clean everything and rebuild from scratch:
% make distclean
This will clean: jemalloc, lua, hiredis, linenoise and other dependencies.
Also if you force certain build options like 32bit target, no C compiler
optimizations (for debugging purposes), and other similar build time options,
those options are cached indefinitely until you issue a make distclean
command.
Fixing problems building 32 bit binaries
If after building Redis with a 32 bit target you need to rebuild it
with a 64 bit target, or the other way around, you need to perform a
make distclean
in the root directory of the Redis distribution.
In case of build errors when trying to build a 32 bit binary of Redis, try the following steps:
- Install the package libc6-dev-i386 (also try g++-multilib).
- Try using the following command line instead of
make 32bit
:make CFLAGS="-m32 -march=native" LDFLAGS="-m32"
Allocator
Selecting a non-default memory allocator when building Redis is done by setting
the MALLOC
environment variable. Redis is compiled and linked against libc
malloc by default, with the exception of jemalloc being the default on Linux
systems. This default was picked because jemalloc has proven to have fewer
fragmentation problems than libc malloc.
To force compiling against libc malloc, use:
% make MALLOC=libc
To compile against jemalloc on Mac OS X systems, use:
% make MALLOC=jemalloc
Monotonic clock
By default, Redis will build using the POSIX clock_gettime function as the monotonic clock source. On most modern systems, the internal processor clock can be used to improve performance. Cautions can be found here: http://oliveryang.net/2015/09/pitfalls-of-TSC-usage/
To build with support for the processor's internal instruction clock, use:
% make CFLAGS="-DUSE_PROCESSOR_CLOCK"
Verbose build
Redis will build with a user-friendly colorized output by default. If you want to see a more verbose output, use the following:
% make V=1
Running Redis
To run Redis with the default configuration, just type:
% cd src
% ./redis-server
If you want to provide your redis.conf, you have to run it using an additional parameter (the path of the configuration file):
% cd src
% ./redis-server /path/to/redis.conf
It is possible to alter the Redis configuration by passing parameters directly as options using the command line. Examples:
% ./redis-server --port 9999 --replicaof 127.0.0.1 6379
% ./redis-server /etc/redis/6379.conf --loglevel debug
All the options in redis.conf are also supported as options using the command line, with exactly the same name.
Running Redis with TLS
Please consult the TLS.md file for more information on how to use Redis with TLS.
Playing with Redis
You can use redis-cli to play with Redis. Start a redis-server instance, then in another terminal try the following:
% cd src
% ./redis-cli
redis> ping
PONG
redis> set foo bar
OK
redis> get foo
"bar"
redis> incr mycounter
(integer) 1
redis> incr mycounter
(integer) 2
redis>
You can find the list of all the available commands at https://redis.io/commands.
Installing Redis
In order to install Redis binaries into /usr/local/bin, just use:
% make install
You can use make PREFIX=/some/other/directory install
if you wish to use a
different destination.
make install
will just install binaries in your system, but will not configure
init scripts and configuration files in the appropriate place. This is not
needed if you just want to play a bit with Redis, but if you are installing
it the proper way for a production system, we have a script that does this
for Ubuntu and Debian systems:
% cd utils
% ./install_server.sh
Note: install_server.sh
will not work on Mac OSX; it is built for Linux only.
The script will ask you a few questions and will setup everything you need to run Redis properly as a background daemon that will start again on system reboots.
You'll be able to stop and start Redis using the script named
/etc/init.d/redis_<portnumber>
, for instance /etc/init.d/redis_6379
.
Code contributions
By contributing code to the Redis project in any form, including sending a pull request via GitHub, a code fragment or patch via private email or public discussion groups, you agree to release your code under the terms of the Redis Software Grant and Contributor License Agreement. Redis software contains contributions to the original Redis core project, which are owned by their contributors and licensed under the 3BSD license. Any copy of that license in this repository applies only to those contributions. Redis releases all Redis Community Edition versions from 7.4.x and thereafter under the RSALv2/SSPL dual-license as described in the LICENSE.txt file included in the Redis Community Edition source distribution.
Please see the CONTRIBUTING.md file in this source distribution for more information. For security bugs and vulnerabilities, please see SECURITY.md.
Redis Trademarks
The purpose of a trademark is to identify the goods and services of a person or company without causing confusion. As the registered owner of its name and logo, Redis accepts certain limited uses of its trademarks but it has requirements that must be followed as described in its Trademark Guidelines available at: https://redis.com/legal/trademark-guidelines/.
Redis internals
If you are reading this README you are likely in front of a GitHub page or you just untarred the Redis distribution tar ball. In both the cases you are basically one step away from the source code, so here we explain the Redis source code layout, what is in each file as a general idea, the most important functions and structures inside the Redis server and so forth. We keep all the discussion at a high level without digging into the details since this document would be huge otherwise and our code base changes continuously, but a general idea should be a good starting point to understand more. Moreover most of the code is heavily commented and easy to follow.
Source code layout
The Redis root directory just contains this README, the Makefile which
calls the real Makefile inside the src
directory and an example
configuration for Redis and Redis Sentinel. You can find a few shell
scripts that are used in order to execute the Redis, Redis Cluster and
Redis Sentinel unit tests, which are implemented inside the tests
directory.
Inside the root are the following important directories:
src
: contains the Redis implementation, written in C.tests
: contains the unit tests, implemented in Tcl.deps
: contains libraries Redis uses. Everything needed to compile Redis is inside this directory; your system just needs to providelibc
, a POSIX compatible interface and a C compiler. Notablydeps
contains a copy ofjemalloc
, which is the default allocator of Redis under Linux. Note that underdeps
there are also things which started with the Redis project, but for which the main repository is notredis/redis
.
There are a few more directories but they are not very important for our goals
here. We'll focus mostly on src
, where the Redis implementation is contained,
exploring what there is inside each file. The order in which files are
exposed is the logical one to follow in order to disclose different layers
of complexity incrementally.
Note: lately Redis was refactored quite a bit. Function names and file
names have been changed, so you may find that this documentation reflects the
unstable
branch more closely. For instance, in Redis 3.0 the server.c
and server.h
files were named redis.c
and redis.h
. However the overall
structure is the same. Keep in mind that all the new developments and pull
requests should be performed against the unstable
branch.
server.h
The simplest way to understand how a program works is to understand the
data structures it uses. So we'll start from the main header file of
Redis, which is server.h
.
All the server configuration and in general all the shared state is
defined in a global structure called server
, of type struct redisServer
.
A few important fields in this structure are:
server.db
is an array of Redis databases, where data is stored.server.commands
is the command table.server.clients
is a linked list of clients connected to the server.server.master
is a special client, the master, if the instance is a replica.
There are tons of other fields. Most fields are commented directly inside the structure definition.
Another important Redis data structure is the one defining a client.
In the past it was called redisClient
, now just client
. The structure
has many fields, here we'll just show the main ones:
struct client {
int fd;
sds querybuf;
int argc;
robj **argv;
redisDb *db;
int flags;
list *reply;
// ... many other fields ...
char buf[PROTO_REPLY_CHUNK_BYTES];
}
The client structure defines a connected client:
- The
fd
field is the client socket file descriptor. argc
andargv
are populated with the command the client is executing, so that functions implementing a given Redis command can read the arguments.querybuf
accumulates the requests from the client, which are parsed by the Redis server according to the Redis protocol and executed by calling the implementations of the commands the client is executing.reply
andbuf
are dynamic and static buffers that accumulate the replies the server sends to the client. These buffers are incrementally written to the socket as soon as the file descriptor is writable.
As you can see in the client structure above, arguments in a command
are described as robj
structures. The following is the full robj
structure, which defines a Redis object:
struct redisObject {
unsigned type:4;
unsigned encoding:4;
unsigned lru:LRU_BITS; /* LRU time (relative to global lru_clock) or
* LFU data (least significant 8 bits frequency
* and most significant 16 bits access time). */
int refcount;
void *ptr;
};
Basically this structure can represent all the basic Redis data types like
strings, lists, sets, sorted sets and so forth. The interesting thing is that
it has a type
field, so that it is possible to know what type a given
object has, and a refcount
, so that the same object can be referenced
in multiple places without allocating it multiple times. Finally the ptr
field points to the actual representation of the object, which might vary
even for the same type, depending on the encoding
used.
Redis objects are used extensively in the Redis internals, however in order to avoid the overhead of indirect accesses, recently in many places we just use plain dynamic strings not wrapped inside a Redis object.
server.c
This is the entry point of the Redis server, where the main()
function
is defined. The following are the most important steps in order to startup
the Redis server.
initServerConfig()
sets up the default values of theserver
structure.initServer()
allocates the data structures needed to operate, setup the listening socket, and so forth.aeMain()
starts the event loop which listens for new connections.
There are two special functions called periodically by the event loop:
serverCron()
is called periodically (according toserver.hz
frequency), and performs tasks that must be performed from time to time, like checking for timed out clients.beforeSleep()
is called every time the event loop fired, Redis served a few requests, and is returning back into the event loop.
Inside server.c you can find code that handles other vital things of the Redis server:
call()
is used in order to call a given command in the context of a given client.activeExpireCycle()
handles eviction of keys with a time to live set via theEXPIRE
command.performEvictions()
is called when a new write command should be performed but Redis is out of memory according to themaxmemory
directive.- The global variable
redisCommandTable
defines all the Redis commands, specifying the name of the command, the function implementing the command, the number of arguments required, and other properties of each command.
commands.c
This file is auto generated by utils/generate-command-code.py, the content is based on the JSON files in the src/commands folder.
These are meant to be the single source of truth about the Redis commands, and all the metadata about them.
These JSON files are not meant to be used by anyone directly, instead that metadata can be obtained via the COMMAND
command.
networking.c
This file defines all the I/O functions with clients, masters and replicas (which in Redis are just special clients):
createClient()
allocates and initializes a new client.- The
addReply*()
family of functions are used by command implementations in order to append data to the client structure, that will be transmitted to the client as a reply for a given command executed. writeToClient()
transmits the data pending in the output buffers to the client and is called by the writable event handlersendReplyToClient()
.readQueryFromClient()
is the readable event handler and accumulates data read from the client into the query buffer.processInputBuffer()
is the entry point in order to parse the client query buffer according to the Redis protocol. Once commands are ready to be processed, it callsprocessCommand()
which is defined insideserver.c
in order to actually execute the command.freeClient()
deallocates, disconnects and removes a client.
aof.c and rdb.c
As you can guess from the names, these files implement the RDB and AOF
persistence for Redis. Redis uses a persistence model based on the fork()
system call in order to create a process with the same (shared) memory
content of the main Redis process. This secondary process dumps the content
of the memory on disk. This is used by rdb.c
to create the snapshots
on disk and by aof.c
in order to perform the AOF rewrite when the
append only file gets too big.
The implementation inside aof.c
has additional functions in order to
implement an API that allows commands to append new commands into the AOF
file as clients execute them.
The call()
function defined inside server.c
is responsible for calling
the functions that in turn will write the commands into the AOF.
db.c
Certain Redis commands operate on specific data types; others are general.
Examples of generic commands are DEL
and EXPIRE
. They operate on keys
and not on their values specifically. All those generic commands are
defined inside db.c
.
Moreover db.c
implements an API in order to perform certain operations
on the Redis dataset without directly accessing the internal data structures.
The most important functions inside db.c
which are used in many command
implementations are the following:
lookupKeyRead()
andlookupKeyWrite()
are used in order to get a pointer to the value associated to a given key, orNULL
if the key does not exist.dbAdd()
and its higher level counterpartsetKey()
create a new key in a Redis database.dbDelete()
removes a key and its associated value.emptyData()
removes an entire single database or all the databases defined.
The rest of the file implements the generic commands exposed to the client.
object.c
The robj
structure defining Redis objects was already described. Inside
object.c
there are all the functions that operate with Redis objects at
a basic level, like functions to allocate new objects, handle the reference
counting and so forth. Notable functions inside this file:
incrRefCount()
anddecrRefCount()
are used in order to increment or decrement an object reference count. When it drops to 0 the object is finally freed.createObject()
allocates a new object. There are also specialized functions to allocate string objects having a specific content, likecreateStringObjectFromLongLong()
and similar functions.
This file also implements the OBJECT
command.
replication.c
This is one of the most complex files inside Redis, it is recommended to approach it only after getting a bit familiar with the rest of the code base. In this file there is the implementation of both the master and replica role of Redis.
One of the most important functions inside this file is replicationFeedSlaves()
that writes commands to the clients representing replica instances connected
to our master, so that the replicas can get the writes performed by the clients:
this way their data set will remain synchronized with the one in the master.
This file also implements both the SYNC
and PSYNC
commands that are
used in order to perform the first synchronization between masters and
replicas, or to continue the replication after a disconnection.
Script
The script unit is composed of 3 units:
script.c
- integration of scripts with Redis (commands execution, set replication/resp, ...)script_lua.c
- responsible to execute Lua code, usesscript.c
to interact with Redis from within the Lua code.function_lua.c
- contains the Lua engine implementation, usesscript_lua.c
to execute the Lua code.functions.c
- contains Redis Functions implementation (FUNCTION
command), usesfunctions_lua.c
if the function it wants to invoke needs the Lua engine.eval.c
- contains theeval
implementation usingscript_lua.c
to invoke the Lua code.
Other C files
t_hash.c
,t_list.c
,t_set.c
,t_string.c
,t_zset.c
andt_stream.c
contains the implementation of the Redis data types. They implement both an API to access a given data type, and the client command implementations for these data types.ae.c
implements the Redis event loop, it's a self contained library which is simple to read and understand.sds.c
is the Redis string library, check https://github.com/antirez/sds for more information.anet.c
is a library to use POSIX networking in a simpler way compared to the raw interface exposed by the kernel.dict.c
is an implementation of a non-blocking hash table which rehashes incrementally.cluster.c
implements the Redis Cluster. Probably a good read only after being very familiar with the rest of the Redis code base. If you want to readcluster.c
make sure to read the Redis Cluster specification.
Anatomy of a Redis command
All the Redis commands are defined in the following way:
void foobarCommand(client *c) {
printf("%s",c->argv[1]->ptr); /* Do something with the argument. */
addReply(c,shared.ok); /* Reply something to the client. */
}
The command function is referenced by a JSON file, together with its metadata, see commands.c
described above for details.
The command flags are documented in the comment above the struct redisCommand
in server.h
.
For other details, please refer to the COMMAND
command. https://redis.io/commands/command/
After the command operates in some way, it returns a reply to the client,
usually using addReply()
or a similar function defined inside networking.c
.
There are tons of command implementations inside the Redis source code that can serve as examples of actual commands implementations (e.g. pingCommand). Writing a few toy commands can be a good exercise to get familiar with the code base.
There are also many other files not described here, but it is useless to cover everything. We just want to help you with the first steps. Eventually you'll find your way inside the Redis code base :-)
Enjoy!