redis

Commit Graph

Author	SHA1	Message	Date
Wang Yuan	d4bca53cd9	Use madvise(MADV_DONTNEED) to release memory to reduce COW (#8974 ) ## Backgroud As we know, after `fork`, one process will copy pages when writing data to these pages(CoW), and another process still keep old pages, they totally cost more memory. For redis, we suffered that redis consumed much memory when the fork child is serializing key/values, even that maybe cause OOM. But actually we find, in redis fork child process, the child process don't need to keep some memory and parent process may write or update that, for example, child process will never access the key-value that is serialized but users may update it in parent process. So we think it may reduce COW if the child process release memory that it is not needed. ## Implementation For releasing key value in child process, we may think we call `decrRefCount` to free memory, but i find the fork child process still use much memory when we don't write any data to redis, and it costs much more time that slows down bgsave. Maybe because memory allocator doesn't really release memory to OS, and it may modify some inner data for this free operation, especially when we free small objects. Moreover, CoW is based on pages, so it is a easy way that we only free the memory bulk that is not less than kernel page size. madvise(MADV_DONTNEED) can quickly release specified region pages to OS bypassing memory allocator, and allocator still consider that this memory still is used and don't change its inner data. There are some buffers we can release in the fork child process: - Serialized key-values the fork child process never access serialized key-values, so we try to free them. Because we only can release big bulk memory, and it is time consumed to iterate all items/members/fields/entries of complex data type. So we decide to iterate them and try to release them only when their average size of item/member/field/entry is more than page size of OS. - Replication backlog Because replication backlog is a cycle buffer, it will be changed quickly if redis has heavy write traffic, but in fork child process, we don't need to access that. - Client buffers If clients have requests during having the fork child process, clients' buffer also be changed frequently. The memory includes client query buffer, output buffer, and client struct used memory. To get child process peak private dirty memory, we need to count peak memory instead of last used memory, because the child process may continue to release memory (since COW used to only grow till now, the last was equivalent to the peak). Also we're adding a new `current_cow_peak` info variable (to complement the existing `current_cow_size`) Co-authored-by: Oran Agra <oran@redislabs.com>	2021-08-04 23:01:46 +03:00
sundb	95d6297db8	Add run all test support with define REDIS_TEST (#8570 ) 1. Add `redis-server test all` support to run all tests. 2. Add redis test to daily ci. 3. Add `--accurate` option to run slow tests for more iterations (so that by default we run less cycles (shorter time, and less prints). 4. Move dict benchmark to REDIS_TEST. 5. fix some leaks in tests 6. make quicklist tests run on a specific fill set of options rather than huge ranges 7. move some prints in quicklist test outside their loops to reduce prints 8. removing sds.h from dict.c since it is now used in both redis-server and redis-cli (uses hiredis sds)	2021-03-10 09:13:11 +02:00
Yossi Gottlieb	e8e6ca6309	Fix FreeBSD <12.x builds. (#8603 )	2021-03-07 14:14:23 +02:00
Yossi Gottlieb	3ea4c43add	Cleanup usage of malloc_usable_size. (#8554 ) * Add better control of malloc_usable_size() usage. * Use malloc_usable_size on alpine libc daily job. * Add no-malloc-usable-size daily jobs. * Fix zmalloc(0) when HAVE_MALLOC_SIZE is undefined. In order to align with the jemalloc behavior, this should never return NULL or OOM panic.	2021-02-25 09:24:41 +02:00
Yossi Gottlieb	ae7d5bf617	Use malloc_usable_size() on FreeBSD. (#8545 )	2021-02-24 09:48:04 +02:00
Oran Agra	7ca00d694d	Sanitize dump payload: fail RESTORE if memory allocation fails When RDB input attempts to make a huge memory allocation that fails, RESTORE should fail gracefully rather than die with panic	2020-12-06 14:54:34 +02:00
Oran Agra	3945a32177	performance and memory reporting improvement - sds take control of it's internal frag (#7875 ) This commit has two aspects: 1) improve memory reporting for all the places that use sdsAllocSize to compute memory used by a string, in this case it'll include the internal fragmentation. 2) reduce the need for realloc calls by making the sds implicitly take over the internal fragmentation of the block it allocated.	2020-10-02 08:19:44 +03:00
antirez	4092a75d85	Avoid collision with MacOS LIST_HEAD macro after #6384 .	2019-12-02 09:13:29 +01:00
Salvatore Sanfilippo	e5b5f9a2f6	Merge pull request #6384 from devnexen/apple_smaps_impl Getting region date per process in Darwin	2019-12-02 09:02:08 +01:00
David Carlier	819a661be5	Getting region date per process in Darwin	2019-09-15 14:05:00 +01:00
Oran Agra	09f99c2a92	make redis purge jemalloc after flush, and enable background purging thread jemalloc 5 doesn't immediately release memory back to the OS, instead there's a decaying mechanism, which doesn't work when there's no traffic (no allocations). this is most evident if there's no traffic after flushdb, the RSS will remain high. 1) enable jemalloc background purging 2) explicitly purge in flushdb	2019-06-02 15:33:14 +03:00
Bruce Merry	8fd1031b10	Fix incorrect memory usage accounting in zrealloc When HAVE_MALLOC_SIZE is false, each call to zrealloc causes used_memory to increase by PREFIX_SIZE more than it should, due to mis-matched accounting between the original zmalloc (which includes PREFIX size in its increment) and zrealloc (which misses it from its decrement). I've also supplied a command-line test to easily demonstrate the problem. It's not wired into the test framework, because I don't know TCL so I'm not sure how to automate it.	2018-09-30 11:49:03 +02:00
Oran Agra	bf680b6f8c	slave buffers were wasteful and incorrectly counted causing eviction A) slave buffers didn't count internal fragmentation and sds unused space, this caused them to induce eviction although we didn't mean for it. B) slave buffers were consuming about twice the memory of what they actually needed. - this was mainly due to sdsMakeRoomFor growing to twice as much as needed each time but networking.c not storing more than 16k (partially fixed recently in `237a38737`). - besides it wasn't able to store half of the new string into one buffer and the other half into the next (so the above mentioned fix helped mainly for small items). - lastly, the sds buffers had up to 30% internal fragmentation that was wasted, consumed but not used. C) inefficient performance due to starting from a small string and reallocing many times. what i changed: - creating dedicated buffers for reply list, counting their size with zmalloc_size - when creating a new reply node from, preallocate it to at least 16k. - when appending a new reply to the buffer, first fill all the unused space of the previous node before starting a new one. other changes: - expose mem_not_counted_for_evict info field for the benefit of the test suite - add a test to make sure slave buffers are counted correctly and that they don't cause eviction	2018-07-16 16:43:42 +03:00
Oran Agra	482785ac62	add malloc_usable_size for libc malloc this reduces the extra 8 bytes we save before each pointer. but more importantly maybe, it makes the valgrind runs to be more similiar to our normal runs. note: the change in malloc_stats struct in server.h is to eliminate an name conflict. structs that are not typedefed are resolved from a separate name space.	2018-06-19 18:18:23 +03:00
Oran Agra	806736cdf9	Adding real allocator fragmentation to INFO and MEMORY command + active defrag test other fixes / improvements: - LUA script memory isn't taken from zmalloc (taken from libc malloc) so it can cause high fragmentation ratio to be displayed (which is false) - there was a problem with "fragmentation" info being calculated from RSS and used_memory sampled at different times (now sampling them together) other details: - adding a few more allocator info fields to INFO and MEMORY commands - improve defrag test to measure defrag latency of big keys - increasing the accuracy of the defrag test (by looking at real grag info) this way we can use an even lower threshold and still avoid false positives - keep the old (total) "fragmentation" field unchanged, but add new ones for spcific things - add these the MEMORY DOCTOR command - deduct LUA memory from the rss in case of non jemalloc allocator (one for which we don't "allocator active/used") - reduce sampling rate of the rss and allocator info	2018-03-12 15:08:52 +02:00
antirez	6eb51bf1ec	zmalloc.c: remove thread safe mode, it's the default way.	2017-05-09 16:59:51 +02:00
antirez	173d692bc2	Defrag: activate it only if running modified version of Jemalloc. This commit also includes minor aesthetic changes like removal of trailing spaces.	2017-01-10 11:25:39 +01:00
oranagra	7aa9e6d2ae	active memory defragmentation	2016-12-30 03:37:52 +02:00
antirez	945a2f948e	zmalloc: zmalloc_get_smap_bytes_by_field() modified to work for any PID. The goal is to get copy-on-write amount of the child from the parent.	2016-09-19 10:28:42 +02:00
antirez	615f6923d5	getMemorySize() moved into zmalloc.c with other low level mem utils. See issue #2218.	2014-12-17 17:11:20 +01:00
antirez	3ef0876b95	THP detection / reporting functions added.	2014-11-12 10:43:32 +01:00
antirez	93253c2762	Sample and cache RSS in serverCron(). Obtaining the RSS (Resident Set Size) info is slow in Linux and OSX. This slowed down the generation of the INFO 'memory' section. Since the RSS does not require to be a real-time measurement, we now sample it with server.hz frequency (10 times per second by default) and use this value both to show the INFO rss field and to compute the fragmentation ratio. Practically this does not make any difference for memory profiling of Redis but speeds up the INFO call significantly.	2014-03-24 12:00:20 +01:00
antirez	3bfeb9c1a7	zmalloc_get_private_dirty() function added (Linux only). For non Linux systmes it just returns 0. This function is useful to estimate copy-on-write because of childs saving stuff on disk.	2012-11-19 11:47:35 +01:00
antirez	6fdc635447	Better Out of Memory handling. The previous implementation of zmalloc.c was not able to handle out of memory in an application-specific way. It just logged an error on standard error, and aborted. The result was that in the case of an actual out of memory in Redis where malloc returned NULL (In Linux this actually happens under specific overcommit policy settings and/or with no or little swap configured) the error was not properly logged in the Redis log. This commit fixes this problem, fixing issue #509. Now the out of memory is properly reported in the Redis log and a stack trace is generated. The approach used is to provide a configurable out of memory handler to zmalloc (otherwise the default one logging the event on the standard output is used).	2012-08-24 12:55:37 +02:00
antirez	ad4c0b4117	Jemalloc updated to 3.0.0. Full changelog here: http://www.canonware.com/cgi-bin/gitweb.cgi?p=jemalloc.git;a=blob_plain;f=ChangeLog;hb=master Notable improvements from the point of view of Redis: 1) Bugfixing. 2) Support for Valgrind. 3) Support for OSX Lion, FreeBSD.	2012-05-16 11:09:45 +02:00
Premysl Hruby	ebba7b3c92	future-proof version comparison	2012-04-05 10:41:28 +02:00
antirez	23c0cdd2ad	Produce the watchlog warning log in a way that is safer from a signal handler. Fix a memory leak in the backtrace generation function.	2012-03-27 15:24:33 +02:00
antirez	442246dde2	Precision of getClientOutputBufferMemoryUsage() greatily improved, see issue #327 for more information.	2012-02-07 13:05:36 +01:00
antirez	6504634019	no more allocation stats info in INFO, useless now that we have jemalloc.	2011-07-02 10:31:16 +02:00
antirez	29d04257b0	forward-ported changes in zmalloc.c/h to support jemalloc build	2011-06-20 11:34:04 +02:00
antirez	67a1810b32	allocation stats in INFO	2011-01-09 15:56:50 +01:00
antirez	92e282288f	zmalloc functions to get RSS and fragmentation refactored into two separated functions	2010-11-02 10:51:09 +01:00
antirez	eddb388ef9	memory fragmentation ratio in INFO output	2010-09-02 10:34:39 +02:00
Benjamin Kramer	399f2f401c	Add zcalloc and use it where appropriate calloc is more effecient than malloc+memset when the system uses mmap to allocate memory. mmap always returns zeroed memory so the memset can be avoided. The threshold to use mmap is 16k in osx libc and 128k in bsd libc and glibc. The kernel can lazily allocate the pages, this reduces memory usage when we have a page table or hash table that is mostly empty. This change is most visible when you start a new redis instance with vm enabled. You'll see no increased memory usage no matter how big your page table is.	2010-07-25 00:11:20 +02:00
antirez	e2641e09cc	redis.c split into many different C files. networking related stuff moved into networking.c moved more code more work on layout of source code SDS instantaneuos memory saving. By Pieter and Salvatore at VMware ;) cleanly compiling again after the first split, now splitting it in more C files moving more things around... work in progress split replication code splitting more Sets split Hash split replication split even more splitting more splitting minor change	2010-07-01 14:38:51 +02:00

35 Commits