linux

Commit Graph

Author	SHA1	Message	Date
Herbert Xu	b824478b21	rhashtable: Add multiple rehash support This patch adds the missing bits to allow multiple rehashes. The read-side as well as remove already handle this correctly. So it's only the rehasher and insertion that need modification to handle this. Note that this patch doesn't actually enable it so for now rehashing is still only performed by the worker thread. This patch also disables the explicit expand/shrink interface because the table is meant to expand and shrink automatically, and continuing to export these interfaces unnecessarily complicates the life of the rehasher since the rehash process is now composed of two parts. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-23 22:07:52 -04:00
Herbert Xu	18093d1c0d	rhashtable: Shrink to fit This patch changes rhashtable_shrink to shrink to the smallest size possible rather than halving the table. This is needed because with multiple rehashing we will defer shrinking until all other rehashing is done, meaning that when we do shrink we may be able to shrink a lot. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-23 22:07:52 -04:00
Herbert Xu	31ccde2dac	rhashtable: Allow hashfn to be unset Since every current rhashtable user uses jhash as their hash function, the fact that jhash is an inline function causes each user to generate a copy of its code. This function provides a solution to this problem by allowing hashfn to be unset. In which case rhashtable will automatically set it to jhash. Furthermore, if the key length is a multiple of 4, we will switch over to jhash2. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-23 22:07:51 -04:00
Herbert Xu	d88252f9bb	rhashtable: Add barrier to ensure we see new tables in walker The walker is a lockless reader so it too needs an smp_rmb before reading the future_tbl field in order to see any new tables that may contain elements that we should have walked over. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-23 22:07:51 -04:00
Herbert Xu	dc0ee268d8	rhashtable: Rip out obsolete out-of-line interface Now that all rhashtable users have been converted over to the inline interface, this patch removes the unused out-of-line interface. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 16:16:24 -04:00
Herbert Xu	02fd97c3d4	rhashtable: Allow hash/comparison functions to be inlined This patch deals with the complaint that we make indirect function calls on the fast paths unnecessarily in rhashtable. We resolve it by moving the fast paths into inline functions that take struct rhashtable_param (which obviously must be the same set of parameters supplied to rhashtable_init) as an argument. The only remaining indirect call is to obj_hashfn (or key_hashfn it obj_hashfn is unset) on the rehash as well as the insert-during- rehash slow path. This patch also extends the support of vairable-length keys to include those where the key is fixed but scattered in the object. For example, in netlink we want to key off the namespace and the portid but they're not next to each other. This patch does this by directly using the object hash function as the indicator of whether the key is accessible or not. It also adds a new function obj_cmpfn to compare a key against an object. This means that the caller no longer needs to supply explicit compare functions. All this is done in a backwards compatible manner so no existing users are affected until they convert to the new interface. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 16:16:24 -04:00
Herbert Xu	488fb86ee9	rhashtable: Make rhashtable_init params argument const This patch marks the rhashtable_init params argument const as there is no reason to modify it since we will always make a copy of it in the rhashtable. This patch also fixes a bug where we don't actually round up the value of min_size unless it is less than HASH_MIN_SIZE. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 16:16:24 -04:00
Thomas Graf	a998f712f7	rhashtable: Round up/down min/max_size to ensure we respect limit Round up min_size respectively round down max_size to the next power of two to make sure we always respect the limit specified by the user. This is required because we compare the table size against the limit before we expand or shrink. Also fixes a minor bug where we modified min_size in the params provided instead of the copy stored in struct rhashtable. Signed-off-by: Thomas Graf <tgraf@suug.ch> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-19 21:02:23 -04:00
Herbert Xu	e2e21c1c58	rhashtable: Remove max_shift and min_shift Now that nobody uses max_shift and min_shift, we can safely remove them. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-18 12:46:41 -04:00
Herbert Xu	c2e213cff7	rhashtable: Introduce max_size/min_size This patch adds the parameters max_size and min_size which are meant to replace max_shift and min_shift. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-18 12:46:40 -04:00
Herbert Xu	6aebd94084	rhashtable: Remove shift from bucket_table Keeping both size and shift is silly. We only need one. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-18 12:46:40 -04:00
Thomas Graf	617011e7d5	rhashtable: Avoid calculating hash again to unlock Caching the lock pointer avoids having to hash on the object again to unlock the bucket locks. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-16 17:14:34 -04:00
Thomas Graf	db4374f48a	rhashtable: Annotate RCU locking of walkers Fixes the following sparse warnings: lib/rhashtable.c:767:5: warning: context imbalance in 'rhashtable_walk_start' - wrong count at exit lib/rhashtable.c:849:6: warning: context imbalance in 'rhashtable_walk_stop' - unexpected unlock Fixes: `f2dba9c6ff` ("rhashtable: Introduce rhashtable_walk_*") Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-16 16:24:13 -04:00
Herbert Xu	565e86404e	rhashtable: Fix rhashtable_remove failures The commit `9d901bc051` ("rhashtable: Free bucket tables asynchronously after rehash") causes gratuitous failures in rhashtable_remove. The reason is that it inadvertently introduced multiple rehashing from the perspective of readers. IOW it is now possible to see more than two tables during a single RCU critical section. Fortunately the other reader rhashtable_lookup already deals with this correctly thanks to `c4db8848af` ("rhashtable: rhashtable: Move future_tbl into struct bucket_table") so only rhashtable_remove is broken by this change. This patch fixes this by looping over every table from the first one to the last or until we find the element that we were trying to delete. Incidentally the simple test for detecting rehashing to prevent starting another shrinking no longer works. Since it isn't needed anyway (the work queue and the mutex serves as a natural barrier to unnecessary rehashes) I've simply killed the test. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-15 22:22:08 -04:00
Herbert Xu	963ecbd41a	rhashtable: Fix use-after-free in rhashtable_walk_stop The commit `c4db8848af` ("rhashtable: Move future_tbl into struct bucket_table") introduced a use-after- free bug in rhashtable_walk_stop because it dereferences tbl after droping the RCU read lock. This patch fixes it by moving the RCU read unlock down to the bottom of rhashtable_walk_stop. In fact this was how I had it originally but it got dropped while rearranging patches because this one depended on the async freeing of bucket_table. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-15 22:22:08 -04:00
Herbert Xu	c4db8848af	rhashtable: Move future_tbl into struct bucket_table This patch moves future_tbl to open up the possibility of having multiple rehashes on the same table. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-15 01:35:34 -04:00
Herbert Xu	63d512d0cf	rhashtable: Add rehash counter to bucket_table This patch adds a rehash counter to bucket_table to indicate the last bucket that has been rehashed. This serves two purposes: 1. Any bucket that has been rehashed can never gain a new object. 2. If the rehash counter reaches the size of the table, the table will forever remain empty. This patch also downsizes bucket_table->size to an unsigned int since we do not support sizes greater than 32 bits yet. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-15 01:35:34 -04:00
Herbert Xu	9d901bc051	rhashtable: Free bucket tables asynchronously after rehash There is in fact no need to wait for an RCU grace period in the rehash function, since all insertions are guaranteed to go into the new table through spin locks. This patch uses call_rcu to free the old/rehashed table at our leisure. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-15 01:35:34 -04:00
Herbert Xu	5269b53da4	rhashtable: Move seed init into bucket_table_alloc It seems that I have already made every rehash redo the random seed even though my commit message indicated otherwise :) Since we have already taken that step, this patch goes one step further and moves the seed initialisation into bucket_table_alloc. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-15 01:35:34 -04:00
Herbert Xu	8f2484bdb5	rhashtable: Use SINGLE_DEPTH_NESTING We only nest one level deep there is no need to roll our own subclasses. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-15 01:35:34 -04:00
Herbert Xu	eddee5ba34	rhashtable: Fix walker behaviour during rehash Previously whenever the walker encountered a resize it simply snaps back to the beginning and starts again. However, this only works if the rehash started and completed while the walker was idle. If the walker attempts to restart while the rehash is still ongoing, we may miss objects that we shouldn't have. This patch fixes this by making the walker walk the old table followed by the new table just like all other readers. If a rehash is detected we will still signal our caller of the fact so they can prepare for duplicates but we will simply continue the walk onto the new table after the old one is finished either by us or by the rehasher. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-15 01:35:34 -04:00
Herbert Xu	393619474e	rhashtable: Fix read-side crash during rehash This patch fixes a typo rhashtable_lookup_compare where we fail to recompute the hash when looking up the new table. This causes elements to be missed and potentially a crash during a resize. Reported-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-12 23:02:30 -04:00
Daniel Borkmann	a5b6846f9e	rhashtable: kill ht->shift atomic operations Commit `c0c09bfdc4` ("rhashtable: avoid unnecessary wakeup for worker queue") changed ht->shift to be atomic, which is actually unnecessary. Instead of leaving the current shift in the core rhashtable structure, it can be cached inside the individual bucket tables. There, it will only be initialized once during a new table allocation in the shrink/expansion slow path, and from then onward it stays immutable for the rest of the bucket table liftime. That allows shift to be non-atomic. The patch also moves hash_rnd management into the table setup. The rhashtable structure now consumes 3 instead of 4 cachelines. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Ying Xue <ying.xue@windriver.com> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-12 23:02:30 -04:00
Herbert Xu	9497df88ab	rhashtable: Fix reader/rehash race There is a potential race condition between readers and the rehasher. In particular, the rehasher could have started a rehash while the reader finishes a scan of the old table but fails to see the new table pointer. This patch closes this window by adding smp_wmb/smp_rmb. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-12 23:02:30 -04:00
Herbert Xu	ec9f71c59e	rhashtable: Remove obj_raw_hashfn Now that the only caller of obj_raw_hashfn is head_hashfn, we can simply kill it and fold it into the latter. This patch also moves the common shift from head_hashfn/key_hashfn into rht_bucket_index. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-12 14:35:30 -04:00
Herbert Xu	cffaa9cb92	rhashtable: Remove key length argument to key_hashfn key_hashfn has only one caller and it doesn't really need to supply the key length as an extra parameter. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-12 14:35:30 -04:00
Herbert Xu	eca8493330	rhashtable: Use head_hashfn instead of obj_raw_hashfn Now that we don't have cross-table hashes, we no longer need to keep the entire hash value so all users of obj_raw_hashfn can use head_hashfn instead. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-12 14:35:30 -04:00
Herbert Xu	8d2b18793d	rhashtable: Move masking back into key_hashfn This patch reverts commit `c88455ce50` ("rhashtable: key_hashfn() must return full hash value") because the only user of it always masks the hash value. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-12 14:35:30 -04:00
Herbert Xu	84ed82b74d	rhashtable: Add annotation to nested lock Commit `aa34a6cb04` ("rhashtable: Add arbitrary rehash function") killed the annotation on the nested lock which leads to bitching from lockdep. Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-11 23:53:40 -04:00
Herbert Xu	aa34a6cb04	rhashtable: Add arbitrary rehash function This patch adds a rehash function that supports the use of any hash function for the new table. This is needed to support changing the random seed value during the lifetime of the hash table. However for now the random seed value is still constant and the rehash function is simply used to replace the existing expand/shrink functions. [ ASSERT_BUCKET_LOCK() and thus debug_dump_table() + debug_dump_buckets() are not longer used, so delete them entirely. -DaveM ] Signed-off-by: Herbert Xu <herbert.xu@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-11 16:36:21 -04:00
Herbert Xu	988dfbd795	rhashtable: Move hash_rnd into bucket_table Currently hash_rnd is a parameter that users can set. However, no existing users set this parameter. It is also something that people are unlikely to want to set directly since it's just a random number. In preparation for allowing the reseeding/rehashing of rhashtable, this patch moves hash_rnd into bucket_table so that it's now an internal state rather than a parameter. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-11 16:28:25 -04:00
Eric Dumazet	5beb5c90c1	rhashtable: use cond_resched() If a hash table has 128 slots and 16384 elems, expand to 256 slots takes more than one second. For larger sets, a soft lockup is detected. Holding cpu for that long, even in a work queue is a show stopper for non preemptable kernels. cond_resched() at strategic points to allow process scheduler to reschedule us. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-27 17:55:14 -05:00
Daniel Borkmann	4c4b52d9b2	rhashtable: remove indirection for grow/shrink decision functions Currently, all real users of rhashtable default their grow and shrink decision functions to rht_grow_above_75() and rht_shrink_below_30(), so that there's currently no need to have this explicitly selectable. It can/should be generic and private inside rhashtable until a real use case pops up. Since we can make this private, we'll save us this additional indirection layer and can improve insertion/deletion time as well. Reference: http://patchwork.ozlabs.org/patch/443040/ Suggested-by: David S. Miller <davem@davemloft.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-27 16:06:02 -05:00
Daniel Borkmann	8331de75cb	rhashtable: unconditionally grow when max_shift is not specified While commit `c0c09bfdc4` ("rhashtable: avoid unnecessary wakeup for worker queue") rightfully moved part of the decision making of whether we should expand or shrink from the expand/shrink functions themselves into insert/delete functions in order to avoid unnecessary worker wake-ups, it however introduced a regression by doing so. Before that change, if no max_shift was specified (= 0) on rhashtable initialization, rhashtable_expand() would just grow unconditionally and lets the available memory be the limiting factor. After that change, if no max_shift was specified, there would be _no_ expansion step at all. Given that netlink and tipc have a max_shift specified, it was not visible there, but Josh Hunt reported that if nft that starts out with a default element hint of 3 if not otherwise provided, would slow i.e. inserts down trememdously as it cannot grow larger to relax table occupancy. Given that the test case verifies shrinks/expands manually, we also must remove pointer to the helper functions to explicitly avoid parallel resizing on insertions/deletions. test_bucket_stats() and test_rht_lookup() could also be wrapped around rhashtable mutex to explicitly synchronize a walk from resizing, but I think that defeats the actual test case which intended to have explicit test steps, i.e. 1) inserts, 2) expands, 3) shrinks, 4) deletions, with object verification after each stage. Reported-by: Josh Hunt <johunt@akamai.com> Fixes: `c0c09bfdc4` ("rhashtable: avoid unnecessary wakeup for worker queue") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Ying Xue <ying.xue@windriver.com> Cc: Josh Hunt <johunt@akamai.com> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-27 16:06:02 -05:00
Sasha Levin	71bb0012c3	rhashtable: initialize all rhashtable walker members Commit `f2dba9c6ff` ("rhashtable: Introduce rhashtable_walk_") forgot to initialize the members of struct rhashtable_walker after allocating it, which caused an undefined value for 'resize' which is used later on. Fixes: `f2dba9c6ff` ("rhashtable: Introduce rhashtable_walk_") Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-23 15:23:19 -05:00
Daniel Borkmann	eb6d1abf1b	rhashtable: better high order allocation attempts When trying to allocate future tables via bucket_table_alloc(), it seems overkill on large table shifts that we probe for kzalloc() unconditionally first, as it's likely to fail. Only probe with kzalloc() for more reasonable table sizes and use vzalloc() either as a fallback on failure or directly in case of large table sizes. Fixes: `7e1e77636e` ("lib: Resizable, Scalable, Concurrent Hash Table") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-20 17:38:09 -05:00
Daniel Borkmann	342100d937	rhashtable: don't test for shrink on insert, expansion on delete Restore pre `54c5b7d311` behaviour and only probe for expansions on inserts and shrinks on deletes. Currently, it will happen that on initial inserts into a sparse hash table, we may i.e. shrink it first simply because it's not fully populated yet, only to later realize that we need to grow again. This however is counter intuitive, e.g. an initial default size of 64 elements is already small enough, and in case an elements size hint is given to the hash table by a user, we should avoid unnecessary expansion steps, so a shrink is clearly unintended here. Fixes: `54c5b7d311` ("rhashtable: introduce rhashtable_wakeup_worker helper function") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Ying Xue <ying.xue@windriver.com> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-20 17:38:09 -05:00
Stephen Rothwell	61d7b09773	rhashtable: using ERR_PTR requires linux/err.h Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-08 21:52:24 -08:00
Thomas Graf	020219a69d	rhashtable: Fix remove logic to avoid cross references between buckets The remove logic properly searched the remaining chain for a matching entry with an identical hash but it did this while searching from both the old and new table. Instead in order to not leave stale references behind we need to: 1. When growing and searching from the new table: Search remaining chain for entry with same hash to avoid having the new table directly point to a entry with a different hash. 2. When shrinking and searching from the old table: Check if the element after the removed would create a cross reference and avoid it if so. These bugs were present from the beginning in nft_hash. Also, both insert functions calculated the hash based on the mask of the new table. This worked while growing. Wwhile shrinking, the mask of the inew table is smaller than the mask of the old table. This lead to a bit not being taken into account when selecting the bucket lock and thus caused the wrong bucket to be locked eventually. Fixes: `7e1e77636e` ("lib: Resizable, Scalable, Concurrent Hash Table") Fixes: `97defe1ecf` ("rhashtable: Per bucket locks & deferred expansion/shrinking") Reported-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-06 15:19:17 -08:00
Thomas Graf	cf52d52f9c	rhashtable: Avoid bucket cross reference after removal During a resize, when two buckets in the larger table map to a single bucket in the smaller table and the new table has already been (partially) linked to the old table. Removal of an element may result the bucket in the larger table to point to entries which all hash to a different value than the bucket index. Thus causing two buckets to point to the same sub chain after unzipping. This is not illegal during the resize phase but after it has completed. Keep the old table around until all of the unzipping is done to allow the removal code to only search for matching hashed entries during this special period. Reported-by: Ying Xue <ying.xue@windriver.com> Fixes: `97defe1ecf` ("rhashtable: Per bucket locks & deferred expansion/shrinking") Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-06 15:18:35 -08:00
Thomas Graf	7cd10db8de	rhashtable: Add more lock verification Catch hash miscalculations which result in hard to track down race conditions. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-06 15:18:34 -08:00
Thomas Graf	a03eaec0df	rhashtable: Dump bucket tables on locking violation under PROVE_LOCKING This simplifies debugging of locking violations if compiled with CONFIG_PROVE_LOCKING. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-06 15:18:34 -08:00
Thomas Graf	2af4b52988	rhashtable: Wait for RCU readers after final unzip work We need to wait for all RCU readers to complete after the last bit of unzipping has been completed. Otherwise the old table is freed up prematurely. Fixes: `7e1e77636e` ("lib: Resizable, Scalable, Concurrent Hash Table") Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-06 15:18:34 -08:00
Thomas Graf	a5ec68e3b8	rhashtable: Use a single bucket lock for sibling buckets rhashtable currently allows to use a bucket lock per bucket. This requires multiple levels of complicated nested locking because when resizing, a single bucket of the smaller table will map to two buckets in the larger table. So far rhashtable has explicitly locked both buckets in the larger table. By excluding the highest bit of the hash from the bucket lock map and thus only allowing locks to buckets in a ratio of 1:2, the locking can be simplified a lot without losing the benefits of multiple locks. Larger tables which benefit from multiple locks will not have a single lock per bucket anyway. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-06 15:18:34 -08:00
Thomas Graf	c88455ce50	rhashtable: key_hashfn() must return full hash value The value computed by key_hashfn() is used by rhashtable_lookup_compare() to traverse both tables during a resize. key_hashfn() must therefore return the hash value without the buckets mask applied so it can be masked to the size of each individual table. Fixes: `97defe1ecf` ("rhashtable: Per bucket locks & deferred expansion/shrinking") Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-06 15:18:34 -08:00
Herbert Xu	f2dba9c6ff	rhashtable: Introduce rhashtable_walk_* Some existing rhashtable users get too intimate with it by walking the buckets directly. This prevents us from easily changing the internals of rhashtable. This patch adds the helpers rhashtable_walk_init/exit/start/next/stop which will replace these custom walkers. They are meant to be usable for both procfs seq_file walks as well as walking by a netlink dump. The iterator structure should fit inside a netlink dump cb structure, with at least one element to spare. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-04 20:34:52 -08:00
Herbert Xu	28134a53d6	rhashtable: Fix potential crash on destroy in rhashtable_shrink The current being_destroyed check in rhashtable_expand is not enough since if we start a shrinking process after freeing all elements in the table that's also going to crash. This patch adds a being_destroyed check to the deferred worker thread so that we bail out as soon as we take the lock. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-04 20:34:52 -08:00
Geert Uytterhoeven	9d6dbe1bba	rhashtable: Make selftest modular Allow the selftest on the resizable hash table to be built modular, just like all other tests that do not depend on DEBUG_KERNEL. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-30 18:06:33 -08:00
Thomas Graf	fe6a043c53	rhashtable: rhashtable_remove() must unlink in both tbl and future_tbl As removals can occur during resizes, entries may be referred to from both tbl and future_tbl when the removal is requested. Therefore rhashtable_remove() must unlink the entry in both tables if this is the case. The existing code did search both tables but stopped when it hit the first match. Failing to unlink in both tables resulted in use after free. Fixes: `97defe1ecf` ("rhashtable: Per bucket locks & deferred expansion/shrinking") Reported-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-26 11:56:34 -08:00
Ying Xue	57699a40b4	rhashtable: Fix race in rhashtable_destroy() and use regular work_struct When we put our declared work task in the global workqueue with schedule_delayed_work(), its delay parameter is always zero. Therefore, we should define a regular work in rhashtable structure instead of a delayed work. By the way, we add a condition to check whether resizing functions are NULL before cancelling the work, avoiding to cancel an uninitialized work. Lastly, while we wait for all work items we submitted before to run to completion with cancel_delayed_work(), ht->mutex has been taken in rhashtable_destroy(). Moreover, cancel_delayed_work() doesn't return until all work items are accomplished, and when work items are scheduled, the work's function - rht_deferred_worker() will be called. However, as rht_deferred_worker() also needs to acquire the lock, deadlock might happen at the moment as the lock is already held before. So if the cancel work function is moved out of the lock covered scope, this will avoid the deadlock. Fixes: `97defe1` ("rhashtable: Per bucket locks & deferred expansion/shrinking") Signed-off-by: Ying Xue <ying.xue@windriver.com> Cc: Thomas Graf <tgraf@suug.ch> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-16 01:18:51 -05:00

1 2

84 Commits