linux

Commit Graph

Author	SHA1	Message	Date
Marcel Holtmann	0151e426b1	Bluetooth: Restrict BNEP flags to only valid ones The BNEP flags should be clearly restricted to valid ones. So this puts extra checks in place to ensure this. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-04-02 08:44:02 +03:00
Marcel Holtmann	5f5da99f1d	Bluetooth: Restrict HIDP flags to only valid ones The HIDP flags should be clearly restricted to valid ones. So this puts extra checks in place to ensure this. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-04-02 08:43:11 +03:00
Marcel Holtmann	8bf17a3619	Bluetooth: Restrict CMTP flags to only valid ones The CMTP flags should be clearly restricted to valid ones. So this puts extra checks in place to ensure this. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-04-02 08:42:21 +03:00
Marcel Holtmann	c3370de64d	Bluetooth: Expose current Device ID information via debugfs For debugging purposes it is good to be able to read the current configured Device ID details. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-04-02 08:40:35 +03:00
Simon Horman	05e8bb860b	pkt_sched: fq: correct spelling of locally Correct spelling of locally. Also remove extra space before tab character in struct fq_flow. Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-01 22:52:29 -04:00
Felix Fietkau	ba8c3d6f16	mac80211: add an intermediate software queue implementation This allows drivers to request per-vif and per-sta-tid queues from which they can pull frames. This makes it easier to keep the hardware queues short, and to improve fairness between clients and vifs. The task of scheduling packet transmission is left up to the driver - queueing is controlled by mac80211. Drivers can only dequeue packets by calling ieee80211_tx_dequeue. This makes it possible to add active queue management later without changing drivers using this code. This can also be used as a starting point to implement A-MSDU aggregation in a way that does not add artificially induced latency. Signed-off-by: Felix Fietkau <nbd@openwrt.org> [resolved minor context conflict, minor changes, endian annotations] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-04-01 20:44:34 +02:00
John Linville	cef2fc1ce4	mac80211: reduce log spam from ieee80211_handle_pwr_constr This changes a couple of messages from sdata_info to sdata_dbg. This should reduce some log spam, as reported here: https://bugzilla.redhat.com/show_bug.cgi?id=1206468 Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-04-01 20:44:34 +02:00
Thomas Huehn	5f919abc76	mac80211: add standard deviation to Minstrel stats This patch adds the statistical descriptor "standard deviation" to better describe the current properties of Minstrel and Minstrel-HTs success probability distribution. The standard deviation (SD) is calculated as exponential weighted moving standard deviation (EWMSD) and its current value is added as new column in all rc_stats (in debugfs). Signed-off-by: Thomas Huehn <thomas@net.t-labs.tu-berlin.de> Acked-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-04-01 20:44:33 +02:00
Thomas Huehn	ade6d4a2ec	mac80211: reduce calculation costs of EWMA This patch reduces the calculation costs of the EWMA macro from "2x multiplication and 1 addition" down to "1x multiplication and 2x additions". This slightly improves performance depending on the CPU architecture. Signed-off-by: Thomas Huehn <thomas@net.t-labs.tu-berlin.de> Acked-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-04-01 20:44:33 +02:00
Thomas Huehn	50e55a8ea7	mac80211: add max lossless throughput per rate This patch adds the new statistic "maximum possible lossless throughput" to Minstrels and Minstrel-HTs rc_stats (in debugfs). This enables comprehensive comparison between current per-rate throughput and max. achievable per-rate throughput. Signed-off-by: Thomas Huehn <thomas@net.t-labs.tu-berlin.de> Acked-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-04-01 20:44:32 +02:00
Thomas Huehn	6a27b2c40b	mac80211: restructure per-rate throughput calculation into function This patch moves Minstrels and Minstrel-HTs per-rate throughput calculation (EWMA(thr)) into a dedicated function to be called. Therefore the variable "unsigned int cur_tp" within struct "minstrel_rate_stats" becomes obsolete. and is removed to free up its space. Signed-off-by: Thomas Huehn <thomas@net.t-labs.tu-berlin.de> Acked-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-04-01 20:44:32 +02:00
Thomas Huehn	9134073bc6	mac80211: improve Minstrel variable & function naming This patch ensures a consistent usage of variable names for type "minstrel_rate_stats" to be used as "mrs" and from type minstrel_rate as "mr" across both Minstrel & Minstrel-HT. In addition some variable and function names got changed to more meaningful ones. Signed-off-by: Thomas Huehn <thomas@net.t-labs.tu-berlin.de> Acked-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-04-01 20:44:31 +02:00
Thomas Huehn	f62838bcc5	mac80211: unify Minstrel & Minstrel-HTs calculation of rate statistics This patch unifies the calculation of Minstrels and Minstrel-HTs per-rate statistic. The new common function minstrel_calc_rate_stats() is called when a statistic update is performed. Signed-off-by: Thomas Huehn <thomas@net.t-labs.tu-berlin.de> Acked-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-04-01 20:44:31 +02:00
Thomas Huehn	2cae0b6a70	mac80211: add new Minstrel-HT statistic output via csv This patch adds a new debugfs file "rc_stats_csv" to output Minstrel-HTs statistics in a common csv format that is easy to parse. Signed-off-by: Thomas Huehn <thomas@net.t-labs.tu-berlin.de> Signed-off-by: Stefan Venz <ikstream86@gmail.com> Acked-by: Felix Fietkau <nbd@openwrt.org> [remove printing current time of day] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-04-01 20:44:26 +02:00
Thomas Huehn	6d48851779	mac80211: add new Minstrel statistic output via csv This patch adds a new debugfs file "rc_stats_csv" to output Minstrels statistics in a common csv format that is easy to parse. Signed-off-by: Thomas Huehn <thomas@net.t-labs.tu-berlin.de> Signed-off-by: Stefan Venz <ikstream86@gmail.com> Acked-by: Felix Fietkau <nbd@openwrt.org> [remove printing current time of day] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-04-01 20:41:43 +02:00
David S. Miller	af3e09e666	This contains just a single fix for a crash I happened to randomly run into today during testing. It's clearly been around for a while, but is pretty hard to trigger, even when I tried explicitly (and modified the code to make it more likely) it rarely did. -----BEGIN PGP SIGNATURE----- iQIcBAABCAAGBQJVG/WSAAoJEDBSmw7B7bqrzmgQALygsyo0GrmHHorg0wkO+PBK l6kVknRwlsMil+vmB6mPTnwgUaGnEWJeRXl4zPJeA4Z+Xr58K1lyTFcMC0nu2MVb MNcTJycRmF4Lqyycd52zFaA+1vMsgG7AZb6vXLYppchUFTmbLVNX/IWhJGNOAsKZ HjYdvLr2kbJunIRofc/0PCC/8J4qFQj2ZphF5WfMckhNrh8+SQjIqmscFdRIqcgj R+LtyxscyKWspQ97J6OdoTsKpxTKSZ8mi9IvohdJhkJVnzBEkJx+Krf6PNs+Xnh1 Z4x1Dkn1RoM8+7cakq2tTwwxokEw7de/3v/s90W+yVuZWNKsaiKHcnU+KGHu6t0J 2766PXv6hC3KSWN6XrfTxYP7CBsT46Vf5+7FSlDsck1tUbn3W55c2kPFMBKk79yi CHjz1O82wzx4bAVNdaKMpR8rz6bSyhZijmduMuYxxrvnKkl5BSHype9IAUo+etVz KxPnwGq9yJjf0RYdr9tttxiwXJaADD6/R/bO21SIi1JeKa5sCyoAA5nLDyJfXwyt KtlrzzM9NWqoQUi2SGmurHHEIBBmgg9RBBWvq+MNM0Ik7d9kIawCBlvPerVc4IH4 bUvIXEnrQNOEwxmY/9nsUShQvkzuQbkwR3rpEYA3XemO0qW1t0Tkdp/3UY3TY6Sq EOTi9LkOquZnKd9GB7yl =Imm5 -----END PGP SIGNATURE----- Merge tag 'mac80211-for-davem-2015-04-01' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 Johannes Berg says: ==================== This contains just a single fix for a crash I happened to randomly run into today during testing. It's clearly been around for a while, but is pretty hard to trigger, even when I tried explicitly (and modified the code to make it more likely) it rarely did. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-01 14:19:22 -04:00
Linus Torvalds	1e848913f0	Merge branch 'for-4.0' of git://linux-nfs.org/~bfields/linux Pull nfsd fixes from Bruce Fields: "Two main issues: - We found that turning on pNFS by default (when it's configured at build time) was too aggressive, so we want to switch the default before the 4.0 release. - Recent client changes to increase open parallelism uncovered a serious bug lurking in the server's open code. Also fix a krb5/selinux regression. The rest is mainly smaller pNFS fixes" * 'for-4.0' of git://linux-nfs.org/~bfields/linux: sunrpc: make debugfs file creation failure non-fatal nfsd: require an explicit option to enable pNFS NFSD: Fix bad update of layout in nfsd4_return_file_layout NFSD: Take care the return value from nfsd4_encode_stateid NFSD: Printk blocklayout length and offset as format 0x%llx nfsd: return correct lockowner when there is a race on hash insert nfsd: return correct openowner when there is a race to put one in the hash NFSD: Put exports after nfsd4_layout_verify fail NFSD: Error out when register_shrinker() fail NFSD: Take care the return value from nfsd4_decode_stateid NFSD: Check layout type when returning client layouts NFSD: restore trace event lost in mismerge	2015-04-01 09:45:47 -07:00
David Howells	44ba06987c	RxRPC: Handle VERSION Rx protocol packets Handle VERSION Rx protocol packets. We should respond to a VERSION packet with a string indicating the Rx version. This is a maximum of 64 characters and is padded out to 65 chars with NUL bytes. Note that other AFS clients use the version request as a NAT keepalive so we need to handle it rather than returning an abort. The standard formulation seems to be: <project> <version> built <yyyy>-<mm>-<dd> for example: " OpenAFS 1.6.2 built 2013-05-07 " (note the three extra spaces) as obtained with: rxdebug grand.mit.edu -version from the openafs package. Signed-off-by: David Howells <dhowells@redhat.com>	2015-04-01 16:31:26 +01:00
David Howells	382d7974de	RxRPC: Use iov_iter_count() in rxrpc_send_data() instead of the len argument Use iov_iter_count() in rxrpc_send_data() to get the remaining data length instead of using the len argument as the len argument is now redundant. Signed-off-by: David Howells <dhowells@redhat.com>	2015-04-01 15:49:26 +01:00
David Howells	aab94830a7	RxRPC: Don't call skb_add_data() if there's no data to copy Don't call skb_add_data() in rxrpc_send_data() if there's no data to copy and also skip the calculations associated with it in such a case. Signed-off-by: David Howells <dhowells@redhat.com>	2015-04-01 15:48:00 +01:00
David Howells	3af6878eca	RxRPC: Fix the conversion to iov_iter This commit: commit `af2b040e47` Author: Al Viro <viro@zeniv.linux.org.uk> Date: Thu Nov 27 21:44:24 2014 -0500 Subject: rxrpc: switch rxrpc_send_data() to iov_iter primitives incorrectly changes a do-while loop into a while loop in rxrpc_send_data(). Unfortunately, at least one pass through the loop is required - even if there is no data - so that the packet the closes the send phase can be sent if MSG_MORE is not set. Signed-off-by: David Howells <dhowells@redhat.com>	2015-04-01 14:06:00 +01:00
Johannes Berg	788211d81b	mac80211: fix RX A-MPDU session reorder timer deletion There's an issue with the way the RX A-MPDU reorder timer is deleted that can cause a kernel crash like this: * tid_rx is removed - call_rcu(ieee80211_free_tid_rx) * station is destroyed * reorder timer fires before ieee80211_free_tid_rx() runs, accessing the station, thus potentially crashing due to the use-after-free The station deletion is protected by synchronize_net(), but that isn't enough -- ieee80211_free_tid_rx() need not have run when that returns (it deletes the timer.) We could use rcu_barrier() instead of synchronize_net(), but that's much more expensive. Instead, to fix this, add a field tracking that the session is being deleted. In this case, the only re-arming of the timer happens with the reorder spinlock held, so make that code not rearm it if the session is being deleted and also delete the timer after setting that field. This ensures the timer cannot fire after ___ieee80211_stop_rx_ba_session() returns, which fixes the problem. Cc: stable@vger.kernel.org Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-04-01 14:35:01 +02:00
Pablo Neira Ayuso	c5035c77f8	netfilter: nft_meta: fix cgroup matching We have to stop iterating on the rule expressions if the cgroup mismatches. Moreover, make sure a non-full socket from the input path leads us to a crash. Fixes: `ce67417` ("netfilter: nft_meta: add cgroup support") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-04-01 11:33:00 +02:00
Oliver Hartkopp	a5581ef4c2	can: introduce new raw socket option to join the given CAN filters The CAN_RAW socket can set multiple CAN identifier specific filters that lead to multiple filters in the af_can.c filter processing. These filters are indenpendent from each other which leads to logical OR'ed filters when applied. This socket option joines the given CAN filters in the way that only CAN frames are passed to user space that matched all given CAN filters. The semantic for the applied filters is therefore changed to a logical AND. This is useful especially when the filterset is a combination of filters where the CAN_INV_FILTER flag is set in order to notch single CAN IDs or CAN ID ranges from the incoming traffic. As the raw_rcv() function is executed from NET_RX softirq the introduced variables are implemented as per-CPU variables to avoid extensive locking at CAN frame reception time. Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2015-04-01 11:28:22 +02:00
Oliver Hartkopp	514ac99c64	can: fix multiple delivery of a single CAN frame for overlapping CAN filters The CAN_RAW socket can set multiple CAN identifier specific filters that lead to multiple filters in the af_can.c filter processing. These filters are indenpendent from each other which leads to logical OR'ed filters when applied. This patch makes sure that every CAN frame which is filtered for a specific socket is only delivered once to the user space. This is independent from the number of matching CAN filters of this socket. As the raw_rcv() function is executed from NET_RX softirq the introduced variables are implemented as per-CPU variables to avoid extensive locking at CAN frame reception time. Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2015-04-01 11:27:41 +02:00
Daniel Borkmann	afb7718016	netfilter: x_tables: fix cgroup matching on non-full sks While originally only being intended for outgoing traffic, commit `a00e76349f` ("netfilter: x_tables: allow to use cgroup match for LOCAL_IN nf hooks") enabled xt_cgroups for the NF_INET_LOCAL_IN hook as well, in order to allow for nfacct accounting. Besides being currently limited to early demuxes only, commit `a00e76349f` forgot to add a check if we deal with full sockets, i.e. in this case not with time wait sockets. TCP time wait sockets do not have the same memory layout as full sockets, a lower memory footprint and consequently also don't have a sk_classid member; probing for sk_classid member there could potentially lead to a crash. Fixes: `a00e76349f` ("netfilter: x_tables: allow to use cgroup match for LOCAL_IN nf hooks") Cc: Alexey Perevalov <a.perevalov@samsung.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-04-01 11:26:42 +02:00
Thomas Huehn	9c00bb7210	mac80211: enhance readability of Minstrel-HTs rc_stats output This patch restructures the rc_stats debugfs table of Minstrel-HT in order to achieve better human readability. A new layout of the statistics and a new header is added. In addition to the old layout there are two new columns of information added: idx - representing the rate index of each rate in mac80211 which can be used to set specific rates as fixed rate via debugfs airtime - the tx-time in micro seconds that a 1200 Byte packet takes to be transmitted over the air at the given rate The old layout of rc_stats: type rate tpt eprob prob ret ok(*cum) ok( cum) HT20/LGI MCS0 5.6 100.0 100.0 1 0( 0) 1( 1) HT20/LGI B MCS1 10.5 100.0 100.0 0 0( 0) 1( 1) HT20/LGI A MCS2 14.8 100.0 100.0 0 0( 0) 1( 1) ... is changed into this new layout: best ________rate______ __statistics__ ________last_______ ______sum-of________ mode guard # rate [name idx airtime] [ ø(tp) ø(prob)] [prob.\|retry\|suc\|att] [#success \| #attempts] HT20 LGI 1 MCS0 0 1480 0.0 0.0 0.0 1 0 0 0 0 HT20 LGI 1 B MCS1 1 740 10.5 100.0 100.0 0 0 0 1 1 HT20 LGI 1 A MCS2 2 496 14.8 100.0 100.0 0 0 0 1 1 ... Signed-off-by: Thomas Huehn <thomas@net.t-labs.tu-berlin.de> Signed-off-by: Stefan Venz <ikstream86@gmail.com> Acked-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-04-01 11:22:39 +02:00
Thomas Huehn	e161f7f6c4	mac80211: enhance readability of Minstrels rc_stats output This patch restructures the rc_stats debugfs table of Minstrel in order to achieve better human readability. A new layout of the statistics and a new header is added. In addition to the old layout there are two new columns of information added: idx - representing the rate index of each rate in mac80211 which can be used to set specific rates as fixed rate via debugfs airtime - the tx-time in micro seconds that a 1200 Byte packet takes to be transmitted over the air at the given rate The old layout of rc_stats: rate tpt eprob prob ret ok(*cum) ok( cum) DP 1 0.9 93.5 100.0 1 0( 0) 2( 2) 2 0.4 40.0 100.0 0 0( 0) 4( 10) 5.5 0.0 0.0 0.0 0 0( 0) 0( 0) ... is changed into this new layout: best _______rate_____ __statistics__ ________last_______ ______sum-of________ rate [name idx tx-time] [ ø(tp) ø(prob)] [prob.\|retry\|suc\|att] [#success \| #attempts] DP 1 0 9738 0.9 93.5 100.0 1 1 1 2 2 2 1 4922 0.4 40.0 100.0 1 0 0 4 10 5.5 2 1858 0.0 0.0 0.0 2 0 0 0 0 ... Signed-off-by: Thomas Huehn <thomas@net.t-labs.tu-berlin.de> Signed-off-by: Stefan Venz <ikstream86@gmail.com> Acked-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-04-01 11:22:39 +02:00
Ilan peer	c37722bd19	cfg80211: Stop calling crda if it is not responsive Patch `eeca9fce1d` (cfg80211: Schedule timeout for all CRDA call) introduced a regression, where in case that crda is not installed (or not configured properly etc.), the regulatory core will needlessly continue to call it, polluting the log with the following log: "cfg80211: Calling CRDA to update world regulatory domain" Fix this by limiting the number of continuous CRDA request failures. Signed-off-by: Ilan Peer <ilan.peer@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-04-01 11:22:38 +02:00
Patrick McHardy	9d0982927e	netfilter: nft_hash: add support for timeouts Add support for element timeouts to nft_hash. The lookup and walking functions are changed to ignore timed out elements, a periodic garbage collection task cleans out expired entries. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-04-01 11:17:49 +02:00
Patrick McHardy	6908665826	netfilter: nf_tables: add GC synchronization helpers GC is expected to happen asynchrously to the netlink interface. In the netlink path, both insertion and removal of elements consist of two steps, insertion followed by activation or deactivation followed by removal, during which the element must not be freed by GC. The synchronization helpers use an unused bit in the genmask field to atomically mark an element as "busy", meaning it is either currently being handled through the netlink API or by GC. Elements being processed by GC will never survive, netlink will simply ignore them. Elements being currently processed through netlink will be skipped by GC and reprocessed during the next run. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-04-01 11:17:29 +02:00
Patrick McHardy	cfed7e1b1f	netfilter: nf_tables: add set garbage collection helpers Add helpers for GC batch destruction: since element destruction needs a RCU grace period for all set implementations, add some helper functions for asynchronous batch destruction. Elements are collected in a batch structure, which is asynchronously released using RCU once its full. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-04-01 11:17:29 +02:00
Patrick McHardy	c3e1b005ed	netfilter: nf_tables: add set element timeout support Add API support for set element timeouts. Elements can have a individual timeout value specified, overriding the sets' default. Two new extension types are used for timeouts - the timeout value and the expiration time. The timeout value only exists if it differs from the default value. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-04-01 11:17:28 +02:00
Patrick McHardy	761da2935d	netfilter: nf_tables: add set timeout API support Add set timeout support to the netlink API. Sets with timeout support enabled can have a default timeout value and garbage collection interval specified. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-04-01 11:17:28 +02:00
Johannes Berg	7bedd0cfad	mac80211: use rhashtable for station table We currently have a hand-rolled table with 256 entries and are using the last byte of the MAC address as the hash. This hash is obviously very fast, but collisions are easily created and we waste a lot of space in the common case of just connecting as a client to an AP where we just have a single station. The other common case of an AP is also suboptimal due to the size of the hash table and the ease of causing collisions. Convert all of this to use rhashtable with jhash, which gives us the advantage of a far better hash function (with random perturbation to avoid hash collision attacks) and of course that the hash table grows and shrinks dynamically with chain length, improving both cases above. Use a specialised hash function (using jhash, but with fixed length) to achieve better compiler optimisation as suggested by Sergey Ryazanov. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-04-01 10:06:26 +02:00
Ying Xue	7e43690578	tipc: fix a slab object leak When remove TIPC module, there is a warning to remind us that a slab object is leaked like: root@localhost:~# rmmod tipc [ 19.056226] ============================================================================= [ 19.057549] BUG TIPC (Not tainted): Objects remaining in TIPC on kmem_cache_close() [ 19.058736] ----------------------------------------------------------------------------- [ 19.058736] [ 19.060287] INFO: Slab 0xffffea0000519a00 objects=23 used=1 fp=0xffff880014668b00 flags=0x100000000004080 [ 19.061915] INFO: Object 0xffff880014668000 @offset=0 [ 19.062717] kmem_cache_destroy TIPC: Slab cache still has objects This is because the listening socket of TIPC topology server is not closed before TIPC proto handler is unregistered with proto_unregister(). However, as the socket is closed in tipc_exit_net() which is called by unregister_pernet_subsys() during unregistering TIPC namespace operation, the warning can be eliminated if calling unregister_pernet_subsys() is moved before calling proto_unregister(). Fixes: `e05b31f4bf` ("tipc: make tipc socket support net namespace") Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-31 23:10:08 -04:00
David S. Miller	7b6249bba9	Lots of updates for net-next; along with the usual flurry of small fixes, cleanups and internal features we have: * VHT support for TDLS and IBSS (conditional on drivers though) * first TX performance improvements (the biggest will come later) * many suspend/resume (race) fixes * name_assign_type support from Tom Gundersen -----BEGIN PGP SIGNATURE----- iQIcBAABCAAGBQJVGWlNAAoJEDBSmw7B7bqr5s8P/R9+Q6y4Ixice9dDOJYynl/d dMEUEfCBWUyDaQD1bNQIED2mc0sM5+Ax8OVIVx9fdrLGPxaISBqDJKE1USoTNZzm q+U3dM4Q45SfQSsgaY4FtxTlPWPtUKsNMXY/CxLR+IqVeA3+30rX+hv1f3SAqBj0 68IwW/uUIHb71IZ+hz2Mwudt4JVW8KRg9VlZ0UY6EEvC4m5QD2YkwQQo/hJ2WF+/ wAJbku02L/Vy4J7E6hNcKYWXokht4fVYphjl/1ZDd/+8L8SUv9mC88n1Jzxa428p 1PmbtwzbpOrtTcC2BCPDA04IyfMc7k9DlLKw/h2KLPbHZXheD9eVmo/Am5vz+uH6 926f+FK339SzoJnZ5wBBDiZ8W8TLYNc8ImxtcxjnrtGfr1CKiuh23P1CWyOlKJCO BYFJqkCOqWOHYnN0embaj7JqM/LmQI5ZoBZHZhD2KQXIXpTsjjIMPfJvip5D+tsV +iXIlQwdeK6rbjxdonBgn7n57XIeSVMAYeyDCbzIShfibjHbUZPn+RsZCtv8RWv8 EaZu8PerU5ZDKwdX940+lWrtf/TMDJBYQpAIBRuiZK4DTNWCt3BrDlvb1FXGgA+X vQJnr32vjJ/pLDxNLHQlkKWC4I/CYtG47OgcJN9AQXrig1zApd+C29zy3aqch3ea wxV9dFfheTqZFjtZfSsH =O/cf -----END PGP SIGNATURE----- Merge tag 'mac80211-next-for-davem-2015-03-30' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next Johannes Berg says: ==================== Lots of updates for net-next; along with the usual flurry of small fixes, cleanups and internal features we have: * VHT support for TDLS and IBSS (conditional on drivers though) * first TX performance improvements (the biggest will come later) * many suspend/resume (race) fixes * name_assign_type support from Tom Gundersen ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-31 16:39:04 -04:00
Jiri Pirko	fbcb217059	net: rename dev to orig_dev in deliver_ptype_list_skb Unlike other places, this function uses name "dev" for what should be "orig_dev", which might be a bit confusing. So fix this. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-31 16:37:43 -04:00
Eugene Crosser	ed4ac42217	af_iucv: fix AF_IUCV sendmsg() errno When sending over AF_IUCV socket, errno was incorrectly set to ENOMEM even when other values where appropriate, notably EAGAIN. With this patch, error indicator returned by sock_alloc_send_skb() is passed to the caller, rather than being overwritten with ENOMEM. Signed-off-by: Eugene Crosser <Eugene.Crosser@ru.ibm.com> Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-31 16:06:50 -04:00
Thomas Graf	fa2d8ff4e3	openvswitch: Return vport module ref before destruction Return module reference before invoking the respective vport ->destroy() function. This is needed as ovs_vport_del() is not invoked inside an RCU read side critical section so the kfree can occur immediately before returning to ovs_vport_del(). Returning the module reference before ->destroy() is safe because the module unregistration is blocked on ovs_lock which we hold while destroying the datapath. Fixes: `62b9c8d037` ("ovs: Turn vports with dependencies into separate modules") Reported-by: Pravin Shelar <pshelar@nicira.com> Signed-off-by: Thomas Graf <tgraf@suug.ch> Acked-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-31 15:59:50 -04:00
Jeff Layton	f9c72d10d6	sunrpc: make debugfs file creation failure non-fatal We currently have a problem that SELinux policy is being enforced when creating debugfs files. If a debugfs file is created as a side effect of doing some syscall, then that creation can fail if the SELinux policy for that process prevents it. This seems wrong. We don't do that for files under /proc, for instance, so Bruce has proposed a patch to fix that. While discussing that patch however, Greg K.H. stated: "No kernel code should care / fail if a debugfs function fails, so please fix up the sunrpc code first." This patch converts all of the sunrpc debugfs setup code to be void return functins, and the callers to not look for errors from those functions. This should allow rpc_clnt and rpc_xprt creation to work, even if the kernel fails to create debugfs files for some reason. Symptoms were failing krb5 mounts on systems using gss-proxy and selinux. Fixes: `388f0c7767` "sunrpc: add a debugfs rpc_xprt directory..." Cc: stable@vger.kernel.org Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2015-03-31 14:15:08 -04:00
Jiri Benc	67b61f6c13	netlink: implement nla_get_in_addr and nla_get_in6_addr Those are counterparts to nla_put_in_addr and nla_put_in6_addr. Signed-off-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-31 13:58:35 -04:00
Jiri Benc	930345ea63	netlink: implement nla_put_in_addr and nla_put_in6_addr IP addresses are often stored in netlink attributes. Add generic functions to do that. For nla_put_in_addr, it would be nicer to pass struct in_addr but this is not used universally throughout the kernel, in way too many places __be32 is used to store IPv4 address. Signed-off-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-31 13:58:35 -04:00
Jiri Benc	15e318bdc6	xfrm: simplify xfrm_address_t use In many places, the a6 field is typecasted to struct in6_addr. As the fields are in union anyway, just add in6_addr type to the union and get rid of the typecasting. Modifying the uapi header is okay, the union has still the same size. Signed-off-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-31 13:58:35 -04:00
Jiri Benc	8f55db4860	tcp: simplify inetpeer_addr_base use In many places, the a6 field is typecasted to struct in6_addr. As the fields are in union anyway, just add in6_addr type to the union and get rid of the typecasting. Signed-off-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-31 13:58:35 -04:00
Ian Morris	53b24b8f94	ipv6: coding style: comparison for inequality with NULL The ipv6 code uses a mixture of coding styles. In some instances check for NULL pointer is done as x != NULL and sometimes as x. x is preferred according to checkpatch and this patch makes the code consistent by adopting the latter form. No changes detected by objdiff. Signed-off-by: Ian Morris <ipm@chirality.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-31 13:51:54 -04:00
Ian Morris	63159f29be	ipv6: coding style: comparison for equality with NULL The ipv6 code uses a mixture of coding styles. In some instances check for NULL pointer is done as x == NULL and sometimes as !x. !x is preferred according to checkpatch and this patch makes the code consistent by adopting the latter form. No changes detected by objdiff. Signed-off-by: Ian Morris <ipm@chirality.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-31 13:51:54 -04:00
Alexander Duyck	6e47d6caff	fib_trie: Cleanup ip_fib_net_exit code path While fixing a recent issue I noticed that we are doing some unnecessary work inside the loop for ip_fib_net_exit. As such I am pulling out the initialization to NULL for the locally stored fib_local, fib_main, and fib_default. In addition I am restoring the original code for flushing the table as there is no need to split up the fib_table_flush and hlist_del work since the code for packing the tnodes with multiple key vectors was dropped. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-31 13:18:56 -04:00
Alexander Duyck	ad88d05136	fib_trie: Fix warning on fib4_rules_exit This fixes the following warning: BUG: sleeping function called from invalid context at mm/slub.c:1268 in_atomic(): 1, irqs_disabled(): 0, pid: 6, name: kworker/u8:0 INFO: lockdep is turned off. CPU: 3 PID: 6 Comm: kworker/u8:0 Tainted: G W 4.0.0-rc5+ #895 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 Workqueue: netns cleanup_net 0000000000000006 ffff88011953fa68 ffffffff81a203b6 000000002c3a2c39 ffff88011952a680 ffff88011953fa98 ffffffff8109daf0 ffff8801186c6aa8 ffffffff81fbc9e5 00000000000004f4 0000000000000000 ffff88011953fac8 Call Trace: [<ffffffff81a203b6>] dump_stack+0x4c/0x65 [<ffffffff8109daf0>] ___might_sleep+0x1c3/0x1cb [<ffffffff8109db70>] __might_sleep+0x78/0x80 [<ffffffff8117a60e>] slab_pre_alloc_hook+0x31/0x8f [<ffffffff8117d4f6>] __kmalloc+0x69/0x14e [<ffffffff818ed0e1>] ? kzalloc.constprop.20+0xe/0x10 [<ffffffff818ed0e1>] kzalloc.constprop.20+0xe/0x10 [<ffffffff818ef622>] fib_trie_table+0x27/0x8b [<ffffffff818ef6bd>] fib_trie_unmerge+0x37/0x2a6 [<ffffffff810b06e1>] ? arch_local_irq_save+0x9/0xc [<ffffffff818e9793>] fib_unmerge+0x2d/0xb3 [<ffffffff818f5f56>] fib4_rule_delete+0x1f/0x52 [<ffffffff817f1c3f>] ? fib_rules_unregister+0x30/0xb2 [<ffffffff817f1c8b>] fib_rules_unregister+0x7c/0xb2 [<ffffffff818f64a1>] fib4_rules_exit+0x15/0x18 [<ffffffff818e8c0a>] ip_fib_net_exit+0x23/0xf2 [<ffffffff818e91f8>] fib_net_exit+0x32/0x36 [<ffffffff817c8352>] ops_exit_list+0x45/0x57 [<ffffffff817c8d3d>] cleanup_net+0x13c/0x1cd [<ffffffff8108b05d>] process_one_work+0x255/0x4ad [<ffffffff8108af69>] ? process_one_work+0x161/0x4ad [<ffffffff8108b4b1>] worker_thread+0x1cd/0x2ab [<ffffffff8108b2e4>] ? process_scheduled_works+0x2f/0x2f [<ffffffff81090686>] kthread+0xd4/0xdc [<ffffffff8109ec8f>] ? local_clock+0x19/0x22 [<ffffffff810905b2>] ? __kthread_parkme+0x83/0x83 [<ffffffff81a2c0c8>] ret_from_fork+0x58/0x90 [<ffffffff810905b2>] ? __kthread_parkme+0x83/0x83 The issue was that as a part of exiting the default rules were being deleted which resulted in the local trie being unmerged. By moving the freeing of the FIB tables up we can avoid the unmerge since there is no local table left when we call the fib4_rules_exit function. Fixes: `0ddcf43d5d` ("ipv4: FIB Local/MAIN table collapse") Reported-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-31 13:18:56 -04:00
David S. Miller	89a3f3c55b	Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next Johan Hedberg says: ==================== pull request: bluetooth-next 2015-03-27 Here's another set of Bluetooth & 802.15.4 patches for 4.1: - New API to control LE advertising data (i.e. peripheral role) - mac802154 & at86rf230 cleanups - Support for toggling quirks from debugfs (useful for testing) - Memory leak fix for LE scanning - Extra version info reading support for Broadcom controllers Please let me know if there are any issues pulling. Thanks. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-31 12:45:58 -04:00
Chuck Lever	d654788e98	xprtrdma: Make rpcrdma_{un}map_one() into inline functions These functions are called in a loop for each page transferred via RDMA READ or WRITE. Extract loop invariants and inline them to reduce CPU overhead. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: Devesh Sharma <Devesh.Sharma@Emulex.Com> Tested-by: Meghana Cheripady <Meghana.Cheripady@Emulex.Com> Tested-by: Veeresh U. Kokatnur <veereshuk@chelsio.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2015-03-31 09:52:53 -04:00
Chuck Lever	e46ac34c3c	xprtrdma: Handle non-SEND completions via a callout Allow each memory registration mode to plug in a callout that handles the completion of a memory registration operation. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Tested-by: Devesh Sharma <Devesh.Sharma@Emulex.Com> Tested-by: Meghana Cheripady <Meghana.Cheripady@Emulex.Com> Tested-by: Veeresh U. Kokatnur <veereshuk@chelsio.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2015-03-31 09:52:53 -04:00
Chuck Lever	3968cb5850	xprtrdma: Add "open" memreg op The open op determines the size of various transport data structures based on device capabilities and memory registration mode. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: Devesh Sharma <Devesh.Sharma@Emulex.Com> Tested-by: Meghana Cheripady <Meghana.Cheripady@Emulex.Com> Tested-by: Veeresh U. Kokatnur <veereshuk@chelsio.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2015-03-31 09:52:53 -04:00
Chuck Lever	4561f347d4	xprtrdma: Add "destroy MRs" memreg op Memory Region objects associated with a transport instance are destroyed before the instance is shutdown and destroyed. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Tested-by: Devesh Sharma <Devesh.Sharma@Emulex.Com> Tested-by: Meghana Cheripady <Meghana.Cheripady@Emulex.Com> Tested-by: Veeresh U. Kokatnur <veereshuk@chelsio.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2015-03-31 09:52:53 -04:00
Chuck Lever	31a701a947	xprtrdma: Add "reset MRs" memreg op This method is invoked when a transport instance is about to be reconnected. Each Memory Region object is reset to its initial state. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Tested-by: Devesh Sharma <Devesh.Sharma@Emulex.Com> Tested-by: Meghana Cheripady <Meghana.Cheripady@Emulex.Com> Tested-by: Veeresh U. Kokatnur <veereshuk@chelsio.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2015-03-31 09:52:53 -04:00
Chuck Lever	91e70e70e4	xprtrdma: Add "init MRs" memreg op This method is used when setting up a new transport instance to create a pool of Memory Region objects that will be used to register memory during operation. Memory Regions are not needed for "physical" registration, since ->prepare and ->release are no-ops for that mode. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Tested-by: Devesh Sharma <Devesh.Sharma@Emulex.Com> Tested-by: Meghana Cheripady <Meghana.Cheripady@Emulex.Com> Tested-by: Veeresh U. Kokatnur <veereshuk@chelsio.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2015-03-31 09:52:52 -04:00
Chuck Lever	6814baead8	xprtrdma: Add a "deregister_external" op for each memreg mode There is very little common processing among the different external memory deregistration functions. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: Devesh Sharma <Devesh.Sharma@Emulex.Com> Tested-by: Meghana Cheripady <Meghana.Cheripady@Emulex.Com> Tested-by: Veeresh U. Kokatnur <veereshuk@chelsio.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2015-03-31 09:52:52 -04:00
Chuck Lever	9c1b4d775f	xprtrdma: Add a "register_external" op for each memreg mode There is very little common processing among the different external memory registration functions. Have rpcrdma_create_chunks() call the registration method directly. This removes a stack frame and a switch statement from the external registration path. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: Devesh Sharma <Devesh.Sharma@Emulex.Com> Tested-by: Meghana Cheripady <Meghana.Cheripady@Emulex.Com> Tested-by: Veeresh U. Kokatnur <veereshuk@chelsio.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2015-03-31 09:52:52 -04:00
Chuck Lever	1c9351ee0e	xprtrdma: Add a "max_payload" op for each memreg mode The max_payload computation is generalized to ensure that the payload maximum is the lesser of RPC_MAX_DATA_SEGS and the number of data segments that can be transmitted in an inline buffer. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Tested-by: Devesh Sharma <Devesh.Sharma@Emulex.Com> Tested-by: Meghana Cheripady <Meghana.Cheripady@Emulex.Com> Tested-by: Veeresh U. Kokatnur <veereshuk@chelsio.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2015-03-31 09:52:52 -04:00
Chuck Lever	a0ce85f595	xprtrdma: Add vector of ops for each memory registration strategy Instead of employing switch() statements, let's use the typical Linux kernel idiom for handling behavioral variation: virtual functions. Start by defining a vector of operations for each supported memory registration mode, and by adding a source file for each mode. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Tested-by: Devesh Sharma <Devesh.Sharma@Emulex.Com> Tested-by: Meghana Cheripady <Meghana.Cheripady@Emulex.Com> Tested-by: Veeresh U. Kokatnur <veereshuk@chelsio.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2015-03-31 09:52:52 -04:00
Chuck Lever	41f9702896	xprtrdma: Prevent infinite loop in rpcrdma_ep_create() If a provider advertizes a zero max_fast_reg_page_list_len, FRWR depth detection loops forever. Instead of just failing the mount, try other memory registration modes. Fixes: `0fc6c4e7bb` ("xprtrdma: mind the device's max fast . . .") Reported-by: Devesh Sharma <Devesh.Sharma@Emulex.Com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: Devesh Sharma <Devesh.Sharma@Emulex.Com> Tested-by: Meghana Cheripady <Meghana.Cheripady@Emulex.Com> Tested-by: Veeresh U. Kokatnur <veereshuk@chelsio.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2015-03-31 09:52:52 -04:00
Chuck Lever	805272406a	xprtrdma: Byte-align FRWR registration The RPC/RDMA transport's FRWR registration logic registers whole pages. This means areas in the first and last pages that are not involved in the RDMA I/O are needlessly exposed to the server. Buffered I/O is typically page-aligned, so not a problem there. But for direct I/O, which can be byte-aligned, and for reply chunks, which are nearly always smaller than a page, the transport could expose memory outside the I/O buffer. FRWR allows byte-aligned memory registration, so let's use it as it was intended. Reported-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: Devesh Sharma <Devesh.Sharma@Emulex.Com> Tested-by: Meghana Cheripady <Meghana.Cheripady@Emulex.Com> Tested-by: Veeresh U. Kokatnur <veereshuk@chelsio.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2015-03-31 09:52:52 -04:00
Chuck Lever	e23779451e	xprtrdma: Perform a full marshal on retransmit Commit `6ab59945f2` ("xprtrdma: Update rkeys after transport reconnect" added logic in the ->send_request path to update the chunk list when an RPC/RDMA request is retransmitted. Note that rpc_xdr_encode() resets and re-encodes the entire RPC send buffer for each retransmit of an RPC. The RPC send buffer is not preserved from the previous transmission of an RPC. Revert `6ab59945f2`, and instead, just force each request to be fully marshaled every time through ->send_request. This should preserve the fix from `6ab59945f2`, while also performing pullup during retransmits. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Acked-by: Sagi Grimberg <sagig@mellanox.com> Tested-by: Devesh Sharma <Devesh.Sharma@Emulex.Com> Tested-by: Meghana Cheripady <Meghana.Cheripady@Emulex.Com> Tested-by: Veeresh U. Kokatnur <veereshuk@chelsio.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2015-03-31 09:52:52 -04:00
Chuck Lever	0dd39cae26	xprtrdma: Display IPv6 addresses and port numbers correctly Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Tested-by: Devesh Sharma <Devesh.Sharma@Emulex.Com> Tested-by: Meghana Cheripady <Meghana.Cheripady@Emulex.Com> Tested-by: Veeresh U. Kokatnur <veereshuk@chelsio.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2015-03-31 09:52:52 -04:00
Johan Hedberg	db6e3e8d01	Bluetooth: Refactor HCI request variables into own struct In order to shrink the size of bt_skb_cb, this patch moves the HCI request related variables into their own req_ctrl struct. Additionall the L2CAP and HCI request structs are placed inside the same union since they will never be used at the same time for the same skb. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-03-30 23:20:53 +02:00
Johan Hedberg	a4368ff3ed	Bluetooth: Refactor L2CAP variables into l2cap_ctrl We're getting very close to the maximum possible size of bt_skb_cb. To prepare to shrink the struct with the help of a union this patch moves all L2CAP related variables into the l2cap_ctrl struct. To later add other 'ctrl' structs the L2CAP one is renamed simple 'l2cap' instead of 'control'. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-03-30 23:20:53 +02:00
Johannes Berg	2c44be81f0	mac80211: set QoS capability before changing station state In the upcoming fast-xmit patch, changing station state will build a header cache based on the station's capabilities, and as the QoS capability (sta.wme) impacts the header, it needs to be set before. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-30 15:11:01 +02:00
Arik Nemtsov	070e176a75	mac80211: send HT/VHT IEs in TDLS discovery response These are mandated by IEEE802.11-2012 section 8.5.8.6 and IEEE802.11ac-2013 section 8.5.8.16. Signed-off-by: Arik Nemtsov <arikx.nemtsov@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-30 10:48:59 +02:00
Janusz.Dziedzic@tieto.com	abcff6ef01	mac80211: add VHT support for IBSS Add VHT support for IBSS. Drivers could activate this feature by setting NL80211_EXT_FEATURE_VHT_IBSS flag. Signed-off-by: Janusz Dziedzic <janusz.dziedzic@tieto.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-30 10:48:26 +02:00
Janusz.Dziedzic@tieto.com	76bed0f43b	mac80211: IBSS fix scan request In case of wide bandwidth (wider than 20MHz) used by IBSS, scan all channels in chandef to be able to find neighboring IBSS netwqworks that use the same overall channels but a different control channel. Signed-off-by: Janusz Dziedzic <janusz.dziedzic@tieto.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-30 10:47:56 +02:00
Johannes Berg	97ffe75791	mac80211: factor out station lookup from ieee80211_build_hdr() In order to look up the RA station earlier to implement a TX fastpath, factor out the lookup from ieee80211_build_hdr(). To always have a valid station pointer, also move some of the checks into the new function. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-30 10:47:19 +02:00
Johannes Berg	527871d720	mac80211: make sta.wme indicate whether QoS is used Indicating just the peer's capability is fairly pointless if the local device doesn't support it. Make the variable track both combined, and remove the 'local support' check in the TX path. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-30 10:47:16 +02:00
Johannes Berg	a73f8e21f3	mac80211: send AP probe as unicast again Louis reported that a static checker was complaining that the 'dst' variable was set (multiple times) but not used. This is due to a previous commit having removed the usage (apparently erroneously), so add it back. Fixes: `a344d6778a` ("mac80211: allow drivers to support NL80211_SCAN_FLAG_RANDOM_ADDR") Reported-by: Louis Langholtz <lou_langholtz@me.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-30 10:47:14 +02:00
Johannes Berg	07862e13e6	mac80111: aes_gcm: clean up ieee80211_aes_gcm_key_setup_encrypt() This code is written using an anti-pattern called "success handling" which makes it hard to read, especially if you are used to normal kernel style. It should instead be written as a list of directives in a row with branches for error handling. (Basically copied from Dan's previous patch for CCM) Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-30 10:42:02 +02:00
Dan Carpenter	45fd63293a	mac80111: aes_ccm: cleanup ieee80211_aes_key_setup_encrypt() This code is written using an anti-pattern called "success handling" which makes it hard to read, especially if you are used to normal kernel style. It should instead be written as a list of directives in a row with branches for error handling. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-30 10:40:03 +02:00
Jouni Malinen	82ca6ef686	mac80211: Fix misplaced return in AES-GMAC key setup Commit `8ade538bf3` ("mac80111: Add BIP-GMAC-128 and BIP-GMAC-256 ciphers") had the success return in incorrect place before the crypto_aead_setauthsize() call which practically ended up skipping that call unconditionally. The missing call did not actually change any functionality since GMAC_MIC_LEN (16) is identical to the maxauthsize in gcm(aes) and as such, the default value used for the authsize parameter. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Jouni Malinen <jouni@qca.qualcomm.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-30 10:38:07 +02:00
Tom Gundersen	6bab2e19c5	cfg80211: pass name_assign_type to rdev_add_virtual_intf() This will expose in /sys whether the ifname of a device is set by userspace or generated by the kernel. The latter kind (wlanX, etc) is not deterministic, so userspace needs to rename these devices to names that are guaranteed to stay the same between reboots. The former, however should never be renamed, so userspace needs to be able to reliably tell the difference. Similar functionality was introduced for the rtnetlink core in commit `5517750f05` ("net: rtnetlink - make create_link take name_assign_type") Signed-off-by: Tom Gundersen <teg@jklm.no> Cc: Kalle Valo <kvalo@qca.qualcomm.com> Cc: Brett Rudley <brudley@broadcom.com> Cc: Arend van Spriel <arend@broadcom.com> Cc: Franky (Zhenhui) Lin <frankyl@broadcom.com> Cc: Hante Meuleman <meuleman@broadcom.com> Cc: Johannes Berg <johannes@sipsolutions.net> [reformat changelog to fit 72 cols] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-30 10:36:17 +02:00
Michael Braun	6a8b4adb47	mac80211: fix typo in debug output Signed-off-by: Michael Braun <michael-dev@fami-braun.de> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-30 10:34:13 +02:00
Johannes Berg	8f9c77fc1e	mac80211: reject aggregation sessions with non-HT peers If a peer or some local agent (rate control, ...) decides to start an aggregation session but doesn't support HT (which also implies QoS), reject it. This is mostly a corner case as such peers normally won't try to use block-ack sessions and rate control wouldn't start them, but technically QoS stations could request it according to the spec. However, since drivers don't really support such non-HT sessions it's better to reject them. Also, while at it, move the tracing for TX sessions earlier so it captures the error cases as well. Reviewed-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-30 10:33:19 +02:00
Arik Nemtsov	a38700dd48	cfg/mac80211: add regulatory classes IE during TDLS setup Seems Broadcom TDLS peers (Nexus 5, Xperia Z3) refuse to allow TDLS connection when channel-switching is supported but the regulatory classes IE is missing from the setup request. Add a chandef to reg-class translation function to cfg80211 and use it to add the required IE during setup. For now add only the current regulatory class as supported - it is enough to resolve the compatibility issue. Signed-off-by: Arik Nemtsov <arikx.nemtsov@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-30 10:26:36 +02:00
David Spinadel	7d830a1986	mac80211: stop scan before connection Stop scan before authentication or association to make sure that nothing interferes with connection flow. Currently mac80211 defers RX auth and assoc packets (among other ones) until after the scan is complete, so auth during scan is likely to fail if scan took too much time. Signed-off-by: David Spinadel <david.spinadel@intel.com> Reviewed-by: Luciano Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-30 10:19:13 +02:00
Luciano Coelho	21fea56731	nl80211: add net-detect delay to wowlan info Pass the initial net-detect delay (NL80211_ATTR_SCHED_SCAN_DELAY) attribute in the WoWLAN info response. Additionally, remove a bogus TODO comment. Signed-off-by: Luciano Coelho <luciano.coelho@intel.com> Reviewed-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-30 10:18:12 +02:00
Emmanuel Grumbach	a90faa9d64	mac80211: notify the driver about deauth This can allow the driver to take action based on the reason of the deauth. Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-30 10:17:12 +02:00
Emmanuel Grumbach	d0d1a12f9c	mac80211: notify the driver about association status This can allow the driver to take action based on the success / failure of the association. Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-30 10:17:11 +02:00
Emmanuel Grumbach	a9409093d2	mac80211: notify the driver about authentication status This can allow the driver to take action based on the success / failure of the authentication. Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-30 10:17:10 +02:00
Emmanuel Grumbach	a818292952	mac80211: convert rssi_callback() to event_callback() We will be able to add more events, such as MLME events and others. The low level driver may be interested in knowing about these events to dump firmware data upon failures, or to change parameters in case connection attempts fail etc... Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-30 10:17:09 +02:00
Johannes Berg	2c158887f1	mac80211: agg-tx: avoid sending DelBA with sta->lock held The rate control locking caused a potential deadlock here due to the locks being acquired in different orders, so that change cannot yet be applied. However, there's no fundamental reason for this code to hold the sta->lock while transmitting frames. Clearly it's better not to hold the lock for longer periods of time, which can happen here since we call all the way down to the driver. Change the code a bit to not hold it while doing that. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-30 09:46:52 +02:00
Jon Paul Maloy	d482994fca	tipc: fix two bugs in secondary destination lookup A message sent to a node after a successful name table lookup may still find that the destination socket has disappeared, because distribution of name table updates is non-atomic. If so, the message will be rejected back to the sender with error code TIPC_ERR_NO_PORT. If the source socket of the message has disappeared in the meantime, the message should be dropped. However, in the currrent code, the message will instead be subject to an unwanted tertiary lookup, because the function tipc_msg_lookup_dest() doesn't check if there is an error code present in the message before performing the lookup. In the worst case, the message may now find the old destination again, and be redirected once more, instead of being dropped directly as it should be. A second bug in this function is that the "prev_node" field in the message is not updated after successful lookup, something that may have unpredictable consequences. The problems arising from those bugs occur very infrequently. The third change in this function; the test on msg_reroute_msg_cnt() is purely cosmetic, reflecting that the returned value never can be negative. This commit corrects the two bugs described above. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-29 13:47:36 -07:00
Alexey Kodanev	4ad19de877	net: tcp6: fix double call of tcp_v6_fill_cb() tcp_v6_fill_cb() will be called twice if socket's state changes from TCP_TIME_WAIT to TCP_LISTEN. That can result in control buffer data corruption because in the second tcp_v6_fill_cb() call it's not copying IP6CB(skb) anymore, but 'seq', 'end_seq', etc., so we can get weird and unpredictable results. Performance loss of up to 1200% has been observed in LTP/vxlan03 test. This can be fixed by copying inet6_skb_parm to the beginning of 'cb' only if xfrm6_policy_check() and tcp_v6_fill_cb() are going to be called again. Fixes: `2dc49d1680` ("tcp6: don't move IP6CB before xfrm6_policy_check()") Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-29 13:36:05 -07:00
Toshiaki Makita	e38f30256b	net: Introduce passthru_features_check As there are a number of (especially virtual) devices that don't need the multiple vlan check, introduce passthru_features_check() for convenience. Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-29 13:33:22 -07:00
Toshiaki Makita	8cb65d0008	net: Move check for multiple vlans to drivers To allow drivers to handle the features check for multiple tags, move the check to ndo_features_check(). As no drivers currently handle multiple tagged TSO, introduce dflt_features_check() and call it if the driver does not have ndo_features_check(). Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-29 13:33:22 -07:00
Toshiaki Makita	f5a7fb88e1	vlan: Introduce helper functions to check if skb is tagged Separate the two checks for single vlan and multiple vlans in netif_skb_features(). This allows us to move the check for multiple vlans to another function later. Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-29 13:33:22 -07:00
Toshiaki Makita	8d463504c1	vlan: Add features for stacked vlan device Stacked vlan devices curretly have few features (GRO, HIGHDMA, LLTX). Since we have software fallbacks in case the NIC can not handle some features for multiple vlans, we can add the same features as the lower vlan devices for stacked vlan devices. This allows stacked vlan devices to create large (GSO) packets and not to segment packets. Those packets will be segmented by software on the real device, or even can be segmented by the NIC once TSO for multiple vlans becomes enabled by the following patches. The exception is those related to FCoE, which does not have a software fallback. Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-29 13:33:21 -07:00
Alexei Starovoitov	608cd71a9c	tc: bpf: generalize pedit action existing TC action 'pedit' can munge any bits of the packet. Generalize it for use in bpf programs attached as cls_bpf and act_bpf via bpf_skb_store_bytes() helper function. Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Reviewed-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-29 13:26:54 -07:00
Guenter Roeck	339d82626d	net: dsa: Add basic framework to support ndo_fdb functions Provide callbacks for ndo_fdb_add, ndo_fdb_del, and ndo_fdb_dump. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-29 13:23:54 -07:00
Nicolas Dichtel	4217291e59	netns: don't clear nsid too early on removal With the current code, ids are removed too early. Suppose you have an ipip interface that stands in the netns foo and its link part in the netns bar (so the netns bar has an nsid into the netns foo). Now, you remove the netns bar: - the bar nsid into the netns foo is removed - the netns exit method of ipip is called, thus our ipip iface is removed: => a netlink message is sent in the netns foo to advertise this deletion => this netlink message requests an nsid for bar, thus a new nsid is allocated for bar and never removed. We must remove nsids when we are sure that nobody will refer to netns currently cleaned. Fixes: `0c7aecd4bd` ("netns: add rtnl cmd to add and get peer netns ids") Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-29 12:58:21 -07:00
David S. Miller	4ef295e047	Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next Pablo Neira Ayuso says: ==================== Netfilter updates for net-next The following patchset contains Netfilter updates for your net-next tree. Basically, nf_tables updates to add the set extension infrastructure and finish the transaction for sets from Patrick McHardy. More specifically, they are: 1) Move netns to basechain and use recently added possible_net_t, from Patrick McHardy. 2) Use LOGLEVEL_<FOO> from nf_log infrastructure, from Joe Perches. 3) Restore nf_log_trace that was accidentally removed during conflict resolution. 4) nft_queue does not depend on NETFILTER_XTABLES, starting from here all patches from Patrick McHardy. 5) Use raw_smp_processor_id() in nft_meta. Then, several patches to prepare ground for the new set extension infrastructure: 6) Pass object length to the hash callback in rhashtable as needed by the new set extension infrastructure. 7) Cleanup patch to restore struct nft_hash as wrapper for struct rhashtable 8) Another small source code readability cleanup for nft_hash. 9) Convert nft_hash to rhashtable callbacks. And finally... 10) Add the new set extension infrastructure. 11) Convert the nft_hash and nft_rbtree sets to use it. 12) Batch set element release to avoid several RCU grace period in a row and add new function nft_set_elem_destroy() to consolidate set element release. 13) Return the set extension data area from nft_lookup. 14) Refactor existing transaction code to add some helper functions and document it. 15) Complete the set transaction support, using similar approach to what we already use, to activate/deactivate elements in an atomic fashion. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-29 12:43:43 -07:00
Ying Xue	8a0f6ebe84	tipc: involve reference counter for node structure TIPC node hash node table is protected with rcu lock on read side. tipc_node_find() is used to look for a node object with node address through iterating the hash node table. As the entire process of what tipc_node_find() traverses the table is guarded with rcu read lock, it's safe for us. However, when callers use the node object returned by tipc_node_find(), there is no rcu read lock applied. Therefore, this is absolutely unsafe for callers of tipc_node_find(). Now we introduce a reference counter for node structure. Before tipc_node_find() returns node object to its caller, it first increases the reference counter. Accordingly, after its caller used it up, it decreases the counter again. This can prevent a node being used by one thread from being freed by another thread. Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericson.com> Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-29 12:40:28 -07:00
Ying Xue	b952b2befb	tipc: fix potential deadlock when all links are reset [ 60.988363] ====================================================== [ 60.988754] [ INFO: possible circular locking dependency detected ] [ 60.989152] 3.19.0+ #194 Not tainted [ 60.989377] ------------------------------------------------------- [ 60.989781] swapper/3/0 is trying to acquire lock: [ 60.990079] (&(&n_ptr->lock)->rlock){+.-...}, at: [<ffffffffa0006dca>] tipc_link_retransmit+0x1aa/0x240 [tipc] [ 60.990743] [ 60.990743] but task is already holding lock: [ 60.991106] (&(&bclink->lock)->rlock){+.-...}, at: [<ffffffffa00004be>] tipc_bclink_lock+0x8e/0xa0 [tipc] [ 60.991738] [ 60.991738] which lock already depends on the new lock. [ 60.991738] [ 60.992174] [ 60.992174] the existing dependency chain (in reverse order) is: [ 60.992174] -> #1 (&(&bclink->lock)->rlock){+.-...}: [ 60.992174] [<ffffffff810a9c0c>] lock_acquire+0x9c/0x140 [ 60.992174] [<ffffffff8179c41f>] _raw_spin_lock_bh+0x3f/0x50 [ 60.992174] [<ffffffffa00004be>] tipc_bclink_lock+0x8e/0xa0 [tipc] [ 60.992174] [<ffffffffa0000f57>] tipc_bclink_add_node+0x97/0xf0 [tipc] [ 60.992174] [<ffffffffa0011815>] tipc_node_link_up+0xf5/0x110 [tipc] [ 60.992174] [<ffffffffa0007783>] link_state_event+0x2b3/0x4f0 [tipc] [ 60.992174] [<ffffffffa00193c0>] tipc_link_proto_rcv+0x24c/0x418 [tipc] [ 60.992174] [<ffffffffa0008857>] tipc_rcv+0x827/0xac0 [tipc] [ 60.992174] [<ffffffffa0002ca3>] tipc_l2_rcv_msg+0x73/0xd0 [tipc] [ 60.992174] [<ffffffff81646e66>] __netif_receive_skb_core+0x746/0x980 [ 60.992174] [<ffffffff816470c1>] __netif_receive_skb+0x21/0x70 [ 60.992174] [<ffffffff81647295>] netif_receive_skb_internal+0x35/0x130 [ 60.992174] [<ffffffff81648218>] napi_gro_receive+0x158/0x1d0 [ 60.992174] [<ffffffff81559e05>] e1000_clean_rx_irq+0x155/0x490 [ 60.992174] [<ffffffff8155c1b7>] e1000_clean+0x267/0x990 [ 60.992174] [<ffffffff81647b60>] net_rx_action+0x150/0x360 [ 60.992174] [<ffffffff8105ec43>] __do_softirq+0x123/0x360 [ 60.992174] [<ffffffff8105f12e>] irq_exit+0x8e/0xb0 [ 60.992174] [<ffffffff8179f9f5>] do_IRQ+0x65/0x110 [ 60.992174] [<ffffffff8179da6f>] ret_from_intr+0x0/0x13 [ 60.992174] [<ffffffff8100de9f>] arch_cpu_idle+0xf/0x20 [ 60.992174] [<ffffffff8109dfa6>] cpu_startup_entry+0x2f6/0x3f0 [ 60.992174] [<ffffffff81033cda>] start_secondary+0x13a/0x150 [ 60.992174] -> #0 (&(&n_ptr->lock)->rlock){+.-...}: [ 60.992174] [<ffffffff810a8f7d>] __lock_acquire+0x163d/0x1ca0 [ 60.992174] [<ffffffff810a9c0c>] lock_acquire+0x9c/0x140 [ 60.992174] [<ffffffff8179c41f>] _raw_spin_lock_bh+0x3f/0x50 [ 60.992174] [<ffffffffa0006dca>] tipc_link_retransmit+0x1aa/0x240 [tipc] [ 60.992174] [<ffffffffa0001e11>] tipc_bclink_rcv+0x611/0x640 [tipc] [ 60.992174] [<ffffffffa0008646>] tipc_rcv+0x616/0xac0 [tipc] [ 60.992174] [<ffffffffa0002ca3>] tipc_l2_rcv_msg+0x73/0xd0 [tipc] [ 60.992174] [<ffffffff81646e66>] __netif_receive_skb_core+0x746/0x980 [ 60.992174] [<ffffffff816470c1>] __netif_receive_skb+0x21/0x70 [ 60.992174] [<ffffffff81647295>] netif_receive_skb_internal+0x35/0x130 [ 60.992174] [<ffffffff81648218>] napi_gro_receive+0x158/0x1d0 [ 60.992174] [<ffffffff81559e05>] e1000_clean_rx_irq+0x155/0x490 [ 60.992174] [<ffffffff8155c1b7>] e1000_clean+0x267/0x990 [ 60.992174] [<ffffffff81647b60>] net_rx_action+0x150/0x360 [ 60.992174] [<ffffffff8105ec43>] __do_softirq+0x123/0x360 [ 60.992174] [<ffffffff8105f12e>] irq_exit+0x8e/0xb0 [ 60.992174] [<ffffffff8179f9f5>] do_IRQ+0x65/0x110 [ 60.992174] [<ffffffff8179da6f>] ret_from_intr+0x0/0x13 [ 60.992174] [<ffffffff8100de9f>] arch_cpu_idle+0xf/0x20 [ 60.992174] [<ffffffff8109dfa6>] cpu_startup_entry+0x2f6/0x3f0 [ 60.992174] [<ffffffff81033cda>] start_secondary+0x13a/0x150 [ 60.992174] [ 60.992174] other info that might help us debug this: [ 60.992174] [ 60.992174] Possible unsafe locking scenario: [ 60.992174] [ 60.992174] CPU0 CPU1 [ 60.992174] ---- ---- [ 60.992174] lock(&(&bclink->lock)->rlock); [ 60.992174] lock(&(&n_ptr->lock)->rlock); [ 60.992174] lock(&(&bclink->lock)->rlock); [ 60.992174] lock(&(&n_ptr->lock)->rlock); [ 60.992174] [ 60.992174] * DEADLOCK * [ 60.992174] [ 60.992174] 3 locks held by swapper/3/0: [ 60.992174] #0: (rcu_read_lock){......}, at: [<ffffffff81646791>] __netif_receive_skb_core+0x71/0x980 [ 60.992174] #1: (rcu_read_lock){......}, at: [<ffffffffa0002c35>] tipc_l2_rcv_msg+0x5/0xd0 [tipc] [ 60.992174] #2: (&(&bclink->lock)->rlock){+.-...}, at: [<ffffffffa00004be>] tipc_bclink_lock+0x8e/0xa0 [tipc] [ 60.992174] The correct the sequence of grabbing n_ptr->lock and bclink->lock should be that the former is first held and the latter is then taken, which exactly happened on CPU1. But especially when the retransmission of broadcast link is failed, bclink->lock is first held in tipc_bclink_rcv(), and n_ptr->lock is taken in link_retransmit_failure() called by tipc_link_retransmit() subsequently, which is demonstrated on CPU0. As a result, deadlock occurs. If the order of holding the two locks happening on CPU0 is reversed, the deadlock risk will be relieved. Therefore, the node lock taken in link_retransmit_failure() originally is moved to tipc_bclink_rcv() so that it's obtained before bclink lock. But the precondition of the adjustment of node lock is that responding to bclink reset event must be moved from tipc_bclink_unlock() to tipc_node_unlock(). Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-29 12:40:27 -07:00
Eric Dumazet	41d25fe092	tcp: tcp_syn_flood_action() can be static After commit `1fb6f159fd` ("tcp: add tcp_conn_request"), tcp_syn_flood_action() is no longer used from IPv6. We can make it static, by moving it above tcp_conn_request() Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Octavian Purdila <octavian.purdila@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-29 12:17:18 -07:00
WANG Cong	f243e5a785	ipmr,ip6mr: call ip6mr_free_table() on failure path Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-29 12:13:54 -07:00
WANG Cong	85b9909272	fib6: install fib6 ops in the last step We should not commit the new ops until we finish all the setup, otherwise we have to NULL it on failure. Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-29 12:12:37 -07:00
Marcel Holtmann	20fa110a54	Bluetooth: Remove superfluous extra empty line between functions Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-03-29 07:57:03 +03:00
Marcel Holtmann	57b0d3e8e7	Bluetooth: Fix error returns for Read Local OOB Extended Data commands The Read Local OOB Extended Data commands are required to return the address type and the data length at least. However currently the error returns only the address type. To fix this and avoid any extra allocations or stack memory, rearrange the code so that the same path can be used for error returns. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-03-29 07:57:02 +03:00
Marcel Holtmann	efcd8c98e0	Bluetooth: Move memory location outside of hci_dev lock Taking the hci_dev lock for just a memory allocation seems a bit too much and not really needed. So instead try to allocate the memory first and then take the lock. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-03-29 07:57:00 +03:00
Arman Uguray	880897d4c9	Bluetooth: Update adv. parameters when conn. setting changes This patch fixes a bug where the advertising parameters weren't updated after a call to "Set Connectable" if the HCI_ADVERTISING_INSTANCE setting was set. Signed-off-by: Arman Uguray <armansito@chromium.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-03-28 21:31:57 +01:00
Arman Uguray	c7d4883b06	Bluetooth: Use ADV_SCAN_IND for adv. instances With this patch, ADV_SCAN_IND will be used for advertising instances that have non-zero scan response data while the global "connectable" setting is "off". Signed-off-by: Arman Uguray <armansito@chromium.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-03-28 21:31:57 +01:00
Arman Uguray	faccb950f7	Bluetooth: Fix using global connectable settings for adv This patch fixes a bug where ADV_NONCONN_IND was being used for advertising instances >0 while the global connectable setting was set to "on". Signed-off-by: Arman Uguray <armansito@chromium.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-03-28 21:31:57 +01:00
Johan Hedberg	600b21507e	Bluetooth: Fix race condition with HCI_RESET flag During the HCI init phase a completed request might be the last part of the setup procedure after which the actual init procedure starts. The init procedure begins with a call to hci_reset_req() which sets the HCI_RESET flag. The purpose of this flag is to make us ignore any updates to ncmd/cmd_cnt as long as we haven't received the command complete event for the HCI_Reset. There's a potential race with this however: hci_req_cmd_complete(hdev, opcode, status); if (ev->ncmd && !test_bit(HCI_RESET, &hdev->flags)) { atomic_set(&hdev->cmd_cnt, 1); if (!skb_queue_empty(&hdev->cmd_q)) queue_work(hdev->workqueue, &hdev->cmd_work); } Since the hci_req_cmd_complete() will trigger the completion of the setup stage, it's possible that hci_reset_req() gets called before we try to read ev->ncmd and the HCI_RESET flag. Because of this the cmd_cnt would never be updated and the hci_reset_req() in practice ends up blocking itself. This patch fixes the issue by updating cmd_cnt before notifying the request completion, and then reading it again to determine whether the cmd_work should be queued or not. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-03-28 20:05:11 +01:00
Alexander Aring	8bf9538a5d	mac802154: cleanup concurrent check This patch cleanups the checking of different mac phy depended values by handling depended mac settings per hw support flag in one condition. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-03-27 19:18:50 +01:00
Trond Myklebust	0695314ef0	SUNRPC: Fix a regression when reconnecting If the task needs to give up the socket lock in order to allow a reconnect to occur, then it must also clear the 'rq_bytes_sent' field so that when it retransmits, it knows to start from the beginning. Fixes: `718ba5b873` ("SUNRPC: Add helpers to prevent socket create from racing") Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2015-03-27 12:24:36 -04:00
Patrick McHardy	cc02e457bb	netfilter: nf_tables: implement set transaction support Set elements are the last object type not supporting transaction support. Implement similar to the existing rule transactions: The global transaction counter keeps track of two generations, current and next. Each element contains a bitmask specifying in which generations it is inactive. New elements start out as inactive in the current generation and active in the next. On commit, the previous next generation becomes the current generation and the element becomes active. The bitmask is then cleared to indicate that the element is active in all future generations. If the transaction is aborted, the element is removed from the set before it becomes active. When removing an element, it gets marked as inactive in the next generation. On commit the next generation becomes active and the therefor the element inactive. It is then taken out of then set and released. On abort, the element is marked as active for the next generation again. Lookups ignore elements not active in the current generation. The current set types (hash/rbtree) both use a field in the extension area to store the generation mask. This (currently) does not require any additional memory since we have some free space in there. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-26 11:09:35 +01:00
Patrick McHardy	ea4bd995b0	netfilter: nf_tables: add transaction helper functions Add some helper functions for building the genmask as preparation for set transactions. Also add a little documentation how this stuff actually works. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-26 11:09:35 +01:00
Patrick McHardy	b2832dd662	netfilter: nf_tables: return set extensions from ->lookup() Return the extension area from the ->lookup() function to allow to consolidate common actions. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-26 11:09:34 +01:00
Patrick McHardy	61edafbb47	netfilter: nf_tables: consolide set element destruction With the conversion to set extensions, it is now possible to consolidate the different set element destruction functions. The set implementations' ->remove() functions are changed to only take the element out of their internal data structures. Elements will be freed in a batched fashion after the global transaction's completion RCU grace period. This reduces the amount of grace periods required for nft_hash from N to zero additional ones, additionally this guarantees that the set elements' extensions of all implementations can be used under RCU protection. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-26 11:09:34 +01:00
Clément Perrochaud	25af01ed18	NFC: nci: Add firmware download support A simple forward for firmware download (i.e. sending a new firmware to the NFC adapter) from the NFC subsystem to the drivers. This feature is required to update the firmware of NXP-NCI NFC controllers but can be used by any NCI driver. This feature has been present in the HCI subsystem since 9a695d. Signed-off-by: Clément Perrochaud <clement.perrochaud@effinnov.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2015-03-26 10:56:20 +01:00
Arman Uguray	fdf51784cd	Bluetooth: Unify advertising data code paths This patch simplifies the code paths for assembling the advertising data used by advertising instances 0 and 1. Signed-off-by: Arman Uguray <armansito@chromium.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-03-26 03:30:29 +01:00
Arman Uguray	089fa8c09e	Bluetooth: Update supported_flags for AD features This patch updates the "supported_flags" parameter returned from the "Read Advertising Features" command. Add Advertising will now return an error if an unsupported flag is provided. Signed-off-by: Arman Uguray <armansito@chromium.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-03-26 03:30:29 +01:00
Arman Uguray	5507e35811	Bluetooth: Support the "tx-power" adv flag This patch adds support for the "tx-power" flag of the Add Advertising command. Signed-off-by: Arman Uguray <armansito@chromium.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-03-26 03:30:29 +01:00
Arman Uguray	67e0c0cd8f	Bluetooth: Support the "managed-flags" adv flag This patch adds support for the "managed-flags" flag of the Add Advertising command. Signed-off-by: Arman Uguray <armansito@chromium.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-03-26 03:30:29 +01:00
Arman Uguray	807ec772bf	Bluetooth: Support the "limited-discoverable" adv flag This patch adds support for the "limited-discoverable" flag of the Add Advertising command. Signed-off-by: Arman Uguray <armansito@chromium.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-03-26 03:30:29 +01:00
Arman Uguray	b44133ff03	Bluetooth: Support the "discoverable" adv flag This patch adds support for the "discoverable" flag of the Add Advertising command. Signed-off-by: Arman Uguray <armansito@chromium.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-03-26 03:30:28 +01:00
Arman Uguray	e7a685d316	Bluetooth: Support the "connectable mode" adv flag This patch adds support for the "connectable mode" flag of the Add Advertising command. Signed-off-by: Arman Uguray <armansito@chromium.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-03-26 03:30:28 +01:00
Marcel Holtmann	08dc0e987e	Bluetooth: Fix minor typo in comment for static address setting Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-03-25 19:09:45 -07:00
Christoph Hellwig	e2e40f2c1e	fs: move struct kiocb to fs.h struct kiocb now is a generic I/O container, so move it to fs.h. Also do a #include diet for aio.h while we're at it. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-03-25 20:28:11 -04:00
Hannes Frederic Sowa	5a352dd0a3	ipv6: hash net ptr into fragmentation bucket selection As namespaces are sometimes used with overlapping ip address ranges, we should also use the namespace as input to the hash to select the ip fragmentation counter bucket. Cc: Eric Dumazet <edumazet@google.com> Cc: Flavio Leitner <fbl@redhat.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-25 14:07:04 -04:00
Hannes Frederic Sowa	b6a7719aed	ipv4: hash net ptr into fragmentation bucket selection As namespaces are sometimes used with overlapping ip address ranges, we should also use the namespace as input to the hash to select the ip fragmentation counter bucket. Cc: Eric Dumazet <edumazet@google.com> Cc: Flavio Leitner <fbl@redhat.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-25 14:07:04 -04:00
Jon Paul Maloy	8b4ed8634f	tipc: eliminate race condition at dual link establishment Despite recent improvements, the establishment of dual parallel links still has a small glitch where messages can bypass each other. When the second link in a dual-link configuration is established, part of the first link's traffic will be steered over to the new link. Although we do have a mechanism to ensure that packets sent before and after the establishment of the new link arrive in sequence to the destination node, this is not enough. The arriving messages will still be delivered upwards in different threads, something entailing a risk of message disordering during the transition phase. To fix this, we introduce a synchronization mechanism between the two parallel links, so that traffic arriving on the new link cannot be added to its input queue until we are guaranteed that all pre-establishment messages have been delivered on the old, parallel link. This problem seems to always have been around, but its occurrence is so rare that it has not been noticed until recent intensive testing. Reviewed-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-25 14:05:56 -04:00
Jon Paul Maloy	3127a0200d	tipc: clean up handling of link congestion After the recent changes in message importance handling it becomes possible to simplify handling of messages and sockets when we encounter link congestion. We merge the function tipc_link_cong() into link_schedule_user(), and simplify the code of the latter. The code should now be easier to follow, especially regarding return codes and handling of the message that caused the situation. In case the scheduling function is unable to pre-allocate a wakeup message buffer, it now returns -ENOBUFS, which is a more correct code than the previously used -EHOSTUNREACH. Reviewed-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-25 14:05:56 -04:00
Jon Paul Maloy	1f66d161ab	tipc: introduce starvation free send algorithm Currently, we only use a single counter; the length of the backlog queue, to determine whether a message should be accepted to the queue or not. Each time a message is being sent, the queue length is compared to a threshold value for the message's importance priority. If the queue length is beyond this threshold, the message is rejected. This algorithm implies a risk of starvation of low importance senders during very high load, because it may take a long time before the backlog queue has decreased enough to accept a lower level message. We now eliminate this risk by introducing a counter for each importance priority. When a message is sent, we check only the queue level for that particular message's priority. If that is ok, the message can be added to the backlog, irrespective of the queue level for other priorities. This way, each level is guaranteed a certain portion of the total bandwidth, and any risk of starvation is eliminated. Reviewed-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-25 14:05:56 -04:00
Guenter Roeck	b06b107a4c	net: dsa: Handle non-bridge master change Master change notifications may occur other than when joining or leaving a bridge, for example when being added to or removed from a bond or Open vSwitch. In that case, do nothing instead of asking the switch driver to remove a port from a bridge that it didn't join. Signed-off-by: Guenter Roeck <linux@roeck-us.net> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-25 14:04:57 -04:00
Patrick McHardy	fe2811ebeb	netfilter: nf_tables: convert hash and rbtree to set extensions The set implementations' private struct will only contain the elements needed to maintain the search structure, all other elements are moved to the set extensions. Element allocation and initialization is performed centrally by nf_tables_api instead of by the different set implementations' ->insert() functions. A new "elemsize" member in the set ops specifies the amount of memory to reserve for internal usage. Destruction will also be moved out of the set implementations by a following patch. Except for element allocation, the patch is a simple conversion to using data from the extension area. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-25 17:18:35 +01:00
Patrick McHardy	3ac4c07a24	netfilter: nf_tables: add set extensions Add simple set extension infrastructure for maintaining variable sized and optional per element data. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-25 17:18:34 +01:00
Patrick McHardy	bfd6e327e1	netfilter: nft_hash: convert to use rhashtable callbacks A following patch will convert sets to use so called set extensions, where the key is not located in a fixed position anymore. This will require rhashtable hashing and comparison callbacks to be used. As preparation, convert nft_hash to use these callbacks without any functional changes. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-25 17:18:34 +01:00
Patrick McHardy	45d84751fb	netfilter: nft_hash: indent rhashtable parameters Improve readability by indenting the parameter initialization. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-25 17:18:34 +01:00
Patrick McHardy	745f5450d5	netfilter: nft_hash: restore struct nft_hash Following patches will add new private members, restore struct nft_hash as preparation. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-25 17:18:33 +01:00
Patrick McHardy	49f7b33e63	rhashtable: provide len to obj_hashfn nftables sets will be converted to use so called setextensions, moving the key to a non-fixed position. To hash it, the obj_hashfn must be used, however it so far doesn't receive the length parameter. Pass the key length to obj_hashfn() and convert existing users. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-25 17:18:33 +01:00
Ying Xue	bc14b8d6a9	tipc: fix a link reset issue due to retransmission failures When a node joins a cluster while we are transmitting a fragment stream over the broadcast link, it's missing the preceding fragments needed to build a meaningful message. As a result, the node has to drop it. However, as the fragment message is not acknowledged to its sender before it's dropped, it accidentally causes link reset of retransmission failure on the node. Reported-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Tested-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-25 11:43:32 -04:00
D.S. Ljungmark	6fd99094de	ipv6: Don't reduce hop limit for an interface A local route may have a lower hop_limit set than global routes do. RFC 3756, Section 4.2.7, "Parameter Spoofing" > 1. The attacker includes a Current Hop Limit of one or another small > number which the attacker knows will cause legitimate packets to > be dropped before they reach their destination. > As an example, one possible approach to mitigate this threat is to > ignore very small hop limits. The nodes could implement a > configurable minimum hop limit, and ignore attempts to set it below > said limit. Signed-off-by: D.S. Ljungmark <ljungmark@modio.se> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-25 11:41:08 -04:00
Ying Xue	7e3ea6d5c4	sctp: avoid to repeatedly declare external variables Move the declaration for external variables to sctp.h file avoiding to repeatedly declare them with extern keyword. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-25 11:40:16 -04:00
Patrick McHardy	14d14a5d29	netfilter: nft_meta: use raw_smp_processor_id() Using smp_processor_id() triggers warnings with PREEMPT_RCU. There is no point in disabling preemption since we only collect the numeric value, so use raw_smp_processor_id() instead. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-25 12:09:40 +01:00
Patrick McHardy	d95797252a	netfilter: nf_tables: nft_queue does not depend on x_tables Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-25 12:09:39 +01:00
Pablo Neira Ayuso	fce1528ef6	netfilter: nf_tables: restore nf_log_trace() in nf_tables_core.c As described by `4017a7e` ("netfilter: restore rule tracing via nfnetlink_log"), this accidentally slipped through during conflict resolution in `d5c1d8c`. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-25 12:09:39 +01:00
Joe Perches	a81b2ce850	netfilter: Use LOGLEVEL_<FOO> defines Use the #defines where appropriate. Miscellanea: Add explicit #include <linux/kernel.h> where it was not previously used so that these #defines are a bit more explicitly defined instead of indirectly included via: module.h->moduleparam.h->kernel.h Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-25 12:09:39 +01:00
Patrick McHardy	5ebb335dcb	netfilter: nf_tables: move struct net pointer to base chain The network namespace is only needed for base chains to get at the gencursor. Also convert to possible_net_t. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-25 12:09:38 +01:00
Eric Dumazet	0144a81ccc	tcp: fix ipv4 mapped request socks ss should display ipv4 mapped request sockets like this : tcp SYN-RECV 0 0 ::ffff:192.168.0.1:8080 ::ffff:192.0.2.1:35261 and not like this : tcp SYN-RECV 0 0 192.168.0.1:8080 192.0.2.1:35261 We should init ireq->ireq_family based on listener sk_family, not the actual protocol carried by SYN packet. This means we can set ireq_family in inet_reqsk_alloc() Fixes: `3f66b083a5` ("inet: introduce ireq_family") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-25 00:57:48 -04:00
Marcel Holtmann	99c679acce	Bluetooth: Filter list of supported commands/events for untrusted users When the user of the management interface is not trusted, then it only has access to a limited set of commands and events. When providing the list of supported commands and events take the trusted vs untrusted status of the user into account and return different lists. This way the untrusted user knows exactly which commands it can execute and which events it can receive. So no guesswork needed. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-03-24 18:37:42 -07:00
Eric Dumazet	fd3a154a00	tcp: md5: get rid of tcp_v[46]_reqsk_md5_lookup() With request socks convergence, we no longer need different lookup methods. A request socket can use generic lookup function. Add const qualifier to 2nd tcp_v[46]_md5_lookup() parameter. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-24 21:16:30 -04:00
Eric Dumazet	39f8e58e53	tcp: md5: remove request sock argument of calc_md5_hash() Since request and established sockets now have same base, there is no need to pass two pointers to tcp_v4_md5_hash_skb() or tcp_v6_md5_hash_skb() Also add a const qualifier to their struct tcp_md5sig_key argument. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-24 21:16:30 -04:00
Eric Dumazet	ff74e23f7e	tcp: md5: input path is run under rcu protected sections It is guaranteed that both tcp_v4_rcv() and tcp_v6_rcv() run from rcu read locked sections : ip_local_deliver_finish() and ip6_input_finish() both use rcu_read_lock() Also align tcp_v6_inbound_md5_hash() on tcp_v4_inbound_md5_hash() by returning a boolean. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-24 21:16:29 -04:00
Eric Dumazet	0980c1e308	tcp: use C99 initializers in new_state[] Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-24 21:16:29 -04:00
Eric Dumazet	80f03e27a3	tcp: md5: fix rcu lockdep splat While timer handler effectively runs a rcu read locked section, there is no explicit rcu_read_lock()/rcu_read_unlock() annotations and lockdep can be confused here : net/ipv4/tcp_ipv4.c-906- /* caller either holds rcu_read_lock() or socket lock */ net/ipv4/tcp_ipv4.c:907: md5sig = rcu_dereference_check(tp->md5sig_info, net/ipv4/tcp_ipv4.c-908- sock_owned_by_user(sk) \|\| net/ipv4/tcp_ipv4.c-909- lockdep_is_held(&sk->sk_lock.slock)); Let's explicitely acquire rcu_read_lock() in tcp_make_synack() Before commit `fa76ce7328` ("inet: get rid of central tcp/dccp listener timer"), we were holding listener lock so lockdep was happy. Fixes: `fa76ce7328` ("inet: get rid of central tcp/dccp listener timer") Signed-off-by: Eric DUmazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-24 21:16:29 -04:00
Thomas Graf	6b6f302ced	rhashtable: Add rhashtable_free_and_destroy() rhashtable_destroy() variant which stops rehashes, iterates over the table and calls a callback to release resources. Avoids need for nft_hash to embed rhashtable internals and allows to get rid of the being_destroyed flag. It also saves a 2nd mutex lock upon destruction. Also fixes an RCU lockdep splash on nft set destruction due to calling rht_for_each_entry_safe() without holding bucket locks. Open code this loop as we need know that no mutations may occur in parallel. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-24 17:48:40 -04:00
Thomas Graf	b5e2c150ac	rhashtable: Disable automatic shrinking by default Introduce a new bool automatic_shrinking to require the user to explicitly opt-in to automatic shrinking of tables. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-24 17:48:40 -04:00
Michal Sekletar	27cd545247	filter: introduce SKF_AD_VLAN_TPID BPF extension If vlan offloading takes place then vlan header is removed from frame and its contents, both vlan_tci and vlan_proto, is available to user space via TPACKET interface. However, only vlan_tci can be used in BPF filters. This commit introduces a new BPF extension. It makes possible to load the value of vlan_proto (vlan TPID) to register A. Support for classic BPF and eBPF is being added, analogous to skb->protocol. Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Jiri Pirko <jpirko@redhat.com> Signed-off-by: Michal Sekletar <msekleta@redhat.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Reviewed-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-24 15:25:15 -04:00
Hannes Frederic Sowa	ff40217e73	ipv6: fix sparse warnings in privacy stable addresses generation Those warnings reported by sparse endianness check (via kbuild test robot) are harmless, nevertheless fix them up and make the code a little bit easier to read. Reported-by: kbuild test robot <fengguang.wu@intel.com> Fixes: `622c81d57b` ("ipv6: generation of stable privacy addresses for link-local and autoconf") Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-24 15:21:35 -04:00
Ying Xue	ed3e852aa5	tipc: fix compile error when IPV6=m and TIPC=y When IPV6=m and TIPC=y, below error will appear during building kernel image: net/tipc/udp_media.c:196: undefined reference to `ip6_dst_lookup' make: *** [vmlinux] Error 1 As ip6_dst_lookup() is implemented in IPV6 and IPV6 is compiled as module, ip6_dst_lookup() is not built-in core kernel image. As a result, compiler cannot find 'ip6_dst_lookup' reference while compiling TIPC code into core kernel image. But with the method introduced by commit `5f81bd2e5d` ("ipv6: export a stub for IPv6 symbols used by vxlan"), we can avoid the compile error through "ipv6_stub" pointer to access ip6_dst_lookup(). Fixes: `d0f91938be` ("tipc: add ip/udp media type") Suggested-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-24 15:20:29 -04:00
WANG Cong	66400d5430	net: allow to delete a whole device group With dev group, we can change a batch of net devices, so we should allow to delete them together too. Group 0 is not allowed to be deleted since it is the default group. Cc: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-24 15:00:01 -04:00
WANG Cong	d079535d5e	net: use for_each_netdev_safe() in rtnl_group_changelink() In case we move the whole dev group to another netns, we should call for_each_netdev_safe(), otherwise we get a soft lockup: NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [ip:798] irq event stamp: 255424 hardirqs last enabled at (255423): [<ffffffff81a2aa95>] restore_args+0x0/0x30 hardirqs last disabled at (255424): [<ffffffff81a2ad5a>] apic_timer_interrupt+0x6a/0x80 softirqs last enabled at (255422): [<ffffffff81079ebc>] __do_softirq+0x2c1/0x3a9 softirqs last disabled at (255417): [<ffffffff8107a190>] irq_exit+0x41/0x95 CPU: 0 PID: 798 Comm: ip Not tainted 4.0.0-rc4+ #881 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 task: ffff8800d1b88000 ti: ffff880119530000 task.ti: ffff880119530000 RIP: 0010:[<ffffffff810cad11>] [<ffffffff810cad11>] debug_lockdep_rcu_enabled+0x28/0x30 RSP: 0018:ffff880119533778 EFLAGS: 00000246 RAX: ffff8800d1b88000 RBX: 0000000000000002 RCX: 0000000000000038 RDX: 0000000000000000 RSI: ffff8800d1b888c8 RDI: ffff8800d1b888c8 RBP: ffff880119533778 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 000000000000b5c2 R12: 0000000000000246 R13: ffff880119533708 R14: 00000000001d5a40 R15: ffff88011a7d5a40 FS: 00007fc01315f740(0000) GS:ffff88011a600000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00007f367a120988 CR3: 000000011849c000 CR4: 00000000000007f0 Stack: ffff880119533798 ffffffff811ac868 ffffffff811ac831 ffffffff811ac828 ffff8801195337c8 ffffffff811ac8c9 ffff8801195339b0 ffff8801197633e0 0000000000000000 ffff8801195339b0 ffff8801195337d8 ffffffff811ad2d7 Call Trace: [<ffffffff811ac868>] rcu_read_lock+0x37/0x6e [<ffffffff811ac831>] ? rcu_read_unlock+0x5f/0x5f [<ffffffff811ac828>] ? rcu_read_unlock+0x56/0x5f [<ffffffff811ac8c9>] __fget+0x2a/0x7a [<ffffffff811ad2d7>] fget+0x13/0x15 [<ffffffff811be732>] proc_ns_fget+0xe/0x38 [<ffffffff817c7714>] get_net_ns_by_fd+0x11/0x59 [<ffffffff817df359>] rtnl_link_get_net+0x33/0x3e [<ffffffff817df3d7>] do_setlink+0x73/0x87b [<ffffffff810b28ce>] ? trace_hardirqs_off+0xd/0xf [<ffffffff81a2aa95>] ? retint_restore_args+0xe/0xe [<ffffffff817e0301>] rtnl_newlink+0x40c/0x699 [<ffffffff817dffe0>] ? rtnl_newlink+0xeb/0x699 [<ffffffff81a29246>] ? _raw_spin_unlock+0x28/0x33 [<ffffffff8143ed1e>] ? security_capable+0x18/0x1a [<ffffffff8107da51>] ? ns_capable+0x4d/0x65 [<ffffffff817de5ce>] rtnetlink_rcv_msg+0x181/0x194 [<ffffffff817de407>] ? rtnl_lock+0x17/0x19 [<ffffffff817de407>] ? rtnl_lock+0x17/0x19 [<ffffffff817de44d>] ? __rtnl_unlock+0x17/0x17 [<ffffffff818327c6>] netlink_rcv_skb+0x4d/0x93 [<ffffffff817de42f>] rtnetlink_rcv+0x26/0x2d [<ffffffff81830f18>] netlink_unicast+0xcb/0x150 [<ffffffff8183198e>] netlink_sendmsg+0x501/0x523 [<ffffffff8115cba9>] ? might_fault+0x59/0xa9 [<ffffffff817b5398>] ? copy_from_user+0x2a/0x2c [<ffffffff817b7b74>] sock_sendmsg+0x34/0x3c [<ffffffff817b7f6d>] ___sys_sendmsg+0x1b8/0x255 [<ffffffff8115c5eb>] ? handle_pte_fault+0xbd5/0xd4a [<ffffffff8100a2b0>] ? native_sched_clock+0x35/0x37 [<ffffffff8109e94b>] ? sched_clock_local+0x12/0x72 [<ffffffff8109eb9c>] ? sched_clock_cpu+0x9e/0xb7 [<ffffffff810cadbf>] ? rcu_read_lock_held+0x3b/0x3d [<ffffffff811ac1d8>] ? __fcheck_files+0x4c/0x58 [<ffffffff811ac946>] ? __fget_light+0x2d/0x52 [<ffffffff817b8adc>] __sys_sendmsg+0x42/0x60 [<ffffffff817b8b0c>] SyS_sendmsg+0x12/0x1c [<ffffffff81a29e32>] system_call_fastpath+0x12/0x17 Fixes: `e7ed828f10` ("netlink: support setting devgroup parameters") Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-24 13:02:32 -04:00
Sasha Levin	610600c8c5	tipc: validate length of sockaddr in connect() for dgram/rdm Commit `f2f8036` ("tipc: add support for connect() on dgram/rdm sockets") hasn't validated user input length for the sockaddr structure which allows a user to overwrite kernel memory with arbitrary input. Fixes: `f2f8036` ("tipc: add support for connect() on dgram/rdm sockets") Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-24 12:50:39 -04:00
Michal Kubeček	d0c294c53a	tcp: prevent fetching dst twice in early demux code On s390x, gcc 4.8 compiles this part of tcp_v6_early_demux() struct dst_entry *dst = sk->sk_rx_dst; if (dst) dst = dst_check(dst, inet6_sk(sk)->rx_dst_cookie); to code reading sk->sk_rx_dst twice, once for the test and once for the argument of ip6_dst_check() (dst_check() is inline). This allows ip6_dst_check() to be called with null first argument, causing a crash. Protect sk->sk_rx_dst access by READ_ONCE() both in IPv4 and IPv6 TCP early demux code. Fixes: `41063e9dd1` ("ipv4: Early TCP socket demux.") Fixes: `c7109986db` ("ipv6: Early TCP socket demux") Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-23 22:38:24 -04:00
David S. Miller	d5c1d8c567	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: net/netfilter/nf_tables_core.c The nf_tables_core.c conflict was resolved using a conflict resolution from Stephen Rothwell as a guide. Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-23 22:22:43 -04:00
Hannes Frederic Sowa	1855b7c3e8	ipv6: introduce idgen_delay and idgen_retries knobs This is specified by RFC 7217. Cc: Erik Kline <ek@google.com> Cc: Fernando Gont <fgont@si6networks.com> Cc: Lorenzo Colitti <lorenzo@google.com> Cc: YOSHIFUJI Hideaki/吉藤英明 <hideaki.yoshifuji@miraclelinux.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-23 22:12:09 -04:00
Hannes Frederic Sowa	5f40ef77ad	ipv6: do retries on stable privacy addresses If a DAD conflict is detected, we want to retry privacy stable address generation up to idgen_retries (= 3) times with a delay of idgen_delay (= 1 second). Add the logic to addrconf_dad_failure. By design, we don't clean up dad failed permanent addresses. Cc: Erik Kline <ek@google.com> Cc: Fernando Gont <fgont@si6networks.com> Cc: Lorenzo Colitti <lorenzo@google.com> Cc: YOSHIFUJI Hideaki/吉藤英明 <hideaki.yoshifuji@miraclelinux.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-23 22:12:09 -04:00
Hannes Frederic Sowa	8e8e676d0b	ipv6: collapse state_lock and lock Cc: Erik Kline <ek@google.com> Cc: Fernando Gont <fgont@si6networks.com> Cc: Lorenzo Colitti <lorenzo@google.com> Cc: YOSHIFUJI Hideaki/吉藤英明 <hideaki.yoshifuji@miraclelinux.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-23 22:12:09 -04:00
Hannes Frederic Sowa	64236f3f3d	ipv6: introduce IFA_F_STABLE_PRIVACY flag We need to mark appropriate addresses so we can do retries in case their DAD failed. Cc: Erik Kline <ek@google.com> Cc: Fernando Gont <fgont@si6networks.com> Cc: Lorenzo Colitti <lorenzo@google.com> Cc: YOSHIFUJI Hideaki/吉藤英明 <hideaki.yoshifuji@miraclelinux.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-23 22:12:09 -04:00
Hannes Frederic Sowa	622c81d57b	ipv6: generation of stable privacy addresses for link-local and autoconf This patch implements the stable privacy address generation for link-local and autoconf addresses as specified in RFC7217. RID = F(Prefix, Net_Iface, Network_ID, DAD_Counter, secret_key) is the RID (random identifier). As the hash function F we chose one round of sha1. Prefix will be either the link-local prefix or the router advertised one. As Net_Iface we use the MAC address of the device. DAD_Counter and secret_key are implemented as specified. We don't use Network_ID, as it couples the code too closely to other subsystems. It is specified as optional in the RFC. As Net_Iface we only use the MAC address: we simply have no stable identifier in the kernel we could possibly use: because this code might run very early, we cannot depend on names, as they might be changed by user space early on during the boot process. A new address generation mode is introduced, IN6_ADDR_GEN_MODE_STABLE_PRIVACY. With iproute2 one can switch back to none or eui64 address configuration mode although the stable_secret is already set. We refuse writes to ipv6/conf/all/stable_secret but only allow ipv6/conf/default/stable_secret and the interface specific file to be written to. The default stable_secret is used as the parameter for the namespace, the interface specific can overwrite the secret, e.g. when switching a network configuration from one system to another while inheriting the secret. Cc: Erik Kline <ek@google.com> Cc: Fernando Gont <fgont@si6networks.com> Cc: Lorenzo Colitti <lorenzo@google.com> Cc: YOSHIFUJI Hideaki/吉藤英明 <hideaki.yoshifuji@miraclelinux.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-23 22:12:08 -04:00
Hannes Frederic Sowa	3d1bec9932	ipv6: introduce secret_stable to ipv6_devconf This patch implements the procfs logic for the stable_address knob: The secret is formatted as an ipv6 address and will be stored per interface and per namespace. We track initialized flag and return EIO errors until the secret is set. We don't inherit the secret to newly created namespaces. Cc: Erik Kline <ek@google.com> Cc: Fernando Gont <fgont@si6networks.com> Cc: Lorenzo Colitti <lorenzo@google.com> Cc: YOSHIFUJI Hideaki/吉藤英明 <hideaki.yoshifuji@miraclelinux.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-23 22:12:08 -04:00
Herbert Xu	6d02294981	tipc: Use default rhashtable hashfn This patch removes the explicit jhash value for the hashfn parameter of rhashtable. The default is now jhash so removing the setting makes no difference apart from making one less copy of jhash in the kernel. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-23 22:07:51 -04:00
Herbert Xu	11b58ba146	netlink: Use default rhashtable hashfn This patch removes the explicit jhash value for the hashfn parameter of rhashtable. As the key length is a multiple of 4, this means that we will actually end up using jhash2. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-23 22:07:51 -04:00
David S. Miller	40451fd013	Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next Pablo Neira Ayuso says: ==================== Netfilter updates for net-next The following patchset contains Netfilter updates for net-next. Basically, more incremental updates for br_netfilter from Florian Westphal, small nf_tables updates (including one fix for rb-tree locking) and small two-liner to add extra validation for the REJECT6 target. More specifically, they are: 1) Use the conntrack status flags from br_netfilter to know that DNAT is happening. Patch for Florian Westphal. 2) nf_bridge->physoutdev == NULL already indicates that the traffic is bridged, so let's get rid of the BRNF_BRIDGED flag. Also from Florian. 3) Another patch to prepare voidization of seq_printf/seq_puts/seq_putc, from Joe Perches. 4) Consolidation of nf_tables_newtable() error path. 5) Kill nf_bridge_pad used by br_netfilter from ip_fragment(), from Florian Westphal. 6) Access rb-tree root node inside the lock and remove unnecessary locking from the get path (we already hold nfnl_lock there), from Patrick McHardy. 7) You cannot use a NFT_SET_ELEM_INTERVAL_END when the set doesn't support interval, also from Patrick. 8) Enforce IP6T_F_PROTO from ip6t_REJECT to make sure the core is actually restricting matches to TCP. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-23 22:02:46 -04:00
Alexander Drozdov	682f048bd4	af_packet: pass checksum validation status to the user Introduce TP_STATUS_CSUM_VALID tp_status flag to tell the af_packet user that at least the transport header checksum has been already validated. For now, the flag may be set for incoming packets only. Signed-off-by: Alexander Drozdov <al.drozdov@gmail.com> Cc: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-23 22:01:28 -04:00
Alexander Drozdov	68c2e5de36	af_packet: make tpacket_rcv to not set status value before run_filter It is just an optimization. We don't need the value of status variable if the packet is filtered. Signed-off-by: Alexander Drozdov <al.drozdov@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-23 22:00:36 -04:00
Fan Du	c69736696c	inet: fix double request socket freeing Eric Hugne reported following error : I'm hitting this warning on latest net-next when i try to SSH into a machine with eth0 added to a bridge (but i think the problem is older than that) Steps to reproduce: node2 ~ # brctl addif br0 eth0 [ 223.758785] device eth0 entered promiscuous mode node2 ~ # ip link set br0 up [ 244.503614] br0: port 1(eth0) entered forwarding state [ 244.505108] br0: port 1(eth0) entered forwarding state node2 ~ # [ 251.160159] ------------[ cut here ]------------ [ 251.160831] WARNING: CPU: 0 PID: 3 at include/net/request_sock.h:102 tcp_v4_err+0x6b1/0x720() [ 251.162077] Modules linked in: [ 251.162496] CPU: 0 PID: 3 Comm: ksoftirqd/0 Not tainted 4.0.0-rc3+ #18 [ 251.163334] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 [ 251.164078] ffffffff81a8365c ffff880038a6ba18 ffffffff8162ace4 0000000000009898 [ 251.165084] 0000000000000000 ffff880038a6ba58 ffffffff8104da85 ffff88003fa437c0 [ 251.166195] ffff88003fa437c0 ffff88003fa74e00 ffff88003fa43bb8 ffff88003fad99a0 [ 251.167203] Call Trace: [ 251.167533] [<ffffffff8162ace4>] dump_stack+0x45/0x57 [ 251.168206] [<ffffffff8104da85>] warn_slowpath_common+0x85/0xc0 [ 251.169239] [<ffffffff8104db65>] warn_slowpath_null+0x15/0x20 [ 251.170271] [<ffffffff81559d51>] tcp_v4_err+0x6b1/0x720 [ 251.171408] [<ffffffff81630d03>] ? _raw_read_lock_irq+0x3/0x10 [ 251.172589] [<ffffffff81534e20>] ? inet_del_offload+0x40/0x40 [ 251.173366] [<ffffffff81569295>] icmp_socket_deliver+0x65/0xb0 [ 251.174134] [<ffffffff815693a2>] icmp_unreach+0xc2/0x280 [ 251.174820] [<ffffffff8156a82d>] icmp_rcv+0x2bd/0x3a0 [ 251.175473] [<ffffffff81534ea2>] ip_local_deliver_finish+0x82/0x1e0 [ 251.176282] [<ffffffff815354d8>] ip_local_deliver+0x88/0x90 [ 251.177004] [<ffffffff815350f0>] ip_rcv_finish+0xf0/0x310 [ 251.177693] [<ffffffff815357bc>] ip_rcv+0x2dc/0x390 [ 251.178336] [<ffffffff814f5da3>] __netif_receive_skb_core+0x713/0xa20 [ 251.179170] [<ffffffff814f7fca>] __netif_receive_skb+0x1a/0x80 [ 251.179922] [<ffffffff814f97d4>] process_backlog+0x94/0x120 [ 251.180639] [<ffffffff814f9612>] net_rx_action+0x1e2/0x310 [ 251.181356] [<ffffffff81051267>] __do_softirq+0xa7/0x290 [ 251.182046] [<ffffffff81051469>] run_ksoftirqd+0x19/0x30 [ 251.182726] [<ffffffff8106cc23>] smpboot_thread_fn+0x153/0x1d0 [ 251.183485] [<ffffffff8106cad0>] ? SyS_setgroups+0x130/0x130 [ 251.184228] [<ffffffff8106935e>] kthread+0xee/0x110 [ 251.184871] [<ffffffff81069270>] ? kthread_create_on_node+0x1b0/0x1b0 [ 251.185690] [<ffffffff81631108>] ret_from_fork+0x58/0x90 [ 251.186385] [<ffffffff81069270>] ? kthread_create_on_node+0x1b0/0x1b0 [ 251.187216] ---[ end trace c947fc7b24e42ea1 ]--- [ 259.542268] br0: port 1(eth0) entered forwarding state Remove the double calls to reqsk_put() [edumazet] : I got confused because reqsk_timer_handler() _has_ to call reqsk_put(req) after calling inet_csk_reqsk_queue_drop(), as the timer handler holds a reference on req. Signed-off-by: Fan Du <fan.du@intel.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Erik Hugne <erik.hugne@ericsson.com> Fixes: `fa76ce7328` ("inet: get rid of central tcp/dccp listener timer") Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-23 21:40:48 -04:00
Arman Uguray	912098a630	Bluetooth: Add support for adv instance timeout This patch implements support for the timeout parameter of the Add Advertising command. Signed-off-by: Arman Uguray <armansito@chromium.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-03-24 01:53:47 +01:00
Arman Uguray	4117ed70a5	Bluetooth: Add support for instance scan response This patch implements setting the Scan Response data provided as part of an advertising instance through the Add Advertising command. Signed-off-by: Arman Uguray <armansito@chromium.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-03-24 01:53:47 +01:00
Arman Uguray	da929335f2	Bluetooth: Implement the Remove Advertising command This patch implements the "Remove Advertising" mgmt command. Signed-off-by: Arman Uguray <armansito@chromium.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-03-24 01:53:47 +01:00
Arman Uguray	24b4f38fc9	Bluetooth: Implement the Add Advertising command This patch adds the most basic implementation for the "Add Advertisement" command. All state updates between the various HCI settings (POWERED, ADVERTISING, ADVERTISING_INSTANCE, and LE_ENABLED) has been implemented. The command currently supports only setting the advertising data fields, with no flags and no scan response data. Signed-off-by: Arman Uguray <armansito@chromium.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-03-24 01:53:46 +01:00
Arman Uguray	203fea0178	Bluetooth: Add data structure for advertising instance This patch introduces a new data structure to represent advertising instances that were added using the "Add Advertising" mgmt command. Initially an hci_dev structure will support only one of these instances at a time, so the current instance is simply stored as a direct member of hci_dev. Signed-off-by: Arman Uguray <armansito@chromium.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-03-24 01:53:46 +01:00
Alexander Duyck	b6f15f828d	fib_trie: Fix regression in handling of inflate/halve failure When I updated the code to address a possible null pointer dereference in resize I ended up reverting an exception handling fix for the suffix length in the event that inflate or halve failed. This change is meant to correct that by reverting the earlier fix and instead simply getting the parent again after inflate has been completed to avoid the possible null pointer issue. Fixes: `ddb4b9a13` ("fib_trie: Address possible NULL pointer dereference in resize") Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-23 16:58:32 -04:00
YOSHIFUJI Hideaki/吉藤英明	443b5991a7	net: Move the comment about unsettable socket-level options to default clause and update its reference. We implement the SO_SNDLOWAT etc not to be settable and return ENOPROTOOPT per 1003.1g 7. Move the comment to appropriate position and update the reference. Signed-off-by: YOSHIFUJI Hideaki <hideaki.yoshifuji@miraclelinux.com> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-23 16:54:34 -04:00
Eric Dumazet	52036a4305	ipv6: dccp: handle ICMP messages on DCCP_NEW_SYN_RECV request sockets dccp_v6_err() can restrict lookups to ehash table, and not to listeners. Note this patch creates the infrastructure, but this means that ICMP messages for request sockets are ignored until complete conversion. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-23 16:52:26 -04:00
Eric Dumazet	85645bab57	ipv4: dccp: handle ICMP messages on DCCP_NEW_SYN_RECV request sockets dccp_v4_err() can restrict lookups to ehash table, and not to listeners. Note this patch creates the infrastructure, but this means that ICMP messages for request sockets are ignored until complete conversion. New dccp_req_err() helper is exported so that we can use it in IPv6 in following patch. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-23 16:52:26 -04:00
Eric Dumazet	2215089b22	ipv6: tcp: handle ICMP messages on TCP_NEW_SYN_RECV request sockets tcp_v6_err() can restrict lookups to ehash table, and not to listeners. Note this patch creates the infrastructure, but this means that ICMP messages for request sockets are ignored until complete conversion. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-23 16:52:26 -04:00
Eric Dumazet	26e3736090	ipv4: tcp: handle ICMP messages on TCP_NEW_SYN_RECV request sockets tcp_v4_err() can restrict lookups to ehash table, and not to listeners. Note this patch creates the infrastructure, but this means that ICMP messages for request sockets are ignored until complete conversion. New tcp_req_err() helper is exported so that we can use it in IPv6 in following patch. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-23 16:52:26 -04:00
Eric Dumazet	b282705336	net: convert syn_wait_lock to a spinlock This is a low hanging fruit, as we'll get rid of syn_wait_lock eventually. We hold syn_wait_lock for such small sections, that it makes no sense to use a read/write lock. A spin lock is simply faster. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-23 16:52:26 -04:00
Eric Dumazet	8b929ab12f	inet: remove some sk_listener dependencies listener can be source of false sharing. request sock has some useful information like : ireq->ir_iif, ireq->ir_num, ireq->ireq_net This patch does not solve the major problem of having to read sk->sk_protocol which is sharing a cache line with sk->sk_wmem_alloc. (This same field is read later in ip_build_and_send_pkt()) One idea would be to move sk_protocol close to sk_family (using 8 bits instead of 16 for sk_family seems enough) Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-23 16:52:26 -04:00
Eric Dumazet	42cb80a235	inet: remove sk_listener parameter from syn_ack_timeout() It is not needed, and req->sk_listener points to the listener anyway. request_sock argument can be const. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-23 16:52:25 -04:00
Eric Dumazet	2b41fab70f	inet: cache listen_sock_qlen() and read rskq_defer_accept once Cache listen_sock_qlen() to limit false sharing, and read rskq_defer_accept once as it might change under us. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-23 16:52:25 -04:00
Roopa Prabhu	558d51fa2f	switchdev: fix stp update API to work with layered netdevices make it same as the netdev_switch_port_bridge_setlink/dellink api (ie traverse lowerdevs to get to the switch port). removes "WARN_ON(!ops->ndo_switch_parent_id_get)" because direct bridge ports can be stacked netdevices (like bonds and team of switch ports) which may not implement this ndo. v2 to v3: - remove changes to bond and team. Bring back the transparently following lowerdevs like i initially had for setlink/getlink (http://www.spinics.net/lists/netdev/msg313436.html) dave and scott feldman also seem to prefer it be that way and move to non-transparent way of doing things if we see a problem down the lane. v3 to v4: - fix ret initialization v4 to v5: - return err on first failure (scott feldman) v5 to v6: - change variable name (err) and initialize to -EOPNOTSUPP (scott feldman). Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Acked-by: Scott Feldman <sfeldma@gmail.com> Acked-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-23 16:44:56 -04:00
WANG Cong	08b4b8ea79	net: clear skb->priority when forwarding to another netns skb->priority can be set for two purposes: 1) With respect to IP TOS field, which is computed by a mask. Ususally used for priority qdisc's (pfifo, prio etc.), on TX side (we only have ingress qdisc on RX side). 2) Used as a classid or flowid, works in the same way with tc classid. What's more, this can even override the classid of tc filters. For case 1), it has been respected within its netns, I don't see any point of keeping it for another netns, especially when packets will be forwarded to Rx path (no matter from TX path or RX path). For case 2) we care, our applications run inside a netns, and we classify the packets by our own filters outside, If some application sets this priority, it could bypass our filters, therefore clear it when moving out of a netns, it makes no sense to bypass tc filters out of its netns. Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-23 16:43:08 -04:00
tadeusz.struk@intel.com	0345f93138	net: socket: add support for async operations Add support for async operations. Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-23 16:41:36 -04:00
David S. Miller	c0e41fa76c	Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes for your net tree, they are: 1) Fix missing initialization of tuple structure in nfnetlink_cthelper to avoid mismatches when looking up to attach userspace helpers to flows, from Ian Wilson. 2) Fix potential crash in nft_hash when we hit -EAGAIN in nft_hash_walk(), from Herbert Xu. 3) We don't need to indicate the hook information to update the basechain default policy in nf_tables. 4) Restore tracing over nfnetlink_log due to recent rework to accomodate logging infrastructure into nf_tables. 5) Fix wrong IP6T_INV_PROTO check in xt_TPROXY. 6) Set IP6T_F_PROTO flag in nft_compat so we can use SYNPROXY6 and REJECT6 from xt over nftables. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-22 16:57:07 -04:00
Pablo Neira Ayuso	e35158e401	netfilter: ip6t_REJECT: check for IP6T_F_PROTO Make sure IP6T_F_PROTO is set to enforce layer 4 protocol matching from the ip6_tables core. Suggested-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-22 20:02:46 +01:00
Patrick McHardy	55df35d22f	netfilter: nf_tables: reject NFT_SET_ELEM_INTERVAL_END flag for non-interval sets Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-22 19:50:35 +01:00
Patrick McHardy	16c45eda96	netfilter: nft_rbtree: fix locking Fix a race condition and unnecessary locking: * the root rb_node must only be accessed under the lock in nft_rbtree_lookup() * the lock is not needed in lookup functions in netlink context Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-22 19:49:09 +01:00
Florian Westphal	8d0451638a	netfilter: bridge: kill nf_bridge_pad The br_netfilter frag output function calls skb_cow_head() so in case it needs a larger headroom to e.g. re-add a previously stripped PPPOE or VLAN header things will still work (at cost of reallocation). We can then move nf_bridge_encap_header_len to br_netfilter. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-22 19:45:55 +01:00
Pablo Neira Ayuso	749177ccc7	netfilter: nft_compat: set IP6T_F_PROTO flag if protocol is set ip6tables extensions check for this flag to restrict match/target to a given protocol. Without this flag set, SYNPROXY6 returns an error. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Acked-by: Patrick McHardy <kaber@trash.net>	2015-03-22 19:32:05 +01:00
Johan Hedberg	baf880a968	Bluetooth: Fix memory leak in le_scan_disable_work_complete() The hci_request in le_scan_disable_work_complete() was being initialized in a general context but only used in a specific branch in the function (when simultaneous discovery is not supported). This patch moves the usage to be limited to the branch where hci_req_run() is actually called. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-03-22 08:03:54 +01:00
Dominique Martinet	f569d3ef82	net/9p: add a privport option for RDMA transport. RDMA can use the same kind of weak security as TCP by checking the client can bind to a privileged port, which is better than nothing if TAUTH isn't implemented. Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>	2015-03-21 19:32:33 -07:00
Dominique Martinet	b99baa43e5	net/9p: Initialize opts->privport as it should be. We're currently using an uninitialized value if option privport is not set, thus (almost) always using a privileged port. Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>	2015-03-21 19:27:22 -07:00
Herbert Xu	8f2ddaac30	netlink: Remove netlink_compare_arg.trailer Instead of computing the offset from trailer, this patch computes netlink_compare_arg_len from the offset of portid and then adds 4 to it. This allows trailer to be removed. Reported-by: David Miller <davem@davemloft.net> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-21 00:16:39 -04:00
YOSHIFUJI Hideaki/吉藤英明	8da86466b8	net: neighbour: Add mcast_resolicit to configure the number of multicast resolicitations in PROBE state. We send unicast neighbor (ARP or NDP) solicitations ucast_probes times in PROBE state. Zhu Yanjun reported that some implementation does not reply against them and the entry will become FAILED, which is undesirable. We had been dealt with such nodes by sending multicast probes mcast_ solicit times after unicast probes in PROBE state. In 2003, I made a change not to send them to improve compatibility with IPv6 NDP. Let's introduce per-protocol per-interface sysctl knob "mcast_ reprobe" to configure the number of multicast (re)solicitation for reconfirmation in PROBE state. The default is 0, since we have been doing so for 10+ years. Reported-by: Zhu Yanjun <Yanjun.Zhu@windriver.com> CC: Ulf Samuelsson <ulf.samuelsson@ericsson.com> Signed-off-by: YOSHIFUJI Hideaki <hideaki.yoshifuji@miraclelinux.com> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 21:47:40 -04:00
Mathieu Olivari	c6f15070e7	net: dsa: make NET_DSA manually selectable from the config Change `bd76a11670` made all DSA drivers depend on NET_DSA rather than selecting them. However, as the only way to select this option was to actually select a driver, it made DSA impossible to enable at all. This patch adds an explicit entry which the user will have to enable prior selecting a driver. Signed-off-by: Mathieu Olivari <mathieu@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 21:37:58 -04:00
Eric Dumazet	d3593b5cef	Revert "selinux: add a skb_owned_by() hook" This reverts commit `ca10b9e9a8`. No longer needed after commit `eb8895debe` ("tcp: tcp_make_synack() should use sock_wmalloc") When under SYNFLOOD, we build lot of SYNACK and hit false sharing because of multiple modifications done on sk_listener->sk_wmem_alloc Since tcp_make_synack() uses sock_wmalloc(), there is no need to call skb_set_owner_w() again, as this adds two atomic operations. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 21:36:53 -04:00
Daniel Borkmann	a8cb5f556b	act_bpf: add initial eBPF support for actions This work extends the "classic" BPF programmable tc action by extending its scope also to native eBPF code! Together with commit `e2e9b6541d` ("cls_bpf: add initial eBPF support for programmable classifiers") this adds the facility to implement fully flexible classifier and actions for tc that can be implemented in a C subset in user space, "safely" loaded into the kernel, and being run in native speed when JITed. Also, since eBPF maps can be shared between eBPF programs, it offers the possibility that cls_bpf and act_bpf can share data 1) between themselves and 2) between user space applications. That means that, f.e. customized runtime statistics can be collected in user space, but also more importantly classifier and action behaviour could be altered based on map input from the user space application. For the remaining details on the workflow and integration, see the cls_bpf commit `e2e9b6541d`. Preliminary iproute2 part can be found under [1]. [1] http://git.breakpoint.cc/cgit/dborkman/iproute2.git/log/?h=ebpf-act Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Jiri Pirko <jiri@resnulli.us> Acked-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 19:10:44 -04:00
Daniel Borkmann	94caee8c31	ebpf: add sched_act_type and map it to sk_filter's verifier ops In order to prepare eBPF support for tc action, we need to add sched_act_type, so that the eBPF verifier is aware of what helper function act_bpf may use, that it can load skb data and read out currently available skb fields. This is bascially analogous to `96be4325f4` ("ebpf: add sched_cls_type and map it to sk_filter's verifier ops"). BPF_PROG_TYPE_SCHED_CLS and BPF_PROG_TYPE_SCHED_ACT need to be separate since both will have a different set of functionality in future (classifier vs action), thus we won't run into ABI troubles when the point in time comes to diverge functionality from the classifier. The future plan for act_bpf would be that it will be able to write into skb->data and alter selected fields mirrored in struct __sk_buff. For an initial support, it's sufficient to map it to sk_filter_ops. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Jiri Pirko <jiri@resnulli.us> Reviewed-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 19:10:44 -04:00
David S. Miller	0fa74a4be4	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/ethernet/emulex/benet/be_main.c net/core/sysctl_net_core.c net/ipv4/inet_diag.c The be_main.c conflict resolution was really tricky. The conflict hunks generated by GIT were very unhelpful, to say the least. It split functions in half and moved them around, when the real actual conflict only existed solely inside of one function, that being be_map_pci_bars(). So instead, to resolve this, I checked out be_main.c from the top of net-next, then I applied the be_main.c changes from 'net' since the last time I merged. And this worked beautifully. The inet_diag.c and sysctl_net_core.c conflicts were simple overlapping changes, and were easily to resolve. Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 18:51:09 -04:00
Al Viro	4de930efc2	net: validate the range we feed to iov_iter_init() in sys_sendto/sys_recvfrom Cc: stable@vger.kernel.org # v3.19 Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 16:38:06 -04:00
Catalin Marinas	91edd096e2	net: compat: Update get_compat_msghdr() to match copy_msghdr_from_user() behaviour Commit `db31c55a6f` (net: clamp ->msg_namelen instead of returning an error) introduced the clamping of msg_namelen when the unsigned value was larger than sizeof(struct sockaddr_storage). This caused a msg_namelen of -1 to be valid. The native code was subsequently fixed by commit `dbb490b965` (net: socket: error on a negative msg_namelen). In addition, the native code sets msg_namelen to 0 when msg_name is NULL. This was done in commit (`6a2a2b3ae0` net:socket: set msg_namelen to 0 if msg_name is passed as NULL in msghdr struct from userland) and subsequently updated by `08adb7dabd` (fold verify_iovec() into copy_msghdr_from_user()). This patch brings the get_compat_msghdr() in line with copy_msghdr_from_user(). Fixes: `db31c55a6f` (net: clamp ->msg_namelen instead of returning an error) Cc: David S. Miller <davem@davemloft.net> Cc: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 16:31:09 -04:00
Herbert Xu	6cca7289d5	tipc: Use inlined rhashtable interface This patch converts tipc to the inlined rhashtable interface. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 16:16:24 -04:00
Herbert Xu	fa3773211e	netfilter: Convert nft_hash to inlined rhashtable This patch converts nft_hash to the inlined rhashtable interface. This patch also replaces the call to rhashtable_lookup_compare with a straight rhashtable_lookup_fast because it's simply doing a memcmp (in fact nft_hash_lookup already uses memcmp instead of nft_data_cmp). Furthermore, the compare function is only meant to compare, it is not supposed to have side-effects. The current side-effect code can simply be moved into the nft_hash_get. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 16:16:24 -04:00
Herbert Xu	c428ecd1a2	netlink: Move namespace into hash key Currently the name space is a de facto key because it has to match before we find an object in the hash table. However, it isn't in the hash value so all objects from different name spaces with the same port ID hash to the same bucket. This is bad as the number of name spaces is unbounded. This patch fixes this by using the namespace when doing the hash. Because the namespace field doesn't lie next to the portid field in the netlink socket, this patch switches over to the rhashtable interface without a fixed key. This patch also uses the new inlined rhashtable interface where possible. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 16:16:24 -04:00
Daniel Borkmann	0b8c707ddf	ebpf, filter: do not convert skb->protocol to host endianess during runtime Commit `c249739579` ("bpf: allow BPF programs access 'protocol' and 'vlan_tci' fields") has added support for accessing protocol, vlan_present and vlan_tci into the skb offset map. As referenced in the below discussion, accessing skb->protocol from an eBPF program should be converted without handling endianess. The reason for this is that an eBPF program could simply do a check more naturally, by f.e. testing skb->protocol == htons(ETH_P_IP), where the LLVM compiler resolves htons() against a constant automatically during compilation time, as opposed to an otherwise needed run time conversion. After all, the way of programming both from a user perspective differs quite a lot, i.e. bpf_asm ["ld proto"] versus a C subset/LLVM. Reference: https://patchwork.ozlabs.org/patch/450819/ Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 15:24:26 -04:00
Johannes Berg	5d8325ecb9	cfg80211: add vlan to station add/change tracing This helps debug issues with VLAN modifications that are otherwise not really visible in any tracing/debugging. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-20 19:56:41 +01:00
Jakub Pawlowski	b55d1abf56	Bluetooth: Expose quirks through debugfs This patch expose controller quirks through debugfs. It would be useful for BlueZ tests using vhci. Currently there is no way to test quirk dependent behaviour. It might be also useful for manual testing. Signed-off-by: Jakub Pawlowski <jpawlowski@google.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-03-20 19:47:01 +01:00
Marcelo Ricardo Leitner	c4a6853d8f	ipv6: invert join/leave anycast rtnl/socket locking order Commit `baf606d9c9` ("ipv4,ipv6: grab rtnl before locking the socket") missed to update two setsockopt options, IPV6_JOIN_ANYCAST and IPV6_LEAVE_ANYCAST, causing a lock inverstion regarding to the updated ones. As ipv6_sock_ac_join and ipv6_sock_ac_leave are only called from do_ipv6_setsockopt, we are good to just move the rtnl lock upper. Fixes: `baf606d9c9` ("ipv4,ipv6: grab rtnl before locking the socket") Reported-by: Ying Huang <ying.huang@intel.com> Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 13:32:38 -04:00
Josh Hunt	d22e153718	tcp: fix tcp fin memory accounting tcp_send_fin() does not account for the memory it allocates properly, so sk_forward_alloc can be negative in cases where we've sent a FIN: ss example output (ss -amn \| grep -B1 f4294): tcp FIN-WAIT-1 0 1 192.168.0.1:45520 192.0.2.1:8080 skmem:(r0,rb87380,t0,tb87380,f4294966016,w1280,o0,bl0) Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 13:18:52 -04:00
Steven Barth	73ba57bfae	ipv6: fix backtracking for throw routes for throw routes to trigger evaluation of other policy rules EAGAIN needs to be propagated up to fib_rules_lookup similar to how its done for IPv4 A simple testcase for verification is: ip -6 rule add lookup 33333 priority 33333 ip -6 route add throw 2001:db8::1 ip -6 route add 2001:db8::1 via fe80::1 dev wlan0 table 33333 ip route get 2001:db8::1 Signed-off-by: Steven Barth <cyrus@openwrt.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 12:57:23 -04:00
Sabrina Dubroca	8e199dfd82	ipv6: call ipv6_proxy_select_ident instead of ipv6_select_ident in udp6_ufo_fragment Matt Grant reported frequent crashes in ipv6_select_ident when udp6_ufo_fragment is called from openvswitch on a skb that doesn't have a dst_entry set. ipv6_proxy_select_ident generates the frag_id without using the dst associated with the skb. This approach was suggested by Vladislav Yasevich. Fixes: `0508c07f5e` ("ipv6: Select fragment id during UFO segmentation if not set.") Cc: Vladislav Yasevich <vyasevic@redhat.com> Reported-by: Matt Grant <matt@mattgrant.net.nz> Tested-by: Matt Grant <matt@mattgrant.net.nz> Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Acked-by: Vladislav Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 12:56:11 -04:00
Eric Dumazet	becb74f0ac	net: increase sk_[max_]ack_backlog sk_ack_backlog & sk_max_ack_backlog were 16bit fields, meaning listen() backlog was limited to 65535. It is time to increase the width to allow much bigger backlog, if admins change /proc/sys/net/core/somaxconn & /proc/sys/net/ipv4/tcp_max_syn_backlog default values. Tested: echo 5000000 >/proc/sys/net/core/somaxconn echo 5000000 >/proc/sys/net/ipv4/tcp_max_syn_backlog Ran a SYNFLOOD test against a listener using listen(fd, 5000000) myhost~# grep request_sock_TCP /proc/slabinfo request_sock_TCP 4185642 4411940 304 13 1 : tunables 54 27 8 : slabdata 339380 339380 0 Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 12:40:25 -04:00
Eric Dumazet	fa76ce7328	inet: get rid of central tcp/dccp listener timer One of the major issue for TCP is the SYNACK rtx handling, done by inet_csk_reqsk_queue_prune(), fired by the keepalive timer of a TCP_LISTEN socket. This function runs for awful long times, with socket lock held, meaning that other cpus needing this lock have to spin for hundred of ms. SYNACK are sent in huge bursts, likely to cause severe drops anyway. This model was OK 15 years ago when memory was very tight. We now can afford to have a timer per request sock. Timer invocations no longer need to lock the listener, and can be run from all cpus in parallel. With following patch increasing somaxconn width to 32 bits, I tested a listener with more than 4 million active request sockets, and a steady SYNFLOOD of ~200,000 SYN per second. Host was sending ~830,000 SYNACK per second. This is ~100 times more what we could achieve before this patch. Later, we will get rid of the listener hash and use ehash instead. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 12:40:25 -04:00
Eric Dumazet	52452c5425	inet: drop prev pointer handling in request sock When request sock are put in ehash table, the whole notion of having a previous request to update dl_next is pointless. Also, following patch will get rid of big purge timer, so we want to delete a request sock without holding listener lock. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 12:40:25 -04:00
Johannes Berg	7c10770f99	mac80211: avoid duplicate TX path station lookup Instead of looking up the destination station twice in the TX path (first to build the header, and then for control processing), save it when building the header and use it later in the TX path. To avoid having to look up the station in the many callers, allow those to pass %NULL which keeps the existing lookup. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-20 16:27:36 +01:00
Johannes Berg	e33f5569aa	mac80211: mesh: avoid pointless station lookup In ieee80211_build_hdr(), the station is looked up to build the header correctly (QoS field) and to check for authorization. For mesh, authorization isn't checked here, and QoS capability is mandatory, so the station lookup can be avoided. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-20 16:27:36 +01:00
Johannes Berg	a8d15ff005	mac80211: drop 4-addr VLAN frames earlier if not connected If there's no station on the 4-addr VLAN interface, then frames cannot be transmitted. Drop such frames earlier, before setting up all the information for them. We should keep the old check though since that code might be used for other internally-generated frames. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-20 16:27:35 +01:00
Johannes Berg	5041006c42	mac80211: don't look up destination station twice There's no need to look up the destination station twice while building the 802.11 header for a given frame if the frame will actually be transmitted to the station we initially looked up. This happens for 4-addr VLAN interfaces and TDLS connections, which both directly send the frame to the station they looked up, though in the case of TDLS some station conditions need to be checked. To avoid that, add a variable indicating that we've looked up the station that the frame is going to be transmitted to, and avoid the lookup/flag checking if it already has been done. In the TDLS case, also move the authorized/wme_sta flag assignment to the correct place, i.e. only when that station is really used. Before this change, the new lookup should always have succeeded so that the potentially erroneous data would be overwritten. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-20 16:27:34 +01:00
Andrey Ryabinin	179a5bc4b8	net/9p: use memcpy() instead of snprintf() in p9_mount_tag_show() p9_mount_tag_show() uses '%s' format string to print non-NULL terminated chan->tag string. This leads to out of bounds memory read, because format '%s' implies that string is NULL-terminated. The length of string is know here, so its simpler and safer to use memcpy instead of snprintf(). Signed-off-by: Andrey Ryabinin <a.ryabinin@samsung.com> Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>	2015-03-20 07:34:43 -07:00
Kirill A. Shutemov	6250a8badb	9p: use unsigned integers for nwqid/count As specification says, all integers in messages are unsigned. Let's fix behaviour of p9pdu_vreadf()/p9pdu_vwritef() accordingly. Fix for p9pdu_vreadf() is critical. If server replies with Rwalk, where nwqid > SHRT_MAX, the value will be interpreted as negative. kmalloc, in its order, will cast the value to (very big) size_t. It should never happen in normal situation: we never submit Twalk with nwname > 16, but malicious or broken server can still produce problematic Rwalk. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>	2015-03-20 07:34:42 -07:00
Fabian Frederick	9bfc52c109	9p: remove unused variable in p9_fd_create() p is initialized but unused. Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>	2015-03-20 07:34:41 -07:00
Pablo Neira Ayuso	3d8c6dce53	netfilter: xt_TPROXY: fix invflags check in tproxy_tg6_check() We have to check for IP6T_INV_PROTO in invflags, instead of flags. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Acked-by: Balazs Scheidler <bazsi@balabit.hu>	2015-03-20 14:35:33 +01:00
Marcel Holtmann	dc5d82a9fe	Bluetooth: Use HCI_MAX_AD_LENGTH constant instead hardcoded value Using the HCI_MAX_AD_LENGTH for the max advertising data and max scan response data length makes more sense than hardcoding the value. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-03-20 14:08:32 +02:00
Marcel Holtmann	e7844ee599	Bluetooth: Gracefully response to enabling LE on LE only devices Currently the enabling of LE on LE only devices causes an error. This is a bit difference from other commands where trying to set the same existing settings causes a positive response. Fix this behavior for this single corner case. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2015-03-20 14:05:27 +02:00
Johannes Berg	e8f4fb7c7c	mac80211: remove drop_unencrypted code This mechanism was historic, and only ever used by IBSS, which also doesn't need to have it as it properly manages station's 802.1X PAE state (or, with WEP, always has a key.) Remove the mechanism to clean up the code. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2015-03-20 11:37:36 +01:00
Marcelo Ricardo Leitner	446981e5fc	tipc: fix build issue when building without IPv6 We can't directly call ipv6_sock_mc_join() but should use the stub instead and protect it around IS_ENABLED. Fixes: `d0f91938be` ("tipc: add ip/udp media type") Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-19 16:06:27 -04:00
David S. Miller	970282d0e1	Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next Johan Hedberg says: ==================== pull request: bluetooth-next 2015-03-19 This wont the last 4.1 bluetooth-next pull request, but we've piled up enough patches in less than a week that I wanted to save you from a single huge "last-minute" pull somewhere closer to the merge window. The main changes are: - Simultaneous LE & BR/EDR discovery support for HW that can do it - Complete LE OOB pairing support - More fine-grained mgmt-command access control (normal user can now do harmless read-only operations). - Added RF power amplifier support in cc2520 ieee802154 driver - Some cleanups/fixes in ieee802154 code Please let me know if there are any issues pulling. Thanks. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-19 15:18:04 -04:00
Linus Torvalds	47226fe1b5	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: 1) Fix packet header offset calculation in _decode_session6(), from Hajime Tazaki. 2) Fix route leak in error paths of xfrm_lookup(), from Huaibin Wang. 3) Be sure to clear state properly when scans fail in iwlwifi mvm code, from Luciano Coelho. 4) iwlwifi tries to stop scans that aren't actually running, also from Luciano Coelho. 5) mac80211 should drop mesh frames that are not encrypted, fix from Bob Copeland. 6) Add new device ID to b43 wireless driver for BCM432228 chips, from Rafał Miłecki. 7) Fix accidental addition of members after variable sized array in struct tc_u_hnode, from WANG Cong. 8) Don't re-enable interrupts until after we call napi_complete() in ibmveth and WIZnet drivers, frm Yongbae Park. 9) Fix regression in vlan tag handling of fec driver, from Fugang Duan. 10) If a network namespace change fails during rtnl_newlink(), we don't unwind the device registry properly. 11) Fix two TCP regressions, from Neal Cardwell: - Don't allow snd_cwnd_cnt to accumulate huge values due to missing test in tcp_cong_avoid_ai(). - Restore CUBIC back to advancing cwnd by 1.5x packets per RTT. 12) Fix performance regression in xne-netback involving push TX notifications, from David Vrabel. 13) __skb_tstamp_tx() can be called with a NULL sk pointer, do not dereference blindly. From Willem de Bruijn. 14) Fix potential stack overflow in RDS protocol stack, from Arnd Bergmann. 15) VXLAN_VID_MASK used incorrectly in new remote checksum offload support of VXLAN driver. Fix from Alexey Kodanev. 16) Fix too small netlink SKB allocation in inet_diag layer, from Eric Dumazet. 17) ieee80211_check_combinations() does not count interfaces correctly, from Andrei Otcheretianski. 18) Hardware feature determination in bxn2x driver references a piece of software state that actually isn't initialized yet, fix from Michal Schmidt. 19) inet_csk_wait_for_connect() needs a sched_annotate_sleep() annoation, from Eric Dumazet. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (56 commits) Revert "net: cx82310_eth: use common match macro" net/mlx4_en: Set statistics bitmap at port init IB/mlx4: Saturate RoCE port PMA counters in case of overflow net/mlx4_en: Fix off-by-one in ethtool statistics display IB/mlx4: Verify net device validity on port change event act_bpf: allow non-default TC_ACT opcodes as BPF exec outcome Revert "smc91x: retrieve IRQ and trigger flags in a modern way" inet: Clean up inet_csk_wait_for_connect() vs. might_sleep() ip6_tunnel: fix error code when tunnel exists netdevice.h: fix ndo_bridge_* comments bnx2x: fix encapsulation features on 57710/57711 mac80211: ignore CSA to same channel nl80211: ignore HT/VHT capabilities without QoS/WMM mac80211: ask for ECSA IE to be considered for beacon parse CRC mac80211: count interfaces correctly for combination checks isdn: icn: use strlcpy() when parsing setup options rxrpc: bogus MSG_PEEK test in rxrpc_recvmsg() caif: fix MSG_OOB test in caif_seqpkt_recvmsg() bridge: reset bridge mtu after deleting an interface can: kvaser_usb: Fix tx queue start/stop race conditions ...	2015-03-19 11:19:44 -07:00
Erik Hugne	f2f8036e39	tipc: add support for connect() on dgram/rdm sockets Following the example of ip4_datagram_connect, we store the address in the socket structure for dgram/rdm sockets and use that as the default destination for subsequent send() calls. It is allowed to connect to any address types, and the behaviour of send() will be the same as a normal sendto() with this address provided. Binding to an AF_UNSPEC address clears the association. Signed-off-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-19 12:25:54 -04:00
Erik Hugne	3bd88ee7a2	tipc: do not report -EHOSTUNREACH for failed local delivery Since commit `1186adf7df` ("tipc: simplify message forwarding and rejection in socket layer") -EHOSTUNREACH is propagated back to the sending process if we fail to deliver the message to another socket local to the node. This is wrong, host unreachable should only be reported when the destination port/name does not exist in the cluster, and that check is always done before sending the message. Also, this introduces inconsistent sendmsg() behavior for local/remote destinations. Errors occurring on the receiving side should not trickle up to the sender. If message delivery fails TIPC should either discard the packet or reject it back to the sender based on the destination droppable option. Signed-off-by: Erik Hugne <erik.hugne@ericsson.com> Acked-by: Ying Xue <ying.xue@windriver.com> Acked-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-19 12:25:54 -04:00
Erik Hugne	18d6c58415	tipc: remove redundant call to tipc_node_remove_conn tipc_node_remove_conn may be called twice if shutdown() is called on a socket that have messages in the receive queue. Calling this function twice does no harm, but is unnecessary and we remove the redundant call. Signed-off-by: Erik Hugne <erik.hugne@ericsson.com> Acked-by: Ying Xue <ying.xue@windriver.com> Acked-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-19 12:25:54 -04:00
Nicolas Iooss	ea6edfbcef	mac802154: fix typo in header guard Signed-off-by: Nicolas Iooss <nicolas.iooss_linux@m4x.org> Fixes: `b6eea9ca35` ("mac802154: introduce driver-ops header") Acked-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2015-03-19 15:32:22 +01:00
Pablo Neira Ayuso	4017a7ee69	netfilter: restore rule tracing via nfnetlink_log Since `fab4085` ("netfilter: log: nf_log_packet() as real unified interface"), the loginfo structure that is passed to nf_log_packet() is used to explicitly indicate the logger type you want to use. This is a problem for people tracing rules through nfnetlink_log since packets are always routed to the NF_LOG_TYPE logger after the aforementioned patch. We can fix this by removing the trace loginfo structures, but that still changes the log level from 4 to 5 for tracing messages and there may be someone relying on this outthere. So let's just introduce a new nf_log_trace() function that restores the former behaviour. Reported-by: Markus Kötter <koetter@rrzn.uni-hannover.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-19 11:14:48 +01:00
Jörg Thalheim	af615762e9	bridge: add ageing_time, stp_state, priority over netlink Signed-off-by: Jörg Thalheim <joerg@higgsboson.tk> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-18 23:21:06 -04:00
David S. Miller	99c4a26a15	net: Fix high overhead of vlan sub-device teardown. When a networking device is taken down that has a non-trivial number of VLAN devices configured under it, we eat a full synchronize_net() for every such VLAN device. This is because of the call chain: NETDEV_DOWN notifier --> vlan_device_event() --> dev_change_flags() --> __dev_change_flags() --> __dev_close() --> __dev_close_many() --> dev_deactivate_many() --> synchronize_net() This is kind of rediculous because we already have infrastructure for batching doing operation X to a list of net devices so that we only incur one sync. So make use of that by exporting dev_close_many() and adjusting it's interfaace so that the caller can fully manage the batch list. Use this in vlan_device_event() and all the overhead goes away. Reported-by: Salam Noureddine <noureddine@arista.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-18 22:52:56 -04:00
Eric Dumazet	738e6d30d3	inet: add a schedule point in inet_twsk_purge() On a large hash table, we can easily spend seconds to walk over all entries. Add a cond_resched() to yield cpu if necessary. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-18 22:38:13 -04:00
David Ahern	db24a9044e	net: add support for phys_port_name Similar to port id allow netdevices to specify port names and export the name via sysfs. Drivers can implement the netdevice operation to assist udev in having sane default names for the devices using the rule: $ cat /etc/udev/rules.d/80-net-setup-link.rules SUBSYSTEM=="net", ACTION=="add", ATTR{phys_port_name}!="", NAME="$attr{phys_port_name}" Use of phys_name versus phys_id was suggested-by Jiri Pirko. Signed-off-by: David Ahern <dsahern@gmail.com> Acked-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Scott Feldman <sfeldma@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-18 22:30:35 -04:00
Marcelo Ricardo Leitner	54ff9ef36b	ipv4, ipv6: kill ip_mc_{join, leave}_group and ipv6_sock_mc_{join, drop} in favor of their inner __ ones, which doesn't grab rtnl. As these functions need to operate on a locked socket, we can't be grabbing rtnl by then. It's too late and doing so causes reversed locking. So this patch: - move rtnl handling to callers instead while already fixing some reversed locking situations, like on vxlan and ipvs code. - renames __ ones to not have the __ mark: __ip_mc_{join,leave}_group -> ip_mc_{join,leave}_group __ipv6_sock_mc_{join,drop} -> ipv6_sock_mc_{join,drop} Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-18 22:05:09 -04:00
Marcelo Ricardo Leitner	baf606d9c9	ipv4,ipv6: grab rtnl before locking the socket There are some setsockopt operations in ipv4 and ipv6 that are grabbing rtnl after having grabbed the socket lock. Yet this makes it impossible to do operations that have to lock the socket when already within a rtnl protected scope, like ndo dev_open and dev_stop. We normally take coarse grained locks first but setsockopt inverted that. So this patch invert the lock logic for these operations and makes setsockopt grab rtnl if it will be needed prior to grabbing socket lock. Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-18 22:05:09 -04:00
Eric Dumazet	08d2cc3b26	inet: request sock should init IPv6/IPv4 addresses In order to be able to use sk_ehashfn() for request socks, we need to initialize their IPv6/IPv4 addresses. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-18 22:00:35 -04:00
Eric Dumazet	b4d6444ea3	inet: get rid of last __inet_hash_connect() argument We now always call __inet_hash_nolisten(), no need to pass it as an argument. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-18 22:00:35 -04:00

... 3 4 5 6 7 ...

37347 Commits