linux

Commit Graph

Author	SHA1	Message	Date
Johannes Berg	c6f219dc83	mac80211: don't send delBA on addBA failure There's no reason to send a delBA when the peer refused our addBA, so change that. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-09-21 16:14:14 +02:00
Johannes Berg	582bb505b6	mac80211: don't send delBA when removing stations When a station is removed and we stop the aggregation sessions, it's not useful to send delBA since this is due to us or the station disassociating or dropping the connection in some other way, so change that. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-09-21 16:14:14 +02:00
Johannes Berg	7f1611469b	mac80211: don't send delBA before disassoc When we disassociate, it's not really useful to send delBA action frames since we're going to send disassoc/deauth anyway, so change that. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-09-21 16:14:13 +02:00
Jan Engelhardt	2cbc78a29e	netfilter: combine ipt_REDIRECT and ip6t_REDIRECT Combine more modules since the actual code is so small anyway that the kmod metadata and the module in its loaded state totally outweighs the combined actual code size. IP_NF_TARGET_REDIRECT becomes a compat option; IP6_NF_TARGET_REDIRECT is completely eliminated since it has not see a release yet. Signed-off-by: Jan Engelhardt <jengelh@inai.de> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-09-21 12:12:05 +02:00
Jan Engelhardt	b3d54b3e40	netfilter: combine ipt_NETMAP and ip6t_NETMAP Combine more modules since the actual code is so small anyway that the kmod metadata and the module in its loaded state totally outweighs the combined actual code size. IP_NF_TARGET_NETMAP becomes a compat option; IP6_NF_TARGET_NETMAP is completely eliminated since it has not see a release yet. Signed-off-by: Jan Engelhardt <jengelh@inai.de> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-09-21 12:11:08 +02:00
Ulrich Weber	136251d02f	netfilter: nf_nat: remove obsolete rcu_read_unlock call hlist walk in find_appropriate_src() is not protected anymore by rcu_read_lock(), so rcu_read_unlock() is unnecessary if in_range() matches. This bug was added in (`c7232c9` netfilter: add protocol independent NAT core). Signed-off-by: Ulrich Weber <ulrich.weber@sophos.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-09-21 12:09:25 +02:00
Patrick McHardy	b0cdb1d9a9	netfilter: nf_nat: fix oops when unloading protocol modules When unloading a protocol module nf_ct_iterate_cleanup() is used to remove all conntracks using the protocol from the bysource hash and clean their NAT sections. Since the conntrack isn't actually killed, the NAT callback is invoked twice, once for each direction, which causes an oops when trying to delete it from the bysource hash for the second time. The same oops can also happen when removing both an L3 and L4 protocol since the cleanup function doesn't check whether the conntrack has already been cleaned up. Pid: 4052, comm: modprobe Not tainted 3.6.0-rc3-test-nat-unload-fix+ #32 Red Hat KVM RIP: 0010:[<ffffffffa002c303>] [<ffffffffa002c303>] nf_nat_proto_clean+0x73/0xd0 [nf_nat] RSP: 0018:ffff88007808fe18 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff8800728550c0 RCX: ffff8800756288b0 RDX: dead000000200200 RSI: ffff88007808fe88 RDI: ffffffffa002f208 RBP: ffff88007808fe28 R08: ffff88007808e000 R09: 0000000000000000 R10: dead000000200200 R11: dead000000100100 R12: ffffffff81c6dc00 R13: ffff8800787582b8 R14: ffff880078758278 R15: ffff88007808fe88 FS: 00007f515985d700(0000) GS:ffff88007cd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00007f515986a000 CR3: 000000007867a000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process modprobe (pid: 4052, threadinfo ffff88007808e000, task ffff8800756288b0) Stack: ffff88007808fe68 ffffffffa002c290 ffff88007808fe78 ffffffff815614e3 ffffffff00000000 00000aeb00000246 ffff88007808fe68 ffffffff81c6dc00 ffff88007808fe88 ffffffffa00358a0 0000000000000000 000000000040f5b0 Call Trace: [<ffffffffa002c290>] ? nf_nat_net_exit+0x50/0x50 [nf_nat] [<ffffffff815614e3>] nf_ct_iterate_cleanup+0xc3/0x170 [<ffffffffa002c55a>] nf_nat_l3proto_unregister+0x8a/0x100 [nf_nat] [<ffffffff812a0303>] ? compat_prepare_timeout+0x13/0xb0 [<ffffffffa0035848>] nf_nat_l3proto_ipv4_exit+0x10/0x23 [nf_nat_ipv4] ... To fix this, - check whether the conntrack has already been cleaned up in nf_nat_proto_clean - change nf_ct_iterate_cleanup() to only invoke the callback function once for each conntrack (IP_CT_DIR_ORIGINAL). The second change doesn't affect other callers since when conntracks are actually killed, both directions are removed from the hash immediately and the callback is already only invoked once. If it is not killed, the second callback invocation will always return the same decision not to kill it. Reported-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-09-21 11:35:18 +02:00
Pablo Neira Ayuso	b0041d1b8e	netfilter: fix IPv6 NAT dependencies in Kconfig * NF_NAT_IPV6 requires IP6_NF_IPTABLES * IP6_NF_TARGET_MASQUERADE, IP6_NF_TARGET_NETMAP, IP6_NF_TARGET_REDIRECT and IP6_NF_TARGET_NPT require NF_NAT_IPV6. This change just mirrors what IPv4 does in Kconfig, for consistency. Reported-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-09-21 11:33:19 +02:00
Ed Cashin	c0d680e577	net: do not disable sg for packets requiring no checksum A change in a series of VLAN-related changes appears to have inadvertently disabled the use of the scatter gather feature of network cards for transmission of non-IP ethernet protocols like ATA over Ethernet (AoE). Below is a reference to the commit that introduces a "harmonize_features" function that turns off scatter gather when the NIC does not support hardware checksumming for the ethernet protocol of an sk buff. commit `f01a5236bd` Author: Jesse Gross <jesse@nicira.com> Date: Sun Jan 9 06:23:31 2011 +0000 net offloading: Generalize netif_get_vlan_features(). The can_checksum_protocol function is not equipped to consider a protocol that does not require checksumming. Calling it for a protocol that requires no checksum is inappropriate. The patch below has harmonize_features call can_checksum_protocol when the protocol needs a checksum, so that the network layer is not forced to perform unnecessary skb linearization on the transmission of AoE packets. Unnecessary linearization results in decreased performance and increased memory pressure, as reported here: http://www.spinics.net/lists/linux-mm/msg15184.html The problem has probably not been widely experienced yet, because only recently has the kernel.org-distributed aoe driver acquired the ability to use payloads of over a page in size, with the patchset recently included in the mm tree: https://lkml.org/lkml/2012/8/28/140 The coraid.com-distributed aoe driver already could use payloads of greater than a page in size, but its users generally do not use the newest kernels. Signed-off-by: Ed Cashin <ecashin@coraid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-20 22:23:40 -04:00
Mathias Krause	e3ac104d41	xfrm_user: don't copy esn replay window twice for new states The ESN replay window was already fully initialized in xfrm_alloc_replay_state_esn(). No need to copy it again. Cc: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Mathias Krause <minipli@googlemail.com> Acked-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-20 18:08:40 -04:00
Mathias Krause	ecd7918745	xfrm_user: ensure user supplied esn replay window is valid The current code fails to ensure that the netlink message actually contains as many bytes as the header indicates. If a user creates a new state or updates an existing one but does not supply the bytes for the whole ESN replay window, the kernel copies random heap bytes into the replay bitmap, the ones happen to follow the XFRMA_REPLAY_ESN_VAL netlink attribute. This leads to following issues: 1. The replay window has random bits set confusing the replay handling code later on. 2. A malicious user could use this flaw to leak up to ~3.5kB of heap memory when she has access to the XFRM netlink interface (requires CAP_NET_ADMIN). Known users of the ESN replay window are strongSwan and Steffen's iproute2 patch (<http://patchwork.ozlabs.org/patch/85962/>). The latter uses the interface with a bitmap supplied while the former does not. strongSwan is therefore prone to run into issue 1. To fix both issues without breaking existing userland allow using the XFRMA_REPLAY_ESN_VAL netlink attribute with either an empty bitmap or a fully specified one. For the former case we initialize the in-kernel bitmap with zero, for the latter we copy the user supplied bitmap. For state updates the full bitmap must be supplied. To prevent overflows in the bitmap length calculation the maximum size of bmp_len is limited to 128 by this patch -- resulting in a maximum replay window of 4096 packets. This should be sufficient for all real life scenarios (RFC 4303 recommends a default replay window size of 64). Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: Martin Willi <martin@revosec.ch> Cc: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: Mathias Krause <minipli@googlemail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-20 18:08:40 -04:00
Mathias Krause	1f86840f89	xfrm_user: fix info leak in copy_to_user_tmpl() The memory used for the template copy is a local stack variable. As struct xfrm_user_tmpl contains multiple holes added by the compiler for alignment, not initializing the memory will lead to leaking stack bytes to userland. Add an explicit memset(0) to avoid the info leak. Initial version of the patch by Brad Spengler. Cc: Brad Spengler <spender@grsecurity.net> Signed-off-by: Mathias Krause <minipli@googlemail.com> Acked-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-20 18:08:40 -04:00
Mathias Krause	7b789836f4	xfrm_user: fix info leak in copy_to_user_policy() The memory reserved to dump the xfrm policy includes multiple padding bytes added by the compiler for alignment (padding bytes in struct xfrm_selector and struct xfrm_userpolicy_info). Add an explicit memset(0) before filling the buffer to avoid the heap info leak. Signed-off-by: Mathias Krause <minipli@googlemail.com> Acked-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-20 18:08:39 -04:00
Mathias Krause	f778a63671	xfrm_user: fix info leak in copy_to_user_state() The memory reserved to dump the xfrm state includes the padding bytes of struct xfrm_usersa_info added by the compiler for alignment (7 for amd64, 3 for i386). Add an explicit memset(0) before filling the buffer to avoid the info leak. Signed-off-by: Mathias Krause <minipli@googlemail.com> Acked-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-20 18:08:39 -04:00
Mathias Krause	4c87308bde	xfrm_user: fix info leak in copy_to_user_auth() copy_to_user_auth() fails to initialize the remainder of alg_name and therefore discloses up to 54 bytes of heap memory via netlink to userland. Use strncpy() instead of strcpy() to fill the trailing bytes of alg_name with null bytes. Signed-off-by: Mathias Krause <minipli@googlemail.com> Acked-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-20 18:08:39 -04:00
Andrey Vagin	bc26ccd8fc	tcp: restore rcv_wscale in a repair mode (v2) rcv_wscale is a symetric parameter with snd_wscale. Both this parameters are set on a connection handshake. Without this value a remote window size can not be interpreted correctly, because a value from a packet should be shifted on rcv_wscale. And one more thing is that wscale_ok should be set too. This patch doesn't break a backward compatibility. If someone uses it in a old scheme, a rcv window will be restored with the same bug (rcv_wscale = 0). v2: Save backward compatibility on big-endian system. Before the first two bytes were snd_wscale and the second two bytes were rcv_wscale. Now snd_wscale is opt_val & 0xFFFF and rcv_wscale >> 16. This approach is independent on byte ordering. Cc: David S. Miller <davem@davemloft.net> Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Cc: James Morris <jmorris@namei.org> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Cc: Patrick McHardy <kaber@trash.net> CC: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: Andrew Vagin <avagin@openvz.org> Acked-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-20 17:49:58 -04:00
Alan Cox	4308fc58dc	tcp: Document use of undefined variable. Both tcp_timewait_state_process and tcp_check_req use the same basic construct of struct tcp_options received tmp_opt; tmp_opt.saw_tstamp = 0; then call tcp_parse_options However if they are fed a frame containing a TCP_SACK then tbe code behaviour is undefined because opt_rx->sack_ok is undefined data. This ought to be documented if it is intentional. Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-20 17:29:36 -04:00
Christoph Paasch	bb68b64724	ipv4: Don't add TCP-code in inet_sock_destruct Signed-off-by: Christoph Paasch <christoph.paasch@uclouvain.be> Acked-by: H.K. Jerry Chu <hkchu@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-20 17:12:27 -04:00
Sylvain Roger Rieunier	2514ec8653	mac80211: fix IBSS auth TX debug message In the IBSS auth TX debug message the BSSID and DA address are reversed, fix that. Signed-off-by: Sylvain Roger Rieunier <sylvain.roger.rieunier@gmail.com> [reword commit message and make it fit 72 cols] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-09-20 10:31:34 +02:00
Trond Myklebust	a519fc7a70	SUNRPC: Ensure that the TCP socket is closed when in CLOSE_WAIT Instead of doing a shutdown() call, we need to do an actual close(). Ditto if/when the server is sending us junk RPC headers. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Tested-by: Simon Kirby <sim@hostway.ca> Cc: stable@vger.kernel.org	2012-09-19 18:16:10 -04:00
Li RongQing	8ea853fd0b	net/core: fix comment in skb_try_coalesce It should be the skb which is not cloned Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-19 17:29:13 -04:00
Amerigo Wang	6b102865e7	ipv6: unify fragment thresh handling code Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Michal Kubeček <mkubecek@suse.cz> Cc: David Miller <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-19 17:23:28 -04:00
Amerigo Wang	d4915c087f	ipv6: make ip6_frag_nqueues() and ip6_frag_mem() static inline Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Michal Kubeček <mkubecek@suse.cz> Cc: David Miller <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-19 17:23:28 -04:00
Amerigo Wang	b836c99fd6	ipv6: unify conntrack reassembly expire code with standard one Two years ago, Shan Wei tried to fix this: http://patchwork.ozlabs.org/patch/43905/ The problem is that RFC2460 requires an ICMP Time Exceeded -- Fragment Reassembly Time Exceeded message should be sent to the source of that fragment, if the defragmentation times out. " If insufficient fragments are received to complete reassembly of a packet within 60 seconds of the reception of the first-arriving fragment of that packet, reassembly of that packet must be abandoned and all the fragments that have been received for that packet must be discarded. If the first fragment (i.e., the one with a Fragment Offset of zero) has been received, an ICMP Time Exceeded -- Fragment Reassembly Time Exceeded message should be sent to the source of that fragment. " As Herbert suggested, we could actually use the standard IPv6 reassembly code which follows RFC2460. With this patch applied, I can see ICMP Time Exceeded sent from the receiver when the sender sent out 3/4 fragmented IPv6 UDP packet. Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Michal Kubeček <mkubecek@suse.cz> Cc: David Miller <davem@davemloft.net> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Cc: Patrick McHardy <kaber@trash.net> Cc: Pablo Neira Ayuso <pablo@netfilter.org> Cc: netfilter-devel@vger.kernel.org Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-19 17:23:28 -04:00
Amerigo Wang	c038a767cd	ipv6: add a new namespace for nf_conntrack_reasm As pointed by Michal, it is necessary to add a new namespace for nf_conntrack_reasm code, this prepares for the second patch. Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Michal Kubeček <mkubecek@suse.cz> Cc: David Miller <davem@davemloft.net> Cc: Patrick McHardy <kaber@trash.net> Cc: Pablo Neira Ayuso <pablo@netfilter.org> Cc: netfilter-devel@vger.kernel.org Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-19 17:23:28 -04:00
Amerigo Wang	8c4c49df5c	netpoll: call ->ndo_select_queue() in tx path In netpoll tx path, we miss the chance of calling ->ndo_select_queue(), thus could cause problems when bonding is involved. This patch makes dev_pick_tx() extern (and rename it to netdev_pick_tx()) to let netpoll call it in netpoll_send_skb_on_dev(). Reported-by: Sylvain Munaut <s.munaut@whatever-company.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: Cong Wang <amwang@redhat.com> Tested-by: Sylvain Munaut <s.munaut@whatever-company.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-19 17:19:09 -04:00
stephen hemminger	6b6e27255f	netdev: make address const in device address management The internal functions for add/deleting addresses don't change their argument. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-19 16:35:22 -04:00
Paolo Valente	7126195697	pkt_sched: fix virtual-start-time update in QFQ If the old timestamps of a class, say cl, are stale when the class becomes active, then QFQ may assign to cl a much higher start time than the maximum value allowed. This may happen when QFQ assigns to the start time of cl the finish time of a group whose classes are characterized by a higher value of the ratio max_class_pkt/weight_of_the_class with respect to that of cl. Inserting a class with a too high start time into the bucket list corrupts the data structure and may eventually lead to crashes. This patch limits the maximum start time assigned to a class. Signed-off-by: Paolo Valente <paolo.valente@unimore.it> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-19 16:23:53 -04:00
Michal Kubeček	15c041759b	tcp: flush DMA queue before sk_wait_data if rcv_wnd is zero If recv() syscall is called for a TCP socket so that - IOAT DMA is used - MSG_WAITALL flag is used - requested length is bigger than sk_rcvbuf - enough data has already arrived to bring rcv_wnd to zero then when tcp_recvmsg() gets to calling sk_wait_data(), receive window can be still zero while sk_async_wait_queue exhausts enough space to keep it zero. As this queue isn't cleaned until the tcp_service_net_dma() call, sk_wait_data() cannot receive any data and blocks forever. If zero receive window and non-empty sk_async_wait_queue is detected before calling sk_wait_data(), process the queue first. Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-19 16:07:58 -04:00
Linus Lüssing	dbd6b11e15	batman-adv: make batadv_test_bit() return 0 or 1 only On some architectures test_bit() can return other values than 0 or 1: With a generic x86 OpenWrt image in a kvm setup (batadv_)test_bit() frequently returns -1 for me, leading to batadv_iv_ogm_update_seqnos() wrongly signaling a protected seqno window. This patch tries to fix this issue by making batadv_test_bit() return 0 or 1 only. Signed-off-by: Linus Lüssing <linus.luessing@web.de> Acked-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Antonio Quartulli <ordex@autistici.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-19 15:49:53 -04:00
Eric Dumazet	6b78f16e4b	gre: add GSO support Add GSO support to GRE tunnels. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Maciej Żenczykowski <maze@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-19 15:40:15 -04:00
Eric Dumazet	2c60db0370	net: provide a default dev->ethtool_ops Instead of forcing device drivers to provide empty ethtool_ops or tweak net/core/ethtool.c again, we could provide a generic ethtool_ops. This occurred to me when I wanted to add GSO support to GRE tunnels. ethtool -k support should be generic for all drivers. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Ben Hutchings <bhutchings@solarflare.com> Cc: Maciej Żenczykowski <maze@google.com> Reviewed-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-19 15:40:15 -04:00
Gao feng	828de4f6bf	net: dev: fix incorrect getting net device's name When moving a nic from net namespace A to net namespace B, in dev_change_net_namesapce,we call __dev_get_by_name to decide if the netns B has the device has the same name. if the netns B already has the same named device,we call dev_get_valid_name to try to get a valid name for this nic in the netns B,but net_device->nd_net still point to netns A now. this patch fix it. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-19 15:37:01 -04:00
Li RongQing	3fd91fb358	ipv6: recursive check rt->dst.from when call rt6_check_expired If dst cache dst_a copies from dst_b, and dst_b copies from dst_c, check if dst_a is expired or not, we should not end with dst_a->dst.from, dst_b, we should check dst_c. CC: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-19 15:35:33 -04:00
Eric Dumazet	b40863c667	net: more accurate network taps in transmit path dev_queue_xmit_nit() should be called right before ndo_start_xmit() calls or we might give wrong packet contents to taps users : Packet checksum can be changed, or packet can be linearized or segmented, and segments partially sent for the later case. Also a memory allocation can fail and packet never really hit the driver entry point. Reported-by: Jamie Gloudon <jamie.gloudon@gmail.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-19 15:32:42 -04:00
Johannes Berg	552bff0c2f	cfg80211: constify name parameter to add_virtual_intf The name can't be modified by the driver, make it const. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-09-19 09:32:59 +02:00
Johannes Berg	2ad4814fb6	mac80211: make reset debugfs depend on CONFIG_PM The suspend/resume code depends on CONFIG_PM, so the reset debugfs file can only be made available if that is enabled. Fengguang Wu's zero-day build testing found this. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-09-19 08:20:24 +02:00
Johan Hedberg	23b3b1330a	Bluetooth: Update management interface revision For each kernel release where commands or events are added to the management interface, the revision field should be increment by one. The increment should only happen once per kernel release and not for every command/event that gets added. The revision value is for informational purposes only, but this simple policy would make any future debugging a lot simple. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Cc: stable@vger.kernel.org Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2012-09-18 22:27:30 -03:00
Johan Hedberg	92a25256f1	Bluetooth: mgmt: Implement support for passkey notification This patch adds support for Secure Simple Pairing with devices that have KeyboardOnly as their IO capability. Such devices will cause a passkey notification on our side and optionally also keypress notifications. Without this patch some keyboards cannot be paired using the mgmt interface. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Cc: stable@vger.kernel.org Acked-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2012-09-18 22:27:29 -03:00
Luis R. Rodriguez	a85d0d7f34	cfg80211: fix possible circular lock on reg_regdb_search() When call_crda() is called we kick off a witch hunt search for the same regulatory domain on our internal regulatory database and that work gets kicked off on a workqueue, this is done while the cfg80211_mutex is held. If that workqueue kicks off it will first lock reg_regdb_search_mutex and later cfg80211_mutex but to ensure two CPUs will not contend against cfg80211_mutex the right thing to do is to have the reg_regdb_search() wait until the cfg80211_mutex is let go. The lockdep report is pasted below. cfg80211: Calling CRDA to update world regulatory domain ====================================================== [ INFO: possible circular locking dependency detected ] 3.3.8 #3 Tainted: G O ------------------------------------------------------- kworker/0:1/235 is trying to acquire lock: (cfg80211_mutex){+.+...}, at: [<816468a4>] set_regdom+0x78c/0x808 [cfg80211] but task is already holding lock: (reg_regdb_search_mutex){+.+...}, at: [<81646828>] set_regdom+0x710/0x808 [cfg80211] which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #2 (reg_regdb_search_mutex){+.+...}: [<800a8384>] lock_acquire+0x60/0x88 [<802950a8>] mutex_lock_nested+0x54/0x31c [<81645778>] is_world_regdom+0x9f8/0xc74 [cfg80211] -> #1 (reg_mutex#2){+.+...}: [<800a8384>] lock_acquire+0x60/0x88 [<802950a8>] mutex_lock_nested+0x54/0x31c [<8164539c>] is_world_regdom+0x61c/0xc74 [cfg80211] -> #0 (cfg80211_mutex){+.+...}: [<800a77b8>] __lock_acquire+0x10d4/0x17bc [<800a8384>] lock_acquire+0x60/0x88 [<802950a8>] mutex_lock_nested+0x54/0x31c [<816468a4>] set_regdom+0x78c/0x808 [cfg80211] other info that might help us debug this: Chain exists of: cfg80211_mutex --> reg_mutex#2 --> reg_regdb_search_mutex Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(reg_regdb_search_mutex); lock(reg_mutex#2); lock(reg_regdb_search_mutex); lock(cfg80211_mutex); * DEADLOCK * 3 locks held by kworker/0:1/235: #0: (events){.+.+..}, at: [<80089a00>] process_one_work+0x230/0x460 #1: (reg_regdb_work){+.+...}, at: [<80089a00>] process_one_work+0x230/0x460 #2: (reg_regdb_search_mutex){+.+...}, at: [<81646828>] set_regdom+0x710/0x808 [cfg80211] stack backtrace: Call Trace: [<80290fd4>] dump_stack+0x8/0x34 [<80291bc4>] print_circular_bug+0x2ac/0x2d8 [<800a77b8>] __lock_acquire+0x10d4/0x17bc [<800a8384>] lock_acquire+0x60/0x88 [<802950a8>] mutex_lock_nested+0x54/0x31c [<816468a4>] set_regdom+0x78c/0x808 [cfg80211] Reported-by: Felix Fietkau <nbd@openwrt.org> Tested-by: Felix Fietkau <nbd@openwrt.org> Cc: stable@vger.kernel.org Signed-off-by: Luis R. Rodriguez <mcgrof@do-not-panic.com> Reviewed-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2012-09-18 20:43:23 -04:00
Vinicius Costa Gomes	78c04c0bf5	Bluetooth: Fix not removing power_off delayed work For example, when a usb reset is received (I could reproduce it running something very similar to this[1] in a loop) it could be that the device is unregistered while the power_off delayed work is still scheduled to run. Backtrace: WARNING: at lib/debugobjects.c:261 debug_print_object+0x7c/0x8d() Hardware name: To Be Filled By O.E.M. ODEBUG: free active (active state 0) object type: timer_list hint: delayed_work_timer_fn+0x0/0x26 Modules linked in: nouveau mxm_wmi btusb wmi bluetooth ttm coretemp drm_kms_helper Pid: 2114, comm: usb-reset Not tainted 3.5.0bt-next #2 Call Trace: [<ffffffff8124cc00>] ? free_obj_work+0x57/0x91 [<ffffffff81058f88>] warn_slowpath_common+0x7e/0x97 [<ffffffff81059035>] warn_slowpath_fmt+0x41/0x43 [<ffffffff8124ccb6>] debug_print_object+0x7c/0x8d [<ffffffff8106e3ec>] ? __queue_work+0x259/0x259 [<ffffffff8124d63e>] ? debug_check_no_obj_freed+0x6f/0x1b5 [<ffffffff8124d667>] debug_check_no_obj_freed+0x98/0x1b5 [<ffffffffa00aa031>] ? bt_host_release+0x10/0x1e [bluetooth] [<ffffffff810fc035>] kfree+0x90/0xe6 [<ffffffffa00aa031>] bt_host_release+0x10/0x1e [bluetooth] [<ffffffff812ec2f9>] device_release+0x4a/0x7e [<ffffffff8123ef57>] kobject_release+0x11d/0x154 [<ffffffff8123ed98>] kobject_put+0x4a/0x4f [<ffffffff812ec0d9>] put_device+0x12/0x14 [<ffffffffa009472b>] hci_free_dev+0x22/0x26 [bluetooth] [<ffffffffa0280dd0>] btusb_disconnect+0x96/0x9f [btusb] [<ffffffff813581b4>] usb_unbind_interface+0x57/0x106 [<ffffffff812ef988>] __device_release_driver+0x83/0xd6 [<ffffffff812ef9fb>] device_release_driver+0x20/0x2d [<ffffffff813582a7>] usb_driver_release_interface+0x44/0x7b [<ffffffff81358795>] usb_forced_unbind_intf+0x45/0x4e [<ffffffff8134f959>] usb_reset_device+0xa6/0x12e [<ffffffff8135df86>] usbdev_do_ioctl+0x319/0xe20 [<ffffffff81203244>] ? avc_has_perm_flags+0xc9/0x12e [<ffffffff812031a0>] ? avc_has_perm_flags+0x25/0x12e [<ffffffff81050101>] ? do_page_fault+0x31e/0x3a1 [<ffffffff8135eaa6>] usbdev_ioctl+0x9/0xd [<ffffffff811126b1>] vfs_ioctl+0x21/0x34 [<ffffffff81112f7b>] do_vfs_ioctl+0x408/0x44b [<ffffffff81208d45>] ? file_has_perm+0x76/0x81 [<ffffffff8111300f>] sys_ioctl+0x51/0x76 [<ffffffff8158db22>] system_call_fastpath+0x16/0x1b [1] http://cpansearch.perl.org/src/DPAVLIN/Biblio-RFID-0.03/examples/usbreset.c Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@openbossa.org> Cc: stable@vger.kernel.org Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2012-09-18 20:13:02 -03:00
Andrei Emeltchenko	aad3d0e343	Bluetooth: Fix freeing uninitialized delayed works When releasing L2CAP socket which is in BT_CONFIG state l2cap_chan_close invokes l2cap_send_disconn_req which cancel delayed works which are only set in BT_CONNECTED state with l2cap_ertm_init. Add state check before cancelling those works. ... [ 9668.574372] [21085] l2cap_sock_release: sock cd065200, sk f073e800 [ 9668.574399] [21085] l2cap_sock_shutdown: sock cd065200, sk f073e800 [ 9668.574411] [21085] l2cap_chan_close: chan f073ec00 state BT_CONFIG sk f073e800 [ 9668.574421] [21085] l2cap_send_disconn_req: chan f073ec00 conn ecc16600 [ 9668.574441] INFO: trying to register non-static key. [ 9668.574443] the code is fine but needs lockdep annotation. [ 9668.574446] turning off the locking correctness validator. [ 9668.574450] Pid: 21085, comm: obex-client Tainted: G O 3.5.0+ #57 [ 9668.574452] Call Trace: [ 9668.574463] [<c10a64b3>] __lock_acquire+0x12e3/0x1700 [ 9668.574468] [<c10a44fb>] ? trace_hardirqs_on+0xb/0x10 [ 9668.574476] [<c15e4f60>] ? printk+0x4d/0x4f [ 9668.574479] [<c10a6e38>] lock_acquire+0x88/0x130 [ 9668.574487] [<c1059740>] ? try_to_del_timer_sync+0x60/0x60 [ 9668.574491] [<c1059790>] del_timer_sync+0x50/0xc0 [ 9668.574495] [<c1059740>] ? try_to_del_timer_sync+0x60/0x60 [ 9668.574515] [<f8aa1c23>] l2cap_send_disconn_req+0xe3/0x160 [bluetooth] ... Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2012-09-18 20:07:04 -03:00
Andrzej Kaczmarek	562fcc246e	Bluetooth: mgmt: Fix enabling LE while powered off When new BT USB adapter is plugged in it's configured while still being powered off (HCI_AUTO_OFF flag is set), thus Set LE will only set dev_flags but won't write changes to controller. As a result it's not possible to start device discovery session on LE controller as it uses interleaved discovery which requires LE Supported Host flag in extended features. This patch ensures HCI Write LE Host Supported is sent when Set Powered is called to power on controller and clear HCI_AUTO_OFF flag. Signed-off-by: Andrzej Kaczmarek <andrzej.kaczmarek@tieto.com> Cc: stable@vger.kernel.org Acked-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2012-09-18 20:07:03 -03:00
Andrzej Kaczmarek	3d1cbdd6ae	Bluetooth: mgmt: Fix enabling SSP while powered off When new BT USB adapter is plugged in it's configured while still being powered off (HCI_AUTO_OFF flag is set), thus Set SSP will only set dev_flags but won't write changes to controller. As a result remote devices won't use Secure Simple Pairing with our device due to SSP Host Support flag disabled in extended features and may also reject SSP attempt from our side (with possible fallback to legacy pairing). This patch ensures HCI Write Simple Pairing Mode is sent when Set Powered is called to power on controller and clear HCI_AUTO_OFF flag. Signed-off-by: Andrzej Kaczmarek <andrzej.kaczmarek@tieto.com> Cc: stable@vger.kernel.org Acked-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2012-09-18 20:07:03 -03:00
Li RongQing	433a195480	xfrm: fix a read lock imbalance in make_blackhole if xfrm_policy_get_afinfo returns 0, it has already released the read lock, xfrm_policy_put_afinfo should not be called again. Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-18 16:30:15 -04:00
Eric Dumazet	1d57f19539	tcp: fix regression in urgent data handling Stephan Springl found that commit `1402d36601` "tcp: introduce tcp_try_coalesce" introduced a regression for rlogin It turns out problem comes from TCP urgent data handling and a change in behavior in input path. rlogin sends two one-byte packets with URG ptr set, and when next data frame is coalesced, we lack sk_data_ready() calls to wakeup consumer. Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Stephan Springl <springl-k@lar.bfw.de> Cc: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-18 16:26:27 -04:00
Michael S. Tsirkin	0e698bf662	net: fix memory leak on oom with zerocopy If orphan flags fails, we don't free the skb on receive, which leaks the skb memory. Return value was also wrong: netif_receive_skb is supposed to return NET_RX_DROP, not ENOMEM. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-18 16:24:00 -04:00
Mathias Krause	c254637225	xfrm_user: return error pointer instead of NULL #2 When dump_one_policy() returns an error, e.g. because of a too small buffer to dump the whole xfrm policy, xfrm_policy_netlink() returns NULL instead of an error pointer. But its caller expects an error pointer and therefore continues to operate on a NULL skbuff. Signed-off-by: Mathias Krause <minipli@googlemail.com> Acked-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-18 16:13:46 -04:00
Mathias Krause	864745d291	xfrm_user: return error pointer instead of NULL When dump_one_state() returns an error, e.g. because of a too small buffer to dump the whole xfrm state, xfrm_state_netlink() returns NULL instead of an error pointer. But its callers expect an error pointer and therefore continue to operate on a NULL skbuff. This could lead to a privilege escalation (execution of user code in kernel context) if the attacker has CAP_NET_ADMIN and is able to map address 0. Signed-off-by: Mathias Krause <minipli@googlemail.com> Acked-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-18 16:13:45 -04:00
Peter Senna Tschudin	adccff34de	net/tipc/name_table.c: Remove unecessary semicolon Found by http://coccinelle.lip6.fr/ Signed-off-by: Peter Senna Tschudin <peter.senna@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-18 16:08:19 -04:00
Peter Senna Tschudin	a2bf91b5b8	net/openvswitch/vport.c: Remove unecessary semicolon Found by http://coccinelle.lip6.fr/ Signed-off-by: Peter Senna Tschudin <peter.senna@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-18 16:08:19 -04:00
Peter Senna Tschudin	4c835019a6	net/ieee802154/6lowpan.c: Remove unecessary semicolon Found by http://coccinelle.lip6.fr/ Signed-off-by: Peter Senna Tschudin <peter.senna@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-18 16:08:19 -04:00
Nicolas Dichtel	2c20cbd7e3	ipv6: use DST_* macro to set obselete field Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-18 15:57:04 -04:00
Nicolas Dichtel	6f3118b571	ipv6: use net->rt_genid to check dst validity IPv6 dst should take care of rt_genid too. When a xfrm policy is inserted or deleted, all dst should be invalidated. To force the validation, dst entries should be created with ->obsolete set to DST_OBSOLETE_FORCE_CHK. This was already the case for all functions calling ip6_dst_alloc(), except for ip6_rt_copy(). As a consequence, we can remove the specific code in inet6_connection_sock. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-18 15:57:03 -04:00
Nicolas Dichtel	ee8372dd19	xfrm: invalidate dst on policy insertion/deletion When a policy is inserted or deleted, all dst should be recalculated. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-18 15:57:03 -04:00
Nicolas Dichtel	b42664f898	netns: move net->ipv4.rt_genid to net->rt_genid This commit prepares the use of rt_genid by both IPv4 and IPv6. Initialization is left in IPv4 part. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-18 15:57:03 -04:00
Eric Dumazet	2885da7296	net: rt_cache_flush() cleanup We dont use jhash anymore since route cache removal, so we can get rid of get_random_bytes() calls for rt_genid changes. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-18 15:54:19 -04:00
Nicolas Dichtel	bafa6d9d89	ipv4/route: arg delay is useless in rt_cache_flush() Since route cache deletion (`89aef8921b`), delay is no more used. Remove it. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-18 15:44:34 -04:00
Pandiyarajan Pitchaimuthu	ed44a951c7	cfg80211/nl80211: Notify connection request failure in AP mode In AP mode, when a station requests connection to an AP and if the request is failed for particular reason, userspace is notified about the failure through NL80211_CMD_CONN_FAILED command. Reason for the failure is sent through the attribute NL80211_ATTR_CONN_FAILED_REASON. Signed-off-by: Pandiyarajan Pitchaimuthu <c_ppitch@qca.qualcomm.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-09-18 19:54:06 +02:00
Alan Cox	f3baed51f4	wireless: remove unreachable code The only case where intersected_rd can become non NULL is within an if. All paths from that if return, so the end chunk has therefore squawked its last and is no more. Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-09-18 19:54:05 +02:00
Eric W. Biederman	e1760bd5ff	userns: Convert the audit loginuid to be a kuid Always store audit loginuids in type kuid_t. Print loginuids by converting them into uids in the appropriate user namespace, and then printing the resulting uid. Modify audit_get_loginuid to return a kuid_t. Modify audit_set_loginuid to take a kuid_t. Modify /proc/<pid>/loginuid on read to convert the loginuid into the user namespace of the opener of the file. Modify /proc/<pid>/loginud on write to convert the loginuid rom the user namespace of the opener of the file. Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Eric Paris <eparis@redhat.com> Cc: Paul Moore <paul@paul-moore.com> ? Cc: David Miller <davem@davemloft.net> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-17 18:08:54 -07:00
Simon Derr	584a8c13d5	9P: Fix race in p9_write_work() See previous commit about p9_read_work() for details. This fixes a similar race between p9_write_work() and p9_poll_mux() Signed-off-by: Simon Derr <simon.derr@bull.net> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>	2012-09-17 14:54:11 -05:00
Simon Derr	1957b3a86f	9P: fix test at the end of p9_write_work() At the end of p9_write_work() we want to test if there is still data to send. This means: - either the current request still has data to send (wsize != 0) - or there are requests in the unsent queue Signed-off-by: Simon Derr <simon.derr@bull.net> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>	2012-09-17 14:54:11 -05:00
Simon Derr	0462194d35	9P: Fix race in p9_read_work() Race scenario between p9_read_work() and p9_poll_mux() Data arrive, Rworksched is set, p9_read_work() is called. thread A thread B p9_read_work() . reads data . checks if new data ready. No. . gets preempted . More data arrive, p9_poll_mux() is called. . . . p9_poll_mux() . . if (!test_and_set_bit(Rworksched, . &m->wsched)) { . schedule_work(&m->rq); . } . . -> does not schedule work because . Rworksched is set . . clear_bit(Rworksched, &m->wsched); return; No work has been scheduled, and yet data are waiting. Currently p9_read_work() checks if there is data to read, and if not, it clears Rworksched. I think it should clear Rworksched first, and then check if there is data to read. Signed-off-by: Simon Derr <simon.derr@bull.net> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>	2012-09-17 14:54:11 -05:00
David S. Miller	b4516a288e	llc: Remove stray reference to sysctl_llc_station_ack_timeout. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-17 13:13:24 -04:00
Ben Hutchings	12ebc8b9af	llc2: Collapse remainder of state machine into simple if-else if-statement Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-17 13:04:19 -04:00
Ben Hutchings	da31888018	llc2: Remove explicit indexing of state action arrays These arrays are accessed by iteration in llc_exec_station_trans_actions(). There must not be any zero-filled gaps in them, so the explicit indices are pointless. Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-17 13:04:19 -04:00
Ben Hutchings	5ecf9eea26	llc2: Remove the station send queue We only ever put one skb on the send queue, and then immediately send it. Remove the queue and call dev_queue_xmit() directly. This leaves struct llc_station empty, so remove that as well. Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-17 13:04:18 -04:00
Ben Hutchings	04d191c259	llc2: Collapse the station event receive path We only ever put one skb on the event queue, and then immediately process it. Remove the queue and fold together the related functions, removing several blatantly false comments. Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-17 13:04:18 -04:00
Ben Hutchings	025e363325	llc2: Remove dead code for state machine The initial state is UP and there is no way to enter the other states as the required event type is never generated. Delete all states, event types, and other dead code. The only thing left is handling of the XID and TEST commands. Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-17 13:04:18 -04:00
Ben Hutchings	cc6328dfe4	llc2: Remove pointless indirection through llc_stat_state_trans_end Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-17 13:04:18 -04:00
Alan Cox	e04dae8408	af_unix: old_cred is surplus Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-17 13:00:13 -04:00
Joe Perches	666f355f38	device and dynamic_debug: Use dev_vprintk_emit and dev_printk_emit Convert direct calls of vprintk_emit and printk_emit to the dev_ equivalents. Make create_syslog_header static. Signed-off-by: Joe Perches <joe@perches.com> Acked-by: David S. Miller <davem@davemloft.net> Tested-by: Jim Cromie <jim.cromie@gmail.com> Acked-by: Jason Baron <jbaron@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-09-17 06:10:05 -07:00
Joe Perches	c2c5a7051c	netdev_printk/netif_printk: Remove a superfluous logging colon netdev_printk originally called dev_printk with %pV. This style emitted the complete dev_printk header with a colon followed by the netdev_name prefix followed by a colon. Now that netdev_printk does not call dev_printk, the extra colon is superfluous. Remove it. Example: old: sky2 0000:02:00.0: eth0: Link is up at 100 Mbps, full duplex, flow control both new: sky2 0000:02:00.0 eth0: Link is up at 100 Mbps, full duplex, flow control both Signed-off-by: Joe Perches <joe@perches.com> Acked-by: David S. Miller <davem@davemloft.net> Tested-by: Jim Cromie <jim.cromie@gmail.com> Acked-by: Jason Baron <jbaron@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-09-17 06:10:05 -07:00
Joe Perches	b004ff4972	netdev_printk/dynamic_netdev_dbg: Directly call printk_emit A lot of stack is used in recursive printks with %pV. Using multiple levels of %pV (a logging function with %pV that calls another logging function with %pV) can consume more stack than necessary. Avoid excessive stack use by not calling dev_printk from netdev_printk and dynamic_netdev_dbg. Duplicate the logic and form of dev_printk instead. Make __netdev_printk static. Remove EXPORT_SYMBOL(__netdev_printk) Whitespace and brace style neatening. Signed-off-by: Joe Perches <joe@perches.com> Acked-by: David S. Miller <davem@davemloft.net> Tested-by: Jim Cromie <jim.cromie@gmail.com> Acked-by: Jason Baron <jbaron@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-09-17 06:08:30 -07:00
David S. Miller	ba01dfe182	Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next John W. Linville says: ==================== This is another batch of updates intended for the 3.7 stream. There are not a lot of large items, but iwlwifi, mwifiex, rt2x00, ath9k, and brcmfmac all get some attention. Wei Yongjun also provides a series of small maintenance fixes. This also includes a pull of the wireless tree in order to satisfy some prerequisites for later patches. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-17 00:57:32 -04:00
Greg Kroah-Hartman	7ac3c93e5d	Merge 3.6-rc6 into tty-next This pulls in the fixes in 3.6-rc6 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-09-16 17:31:36 -07:00
David S. Miller	b48b63a1f6	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: net/netfilter/nfnetlink_log.c net/netfilter/xt_LOG.c Rather easy conflict resolution, the 'net' tree had bug fixes to make sure we checked if a socket is a time-wait one or not and elide the logging code if so. Whereas on the 'net-next' side we are calculating the UID and GID from the creds using different interfaces due to the user namespace changes from Eric Biederman. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-15 11:43:53 -04:00
Linus Torvalds	a1362d504e	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: 1) Use after free and new device IDs in bluetooth from Andre Guedes, Yevgeniy Melnichuk, Gustavo Padovan, and Henrik Rydberg. 2) Fix crashes with short packet lengths and VLAN in pktgen, from Nishank Trivedi. 3) mISDN calls flush_work_sync() with locks held, fix from Karsten Keil. 4) Packet scheduler gred parameters are reported to userspace improperly scaled, and WRED idling is not performed correctly. All from David Ward. 5) Fix TCP socket refcount problem in ipv6, from Julian Anastasov. 6) ibmveth device has RX queue alignment requirements which are not being explicitly met resulting in sporadic failures, fix from Santiago Leon. 7) Netfilter needs to take care when interpreting sockets attached to socket buffers, they could be time-wait minisockets. Fix from Eric Dumazet. 8) sock_edemux() has the same issue as netfilter did in #7 above, fix from Eric Dumazet. 9) Avoid infinite loops in CBQ scheduler with some configurations, from Eric Dumazet. 10) Deal with "Reflection scan: an Off-Path Attack on TCP", from Jozsef Kadlecsik. 11) SCTP overcharges socket for TX packets, fix from Thomas Graf. 12) CODEL packet scheduler should not reset it's state every time it builds a new flow, fix from Eric Dumazet. 13) Fix memory leak in nl80211, from Wei Yongjun. 14) NETROM doesn't check skb_copy_datagram_iovec() return values, from Alan Cox. 15) l2tp ethernet was using sizeof(ETH_HLEN) instead of plain ETH_HLEN, oops. From Eric Dumazet. 16) Fix selection of ath9k chips on which PA linearization and AM2PM predistoration are used, from Felix Fietkau. 17) Flow steering settings in mlx4 driver need to be validated properly, from Hadar Hen Zion. 18) bnx2x doesn't show the correct link duplex setting, from Yaniv Rosner. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (75 commits) pktgen: fix crash with vlan and packet size less than 46 bnx2x: Add missing afex code bnx2x: fix registers dumped bnx2x: correct advertisement of pause capabilities bnx2x: display the correct duplex value bnx2x: prevent timeouts when using PFC bnx2x: fix stats copying logic bnx2x: Avoid sending multiple statistics queries net: qmi_wwan: call subdriver with control intf only net_sched: gred: actually perform idling in WRED mode net_sched: gred: fix qave reporting via netlink net_sched: gred: eliminate redundant DP prio comparisons net_sched: gred: correct comment about qavg calculation in RIO mode mISDN: Fix wrong usage of flush_work_sync while holding locks netfilter: log: Fix log-level processing net-sched: sch_cbq: avoid infinite loop net: qmi_wwan: fix Gobi device probing for un2430 net: fix net/core/sock.c build error ixp4xx_hss: fix build failure due to missing linux/module.h inclusion caif: move the dereference below the NULL test ...	2012-09-14 15:34:07 -07:00
Tejun Heo	8c7f6edbda	cgroup: mark subsystems with broken hierarchy support and whine if cgroups are nested for them Currently, cgroup hierarchy support is a mess. cpu related subsystems behave correctly - configuration, accounting and control on a parent properly cover its children. blkio and freezer completely ignore hierarchy and treat all cgroups as if they're directly under the root cgroup. Others show yet different behaviors. These differing interpretations of cgroup hierarchy make using cgroup confusing and it impossible to co-mount controllers into the same hierarchy and obtain sane behavior. Eventually, we want full hierarchy support from all subsystems and probably a unified hierarchy. Users using separate hierarchies expecting completely different behaviors depending on the mounted subsystem is deterimental to making any progress on this front. This patch adds cgroup_subsys.broken_hierarchy and sets it to %true for controllers which are lacking in hierarchy support. The goal of this patch is two-fold. * Move users away from using hierarchy on currently non-hierarchical subsystems, so that implementing proper hierarchy support on those doesn't surprise them. * Keep track of which controllers are broken how and nudge the subsystems to implement proper hierarchy support. For now, start with a single warning message. We can whine louder later on. v2: Fixed a typo spotted by Michal. Warning message updated. v3: Updated memcg part so that it doesn't generate warning in the cases where .use_hierarchy=false doesn't make the behavior different from root.use_hierarchy=true. Fixed a typo spotted by Glauber. v4: Check ->broken_hierarchy after cgroup creation is complete so that ->create() can affect the result per Michal. Dropped unnecessary memcg root handling per Michal. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Michal Hocko <mhocko@suse.cz> Acked-by: Li Zefan <lizefan@huawei.com> Acked-by: Serge E. Hallyn <serue@us.ibm.com> Cc: Glauber Costa <glommer@parallels.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Paul Turner <pjt@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Thomas Graf <tgraf@suug.ch> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Cc: Neil Horman <nhorman@tuxdriver.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>	2012-09-14 12:01:16 -07:00
John W. Linville	9316f0e3c6	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem	2012-09-14 13:53:49 -04:00
Daniel Wagner	8a8e04df47	cgroup: Assign subsystem IDs during compile time WARNING: With this change it is impossible to load external built controllers anymore. In case where CONFIG_NETPRIO_CGROUP=m and CONFIG_NET_CLS_CGROUP=m is set, corresponding subsys_id should also be a constant. Up to now, net_prio_subsys_id and net_cls_subsys_id would be of the type int and the value would be assigned during runtime. By switching the macro definition IS_SUBSYS_ENABLED from IS_BUILTIN to IS_ENABLED, all _subsys_id will have constant value. That means we need to remove all the code which assumes a value can be assigned to net_prio_subsys_id and net_cls_subsys_id. A close look is necessary on the RCU part which was introduces by following patch: commit `f845172531` Author: Herbert Xu <herbert@gondor.apana.org.au> Mon May 24 09:12:34 2010 Committer: David S. Miller <davem@davemloft.net> Mon May 24 09:12:34 2010 cls_cgroup: Store classid in struct sock Tis code was added to init_cgroup_cls() / We can't use rcu_assign_pointer because this is an int. */ smp_wmb(); net_cls_subsys_id = net_cls_subsys.subsys_id; respectively to exit_cgroup_cls() net_cls_subsys_id = -1; synchronize_rcu(); and in module version of task_cls_classid() rcu_read_lock(); id = rcu_dereference(net_cls_subsys_id); if (id >= 0) classid = container_of(task_subsys_state(p, id), struct cgroup_cls_state, css)->classid; rcu_read_unlock(); Without an explicit explaination why the RCU part is needed. (The rcu_deference was fixed by exchanging it to rcu_derefence_index_check() in a later commit, but that is a minor detail.) So here is my pondering why it was introduced and why it safe to remove it now. Note that this code was copied over to net_prio the reasoning holds for that subsystem too. The idea behind the RCU use for net_cls_subsys_id is to make sure we get a valid pointer back from task_subsys_state(). task_subsys_state() is just blindly accessing the subsys array and returning the pointer. Obviously, passing in -1 as id into task_subsys_state() returns an invalid value (out of lower bound). So this code makes sure that only after module is loaded and the subsystem registered, the id is assigned. Before unregistering the module all old readers must have left the critical section. This is done by assigning -1 to the id and issuing a synchronized_rcu(). Any new readers wont call task_subsys_state() anymore and therefore it is safe to unregister the subsystem. The new code relies on the same trick, but it looks at the subsys pointer return by task_subsys_state() (remember the id is constant and therefore we allways have a valid index into the subsys array). No precautions need to be taken during module loading module. Eventually, all CPUs will get a valid pointer back from task_subsys_state() because rebind_subsystem() which is called after the module init() function will assigned subsys[net_cls_subsys_id] the newly loaded module subsystem pointer. When the subsystem is about to be removed, rebind_subsystem() will called before the module exit() function. In this case, rebind_subsys() will assign subsys[net_cls_subsys_id] a NULL pointer and then it calls synchronize_rcu(). All old readers have left by then the critical section. Any new reader wont access the subsystem anymore. At this point we are safe to unregister the subsystem. No synchronize_rcu() call is needed. Signed-off-by: Daniel Wagner <daniel.wagner@bmw-carit.de> Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Li Zefan <lizefan@huawei.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Eric Dumazet <edumazet@google.com> Cc: Gao feng <gaofeng@cn.fujitsu.com> Cc: Glauber Costa <glommer@parallels.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: John Fastabend <john.r.fastabend@intel.com> Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: netdev@vger.kernel.org Cc: cgroups@vger.kernel.org	2012-09-14 09:57:43 -07:00
Daniel Wagner	51e4e7faba	cgroup: net_prio: Do not define task_netpioidx() when not selected task_netprioidx() should not be defined in case the configuration is CONFIG_NETPRIO_CGROUP=n. The reason is that in a following patch the net_prio_subsys_id will only be defined if CONFIG_NETPRIO_CGROUP!=n. When net_prio is not built at all any callee should only get an empty task_netprioidx() without any references to net_prio_subsys_id. Signed-off-by: Daniel Wagner <daniel.wagner@bmw-carit.de> Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Li Zefan <lizefan@huawei.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Cc: Gao feng <gaofeng@cn.fujitsu.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: John Fastabend <john.r.fastabend@intel.com> Cc: netdev@vger.kernel.org Cc: cgroups@vger.kernel.org	2012-09-14 09:57:28 -07:00
Daniel Wagner	8fb974c937	cgroup: net_cls: Do not define task_cls_classid() when not selected task_cls_classid() should not be defined in case the configuration is CONFIG_NET_CLS_CGROUP=n. The reason is that in a following patch the net_cls_subsys_id will only be defined if CONFIG_NET_CLS_CGROUP!=n. When net_cls is not built at all a callee should only get an empty task_cls_classid() without any references to net_cls_subsys_id. Signed-off-by: Daniel Wagner <daniel.wagner@bmw-carit.de> Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Li Zefan <lizefan@huawei.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Cc: Gao feng <gaofeng@cn.fujitsu.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: John Fastabend <john.r.fastabend@intel.com> Cc: netdev@vger.kernel.org Cc: cgroups@vger.kernel.org	2012-09-14 09:57:25 -07:00
Chun-Yeow Yeoh	9385d04f28	mac80211: allow re-open the blocked peer link in mesh Peer link which is blocked using the "iw mesh0 station set <MAC addr> plink_action block" is previously not able to re-open using "iw mesh0 station set <MAC addr> plink_action open". This patch is intended to solve this. If the station plink state remains at OPN_SNT once open, try block and open again should solve this problem. Signed-off-by: Chun-Yeow Yeoh <yeohchunyeow@gmail.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-09-14 14:25:16 +02:00
Johannes Berg	5d8e4237d2	mac80211: change locking around ieee80211_recalc_smps Make the function acquire the necessary mutex itself to simplify the callers. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-09-14 14:09:25 +02:00
Johannes Berg	04b7b2ff50	mac80211: handle power constraint/country IE better Currently, mac80211 uses the power constraint IE, and reduces the regulatory max TX power by it. This can cause issues if the AP is advertising a large power constraint value matching a high TX power in its country IE, for example in this case: ... Country: US Environment: Indoor/Outdoor ... Channels [157 - 157] @ 30 dBm ... Power constraint: 13 dB ... What happened here is that our local regulatory TX power is 15 dBm, and gets reduced by 13 dB so we end up with only 2 dBm effective TX power, which is way too low. Instead, handle the country IE/power constraint IE combined and restrict our TX power to the max of the regulatory power and the maximum power advertised by the AP, in this case 17 dBm (= 30 dBm - 13 dB). Also print a message when this happens to let the user know and help us debug issues with it. Reported-by: Carl A. Cook <CACook@quantum-equities.com> Tested-by: Carl A. Cook <CACook@quantum-equities.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-09-14 14:06:51 +02:00
Eric W. Biederman	c6089735e7	userns: net: Call key_alloc with GLOBAL_ROOT_UID, GLOBAL_ROOT_GID instead of 0, 0 In net/dns_resolver/dns_key.c and net/rxrpc/ar-key.c make them work with user namespaces enabled where key_alloc takes kuids and kgids. Pass GLOBAL_ROOT_UID and GLOBAL_ROOT_GID instead of bare 0's. Cc: Sage Weil <sage@inktank.com> Cc: ceph-devel@vger.kernel.org Cc: David Howells <dhowells@redhat.com> Cc: David Miller <davem@davemloft.net> Cc: linux-afs@lists.infradead.org Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-13 18:28:04 -07:00
Nishank Trivedi	6af773e786	pktgen: fix crash with vlan and packet size less than 46 If vlan option is being specified in the pktgen and packet size being requested is less than 46 bytes, despite being illogical request, pktgen should not crash the kernel. BUG: unable to handle kernel paging request at ffff88021fb82000 Process kpktgend_0 (pid: 1184, threadinfo ffff880215f1a000, task ffff880218544530) Call Trace: [<ffffffffa0637cd2>] ? pktgen_finalize_skb+0x222/0x300 [pktgen] [<ffffffff814f0084>] ? build_skb+0x34/0x1c0 [<ffffffffa0639b11>] pktgen_thread_worker+0x5d1/0x1790 [pktgen] [<ffffffffa03ffb10>] ? igb_xmit_frame_ring+0xa30/0xa30 [igb] [<ffffffff8107ba20>] ? wake_up_bit+0x40/0x40 [<ffffffff8107ba20>] ? wake_up_bit+0x40/0x40 [<ffffffffa0639540>] ? spin+0x240/0x240 [pktgen] [<ffffffff8107b4e3>] kthread+0x93/0xa0 [<ffffffff81615de4>] kernel_thread_helper+0x4/0x10 [<ffffffff8107b450>] ? flush_kthread_worker+0x80/0x80 [<ffffffff81615de0>] ? gs_change+0x13/0x13 The root cause of why pktgen is not able to handle this case is due to comparison of signed (datalen) and unsigned data (sizeof), which eventually passes a huge number to skb_put(). Signed-off-by: Nishank Trivedi <nistrive@cisco.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-13 17:10:00 -04:00
Li RongQing	5744dd9b71	ipv6: replace write lock with read lock when get route info geting route info does not write rt->rt6i_table, so replace write lock with read lock Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-13 16:53:46 -04:00
Eric Dumazet	fb0af4c74f	ipv6: route templates can be const We kmemdup() templates, so they can be const. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-13 16:52:55 -04:00
YOSHIFUJI Hideaki / 吉藤英明	91b4b04ff8	ipv6: Compare addresses only bits up to the prefix length (RFC6724). Compare bits up to the source address's prefix length only to allows DNS load balancing to continue to be used as a tie breaker. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-13 16:34:03 -04:00
YOSHIFUJI Hideaki / 吉藤英明	417962a02b	ipv6: Add labels for site-local and 6bone testing addresses (RFC6724) Added labels for site-local addresses (fec0::/10) and 6bone testing addresses (3ffe::/16) in order to depreference them. Note that the RFC introduced new rows for Teredo, ULA and 6to4 addresses in the default policy table. Some of them have different labels from ours. For backward compatibility, we do not change the "default" labels. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-13 16:34:03 -04:00
Srivatsa S. Bhat	f05ba7fccf	netprio_cgroup: Use memcpy instead of the for-loop to copy priomap Replace the current (inefficient) for-loop with memcpy, to copy priomap. Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-13 16:18:40 -04:00
Srivatsa S. Bhat	d530d6df96	netprio_cgroup: Remove update_netdev_tables() since it is unnecessary The update_netdev_tables() function appears to be unnecessary, since the write_update_netdev_table() function will adjust the priomaps as and when required anyway. So drop the usage of update_netdev_tables() entirely. Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-13 16:18:40 -04:00
David Ward	ba1bf474ea	net_sched: gred: actually perform idling in WRED mode gred_dequeue() and gred_drop() do not seem to get called when the queue is empty, meaning that we never start idling while in WRED mode. And since qidlestart is not stored by gred_store_wred_set(), we would never stop idling while in WRED mode if we ever started. This messes up the average queue size calculation that influences packet marking/dropping behavior. Now, we start WRED mode idling as we are removing the last packet from the queue. Also we now actually stop WRED mode idling when we are enqueuing a packet. Cc: Bruce Osler <brosler@cisco.com> Signed-off-by: David Ward <david.ward@ll.mit.edu> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-13 16:10:13 -04:00
David Ward	1fe37b106b	net_sched: gred: fix qave reporting via netlink q->vars.qavg is a Wlog scaled value, but q->backlog is not. In order to pass q->vars.qavg as the backlog value, we need to un-scale it. Additionally, the qave value returned via netlink should not be Wlog scaled, so we need to un-scale the result of red_calc_qavg(). This caused artificially high values for "Average Queue" to be shown by 'tc -s -d qdisc', but did not affect the actual operation of GRED. Signed-off-by: David Ward <david.ward@ll.mit.edu> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-13 16:10:13 -04:00
David Ward	c22e464022	net_sched: gred: eliminate redundant DP prio comparisons Each pair of DPs only needs to be compared once when searching for a non-unique prio value. Signed-off-by: David Ward <david.ward@ll.mit.edu> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-13 16:10:13 -04:00
David Ward	e29fe837bf	net_sched: gred: correct comment about qavg calculation in RIO mode Signed-off-by: David Ward <david.ward@ll.mit.edu> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-13 16:10:13 -04:00
David S. Miller	b0e61d98c6	Merge branch 'master' of git://1984.lsi.us.es/nf-next Pablo Neira Ayuso says: ==================== The following patchset contains four Netfilter updates, mostly targeting to fix issues added with IPv6 NAT, and one little IPVS update for net-next: * Remove unneeded conditional free of skb in nfnetlink_queue, from Wei Yongjun. * One semantic path from coccinelle detected the use of list_del + INIT_LIST_HEAD, instead of list_del_init, again from Wei Yongjun. * Fix out-of-bound memory access in the NAT address selection, from Florian Westphal. This was introduced with the IPv6 NAT patches. * Two fixes for crashes that were introduced in the recently merged IPv6 NAT support, from myself. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-13 14:24:31 -04:00
David S. Miller	930521695c	Merge branch 'master' of git://1984.lsi.us.es/nf Pablo Neira Ayuso say: ==================== The following patchset contains four updates for your net tree, they are: * Fix crash on timewait sockets, since the TCP early demux was added, in nfnetlink_log, from Eric Dumazet. * Fix broken syslog log-level for xt_LOG and ebt_log since printk format was converted from <.> to a 2 bytes pattern using ASCII SOH, from Joe Perches. * Two security fixes for the TCP connection tracking targeting off-path attacks, from Jozsef Kadlecsik. The problem was discovered by Jan Wrobel and it is documented in: http://mixedbit.org/reflection_scan/reflection_scan.pdf. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-13 13:53:06 -04:00
Linus Torvalds	22b4e63ebe	NFS client bugfixes for Linux 3.6 - Final (hopefully) fix for the range checking code in NFSv4 getacl. This should fix the Oopses being seen when the acl size is close to PAGE_SIZE. - Fix a regression with the legacy binary mount code - Fix a regression in the readdir cookieverf initialisation - Fix an RPC over UDP regression - Ensure that we report all errors in the NFSv4 open code - Ensure that fsync() reports all relevant synchronisation errors. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iQIcBAABAgAGBQJQUN+KAAoJEGcL54qWCgDyHGcQAKj7MYVDIjhdmsVGGNWXUCnf X0LVg/ajh+vjusK+hmquzcJikZqgce5IU5DW4vcFr1X8BgP+R51UVvU0KksByD5H ourV2JVCztAQzQ4WWOsZAGqN0tooJUjyEjl4lEiDsQCF4Nk1HWbuCHeYuX74OToZ jrgedj0EZ6zb7TOizvbgU/7lI+FKu3Hlw6+u27M9phtSuefJdYSHZHYVMOX81qPh k0zgZ4tuLIaDuBB84iCrPwNt9icnevq6cIc+AGluI6xhDw+foPvUaUR+OUI420IZ tunNzP2So+nNoyjEiyMVENaCdEyA75XAmmGHTUUdBiVOsMV4HF/TqvTtSsjk2mN1 FbZVvtjD6srjsQaKdVmqMIZBdhY9LSMLIQVqb4H2rYP6Mwq06WTuyCxf5YhzFfoy 2tai7JuqBkTAWfKB8ESWywV6Qk/MkUWRAOBO6ksS66gAwpcFDj6nfeAdwaEmoYKc uzLUIRZaclPMZf661cs1fWeFV5XOnCL7je4owgTRGs7MHooWHPcC3273fEJqnhFz 5MkC7nfmUiGcdO1v0mfYTEtMj9Pp9icBoZcVTGn4eZIHzvhhZOx//8LhyBfS+jll bKjaLZ1rErvIqwnSGcB7PK2yBYY9P6ZaxWjOrAAncZmiOxfhN0hvCo54jNOr/VZ+ atsDEAuqSTeK7ouBqyO4 =e5yE -----END PGP SIGNATURE----- Merge tag 'nfs-for-3.6-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs Pull NFS client bugfixes from Trond Myklebust: - Final (hopefully) fix for the range checking code in NFSv4 getacl. This should fix the Oopses being seen when the acl size is close to PAGE_SIZE. - Fix a regression with the legacy binary mount code - Fix a regression in the readdir cookieverf initialisation - Fix an RPC over UDP regression - Ensure that we report all errors in the NFSv4 open code - Ensure that fsync() reports all relevant synchronisation errors. * tag 'nfs-for-3.6-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: NFS: fsync() must exit with an error if page writeback failed SUNRPC: Fix a UDP transport regression NFS: return error from decode_getfh in decode open NFSv4: Fix buffer overflow checking in __nfs4_get_acl_uncached NFSv4: Fix range checking in __nfs4_get_acl_uncached and __nfs4_proc_set_acl NFS: Fix a problem with the legacy binary mount code NFS: Fix the initialisation of the readdir 'cookieverf' array	2012-09-13 09:04:13 +08:00
Pablo Neira Ayuso	c7cbb9173d	netfilter: ctnetlink: fix module auto-load in ctnetlink_parse_nat (`c7232c9` netfilter: add protocol independent NAT core) added incorrect locking for the module auto-load case in ctnetlink_parse_nat. That function is always called from ctnetlink_create_conntrack which requires no locking. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-09-12 17:27:37 +02:00
Joe Perches	16af511a66	netfilter: log: Fix log-level processing auto75914331@hushmail.com reports that iptables does not correctly output the KERN_<level>. $IPTABLES -A RULE_0_in -j LOG --log-level notice --log-prefix "DENY in: " result with linux 3.6-rc5 Sep 12 06:37:29 xxxxx kernel: <5>DENY in: IN=eth0 OUT= MAC=....... result with linux 3.5.3 and older: Sep 9 10:43:01 xxxxx kernel: DENY in: IN=eth0 OUT= MAC...... commit `04d2c8c83d` ("printk: convert the format for KERN_<LEVEL> to a 2 byte pattern") updated the syslog header style but did not update netfilter uses. Do so. Use KERN_SOH and string concatenation instead of "%c" KERN_SOH_ASCII as suggested by Eric Dumazet. Signed-off-by: Joe Perches <joe@perches.com> cc: auto75914331@hushmail.com Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-09-12 17:17:35 +02:00
Eric Dumazet	bdfc87f7d1	net-sched: sch_cbq: avoid infinite loop Its possible to setup a bad cbq configuration leading to an infinite loop in cbq_classify() DEV_OUT=eth0 ICMP="match ip protocol 1 0xff" U32="protocol ip u32" DST="match ip dst" tc qdisc add dev $DEV_OUT root handle 1: cbq avpkt 1000 \ bandwidth 100mbit tc class add dev $DEV_OUT parent 1: classid 1:1 cbq \ rate 512kbit allot 1500 prio 5 bounded isolated tc filter add dev $DEV_OUT parent 1: prio 3 $U32 \ $ICMP $DST 192.168.3.234 flowid 1: Reported-by: Denys Fedoryschenko <denys@visp.net.lb> Tested-by: Denys Fedoryschenko <denys@visp.net.lb> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-11 22:20:43 -04:00
Johannes Berg	3a6a0d8ee8	mac80211: remove unneeded CONFIG_PM ifdef The functions are only called if CONFIG_PM is set as the callers are under an ifdef, so there's no need to also define no-op functions. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-09-11 14:43:43 +02:00
Randy Dunlap	1c463e57b3	net: fix net/core/sock.c build error Fix net/core/sock.c build error when CONFIG_INET is not enabled: net/built-in.o: In function `sock_edemux': (.text+0xd396): undefined reference to `inet_twsk_put' Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-10 16:44:45 -04:00
Amerigo Wang	fdd6681d92	ipv6: remove some useless RCU read lock After this commit: commit `97cac0821a` Author: David S. Miller <davem@davemloft.net> Date: Mon Jul 2 22:43:47 2012 -0700 ipv6: Store route neighbour in rt6_info struct. we no longer use RCU to protect route neighbour. Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-10 16:31:18 -04:00
Wei Yongjun	566f26aa70	caif: move the dereference below the NULL test The dereference should be moved below the NULL test. spatch with a semantic match is used to found this. (http://coccinelle.lip6.fr/) Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-10 16:13:31 -04:00
Eric Dumazet	b6069a9570	filter: add MOD operation Add a new ALU opcode, to compute a modulus. Commit `ffe06c17af` used an ancillary to implement XOR_X, but here we reserve one of the available ALU opcode to implement both MOD_X and MOD_K Signed-off-by: Eric Dumazet <edumazet@google.com> Suggested-by: George Bakos <gbakos@alpinista.org> Cc: Jay Schulist <jschlst@samba.org> Cc: Jiri Pirko <jpirko@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-10 15:44:56 -04:00
Eric W. Biederman	c6bb8136c9	xfrm: Report user triggered expirations against the users socket When a policy expiration is triggered from user space the request travels through km_policy_expired and ultimately into xfrm_exp_policy_notify which calls build_polexpire. build_polexpire uses the netlink port passed to km_policy_expired as the source port for the netlink message it builds. When a state expiration is triggered from user space the request travles through km_state_expired and ultimately into xfrm_exp_state_notify which calls build_expire. build_expire uses the netlink port passed to km_state_expired as the source port for the netlink message it builds. Pass nlh->nlmsg_pid from the user generated netlink message that requested the expiration to km_policy_expired and km_state_expired instead of current->pid which is not a netlink port number. Cc: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-10 15:34:30 -04:00
Eric W. Biederman	15e473046c	netlink: Rename pid to portid to avoid confusion It is a frequent mistake to confuse the netlink port identifier with a process identifier. Try to reduce this confusion by renaming fields that hold port identifiers portid instead of pid. I have carefully avoided changing the structures exported to userspace to avoid changing the userspace API. I have successfully built an allyesconfig kernel with this change. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-10 15:30:41 -04:00
Felix Fietkau	1bad538243	mac80211: validate skb->dev in the tx status path skb->dev might contain a stale reference to a device that was already deleted, and using it unchecked can lead to invalid pointer accesses. Since this is only used for nl80211 tx, iterate over active interfaces to find a match for skb->dev, and discard the tx status if the device is gone. Signed-off-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-09-10 18:44:58 +02:00
J. Bruce Fields	eccf50c129	nfsd: remove unused listener-removal interfaces You can use nfsd/portlist to give nfsd additional sockets to listen on. In theory you can also remove listening sockets this way. But nobody's ever done that as far as I can tell. Also this was partially broken in 2.6.25, by `a217813f90` "knfsd: Support adding transports by writing portlist file". (Note that we decide whether to take the "delfd" case by checking for a digit--but what's actually expected in that case is something made by svc_one_sock_name(), which won't begin with a digit.) So, let's just rip out this stuff. Acked-by: NeilBrown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-09-10 10:55:19 -04:00
Eliad Peller	b22cfcfcae	mac80211: use call_rcu() on sta deletion mac80211 calls synchronize_rcu() on sta deletion, which increase the roaming time significantly. Convert it into a call_rcu() mechanism, in order to avoid blocking. Since some of the cleanup functions might sleep, schedule from the call_rcu callback a new work that will do the actual cleanup. In order to make sure the cleanup occurs before the interface went down, flush local->workqueue on ieee80211_do_stop(). Signed-off-by: Yoni Divinsky <yoni.divinsky@ti.com> Signed-off-by: Eliad Peller <eliad@wizery.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-09-10 12:44:17 +02:00
Johannes Berg	e548c49e6d	mac80211: add key flag for management keys Mark keys that might be used to receive management frames so drivers can fall back on software crypto for them if they don't support hardware offload. As the new flag is only set correctly for RX keys and the existing IEEE80211_KEY_FLAG_SW_MGMT flag can only affect TX, also rename the latter to IEEE80211_KEY_FLAG_SW_MGMT_TX. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-09-10 11:29:17 +02:00
Wei Yongjun	0edd94887d	ipvs: use list_del_init instead of list_del/INIT_LIST_HEAD Using list_del_init() instead of list_del() + INIT_LIST_HEAD(). spatch with a semantic match is used to found this problem. (http://coccinelle.lip6.fr/) Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Acked-by: Simon Horman <horms@verge.net.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-09-10 09:48:55 +02:00
Jozsef Kadlecsik	4a70bbfaef	netfilter: Validate the sequence number of dataless ACK packets as well We spare nothing by not validating the sequence number of dataless ACK packets and enabling it makes harder off-path attacks. See: "Reflection scan: an Off-Path Attack on TCP" by Jan Wrobel, http://arxiv.org/abs/1201.2074 Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-09-09 22:13:49 +02:00
Jozsef Kadlecsik	64f509ce71	netfilter: Mark SYN/ACK packets as invalid from original direction Clients should not send such packets. By accepting them, we open up a hole by wich ephemeral ports can be discovered in an off-path attack. See: "Reflection scan: an Off-Path Attack on TCP" by Jan Wrobel, http://arxiv.org/abs/1201.2074 Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-09-09 22:13:30 +02:00
Wei Yongjun	a67299556e	netfilter: nfnetlink_queue: remove pointless conditional before kfree_skb() Remove pointless conditional before kfree_skb(). Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-09-09 20:33:57 +02:00
Florian Westphal	5693d68df6	netfilter: nf_nat: fix out-of-bounds access in address selection include/linux/jhash.h:138:16: warning: array subscript is above array bounds [jhash2() expects the number of u32 in the key] Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-09-09 20:18:55 +02:00
Pablo Neira Ayuso	9f00d9776b	netlink: hide struct module parameter in netlink_kernel_create This patch defines netlink_kernel_create as a wrapper function of __netlink_kernel_create to hide the struct module *me parameter (which seems to be THIS_MODULE in all existing netlink subsystems). Suggested by David S. Miller. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-08 18:46:30 -04:00
Pablo Neira Ayuso	9785e10aed	netlink: kill netlink_set_nonroot Replace netlink_set_nonroot by one new field `flags' in struct netlink_kernel_cfg that is passed to netlink_kernel_create. This patch also renames NL_NONROOT_* to NL_CFG_F_NONROOT_* since now the flags field in nl_table is generic (so we can add more flags if needed in the future). Also adjust all callers in the net-next tree to use these flags instead of netlink_set_nonroot. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-08 18:45:27 -04:00
Chema Gonzalez	6862234238	net: small bug on rxhash calculation In the current rxhash calculation function, while the sorting of the ports/addrs is coherent (you get the same rxhash for packets sharing the same 4-tuple, in both directions), ports and addrs are sorted independently. This implies packets from a connection between the same addresses but crossed ports hash to the same rxhash. For example, traffic between A=S:l and B=L:s is hashed (in both directions) from {L, S, {s, l}}. The same rxhash is obtained for packets between C=S:s and D=L:l. This patch ensures that you either swap both addrs and ports, or you swap none. Traffic between A and B, and traffic between C and D, get their rxhash from different sources ({L, S, {l, s}} for A<->B, and {L, S, {s, l}} for C<->D) The patch is co-written with Eric Dumazet <edumazet@google.com> Signed-off-by: Chema Gonzalez <chema@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-08 18:41:48 -04:00
Andrei Emeltchenko	e71dfabab0	Bluetooth: AMP: Add Read Data Block Size to amp_init Add Read Data Block Size HCI cmd to AMP initialization, then it makes possible to send data. Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2012-09-08 18:14:11 -03:00
Andrei Emeltchenko	93f71941c6	Bluetooth: trivial: Remove empty line Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2012-09-08 17:27:27 -03:00
Andrei Emeltchenko	9472007c62	Bluetooth: trivial: Make hci_chan_del return void Return code is not needed in hci_chan_del Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2012-09-08 17:27:18 -03:00
Andrei Emeltchenko	6b536b5e5e	Bluetooth: Remove unneeded zero init hdev is allocated with kzalloc so zero initialization is not needed. Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com> Acked-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2012-09-08 16:53:48 -03:00
Eric Dumazet	ba8bd0ea98	net: rt_cache_flush() cleanup We dont use jhash anymore since route cache removal, so we can get rid of get_random_bytes() calls for rt_genid changes. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-07 17:23:17 -04:00
John W. Linville	fac805f8c1	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless	2012-09-07 15:07:55 -04:00
John W. Linville	4a3e12fd7a	Merge branch 'for-john' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next	2012-09-07 14:49:46 -04:00
Nicolas Dichtel	4ccfe6d410	ipv4/route: arg delay is useless in rt_cache_flush() Since route cache deletion (`89aef8921b`), delay is no more used. Remove it. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-07 14:44:08 -04:00
Eric W. Biederman	dbe9a4173e	scm: Don't use struct ucred in NETLINK_CB and struct scm_cookie. Passing uids and gids on NETLINK_CB from a process in one user namespace to a process in another user namespace can result in the wrong uid or gid being presented to userspace. Avoid that problem by passing kuids and kgids instead. - define struct scm_creds for use in scm_cookie and netlink_skb_parms that holds uid and gid information in kuid_t and kgid_t. - Modify scm_set_cred to fill out scm_creds by heand instead of using cred_to_ucred to fill out struct ucred. This conversion ensures userspace does not get incorrect uid or gid values to look at. - Modify scm_recv to convert from struct scm_creds to struct ucred before copying credential values to userspace. - Modify __scm_send to populate struct scm_creds on in the scm_cookie, instead of just copying struct ucred from userspace. - Modify netlink_sendmsg to copy scm_creds instead of struct ucred into the NETLINK_CB. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-07 14:42:05 -04:00
John W. Linville	777bf135b7	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem John W. Linville says: ==================== Please pull these fixes intended for 3.6. There are more commits here than I would like -- I got a bit behind while I was stalking Steven Rostedt in San Diego last week... I'll slow it down after this! There are a couple of pulls here. One is from Johannes: "Please pull (according to the below information) to get a few fixes. * a fix to properly disconnect in the driver when authentication or association fails * a fix to prevent invalid information about mesh paths being reported to userspace * a memory leak fix in an nl80211 error path" The other comes via Gustavo: "A few updates for the 3.6 kernel. There are two btusb patches to add more supported devices through the new USB_VENDOR_AND_INTEFACE_INFO() macro and another one that add a new device id for a Sony Vaio laptop, one fix for a user-after-free and, finally, two patches from Vinicius to fix a issue in SMP pairing." Along with those... Arend van Spriel provides a fix for a use-after-free bug in brcmfmac. Daniel Drake avoids a hang by not trying to touch the libertas hardware duing suspend if it is already powered-down. Felix Fietkau provides a batch of ath9k fixes that adress some potential problems with power settings, as well as a fix to avoid a potential interrupt storm. Gertjan van Wingerde provides a register-width fix for rt2x00, and a rt2x00 fix to prevent incorrectly detecting the rfkill status. He also provides a device ID patch. Hante Meuleman gives us three brcmfmac fixes, one that properly initializes a command structure, one that fixes a race condition that could lose usb requests, and one that removes some log spam. Marc Kleine-Budde offers an rt2x00 fix for a voltage setting on some specific devices. Mohammed Shafi Shajakhan sent an ath9k fix to avoid a crash related to using timers that aren't allocated when 2 wire bluetooth coexistence hardware is in use. Sergei Poselenov changes rt2800usb to do some validity checking for received packets, avoiding crashes on an ARM Soc. Stone Piao gives us an mwifiex fix for an incorrectly set skb length value for a command buffer. All of these are localized to their specific drivers, and relatively small. The power-related patches from Felix are bigger than I would like, but I merged them in consideration of their isolation to ath9k and the sensitive nature of power settings in wireless devices. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-07 14:38:50 -04:00
Eric Dumazet	d679c5324d	igmp: avoid drop_monitor false positives igmp should call consume_skb() for all correctly processed packets, to avoid false dropwatch/drop_monitor false positives. Reported-by: Shawn Bohrer <sbohrer@rgmadvisors.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-07 14:17:10 -04:00
Nicolas Dichtel	b4949ab269	ipv6: fix handling of throw routes It's the same problem that previous fix about blackhole and prohibit routes. When adding a throw route, it was handled like a classic route. Moreover, it was only possible to add this kind of routes by specifying an interface. Before the patch: $ ip route add throw 2001::2/128 RTNETLINK answers: No such device $ ip route add throw 2001::2/128 dev eth0 $ ip -6 route \| grep 2001::2 2001::2 dev eth0 metric 1024 After: $ ip route add throw 2001::2/128 $ ip -6 route \| grep 2001::2 throw 2001::2 dev lo metric 1024 error -11 Reported-by: Markus Stenberg <markus.stenberg@iki.fi> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-07 14:17:10 -04:00
Eric Dumazet	979402b16c	udp: increment UDP_MIB_INERRORS if copy failed In UDP recvmsg(), we miss an increase of UDP_MIB_INERRORS if the copy of skb to userspace failed for whatever reason. Reported-by: Shawn Bohrer <sbohrer@rgmadvisors.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-07 12:56:00 -04:00
Trond Myklebust	f39c1bfb5a	SUNRPC: Fix a UDP transport regression Commit `43cedbf0e8` (SUNRPC: Ensure that we grab the XPRT_LOCK before calling xprt_alloc_slot) is causing hangs in the case of NFS over UDP mounts. Since neither the UDP or the RDMA transport mechanism use dynamic slot allocation, we can skip grabbing the socket lock for those transports. Add a new rpc_xprt_op to allow switching between the TCP and UDP/RDMA case. Note that the NFSv4.1 back channel assigns the slot directly through rpc_run_bc_task, so we can ignore that case. Reported-by: Dick Streefland <dick.streefland@altium.nl> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@vger.kernel.org [>= 3.1]	2012-09-07 11:43:49 -04:00
Antonio Quartulli	2cc59e784b	mac80211: reply to AUTH with DEAUTH if sta allocation fails in IBSS Whenever a host gets an AUTH frame it first allocates a new station and then replies with another AUTH frame. However, if sta allocations fails the host should send a DEAUTH frame instead to tell the other end that something went wrong. Signed-off-by: Antonio Quartulli <ordex@autistici.org> [reword commit message a bit] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-09-07 13:50:47 +02:00
Antonio Quartulli	6ae16775d6	mac80211: move ieee80211_send_deauth_disassoc outside mlme code Move ieee80211_send_deauth_disassoc() to util.c to make it available for the rest of the mac80211 code. Signed-off-by: Antonio Quartulli <ordex@autistici.org> [reword commit message] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-09-07 13:50:34 +02:00
Peter Senna Tschudin	316b6b5df7	net/mac80211/scan.c: removes unnecessary semicolon removes unnecessary semicolon Found by Coccinelle: http://coccinelle.lip6.fr/ Signed-off-by: Peter Senna Tschudin <peter.senna@gmail.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-09-07 13:30:06 +02:00
Simon Derr	43def35c10	net/9p: Check errno validity While working on a modified server I had the Linux clients crash a few times. This lead me to find this: Some error codes are directly extracted from the server replies. A malformed server reply could contain an invalid error code, with a very large value. If this value is then passed to ERR_PTR() it will not be properly detected as an error code by IS_ERR() and as a result the kernel will dereference an invalid pointer. This patch tries to avoid this. Signed-off-by: Simon Derr <simon.derr@bull.net> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>	2012-09-06 13:54:55 -05:00
Eric Dumazet	7ab4551f3b	tcp: fix TFO regression Fengguang Wu reported various panics and bisected to commit `8336886f78` (tcp: TCP Fast Open Server - support TFO listeners) Fix this by making sure socket is a TCP socket before accessing TFO data structures. [ 233.046014] kfree_debugcheck: out of range ptr ea6000000bb8h. [ 233.047399] ------------[ cut here ]------------ [ 233.048393] kernel BUG at /c/kernel-tests/src/stable/mm/slab.c:3074! [ 233.048393] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC [ 233.048393] Modules linked in: [ 233.048393] CPU 0 [ 233.048393] Pid: 3929, comm: trinity-watchdo Not tainted 3.6.0-rc3+ #4192 Bochs Bochs [ 233.048393] RIP: 0010:[<ffffffff81169653>] [<ffffffff81169653>] kfree_debugcheck+0x27/0x2d [ 233.048393] RSP: 0018:ffff88000facbca8 EFLAGS: 00010092 [ 233.048393] RAX: 0000000000000031 RBX: 0000ea6000000bb8 RCX: 00000000a189a188 [ 233.048393] RDX: 000000000000a189 RSI: ffffffff8108ad32 RDI: ffffffff810d30f9 [ 233.048393] RBP: ffff88000facbcb8 R08: 0000000000000002 R09: ffffffff843846f0 [ 233.048393] R10: ffffffff810ae37c R11: 0000000000000908 R12: 0000000000000202 [ 233.048393] R13: ffffffff823dbd5a R14: ffff88000ec5bea8 R15: ffffffff8363c780 [ 233.048393] FS: 00007faa6899c700(0000) GS:ffff88001f200000(0000) knlGS:0000000000000000 [ 233.048393] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 233.048393] CR2: 00007faa6841019c CR3: 0000000012c82000 CR4: 00000000000006f0 [ 233.048393] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 233.048393] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 233.048393] Process trinity-watchdo (pid: 3929, threadinfo ffff88000faca000, task ffff88000faec600) [ 233.048393] Stack: [ 233.048393] 0000000000000000 0000ea6000000bb8 ffff88000facbce8 ffffffff8116ad81 [ 233.048393] ffff88000ff588a0 ffff88000ff58850 ffff88000ff588a0 0000000000000000 [ 233.048393] ffff88000facbd08 ffffffff823dbd5a ffffffff823dbcb0 ffff88000ff58850 [ 233.048393] Call Trace: [ 233.048393] [<ffffffff8116ad81>] kfree+0x5f/0xca [ 233.048393] [<ffffffff823dbd5a>] inet_sock_destruct+0xaa/0x13c [ 233.048393] [<ffffffff823dbcb0>] ? inet_sk_rebuild_header +0x319/0x319 [ 233.048393] [<ffffffff8231c307>] __sk_free+0x21/0x14b [ 233.048393] [<ffffffff8231c4bd>] sk_free+0x26/0x2a [ 233.048393] [<ffffffff825372db>] sctp_close+0x215/0x224 [ 233.048393] [<ffffffff810d6835>] ? lock_release+0x16f/0x1b9 [ 233.048393] [<ffffffff823daf12>] inet_release+0x7e/0x85 [ 233.048393] [<ffffffff82317d15>] sock_release+0x1f/0x77 [ 233.048393] [<ffffffff82317d94>] sock_close+0x27/0x2b [ 233.048393] [<ffffffff81173bbe>] __fput+0x101/0x20a [ 233.048393] [<ffffffff81173cd5>] ____fput+0xe/0x10 [ 233.048393] [<ffffffff810a3794>] task_work_run+0x5d/0x75 [ 233.048393] [<ffffffff8108da70>] do_exit+0x290/0x7f5 [ 233.048393] [<ffffffff82707415>] ? retint_swapgs+0x13/0x1b [ 233.048393] [<ffffffff8108e23f>] do_group_exit+0x7b/0xba [ 233.048393] [<ffffffff8108e295>] sys_exit_group+0x17/0x17 [ 233.048393] [<ffffffff8270de10>] tracesys+0xdd/0xe2 [ 233.048393] Code: 59 01 5d c3 55 48 89 e5 53 41 50 0f 1f 44 00 00 48 89 fb e8 d4 b0 f0 ff 84 c0 75 11 48 89 de 48 c7 c7 fc fa f7 82 e8 0d 0f 57 01 <0f> 0b 5f 5b 5d c3 55 48 89 e5 0f 1f 44 00 00 48 63 87 d8 00 00 [ 233.048393] RIP [<ffffffff81169653>] kfree_debugcheck+0x27/0x2d [ 233.048393] RSP <ffff88000facbca8> Reported-by: Fengguang Wu <wfg@linux.intel.com> Tested-by: Fengguang Wu <wfg@linux.intel.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: "H.K. Jerry Chu" <hkchu@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Acked-by: H.K. Jerry Chu <hkchu@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-06 14:21:10 -04:00
Michal Kazior	23a85b45cf	mac80211: refactor set_channel_type Split functionality for further reuse. Will prevent code duplication when channel context channel_type merging is introduced. Signed-off-by: Michal Kazior <michal.kazior@tieto.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-09-06 17:55:00 +02:00
Eliad Peller	964b19f977	mac80211: use synchronize_net() on key destroying __ieee80211_key_destroy() calls synchronize_rcu() in order to sync the tx path before destroying the key. However, synching the tx path can be done with synchronize_net() as well, which is usually faster (the timing might be important for roaming scenarios). Signed-off-by: Eliad Peller <eliad@wizery.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-09-06 17:29:23 +02:00
Johannes Berg	761a48d260	mac80211: check power constraint IE size when parsing The power constraint IE is always a single byte so check the size when parsing instead of later. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-09-06 17:11:00 +02:00
Johannes Berg	00b14825ee	Merge remote-tracking branch 'wireless-next/master' into mac80211-next	2012-09-06 17:05:28 +02:00
Johannes Berg	882a7c69d3	mac80211: disconnect if channel switch fails Disconnect from the AP if channel switching in the driver failed or if the new channel is unavailable. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-09-06 17:05:24 +02:00
Johannes Berg	30dd3edf97	mac80211: don't hang on to sched_scan_ies There's no need to keep a copy of the scheduled scan IEs after the driver has been told, if it requires a copy it must make one. Therefore, we can move sched_scan_ies into the function. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-09-06 15:56:35 +02:00
Johannes Berg	944b9e375d	Merge remote-tracking branch 'mac80211/master' into mac80211-next Pull in mac80211.git to let the next patch apply without conflicts, also resolving a hwsim conflict. Conflicts: drivers/net/wireless/mac80211_hwsim.c Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-09-06 15:56:02 +02:00
Eric Dumazet	0626af3139	netfilter: take care of timewait sockets Sami Farin reported crashes in xt_LOG because it assumes skb->sk is a full blown socket. Since (`41063e9` ipv4: Early TCP socket demux), we can have skb->sk pointing to a timewait socket. Same fix is needed in nfnetlink_log. Diagnosed-by: Florian Westphal <fw@strlen.de> Reported-by: Sami Farin <hvtaifwkbgefbaei@gmail.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-09-06 14:28:18 +02:00
Wei Yongjun	a4ed53466a	mac80211: use list_move instead of list_del/list_add Using list_move() instead of list_del() + list_add(). spatch with a semantic match is used to found this problem. (http://coccinelle.lip6.fr/) Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-09-06 11:55:47 +02:00
Mikulas Patocka	ed6fe9d614	Fix order of arguments to compat_put_time[spec\|val] Commit `644595f896` ("compat: Handle COMPAT_USE_64BIT_TIME in net/socket.c") introduced a bug where the helper functions to take either a 64-bit or compat time[spec\|val] got the arguments in the wrong order, passing the kernel stack pointer off as a user pointer (and vice versa). Because of the user address range check, that in turn then causes an EFAULT due to the user pointer range checking failing for the kernel address. Incorrectly resuling in a failed system call for 32-bit processes with a 64-bit kernel. On odder architectures like HP-PA (with separate user/kernel address spaces), it can be used read kernel memory. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-09-05 18:34:13 -07:00
Nicolas Dichtel	ef2c7d7b59	ipv6: fix handling of blackhole and prohibit routes When adding a blackhole or a prohibit route, they were handling like classic routes. Moreover, it was only possible to add this kind of routes by specifying an interface. Bug already reported here: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=498498 Before the patch: $ ip route add blackhole 2001::1/128 RTNETLINK answers: No such device $ ip route add blackhole 2001::1/128 dev eth0 $ ip -6 route \| grep 2001 2001::1 dev eth0 metric 1024 After: $ ip route add blackhole 2001::1/128 $ ip -6 route \| grep 2001 blackhole 2001::1 dev lo metric 1024 error -22 v2: wrong patch v3: add a field fc_type in struct fib6_config to store RTN_* type Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-05 17:49:28 -04:00
Eric Dumazet	23d3b8bfb8	net: qdisc busylock needs lockdep annotations It seems we need to provide ability for stacked devices to use specific lock_class_key for sch->busylock We could instead default l2tpeth tx_queue_len to 0 (no qdisc), but a user might use a qdisc anyway. (So same fixes are probably needed on non LLTX stacked drivers) Noticed while stressing L2TPV3 setup : ====================================================== [ INFO: possible circular locking dependency detected ] 3.6.0-rc3+ #788 Not tainted ------------------------------------------------------- netperf/4660 is trying to acquire lock: (l2tpsock){+.-...}, at: [<ffffffffa0208db2>] l2tp_xmit_skb+0x172/0xa50 [l2tp_core] but task is already holding lock: (&(&sch->busylock)->rlock){+.-...}, at: [<ffffffff81596595>] dev_queue_xmit+0xd75/0xe00 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&(&sch->busylock)->rlock){+.-...}: [<ffffffff810a5df0>] lock_acquire+0x90/0x200 [<ffffffff817499fc>] _raw_spin_lock_irqsave+0x4c/0x60 [<ffffffff81074872>] __wake_up+0x32/0x70 [<ffffffff8136d39e>] tty_wakeup+0x3e/0x80 [<ffffffff81378fb3>] pty_write+0x73/0x80 [<ffffffff8136cb4c>] tty_put_char+0x3c/0x40 [<ffffffff813722b2>] process_echoes+0x142/0x330 [<ffffffff813742ab>] n_tty_receive_buf+0x8fb/0x1230 [<ffffffff813777b2>] flush_to_ldisc+0x142/0x1c0 [<ffffffff81062818>] process_one_work+0x198/0x760 [<ffffffff81063236>] worker_thread+0x186/0x4b0 [<ffffffff810694d3>] kthread+0x93/0xa0 [<ffffffff81753e24>] kernel_thread_helper+0x4/0x10 -> #0 (l2tpsock){+.-...}: [<ffffffff810a5288>] __lock_acquire+0x1628/0x1b10 [<ffffffff810a5df0>] lock_acquire+0x90/0x200 [<ffffffff817498c1>] _raw_spin_lock+0x41/0x50 [<ffffffffa0208db2>] l2tp_xmit_skb+0x172/0xa50 [l2tp_core] [<ffffffffa021a802>] l2tp_eth_dev_xmit+0x32/0x60 [l2tp_eth] [<ffffffff815952b2>] dev_hard_start_xmit+0x502/0xa70 [<ffffffff815b63ce>] sch_direct_xmit+0xfe/0x290 [<ffffffff81595a05>] dev_queue_xmit+0x1e5/0xe00 [<ffffffff815d9d60>] ip_finish_output+0x3d0/0x890 [<ffffffff815db019>] ip_output+0x59/0xf0 [<ffffffff815da36d>] ip_local_out+0x2d/0xa0 [<ffffffff815da5a3>] ip_queue_xmit+0x1c3/0x680 [<ffffffff815f4192>] tcp_transmit_skb+0x402/0xa60 [<ffffffff815f4a94>] tcp_write_xmit+0x1f4/0xa30 [<ffffffff815f5300>] tcp_push_one+0x30/0x40 [<ffffffff815e6672>] tcp_sendmsg+0xe82/0x1040 [<ffffffff81614495>] inet_sendmsg+0x125/0x230 [<ffffffff81576cdc>] sock_sendmsg+0xdc/0xf0 [<ffffffff81579ece>] sys_sendto+0xfe/0x130 [<ffffffff81752c92>] system_call_fastpath+0x16/0x1b Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&(&sch->busylock)->rlock); lock(l2tpsock); lock(&(&sch->busylock)->rlock); lock(l2tpsock); * DEADLOCK * 5 locks held by netperf/4660: #0: (sk_lock-AF_INET){+.+.+.}, at: [<ffffffff815e581c>] tcp_sendmsg+0x2c/0x1040 #1: (rcu_read_lock){.+.+..}, at: [<ffffffff815da3e0>] ip_queue_xmit+0x0/0x680 #2: (rcu_read_lock_bh){.+....}, at: [<ffffffff815d9ac5>] ip_finish_output+0x135/0x890 #3: (rcu_read_lock_bh){.+....}, at: [<ffffffff81595820>] dev_queue_xmit+0x0/0xe00 #4: (&(&sch->busylock)->rlock){+.-...}, at: [<ffffffff81596595>] dev_queue_xmit+0xd75/0xe00 stack backtrace: Pid: 4660, comm: netperf Not tainted 3.6.0-rc3+ #788 Call Trace: [<ffffffff8173dbf8>] print_circular_bug+0x1fb/0x20c [<ffffffff810a5288>] __lock_acquire+0x1628/0x1b10 [<ffffffff810a334b>] ? check_usage+0x9b/0x4d0 [<ffffffff810a3f44>] ? __lock_acquire+0x2e4/0x1b10 [<ffffffff810a5df0>] lock_acquire+0x90/0x200 [<ffffffffa0208db2>] ? l2tp_xmit_skb+0x172/0xa50 [l2tp_core] [<ffffffff817498c1>] _raw_spin_lock+0x41/0x50 [<ffffffffa0208db2>] ? l2tp_xmit_skb+0x172/0xa50 [l2tp_core] [<ffffffffa0208db2>] l2tp_xmit_skb+0x172/0xa50 [l2tp_core] [<ffffffffa021a802>] l2tp_eth_dev_xmit+0x32/0x60 [l2tp_eth] [<ffffffff815952b2>] dev_hard_start_xmit+0x502/0xa70 [<ffffffff81594e0e>] ? dev_hard_start_xmit+0x5e/0xa70 [<ffffffff81595961>] ? dev_queue_xmit+0x141/0xe00 [<ffffffff815b63ce>] sch_direct_xmit+0xfe/0x290 [<ffffffff81595a05>] dev_queue_xmit+0x1e5/0xe00 [<ffffffff81595820>] ? dev_hard_start_xmit+0xa70/0xa70 [<ffffffff815d9d60>] ip_finish_output+0x3d0/0x890 [<ffffffff815d9ac5>] ? ip_finish_output+0x135/0x890 [<ffffffff815db019>] ip_output+0x59/0xf0 [<ffffffff815da36d>] ip_local_out+0x2d/0xa0 [<ffffffff815da5a3>] ip_queue_xmit+0x1c3/0x680 [<ffffffff815da3e0>] ? ip_local_out+0xa0/0xa0 [<ffffffff815f4192>] tcp_transmit_skb+0x402/0xa60 [<ffffffff815fa25e>] ? tcp_md5_do_lookup+0x18e/0x1a0 [<ffffffff815f4a94>] tcp_write_xmit+0x1f4/0xa30 [<ffffffff815f5300>] tcp_push_one+0x30/0x40 [<ffffffff815e6672>] tcp_sendmsg+0xe82/0x1040 [<ffffffff81614495>] inet_sendmsg+0x125/0x230 [<ffffffff81614370>] ? inet_create+0x6b0/0x6b0 [<ffffffff8157e6e2>] ? sock_update_classid+0xc2/0x3b0 [<ffffffff8157e750>] ? sock_update_classid+0x130/0x3b0 [<ffffffff81576cdc>] sock_sendmsg+0xdc/0xf0 [<ffffffff81162579>] ? fget_light+0x3f9/0x4f0 [<ffffffff81579ece>] sys_sendto+0xfe/0x130 [<ffffffff810a69ad>] ? trace_hardirqs_on+0xd/0x10 [<ffffffff8174a0b0>] ? _raw_spin_unlock_irq+0x30/0x50 [<ffffffff810757e3>] ? finish_task_switch+0x83/0xf0 [<ffffffff810757a6>] ? finish_task_switch+0x46/0xf0 [<ffffffff81752cb7>] ? sysret_check+0x1b/0x56 [<ffffffff81752c92>] system_call_fastpath+0x16/0x1b Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-05 17:49:27 -04:00
Stephen Rothwell	2c96932229	netfilter: ipv6: using csum_ipv6_magic requires net/ip6_checksum.h Fixes this build error: net/ipv6/netfilter/nf_nat_l3proto_ipv6.c: In function 'nf_nat_ipv6_csum_recalc': net/ipv6/netfilter/nf_nat_l3proto_ipv6.c:144:4: error: implicit declaration of function 'csum_ipv6_magic' [-Werror=implicit-function-declaration] Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-05 17:46:06 -04:00
Nikolay Aleksandrov	c6c13965f4	net: add unknown state to sysfs NIC duplex export Currently when the NIC duplex state is DUPLEX_UNKNOWN it is exported as full through sysfs, this patch adds support for DUPLEX_UNKNOWN. It is handled the same way as in ethtool. Signed-off-by: Nikolay Aleksandrov <naleksan@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-05 17:40:07 -04:00
Julian Anastasov	d013ef2aba	tcp: fix possible socket refcount problem for ipv6 commit `144d56e910` ("tcp: fix possible socket refcount problem") is missing the IPv6 part. As tcp_release_cb is shared by both protocols we should hold sock reference for the TCP_MTU_REDUCED_DEFERRED bit. Signed-off-by: Julian Anastasov <ja@ssi.bg> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-05 17:16:25 -04:00
Huang Shijie	f21ec3d2d4	serial: add a new helper function In most of the time, the driver needs to check if the cts flow control is enabled. But now, the driver checks the ASYNC_CTS_FLOW flag manually, which is not a grace way. So add a new wraper function to make the code tidy and clean. Signed-off-by: Huang Shijie <shijie8@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-09-05 13:28:35 -07:00
Julian Anastasov	d23ff70164	tcp: add generic netlink support for tcp_metrics Add support for genl "tcp_metrics". No locking is changed, only that now we can unlink and delete entries after grace period. We implement get/del for single entry and dump to support show/flush filtering in user space. Del without address attribute causes flush for all addresses, sadly under genl_mutex. v2: - remove rcu_assign_pointer as suggested by Eric Dumazet, it is not needed because there are no other writes under lock - move the flushing code in tcp_metrics_flush_all v3: - remove synchronize_rcu on flush as suggested by Eric Dumazet Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-05 15:15:02 -04:00
John W. Linville	785a7de9ee	Merge branch 'for-john' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211	2012-09-05 14:48:15 -04:00
John W. Linville	518eefe1b6	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth	2012-09-05 14:46:30 -04:00
Pablo Neira Ayuso	00545bec94	netfilter: fix crash during boot if NAT has been compiled built-in (`c7232c9` netfilter: add protocol independent NAT core) introduced a problem that leads to crashing during boot due to NULL pointer dereference. It seems that xt_nat calls xt_register_target() before xt_init(): net/netfilter/x_tables.c:static struct xt_af xt; is NULL and we crash on xt_register_target(struct xt_target target) { u_int8_t af = target->family; int ret; ret = mutex_lock_interruptible(&xt[af].mutex); ... Fix this by changing the linking order, to make sure that x_tables comes before xt_nat. Reported-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-09-05 18:35:51 +02:00
Hila Gonen	768be59f30	cfg80211: fix indentation checkpatch pointed out an issue, fix it. Signed-off-by: Hila Gonen <hila.gonen@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-09-05 16:54:05 +02:00
Arend van Spriel	e5f5b2fb07	wext: include wireless event id when it has a size problem The wext code checks is the event data is within size limits. When this check fails a message is logged with violating size. This patch adds the event id to put us on the right track for resolving that violation. Reviewed-by: Hante Meuleman <meuleman@broadcom.com> Signed-off-by: Arend van Spriel <arend@broadcom.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-09-05 16:12:44 +02:00
LEO Airwarosu Yoichi Shinoda	7ce8c7a343	mac80211: Various small fixes for cfg.c: mpath_set_pinfo() Various small fixes for net/mac80211/cfg.c:mpath_set_pinfo(): Initialize *pinfo before filling members in, handle MESH_PATH_RESOLVED correctly, and remove bogus assignment; result in correct display of FLAGS values and meaningful EXPTIME for expired paths in iw utility. Signed-off-by: Yoichi Shinoda <shinoda@jaist.ac.jp> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-09-05 16:06:05 +02:00
Johannes Berg	00ea6deb0c	mac80211: don't use kerneldoc for ieee80211_add_rx_radiotap_header Doing so creates warnings, but the function is internal and not part of the 802.11 docbooks, so it from kerneldoc. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-09-05 15:55:27 +02:00
Wei Yongjun	00a9ac4c01	cfg80211: use list_move_tail instead of list_del/list_add_tail Using list_move_tail() instead of list_del() + list_add_tail(). Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-09-05 15:39:37 +02:00
Eric Dumazet	c0cc88a762	l2tp: fix a typo in l2tp_eth_dev_recv() While investigating l2tp bug, I hit a bug in eth_type_trans(), because not enough bytes were pulled in skb head. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-04 15:54:55 -04:00
Masatake YAMATO	600e177920	net: Providing protocol type via system.sockprotoname xattr of /proc/PID/fd entries lsof reports some of socket descriptors as "can't identify protocol" like: [yamato@localhost]/tmp% sudo lsof \| grep dbus \| grep iden dbus-daem 652 dbus 6u sock ... 17812 can't identify protocol dbus-daem 652 dbus 34u sock ... 24689 can't identify protocol dbus-daem 652 dbus 42u sock ... 24739 can't identify protocol dbus-daem 652 dbus 48u sock ... 22329 can't identify protocol ... lsof cannot resolve the protocol used in a socket because procfs doesn't provide the map between inode number on sockfs and protocol type of the socket. For improving the situation this patch adds an extended attribute named 'system.sockprotoname' in which the protocol name for /proc/PID/fd/SOCKET is stored. So lsof can know the protocol for a given /proc/PID/fd/SOCKET with getxattr system call. A few weeks ago I submitted a patch for the same purpose. The patch was introduced /proc/net/sockfs which enumerates inodes and protocols of all sockets alive on a system. However, it was rejected because (1) a global lock was needed, and (2) the layout of struct socket was changed with the patch. This patch doesn't use any global lock; and doesn't change the layout of any structs. In this patch, a protocol name is stored to dentry->d_name of sockfs when new socket is associated with a file descriptor. Before this patch dentry->d_name was not used; it was just filled with empty string. lsof may use an extended attribute named 'system.sockprotoname' to retrieve the value of dentry->d_name. It is nice if we can see the protocol name with ls -l /proc/PID/fd. However, "socket:[#INODE]", the name format returned from sockfs_dname() was already defined. To keep the compatibility between kernel and user land, the extended attribute is used to prepare the value of dentry->d_name. Signed-off-by: Masatake YAMATO <yamato@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-04 15:52:13 -04:00
David S. Miller	cefd81cfec	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jesse/openvswitch	2012-09-04 15:22:28 -04:00
David S. Miller	9d148e39d1	Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jesse/openvswitch	2012-09-04 15:17:52 -04:00
David S. Miller	798b2cbf92	net: Add INET dependency on aes crypto for the sake of TCP fastopen. Stephen Rothwell says: ==================== After merging the final tree, today's linux-next build (powerpc ppc44x_defconfig) failed like this: net/built-in.o: In function `tcp_fastopen_ctx_free': tcp_fastopen.c:(.text+0x5cc5c): undefined reference to `crypto_destroy_tfm' net/built-in.o: In function `tcp_fastopen_reset_cipher': (.text+0x5cccc): undefined reference to `crypto_alloc_base' net/built-in.o: In function `tcp_fastopen_reset_cipher': (.text+0x5cd6c): undefined reference to `crypto_destroy_tfm' Presumably caused by commit `1046716368` ("tcp: TCP Fast Open Server - header & support functions") from the net-next tree. I assume that some dependency on the CRYPTO infrastructure is missing. I have reverted commit `1bed966cc3` ("Merge branch 'tcp_fastopen_server'") for today. ==================== Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-04 14:20:14 -04:00
Wei Yongjun	54a2792423	sctp: use list_move_tail instead of list_del/list_add_tail Using list_move_tail() instead of list_del() + list_add_tail(). spatch with a semantic match is used to found this problem. (http://coccinelle.lip6.fr/) Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-04 14:16:13 -04:00
Steffen Klassert	3b59df46a4	xfrm: Workaround incompatibility of ESN and async crypto ESN for esp is defined in RFC 4303. This RFC assumes that the sequence number counters are always up to date. However, this is not true if an async crypto algorithm is employed. If the sequence number counters are not up to date on sequence number check, we may incorrectly update the upper 32 bit of the sequence number. This leads to a DOS. We workaround this by comparing the upper sequence number, (used for authentication) with the upper sequence number computed after the async processing. We drop the packet if these numbers are different. To do this, we introduce a recheck function that does this check in the ESN case. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-04 14:09:45 -04:00
Eric Dumazet	37159ef2c1	l2tp: fix a lockdep splat Fixes following lockdep splat : [ 1614.734896] ============================================= [ 1614.734898] [ INFO: possible recursive locking detected ] [ 1614.734901] 3.6.0-rc3+ #782 Not tainted [ 1614.734903] --------------------------------------------- [ 1614.734905] swapper/11/0 is trying to acquire lock: [ 1614.734907] (slock-AF_INET){+.-...}, at: [<ffffffffa0209d72>] l2tp_xmit_skb+0x172/0xa50 [l2tp_core] [ 1614.734920] [ 1614.734920] but task is already holding lock: [ 1614.734922] (slock-AF_INET){+.-...}, at: [<ffffffff815fce23>] tcp_v4_err+0x163/0x6b0 [ 1614.734932] [ 1614.734932] other info that might help us debug this: [ 1614.734935] Possible unsafe locking scenario: [ 1614.734935] [ 1614.734937] CPU0 [ 1614.734938] ---- [ 1614.734940] lock(slock-AF_INET); [ 1614.734943] lock(slock-AF_INET); [ 1614.734946] [ 1614.734946] * DEADLOCK * [ 1614.734946] [ 1614.734949] May be due to missing lock nesting notation [ 1614.734949] [ 1614.734952] 7 locks held by swapper/11/0: [ 1614.734954] #0: (rcu_read_lock){.+.+..}, at: [<ffffffff81592801>] __netif_receive_skb+0x251/0xd00 [ 1614.734964] #1: (rcu_read_lock){.+.+..}, at: [<ffffffff815d319c>] ip_local_deliver_finish+0x4c/0x4e0 [ 1614.734972] #2: (rcu_read_lock){.+.+..}, at: [<ffffffff8160d116>] icmp_socket_deliver+0x46/0x230 [ 1614.734982] #3: (slock-AF_INET){+.-...}, at: [<ffffffff815fce23>] tcp_v4_err+0x163/0x6b0 [ 1614.734989] #4: (rcu_read_lock){.+.+..}, at: [<ffffffff815da240>] ip_queue_xmit+0x0/0x680 [ 1614.734997] #5: (rcu_read_lock_bh){.+....}, at: [<ffffffff815d9925>] ip_finish_output+0x135/0x890 [ 1614.735004] #6: (rcu_read_lock_bh){.+....}, at: [<ffffffff81595680>] dev_queue_xmit+0x0/0xe00 [ 1614.735012] [ 1614.735012] stack backtrace: [ 1614.735016] Pid: 0, comm: swapper/11 Not tainted 3.6.0-rc3+ #782 [ 1614.735018] Call Trace: [ 1614.735020] <IRQ> [<ffffffff810a50ac>] __lock_acquire+0x144c/0x1b10 [ 1614.735033] [<ffffffff810a334b>] ? check_usage+0x9b/0x4d0 [ 1614.735037] [<ffffffff810a6762>] ? mark_held_locks+0x82/0x130 [ 1614.735042] [<ffffffff810a5df0>] lock_acquire+0x90/0x200 [ 1614.735047] [<ffffffffa0209d72>] ? l2tp_xmit_skb+0x172/0xa50 [l2tp_core] [ 1614.735051] [<ffffffff810a69ad>] ? trace_hardirqs_on+0xd/0x10 [ 1614.735060] [<ffffffff81749b31>] _raw_spin_lock+0x41/0x50 [ 1614.735065] [<ffffffffa0209d72>] ? l2tp_xmit_skb+0x172/0xa50 [l2tp_core] [ 1614.735069] [<ffffffffa0209d72>] l2tp_xmit_skb+0x172/0xa50 [l2tp_core] [ 1614.735075] [<ffffffffa014f7f2>] l2tp_eth_dev_xmit+0x32/0x60 [l2tp_eth] [ 1614.735079] [<ffffffff81595112>] dev_hard_start_xmit+0x502/0xa70 [ 1614.735083] [<ffffffff81594c6e>] ? dev_hard_start_xmit+0x5e/0xa70 [ 1614.735087] [<ffffffff815957c1>] ? dev_queue_xmit+0x141/0xe00 [ 1614.735093] [<ffffffff815b622e>] sch_direct_xmit+0xfe/0x290 [ 1614.735098] [<ffffffff81595865>] dev_queue_xmit+0x1e5/0xe00 [ 1614.735102] [<ffffffff81595680>] ? dev_hard_start_xmit+0xa70/0xa70 [ 1614.735106] [<ffffffff815b4daa>] ? eth_header+0x3a/0xf0 [ 1614.735111] [<ffffffff8161d33e>] ? fib_get_table+0x2e/0x280 [ 1614.735117] [<ffffffff8160a7e2>] arp_xmit+0x22/0x60 [ 1614.735121] [<ffffffff8160a863>] arp_send+0x43/0x50 [ 1614.735125] [<ffffffff8160b82f>] arp_solicit+0x18f/0x450 [ 1614.735132] [<ffffffff8159d9da>] neigh_probe+0x4a/0x70 [ 1614.735137] [<ffffffff815a191a>] __neigh_event_send+0xea/0x300 [ 1614.735141] [<ffffffff815a1c93>] neigh_resolve_output+0x163/0x260 [ 1614.735146] [<ffffffff815d9cf5>] ip_finish_output+0x505/0x890 [ 1614.735150] [<ffffffff815d9925>] ? ip_finish_output+0x135/0x890 [ 1614.735154] [<ffffffff815dae79>] ip_output+0x59/0xf0 [ 1614.735158] [<ffffffff815da1cd>] ip_local_out+0x2d/0xa0 [ 1614.735162] [<ffffffff815da403>] ip_queue_xmit+0x1c3/0x680 [ 1614.735165] [<ffffffff815da240>] ? ip_local_out+0xa0/0xa0 [ 1614.735172] [<ffffffff815f4402>] tcp_transmit_skb+0x402/0xa60 [ 1614.735177] [<ffffffff815f5a11>] tcp_retransmit_skb+0x1a1/0x620 [ 1614.735181] [<ffffffff815f7e93>] tcp_retransmit_timer+0x393/0x960 [ 1614.735185] [<ffffffff815fce23>] ? tcp_v4_err+0x163/0x6b0 [ 1614.735189] [<ffffffff815fd317>] tcp_v4_err+0x657/0x6b0 [ 1614.735194] [<ffffffff8160d116>] ? icmp_socket_deliver+0x46/0x230 [ 1614.735199] [<ffffffff8160d19e>] icmp_socket_deliver+0xce/0x230 [ 1614.735203] [<ffffffff8160d116>] ? icmp_socket_deliver+0x46/0x230 [ 1614.735208] [<ffffffff8160d464>] icmp_unreach+0xe4/0x2c0 [ 1614.735213] [<ffffffff8160e520>] icmp_rcv+0x350/0x4a0 [ 1614.735217] [<ffffffff815d3285>] ip_local_deliver_finish+0x135/0x4e0 [ 1614.735221] [<ffffffff815d319c>] ? ip_local_deliver_finish+0x4c/0x4e0 [ 1614.735225] [<ffffffff815d3ffa>] ip_local_deliver+0x4a/0x90 [ 1614.735229] [<ffffffff815d37b7>] ip_rcv_finish+0x187/0x730 [ 1614.735233] [<ffffffff815d425d>] ip_rcv+0x21d/0x300 [ 1614.735237] [<ffffffff81592a1b>] __netif_receive_skb+0x46b/0xd00 [ 1614.735241] [<ffffffff81592801>] ? __netif_receive_skb+0x251/0xd00 [ 1614.735245] [<ffffffff81593368>] process_backlog+0xb8/0x180 [ 1614.735249] [<ffffffff81593cf9>] net_rx_action+0x159/0x330 [ 1614.735257] [<ffffffff810491f0>] __do_softirq+0xd0/0x3e0 [ 1614.735264] [<ffffffff8109ed24>] ? tick_program_event+0x24/0x30 [ 1614.735270] [<ffffffff8175419c>] call_softirq+0x1c/0x30 [ 1614.735278] [<ffffffff8100425d>] do_softirq+0x8d/0xc0 [ 1614.735282] [<ffffffff8104983e>] irq_exit+0xae/0xe0 [ 1614.735287] [<ffffffff8175494e>] smp_apic_timer_interrupt+0x6e/0x99 [ 1614.735291] [<ffffffff81753a1c>] apic_timer_interrupt+0x6c/0x80 [ 1614.735293] <EOI> [<ffffffff810a14ad>] ? trace_hardirqs_off+0xd/0x10 [ 1614.735306] [<ffffffff81336f85>] ? intel_idle+0xf5/0x150 [ 1614.735310] [<ffffffff81336f7e>] ? intel_idle+0xee/0x150 [ 1614.735317] [<ffffffff814e6ea9>] cpuidle_enter+0x19/0x20 [ 1614.735321] [<ffffffff814e7538>] cpuidle_idle_call+0xa8/0x630 [ 1614.735327] [<ffffffff8100c1ba>] cpu_idle+0x8a/0xe0 [ 1614.735333] [<ffffffff8173762e>] start_secondary+0x220/0x222 Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-04 14:07:50 -04:00
Alan Cox	6cf5c95117	netrom: copy_datagram_iovec can fail Check for an error from this and if so bail properly. Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-04 12:57:35 -04:00
Wei Yongjun	b4e4f47e94	nl80211: fix possible memory leak nl80211_connect() connkeys is malloced in nl80211_parse_connkeys() and should be freed in the error handling case, otherwise it will cause memory leak. spatch with a semantic match is used to found this problem. (http://coccinelle.lip6.fr/) Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-09-04 18:06:00 +02:00
Eliad Peller	3d2abdfdf1	mac80211: clear bssid on auth/assoc failure ifmgd->bssid wasn't cleared properly in some auth/assoc failure cases, causing mac80211 and the low-level driver to go out of sync. Clear ifmgd->bssid on failure, and notify the driver. Cc: stable@kernel.org # 3.4+ Signed-off-by: Eliad Peller <eliad@wizery.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-09-04 17:14:19 +02:00
Ilan Peer	0ef24e528f	mac80211: Do not check for valid hw_queues for P2P_DEVICE A P2P Device interface does not have a netdev, and is not expected to be used for transmitting data, so there is no need to assign hw queues for it. Signed-off-by: Ilan Peer <ilan.peer@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-09-04 14:16:41 +02:00
Pravin B Shelar	15eac2a742	openvswitch: Increase maximum number of datapath ports. Use hash table to store ports of datapath. Allow 64K ports per switch. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: Jesse Gross <jesse@nicira.com>	2012-09-03 19:20:49 -07:00
Jesse Gross	c303aa94cd	openvswitch: Fix FLOW_BUFSIZE definition. The vlan encapsulation fields in the maximum flow defintion were never updated when the representation changed before upstreaming. In theory this could cause a kernel panic when a maximum length flow is used. In practice this has never happened (to my knowledge) because skb allocations are padded out to a cache line so you would need the right combination of flow and packet being sent to userspace. Signed-off-by: Jesse Gross <jesse@nicira.com>	2012-09-03 19:06:27 -07:00
David S. Miller	1e9f0207d3	Merge branch 'master' of git://1984.lsi.us.es/nf-next	2012-09-03 20:26:45 -04:00
Eric Dumazet	b379135c40	fq_codel: dont reinit flow state When fq_codel builds a new flow, it should not reset codel state. Codel algo needs to get previous values (lastcount, drop_next) to get proper behavior. Signed-off-by: Dave Taht <dave.taht@gmail.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Dave Taht <dave.taht@bufferbloat.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-03 14:36:50 -04:00
Yuchung Cheng	684bad1107	tcp: use PRR to reduce cwin in CWR state Use proportional rate reduction (PRR) algorithm to reduce cwnd in CWR state, in addition to Recovery state. Retire the current rate-halving in CWR. When losses are detected via ACKs in CWR state, the sender enters Recovery state but the cwnd reduction continues and does not restart. Rename and refactor cwnd reduction functions since both CWR and Recovery use the same algorithm: tcp_init_cwnd_reduction() is new and initiates reduction state variables. tcp_cwnd_reduction() is previously tcp_update_cwnd_in_recovery(). tcp_ends_cwnd_reduction() is previously tcp_complete_cwr(). The rate halving functions and logic such as tcp_cwnd_down(), tcp_min_cwnd(), and the cwnd moderation inside tcp_enter_cwr() are removed. The unused parameter, flag, in tcp_cwnd_reduction() is also removed. Signed-off-by: Yuchung Cheng <ycheng@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-03 14:34:02 -04:00
Yuchung Cheng	fb4d3d1df3	tcp: move tcp_update_cwnd_in_recovery To prepare replacing rate halving with PRR algorithm in CWR state. Signed-off-by: Yuchung Cheng <ycheng@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-03 14:34:02 -04:00
Yuchung Cheng	09484d1f6e	tcp: move tcp_enter_cwr() To prepare replacing rate halving with PRR algorithm in CWR state. Signed-off-by: Yuchung Cheng <ycheng@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-03 14:34:02 -04:00
Thomas Graf	4c3a5bdae2	sctp: Don't charge for data in sndbuf again when transmitting packet SCTP charges wmem_alloc via sctp_set_owner_w() in sctp_sendmsg() and via skb_set_owner_w() in sctp_packet_transmit(). If a sender runs out of sndbuf it will sleep in sctp_wait_for_sndbuf() and expects to be waken up by __sctp_write_space(). Buffer space charged via sctp_set_owner_w() is released in sctp_wfree() which calls __sctp_write_space() directly. Buffer space charged via skb_set_owner_w() is released via sock_wfree() which calls sk->sk_write_space() _if_ SOCK_USE_WRITE_QUEUE is not set. sctp_endpoint_init() sets SOCK_USE_WRITE_QUEUE on all sockets. Therefore if sctp_packet_transmit() manages to queue up more than sndbuf bytes, sctp_wait_for_sndbuf() will never be woken up again unless it is interrupted by a signal. This could be fixed by clearing the SOCK_USE_WRITE_QUEUE flag but ... Charging for the data twice does not make sense in the first place, it leads to overcharging sndbuf by a factor 2. Therefore this patch only charges a single byte in wmem_alloc when transmitting an SCTP packet to ensure that the socket stays alive until the packet has been released. This means that control chunks are no longer accounted for in wmem_alloc which I believe is not a problem as skb->truesize will typically lead to overcharging anyway and thus compensates for any control overhead. Signed-off-by: Thomas Graf <tgraf@suug.ch> CC: Vlad Yasevich <vyasevic@redhat.com> CC: Neil Horman <nhorman@tuxdriver.com> CC: David Miller <davem@davemloft.net> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-03 13:24:13 -04:00
Eric Dumazet	e812347ccf	net: sock_edemux() should take care of timewait sockets sock_edemux() can handle either a regular socket or a timewait socket Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-03 13:22:43 -04:00
Pablo Neira Ayuso	ace1fe1231	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next This merges (`3f509c6` netfilter: nf_nat_sip: fix incorrect handling of EBUSY for RTCP expectation) to Patrick McHardy's IPv6 NAT changes.	2012-09-03 15:34:51 +02:00
Jan Beulich	ce9f3f31ef	netfilter: properly annotate ipv4_netfilter_{init,fini}() Despite being just a few bytes of code, they should still have proper annotations. Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-09-03 13:56:04 +02:00
Michael Wang	1c15b67709	netfilter: pass 'nf_hook_ops' instead of 'list_head' to nf_queue() Since 'list_for_each_continue_rcu' has already been replaced by 'list_for_each_entry_continue_rcu', pass 'list_head' to nf_queue() as a parameter can not benefit us any more. This patch will replace 'list_head' with 'nf_hook_ops' as the parameter of nf_queue() and __nf_queue() to save code. Signed-off-by: Michael Wang <wangyun@linux.vnet.ibm.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-09-03 13:52:54 +02:00
Michael Wang	2a6decfd8a	netfilter: pass 'nf_hook_ops' instead of 'list_head' to nf_iterate() Since 'list_for_each_continue_rcu' has already been replaced by 'list_for_each_entry_continue_rcu', pass 'list_head' to nf_iterate() as a parameter can not benefit us any more. This patch will replace 'list_head' with 'nf_hook_ops' as the parameter of nf_iterate() to save code. Signed-off-by: Michael Wang <wangyun@linux.vnet.ibm.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-09-03 13:52:44 +02:00
Cong Wang	965505015b	netfilter: remove xt_NOTRACK It was scheduled to be removed for a long time. Cc: Pablo Neira Ayuso <pablo@netfilter.org> Cc: Patrick McHardy <kaber@trash.net> Cc: "David S. Miller" <davem@davemloft.net> Cc: netfilter@vger.kernel.org Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-09-03 13:36:40 +02:00
Pablo Neira Ayuso	84b5ee939e	netfilter: nf_conntrack: add nf_ct_timeout_lookup This patch adds the new nf_ct_timeout_lookup function to encapsulate the timeout policy attachment that is called in the nf_conntrack_in path. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-09-03 13:33:03 +02:00
Pablo Neira Ayuso	236df00561	netfilter: xt_CT: refactorize xt_ct_tg_check This patch adds xt_ct_set_helper and xt_ct_set_timeout to reduce the size of xt_ct_tg_check. This aims to improve code mantainability by splitting xt_ct_tg_check in smaller chunks. Suggested by Eric Dumazet. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-09-03 13:32:48 +02:00
Pablo Neira Ayuso	6703aa74ad	netfilter: xt_socket: fix compilation warnings with gcc 4.7 This patch fixes compilation warnings in xt_socket with gcc-4.7. In file included from net/netfilter/xt_socket.c:22:0: net/netfilter/xt_socket.c: In function ‘socket_mt6_v1’: include/net/netfilter/nf_tproxy_core.h:175:23: warning: ‘sport’ may be used uninitialized in this function [-Wmaybe-uninitialized] net/netfilter/xt_socket.c:265:16: note: ‘sport’ was declared here In file included from net/netfilter/xt_socket.c:22:0: include/net/netfilter/nf_tproxy_core.h:175:23: warning: ‘dport’ may be used uninitialized in this function [-Wmaybe-uninitialized] net/netfilter/xt_socket.c:265:9: note: ‘dport’ was declared here In file included from net/netfilter/xt_socket.c:22:0: include/net/netfilter/nf_tproxy_core.h:175:6: warning: ‘saddr’ may be used uninitialized in this function [-Wmaybe-uninitialized] net/netfilter/xt_socket.c:264:27: note: ‘saddr’ was declared here In file included from net/netfilter/xt_socket.c:22:0: include/net/netfilter/nf_tproxy_core.h:175:6: warning: ‘daddr’ may be used uninitialized in this function [-Wmaybe-uninitialized] net/netfilter/xt_socket.c:264:19: note: ‘daddr’ was declared here In file included from net/netfilter/xt_socket.c:22:0: net/netfilter/xt_socket.c: In function ‘socket_match.isra.4’: include/net/netfilter/nf_tproxy_core.h:75:2: warning: ‘protocol’ may be used uninitialized in this function [-Wmaybe-uninitialized] net/netfilter/xt_socket.c:113:5: note: ‘protocol’ was declared here In file included from include/net/tcp.h:37:0, from net/netfilter/xt_socket.c:17: include/net/inet_hashtables.h:356:45: warning: ‘sport’ may be used uninitialized in this function [-Wmaybe-uninitialized] net/netfilter/xt_socket.c:112:16: note: ‘sport’ was declared here In file included from net/netfilter/xt_socket.c:22:0: include/net/netfilter/nf_tproxy_core.h:106:23: warning: ‘dport’ may be used uninitialized in this function [-Wmaybe-uninitialized] net/netfilter/xt_socket.c:112:9: note: ‘dport’ was declared here In file included from include/net/tcp.h:37:0, from net/netfilter/xt_socket.c:17: include/net/inet_hashtables.h:356:15: warning: ‘saddr’ may be used uninitialized in this function [-Wmaybe-uninitialized] net/netfilter/xt_socket.c:111:16: note: ‘saddr’ was declared here In file included from include/net/tcp.h:37:0, from net/netfilter/xt_socket.c:17: include/net/inet_hashtables.h:356:15: warning: ‘daddr’ may be used uninitialized in this function [-Wmaybe-uninitialized] net/netfilter/xt_socket.c:111:9: note: ‘daddr’ was declared here In file included from net/netfilter/xt_socket.c:22:0: net/netfilter/xt_socket.c: In function ‘socket_mt6_v1’: include/net/netfilter/nf_tproxy_core.h:175:23: warning: ‘sport’ may be used uninitialized in this function [-Wmaybe-uninitialized] net/netfilter/xt_socket.c:268:16: note: ‘sport’ was declared here In file included from net/netfilter/xt_socket.c:22:0: include/net/netfilter/nf_tproxy_core.h:175:23: warning: ‘dport’ may be used uninitialized in this function [-Wmaybe-uninitialized] net/netfilter/xt_socket.c:268:9: note: ‘dport’ was declared here In file included from net/netfilter/xt_socket.c:22:0: include/net/netfilter/nf_tproxy_core.h:175:6: warning: ‘saddr’ may be used uninitialized in this function [-Wmaybe-uninitialized] net/netfilter/xt_socket.c:267:27: note: ‘saddr’ was declared here In file included from net/netfilter/xt_socket.c:22:0: include/net/netfilter/nf_tproxy_core.h:175:6: warning: ‘daddr’ may be used uninitialized in this function [-Wmaybe-uninitialized] net/netfilter/xt_socket.c:267:19: note: ‘daddr’ was declared here Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-09-03 13:31:39 +02:00
Joe Stringer	39855b5ba9	openvswitch: Fix typo Signed-off-by: Joe Stringer <joe@wand.net.nz> Signed-off-by: Jesse Gross <jesse@nicira.com>	2012-09-02 12:18:25 -07:00
Linus Torvalds	0b1a34c992	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: 1) NLA_PUT* --> nla_put_* conversion got one case wrong in nfnetlink_log, fix from Patrick McHardy. 2) Missed error return check in ipw2100 driver, from Julia Lawall. 3) PMTU updates in ipv4 were setting the expiry time incorrectly, fix from Eric Dumazet. 4) SFC driver erroneously reversed src and dst when reporting filters via ethtool. 5) Memory leak in CAN protocol and wrong setting of IRQF_SHARED in sja1000 can platform driver, from Alexey Khoroshilov and Sven Schmitt. 6) Fix multicast traffic scaling regression in ipv4_dst_destroy, only take the lock when we really need to. From Eric Dumazet. 7) Fix non-root process spoofing in netlink, from Pablo Neira Ayuso. 8) CWND reduction in TCP is done incorrectly during non-SACK recovery, fix from Yuchung Cheng. 9) Revert netpoll change, and fix what was actually a driver specific problem. From Amerigo Wang. This should cure bootup hangs with netconsole some people reported. 10) Fix xen-netfront invoking __skb_fill_page_desc() with a NULL page pointer. From Ian Campbell. 11) SIP NAT fix for expectiontation creation, from Pablo Neira Ayuso. 12) __ip_rt_update_pmtu() needs RCU locking, from Eric Dumazet. 13) Fix usbnet deadlock on resume, can't use GFP_KERNEL in this situation. From Oliver Neukum. 14) The davinci ethernet driver triggers an OOPS on removal because it frees an MDIO object before unregistering it. Fix from Bin Liu. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (41 commits) net: qmi_wwan: add several new Gobi devices fddi: 64 bit bug in smt_add_para() net: ethernet: fix kernel OOPS when remove davinci_mdio module net/xfrm/xfrm_state.c: fix error return code net: ipv6: fix error return code net: qmi_wwan: new device: Foxconn/Novatel E396 usbnet: fix deadlock in resume cs89x0 : packet reception not working netfilter: nf_conntrack: fix racy timer handling with reliable events bnx2x: Correct the ndo_poll_controller call bnx2x: Move netif_napi_add to the open call ipv4: must use rcu protection while calling fib_lookup bnx2x: fix 57840_MF pci id net: ipv4: ipmr_expire_timer causes crash when removing net namespace e1000e: DoS while TSO enabled caused by link partner with small MSS l2tp: avoid to use synchronize_rcu in tunnel free function gianfar: fix default tx vlan offload feature flag netfilter: nf_nat_sip: fix incorrect handling of EBUSY for RTCP expectation xen-netfront: use __pskb_pull_tail to ensure linear area is big enough on RX netfilter: nfnetlink_log: fix error return code in init path ...	2012-09-02 11:28:00 -07:00
Alan Ott	a2dc375e12	6lowpan: handle NETDEV_UNREGISTER event Before, it was impossible to remove a wpan device which had lowpan attached to it. Signed-off-by: Alan Ott <alan@signal11.us> Signed-off-by: David S. Miller <davem@tempietto.lan>	2012-09-01 22:48:02 -04:00
Alan Ott	a437d2744b	6lowpan: Make a copy of skb's delivered to 6lowpan Since lowpan_process_data() modifies the skb (by calling skb_pull()), we need our own copy so that it doesn't affect the data received by other protcols (in this case, af_ieee802154). Signed-off-by: Alan Ott <alan@signal11.us> Signed-off-by: David S. Miller <davem@tempietto.lan>	2012-09-01 22:48:01 -04:00
Anatol Pomozov	4907cb7b19	treewide: fix comment/printk/variable typos Signed-off-by: Anatol Pomozov <anatol.pomozov@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2012-09-01 10:33:05 -07:00
Jerry Chu	168a8f5805	tcp: TCP Fast Open Server - main code path This patch adds the main processing path to complete the TFO server patches. A TFO request (i.e., SYN+data packet with a TFO cookie option) first gets processed in tcp_v4_conn_request(). If it passes the various TFO checks by tcp_fastopen_check(), a child socket will be created right away to be accepted by applications, rather than waiting for the 3WHS to finish. In additon to the use of TFO cookie, a simple max_qlen based scheme is put in place to fend off spoofed TFO attack. When a valid ACK comes back to tcp_rcv_state_process(), it will cause the state of the child socket to switch from either TCP_SYN_RECV to TCP_ESTABLISHED, or TCP_FIN_WAIT1 to TCP_FIN_WAIT2. At this time retransmission will resume for any unack'ed (data, FIN,...) segments. Signed-off-by: H.K. Jerry Chu <hkchu@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-31 20:02:19 -04:00
Jerry Chu	8336886f78	tcp: TCP Fast Open Server - support TFO listeners This patch builds on top of the previous patch to add the support for TFO listeners. This includes - 1. allocating, properly initializing, and managing the per listener fastopen_queue structure when TFO is enabled 2. changes to the inet_csk_accept code to support TFO. E.g., the request_sock can no longer be freed upon accept(), not until 3WHS finishes 3. allowing a TCP_SYN_RECV socket to properly poll() and sendmsg() if it's a TFO socket 4. properly closing a TFO listener, and a TFO socket before 3WHS finishes 5. supporting TCP_FASTOPEN socket option 6. modifying tcp_check_req() to use to check a TFO socket as well as request_sock 7. supporting TCP's TFO cookie option 8. adding a new SYN-ACK retransmit handler to use the timer directly off the TFO socket rather than the listener socket. Note that TFO server side will not retransmit anything other than SYN-ACK until the 3WHS is completed. The patch also contains an important function "reqsk_fastopen_remove()" to manage the somewhat complex relation between a listener, its request_sock, and the corresponding child socket. See the comment above the function for the detail. Signed-off-by: H.K. Jerry Chu <hkchu@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-31 20:02:19 -04:00
Jerry Chu	1046716368	tcp: TCP Fast Open Server - header & support functions This patch adds all the necessary data structure and support functions to implement TFO server side. It also documents a number of flags for the sysctl_tcp_fastopen knob, and adds a few Linux extension MIBs. In addition, it includes the following: 1. a new TCP_FASTOPEN socket option an application must call to supply a max backlog allowed in order to enable TFO on its listener. 2. A number of key data structures: "fastopen_rsk" in tcp_sock - for a big socket to access its request_sock for retransmission and ack processing purpose. It is non-NULL iff 3WHS not completed. "fastopenq" in request_sock_queue - points to a per Fast Open listener data structure "fastopen_queue" to keep track of qlen (# of outstanding Fast Open requests) and max_qlen, among other things. "listener" in tcp_request_sock - to point to the original listener for book-keeping purpose, i.e., to maintain qlen against max_qlen as part of defense against IP spoofing attack. 3. various data structure and functions, many in tcp_fastopen.c, to support server side Fast Open cookie operations, including /proc/sys/net/ipv4/tcp_fastopen_key to allow manual rekeying. Signed-off-by: H.K. Jerry Chu <hkchu@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-31 20:02:18 -04:00
Sorin Dumitru	eb7e057596	ipv6: remove some deadcode __ipv6_regen_rndid no longer returns anything other than 0 so there's no point in verifying what it returns Signed-off-by: Sorin Dumitru <sdumitru@ixiacom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-31 16:33:32 -04:00
Julia Lawall	599901c3e4	net/xfrm/xfrm_state.c: fix error return code Initialize return variable before exiting on an error path. A simplified version of the semantic match that finds this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> ( if@p1 ($ret < 0\\|ret != 0$) { ... return ret; } \| ret@p1 = 0 ) ... when != ret = e1 when != &ret *if(...) { ... when != ret = e2 when forall return ret; } // </smpl> Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-31 16:27:48 -04:00
Julia Lawall	48f125ce1c	net: ipv6: fix error return code Initialize return variable before exiting on an error path. The initial initialization of the return variable is also dropped, because that value is never used. A simplified version of the semantic match that finds this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> ( if@p1 ($ret < 0\\|ret != 0$) { ... return ret; } \| ret@p1 = 0 ) ... when != ret = e1 when != &ret *if(...) { ... when != ret = e2 when forall return ret; } // </smpl> Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-31 16:27:48 -04:00
Rami Rosen	d1a53dfd11	net: fix documentation of skb_needs_linearize(). skb_needs_linearize() does not check highmem DMA as it does not call illegal_highdma() anymore, so there is no need to mention highmem DMA here. (Indeed, ~NETIF_F_SG flag, which is checked in skb_needs_linearize(), can be set when illegal_highdma() returns true, and we are assured that illegal_highdma() is invoked prior to skb_needs_linearize() as skb_needs_linearize() is a static method called only once. But ~NETIF_F_SG can be set not only there in this same invocation path. It can also be set when can_checksum_protocol() returns false). see commit `02932ce9e2`, Convert skb_need_linearize() to use precomputed features. Signed-off-by: Rami Rosen <rosenr@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-31 16:24:02 -04:00
Alexander Duyck	98d75c3724	ipv4: Minor logic clean-up in ipv4_mtu In ipv4_mtu there is some logic where we are testing for a non-zero value and a timer expiration, then setting the value to zero, and then testing if the value is zero we set it to a value based on the dst. Instead of bothering with the extra steps it is easier to just cleanup the logic so that we set it to the dst based value if it is zero or if the timer has expired. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>	2012-08-31 16:22:50 -04:00
Wanlong Gao	4a2c240691	net🏧fix up ENOIOCTLCMD error handling At commit `07d106d0`, Linus pointed out that ENOIOCTLCMD should be translated as ENOTTY to user mode. Cc: "David S. Miller" <davem@davemloft.net> Cc: netdev@vger.kernel.org Signed-off-by: Wanlong Gao <gaowanlong@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-31 16:14:33 -04:00
Wei Yongjun	80f0fd8a7f	openvswitch: using kfree_rcu() to simplify the code The callback function of call_rcu() just calls a kfree(), so we can use kfree_rcu() instead of call_rcu() + callback function. spatch with a semantic match is used to found this problem. (http://coccinelle.lip6.fr/) Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Acked-by: Jesse Gross <jesse@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-31 15:55:38 -04:00
Xi Wang	fc61b928dc	af_unix: fix shutdown parameter checking Return -EINVAL rather than 0 given an invalid "mode" parameter. Signed-off-by: Xi Wang <xi.wang@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-31 15:55:37 -04:00
Xi Wang	46b66d7077	decnet: fix shutdown parameter checking The allowed value of "how" is SHUT_RD/SHUT_WR/SHUT_RDWR (0/1/2), rather than SHUTDOWN_MASK (3). Signed-off-by: Xi Wang <xi.wang@gmail.com> Acked-by: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-31 15:55:37 -04:00
David S. Miller	c32f38619a	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Merge the 'net' tree to get the recent set of netfilter bug fixes in order to assist with some merge hassles Pablo is going to have to deal with for upcoming changes. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-31 15:14:18 -04:00
David S. Miller	0dcd5052c8	Merge branch 'master' of git://1984.lsi.us.es/nf	2012-08-31 13:06:37 -04:00
Pablo Neira Ayuso	5b423f6a40	netfilter: nf_conntrack: fix racy timer handling with reliable events Existing code assumes that del_timer returns true for alive conntrack entries. However, this is not true if reliable events are enabled. In that case, del_timer may return true for entries that were just inserted in the dying list. Note that packets / ctnetlink may hold references to conntrack entries that were just inserted to such list. This patch fixes the issue by adding an independent timer for event delivery. This increases the size of the ecache extension. Still we can revisit this later and use variable size extensions to allocate this area on demand. Tested-by: Oliver Smith <olipro@8.c.9.b.0.7.4.0.1.0.0.2.ip6.arpa> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-08-31 15:50:28 +02:00
Eric Dumazet	c5ae7d4192	ipv4: must use rcu protection while calling fib_lookup Following lockdep splat was reported by Pavel Roskin : [ 1570.586223] =============================== [ 1570.586225] [ INFO: suspicious RCU usage. ] [ 1570.586228] 3.6.0-rc3-wl-main #98 Not tainted [ 1570.586229] ------------------------------- [ 1570.586231] /home/proski/src/linux/net/ipv4/route.c:645 suspicious rcu_dereference_check() usage! [ 1570.586233] [ 1570.586233] other info that might help us debug this: [ 1570.586233] [ 1570.586236] [ 1570.586236] rcu_scheduler_active = 1, debug_locks = 0 [ 1570.586238] 2 locks held by Chrome_IOThread/4467: [ 1570.586240] #0: (slock-AF_INET){+.-...}, at: [<ffffffff814f2c0c>] release_sock+0x2c/0xa0 [ 1570.586253] #1: (fnhe_lock){+.-...}, at: [<ffffffff815302fc>] update_or_create_fnhe+0x2c/0x270 [ 1570.586260] [ 1570.586260] stack backtrace: [ 1570.586263] Pid: 4467, comm: Chrome_IOThread Not tainted 3.6.0-rc3-wl-main #98 [ 1570.586265] Call Trace: [ 1570.586271] [<ffffffff810976ed>] lockdep_rcu_suspicious+0xfd/0x130 [ 1570.586275] [<ffffffff8153042c>] update_or_create_fnhe+0x15c/0x270 [ 1570.586278] [<ffffffff815305b3>] __ip_rt_update_pmtu+0x73/0xb0 [ 1570.586282] [<ffffffff81530619>] ip_rt_update_pmtu+0x29/0x90 [ 1570.586285] [<ffffffff815411dc>] inet_csk_update_pmtu+0x2c/0x80 [ 1570.586290] [<ffffffff81558d1e>] tcp_v4_mtu_reduced+0x2e/0xc0 [ 1570.586293] [<ffffffff81553bc4>] tcp_release_cb+0xa4/0xb0 [ 1570.586296] [<ffffffff814f2c35>] release_sock+0x55/0xa0 [ 1570.586300] [<ffffffff815442ef>] tcp_sendmsg+0x4af/0xf50 [ 1570.586305] [<ffffffff8156fc60>] inet_sendmsg+0x120/0x230 [ 1570.586308] [<ffffffff8156fb40>] ? inet_sk_rebuild_header+0x40/0x40 [ 1570.586312] [<ffffffff814f4bdd>] ? sock_update_classid+0xbd/0x3b0 [ 1570.586315] [<ffffffff814f4c50>] ? sock_update_classid+0x130/0x3b0 [ 1570.586320] [<ffffffff814ec435>] do_sock_write+0xc5/0xe0 [ 1570.586323] [<ffffffff814ec4a3>] sock_aio_write+0x53/0x80 [ 1570.586328] [<ffffffff8114bc83>] do_sync_write+0xa3/0xe0 [ 1570.586332] [<ffffffff8114c5a5>] vfs_write+0x165/0x180 [ 1570.586335] [<ffffffff8114c805>] sys_write+0x45/0x90 [ 1570.586340] [<ffffffff815d2722>] system_call_fastpath+0x16/0x1b Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Pavel Roskin <proski@gnu.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-30 13:33:08 -04:00
Francesco Ruggeri	acbb219d5f	net: ipv4: ipmr_expire_timer causes crash when removing net namespace When tearing down a net namespace, ipv4 mr_table structures are freed without first deactivating their timers. This can result in a crash in run_timer_softirq. This patch mimics the corresponding behaviour in ipv6. Locking and synchronization seem to be adequate. We are about to kfree mrt, so existing code should already make sure that no other references to mrt are pending or can be created by incoming traffic. The functions invoked here do not cause new references to mrt or other race conditions to be created. Invoking del_timer_sync guarantees that ipmr_expire_timer is inactive. Both ipmr_expire_process (whose completion we may have to wait in del_timer_sync) and mroute_clean_tables internally use mfc_unres_lock or other synchronizations when needed, and they both only modify mrt. Tested in Linux 3.4.8. Signed-off-by: Francesco Ruggeri <fruggeri@aristanetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-30 12:51:32 -04:00
Eric Dumazet	ee13040901	netpoll: provide an IP ident in UDP frames Let's fill IP header ident field with a meaningful value, it might help some setups. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-30 12:32:14 -04:00
xeb@mail.ru	99469c32f7	l2tp: avoid to use synchronize_rcu in tunnel free function Avoid to use synchronize_rcu in l2tp_tunnel_free because context may be atomic. Signed-off-by: Dmitry Kozlov <xeb@mail.ru> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-30 12:31:03 -04:00
Pablo Neira Ayuso	3f509c689a	netfilter: nf_nat_sip: fix incorrect handling of EBUSY for RTCP expectation We're hitting bug while trying to reinsert an already existing expectation: kernel BUG at kernel/timer.c:895! invalid opcode: 0000 [#1] SMP [...] Call Trace: <IRQ> [<ffffffffa0069563>] nf_ct_expect_related_report+0x4a0/0x57a [nf_conntrack] [<ffffffff812d423a>] ? in4_pton+0x72/0x131 [<ffffffffa00ca69e>] ip_nat_sdp_media+0xeb/0x185 [nf_nat_sip] [<ffffffffa00b5b9b>] set_expected_rtp_rtcp+0x32d/0x39b [nf_conntrack_sip] [<ffffffffa00b5f15>] process_sdp+0x30c/0x3ec [nf_conntrack_sip] [<ffffffff8103f1eb>] ? irq_exit+0x9a/0x9c [<ffffffffa00ca738>] ? ip_nat_sdp_media+0x185/0x185 [nf_nat_sip] We have to remove the RTP expectation if the RTCP expectation hits EBUSY since we keep trying with other ports until we succeed. Reported-by: Rafal Fitt <rafalf@aplusc.com.pl> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-08-30 18:27:14 +02:00
Gao feng	6549dd43c0	net: dev: fix the incorrect hold of net namespace's lo device When moving a net device from one net namespace to another net namespace,dev_change_net_namespace calls NETDEV_DOWN event,so the original net namespace's dst entries which beloned to this net device will be put into dst_garbage list. then dev_change_net_namespace will set this net device's net to the new net namespace. If we unregister this net device's driver, this will trigger the NETDEV_UNREGISTER_FINAL event, dst_ifdown will be called, and get this net device's dst entries from dst_garbage list, put these entries' dev to the new net namespace's lo device. It's not what we want,actually we need these dst entries hold the original net namespace's lo device,this incorrect device holding will trigger emg message like below. unregister_netdevice: waiting for lo to become free. Usage count = 1 so we should call NETDEV_UNREGISTER_FINAL event in dev_change_net_namespace too,in order to make sure dst entries already in the dst_garbage list, we need rcu_barrier before we call NETDEV_UNREGISTER_FINAL event. With help form Eric Dumazet. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-30 12:21:16 -04:00
Julia Lawall	6fc09f10f1	netfilter: nfnetlink_log: fix error return code in init path Initialize return variable before exiting on an error path. A simplified version of the semantic match that finds this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> ( if@p1 ($ret < 0\\|ret != 0$) { ... return ret; } \| ret@p1 = 0 ) ... when != ret = e1 when != &ret *if(...) { ... when != ret = e2 when forall return ret; } // </smpl> Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-08-30 03:29:58 +02:00
Julia Lawall	ef6acf68c2	netfilter: ctnetlink: fix error return code in init path Initialize return variable before exiting on an error path. A simplified version of the semantic match that finds this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> ( if@p1 ($ret < 0\\|ret != 0$) { ... return ret; } \| ret@p1 = 0 ) ... when != ret = e1 when != &ret *if(...) { ... when != ret = e2 when forall return ret; } // </smpl> Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-08-30 03:28:22 +02:00
Julia Lawall	0a54e939d8	ipvs: fix error return code Initialize return variable before exiting on an error path. A simplified version of the semantic match that finds this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> ( if@p1 ($ret < 0\\|ret != 0$) { ... return ret; } \| ret@p1 = 0 ) ... when != ret = e1 when != &ret *if(...) { ... when != ret = e2 when forall return ret; } // </smpl> Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Acked-by: Simon Horman <horms@verge.net.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-08-30 03:27:19 +02:00
Patrick McHardy	8a91bb0c30	netfilter: ip6tables: add stateless IPv6-to-IPv6 Network Prefix Translation target Signed-off-by: Patrick McHardy <kaber@trash.net>	2012-08-30 03:00:25 +02:00
Pablo Neira Ayuso	320ff567f2	netfilter: nf_nat: support IPv6 in TFTP NAT helper Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net>	2012-08-30 03:00:24 +02:00
Pablo Neira Ayuso	5901b6be88	netfilter: nf_nat: support IPv6 in IRC NAT helper Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net>	2012-08-30 03:00:23 +02:00
Patrick McHardy	9a66482106	netfilter: nf_nat: support IPv6 in SIP NAT helper Add IPv6 support to the SIP NAT helper. There are no functional differences to IPv4 NAT, just different formats for addresses. Signed-off-by: Patrick McHardy <kaber@trash.net>	2012-08-30 03:00:22 +02:00
Patrick McHardy	ee6eb96673	netfilter: nf_nat: support IPv6 in amanda NAT helper Signed-off-by: Patrick McHardy <kaber@trash.net>	2012-08-30 03:00:21 +02:00
Patrick McHardy	d33cbeeb1a	netfilter: nf_nat: support IPv6 in FTP NAT helper Signed-off-by: Patrick McHardy <kaber@trash.net>	2012-08-30 03:00:20 +02:00
Patrick McHardy	ed72d9e294	netfilter: ip6tables: add NETMAP target Signed-off-by: Patrick McHardy <kaber@trash.net>	2012-08-30 03:00:19 +02:00
Patrick McHardy	115e23ac78	netfilter: ip6tables: add REDIRECT target Signed-off-by: Patrick McHardy <kaber@trash.net>	2012-08-30 03:00:19 +02:00
Patrick McHardy	b3f644fc82	netfilter: ip6tables: add MASQUERADE target Signed-off-by: Patrick McHardy <kaber@trash.net>	2012-08-30 03:00:18 +02:00
Patrick McHardy	58a317f106	netfilter: ipv6: add IPv6 NAT support Signed-off-by: Patrick McHardy <kaber@trash.net>	2012-08-30 03:00:17 +02:00
Patrick McHardy	2cf545e835	net: core: add function for incremental IPv6 pseudo header checksum updates Add inet_proto_csum_replace16 for incrementally updating IPv6 pseudo header checksums for IPv6 NAT. Signed-off-by: Patrick McHardy <kaber@trash.net> Acked-by: David S. Miller <davem@davemloft.net>	2012-08-30 03:00:16 +02:00
Patrick McHardy	0ad352cb43	netfilter: ipv6: expand skb head in ip6_route_me_harder after oif change Expand the skb headroom if the oif changed due to rerouting similar to how IPv4 packets are handled. Signed-off-by: Patrick McHardy <kaber@trash.net>	2012-08-30 03:00:15 +02:00
Patrick McHardy	c7232c9979	netfilter: add protocol independent NAT core Convert the IPv4 NAT implementation to a protocol independent core and address family specific modules. Signed-off-by: Patrick McHardy <kaber@trash.net>	2012-08-30 03:00:14 +02:00
Patrick McHardy	051966c0c6	netfilter: nf_nat: add protoff argument to packet mangling functions For mangling IPv6 packets the protocol header offset needs to be known by the NAT packet mangling functions. Add a so far unused protoff argument and convert the conntrack and NAT helpers to use it in preparation of IPv6 NAT. Signed-off-by: Patrick McHardy <kaber@trash.net>	2012-08-30 03:00:13 +02:00
Patrick McHardy	811927ccfe	netfilter: nf_conntrack: restrict NAT helper invocation to IPv4 The NAT helpers currently only handle IPv4 packets correctly. Restrict invocation of the helpers to IPv4 in preparation of IPv6 NAT. Signed-off-by: Patrick McHardy <kaber@trash.net>	2012-08-30 03:00:12 +02:00
Patrick McHardy	2b60af0178	netfilter: nf_conntrack_ipv6: fix tracking of ICMPv6 error messages containing fragments ICMPv6 error messages are tracked by extracting the conntrack tuple of the inner packet and looking up the corresponding conntrack entry. Tuple extraction uses the ->get_l4proto() callback, which in case of fragments returns NEXTHDR_FRAGMENT instead of the upper protocol, even for the first fragment when the entire next header is present, resulting in a failure to find the correct connection tracking entry. This patch changes ipv6_get_l4proto() to use ipv6_skip_exthdr() instead of nf_ct_ipv6_skip_exthdr() in order to skip fragment headers when the fragment offset is zero. Signed-off-by: Patrick McHardy <kaber@trash.net>	2012-08-30 03:00:11 +02:00
Patrick McHardy	4cdd34084d	netfilter: nf_conntrack_ipv6: improve fragmentation handling The IPv6 conntrack fragmentation currently has a couple of shortcomings. Fragmentes are collected in PREROUTING/OUTPUT, are defragmented, the defragmented packet is then passed to conntrack, the resulting conntrack information is attached to each original fragment and the fragments then continue their way through the stack. Helper invocation occurs in the POSTROUTING hook, at which point only the original fragments are available. The result of this is that fragmented packets are never passed to helpers. This patch improves the situation in the following way: - If a reassembled packet belongs to a connection that has a helper assigned, the reassembled packet is passed through the stack instead of the original fragments. - During defragmentation, the largest received fragment size is stored. On output, the packet is refragmented if required. If the largest received fragment size exceeds the outgoing MTU, a "packet too big" message is generated, thus behaving as if the original fragments were passed through the stack from an outside point of view. - The ipv6_helper() hook function can't receive fragments anymore for connections using a helper, so it is switched to use ipv6_skip_exthdr() instead of the netfilter specific nf_ct_ipv6_skip_exthdr() and the reassembled packets are passed to connection tracking helpers. The result of this is that we can properly track fragmented packets, but still generate ICMPv6 Packet too big messages if we would have before. This patch is also required as a precondition for IPv6 NAT, where NAT helpers might enlarge packets up to a point that they require fragmentation. In that case we can't generate Packet too big messages since the proper MTU can't be calculated in all cases (f.i. when changing textual representation of a variable amount of addresses), so the packet is transparently fragmented iff the original packet or fragments would have fit the outgoing MTU. IPVS parts by Jesper Dangaard Brouer <brouer@redhat.com>. Signed-off-by: Patrick McHardy <kaber@trash.net>	2012-08-30 03:00:10 +02:00
Jesper Dangaard Brouer	590e3f79a2	ipvs: IPv6 MTU checking cleanup and bugfix Cleaning up the IPv6 MTU checking in the IPVS xmit code, by using a common helper function __mtu_check_toobig_v6(). The MTU check for tunnel mode can also use this helper as ntohs(old_iph->payload_len) + sizeof(struct ipv6hdr) is qual to skb->len. And the 'mtu' variable have been adjusted before calling helper. Notice, this also fixes a bug, as the the MTU check in ip_vs_dr_xmit_v6() were missing a check for skb_is_gso(). This bug e.g. caused issues for KVM IPVS setups, where different Segmentation Offloading techniques are utilized, between guests, via the virtio driver. This resulted in very bad performance, due to the ICMPv6 "too big" messages didn't affect the sender. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-08-30 02:55:39 +02:00
Amerigo Wang	072a9c4860	netpoll: revert `6bdb7fe310` and fix be_poll() instead Against -net. In the patch "netpoll: re-enable irq in poll_napi()", I tried to fix the following warning: [100718.051041] ------------[ cut here ]------------ [100718.051048] WARNING: at kernel/softirq.c:159 local_bh_enable_ip+0x7d/0xb0() (Not tainted) [100718.051049] Hardware name: ProLiant BL460c G7 ... [100718.051068] Call Trace: [100718.051073] [<ffffffff8106b747>] ? warn_slowpath_common+0x87/0xc0 [100718.051075] [<ffffffff8106b79a>] ? warn_slowpath_null+0x1a/0x20 [100718.051077] [<ffffffff810747ed>] ? local_bh_enable_ip+0x7d/0xb0 [100718.051080] [<ffffffff8150041b>] ? _spin_unlock_bh+0x1b/0x20 [100718.051085] [<ffffffffa00ee974>] ? be_process_mcc+0x74/0x230 [be2net] [100718.051088] [<ffffffffa00ea68c>] ? be_poll_tx_mcc+0x16c/0x290 [be2net] [100718.051090] [<ffffffff8144fe76>] ? netpoll_poll_dev+0xd6/0x490 [100718.051095] [<ffffffffa01d24a5>] ? bond_poll_controller+0x75/0x80 [bonding] [100718.051097] [<ffffffff8144fde5>] ? netpoll_poll_dev+0x45/0x490 [100718.051100] [<ffffffff81161b19>] ? ksize+0x19/0x80 [100718.051102] [<ffffffff81450437>] ? netpoll_send_skb_on_dev+0x157/0x240 by reenabling IRQ before calling ->poll, but it seems more problems are introduced after that patch: http://ozlabs.org/~akpm/stuff/IMG_20120824_122054.jpg http://marc.info/?l=linux-netdev&m=134563282530588&w=2 So it is safe to fix be2net driver code directly. This patch reverts the offending commit and fixes be_poll() by avoid disabling BH there, this is okay because be_poll() can be called either by poll_napi() which already disables IRQ, or by net_rx_action() which already disables BH. Reported-by: Andrew Morton <akpm@linux-foundation.org> Reported-by: Sylvain Munaut <s.munaut@whatever-company.com> Cc: Sylvain Munaut <s.munaut@whatever-company.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: David Miller <davem@davemloft.net> Cc: Sathya Perla <sathya.perla@emulex.com> Cc: Subbu Seetharaman <subbu.seetharaman@emulex.com> Cc: Ajit Khaparde <ajit.khaparde@emulex.com> Signed-off-by: Cong Wang <amwang@redhat.com> Tested-by: Sylvain Munaut <s.munaut@whatever-company.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-29 15:03:23 -04:00
Vinicius Costa Gomes	d8343f1257	Bluetooth: Fix sending a HCI Authorization Request over LE links In the case that the link is already in the connected state and a Pairing request arrives from the mgmt interface, hci_conn_security() would be called but it was not considering LE links. Reported-by: João Paulo Rechi Vita <jprvita@openbossa.org> Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@openbossa.org> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2012-08-27 08:11:51 -07:00
Vinicius Costa Gomes	cc110922da	Bluetooth: Change signature of smp_conn_security() To make it clear that it may be called from contexts that may not have any knowledge of L2CAP, we change the connection parameter, to receive a hci_conn. This also makes it clear that it is checking the security of the link. Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@openbossa.org> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2012-08-27 08:07:18 -07:00
Greg Kroah-Hartman	e372dc6c62	Merge 3.6-rc3 into tty-next This picks up all of the different fixes in Linus's tree that we also need here. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-08-27 07:13:33 -07:00
Patrick McHardy	5f2d04f1f9	ipv4: fix path MTU discovery with connection tracking IPv4 conntrack defragments incoming packet at the PRE_ROUTING hook and (in case of forwarded packets) refragments them at POST_ROUTING independent of the IP_DF flag. Refragmentation uses the dst_mtu() of the local route without caring about the original fragment sizes, thereby breaking PMTUD. This patch fixes this by keeping track of the largest received fragment with IP_DF set and generates an ICMP fragmentation required error during refragmentation if that size exceeds the MTU. Signed-off-by: Patrick McHardy <kaber@trash.net> Acked-by: Eric Dumazet <edumazet@google.com> Acked-by: David S. Miller <davem@davemloft.net>	2012-08-26 19:13:55 +02:00
Linus Torvalds	8497ae61d0	Merge branch 'for-3.6' of git://linux-nfs.org/~bfields/linux Pull nfsd bugfixes from J. Bruce Fields: "Particular thanks to Michael Tokarev, Malahal Naineni, and Jamie Heilman for their testing and debugging help." * 'for-3.6' of git://linux-nfs.org/~bfields/linux: svcrpc: fix svc_xprt_enqueue/svc_recv busy-looping svcrpc: sends on closed socket should stop immediately svcrpc: fix BUG() in svc_tcp_clear_pages nfsd4: fix security flavor of NFSv4.0 callback	2012-08-25 11:43:41 -07:00
David S. Miller	e6acb38480	Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace This is an initial merge in of Eric Biederman's work to start adding user namespace support to the networking. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-24 18:54:37 -04:00
David S. Miller	85c21049fc	Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next John W. Linville says: ==================== This is a batch of updates intended for 3.7. The bulk of it is mac80211 changes, including some mesh work from Thomas Pederson and some multi-channel work from Johannes. A variety of driver updates and other bits are scattered in there as well. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-24 15:18:07 -04:00
David S. Miller	d05cebb915	Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless John W. Linville says: ==================== This batch of fixes is intended for 3.6... Johannes Berg gives us a pair of iwlwifi fixes. One corrects some improperly defined ifdefs that lead to crashes and BUG_ONs. The other prevents attempts to read SRAM for devices that aren't actually started. Julia Lawall provides an ipw2100 fix to properly set the return code from a function call before testing it! :-) Thomas Huehn corrects the improper use of a constant related to a power setting in ath5k. Thomas Pedersen offers a mac80211 fix to properly handle destination addresses of unicast frames passing though a mesh gate. Vladimir Zapolskiy provides a brcmsmac fix to properly mark the interface state when the device goes down. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-24 15:15:10 -04:00
Yuchung Cheng	7c4a56fec3	tcp: fix cwnd reduction for non-sack recovery The cwnd reduction in fast recovery is based on the number of packets newly delivered per ACK. For non-sack connections every DUPACK signifies a packet has been delivered, but the sender mistakenly skips counting them for cwnd reduction. The fix is to compute newly_acked_sacked after DUPACKs are accounted in sacked_out for non-sack connections. Signed-off-by: Yuchung Cheng <ycheng@google.com> Acked-by: Nandita Dukkipati <nanditad@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-24 13:48:58 -04:00
Jiri Pirko	9b361c13ce	vlan: add helper which can be called to see if device is used by vlan also, remove unused vlan_info definition from header CC: Patrick McHardy <kaber@trash.net> Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-24 13:46:39 -04:00
Pablo Neira Ayuso	20e1db19db	netlink: fix possible spoofing from non-root processes Non-root user-space processes can send Netlink messages to other processes that are well-known for being subscribed to Netlink asynchronous notifications. This allows ilegitimate non-root process to send forged messages to Netlink subscribers. The userspace process usually verifies the legitimate origin in two ways: a) Socket credentials. If UID != 0, then the message comes from some ilegitimate process and the message needs to be dropped. b) Netlink portID. In general, portID == 0 means that the origin of the messages comes from the kernel. Thus, discarding any message not coming from the kernel. However, ctnetlink sets the portID in event messages that has been triggered by some user-space process, eg. conntrack utility. So other processes subscribed to ctnetlink events, eg. conntrackd, know that the event was triggered by some user-space action. Neither of the two ways to discard ilegitimate messages coming from non-root processes can help for ctnetlink. This patch adds capability validation in case that dst_pid is set in netlink_sendmsg(). This approach is aggressive since existing applications using any Netlink bus to deliver messages between two user-space processes will break. Note that the exception is NETLINK_USERSOCK, since it is reserved for netlink-to-netlink userspace communication. Still, if anyone wants that his Netlink bus allows netlink-to-netlink userspace, then they can set NL_NONROOT_SEND. However, by default, I don't think it makes sense to allow to use NETLINK_ROUTE to communicate two processes that are sending no matter what information that is not related to link/neighbouring/routing. They should be using NETLINK_USERSOCK instead for that. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-24 13:36:09 -04:00
Ben Hutchings	8f4cccbbd9	net: Set device operstate at registration time The operstate of a device is initially IF_OPER_UNKNOWN and is updated asynchronously by linkwatch after each change of carrier state reported by the driver. The default carrier state of a net device is on, and this will never be changed on drivers that do not support carrier detection, thus the operstate remains IF_OPER_UNKNOWN. For devices that do support carrier detection, the driver must set the carrier state to off initially, then poll the hardware state when the device is opened. However, we must not activate linkwatch for a unregistered device, and commit `b473001` ('net: Do not fire linkwatch events until the device is registered.') ensured that we don't. But this means that the operstate for many devices that support carrier detection remains IF_OPER_UNKNOWN when it should be IF_OPER_DOWN. The same issue exists with the dormant state. The proper initialisation sequence, avoiding a race with opening of the device, is: rtnl_lock(); rc = register_netdevice(dev); if (rc) goto out_unlock; netif_carrier_off(dev); /* or netif_dormant_on(dev) */ rtnl_unlock(); but it seems silly that this should have to be repeated in so many drivers. Further, the operstate seen immediately after opening the device may still be IF_OPER_UNKNOWN due to the asynchronous nature of linkwatch. Commit `22604c8` ('net: Fix for initial link state in 2.6.28') attempted to fix this by setting the operstate synchronously, but it was reverted as it could lead to deadlock. This initialises the operstate synchronously at registration time only. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-24 12:46:13 -04:00
Neil Horman	3afa6d00fb	cls_cgroup: Allow classifier cgroups to have their classid reset to 0 The network classifier cgroup initalizes each cgroups instance classid value to 0. However, the sock_update_classid function only updates classid's in sockets if the tasks cgroup classid is not zero, and if it differs from the current classid. The later check is to prevent cache line dirtying, but the former is detrimental, as it prevents resetting a classid for a cgroup to 0. While this is not a common action, it has administrative usefulness (if the admin wants to disable classification of a certain group temporarily for instance). Easy fix, just remove the zero check. Tested successfully by myself Signed-off-by: Neil Horman <nhorman@tuxdriver.com> CC: "David S. Miller" <davem@davemloft.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-24 12:41:17 -04:00
John W. Linville	f20b6213f1	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem	2012-08-24 12:25:30 -04:00
Eric Dumazet	78df76a065	ipv4: take rt_uncached_lock only if needed Multicast traffic allocates dst with DST_NOCACHE, but dst is not inserted into rt_uncached_list. This slowdown multicast workloads on SMP because rt_uncached_lock is contended. Change the test before taking the lock to actually check the dst was inserted into rt_uncached_list. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-24 11:47:48 -04:00
David S. Miller	e6e94e392f	Merge tag 'batman-adv-for-davem' of git://git.open-mesh.org/linux-merge Antonio Quartulli says: ==================== Included changes: - a set of codestyle rearrangements/fixes - new feature to early detect new joining (mesh-unaware) clients - a minor fix for the gw-feature - substitution of shift operations with the BIT() macro - reorganization of the main batman-adv structure (struct batadv_priv) - some more (very) minor cleanups and fixes =================== Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-24 11:30:50 -04:00
John W. Linville	e72615f6ab	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem	2012-08-24 11:16:58 -04:00
Fengguang Wu	a0dfb2634e	af_packet: match_fanout_group() can be static cc: Eric Leblond <eric@regit.org> Signed-off-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-23 09:27:12 -07:00
Eric Dumazet	748e2d9396	net: reinstate rtnl in call_netdevice_notifiers() Eric Biederman pointed out that not holding RTNL while calling call_netdevice_notifiers() was racy. This patch is a direct transcription his feedback against commit `0115e8e30d` (net: remove delay at device dismantle) Thanks Eric ! Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Tom Herbert <therbert@google.com> Cc: Mahesh Bandewar <maheshb@google.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Gao feng <gaofeng@cn.fujitsu.com> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-23 09:24:42 -07:00
John W. Linville	c6b6eedc29	Merge branch 'for-john' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211	2012-08-23 09:51:15 -04:00
John W. Linville	a4881cc45a	Merge branch 'for-john' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next	2012-08-23 09:49:42 -04:00
Sven Eckelmann	fa4f0afcf4	batman-adv: Start new development cycle Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Antonio Quartulli <ordex@autistici.org>	2012-08-23 14:20:23 +02:00
Antonio Quartulli	371351731e	batman-adv: change interface_rx to get orig node In order to understand where a broadcast packet is coming from and use this information to detect not yet announced clients, this patch modifies the interface_rx() function by passing a new argument: the orig node corresponding to the node that originated the received packet (if known). This new argument if not NULL for broadcast packets only (other packets does not have source field). Signed-off-by: Antonio Quartulli <ordex@autistici.org>	2012-08-23 14:20:22 +02:00
Antonio Quartulli	30cfd02b60	batman-adv: detect not yet announced clients With the current TT mechanism a new client joining the network is not immediately able to communicate with other hosts because its MAC address has not been announced yet. This situation holds until the first OGM containing its joining event will be spread over the mesh network. This behaviour can be acceptable in networks where the originator interval is a small value (e.g. 1sec) but if that value is set to an higher time (e.g. 5secs) the client could suffer from several malfunctions like DHCP client timeouts, etc. This patch adds an early detection mechanism that makes nodes in the network able to recognise "not yet announced clients" by means of the broadcast packets they emitted on connection (e.g. ARP or DHCP request). The added client will then be confirmed upon receiving the OGM claiming it or purged if such OGM is not received within a fixed amount of time. Signed-off-by: Antonio Quartulli <ordex@autistici.org>	2012-08-23 14:20:22 +02:00
Sven Eckelmann	c67893d17a	batman-adv: Reduce accumulated length of simple statements Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Antonio Quartulli <ordex@autistici.org>	2012-08-23 14:20:21 +02:00
Sven Eckelmann	bbb1f90efb	batman-adv: Don't break statements after assignment operator Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Antonio Quartulli <ordex@autistici.org>	2012-08-23 14:20:20 +02:00
Sven Eckelmann	8de47de575	batman-adv: Use BIT(x) macro to calculate bit positions Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Antonio Quartulli <ordex@autistici.org>	2012-08-23 14:20:19 +02:00
Martin Hundebøll	74ee3634dc	batman-adv: Drop tt queries with foreign dest When enabling promiscuous mode, tt queries for other hosts might be received. Before this patch, "foreign" tt queries were processed like any other query and thus forwarded to its destination again and thereby causing a loop. This patch adds a check to drop foreign tt queries. Signed-off-by: Martin Hundebøll <martin@hundeboll.net> Signed-off-by: Antonio Quartulli <ordex@autistici.org>	2012-08-23 14:20:19 +02:00
Martin Hundebøll	ff51fd70ad	batman-adv: Move batadv_check_unicast_packet() batadv_check_unicast_packet() is needed in batadv_recv_tt_query(), so move the former to before the latter. Signed-off-by: Martin Hundebøll <martin@hundeboll.net> Signed-off-by: Antonio Quartulli <ordex@autistici.org>	2012-08-23 14:20:18 +02:00
Sven Eckelmann	807736f6e0	batman-adv: Split batadv_priv in sub-structures for features The structure batadv_priv grows everytime a new feature is introduced. It gets hard to find the parts of the struct that belongs to a specific feature. This becomes even harder by the fact that not every feature uses a prefix in the member name. The variables for bridge loop avoidence, gateway handling, translation table and visualization server are moved into separate structs that are included in the bat_priv main struct. Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Antonio Quartulli <ordex@autistici.org>	2012-08-23 14:20:13 +02:00
Simon Wunderlich	624463079e	batman-adv: check batadv_orig_hash_add_if() return code If this call fails, some of the orig_nodes spaces may have been resized for the increased number of interface, and some may not. If we would just continue with the larger number of interfaces, this would lead to access to not allocated memory later. We better check the return code, and don't add the interface if no memory is available. OTOH, keeping some of the orig_nodes with too much memory allocated should hurt no one (except for a few too many bytes allocated). Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de> Signed-off-by: Antonio Quartulli <ordex@autistici.org>	2012-08-23 14:02:46 +02:00
Antonio Quartulli	a51fb9b2ac	batman-adv: fix typos in comments the word millisecond is misspelled in several comments. This patch fixes it. Signed-off-by: Antonio Quartulli <ordex@autistici.org>	2012-08-23 14:02:45 +02:00
Antonio Quartulli	d657e621a0	batman-adv: add reference counting for type batadv_tt_orig_list_entry The batadv_tt_orig_list_entry structure didn't have any refcounting mechanism so far. This patch introduces it and makes the structure being usable in much more complex context. Signed-off-by: Antonio Quartulli <ordex@autistici.org>	2012-08-23 14:02:44 +02:00
Jonathan Corbet	3a7f291bf6	batman-adv: remove a misleading comment As much as I'm happy to see LWN links sprinkled through the kernel by the dozen, this one in particular reflects a very old state of reality; the associated comment is now incorrect. So just delete it. Signed-off-by: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Antonio Quartulli <ordex@autistici.org>	2012-08-23 14:02:44 +02:00
Marek Lindner	1c9b0550f4	batman-adv: convert remaining packet counters to per_cpu_ptr() infrastructure Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Acked-by: Martin Hundebøll <martin@hundeboll.net> Signed-off-by: Antonio Quartulli <ordex@autistici.org>	2012-08-23 14:02:43 +02:00
Simon Wunderlich	3eb8773e3a	batman-adv: rename bridge loop avoidance claim types for consistency reasons within the code and with the documentation, we should always call it "claim" and "unclaim". Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de> Signed-off-by: Antonio Quartulli <ordex@autistici.org>	2012-08-23 14:02:42 +02:00
Simon Wunderlich	99e966fc96	batman-adv: correct comments in bridge loop avoidance Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de> Signed-off-by: Antonio Quartulli <ordex@autistici.org>	2012-08-23 14:02:41 +02:00
Simon Wunderlich	536a23f119	batman-adv: Add the backbone gateway list to debugfs This is especially useful if there are no claims yet, but we still want to know which gateways are using bridge loop avoidance in the network. Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de> Signed-off-by: Antonio Quartulli <ordex@autistici.org>	2012-08-23 14:02:41 +02:00
Antonio Quartulli	c70437289c	batman-adv: move function arguments on one line Signed-off-by: Antonio Quartulli <ordex@autistici.org>	2012-08-23 14:02:40 +02:00
Pavel Emelyanov	0fa7fa98db	packet: Protect packet sk list with mutex (v2) Change since v1: * Fixed inuse counters access spotted by Eric In patch `eea68e2f` (packet: Report socket mclist info via diag module) I've introduced a "scheduling in atomic" problem in packet diag module -- the socket list is traversed under rcu_read_lock() while performed under it sk mclist access requires rtnl lock (i.e. -- mutex) to be taken. [152363.820563] BUG: scheduling while atomic: crtools/12517/0x10000002 [152363.820573] 4 locks held by crtools/12517: [152363.820581] #0: (sock_diag_mutex){+.+.+.}, at: [<ffffffff81a2dcb5>] sock_diag_rcv+0x1f/0x3e [152363.820613] #1: (sock_diag_table_mutex){+.+.+.}, at: [<ffffffff81a2de70>] sock_diag_rcv_msg+0xdb/0x11a [152363.820644] #2: (nlk->cb_mutex){+.+.+.}, at: [<ffffffff81a67d01>] netlink_dump+0x23/0x1ab [152363.820693] #3: (rcu_read_lock){.+.+..}, at: [<ffffffff81b6a049>] packet_diag_dump+0x0/0x1af Similar thing was then re-introduced by further packet diag patches (fanount mutex and pgvec mutex for rings) :( Apart from being terribly sorry for the above, I propose to change the packet sk list protection from spinlock to mutex. This lock currently protects two modifications: * sklist * prot inuse counters The sklist modifications can be just reprotected with mutex since they already occur in a sleeping context. The inuse counters modifications are trickier -- the __this_cpu_-s are used inside, thus requiring the caller to handle the potential issues with contexts himself. Since packet sockets' counters are modified in two places only (packet_create and packet_release) we only need to protect the context from being preempted. BH disabling is not required in this case. Signed-off-by: Pavel Emelyanov <xemul@parallels.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-22 22:58:27 -07:00
danborkmann@iogearbox.net	9e67030af3	af_packet: use define instead of constant Instead of using a hard-coded value for the status variable, it would make the code more readable to use its destined define from linux/if_packet.h. Signed-off-by: daniel.borkmann@tik.ee.ethz.ch Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-22 22:58:27 -07:00
Ying Xue	bfdc587c5a	rds: Don't disable BH on BH context Since we have already in BH context when _write_space(), _data_ready() as well as *_state_change() are called, it's unnecessary to disable BH. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-22 22:52:04 -07:00
Eric Dumazet	b87fb39e39	ipv6: gre: fix ip6gre_err() ip6gre_err() miscomputes grehlen (sizeof(ipv6h) is 4 or 8, not 40 as expected), and should take into account 'offset' parameter. Also uses pskb_may_pull() to cope with some fragged skbs Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Dmitry Kozlov <xeb@mail.ru> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-22 22:48:32 -07:00
Eric Dumazet	ef8531b64c	xfrm: fix RCU bugs This patch reverts commit `56892261ed` (xfrm: Use rcu_dereference_bh to deference pointer protected by rcu_read_lock_bh), and fixes bugs introduced in commit `418a99ac6a` ( Replace rwlock on xfrm_policy_afinfo with rcu ) 1) We properly use RCU variant in this file, not a mix of RCU/RCU_BH 2) We must defer some writes after the synchronize_rcu() call or a reader can crash dereferencing NULL pointer. 3) Now we use the xfrm_policy_afinfo_lock spinlock only from process context, we no longer need to block BH in xfrm_policy_register_afinfo() and xfrm_policy_unregister_afinfo() 4) Can use RCU_INIT_POINTER() instead of rcu_assign_pointer() in xfrm_policy_unregister_afinfo() 5) Remove a forward inline declaration (xfrm_policy_put_afinfo()), and also move xfrm_policy_get_afinfo() declaration. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Fan Du <fan.du@windriver.com> Cc: Priyanka Jain <Priyanka.Jain@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-22 22:39:46 -07:00
Eric Dumazet	0115e8e30d	net: remove delay at device dismantle I noticed extra one second delay in device dismantle, tracked down to a call to dst_dev_event() while some call_rcu() are still in RCU queues. These call_rcu() were posted by rt_free(struct rtable *rt) calls. We then wait a little (but one second) in netdev_wait_allrefs() before kicking again NETDEV_UNREGISTER. As the call_rcu() are now completed, dst_dev_event() can do the needed device swap on busy dst. To solve this problem, add a new NETDEV_UNREGISTER_FINAL, called after a rcu_barrier(), but outside of RTNL lock. Use NETDEV_UNREGISTER_FINAL with care ! Change dst_dev_event() handler to react to NETDEV_UNREGISTER_FINAL Also remove NETDEV_UNREGISTER_BATCH, as its not used anymore after IP cache removal. With help from Gao feng Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Tom Herbert <therbert@google.com> Cc: Mahesh Bandewar <maheshb@google.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-22 21:50:36 -07:00
Eric Dumazet	9b04f35005	ipv4: properly update pmtu Sylvain Munault reported following info : - TCP connection get "stuck" with data in send queue when doing "large" transfers ( like typing 'ps ax' on a ssh connection ) - Only happens on path where the PMTU is lower than the MTU of the interface - Is not present right after boot, it only appears 10-20min after boot or so. (and that's inside the _same_ TCP connection, it works fine at first and then in the same ssh session, it'll get stuck) - Definitely seems related to fragments somehow since I see a router sending ICMP message saying fragmentation is needed. - Exact same setup works fine with kernel 3.5.1 Problem happens when the 10 minutes (ip_rt_mtu_expires) expiration period is over. ip_rt_update_pmtu() calls dst_set_expires() to rearm a new expiration, but dst_set_expires() does nothing because dst.expires is already set. It seems we want to set the expires field to a new value, regardless of prior one. With help from Julian Anastasov. Reported-by: Sylvain Munaut <s.munaut@whatever-company.com> Signed-off-by: Eric Dumazet <edumazet@google.com> CC: Julian Anastasov <ja@ssi.bg> Tested-by: Sylvain Munaut <s.munaut@whatever-company.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-22 19:14:30 -07:00
David S. Miller	bf277b0cce	Merge git://1984.lsi.us.es/nf-next Pablo Neira Ayuso says: ==================== This is the first batch of Netfilter and IPVS updates for your net-next tree. Mostly cleanups for the Netfilter side. They are: * Remove unnecessary RTNL locking now that we have support for namespace in nf_conntrack, from Patrick McHardy. * Cleanup to eliminate unnecessary goto in the initialization path of several Netfilter tables, from Jean Sacren. * Another cleanup from Wu Fengguang, this time to PTR_RET instead of if IS_ERR then return PTR_ERR. * Use list_for_each_entry_continue_rcu in nf_iterate, from Michael Wang. * Add pmtu_disc sysctl option to disable PMTU in their tunneling transmitter, from Julian Anastasov. * Generalize application protocol registration in IPVS and modify IPVS FTP helper to use it, from Julian Anastasov. * update Kconfig. The IPVS FTP helper depends on the Netfilter FTP helper for NAT support, from Julian Anastasov. * Add logic to update PMTU for IPIP packets in IPVS, again from Julian Anastasov. * A couple of sparse warning fixes for IPVS and Netfilter from Claudiu Ghioc and Patrick McHardy respectively. Patrick's IPv6 NAT changes will follow after this batch, I need to flush this batch first before refreshing my tree. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-22 18:48:52 -07:00
Pravin B Shelar	46df7b8145	openvswitch: Add support for network namespaces. Following patch adds support for network namespace to openvswitch. Since it must release devices when namespaces are destroyed, a side effect of this patch is that the module no longer keeps a refcount but instead cleans up any state when it is unloaded. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: Jesse Gross <jesse@nicira.com>	2012-08-22 14:48:55 -07:00
David S. Miller	1304a7343b	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2012-08-22 14:21:38 -07:00
Jean Sacren	90efbed18a	netfilter: remove unnecessary goto statement for error recovery Usually it's a good practice to use goto statement for error recovery when initializing the module. This approach could be an overkill if: 1) there is only one fail case; 2) success and failure use the same return statement. For a cleaner approach, remove the unnecessary goto statement and directly implement error recovery. Signed-off-by: Jean Sacren <sakiwit@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-08-22 19:17:38 +02:00
Michael Wang	6705e86724	netfilter: replace list_for_each_continue_rcu with new interface This patch replaces list_for_each_continue_rcu() with list_for_each_entry_continue_rcu() to allow removing list_for_each_continue_rcu(). Signed-off-by: Michael Wang <wangyun@linux.vnet.ibm.com> Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-08-22 19:17:20 +02:00
Linus Torvalds	f753c4ec15	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client Pull ceph fixes from Sage Weil: "Jim's fix closes a narrow race introduced with the msgr changes. One fix resolves problems with debugfs initialization that Yan found when multiple client instances are created (e.g., two clusters mounted, or rbd + cephfs), another one fixes problems with mounting a nonexistent server subdirectory, and the last one fixes a divide by zero error from unsanitized ioctl input that Dan Carpenter found." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: ceph: avoid divide by zero in __validate_layout() libceph: avoid truncation due to racing banners ceph: tolerate (and warn on) extraneous dentry from mds libceph: delay debugfs initialization until we learn global_id	2012-08-22 09:58:05 -07:00
Sujith Manoharan	aba4e6fff8	mac80211: Fix AP mode regression Commit mac80211: avoid using synchronize_rcu in ieee80211_set_probe_resp changed the return value when the probe response template is not present. Revert to the earlier value of 1 - this fixes AP mode for drivers like ath9k. Signed-off-by: Sujith Manoharan <c_manoha@qca.qualcomm.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-08-22 10:55:04 +02:00
Thomas Pedersen	27f011243a	mac80211: fix DS to MBSS address translation The destination address of unicast frames forwarded through a mesh gate was being replaced with the broadcast address. Instead leave the original destination address as the mesh DA. If the nexthop address is not in the mpath table it will be resolved. If that fails, the frame will be forwarded to known mesh gates. Reported-by: Cedric Voncken <cedric.voncken@acksys.fr> Signed-off-by: Thomas Pedersen <thomas@cozybit.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-08-22 09:45:05 +02:00
Jim Schutt	6d4221b537	libceph: avoid truncation due to racing banners Because the Ceph client messenger uses a non-blocking connect, it is possible for the sending of the client banner to race with the arrival of the banner sent by the peer. When ceph_sock_state_change() notices the connect has completed, it schedules work to process the socket via con_work(). During this time the peer is writing its banner, and arrival of the peer banner races with con_work(). If con_work() calls try_read() before the peer banner arrives, there is nothing for it to do, after which con_work() calls try_write() to send the client's banner. In this case Ceph's protocol negotiation can complete succesfully. The server-side messenger immediately sends its banner and addresses after accepting a connect request, before actually attempting to read or verify the banner from the client. As a result, it is possible for the banner from the server to arrive before con_work() calls try_read(). If that happens, try_read() will read the banner and prepare protocol negotiation info via prepare_write_connect(). prepare_write_connect() calls con_out_kvec_reset(), which discards the as-yet-unsent client banner. Next, con_work() calls try_write(), which sends the protocol negotiation info rather than the banner that the peer is expecting. The result is that the peer sees an invalid banner, and the client reports "negotiation failed". Fix this by moving con_out_kvec_reset() out of prepare_write_connect() to its callers at all locations except the one where the banner might still need to be sent. [elder@inktak.com: added note about server-side behavior] Signed-off-by: Jim Schutt <jaschut@sandia.gov> Reviewed-by: Alex Elder <elder@inktank.com>	2012-08-21 15:55:27 -07:00

... 4 5 6 7 8 ...

25311 Commits