linux_old1

Commit Graph

Author	SHA1	Message	Date
fan.du	d27fc78208	sctp: Don't lookup dst if transport dst is still valid When sctp sits on IPv6, sctp_transport_dst_check pass cookie as ZERO, as a result ip6_dst_check always fail out. This behaviour makes transport->dst useless, because every sctp_packet_transmit must look for valid dst. Add a dst_cookie into sctp_transport, and set the cookie whenever we get new dst for sctp_transport. So dst validness could be checked against it. Since I have split genid for IPv4 and IPv6, also delete/add IPv6 address will also bump IPv6 genid. So issues we discussed in: http://marc.info/?l=linux-netdev&m=137404469219410&w=4 have all been sloved for this patch. Signed-off-by: Fan Du <fan.du@windriver.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-08-02 12:36:00 -07:00
John W. Linville	89b59bcd3a	Merge branch 'for-john' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211	2013-08-02 14:54:19 -04:00
fan.du	439677d766	ipv6: bump genid when delete/add address Server Client 2001:1::803/64 <-> 2001:1::805/64 2001:2::804/64 <-> 2001:2::806/64 Server side fib binary tree looks like this: (2001:/64) / / ffff88002103c380 / \ (2) / \ (2001::803/128) ffff880037ac07c0 / \ / \ (3) ffff880037ac0640 (2001::806/128) / \ (1) / \ (2001::804/128) (2001::805/128) Delete 2001::804/64 won't cause prefix route deleted as well as rt in (3) destinate to 2001::806 with source address as 2001::804/64. That's because 2001::803/64 is still alive, which make onlink=1 in ipv6_del_addr, this is where the substantial difference between same prefix configuration and different prefix configuration :) So packet are still transmitted out to 2001::806 with source address as 2001::804/64. So bump genid will clear rt in (3), and up layer protocol will eventually find the right one for themselves. This problem arised from the discussion in here: http://marc.info/?l=linux-netdev&m=137404469219410&w=4 Signed-off-by: Fan Du <fan.du@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-08-01 16:33:32 -07:00
Ying Xue	c756891a4e	tipc: fix oops when creating server socket fails When creation of TIPC internal server socket fails, we get an oops with the following dump: BUG: unable to handle kernel NULL pointer dereference at 0000000000000020 IP: [<ffffffffa0011f49>] tipc_close_conn+0x59/0xb0 [tipc] PGD 13719067 PUD 12008067 PMD 0 Oops: 0000 [#1] SMP DEBUG_PAGEALLOC Modules linked in: tipc(+) CPU: 4 PID: 4340 Comm: insmod Not tainted 3.10.0+ #1 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007 task: ffff880014360000 ti: ffff88001374c000 task.ti: ffff88001374c000 RIP: 0010:[<ffffffffa0011f49>] [<ffffffffa0011f49>] tipc_close_conn+0x59/0xb0 [tipc] RSP: 0018:ffff88001374dc98 EFLAGS: 00010292 RAX: 0000000000000000 RBX: ffff880012ac09d8 RCX: 0000000000000000 RDX: 0000000000000046 RSI: 0000000000000001 RDI: ffff880014360000 RBP: ffff88001374dcb8 R08: 0000000000000001 R09: 0000000000000001 R10: 0000000000000000 R11: 0000000000000000 R12: ffffffffa0016fa0 R13: ffffffffa0017010 R14: ffffffffa0017010 R15: ffff880012ac09d8 FS: 0000000000000000(0000) GS:ffff880016600000(0063) knlGS:00000000f76668d0 CS: 0010 DS: 002b ES: 002b CR0: 000000008005003b CR2: 0000000000000020 CR3: 0000000012227000 CR4: 00000000000006e0 Stack: ffff88001374dcb8 ffffffffa0016fa0 0000000000000000 0000000000000001 ffff88001374dcf8 ffffffffa0012922 ffff88001374dce8 00000000ffffffea ffffffffa0017100 0000000000000000 ffff8800134241a8 ffffffffa0017150 Call Trace: [<ffffffffa0012922>] tipc_server_stop+0xa2/0x1b0 [tipc] [<ffffffffa0009995>] tipc_subscr_stop+0x15/0x20 [tipc] [<ffffffffa00130f5>] tipc_core_stop+0x1d/0x33 [tipc] [<ffffffffa001f0d4>] tipc_init+0xd4/0xf8 [tipc] [<ffffffffa001f000>] ? 0xffffffffa001efff [<ffffffff8100023f>] do_one_initcall+0x3f/0x150 [<ffffffff81082f4d>] ? __blocking_notifier_call_chain+0x7d/0xd0 [<ffffffff810cc58a>] load_module+0x11aa/0x19c0 [<ffffffff810c8d60>] ? show_initstate+0x50/0x50 [<ffffffff8190311c>] ? retint_restore_args+0xe/0xe [<ffffffff810cce79>] SyS_init_module+0xd9/0x110 [<ffffffff8190dc65>] sysenter_dispatch+0x7/0x1f Code: 6c 24 70 4c 89 ef e8 b7 04 8f e1 8b 73 04 4c 89 e7 e8 7c 9e 32 e1 41 83 ac 24 b8 00 00 00 01 4c 89 ef e8 eb 0a 8f e1 48 8b 43 08 <4c> 8b 68 20 4d 8d a5 48 03 00 00 4c 89 e7 e8 04 05 8f e1 4c 89 RIP [<ffffffffa0011f49>] tipc_close_conn+0x59/0xb0 [tipc] RSP <ffff88001374dc98> CR2: 0000000000000020 ---[ end trace b02321f40e4269a3 ]--- We have the following call chain: tipc_core_start() ret = tipc_subscr_start() ret = tipc_server_start(){ server->enabled = 1; ret = tipc_open_listening_sock() } I.e., the server->enabled flag is unconditionally set to 1, whatever the return value of tipc_open_listening_sock(). This causes a crash when tipc_core_start() tries to clean up resources after a failed initialization: if (ret == failed) tipc_subscr_stop() tipc_server_stop(){ if (server->enabled) tipc_close_conn(){ NULL reference of con->sock-sk OOPS! } } To avoid this, tipc_server_start() should only set server->enabled to 1 in case of a succesful socket creation. In case of failure, it should release all allocated resources before returning. Problem introduced in commit `c5fa7b3cf3` ("tipc: introduce new TIPC server infrastructure") in v3.11-rc1. Note that it won't be seen often; it takes a module load under memory constrained conditions in order to trigger the failure condition. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-08-01 15:54:33 -07:00
Cong Wang	e0d1095ae3	net: rename CONFIG_NET_LL_RX_POLL to CONFIG_NET_RX_BUSY_POLL Eliezer renames several ll_poll to busy_poll, but forgets CONFIG_NET_LL_RX_POLL, so in case of confusion, rename it too. Cc: Eliezer Tamir <eliezer.tamir@linux.intel.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-08-01 15:11:17 -07:00
Jiri Benc	8a226b2cfa	ipv6: prevent race between address creation and removal There's a race in IPv6 automatic addess assignment. The address is created with zero lifetime when it's added to various address lists. Before it gets assigned the correct lifetime, there's a window where a new address may be configured. This causes the semi-initiated address to be deleted in addrconf_verify. This was discovered as a reference leak caused by concurrent run of __ipv6_ifa_notify for both RTM_NEWADDR and RTM_DELADDR with the same address. Fix this by setting the lifetime before the address is added to inet6_addr_lst. A few notes: 1. In addrconf_prefix_rcv, by setting update_lft to zero, the if (update_lft) { ... } condition is no longer executed for newly created addresses. This is okay, as the ifp fields are set in ipv6_add_addr now and ipv6_ifa_notify is called (and has been called) through addrconf_dad_start. 2. The removal of the whole block under ifp->lock in inet6_addr_add is okay, too, as tstamp is initialized to jiffies in ipv6_add_addr. Signed-off-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-08-01 14:16:20 -07:00
Jiri Pirko	3f8f52982a	ipv6: move peer_addr init into ipv6_add_addr() Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-08-01 14:16:20 -07:00
Michal Kubeček	49a18d86f6	ipv6: update ip6_rt_last_gc every time GC is run As pointed out by Eric Dumazet, net->ipv6.ip6_rt_last_gc should hold the last time garbage collector was run so that we should update it whenever fib6_run_gc() calls fib6_clean_all(), not only if we got there from ip6_dst_gc(). Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-08-01 14:16:20 -07:00
Michal Kubeček	2ac3ac8f86	ipv6: prevent fib6_run_gc() contention On a high-traffic router with many processors and many IPv6 dst entries, soft lockup in fib6_run_gc() can occur when number of entries reaches gc_thresh. This happens because fib6_run_gc() uses fib6_gc_lock to allow only one thread to run the garbage collector but ip6_dst_gc() doesn't update net->ipv6.ip6_rt_last_gc until fib6_run_gc() returns. On a system with many entries, this can take some time so that in the meantime, other threads pass the tests in ip6_dst_gc() (ip6_rt_last_gc is still not updated) and wait for the lock. They then have to run the garbage collector one after another which blocks them for quite long. Resolve this by replacing special value ~0UL of expire parameter to fib6_run_gc() by explicit "force" parameter to choose between spin_lock_bh() and spin_trylock_bh() and call fib6_run_gc() with force=false if gc_thresh is reached but not max_size. Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-08-01 14:16:20 -07:00
John W. Linville	7546ff9549	Merge branch 'for-john' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next	2013-08-01 15:26:52 -04:00
John W. Linville	22e02a0272	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem	2013-08-01 14:30:59 -04:00
Simon Wunderlich	73da7d5bab	mac80211: add channel switch command and beacon callbacks The count field in CSA must be decremented with each beacon transmitted. This patch implements the functionality for drivers using ieee80211_beacon_get(). Other drivers must call back manually after reaching count == 0. This patch also contains the handling and finish worker for the channel switch command, and mac80211/chanctx code to allow to change a channel definition of an active channel context. Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de> Signed-off-by: Mathias Kretschmer <mathias.kretschmer@fokus.fraunhofer.de> [small cleanups, catch identical chandef] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-08-01 18:30:33 +02:00
Simon Wunderlich	16ef1fe272	nl80211/cfg80211: add channel switch command To allow channel switch announcements within beacons, add the channel switch command to nl80211/cfg80211. This is implementation is intended for AP and (later) IBSS mode. Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de> Signed-off-by: Mathias Kretschmer <mathias.kretschmer@fokus.fraunhofer.de> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-08-01 18:30:28 +02:00
Johannes Berg	7cf1f14ecf	mac80211: add debugfs for driver-buffered TID bitmap Add a per-station debugfs file indicating the TIDs (as a bitmap) that the driver has data buffered on. Reviewed-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-08-01 15:08:24 +02:00
J. Bruce Fields	7193bd17ea	svcrpc: set cr_gss_mech from gss-proxy as well as legacy upcall The change made to rsc_parse() in `0dc1531aca` "svcrpc: store gss mech in svc_cred" should also have been propagated to the gss-proxy codepath. This fixes a crash in the gss-proxy case. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-08-01 08:42:01 -04:00
J. Bruce Fields	743e217129	svcrpc: fix kfree oops in gss-proxy code mech_oid.data is an array, not kmalloc()'d memory. Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-08-01 08:41:29 -04:00
J. Bruce Fields	dc43376c26	svcrpc: fix gss-proxy xdr decoding oops Uninitialized stack data was being used as the destination for memcpy's. Longer term we'll just delete some of this code; all we're doing is skipping over xdr that we don't care about. Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-08-01 08:40:42 -04:00
J. Bruce Fields	9f96392b0a	svcrpc: fix gss_rpc_upcall create error Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-08-01 08:39:30 -04:00
NeilBrown	447383d2ba	NFSD/sunrpc: avoid deadlock on TCP connection due to memory pressure. Since we enabled auto-tuning for sunrpc TCP connections we do not guarantee that there is enough write-space on each connection to queue a reply. If memory pressure causes the window to shrink too small, the request throttling in sunrpc/svc will not accept any requests so no more requests will be handled. Even when pressure decreases the window will not grow again until data is sent on the connection. This means we get a deadlock: no requests will be handled until there is more space, and no space will be allocated until a request is handled. This can be simulated by modifying svc_tcp_has_wspace to inflate the number of byte required and removing the 'svc_sock_setbufsize' calls in svc_setup_socket. I found that multiplying by 16 was enough to make the requirement exceed the default allocation. With this modification in place: mount -o vers=3,proto=tcp 127.0.0.1:/home /mnt would block and eventually time out because the nfs server could not accept any requests. This patch relaxes the request throttling to always allow at least one request through per connection. It does this by checking both sk_stream_min_wspace() and xprt->xpt_reserved are zero. The first is zero when the TCP transmit queue is empty. The second is zero when there are no RPC requests being processed. When both of these are zero the socket is idle and so one more request can safely be allowed through. Applying this patch allows the above mount command to succeed cleanly. Tracing shows that the allocated write buffer space quickly grows and after a few requests are handled, the extra tests are no longer needed to permit further requests to be processed. The main purpose of request throttling is to handle the case when one client is slow at collecting replies and the send queue gets full of replies that the client hasn't acknowledged (at the TCP level) yet. As we only change behaviour when the send queue is empty this main purpose is still preserved. Reported-by: Ben Myers <bpm@sgi.com> Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-08-01 08:39:16 -04:00
Pablo Neira Ayuso	a206bcb3b0	netfilter: xt_TCPOPTSTRIP: fix possible off by one access Fix a possible off by one access since optlen() touches opt[offset+1] unsafely when i == tcp_hdrlen(skb) - 1. This patch replaces tcp_hdrlen() by the local variable tcp_hdrlen that stores the TCP header length, to save some cycles. Reported-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-08-01 11:45:15 +02:00
Pablo Neira Ayuso	71ffe9c77d	netfilter: xt_TCPMSS: fix handling of malformed TCP header and options Make sure the packet has enough room for the TCP header and that it is not malformed. While at it, store tcph->doff*4 in a variable, as it is used several times. This patch also fixes a possible off by one in case of malformed TCP options. Reported-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-08-01 11:42:53 +02:00
Karl Beldan	a824131017	mac80211: report some VHT radiotap infos for tx status The radiotap VHT info is 12 bytes (required to be aligned on 2) : u16 known - IEEE80211_RADIOTAP_VHT_KNOWN_* u8 flags - IEEE80211_RADIOTAP_VHT_FLAG_* u8 bandwidth u8 mcs_nss[4] u8 coding u8 group_id u16 partial_aid ATM mac80211 can handle IEEE80211_RADIOTAP_VHT_KNOWN_{GI,BANDWIDTH} and mcs_nss[0] (i.e single user) in simple cases. This is more a placeholder to let sniffers give more clues for VHT, since we don't have yet the proper infrastructure/conventions in mac80211 for complete feedback (e.g consider dynamic BW). Signed-off-by: Karl Beldan <karl.beldan@rivierawaves.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-08-01 10:49:04 +02:00
Antonio Quartulli	dad9defd28	mac80211: ibss - remove not authorized station earlier A station which is not authorized has to be purged earlier to give it a chance to re-try to establish an IBSS/RSN session soon. Set the timeout to 10 seconds. Some refactoring has also been done to allow the IBSS submodule to have its own expiring function. Reported-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de> Signed-off-by: Antonio Quartulli <antonio@open-mesh.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-08-01 10:49:02 +02:00
Fabio Baltieri	e47f2509e5	mac80211: use oneshot blink API for LED triggers Change mac80211 LED trigger code to use the generic led_trigger_blink_oneshot() API for transmit and receive activity indication. This gives a better feedback to the user, as with the new API each activity event results in a visible blink, while a constant traffic results in a continuous blink at constant rate. Signed-off-by: Fabio Baltieri <fabio.baltieri@gmail.com> [fix LED disabled build error] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-08-01 10:48:49 +02:00
Hannes Frederic Sowa	46b3a42190	ipv6: fib6_rules should return exact return value With the addition of the suppress operation (`7764a45a8f` ("fib_rules: add .suppress operation") we rely on accurate error reporting of the fib_rules.actions. fib6_rule_action always returned -EAGAIN in case we could not find a matching route and 0 if a rule was matched. This also included a match for blackhole or prohibited rule actions which could get suppressed by the new logic. So adapt fib6_rule_action to always return the correct error code as its counterpart fib4_rule_action does. This also fixes a possiblity of nullptr-deref where we don't find a table, thus rt == NULL. Because the condition rt != ip6_null_entry still holdes it seems we could later get a nullptr bug on dereference rt->dst. v2: a) Fixed a brain fart in the commit msg (the rule => a table, etc). No changes to the patch. Cc: Stefan Tomanek <stefan.tomanek@wertarbyte.de> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-08-01 00:26:22 -07:00
Linus Lüssing	b00589af3b	bridge: disable snooping if there is no querier If there is no querier on a link then we won't get periodic reports and therefore won't be able to learn about multicast listeners behind ports, potentially leading to lost multicast packets, especially for multicast listeners that joined before the creation of the bridge. These lost multicast packets can appear since `c5c2326059` ("bridge: Add multicast_querier toggle and disable queries by default") in particular. With this patch we are flooding multicast packets if our querier is disabled and if we didn't detect any other querier. A grace period of the Maximum Response Delay of the querier is added to give multicast responses enough time to arrive and to be learned from before disabling the flooding behaviour again. Signed-off-by: Linus Lüssing <linus.luessing@web.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-31 17:40:21 -07:00
Stefan Tomanek	7764a45a8f	fib_rules: add .suppress operation This change adds a new operation to the fib_rules_ops struct; it allows the suppression of routing decisions if certain criteria are not met by its results. The first implemented constraint is a minimum prefix length added to the structures of routing rules. If a rule is added with a minimum prefix length >0, only routes meeting this threshold will be considered. Any other (more general) routing table entries will be ignored. When configuring a system with multiple network uplinks and default routes, it is often convinient to reference the main routing table multiple times - but omitting the default route. Using this patch and a modified "ip" utility, this can be achieved by using the following command sequence: $ ip route add table secuplink default via 10.42.23.1 $ ip rule add pref 100 table main prefixlength 1 $ ip rule add pref 150 fwmark 0xA table secuplink With this setup, packets marked 0xA will be processed by the additional routing table "secuplink", but only if no suitable route in the main routing table can be found. By using a minimal prefixlength of 1, the default route (/0) of the table "main" is hidden to packets processed by rule 100; packets traveling to destinations with more specific routing entries are processed as usual. Signed-off-by: Stefan Tomanek <stefan.tomanek@wertarbyte.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-31 17:27:17 -07:00
Dan Carpenter	8cb3b9c364	net_sched: info leak in atm_tc_dump_class() The "pvc" struct has a hole after pvc.sap_family which is not cleared. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-31 15:04:19 -07:00
Eric Dumazet	f2f872f927	netem: Introduce skb_orphan_partial() helper Commit `547669d483` ("tcp: xps: fix reordering issues") added unexpected reorders in case netem is used in a MQ setup for high performance test bed. ETH=eth0 tc qd del dev $ETH root 2>/dev/null tc qd add dev $ETH root handle 1: mq for i in `seq 1 32` do tc qd add dev $ETH parent 1:$i netem delay 100ms done As all tcp packets are orphaned by netem, TCP stack believes it can set skb->ooo_okay on all packets. In order to allow producers to send more packets, we want to keep sk_wmem_alloc from reaching sk_sndbuf limit. We can do that by accounting one byte per skb in netem queues, so that TCP stack is not fooled too much. Tested: With above MQ/netem setup, scaling number of concurrent flows gives linear results and no reorders/retransmits lpq83:~# for n in 1 10 20 30 40 50 60 70 80 90 100 do echo -n "n:$n " ; ./super_netperf $n -H 10.7.7.84; done n:1 198.46 n:10 2002.69 n:20 4000.98 n:30 6006.35 n:40 8020.93 n:50 10032.3 n:60 12081.9 n:70 13971.3 n:80 16009.7 n:90 17117.3 n:100 17425.5 Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-31 14:59:49 -07:00
fan.du	ca4c3fc24e	net: split rt_genid for ipv4 and ipv6 Current net name space has only one genid for both IPv4 and IPv6, it has below drawbacks: - Add/delete an IPv4 address will invalidate all IPv6 routing table entries. - Insert/remove XFRM policy will also invalidate both IPv4/IPv6 routing table entries even when the policy is only applied for one address family. Thus, this patch attempt to split one genid for two to cater for IPv4 and IPv6 separately in a fine granularity. Signed-off-by: Fan Du <fan.du@windriver.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-31 14:56:36 -07:00
Linus Torvalds	06693f305e	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: 1) Fix association failures not triggering a connect-failure event in cfg80211, from Johannes Berg. 2) Eliminate a potential NULL deref with older iptables tools when configuring xt_socket rules, from Eric Dumazet. 3) Missing RTNL locking in wireless regulatory code, from Johannes Berg. 4) Fix OOPS caused by firmware loading races in ath9k_htc, from Alexey Khoroshilov. 5) Fix usb URB leak in usb_8dev CAN driver, also from Alexey Khoroshilov. 6) VXLAN namespace teardown fails to unregister devices, from Stephen Hemminger. 7) Fix multicast settings getting dropped by firmware in qlcnic driver, from Sucheta Chakraborty. 8) Add sysctl range enforcement for tcp_syn_retries, from Michal Tesar. 9) Fix a nasty bug in bridging where an active timer would get reinitialized with a setup_timer() call. From Eric Dumazet. 10) Fix use after free in new mlx5 driver, from Dan Carpenter. 11) Fix freed pointer reference in ipv6 multicast routing on namespace cleanup, from Hannes Frederic Sowa. 12) Some usbnet drivers report TSO and SG in their feature set, but the usbnet layer doesn't really support them. From Eric Dumazet. 13) Fix crash on EEH errors in tg3 driver, from Gavin Shan. 14) Drop cb_lock when requesting modules in genetlink, from Stanislaw Gruszka. 15) Kernel stack leaks in cbq scheduler and af_key pfkey messages, from Dan Carpenter. 16) FEC driver erroneously signals NETDEV_TX_BUSY on transmit leading to endless loops, from Uwe Kleine-König. 17) Fix hangs from loading mvneta driver, from Arnaud Patard. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (84 commits) mlx5: fix error return code in mlx5_alloc_uuars() mvneta: Try to fix mvneta when compiled as module mvneta: Fix hang when loading the mvneta driver atl1c: Fix misuse of netdev_alloc_skb in refilling rx ring genetlink: fix usage of NLM_F_EXCL or NLM_F_REPLACE af_key: more info leaks in pfkey messages net/fec: Don't let ndo_start_xmit return NETDEV_TX_BUSY without link net_sched: Fix stack info leak in cbq_dump_wrr(). igb: fix vlan filtering in promisc mode when not in VT mode ixgbe: Fix Tx Hang issue with lldpad on 82598EB genetlink: release cb_lock before requesting additional module net: fec: workaround stop tx during errata ERR006358 qlcnic: Fix diagnostic interrupt test for 83xx adapters. qlcnic: Fix setting Guest VLAN qlcnic: Fix operation type and command type. qlcnic: Fix initialization of work function. Revert "atl1c: Fix misuse of netdev_alloc_skb in refilling rx ring" atl1c: Fix misuse of netdev_alloc_skb in refilling rx ring net/tg3: Fix warning from pci_disable_device() net/tg3: Fix kernel crash ...	2013-07-31 12:56:18 -07:00
Johannes Berg	ddfe49b42d	mac80211: continue using disabled channels while connected In case the AP has different regulatory information than we do, it can happen that we connect to an AP based on e.g. the world roaming regulatory data, and then update our database with the AP's country information disables the channel the AP is using. If this happens on an HT AP, the bandwidth tracking code will hit the WARN_ON() and disconnect. Since that's not very useful, ignore the channel-disable flag in bandwidth tracking. Cc: stable@vger.kernel.org Reported-by: Chris Wright <chrisw@sous-sol.org> Tested-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-07-31 21:18:17 +02:00
Johannes Berg	74418edec9	cfg80211: fix P2P GO interface teardown When a P2P GO interface goes down, cfg80211 doesn't properly tear it down, leading to warnings later. Add the GO interface type to the enumeration to tear it down like AP interfaces. Otherwise, we leave it pending and mac80211's state can get very confused, leading to warnings later. Cc: stable@vger.kernel.org Reported-by: Ilan Peer <ilan.peer@intel.com> Tested-by: Ilan Peer <ilan.peer@intel.com> Reviewed-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-07-31 21:18:17 +02:00
Johannes Berg	5cdaed1e87	mac80211: ignore HT primary channel while connected While we're connected, the AP shouldn't change the primary channel in the HT information. We checked this, and dropped the connection if it did change it. Unfortunately, this is causing problems on some APs, e.g. on the Netgear WRT610NL: the beacons seem to always contain a bad channel and if we made a connection using a probe response (correct data) we drop the connection immediately and can basically not connect properly at all. Work around this by ignoring the HT primary channel information in beacons if we're already connected. Also print out more verbose messages in the other situations to help diagnose similar bugs quicker in the future. Cc: stable@vger.kernel.org [3.10] Acked-by: Andy Isaacson <adi@hexapodia.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-07-31 21:18:10 +02:00
Dmitry Popov	c0155b2da4	tcp: Remove unused tcpct declarations and comments Remove declaration, 4 defines and confusing comment that are no longer used since `1a2c6181c4` ("tcp: Remove TCPCT"). Signed-off-by: Dmitry Popov <dp@highloadlab.com> Acked-by: Christoph Paasch <christoph.paasch@uclouvain.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-31 12:16:45 -07:00
Johannes Berg	cb236d2d71	mac80211: don't wait for TX status forever TX status notification can get lost, or the frames could get stuck on the queue, so don't wait for the callback from the driver forever and instead time out after half a second. Cc: stable@vger.kernel.org Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-07-31 21:16:17 +02:00
John W. Linville	11a45820d0	This is the second NFC fixes pull request for 3.11. We have: - A build failure fix for the NCI SPI transport layer due to a missing CRC_CCITT Kconfig dependency. - A netlink command rename: CMD_FW_UPLOAD was merged during the 3.11 merge window but the typical terminology for loading a firmware to a target is firmware download rather than upload. In order to avoid any confusion in a file exported to userspace, we rename this command into CMD_FW_DOWNLOAD. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iQIcBAABAgAGBQJR+E9jAAoJEIqAPN1PVmxKzkwP/3dy9wpwQG7f8FLv61IhbhhQ 8gVqY3BX1RJdez1vH5MvqkTK6U3SmlQvtJM8pIPyfPXyR1+Af5AxqQh3vjTP3xUG PuMtmQOlz5OJ6ErttxZtYERVtrhFkasMVmVqKrN9ptItPADfOmeC0/hyoEnoYsWQ HrhZn1lYsf98zmEbNS2KoRcZUVLClbg4xosTktTVaz56jIGVuM8MAch+FS+tJhhl av0MX/VZvAUllSnlWWDmt0Lh9isJOLOMtqIRj6PLBAp2ra9sPNO5TlZ4lz2og2gx zesVhBBLyiF9oluuQj/FJft+s5Khcm0R9W969raL5SvehWY77wHoY76ZqHMUE2Qv 7RPUvFRfOA5LvKJM8MduJ8fMf830mZWD7cByhIfUxtWQZumwPfn2Mbl3xNkPLFZB L2x13SwGjU+PdCo70+ybgr8zUvYIxiVULwq5xFynvXJNSpOujIe3nPdQb7QtK8C0 4d9OudAHmfHsW93PMBE+Zki8i8GDLTR3DOQoXIRi7oPR+EVL2JDsBQvnauXhdSap mp9iyuoqAYjgc6e2o8coVqViXWbKmBEa9n7NKrX3dPrI9e5F67WChAyehBCu9KV3 zZxruhEJBw6PLmIGDETk1XIVd9G6rfMBswnDfSJBjjG5PrUh6Xbfwa1y+KiRKqCh FG+IvbfHWZRmdeFX3U4P =p4r5 -----END PGP SIGNATURE----- Merge tag 'nfc-fixes-3.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/nfc-fixes Samuel Ortiz <sameo@linux.intel.com> says: 'This is the second NFC fixes pull request for 3.11. We have: - A build failure fix for the NCI SPI transport layer due to a missing CRC_CCITT Kconfig dependency. - A netlink command rename: CMD_FW_UPLOAD was merged during the 3.11 merge window but the typical terminology for loading a firmware to a target is firmware download rather than upload. In order to avoid any confusion in a file exported to userspace, we rename this command into CMD_FW_DOWNLOAD." Signed-off-by: John W. Linville <linville@tuxdriver.com>	2013-07-31 15:15:50 -04:00
Chris Wright	b56e4b857c	mac80211: fix infinite loop in ieee80211_determine_chantype Commit "3d9646d mac80211: fix channel selection bug" introduced a possible infinite loop by moving the out target above the chandef_downgrade while loop. When we downgrade to NL80211_CHAN_WIDTH_20_NOHT, we jump back up to re-run the while loop...indefinitely. Replace goto with break and carry on. This may not be sufficient to connect to the AP, but will at least keep the cpu from livelocking. Thanks to Derek Atkins as an extra pair of debugging eyes. Cc: stable@kernel.org Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-07-31 21:15:36 +02:00
John W. Linville	704278ccb5	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth Conflicts: net/bluetooth/hci_core.c	2013-07-31 15:11:50 -04:00
Patrick McHardy	12e7ada385	netfilter: nf_nat: use per-conntrack locking for sequence number adjustments Get rid of the global lock and use per-conntrack locks for protecting the sequencen number adjustment data. Additionally saves one lock/unlock operation for every TCP packet. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-07-31 19:54:59 +02:00
Patrick McHardy	2d89c68ac7	netfilter: nf_nat: change sequence number adjustments to 32 bits Using 16 bits is too small, when many adjustments happen the offsets might overflow and break the connection. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-07-31 19:54:51 +02:00
Patrick McHardy	0658cdc8f3	netfilter: nf_nat: fix locking in nf_nat_seq_adjust() nf_nat_seq_adjust() needs to grab nf_nat_seqofs_lock to protect against concurrent changes to the sequence adjustment data. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-07-31 19:54:24 +02:00
Florian Westphal	02982c27ba	netfilter: nf_conntrack: remove duplicate code in ctnetlink ctnetlink contains copy-paste code from death_by_timeout. In order to avoid changing both places in upcoming event delivery patch, export death_by_timeout functionality and use it in the ctnetlink code. Based on earlier patch from Pablo Neira. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-07-31 18:51:23 +02:00
Florian Westphal	93742cf8af	netfilter: tproxy: remove nf_tproxy_core.h We've removed nf_tproxy_core.ko, so also remove its header. The lookup helpers are split and then moved to tproxy target/socket match. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-07-31 18:43:45 +02:00
Florian Westphal	fd158d79d3	netfilter: tproxy: remove nf_tproxy_core, keep tw sk assigned to skb The module was "permanent", due to the special tproxy skb->destructor. Nowadays we have tcp early demux and its sock_edemux destructor in networking core which can be used instead. Thanks to early demux changes the input path now also handles "skb->sk is tw socket" correctly, so this no longer needs the special handling introduced with commit `d503b30bd6` (netfilter: tproxy: do not assign timewait sockets to skb->sk). Thus: - move assign_sock function to where its needed - don't prevent timewait sockets from being assigned to the skb - remove nf_tproxy_core. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-07-31 16:39:40 +02:00
Florian Westphal	957bec3685	netfilter: nf_queue: relax NFQA_CT attribute check Allow modifying attributes of the conntrack associated with a packet without first requesting ct data via CFG_F_CONNTRACK or extra nfnetlink_conntrack socket. Also remove unneded rcu_read_lock; the entire function is already protected by rcu. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-07-31 16:39:29 +02:00
Florian Westphal	5813a8eb47	netfilter: connlabels: remove unneeded includes leftovers from the (never merged) v1 patch. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-07-31 16:39:18 +02:00
Patrick McHardy	312a0c16c1	netfilter: nf_conntrack: constify sk_buff argument to nf_ct_attach() Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-07-31 16:37:38 +02:00
Phil Oester	5774c94ace	netfilter: xt_addrtype: fix trivial typo Fix typo in error message. Signed-off-by: Phil Oester <kernel@linuxace.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-07-31 16:36:25 +02:00
Dan Carpenter	4299c8a94f	net: remove an unneeded check "ifa->ifa_label" is an array inside the in_ifaddr struct. It can never be NULL so we can remove this check. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-30 19:24:16 -07:00
Tom Herbert	b438f940d3	flow_dissector: add support for IPPROTO_IPV6 Support IPPROTO_IPV6 similar to IPPROTO_IPIP Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-30 19:16:52 -07:00
Tom Herbert	fca4189551	flow_dissector: clean up IPIP case Explicitly set proto to ETH_P_IP and jump directly to ip processing. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-30 19:16:51 -07:00
Jiri Pirko	ff80e519ab	net: export physical port id via sysfs Signed-off-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: Narendra K <narendra_k@dell.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-30 17:31:25 -07:00
Jiri Pirko	66cae9ed6b	rtnl: export physical port id via RT netlink Signed-off-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: Narendra K <narendra_k@dell.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-30 17:31:24 -07:00
Jiri Pirko	66b52b0dc8	net: add ndo to get id of physical port of the device This patch adds a ndo for getting physical port of the device. Driver which is aware of being virtual function of some physical port should implement this ndo. This is applicable not only for IOV, but for other solutions (NPAR, multichannel) as well. Basically if there is possible to have multiple netdevs on the single hw port. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-30 17:31:24 -07:00
Thomas Graf	ffd756b317	pktgen: Require CONFIG_INET due to use of IPv4 checksum function Unlike for IPv6, the IPv4 checksum functions are only available if CONFIG_INET is set. Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-30 16:45:09 -07:00
Pablo Neira	e1ee3673a8	genetlink: fix usage of NLM_F_EXCL or NLM_F_REPLACE Currently, it is not possible to use neither NLM_F_EXCL nor NLM_F_REPLACE from genetlink. This is due to this checking in genl_family_rcv_msg: if (nlh->nlmsg_flags & NLM_F_DUMP) NLM_F_DUMP is NLM_F_MATCH\|NLM_F_ROOT. Thus, if NLM_F_EXCL or NLM_F_REPLACE flag is set, genetlink believes that you're requesting a dump and it calls the .dumpit callback. The solution that I propose is to refine this checking to make it stricter: if ((nlh->nlmsg_flags & NLM_F_DUMP) == NLM_F_DUMP) And given the combination NLM_F_REPLACE and NLM_F_EXCL does not make sense to me, it removes the ambiguity. There was a patch that tried to fix this some time ago (`0ab03c2` netlink: test for all flags of the NLM_F_DUMP composite) but it tried to resolve this ambiguity in all existing netlink subsystems, not only genetlink. That patch was reverted since it broke iproute2, which is using NLM_F_ROOT to request the dump of the routing cache. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-30 16:43:19 -07:00
Dan Carpenter	ff862a4668	af_key: more info leaks in pfkey messages This is inspired by `a5cc68f3d6` "af_key: fix info leaks in notify messages". There are some struct members which don't get initialized and could disclose small amounts of private information. Acked-by: Mathias Krause <minipli@googlemail.com> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-30 16:26:16 -07:00
Samuel Ortiz	9ea7187c53	NFC: netlink: Rename CMD_FW_UPLOAD to CMD_FW_DOWNLOAD Loading a firmware into a target is typically called firmware download, not firmware upload. So we rename the netlink API to NFC_CMD_FW_DOWNLOAD in order to avoid any terminology confusion from userspace. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2013-07-31 01:19:43 +02:00
Hannes Frederic Sowa	5ad37d5dee	tcp: add tcp_syncookies mode to allow unconditionally generation of syncookies \| If you want to test which effects syncookies have to your \| network connections you can set this knob to 2 to enable \| unconditionally generation of syncookies. Original idea and first implementation by Eric Dumazet. Cc: Florian Westphal <fw@strlen.de> Cc: David Miller <davem@davemloft.net> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-30 16:15:18 -07:00
Andi Shyti	60ff779c4a	9p: client: remove unused code and any reference to "cancelled" function This patch reverts commit `80b45261a0` which was implementing a 'cancelled' functionality to notify that a cancelled request will not be replied. This implementation was not used anywhere and therefore removed. Signed-off-by: Andi Shyti <andi@etezian.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-30 15:54:28 -07:00
Johannes Berg	c319d50bfc	nl80211: fix another nl80211_fam.attrbuf race This is similar to the race Linus had reported, but in this case it's an older bug: nl80211_prepare_wdev_dump() uses the wiphy index in cb->args[0] as it is and thus parses the message over and over again instead of just once because 0 is the first valid wiphy index. Similar code in nl80211_testmode_dump() correctly offsets the wiphy_index by 1, do that here as well. Cc: stable@vger.kernel.org Reported-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-07-30 22:40:34 +02:00
Kent Overstreet	d29c445b63	aio: Kill ki_dtor sock_aio_dtor() is dead code - and stuff that does need to do cleanup can simply do it before calling aio_complete(). Signed-off-by: Kent Overstreet <koverstreet@google.com> Cc: Zach Brown <zab@redhat.com> Cc: Felipe Balbi <balbi@ti.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Jens Axboe <axboe@kernel.dk> Cc: Asai Thambi S P <asamymuthupa@micron.com> Cc: Selvan Mani <smani@micron.com> Cc: Sam Bradshaw <sbradshaw@micron.com> Cc: Jeff Moyer <jmoyer@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Benjamin LaHaise <bcrl@kvack.org> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Benjamin LaHaise <bcrl@kvack.org>	2013-07-30 11:53:12 -04:00
Kent Overstreet	73a7075e3f	aio: Kill aio_rw_vect_retry() This code doesn't serve any purpose anymore, since the aio retry infrastructure has been removed. This change should be safe because aio_read/write are also used for synchronous IO, and called from do_sync_read()/do_sync_write() - and there's no looping done in the sync case (the read and write syscalls). Signed-off-by: Kent Overstreet <koverstreet@google.com> Cc: Zach Brown <zab@redhat.com> Cc: Felipe Balbi <balbi@ti.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Jens Axboe <axboe@kernel.dk> Cc: Asai Thambi S P <asamymuthupa@micron.com> Cc: Selvan Mani <smani@micron.com> Cc: Sam Bradshaw <sbradshaw@micron.com> Cc: Jeff Moyer <jmoyer@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Benjamin LaHaise <bcrl@kvack.org> Signed-off-by: Benjamin LaHaise <bcrl@kvack.org>	2013-07-30 11:53:12 -04:00
David S. Miller	a0db856a95	net_sched: Fix stack info leak in cbq_dump_wrr(). Make sure the reserved fields, and padding (if any), are fully initialized. Based upon a patch by Dan Carpenter and feedback from Joe Perches. Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-30 00:16:21 -07:00
Johan Hedberg	53e21fbc28	Bluetooth: Fix calling request callback more than once In certain circumstances, such as an HCI driver using __hci_cmd_sync_ev with HCI_EV_CMD_COMPLETE as the expected completion event there is the chance that hci_event_packet will call hci_req_cmd_complete twice (once for the explicitly looked after event and another time in the actual handler of cmd_complete). In the case of __hci_cmd_sync_ev this introduces a race where the first call wakes up the blocking __hci_cmd_sync_ev and lets it complete. However, by the time that a second __hci_cmd_sync_ev call is already in progress the second hci_req_cmd_complete call (from the previous operation) will wake up the blocking function prematurely and cause it to fail, as witnessed by the following log: [ 639.232195] hci_rx_work: hci0 Event packet [ 639.232201] hci_req_cmd_complete: opcode 0xfc8e status 0x00 [ 639.232205] hci_sent_cmd_data: hci0 opcode 0xfc8e [ 639.232210] hci_req_sync_complete: hci0 result 0x00 [ 639.232220] hci_cmd_complete_evt: hci0 opcode 0xfc8e [ 639.232225] hci_req_cmd_complete: opcode 0xfc8e status 0x00 [ 639.232228] __hci_cmd_sync_ev: hci0 end: err 0 [ 639.232234] __hci_cmd_sync_ev: hci0 [ 639.232238] hci_req_add_ev: hci0 opcode 0xfc8e plen 250 [ 639.232242] hci_prepare_cmd: skb len 253 [ 639.232246] hci_req_run: length 1 [ 639.232250] hci_sent_cmd_data: hci0 opcode 0xfc8e [ 639.232255] hci_req_sync_complete: hci0 result 0x00 [ 639.232266] hci_cmd_work: hci0 cmd_cnt 1 cmd queued 1 [ 639.232271] __hci_cmd_sync_ev: hci0 end: err 0 [ 639.232276] Bluetooth: hci0 sending Intel patch command (0xfc8e) failed (-61) Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Acked-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2013-07-29 12:28:04 +01:00
Johan Hedberg	3f8e2d75c1	Bluetooth: Fix HCI init for BlueFRITZ! devices None of the BlueFRITZ! devices with manufacurer ID 31 (AVM Berlin) support HCI_Read_Local_Supported_Commands. It is safe to use the manufacturer ID (instead of e.g. a USB ID specific quirk) because the company never created any newer controllers. < HCI Command: Read Local Supported Comm.. (0x04\|0x0002) plen 0 [hci0] 0.210014 > HCI Event: Command Status (0x0f) plen 4 [hci0] 0.217361 Read Local Supported Commands (0x04\|0x0002) ncmd 1 Status: Unknown HCI Command (0x01) Reported-by: Jörg Esser <jackfritt@boh.de> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Tested-by: Jörg Esser <jackfritt@boh.de> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2013-07-29 12:12:27 +01:00
Stephen Rothwell	73d94e9481	pktgen: add needed include file Fixes this on PowerPC (at least): net/core/pktgen.c: In function 'fill_packet_ipv6': net/core/pktgen.c:2906:3: error: implicit declaration of function 'csum_ipv6_magic' [-Werror=implicit-function-declaration] udph->check = ~csum_ipv6_magic(&iph->saddr, &iph->daddr, udplen, IPPROTO_UDP, 0); ^ Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-29 00:47:14 -07:00
Hannes Frederic Sowa	9d4a031464	ipv4, ipv6: send igmpv3/mld packets with TC_PRIO_CONTROL v2: a) Also send ipv4 igmp messages with TC_PRIO_CONTROL Cc: William Manley <william.manley@youview.com> Cc: Lukas Tribus <luky-37@hotmail.com> Acked-by: Benjamin LaHaise <bcrl@kvack.org> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-28 11:13:55 -07:00
Stanislaw Gruszka	c74f2b2678	genetlink: release cb_lock before requesting additional module Requesting external module with cb_lock taken can result in the deadlock like showed below: [ 2458.111347] Showing all locks held in the system: [ 2458.111347] 1 lock held by NetworkManager/582: [ 2458.111347] #0: (cb_lock){++++++}, at: [<ffffffff8162bc79>] genl_rcv+0x19/0x40 [ 2458.111347] 1 lock held by modprobe/603: [ 2458.111347] #0: (cb_lock){++++++}, at: [<ffffffff8162baa5>] genl_lock_all+0x15/0x30 [ 2461.579457] SysRq : Show Blocked State [ 2461.580103] task PC stack pid father [ 2461.580103] NetworkManager D ffff880034b84500 4040 582 1 0x00000080 [ 2461.580103] ffff8800197ff720 0000000000000046 00000000001d5340 ffff8800197fffd8 [ 2461.580103] ffff8800197fffd8 00000000001d5340 ffff880019631700 7fffffffffffffff [ 2461.580103] ffff8800197ff880 ffff8800197ff878 ffff880019631700 ffff880019631700 [ 2461.580103] Call Trace: [ 2461.580103] [<ffffffff817355f9>] schedule+0x29/0x70 [ 2461.580103] [<ffffffff81731ad1>] schedule_timeout+0x1c1/0x360 [ 2461.580103] [<ffffffff810e69eb>] ? mark_held_locks+0xbb/0x140 [ 2461.580103] [<ffffffff817377ac>] ? _raw_spin_unlock_irq+0x2c/0x50 [ 2461.580103] [<ffffffff810e6b6d>] ? trace_hardirqs_on_caller+0xfd/0x1c0 [ 2461.580103] [<ffffffff81736398>] wait_for_completion_killable+0xe8/0x170 [ 2461.580103] [<ffffffff810b7fa0>] ? wake_up_state+0x20/0x20 [ 2461.580103] [<ffffffff81095825>] call_usermodehelper_exec+0x1a5/0x210 [ 2461.580103] [<ffffffff817362ed>] ? wait_for_completion_killable+0x3d/0x170 [ 2461.580103] [<ffffffff81095cc3>] __request_module+0x1b3/0x370 [ 2461.580103] [<ffffffff810e6b6d>] ? trace_hardirqs_on_caller+0xfd/0x1c0 [ 2461.580103] [<ffffffff8162c5c9>] ctrl_getfamily+0x159/0x190 [ 2461.580103] [<ffffffff8162d8a4>] genl_family_rcv_msg+0x1f4/0x2e0 [ 2461.580103] [<ffffffff8162d990>] ? genl_family_rcv_msg+0x2e0/0x2e0 [ 2461.580103] [<ffffffff8162da1e>] genl_rcv_msg+0x8e/0xd0 [ 2461.580103] [<ffffffff8162b729>] netlink_rcv_skb+0xa9/0xc0 [ 2461.580103] [<ffffffff8162bc88>] genl_rcv+0x28/0x40 [ 2461.580103] [<ffffffff8162ad6d>] netlink_unicast+0xdd/0x190 [ 2461.580103] [<ffffffff8162b149>] netlink_sendmsg+0x329/0x750 [ 2461.580103] [<ffffffff815db849>] sock_sendmsg+0x99/0xd0 [ 2461.580103] [<ffffffff810bb58f>] ? local_clock+0x5f/0x70 [ 2461.580103] [<ffffffff810e96e8>] ? lock_release_non_nested+0x308/0x350 [ 2461.580103] [<ffffffff815dbc6e>] ___sys_sendmsg+0x39e/0x3b0 [ 2461.580103] [<ffffffff810565af>] ? kvm_clock_read+0x2f/0x50 [ 2461.580103] [<ffffffff810218b9>] ? sched_clock+0x9/0x10 [ 2461.580103] [<ffffffff810bb2bd>] ? sched_clock_local+0x1d/0x80 [ 2461.580103] [<ffffffff810bb448>] ? sched_clock_cpu+0xa8/0x100 [ 2461.580103] [<ffffffff810e33ad>] ? trace_hardirqs_off+0xd/0x10 [ 2461.580103] [<ffffffff810bb58f>] ? local_clock+0x5f/0x70 [ 2461.580103] [<ffffffff810e3f7f>] ? lock_release_holdtime.part.28+0xf/0x1a0 [ 2461.580103] [<ffffffff8120fec9>] ? fget_light+0xf9/0x510 [ 2461.580103] [<ffffffff8120fe0c>] ? fget_light+0x3c/0x510 [ 2461.580103] [<ffffffff815dd1d2>] __sys_sendmsg+0x42/0x80 [ 2461.580103] [<ffffffff815dd222>] SyS_sendmsg+0x12/0x20 [ 2461.580103] [<ffffffff81741ad9>] system_call_fastpath+0x16/0x1b [ 2461.580103] modprobe D ffff88000f2c8000 4632 603 602 0x00000080 [ 2461.580103] ffff88000f04fba8 0000000000000046 00000000001d5340 ffff88000f04ffd8 [ 2461.580103] ffff88000f04ffd8 00000000001d5340 ffff8800377d4500 ffff8800377d4500 [ 2461.580103] ffffffff81d0b260 ffffffff81d0b268 ffffffff00000000 ffffffff81d0b2b0 [ 2461.580103] Call Trace: [ 2461.580103] [<ffffffff817355f9>] schedule+0x29/0x70 [ 2461.580103] [<ffffffff81736d4d>] rwsem_down_write_failed+0xed/0x1a0 [ 2461.580103] [<ffffffff810bb200>] ? update_cpu_load_active+0x10/0xb0 [ 2461.580103] [<ffffffff8137b473>] call_rwsem_down_write_failed+0x13/0x20 [ 2461.580103] [<ffffffff8173492d>] ? down_write+0x9d/0xb2 [ 2461.580103] [<ffffffff8162baa5>] ? genl_lock_all+0x15/0x30 [ 2461.580103] [<ffffffff8162baa5>] genl_lock_all+0x15/0x30 [ 2461.580103] [<ffffffff8162cbb3>] genl_register_family+0x53/0x1f0 [ 2461.580103] [<ffffffffa01dc000>] ? 0xffffffffa01dbfff [ 2461.580103] [<ffffffff8162d650>] genl_register_family_with_ops+0x20/0x80 [ 2461.580103] [<ffffffffa01dc000>] ? 0xffffffffa01dbfff [ 2461.580103] [<ffffffffa017fe84>] nl80211_init+0x24/0xf0 [cfg80211] [ 2461.580103] [<ffffffffa01dc000>] ? 0xffffffffa01dbfff [ 2461.580103] [<ffffffffa01dc043>] cfg80211_init+0x43/0xdb [cfg80211] [ 2461.580103] [<ffffffff810020fa>] do_one_initcall+0xfa/0x1b0 [ 2461.580103] [<ffffffff8105cb93>] ? set_memory_nx+0x43/0x50 [ 2461.580103] [<ffffffff810f75af>] load_module+0x1c6f/0x27f0 [ 2461.580103] [<ffffffff810f2c90>] ? store_uevent+0x40/0x40 [ 2461.580103] [<ffffffff810f82c6>] SyS_finit_module+0x86/0xb0 [ 2461.580103] [<ffffffff81741ad9>] system_call_fastpath+0x16/0x1b [ 2461.580103] Sched Debug Version: v0.10, 3.11.0-0.rc1.git4.1.fc20.x86_64 #1 Problem start to happen after adding net-pf-16-proto-16-family-nl80211 alias name to cfg80211 module by below commit (though that commit itself is perfectly fine): commit `fb4e156886` Author: Marcel Holtmann <marcel@holtmann.org> Date: Sun Apr 28 16:22:06 2013 -0700 nl80211: Add generic netlink module alias for cfg80211/nl80211 Reported-and-tested-by: Jeff Layton <jlayton@redhat.com> Reported-by: Richard W.M. Jones <rjones@redhat.com> Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> Reviewed-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-27 22:19:20 -07:00
Thomas Graf	03c633e733	pktgen: Use ip_send_check() to compute checksum Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-27 22:18:00 -07:00
Thomas Graf	c26bf4a513	pktgen: Add UDPCSUM flag to support UDP checksums UDP checksums are optional, hence pktgen has been omitting them in favour of performance. The optional flag UDPCSUM enables UDP checksumming. If the output device supports hardware checksumming the skb is prepared and marked CHECKSUM_PARTIAL, otherwise the checksum is generated in software. Signed-off-by: Thomas Graf <tgraf@suug.ch> Cc: Eric Dumazet <edumazet@google.com> Cc: Ben Greear <greearb@candelatech.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-27 22:16:36 -07:00
Asias He	82a54d0ebb	VSOCK: Move af_vsock.h and vsock_addr.h to include/net This is useful for other VSOCK transport implemented outside the net/vmw_vsock/ directory to use these headers. Signed-off-by: Asias He <asias@redhat.com> Acked-by: Andy King <acking@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-27 22:14:06 -07:00
Joe Stringer	024ec3deac	net/sctp: Refactor SCTP skb checksum computation This patch consolidates the SCTP checksum calculation code from various places to a single new function, sctp_compute_cksum(skb, offset). Signed-off-by: Joe Stringer <joe@wand.net.nz> Reviewed-by: Julian Anastasov <ja@ssi.bg> Acked-by: Simon Horman <horms@verge.net.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-27 20:07:15 -07:00
Greg Kroah-Hartman	7f4708abf1	net: ieee802154: convert class code to use dev_groups The dev_attrs field of struct class is going away soon, dev_groups should be used instead. This converts the ieee802154 class code to use the correct field. Acked-by: David S. Miller <davem@davemloft.net> Cc: Alexander Smirnov <alex.bluesman.smirnov@gmail.com> Cc: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-07-26 18:05:18 -07:00
Greg Kroah-Hartman	6be8aeef34	net: core: convert class code to use dev_groups The dev_attrs field of struct class is going away soon, dev_groups should be used instead. This converts the networking core class code to use the correct field. In order to do this in the "cleanest" way, some of the macros had to be changed to reflect the driver core format of naming show/store functions, which accounts for the majority of the churn in this file. Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-07-26 18:05:18 -07:00
stephen hemminger	93d8bf9fb8	bridge: cleanup netpoll code This started out with fixing a sparse warning, then I realized that the wrapper function br_netpoll_info could just be collapsed away by rolling it into the enable code. Also, eliminate unnecessary goto's Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Reviewed-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-26 15:24:32 -07:00
Francesco Fusco	555445cd11	neigh: prevent overflowing params in /proc/sys/net/ipv4/neigh/ Without this patch, the fields app_solicit, gc_thresh1, gc_thresh2, gc_thresh3, proxy_qlen, ucast_solicit, mcast_solicit could have assumed negative values when setting large numbers. Signed-off-by: Francesco Fusco <ffusco@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-26 14:22:10 -07:00
Greg Kroah-Hartman	e49df67df1	net: rfkill: convert class code to use dev_groups The dev_attrs field of struct class is going away soon, dev_groups should be used instead. This converts the rfkill class code to use the correct field. Cc: John W. Linville <linville@tuxdriver.com> Reviewed-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-07-25 16:34:40 -07:00
Greg Kroah-Hartman	f0bc99c843	net: wireless: convert class code to use dev_groups The dev_attrs field of struct class is going away soon, dev_groups should be used instead. This converts the networking wireless class code to use the correct field. Reviewed-by: Johannes Berg <johannes@sipsolutions.net> Cc: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-07-25 16:34:40 -07:00
Gustavo Padovan	fcee337704	Bluetooth: Fix race between hci_register_dev() and hci_dev_open() If hci_dev_open() is called after hci_register_dev() added the device to the hci_dev_list but before the workqueue are created we could run into a NULL pointer dereference (see below). This bug is very unlikely to happen, systems using bluetoothd to manage their bluetooth devices will never see this happen. BUG: unable to handle kernel NULL pointer dereference 0100 IP: [<ffffffff81077502>] __queue_work+0x32/0x3d0 (...) Call Trace: [<ffffffff81077be5>] queue_work_on+0x45/0x50 [<ffffffffa016e8ff>] hci_req_run+0xbf/0xf0 [bluetooth] [<ffffffffa01709b0>] ? hci_init2_req+0x720/0x720 [bluetooth] [<ffffffffa016ea06>] __hci_req_sync+0xd6/0x1c0 [bluetooth] [<ffffffff8108ee10>] ? try_to_wake_up+0x2b0/0x2b0 [<ffffffff8150e3f0>] ? usb_autopm_put_interface+0x30/0x40 [<ffffffffa016fad5>] hci_dev_open+0x275/0x2e0 [bluetooth] [<ffffffffa0182752>] hci_sock_ioctl+0x1f2/0x3f0 [bluetooth] [<ffffffff815c6050>] sock_do_ioctl+0x30/0x70 [<ffffffff815c75f9>] sock_ioctl+0x79/0x2f0 [<ffffffff811a8046>] do_vfs_ioctl+0x96/0x560 [<ffffffff811a85a1>] SyS_ioctl+0x91/0xb0 [<ffffffff816d989d>] system_call_fastpath+0x1a/0x1f Reported-by: Sedat Dilek <sedat.dilek@gmail.com> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2013-07-25 19:52:36 +01:00
Jaganath Kanakkassery	da9910ac4a	Bluetooth: Fix invalid length check in l2cap_information_rsp() The length check is invalid since the length varies with type of info response. This was introduced by the commit `cb3b3152b2` Because of this, l2cap info rsp is not handled and command reject is sent. > ACL data: handle 11 flags 0x02 dlen 16 L2CAP(s): Info rsp: type 2 result 0 Extended feature mask 0x00b8 Enhanced Retransmission mode Streaming mode FCS Option Fixed Channels < ACL data: handle 11 flags 0x00 dlen 10 L2CAP(s): Command rej: reason 0 Command not understood Cc: stable@vger.kernel.org Signed-off-by: Jaganath Kanakkassery <jaganath.k@samsung.com> Signed-off-by: Chan-Yeol Park <chanyeol.park@samsung.com> Acked-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2013-07-25 19:52:30 +01:00
Benjamin Tissoires	159d865f20	Bluetooth: hidp: remove wrong send_report at init The USB hid implementation does retrieve the reports during the start. However, this implementation does not call the HID command GET_REPORT (which would fetch the current status of each report), but use the DATA command, which is an Output Report (so transmitting data from the host to the device). The Wiimote controller is already guarded against this problem in the protocol, but it is not conformant to the specification to set all the reports to 0 on start. Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Reviewed-by: David Herrmann <dh.herrmann@gmail.com> Acked-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2013-07-25 14:15:24 +01:00
Benjamin Tissoires	2583d706a1	Bluetooth: hidp: implement hidinput_input_event callback We can re-enable hidinput_input_event to allow the leds of bluetooth keyboards to be set. Now the callbacks uses hid core to retrieve the right HID report to send, so this version is safer. Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Reviewed-by: David Herrmann <dh.herrmann@gmail.com> Acked-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2013-07-25 14:15:24 +01:00
Gustavo Padovan	1c244f79c0	Bluetooth: Add missing braces to an "else if" Trivial change in the coding style. Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2013-07-25 14:15:23 +01:00
Mikel Astiz	a767631ad1	Bluetooth: Use defines instead of integer literals Replace the occurrences of integer literals in hci_event.c with the newly introduced macros in hci.h. Signed-off-by: Mikel Astiz <mikel.astiz@bmw-carit.de> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2013-07-25 14:15:23 +01:00
Mikel Astiz	acabae96df	Bluetooth: Use defines in in hci_get_auth_req() Make the code in hci_get_auth_req() more readable by using the defined macros instead of inlining magic numbers. Signed-off-by: Mikel Astiz <mikel.astiz@bmw-carit.de> Signed-off-by: Timo Mueller <timo.mueller@bmw-carit.de> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2013-07-25 14:15:22 +01:00
Marcel Holtmann	637b4caeed	Bluetooth: Fix simple whitespace vs tab style issue Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2013-07-25 14:15:21 +01:00
Arik Nemtsov	23df0b7319	regulatory: use correct regulatory initiator on wiphy register The current regdomain was not always set by the core. This causes cards with a custom regulatory domain to ignore user initiated changes if done before the card was registered. Signed-off-by: Arik Nemtsov <arik@wizery.com> Acked-by: Luis R. Rodriguez <mcgrof@do-not-panic.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-07-25 09:52:46 +02:00
David S. Miller	1df86b4cee	Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless John W. Linville says: ==================== This is another batch of fixes intended for the 3.11 stream. FWIW, this is the first request with fixes from the mac80211 and iwlwifi trees as well. Regarding the mac80211 bits, Johannes says: "Here I have a fix for RSSI thresholds in mesh, two minstrel fixes from Felix, an nl80211 fix from Michal and four various fixes I did myself." As for the iwlwifi bits, Johannes says: "Here I have a fix for debugfs directory creation (causing a spurious error message), two scanning fixes from David Spinadel, an LED fix and two patches related to a BA session problem that eventually caused firmware crashes from Emmanuel and a small BT fix for older devices as well as a workaround for a firmware problem with APs with very small beacon intervals from myself." Along with those: Arend van Spriel addresses a lock-up and a NULL pointer dereference in brcmfmac. Daniel Drake fixes an unhandled interrupt during device tear down in mwifiex. Larry Finger corrects a wil6210 build error. Oleksij Rempel fixes two ath9k_htc problems related to keeping the driver and firmware in sync. Solomon Peachy gives us a cw1200 fix to avoid an oops in monitor mode. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-25 00:06:59 -07:00
Florian Fainelli	deceb4c062	net: fix comment above build_skb() build_skb() specifies that the data parameter must come from a kmalloc'd area, this is only true if frag_size equals 0, because then build_skb() will use kzsize(data) to figure out the actual data size. Update the comment to reflect that special condition. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-24 17:59:07 -07:00
Thomas Gleixner	18afa4b028	net: Make devnet_rename_seq static No users outside net/core/dev.c. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-24 17:57:26 -07:00
Eric Dumazet	c9bee3b7fd	tcp: TCP_NOTSENT_LOWAT socket option Idea of this patch is to add optional limitation of number of unsent bytes in TCP sockets, to reduce usage of kernel memory. TCP receiver might announce a big window, and TCP sender autotuning might allow a large amount of bytes in write queue, but this has little performance impact if a large part of this buffering is wasted : Write queue needs to be large only to deal with large BDP, not necessarily to cope with scheduling delays (incoming ACKS make room for the application to queue more bytes) For most workloads, using a value of 128 KB or less is OK to give applications enough time to react to POLLOUT events in time (or being awaken in a blocking sendmsg()) This patch adds two ways to set the limit : 1) Per socket option TCP_NOTSENT_LOWAT 2) A sysctl (/proc/sys/net/ipv4/tcp_notsent_lowat) for sockets not using TCP_NOTSENT_LOWAT socket option (or setting a zero value) Default value being UINT_MAX (0xFFFFFFFF), meaning this has no effect. This changes poll()/select()/epoll() to report POLLOUT only if number of unsent bytes is below tp->nosent_lowat Note this might increase number of sendmsg()/sendfile() calls when using non blocking sockets, and increase number of context switches for blocking sockets. Note this is not related to SO_SNDLOWAT (as SO_SNDLOWAT is defined as : Specify the minimum number of bytes in the buffer until the socket layer will pass the data to the protocol) Tested: netperf sessions, and watching /proc/net/protocols "memory" column for TCP With 200 concurrent netperf -t TCP_STREAM sessions, amount of kernel memory used by TCP buffers shrinks by ~55 % (20567 pages instead of 45458) lpq83:~# echo -1 >/proc/sys/net/ipv4/tcp_notsent_lowat lpq83:~# (super_netperf 200 -t TCP_STREAM -H remote -l 90 &); sleep 60 ; grep TCP /proc/net/protocols TCPv6 1880 2 45458 no 208 yes ipv6 y y y y y y y y y y y y y n y y y y y TCP 1696 508 45458 no 208 yes kernel y y y y y y y y y y y y y n y y y y y lpq83:~# echo 131072 >/proc/sys/net/ipv4/tcp_notsent_lowat lpq83:~# (super_netperf 200 -t TCP_STREAM -H remote -l 90 &); sleep 60 ; grep TCP /proc/net/protocols TCPv6 1880 2 20567 no 208 yes ipv6 y y y y y y y y y y y y y n y y y y y TCP 1696 508 20567 no 208 yes kernel y y y y y y y y y y y y y n y y y y y Using 128KB has no bad effect on the throughput or cpu usage of a single flow, although there is an increase of context switches. A bonus is that we hold socket lock for a shorter amount of time and should improve latencies of ACK processing. lpq83:~# echo -1 >/proc/sys/net/ipv4/tcp_notsent_lowat lpq83:~# perf stat -e context-switches ./netperf -H 7.7.7.84 -t omni -l 20 -c -i10,3 OMNI Send TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 7.7.7.84 () port 0 AF_INET : +/-2.500% @ 99% conf. Local Remote Local Elapsed Throughput Throughput Local Local Remote Remote Local Remote Service Send Socket Recv Socket Send Time Units CPU CPU CPU CPU Service Service Demand Size Size Size (sec) Util Util Util Util Demand Demand Units Final Final % Method % Method 1651584 6291456 16384 20.00 17447.90 10^6bits/s 3.13 S -1.00 U 0.353 -1.000 usec/KB Performance counter stats for './netperf -H 7.7.7.84 -t omni -l 20 -c -i10,3': 412,514 context-switches 200.034645535 seconds time elapsed lpq83:~# echo 131072 >/proc/sys/net/ipv4/tcp_notsent_lowat lpq83:~# perf stat -e context-switches ./netperf -H 7.7.7.84 -t omni -l 20 -c -i10,3 OMNI Send TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 7.7.7.84 () port 0 AF_INET : +/-2.500% @ 99% conf. Local Remote Local Elapsed Throughput Throughput Local Local Remote Remote Local Remote Service Send Socket Recv Socket Send Time Units CPU CPU CPU CPU Service Service Demand Size Size Size (sec) Util Util Util Util Demand Demand Units Final Final % Method % Method 1593240 6291456 16384 20.00 17321.16 10^6bits/s 3.35 S -1.00 U 0.381 -1.000 usec/KB Performance counter stats for './netperf -H 7.7.7.84 -t omni -l 20 -c -i10,3': 2,675,818 context-switches 200.029651391 seconds time elapsed Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Yuchung Cheng <ycheng@google.com> Acked-By: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-24 17:54:48 -07:00
Eric Dumazet	64dc61306c	net: add sk_stream_is_writeable() helper Several call sites use the hardcoded following condition : sk_stream_wspace(sk) >= sk_stream_min_wspace(sk) Lets use a helper because TCP_NOTSENT_LOWAT support will change this condition for TCP sockets. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Yuchung Cheng <ycheng@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-24 17:54:48 -07:00
Daniel Borkmann	91705c61b5	net: sctp: trivial: update mailing list address The SCTP mailing list address to send patches or questions to is linux-sctp@vger.kernel.org and not lksctp-developers@lists.sourceforge.net anymore. Therefore, update all occurences. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-24 17:53:38 -07:00
Hannes Frederic Sowa	905a6f96a1	ipv6: take rtnl_lock and mark mrt6 table as freed on namespace cleanup Otherwise we end up dereferencing the already freed net->ipv6.mrt pointer which leads to a panic (from Srivatsa S. Bhat): BUG: unable to handle kernel paging request at ffff882018552020 IP: [<ffffffffa0366b02>] ip6mr_sk_done+0x32/0xb0 [ipv6] PGD 290a067 PUD 207ffe0067 PMD 207ff1d067 PTE 8000002018552060 Oops: 0000 [#1] SMP DEBUG_PAGEALLOC Modules linked in: ebtable_nat ebtables nfs fscache nf_conntrack_ipv4 nf_defrag_ipv4 ipt_REJECT xt_CHECKSUM iptable_mangle iptable_filter ip_tables nfsd lockd nfs_acl exportfs auth_rpcgss autofs4 sunrpc 8021q garp bridge stp llc ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter +ip6_tables ipv6 vfat fat vhost_net macvtap macvlan vhost tun kvm_intel kvm uinput iTCO_wdt iTCO_vendor_support cdc_ether usbnet mii microcode i2c_i801 i2c_core lpc_ich mfd_core shpchp ioatdma dca mlx4_core be2net wmi acpi_cpufreq mperf ext4 jbd2 mbcache dm_mirror dm_region_hash dm_log dm_mod CPU: 0 PID: 7 Comm: kworker/u33:0 Not tainted 3.11.0-rc1-ea45e-a #4 Hardware name: IBM -[8737R2A]-/00Y2738, BIOS -[B2E120RUS-1.20]- 11/30/2012 Workqueue: netns cleanup_net task: ffff8810393641c0 ti: ffff881039366000 task.ti: ffff881039366000 RIP: 0010:[<ffffffffa0366b02>] [<ffffffffa0366b02>] ip6mr_sk_done+0x32/0xb0 [ipv6] RSP: 0018:ffff881039367bd8 EFLAGS: 00010286 RAX: ffff881039367fd8 RBX: ffff882018552000 RCX: dead000000200200 RDX: 0000000000000000 RSI: ffff881039367b68 RDI: ffff881039367b68 RBP: ffff881039367bf8 R08: ffff881039367b68 R09: 2222222222222222 R10: 2222222222222222 R11: 2222222222222222 R12: ffff882015a7a040 R13: ffff882014eb89c0 R14: ffff8820289e2800 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88103fc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffff882018552020 CR3: 0000000001c0b000 CR4: 00000000000407f0 Stack: ffff881039367c18 ffff882014eb89c0 ffff882015e28c00 0000000000000000 ffff881039367c18 ffffffffa034d9d1 ffff8820289e2800 ffff882014eb89c0 ffff881039367c58 ffffffff815bdecb ffffffff815bddf2 ffff882014eb89c0 Call Trace: [<ffffffffa034d9d1>] rawv6_close+0x21/0x40 [ipv6] [<ffffffff815bdecb>] inet_release+0xfb/0x220 [<ffffffff815bddf2>] ? inet_release+0x22/0x220 [<ffffffffa032686f>] inet6_release+0x3f/0x50 [ipv6] [<ffffffff8151c1d9>] sock_release+0x29/0xa0 [<ffffffff81525520>] sk_release_kernel+0x30/0x70 [<ffffffffa034f14b>] icmpv6_sk_exit+0x3b/0x80 [ipv6] [<ffffffff8152fff9>] ops_exit_list+0x39/0x60 [<ffffffff815306fb>] cleanup_net+0xfb/0x1a0 [<ffffffff81075e3a>] process_one_work+0x1da/0x610 [<ffffffff81075dc9>] ? process_one_work+0x169/0x610 [<ffffffff81076390>] worker_thread+0x120/0x3a0 [<ffffffff81076270>] ? process_one_work+0x610/0x610 [<ffffffff8107da2e>] kthread+0xee/0x100 [<ffffffff8107d940>] ? __init_kthread_worker+0x70/0x70 [<ffffffff8162a99c>] ret_from_fork+0x7c/0xb0 [<ffffffff8107d940>] ? __init_kthread_worker+0x70/0x70 Code: 20 48 89 5d e8 4c 89 65 f0 4c 89 6d f8 66 66 66 66 90 4c 8b 67 30 49 89 fd e8 db 3c 1e e1 49 8b 9c 24 90 08 00 00 48 85 db 74 06 <4c> 39 6b 20 74 20 bb f3 ff ff ff e8 8e 3c 1e e1 89 d8 4c 8b 65 RIP [<ffffffffa0366b02>] ip6mr_sk_done+0x32/0xb0 [ipv6] RSP <ffff881039367bd8> CR2: ffff882018552020 Reported-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Tested-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-24 17:02:13 -07:00
Jerry Snitselaar	f585a991e1	fib_trie: potential out of bounds access in trie_show_stats() With the <= max condition in the for loop, it will be always go 1 element further than needed. If the condition for the while loop is never met, then max is MAX_STAT_DEPTH, and for loop will walk off the end of nodesizes[]. Signed-off-by: Jerry Snitselaar <jerry.snitselaar@oracle.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-24 16:05:14 -07:00
Andi Shyti	59ea52dc46	net: trans_rdma: remove unused function This patch gets rid of the following warning: net/9p/trans_rdma.c:594:12: warning: ‘rdma_cancelled’ defined but not used [-Wunused-function] static int rdma_cancelled(struct p9_client client, struct p9_req_t req) The rdma_cancelled function is not called anywhere in the kernel Signed-off-by: Andi Shyti <andi@etezian.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-24 15:46:27 -07:00
fan.du	9225b23057	net: ipv6 eliminate parameter "int addrlen" in function fib6_add_1 The "int addrlen" in fib6_add_1 is rebundant, as we can get it from parameter "struct in6_addr *addr" once we modified its type. And also fix some coding style issues in fib6_add_1 Signed-off-by: Fan Du <fan.du@windriver.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-24 14:24:56 -07:00
fan.du	86a37def2b	net ipv6: Remove rebundant rt6i_nsiblings initialization Seting rt->rt6i_nsiblings to zero is rebundant, because above memset zeroed the rest of rt excluding the first dst memember. Signed-off-by: Fan Du <fan.du@windriver.com> Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Cc: James Morris <jmorris@namei.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-24 14:24:56 -07:00
John W. Linville	18e1ccb6ca	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem	2013-07-24 11:50:38 -04:00
Amerigo Wang	b9959fd3b0	vti: switch to new ip tunnel code GRE tunnel and IPIP tunnel already switched to the new ip tunnel code, VTI tunnel can use it too. Cc: Pravin B Shelar <pshelar@nicira.com> Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: Saurabh Mohan <saurabh.mohan@vyatta.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-23 17:27:32 -07:00
Rami Rosen	2b52c3ada8	ip6mr: change the prototype of ip6_mr_forward(). This patch changes the prototpye of the ip6_mr_forward() method to return void instead of int. The ip6_mr_forward() method always returns 0; moreover, the return value of this method is not checked anywhere. Signed-off-by: Rami Rosen <ramirose@gmail.com> Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-23 17:20:37 -07:00
Rami Rosen	c4854ec8c4	ipmr: change the prototype of ip_mr_forward(). This patch changes the prototpye of the ip_mr_forward() method to return void instead of int. The ip_mr_forward() method always returns 0; moreover, the return value of this method is not checked anywhere. Signed-off-by: Rami Rosen <ramirose@gmail.com> Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-23 17:01:05 -07:00
Jiri Pirko	4aa5dee4d9	net: convert resend IGMP to notifier event Until now, bond_resend_igmp_join_requests() looks for vlans attached to bonding device, bridge where bonding act as port manually. It does not care of other scenarios, like stacked bonds or team device above. Make this more generic and use netdev notifier to propagate the event to upper devices and to actually call ip_mc_rejoin_groups(). Signed-off-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-23 16:52:47 -07:00
Jeff Layton	275448eb10	rpc_pipe: convert back to simple_dir_inode_operations Now that Al has fixed simple_lookup to account for the case where sb->s_d_op is set, there's no need to keep our own special lookup op. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2013-07-23 18:18:53 -04:00
Stanislaw Gruszka	cd34f647a7	mac80211: fix monitor interface suspend crash regression My commit: commit `12e7f51702` Author: Stanislaw Gruszka <sgruszka@redhat.com> Date: Thu Feb 28 10:55:26 2013 +0100 mac80211: cleanup generic suspend/resume procedures removed check for deleting MONITOR and AP_VLAN when suspend. That can cause a crash (i.e. in iwlagn_mac_remove_interface()) since we remove interface in the driver that we did not add before. Reference: http://marc.info/?l=linux-kernel&m=137391815113860&w=2 Bisected-by: Ortwin Glück <odi@odi.ch> Reported-and-tested-by: Ortwin Glück <odi@odi.ch> Cc: stable@vger.kernel.org # 3.10 Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-07-23 14:02:08 +02:00
Yuchung Cheng	ed08495c31	tcp: use RTT from SACK for RTO If RTT is not available because Karn's check has failed or no new packet is acked, use the RTT measured from SACK to estimate the RTO. The sender can continue to estimate the RTO during loss recovery or reordering event upon receiving non-partial ACKs. This also changes when the RTO is re-armed. Previously it is only re-armed when some data is cummulatively acknowledged (i.e., SND.UNA advances), but now it is re-armed whenever RTT estimator is updated. This feature is particularly useful to reduce spurious timeout for buffer bloat including cellular carriers [1], and RTT estimation on reordering events. [1] "An In-depth Study of LTE: Effect of Network Protocol and Application Behavior on Performance", In Proc. of SIGCOMM 2013 Signed-off-by: Yuchung Cheng <ycheng@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-22 17:53:42 -07:00
Yuchung Cheng	59c9af4234	tcp: measure RTT from new SACK Take RTT sample if an ACK selectively acks some sequences that have never been retransmitted. The Karn's algorithm does not apply even if that ACK (s)acks other retransmitted sequences, because it must been generated by an original but perhaps out-of-order packet. There is no ambiguity. In case when multiple blocks are newly sacked because of ACK losses the earliest block is used to measure RTT, similar to cummulative ACKs. Such RTT samples allow the sender to estimate the RTO during loss recovery and packet reordering events. It is still useful even with TCP timestamps. That's because during these events the SND.UNA may not advance preventing RTT samples from TS ECR (thus the FLAG_ACKED check before calling tcp_ack_update_rtt()). Therefore this new RTT source is complementary to existing ACK and TS RTT mechanisms. This patch does not update the RTO. It is done in the next patch. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-22 17:53:42 -07:00
Yuchung Cheng	5b08e47caf	tcp: prefer packet timing to TS-ECR for RTT Prefer packet timings to TS-ecr for RTT measurements when both sources are available. That's because broken middle-boxes and remote peer can return packets with corrupted TS ECR fields. Similarly most congestion controls that require RTT signals favor timing-based sources as well. Also check for bad TS ECR values to avoid RTT blow-ups. It has happened on production Web servers. Signed-off-by: Yuchung Cheng <ycheng@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-22 17:53:42 -07:00
Yuchung Cheng	375fe02c91	tcp: consolidate SYNACK RTT sampling The first patch consolidates SYNACK and other RTT measurement to use a central function tcp_ack_update_rtt(). A (small) bonus is now SYNACK RTT measurement happens after PAWS check, potentially reducing the impact of RTO seeding on bad TCP timestamps values. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-22 17:53:42 -07:00
Richard Cochran	cb820f8e4b	net: Provide a generic socket error queue delivery method for Tx time stamps. This patch moves the private error queue delivery function from the af_packet code to the core socket method. In this way, network layers only needing the error queue for transmit time stamping can share common code. Signed-off-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-22 14:58:19 -07:00
David S. Miller	7bd04bcf91	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf Pablo Neira Ayuso says: ==================== The following patchset contains Netfilter fixes for your net tree, they are: * Fix potential NULL dereference in the socket match if revision 0 is used, from Eric Dumazet. * Fix missing expectation NAT initialization that results in dumping the NAT part via ctnetlink, thus leading to problems in expectation synchronization through conntrackd, from myself. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-22 14:32:39 -07:00
Jiri Kosina	bc197eedef	HID: fix unused rsize usage `27ce4050` ("HID: fix data access in implement()") by mistake removed a setting of buffer size in hidp. Fix that by putting it back. Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2013-07-22 17:11:44 +02:00
Jiri Kosina	27ce405039	HID: fix data access in implement() implement() is setting bytes in LE data stream. In case the data is not aligned to 64bits, it reads past the allocated buffer. It doesn't really change any value there (it's properly bitmasked), but in case that this read past the boundary hits a page boundary, pagefault happens when accessing 64bits of 'x' in implement(), and kernel oopses. This happens much more often when numbered reports are in use, as the initial 8bit skip in the buffer makes the whole process work on values which are not aligned to 64bits. This problem dates back to attempts in 2005 and 2006 to make implement() and extract() as generic as possible, and even back then the problem was realized by Adam Kroperlin, but falsely assumed to be impossible to cause any harm: http://www.mail-archive.com/linux-usb-devel@lists.sourceforge.net/msg47690.html I have made several attempts at fixing it "on the spot" directly in implement(), but the results were horrible; the special casing for processing last 64bit chunk and switching to different math makes it unreadable mess. I therefore took a path to allocate a few bytes more which will never make it into final report, but are there as a cushion for all the 64bit math operations happening in implement() and extract(). All callers of hid_output_report() are converted at the same time to allocate the buffer by newly introduced hid_alloc_report_buf() helper. Bruno noticed that the whole raw_size test can be dropped as well, as hid_alloc_report_buf() makes sure that the buffer is always of a proper size. Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Acked-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2013-07-22 16:16:40 +02:00
Chun-Yeow Yeoh	40d18ff959	mac80211: prevent the buffering or frame transmission to non-assoc mesh STA This patch is intended to avoid the buffering to non-assoc mesh STA and also to avoid the triggering of frame to non-assoc mesh STA which could cause kernel panic in specific hw. One of the examples, is kernel panic happens to ath9k if user space inserts the mesh STA and not proceed with the SAE and AMPE, and later the same mesh STA is detected again. The sta_state of the mesh STA remains at IEEE80211_STA_NONE and if the ieee80211_sta_ps_deliver_wakeup is called and subsequently the ath_tx_aggr_wakeup, the kernel panic due to ath_tx_node_init is not called before to initialize the require data structures. This issue is reported by Cedric Voncken before. http://www.spinics.net/lists/linux-wireless/msg106342.html [<831ea6b4>] ath_tx_aggr_wakeup+0x44/0xcc [ath9k] [<83084214>] ieee80211_sta_ps_deliver_wakeup+0xb8/0x208 [mac80211] [<830b9824>] ieee80211_mps_sta_status_update+0x94/0x108 [mac80211] [<83099398>] ieee80211_sta_ps_transition+0xc94/0x34d8 [mac80211] [<8022399c>] nf_iterate+0x98/0x104 [<8309bb60>] ieee80211_sta_ps_transition+0x345c/0x34d8 [mac80211] Signed-off-by: Chun-Yeow Yeoh <yeohchunyeow@gmail.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-07-22 15:32:47 +02:00
Linus Torvalds	3be542d464	NFS client bugfixes for 3.11 - Fix a regression against NFSv4 FreeBSD servers when creating a new file - Fix another regression in rpc_client_register() -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.13 (GNU/Linux) iQIcBAABAgAGBQJR6ZKMAAoJEGcL54qWCgDyQX8P/19LKLNKcL+y2zVGjLbXMTq0 TpyWdBO0ux7QcqnPEDg+Jpvu62IowYiKTtaSOXtHb5BNjQMBo2RKw3B0eMBoCp/z 6gHmQRD2hMgqwBxBwHceV+dNwueCUiZW7GqaaNh6/3bpGQefegdONnLEifuPogEu oZmEuiVrGDfITEF7D4k5+shXCQN4eNH0LFuIQo4XXdCqmK6PwvOsidZ7YwHVC3Mg /Jzda2YsCxHj8kPi1xb9skPPAn6g4kdfYfyr/xSY7IviPixrkg/nEEK1b8xHU81e a0dd0Yx5kq6fR8LsBvQCHdj2m7doHM15jf5Np5G7VnnaWEjB2y+QftkxWc9lCNU3 t2fr9YVD7ZG/GGNSFePHAHmBY0OqDB1Htp4vcwEQfzX6CAR3Hel82WVvut62Z6m4 G5qHjwdqUFhmRN//SWlDpEqSn+pbeCvPhQS60ayN0TLivRsscm/I4yA75odAnn9b 4su1IcUpqeJGeV6yDyMUqbx4kYZFyCZg/DNkThXiTKOs47A7ogSS9ev2fTB/V+jd rroNHNd/U508ze9D6D4ai9vR78uUp4wKNSSBZMCkBtNh0uSApOTgyGVhertB1EKS vgAr4T1tc+9t+0qg1Sb+hbKyBM/KaS5zUrPn+APHPoBXPh5PSVBzeNJkpxHRw/V0 ZxkEgSQKLZSXYb5ab770 =XE+7 -----END PGP SIGNATURE----- Merge tag 'nfs-for-3.11-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs Pull NFS client bugfixes from Trond Myklebust: - Fix a regression against NFSv4 FreeBSD servers when creating a new file - Fix another regression in rpc_client_register() * tag 'nfs-for-3.11-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: NFSv4: Fix a regression against the FreeBSD server SUNRPC: Fix another issue with rpc_client_register()	2013-07-20 10:48:24 -07:00
Eric Dumazet	1faabf2aab	bridge: do not call setup_timer() multiple times commit `9f00b2e7cf` ("bridge: only expire the mdb entry when query is received") added a nasty bug as an active timer can be reinitialized. setup_timer() must be done once, no matter how many time mod_timer() is called. br_multicast_new_group() is the right place to do this. Reported-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Diagnosed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Eric Dumazet <edumazet@google.com> Tested-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Cc: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-19 22:12:42 -07:00
Michal Tesar	651e92716a	sysctl net: Keep tcp_syn_retries inside the boundary Limit the min/max value passed to the /proc/sys/net/ipv4/tcp_syn_retries. Signed-off-by: Michal Tesar <mtesar@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-19 17:36:03 -07:00
Dragos Foianu	aafee33423	net/irda: fixed style issues in irttp Applied error fixes suggested by checpatch.pl Signed-off-by: Dragos Foianu <dragos.foianu@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-19 17:34:40 -07:00
Frederic Danis	7427b370e0	NFC: Fix NCI over SPI build kbuild test robot found following error: net/built-in.o: In function `nci_spi_send': >> spi.c:(.text+0x19a76f): undefined reference to `crc_ccitt' Add CRC_CCITT module to Kconfig to fix it Reported-by: kbuild test robot. Signed-off-by: Frederic Danis <frederic.danis@linux.intel.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2013-07-19 16:55:26 +02:00
Linus Torvalds	ecb2cf1a6b	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: "A couple interesting SKB fragment handling fixes, plus the usual small bits here and there: 1) Fix 64-bit divide build failure on 32-bit platforms in mlx5, from Tim Gardner. 2) Get rid of a stupid reimplementation on "%phC" in our sysfs MAC address printing helper. 3) Fix NETIF_F_SG capability advertisement in hyperv driver, if the device can't do checksumming offloads then it shouldn't say it can do SG either. From Haiyang Zhang. 4) bgmac needs to depend on PHYLIB, from Hauke Mehrtens. 5) Don't leak DMA mappings on mapping failures, from Neil Horman. 6) We need to reset the transport header of SKBs in ipv4 before we attempt to perform early socket demux, just like ipv6 does. From Eric Dumazet. 7) Add missing locking on vxlan device removal, from Stephen Hemminger. 8) xen-netfront has to make two passes over an SKB to prepare it for transfer. One pass calculates the number of slots needed, the second massages the SKB and fills the slots. Unfortunately, the first pass doesn't calculate the number of slots properly so we can end up trying to build a MAX_SKB_FRAGS + 1 SKB which doesn't work out so well. Fix from Jan Beulich with help and discussion with several others. 9) Fix a similar problem in tun and macvtap, which have to split up scatter-gather elements at PAGE_SIZE boundaries. Don't do zerocopy if it would result in a > MAX_SKB_FRAGS skb. Fixes from Jason Wang. 10) On receive, once we've decoded the VLAN state completely, clear skb->vlan_tci. Otherwise demuxed tunnels underneath can trigger the VLAN code again, corrupting the packet. Fix from Eric Dumazet" git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: vlan: fix a race in egress prio management vlan: mask vlan prio bits macvtap: do not zerocopy if iov needs more pages than MAX_SKB_FRAGS tuntap: do not zerocopy if iov needs more pages than MAX_SKB_FRAGS pkt_sched: sch_qfq: remove a source of high packet delay/jitter xen-netfront: pull on receive skb may need to happen earlier vxlan: add necessary locking on device removal hyperv: Fix the NETIF_F_SG flag setting in netvsc net: Fix sysfs_format_mac() code duplication. be2net: Fix to avoid hardware workaround when not needed macvtap: do not assume 802.1Q when send vlan packets macvtap: fix the missing ret value of TUNSETQUEUE ipv4: set transport header earlier mlx5 core: Fix __udivdi3 when compiling for 32 bit arches bgmac: add dependency to phylib net/irda: fixed style issues in irlan_eth ethtool: fixed trailing statements in ethtool ndisc: bool initializations should use true and false atl1e: unmap partially mapped skb on dma error and free skb	2013-07-18 20:08:47 -07:00
Eric Dumazet	3e3aac4975	vlan: fix a race in egress prio management egress_priority_map[] hash table updates are protected by rtnl, and we never remove elements until device is dismantled. We have to make sure that before inserting an new element in hash table, all its fields are committed to memory or else another cpu could find corrupt values and crash. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-18 13:07:13 -07:00
Eric Dumazet	d4b812dea4	vlan: mask vlan prio bits In commit `48cc32d38a` ("vlan: don't deliver frames for unknown vlans to protocols") Florian made sure we set pkt_type to PACKET_OTHERHOST if the vlan id is set and we could find a vlan device for this particular id. But we also have a problem if prio bits are set. Steinar reported an issue on a router receiving IPv6 frames with a vlan tag of 4000 (id 0, prio 2), and tunneled into a sit device, because skb->vlan_tci is set. Forwarded frame is completely corrupted : We can see (8100:4000) being inserted in the middle of IPv6 source address : 16:48:00.780413 IP6 2001:16d8:8100:4000:ee1c:0:9d9:bc87 > 9f94:4d95:2001:67c:29f4::: ICMP6, unknown icmp6 type (0), length 64 0x0000: 0000 0029 8000 c7c3 7103 0001 a0ae e651 0x0010: 0000 0000 ccce 0b00 0000 0000 1011 1213 0x0020: 1415 1617 1819 1a1b 1c1d 1e1f 2021 2223 0x0030: 2425 2627 2829 2a2b 2c2d 2e2f 3031 3233 It seems we are not really ready to properly cope with this right now. We can probably do better in future kernels : vlan_get_ingress_priority() should be a netdev property instead of a per vlan_dev one. For stable kernels, lets clear vlan_tci to fix the bugs. Reported-by: Steinar H. Gunderson <sesse@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-18 13:05:23 -07:00
Paolo Valente	87f40dd6ce	pkt_sched: sch_qfq: remove a source of high packet delay/jitter QFQ+ inherits from QFQ a design choice that may cause a high packet delay/jitter and a severe short-term unfairness. As QFQ, QFQ+ uses a special quantity, the system virtual time, to track the service provided by the ideal system it approximates. When a packet is dequeued, this quantity must be incremented by the size of the packet, divided by the sum of the weights of the aggregates waiting to be served. Tracking this sum correctly is a non-trivial task, because, to preserve tight service guarantees, the decrement of this sum must be delayed in a special way [1]: this sum can be decremented only after that its value would decrease also in the ideal system approximated by QFQ+. For efficiency, QFQ+ keeps track only of the 'instantaneous' weight sum, increased and decreased immediately as the weight of an aggregate changes, and as an aggregate is created or destroyed (which, in its turn, happens as a consequence of some class being created/destroyed/changed). However, to avoid the problems caused to service guarantees by these immediate decreases, QFQ+ increments the system virtual time using the maximum value allowed for the weight sum, 2^10, in place of the dynamic, instantaneous value. The instantaneous value of the weight sum is used only to check whether a request of weight increase or a class creation can be satisfied. Unfortunately, the problems caused by this choice are worse than the temporary degradation of the service guarantees that may occur, when a class is changed or destroyed, if the instantaneous value of the weight sum was used to update the system virtual time. In fact, the fraction of the link bandwidth guaranteed by QFQ+ to each aggregate is equal to the ratio between the weight of the aggregate and the sum of the weights of the competing aggregates. The packet delay guaranteed to the aggregate is instead inversely proportional to the guaranteed bandwidth. By using the maximum possible value, and not the actual value of the weight sum, QFQ+ provides each aggregate with the worst possible service guarantees, and not with service guarantees related to the actual set of competing aggregates. To see the consequences of this fact, consider the following simple example. Suppose that only the following aggregates are backlogged, i.e., that only the classes in the following aggregates have packets to transmit: one aggregate with weight 10, say A, and ten aggregates with weight 1, say B1, B2, ..., B10. In particular, suppose that these aggregates are always backlogged. Given the weight distribution, the smoothest and fairest service order would be: A B1 A B2 A B3 A B4 A B5 A B6 A B7 A B8 A B9 A B10 A B1 A B2 ... QFQ+ would provide exactly this optimal service if it used the actual value for the weight sum instead of the maximum possible value, i.e., 11 instead of 2^10. In contrast, since QFQ+ uses the latter value, it serves aggregates as follows (easy to prove and to reproduce experimentally): A B1 B2 B3 B4 B5 B6 B7 B8 B9 B10 A A A A A A A A A A B1 B2 ... B10 A A ... By replacing 10 with N in the above example, and by increasing N, one can increase at will the maximum packet delay and the jitter experienced by the classes in aggregate A. This patch addresses this issue by just using the above 'instantaneous' value of the weight sum, instead of the maximum possible value, when updating the system virtual time. After the instantaneous weight sum is decreased, QFQ+ may deviate from the ideal service for a time interval in the order of the time to serve one maximum-size packet for each backlogged class. The worst-case extent of the deviation exhibited by QFQ+ during this time interval [1] is basically the same as of the deviation described above (but, without this patch, QFQ+ suffers from such a deviation all the time). Finally, this patch modifies the comment to the function qfq_slot_insert, to make it coherent with the fact that the weight sum used by QFQ+ can now be lower than the maximum possible value. [1] P. Valente, "Extending WF2Q+ to support a dynamic traffic mix", Proceedings of AAA-IDEA'05, June 2005. Signed-off-by: Paolo Valente <paolo.valente@unimore.it> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-18 13:02:00 -07:00
Linus Torvalds	3f334c2081	Merge branch 'cpuinit_phase2' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux Pull phase two of __cpuinit removal from Paul Gortmaker: "With the __cpuinit infrastructure removed earlier, this group of commits only removes the function/data tagging that was done with the various (now no-op) __cpuinit related prefixes. Now that the dust has settled with yesterday's v3.11-rc1, there hopefully shouldn't be any new users leaking back in tree, but I think we can leave the harmless no-op stubs there for a release as a courtesy to those who still have out of tree stuff and weren't paying attention. Although the commits are against the recent tag to allow for minor context refreshes for things like yesterday's v3.11-rc1~ slab content, the patches have been largely unchanged for weeks, aside from such trivial updates. For detail junkies, the largely boring and mostly irrelevant history of the patches can be viewed at: http://git.kernel.org/cgit/linux/kernel/git/paulg/cpuinit-delete.git If nothing else, I guess it does at least demonstrate the level of involvement required to shepherd such a treewide change to completion. This is the same repository of patches that has been applied to the end of the daily linux-next branches for the past several weeks" * 'cpuinit_phase2' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux: (28 commits) block: delete __cpuinit usage from all block files drivers: delete __cpuinit usage from all remaining drivers files kernel: delete __cpuinit usage from all core kernel files rcu: delete __cpuinit usage from all rcu files net: delete __cpuinit usage from all net files acpi: delete __cpuinit usage from all acpi files hwmon: delete __cpuinit usage from all hwmon files cpufreq: delete __cpuinit usage from all cpufreq files clocksource+irqchip: delete __cpuinit usage from all related files x86: delete __cpuinit usage from all x86 files score: delete __cpuinit usage from all score files xtensa: delete __cpuinit usage from all xtensa files openrisc: delete __cpuinit usage from all openrisc files m32r: delete __cpuinit usage from all m32r files hexagon: delete __cpuinit usage from all hexagon files frv: delete __cpuinit usage from all frv files cris: delete __cpuinit usage from all cris files metag: delete __cpuinit usage from all metag files tile: delete __cpuinit usage from all tile files sh: delete __cpuinit usage from all sh files ...	2013-07-18 10:50:26 -07:00
Linus Torvalds	61f98b0fca	Merge branch 'for-3.11' of git://linux-nfs.org/~bfields/linux Pull nfsd bugfixes from Bruce Fields: "Just three minor bugfixes" * 'for-3.11' of git://linux-nfs.org/~bfields/linux: svcrdma: underflow issue in decode_write_list() nfsd4: fix minorversion support interface lockd: protect nlm_blocked access in nlmsvc_retry_blocked	2013-07-17 13:43:55 -07:00
David S. Miller	ae8e9c5a1a	net: Fix sysfs_format_mac() code duplication. It's just a duplicate implementation of "%*phC". Thanks to Joe Perches for showing that we had exactly this support in the lib/vsprintf.c code already. Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-16 17:09:22 -07:00
Eric Dumazet	21d1196a35	ipv4: set transport header earlier commit `45f00f99d6` ("ipv4: tcp: clean up tcp_v4_early_demux()") added a performance regression for non GRO traffic, basically disabling IP early demux. IPv6 stack resets transport header in ip6_rcv() before calling IP early demux in ip6_rcv_finish(), while IPv4 does this only in ip_local_deliver_finish(), _after_ IP early demux. GRO traffic happened to enable IP early demux because transport header is also set in inet_gro_receive() Instead of reverting the faulty commit, we can make IPv4/IPv6 behave the same : transport_header should be set in ip_rcv() instead of ip_local_deliver_finish() ip_local_deliver_finish() can also use skb_network_header_len() which is faster than ip_hdrlen() Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-16 12:59:28 -07:00
Dragos Foianu	b6a82dd233	net/irda: fixed style issues in irlan_eth Applied fixes suggested by checkpatch.pl Signed-off-by: Dragos Foianu <dragos.foianu@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-16 12:16:03 -07:00
Dragos Foianu	8a73125c36	ethtool: fixed trailing statements in ethtool Applied fixes suggested by checkpatch.pl. Signed-off-by: Dragos Foianu <dragos.foianu@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-16 12:14:51 -07:00
Daniel Baluta	f2f79cca13	ndisc: bool initializations should use true and false Signed-off-by: Daniel Baluta <dbaluta@ixiacom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-16 12:13:31 -07:00
Felix Fietkau	5c9fc93bc9	mac80211/minstrel: fix NULL pointer dereference issue When priv_sta == NULL, mi->prev_sample is dereferenced too early. Move the assignment further down, after the rate_control_send_low call. Reported-by: Krzysztof Mazur <krzysiek@podlesie.net> Cc: stable@vger.kernel.org # 3.10 Signed-off-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-07-16 17:48:14 +03:00
Johannes Berg	c82b5a74cc	mac80211: make active monitor injection work w/ HW queue When a driver (like hwsim) uses HW queue control an active monitor vif needs to be used for the queues, make the code do that. Otherwise we'd bail out and drop the frames. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-07-16 09:58:19 +03:00
Chun-Yeow Yeoh	b60e527a72	mac80211: set forwarding in mesh capability info Set the Forwarding bit in Mesh Capability Info according to dot11MeshForwarding as defined in IEEE 802.11-2012 section 8.4.2.100.8. Signed-off-by: Chun-Yeow Yeoh <yeohchunyeow@gmail.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-07-16 09:58:14 +03:00
Simon Wunderlich	93c75786e8	mac80211: fix off-by-one regression in ibss beacon generation There is an off-by-one error in the beacon generation for the ibss mode, falsely a rate the extended supported rates which was already added to supported rates, messing up the beacon. This was introduced by commit "mac80211: select and adjust bitrates according to channel mode". Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-07-16 09:58:13 +03:00
Simon Wunderlich	2ec9c1f67a	mac80211: fix regression when initializing ibss wmm params There appear to be two regressions in ibss.c when calling ieee80211_sta_def_wmm_params(): * the second argument should be a rate length, not a rate array. This was introduced by my commit "mac80211: select and adjust bitrates according to channel mode" * the third argument is not initialized (anymore), making further checks within this function useless. Since ieee80211_sta_def_wmm_params() is only used by ibss anyway, remove the function entirely and handle the operating mode decision immediately. Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-07-16 09:58:12 +03:00
Simon Wunderlich	bf37264572	nl80211: allow 5 and 10 MHz channels for IBSS Whether the wiphy supports it or not is already checked, so what is left is to enable these channel types. Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de> Signed-off-by: Mathias Kretschmer <mathias.kretschmer@fokus.fraunhofer.de> Signed-off-by: Johannes Berg <johannes@sipsolutions.net>	2013-07-16 09:58:11 +03:00
Simon Wunderlich	4b42aab1e8	mac80211: return if IBSS chandef can not be used This was originally designed to fail when a 40+/40- mode can not be used, but basic modes (such as 5/10/20 MHz) must be handled with an error. Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de> Signed-off-by: Johannes Berg <johannes@sipsolutions.net>	2013-07-16 09:58:10 +03:00
Simon Wunderlich	7ca15a0ae8	mac80211: allow scanning for 5/10 MHz channels in IBSS Use a chandef instead of just the channel for scanning, and enable 5/10 Mhz scanning for IBSS mode. Also reporting is changed to the new inform_bss functions. Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de> Signed-off-by: Mathias Kretschmer <mathias.kretschmer@fokus.fraunhofer.de> Signed-off-by: Johannes Berg <johannes@sipsolutions.net>	2013-07-16 09:58:09 +03:00
Simon Wunderlich	0430c88347	cfg80211/mac80211: use reduced txpower for 5 and 10 MHz Some regulations (like germany, but also FCC) express their transmission power limit in dBm/MHz or mW/MHz. To cope with that and be on the safe side, reduce the maximum power to half (10 MHz) or quarter (5 MHz) when operating on these reduced bandwidth channels. Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de> Signed-off-by: Mathias Kretschmer <mathias.kretschmer@fokus.fraunhofer.de> Signed-off-by: Johannes Berg <johannes@sipsolutions.net>	2013-07-16 09:58:08 +03:00
Simon Wunderlich	74608aca4d	cfg80211/mac80211: get mandatory rates based on scan width Mandatory rates for 5 and 10 MHz are different from the rates used for 20 MHz in 2.4 GHz mode, as they use OFDM only. Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de> Signed-off-by: Mathias Kretschmer <mathias.kretschmer@fokus.fraunhofer.de> Signed-off-by: Johannes Berg <johannes@sipsolutions.net>	2013-07-16 09:58:07 +03:00
Simon Wunderlich	2103dec147	mac80211: select and adjust bitrates according to channel mode The various components accessing the bitrates table must use consider the used channel bandwidth to select only available rates or calculate the bitrate correctly. There are some rates in reduced bandwidth modes which can't be represented as multiples of 500kbps, like 2.25 MBit/s in 5 MHz mode. The standard suggests to round up to the next multiple of 500kbps, just do that in mac80211 as well. Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de> Signed-off-by: Mathias Kretschmer <mathias.kretschmer@fokus.fraunhofer.de> [make rate unsigned in ieee80211_add_tx_radiotap_header(), squash fix] Signed-off-by: Johannes Berg <johannes@sipsolutions.net>	2013-07-16 09:58:06 +03:00
Simon Wunderlich	a5e70697d0	mac80211: add radiotap flag and handling for 5/10 MHz Wireshark already defines radiotap channel flags for 5 and 10 MHz, so just use them in Linux radiotap too. Furthermore, add rx status flags to allow drivers to report when they received data on 5 or 10 MHz channels. Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de> Signed-off-by: Mathias Kretschmer <mathias.kretschmer@fokus.fraunhofer.de> Signed-off-by: Johannes Berg <johannes@sipsolutions.net>	2013-07-16 09:58:05 +03:00
Simon Wunderlich	438b61b770	mac80211: fix timing for 5 MHz and 10 MHz channels according to IEEE 802.11-2012 section 18, various timings change when using 5 MHz and 10 MHz. Reflect this by using a "shift" when calculating durations. Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de> Signed-off-by: Mathias Kretschmer <mathias.kretschmer@fokus.fraunhofer.de> Signed-off-by: Johannes Berg <johannes@sipsolutions.net>	2013-07-16 09:58:04 +03:00
Simon Wunderlich	3de805cf96	mac80211/rc80211: add chandef to rate initialization 5 and 10 MHz support needs to know the current operating channel width, add the chandef to the rate control API. Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de> Signed-off-by: Mathias Kretschmer <mathias.kretschmer@fokus.fraunhofer.de> Signed-off-by: Johannes Berg <johannes@sipsolutions.net>	2013-07-16 09:58:02 +03:00
Simon Wunderlich	dcd6eac1f3	nl80211: add scan width to bss and scan request structs To allow scanning and working with 5 MHz and 10 MHz BSS, extend the inform bss commands and add wrappers to take 5 and 10 MHz bss into account. Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de> Signed-off-by: Mathias Kretschmer <mathias.kretschmer@fokus.fraunhofer.de> Signed-off-by: Johannes Berg <johannes@sipsolutions.net>	2013-07-16 09:58:01 +03:00
Amitkumar Karwar	be29b99a9b	cfg80211/nl80211: Add packet coalesce support In most cases, host that receives IPv4 and IPv6 multicast/broadcast packets does not do anything with these packets. Therefore the reception of these unwanted packets causes unnecessary processing and power consumption. Packet coalesce feature helps to reduce number of received interrupts to host by buffering these packets in firmware/hardware for some predefined time. Received interrupt will be generated when one of the following events occur. a) Expiration of hardware timer whose expiration time is set to maximum coalescing delay of matching coalesce rule. b) Coalescing buffer in hardware reaches it's limit. c) Packet doesn't match any of the configured coalesce rules. This patch adds set/get configuration support for packet coalesce. User needs to configure following parameters for creating a coalesce rule. a) Maximum coalescing delay b) List of packet patterns which needs to be matched c) Condition for coalescence. pattern 'match' or 'no match' Multiple such rules can be created. This feature needs to be advertised during driver initialization. Drivers are supposed to do required firmware/hardware settings based on user configuration. Signed-off-by: Amitkumar Karwar <akarwar@marvell.com> Signed-off-by: Bing Zhao <bzhao@marvell.com> [fix kernel-doc, change free function, fix copy/paste error] Signed-off-by: Johannes Berg <johannes@sipsolutions.net>	2013-07-16 09:58:00 +03:00
Johannes Berg	a144f378a4	mac80211: add per-chain signal information to radiotap When per-chain signal information is available, don't add the antenna field once but instead add a radiotap namespace for each chain containing the chain/antenna number and the signal strength on that chain. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Johannes Berg <johannes@sipsolutions.net>	2013-07-16 09:57:59 +03:00
Simon Wunderlich	822854b0b1	mac80211: enable HT overrides for ibss Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de> Signed-off-by: Mathias Kretschmer <mathias.kretschmer@fokus.fraunhofer.de> Signed-off-by: Johannes Berg <johannes@sipsolutions.net>	2013-07-16 09:57:57 +03:00
Simon Wunderlich	803768f54e	nl80211: enable HT overrides for ibss Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de> Signed-off-by: Mathias Kretschmer <mathias.kretschmer@fokus.fraunhofer.de> Signed-off-by: Johannes Berg <johannes@sipsolutions.net>	2013-07-16 09:57:56 +03:00
Amitkumar Karwar	50ac660784	cfg80211/nl80211: rename packet pattern related structures and enums Currently packet patterns and it's enum/structures are used only for WoWLAN feature. As we intend to reuse them for new feature packet coalesce, they are renamed in this patch. Older names are kept for backward compatibility purpose. Signed-off-by: Amitkumar Karwar <akarwar@marvell.com> Signed-off-by: Bing Zhao <bzhao@marvell.com> Signed-off-by: Johannes Berg <johannes@sipsolutions.net>	2013-07-16 09:57:55 +03:00
Johannes Berg	6b0f32745d	mac80211: fix duplicate retransmission detection The duplicate retransmission detection code in mac80211 erroneously attempts to do the check for every frame, even frames that don't have a sequence control field or that don't use it (QoS-Null frames.) This is problematic because it causes the code to access data beyond the end of the SKB and depending on the data there will drop packets erroneously. Correct the code to not do duplicate detection for such frames. I found this error while testing AP powersave, it lead to retransmitted PS-Poll frames being dropped entirely as the data beyond the end of the SKB was always zero. Cc: stable@vger.kernel.org [all versions] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-07-16 09:55:59 +03:00
Chun-Yeow Yeoh	83374fe9de	nl80211: fix the setting of RSSI threshold value for mesh RSSI threshold value used for mesh peering should be in negative value. After range checks to mesh parameters is introduced, this is not allowed. Fix this. Signed-off-by: Chun-Yeow Yeoh <yeohchunyeow@gmail.com> Signed-off-by: Johannes Berg <johannes@sipsolutions.net>	2013-07-16 09:55:58 +03:00
Johannes Berg	e13bae4f80	mac80211: fix ethtool stats for non-station interfaces As reported in https://bugzilla.kernel.org/show_bug.cgi?id=60514, the station loop never initialises 'sinfo' and therefore adds up a stack values, leaking stack information (the number of times it adds values is easily obtained another way.) Fix this by initialising the sinfo for each station to add. Cc: stable@vger.kernel.org Signed-off-by: Johannes Berg <johannes@sipsolutions.net>	2013-07-16 09:55:57 +03:00
Felix Fietkau	1cd1585739	mac80211/minstrel_ht: fix cck rate sampling The CCK group needs special treatment to set the right flags and rate index. Add this missing check to prevent setting broken rates for tx packets. Cc: stable@vger.kernel.org # 3.10 Signed-off-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: Johannes Berg <johannes@sipsolutions.net>	2013-07-16 09:55:56 +03:00
Johannes Berg	f77b86d7d3	regulatory: add missing rtnl locking restore_regulatory_settings() requires the RTNL to be held, add the missing locking in reg_timeout_work(). Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-07-16 09:55:55 +03:00
Johannes Berg	923a0e7dee	cfg80211: fix bugs in new SME implementation When splitting the SME implementation from the MLME code, I introduced a few bugs: * association failures no longer sent a connect-failure event * getting disassociated from the AP caused deauth to be sent but state wasn't cleaned up, leading to warnings * authentication failures weren't cleaned up properly, causing new connection attempts to warn and fail Fix these bugs. Signed-off-by: Johannes Berg <johannes@sipsolutions.net>	2013-07-16 09:55:54 +03:00
Michal Kazior	a0ec570f4f	nl80211: fix mgmt tx status and testmode reporting for netns These two events were sent to the default network namespace. This caused AP mode in a non-default netns to not work correctly. Mgmt tx status was multicasted to a different (default) netns instead of the one the AP was in. Cc: stable@vger.kernel.org Signed-off-by: Michal Kazior <michal.kazior@tieto.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-07-16 09:55:37 +03:00
Dan Carpenter	b2781e1021	svcrdma: underflow issue in decode_write_list() My static checker marks everything from ntohl() as untrusted and it complains we could have an underflow problem doing: return (u32 *)&ary->wc_array[nchunks]; Also on 32 bit systems the upper bound check could overflow. Cc: stable@vger.kernel.org Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-07-15 11:46:23 -04:00
Trond Myklebust	1540c5d3cb	SUNRPC: Fix another issue with rpc_client_register() Fix the error pathway if rpcauth_create() fails. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2013-07-15 10:15:54 -04:00
Eric Dumazet	baf60efa58	netfilter: xt_socket: fix broken v0 support commit `681f130f39` ("netfilter: xt_socket: add XT_SOCKET_NOWILDCARD flag") added a potential NULL dereference if an old iptables package uses v0 of the match. Fix this by removing the test on @info in fast path. IPv6 can remove the test as well, as it uses v1 or v2. Reported-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-07-15 11:15:21 +02:00
Pablo Neira Ayuso	f09eca8db0	netfilter: ctnetlink: fix incorrect NAT expectation dumping nf_ct_expect_alloc leaves unset the expectation NAT fields. However, ctnetlink_exp_dump_expect expects them to be zeroed in case they are not used, which may not be the case. This results in dumping the NAT tuple of the expectation when it should not. Fix it by zeroing the NAT fields of the expectation. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-07-15 11:14:51 +02:00
Rusty Russell	8c6ffba0ed	PTR_RET is now PTR_ERR_OR_ZERO(): Replace most. Sweep of the simple cases. Cc: netdev@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-arm-kernel@lists.infradead.org Cc: Julia Lawall <julia.lawall@lip6.fr> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-07-15 11:25:01 +09:30
Paul Gortmaker	013dbb325b	net: delete __cpuinit usage from all net files The __cpuinit type of throwaway sections might have made sense some time ago when RAM was more constrained, but now the savings do not offset the cost and complications. For example, the fix in commit `5e427ec2d0` ("x86: Fix bit corruption at CPU resume time") is a good example of the nasty type of bugs that can be created with improper use of the various __init prefixes. After a discussion on LKML[1] it was decided that cpuinit should go the way of devinit and be phased out. Once all the users are gone, we can then finally remove the macros themselves from linux/init.h. This removes all the net/* uses of the __cpuinit macros from all C files. [1] https://lkml.org/lkml/2013/5/20/589 Cc: "David S. Miller" <davem@davemloft.net> Cc: netdev@vger.kernel.org Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>	2013-07-14 19:36:58 -04:00
Linus Torvalds	41d9884c44	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull more vfs stuff from Al Viro: "O_TMPFILE ABI changes, Oleg's fput() series, misc cleanups, including making simple_lookup() usable for filesystems with non-NULL s_d_op, which allows us to get rid of quite a bit of ugliness" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: sunrpc: now we can just set ->s_d_op cgroup: we can use simple_lookup() now efivarfs: we can use simple_lookup() now make simple_lookup() usable for filesystems that set ->s_d_op configfs: don't open-code d_alloc_name() __rpc_lookup_create_exclusive: pass string instead of qstr rpc_create_*_dir: don't bother with qstr llist: llist_add() can use llist_add_batch() llist: fix/simplify llist_add() and llist_add_batch() fput: turn "list_head delayed_fput_list" into llist_head fs/file_table.c:fput(): add comment Safer ABI for O_TMPFILE	2013-07-14 11:42:26 -07:00
Al Viro	dae3794fd6	sunrpc: now we can just set ->s_d_op Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-07-14 17:55:39 +04:00
Al Viro	d3db90b0a4	__rpc_lookup_create_exclusive: pass string instead of qstr ... and use d_hash_and_lookup() instead of open-coding it, for fsck sake... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-07-14 17:09:57 +04:00
Al Viro	a95e691f9c	rpc_create_*_dir: don't bother with qstr just pass the name Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-07-14 17:02:28 +04:00
Linus Torvalds	be9c6d9169	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: "Just a bunch of small fixes and tidy ups: 1) Finish the "busy_poll" renames, from Eliezer Tamir. 2) Fix RCU stalls in IFB driver, from Ding Tianhong. 3) Linearize buffers properly in tun/macvtap zerocopy code. 4) Don't crash on rmmod in vxlan, from Pravin B Shelar. 5) Spinlock used before init in alx driver, from Maarten Lankhorst. 6) A sparse warning fix in bnx2x broke TSO checksums, fix from Dmitry Kravkov. 7) Dummy and ifb driver load failure paths can oops, fixes from Tan Xiaojun and Ding Tianhong. 8) Correct MTU calculations in IP tunnels, from Alexander Duyck. 9) Account all TCP retransmits in SNMP stats properly, from Yuchung Cheng. 10) atl1e and via-rhine do not handle DMA mapping failures properly, from Neil Horman. 11) Various equal-cost multipath route fixes in ipv6 from Hannes Frederic Sowa" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (36 commits) ipv6: only static routes qualify for equal cost multipathing via-rhine: fix dma mapping errors atl1e: fix dma mapping warnings tcp: account all retransmit failures usb/net/r815x: fix cast to restricted __le32 usb/net/r8152: fix integer overflow in expression net: access page->private by using page_private net: strict_strtoul is obsolete, use kstrtoul instead drivers/net/ieee802154: don't use devm_pinctrl_get_select_default() in probe drivers/net/ethernet/cadence: don't use devm_pinctrl_get_select_default() in probe drivers/net/can/c_can: don't use devm_pinctrl_get_select_default() in probe net/usb: add relative mii functions for r815x net/tipc: use %*phC to dump small buffers in hex form qlcnic: Adding Maintainers. gre: Fix MTU sizing check for gretap tunnels pkt_sched: sch_qfq: remove forward declaration of qfq_update_agg_ts pkt_sched: sch_qfq: improve efficiency of make_eligible gso: Update tunnel segmentation to support Tx checksum offload inet: fix spacing in assignment ifb: fix oops when loading the ifb failed ...	2013-07-13 17:42:22 -07:00
Hannes Frederic Sowa	307f2fb95e	ipv6: only static routes qualify for equal cost multipathing Static routes in this case are non-expiring routes which did not get configured by autoconf or by icmpv6 redirects. To make sure we actually get an ecmp route while searching for the first one in this fib6_node's leafs, also make sure it matches the ecmp route assumptions. v2: a) Removed RTF_EXPIRE check in dst.from chain. The check of RTF_ADDRCONF already ensures that this route, even if added again without RTF_EXPIRES (in case of a RA announcement with infinite timeout), does not cause the rt6i_nsiblings logic to go wrong if a later RA updates the expiration time later. v3: a) Allow RTF_EXPIRES routes to enter the ecmp route set. We have to do so, because an pmtu event could update the RTF_EXPIRES flag and we would not count this route, if another route joins this set. We now filter only for RTF_GATEWAY\|RTF_ADDRCONF\|RTF_DYNAMIC, which are flags that don't get changed after rt6_info construction. Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-12 16:29:54 -07:00
Yuchung Cheng	24ab6bec80	tcp: account all retransmit failures Change snmp RETRANSFAILS stat to include timeout retransmit failures in addition to other loss recoveries. Signed-off-by: Yuchung Cheng <ycheng@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-12 16:15:56 -07:00
Sunghan Suh	40dadff265	net: access page->private by using page_private Signed-off-by: Sunghan Suh <sunghan.suh@samsung.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-12 16:10:34 -07:00
“Cosmin	92338dc2fb	net: strict_strtoul is obsolete, use kstrtoul instead patch found using checkpatch.pl Signed-off-by: Cosmin Stanescu <cosmin90stanescu@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-12 16:09:14 -07:00
Andy Shevchenko	d77e41e127	net/tipc: use %*phC to dump small buffers in hex form Instead of passing each byte by stack let's use nice specifier for that. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-11 17:03:36 -07:00
Alexander Duyck	8c91e162e0	gre: Fix MTU sizing check for gretap tunnels This change fixes an MTU sizing issue seen with gretap tunnels when non-gso packets are sent from the interface. In my case I was able to reproduce the issue by simply sending a ping of 1421 bytes with the gretap interface created on a device with a standard 1500 mtu. This fix is based on the fact that the tunnel mtu is already adjusted by dev->hard_header_len so it would make sense that any packets being compared against that mtu should also be adjusted by hard_header_len and the tunnel header instead of just the tunnel header. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Reported-by: Cong Wang <amwang@redhat.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-11 16:12:03 -07:00
Paolo Valente	88d4f419a4	pkt_sched: sch_qfq: remove forward declaration of qfq_update_agg_ts This patch removes the forward declaration of qfq_update_agg_ts, by moving the definition of the function above its first call. This patch also removes a useless forward declaration of qfq_schedule_agg. Reported-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paolo Valente <paolo.valente@unimore.it> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-11 13:01:07 -07:00
Paolo Valente	87f1369d6e	pkt_sched: sch_qfq: improve efficiency of make_eligible In make_eligible, a mask is used to decide which groups must become eligible: the i-th group becomes eligible only if the i-th bit of the mask (from the right) is set. The mask is computed by left-shifting a 1 by a given number of places, and decrementing the result. The shift is performed on a ULL to avoid problems in case the number of places to shift is higher than 31. On a 32-bit machine, this is more costly than working on an UL. This patch replaces such a costly operation with two cheaper branches. The trick is based on the following fact: in case of a shift of at least 32 places, the resulting mask has at least the 32 less significant bits set, whereas the total number of groups is lower than 32. As a consequence, in this case it is enough to just set the 32 less significant bits of the mask with a cheaper ~0UL. In the other case, the shift can be safely performed on a UL. Reported-by: David S. Miller <davem@davemloft.net> Reported-by: David Laight <David.Laight@ACULAB.COM> Signed-off-by: Paolo Valente <paolo.valente@unimore.it> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-11 13:01:07 -07:00
Alexander Duyck	cdbaa0bb26	gso: Update tunnel segmentation to support Tx checksum offload This change makes it so that the GRE and VXLAN tunnels can make use of Tx checksum offload support provided by some drivers via the hw_enc_features. Without this fix enabling GSO means sacrificing Tx checksum offload and this actually leads to a performance regression as shown below: Utilization Send Throughput local GSO 10^6bits/s % S state 6276.51 8.39 enabled 7123.52 8.42 disabled To resolve this it was necessary to address two items. First netif_skb_features needed to be updated so that it would correctly handle the Trans Ether Bridging protocol without impacting the need to check for Q-in-Q tagging. To do this it was necessary to update harmonize_features so that it used skb_network_protocol instead of just using the outer protocol. Second it was necessary to update the GRE and UDP tunnel segmentation offloads so that they would reset the encapsulation bit and inner header offsets after the offload was complete. As a result of this change I have seen the following results on a interface with Tx checksum enabled for encapsulated frames: Utilization Send Throughput local GSO 10^6bits/s % S state 7123.52 8.42 disabled 8321.75 5.43 enabled v2: Instead of replacing refrence to skb->protocol with skb_network_protocol just replace the protocol reference in harmonize_features to allow for double VLAN tag checks. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-11 12:18:49 -07:00
Linus Torvalds	1466b77a7b	NFS client updates for Linux 3.11 (part 2) Highlights include: - Fix an_rpc pipefs regression that causes a deadlock on mount - Readdir optimisations by Scott Mayhew and Jeff Layton - clean up the rpc_pipefs dentry operation setup -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.13 (GNU/Linux) iQIcBAABAgAGBQJR3vVIAAoJEGcL54qWCgDyBWEP/0blqSlJId4zZj4xDviRFqJ4 93C7b/Vn7LrAcNCgDQsPkkzTwAX5yTB1H5eNtMuyggAdGj89d4n0jXgBniIMHmqI Pjrr/XMQ65NddehrO491N01iJSfP9wE3CizJodnAv4VxMRO3xqiJG85lcnoLOFea V1FnEFUu9oi8e93cQt2fe6KdmTu/SuRqlqR7WPGyTFgS26x1l8nkp2OQgulit5Up lWuaxg4xbKOdj1jfUDXZhWUnDtkFjxyGxnKR63aA2X1DEGCUTJ6gB3tAl9pvnUb2 RTQF3GVj+Bm/E3gE6ULJvqOjhsgWYjLAZn6hDA3yNAIiFyV7aA6gwK4oKy/B47a6 tFEN2O1EupWzCqGyHhTArk+oEBLfUv/EgFyo7+Y0YIFV4sQTu5RbaZ0nQ2geY6LA 50q2GH57tkXTs859gtBPQgKzgRF1ulkF1FDY9EYQHyGiUbNxBfx+6/2OI04ubQt3 1gKUmm9w1WVzYGmHcHbxsXPT53NtAnHXW4ExcMgpaZ1YOPuIILm78ZuAw78XB/dd mvXRtbhVt/gs7qZAQQPp1iHIv+vnJ0KgjO62gbuTIRftw5jwWrpWcfYMUUZrMnyM kn326z3f4gn/vSDZI7J4tOfG1Uc7eNy+cJxStjtiNWTs3UzuWJKzJH0rZnoNZdei xAkLhjIUEybAqIpXJuGH =NqQf -----END PGP SIGNATURE----- Merge tag 'nfs-for-3.11-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs Pull second set of NFS client updates from Trond Myklebust: "This mainly contains some small readdir optimisations that had dependencies on Al Viro's readdir rewrite. There is also a fix for a nasty deadlock which surfaced earlier in this merge window. Highlights include: - Fix an_rpc pipefs regression that causes a deadlock on mount - Readdir optimisations by Scott Mayhew and Jeff Layton - clean up the rpc_pipefs dentry operation setup" * tag 'nfs-for-3.11-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: SUNRPC: Fix a deadlock in rpc_client_register() rpc_pipe: rpc_dir_inode_operations can be static NFS: Allow nfs_updatepage to extend a write under additional circumstances NFS: Make nfs_readdir revalidate less often NFS: Make nfs_attribute_cache_expired() non-static rpc_pipe: set dentry operations at d_alloc time nfs: set verifier on existing dentries in nfs_prime_dcache	2013-07-11 12:11:35 -07:00
Camelia Groza	3b8ccd4473	inet: fix spacing in assignment Found using checkpatch.pl Signed-off-by: Camelia Groza <camelia.groza@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-11 12:02:39 -07:00
Hannes Frederic Sowa	afc154e978	ipv6: fix route selection if kernel is not compiled with CONFIG_IPV6_ROUTER_PREF This is a follow-up patch to `3630d40067` ("ipv6: rt6_check_neigh should successfully verify neigh if no NUD information are available"). Since the removal of rt->n in rt6_info we can end up with a dst == NULL in rt6_check_neigh. In case the kernel is not compiled with CONFIG_IPV6_ROUTER_PREF we should also select a route with unkown NUD state but we must not avoid doing round robin selection on routes with the same target. So introduce and pass down a boolean ``do_rr'' to indicate when we should update rt->rr_ptr. As soon as no route is valid we do backtracking and do a lookup on a higher level in the fib trie. v2: a) Improved rt6_check_neigh logic (no need to create neighbour there) and documented return values. v3: a) Introduce enum rt6_nud_state to get rid of the magic numbers (thanks to David Miller). b) Update and shorten commit message a bit to actualy reflect the source. Reported-by: Pierre Emeriaud <petrus.lt@gmail.com> Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-11 11:51:10 -07:00
Sasha Levin	110ecd69a9	9p: fix off by one causing access violations and memory corruption p9_release_pages() would attempt to dereference one value past the end of pages[]. This would cause the following crashes: [ 6293.171817] BUG: unable to handle kernel paging request at ffff8807c96f3000 [ 6293.174146] IP: [<ffffffff8412793b>] p9_release_pages+0x3b/0x60 [ 6293.176447] PGD 79c5067 PUD 82c1e3067 PMD 82c197067 PTE 80000007c96f3060 [ 6293.180060] Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC [ 6293.180060] Modules linked in: [ 6293.180060] CPU: 62 PID: 174043 Comm: modprobe Tainted: G W 3.10.0-next-20130710-sasha #3954 [ 6293.180060] task: ffff8807b803b000 ti: ffff880787dde000 task.ti: ffff880787dde000 [ 6293.180060] RIP: 0010:[<ffffffff8412793b>] [<ffffffff8412793b>] p9_release_pages+0x3b/0x60 [ 6293.214316] RSP: 0000:ffff880787ddfc28 EFLAGS: 00010202 [ 6293.214316] RAX: 0000000000000001 RBX: ffff8807c96f2ff8 RCX: 0000000000000000 [ 6293.222017] RDX: ffff8807b803b000 RSI: 0000000000000001 RDI: ffffea001c7e3d40 [ 6293.222017] RBP: ffff880787ddfc48 R08: 0000000000000000 R09: 0000000000000000 [ 6293.222017] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000001 [ 6293.222017] R13: 0000000000000001 R14: ffff8807cc50c070 R15: ffff8807cc50c070 [ 6293.222017] FS: 00007f572641d700(0000) GS:ffff8807f3600000(0000) knlGS:0000000000000000 [ 6293.256784] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 6293.256784] CR2: ffff8807c96f3000 CR3: 00000007c8e81000 CR4: 00000000000006e0 [ 6293.256784] Stack: [ 6293.256784] ffff880787ddfcc8 ffff880787ddfcc8 0000000000000000 ffff880787ddfcc8 [ 6293.256784] ffff880787ddfd48 ffffffff84128be8 ffff880700000002 0000000000000001 [ 6293.256784] ffff8807b803b000 ffff880787ddfce0 0000100000000000 0000000000000000 [ 6293.256784] Call Trace: [ 6293.256784] [<ffffffff84128be8>] p9_virtio_zc_request+0x598/0x630 [ 6293.256784] [<ffffffff8115c610>] ? wake_up_bit+0x40/0x40 [ 6293.256784] [<ffffffff841209b1>] p9_client_zc_rpc+0x111/0x3a0 [ 6293.256784] [<ffffffff81174b78>] ? sched_clock_cpu+0x108/0x120 [ 6293.256784] [<ffffffff84122a21>] p9_client_read+0xe1/0x2c0 [ 6293.256784] [<ffffffff81708a90>] v9fs_file_read+0x90/0xc0 [ 6293.256784] [<ffffffff812bd073>] vfs_read+0xc3/0x130 [ 6293.256784] [<ffffffff811a78bd>] ? trace_hardirqs_on+0xd/0x10 [ 6293.256784] [<ffffffff812bd5a2>] SyS_read+0x62/0xa0 [ 6293.256784] [<ffffffff841a1a00>] tracesys+0xdd/0xe2 [ 6293.256784] Code: 66 90 48 89 fb 41 89 f5 48 8b 3f 48 85 ff 74 29 85 f6 74 25 45 31 e4 66 0f 1f 84 00 00 00 00 00 e8 eb 14 12 fd 41 ff c4 49 63 c4 <48> 8b 3c c3 48 85 ff 74 05 45 39 e5 75 e7 48 83 c4 08 5b 41 5c [ 6293.256784] RIP [<ffffffff8412793b>] p9_release_pages+0x3b/0x60 [ 6293.256784] RSP <ffff880787ddfc28> [ 6293.256784] CR2: ffff8807c96f3000 [ 6293.256784] ---[ end trace 50822ee72cd360fc ]--- Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-11 11:36:02 -07:00
Linus Torvalds	19d2f8e0fb	Second round of 9p patches for the 3.11 merge window. Several of these patches were rebased in order to correct style issues. Only stylistic changes were made versus the patches which were in linux-next for two weeks. The rebases have been in linux-next for 3 days and have passed my regressions. The bulk of these are RDMA fixes and improvements. There's also some additions on the extended attributes front to support some additional namespaces and a new option for TCP to force allocation of mount requests from a priviledged port. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) Comment: GPGTools - http://gpgtools.org iQIcBAABAgAGBQJR3rWXAAoJEDZk62b0Tg6xabIP/12I+SkQ57wRN03EQy5fqUdX gK/YMHKQ9QuDnZPBvrZ2lypesQNqVU0KINay6VEA86JG1gwzPyUd2MnpQ7F0vV3N XwVD54IoflV/M74xUnrgGWB8YxaPcdacQQ8yazX+mEgOgYGdWmDAl7FHmAkdKAFB gSl25f3PNJX1Rjay0dssNVXrVPXuJY/fZXKnNQZKtRwXffRWKsWHd8FU0Eq7F30A kNQB8tmMSfHBBjP+tzR0My6/kQ09jzHdtZOkH9IgVpNzqrd8tfy0l6tEvFypxqGT 5oQFoxHHL/tUW05V0P3gYany2A7lEhSUifPKS6omqHO+vPlw+pDJw+xWlNq9fnDt 8S8znqVuEHhvqRQW7zFdb9ac2MZi8CHHhC2wGIZ7GYjNG2q5XwE8b/QhdXQeFin7 ibugvoW7+ZdcDewpQW27oO0g7B/8hRt8KC+1lc/8rITKIfGxbNJkGzTDl0F4Co7v IH7Ew5PHPe6ZiuU0QSdU+NBuvk8g8sWGxx04Xvzl3WicwOg7XvN3ivrKB9oN2U1x 50KZRnYpwQQv/9AxyhroYU+Ufje8SF4v++zsq1eMzUcHsC/C73eatw2m764t+X4S 8yMLrgqY1Nzif4nAMi/SDMnB/R1bXeuc8kXD9xT6XD9d2tf6e+zCHhQklVeC0tuK RiVRJqGrfanbKMnWIG0Y =n9rI -----END PGP SIGNATURE----- Merge tag 'for-linus-3.11-merge-window-part-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs Pull second round of 9p patches from Eric Van Hensbergen: "Several of these patches were rebased in order to correct style issues. Only stylistic changes were made versus the patches which were in linux-next for two weeks. The rebases have been in linux-next for 3 days and have passed my regressions. The bulk of these are RDMA fixes and improvements. There's also some additions on the extended attributes front to support some additional namespaces and a new option for TCP to force allocation of mount requests from a priviledged port" * tag 'for-linus-3.11-merge-window-part-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs: fs/9p: Remove the unused variable "err" in v9fs_vfs_getattr() 9P: Add cancelled() to the transport functions. 9P/RDMA: count posted buffers without a pending request 9P/RDMA: Improve error handling in rdma_request 9P/RDMA: Do not free req->rc in error handling in rdma_request() 9P/RDMA: Use a semaphore to protect the RQ 9P/RDMA: Protect against duplicate replies 9P/RDMA: increase P9_RDMA_MAXSIZE to 1MB 9pnet: refactor struct p9_fcall alloc code 9P/RDMA: rdma_request() needs not allocate req->rc 9P: Fix fcall allocation for rdma fs/9p: xattr: add trusted and security namespaces net/9p: add privport option to 9p tcp transport	2013-07-11 10:21:23 -07:00
Linus Torvalds	0ff08ba5d0	Merge branch 'for-3.11' of git://linux-nfs.org/~bfields/linux Pull nfsd changes from Bruce Fields: "Changes this time include: - 4.1 enabled on the server by default: the last 4.1-specific issues I know of are fixed, so we're not going to find the rest of the bugs without more exposure. - Experimental support for NFSv4.2 MAC Labeling (to allow running selinux over NFS), from Dave Quigley. - Fixes for some delicate cache/upcall races that could cause rare server hangs; thanks to Neil Brown and Bodo Stroesser for extreme debugging persistence. - Fixes for some bugs found at the recent NFS bakeathon, mostly v4 and v4.1-specific, but also a generic bug handling fragmented rpc calls" * 'for-3.11' of git://linux-nfs.org/~bfields/linux: (31 commits) nfsd4: support minorversion 1 by default nfsd4: allow destroy_session over destroyed session svcrpc: fix failures to handle -1 uid's sunrpc: Don't schedule an upcall on a replaced cache entry. net/sunrpc: xpt_auth_cache should be ignored when expired. sunrpc/cache: ensure items removed from cache do not have pending upcalls. sunrpc/cache: use cache_fresh_unlocked consistently and correctly. sunrpc/cache: remove races with queuing an upcall. nfsd4: return delegation immediately if lease fails nfsd4: do not throw away 4.1 lock state on last unlock nfsd4: delegation-based open reclaims should bypass permissions svcrpc: don't error out on small tcp fragment svcrpc: fix handling of too-short rpc's nfsd4: minor read_buf cleanup nfsd4: fix decoding of compounds across page boundaries nfsd4: clean up nfs4_open_delegation NFSD: Don't give out read delegations on creates nfsd4: allow client to send no cb_sec flavors nfsd4: fail attempts to request gss on the backchannel nfsd4: implement minimal SP4_MACH_CRED ...	2013-07-11 10:17:13 -07:00
Hannes Frederic Sowa	1eb4f75828	ipv6: in case of link failure remove route directly instead of letting it expire We could end up expiring a route which is part of an ecmp route set. Doing so would invalidate the rt->rt6i_nsiblings calculations and could provoke the following panic: [ 80.144667] ------------[ cut here ]------------ [ 80.145172] kernel BUG at net/ipv6/ip6_fib.c:733! [ 80.145172] invalid opcode: 0000 [#1] SMP [ 80.145172] Modules linked in: 8021q nf_conntrack_netbios_ns nf_conntrack_broadcast ipt_MASQUERADE ip6table_mangle ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 iptable_nat nf_nat_ipv4 nf_nat iptable_mangle nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ebtable_filter ebtables ip6table_filter ip6_tables +snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm snd_page_alloc snd_timer virtio_balloon snd soundcore i2c_piix4 i2c_core virtio_net virtio_blk [ 80.145172] CPU: 1 PID: 786 Comm: ping6 Not tainted 3.10.0+ #118 [ 80.145172] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 [ 80.145172] task: ffff880117fa0000 ti: ffff880118770000 task.ti: ffff880118770000 [ 80.145172] RIP: 0010:[<ffffffff815f3b5d>] [<ffffffff815f3b5d>] fib6_add+0x75d/0x830 [ 80.145172] RSP: 0018:ffff880118771798 EFLAGS: 00010202 [ 80.145172] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff88011350e480 [ 80.145172] RDX: ffff88011350e238 RSI: 0000000000000004 RDI: ffff88011350f738 [ 80.145172] RBP: ffff880118771848 R08: ffff880117903280 R09: 0000000000000001 [ 80.145172] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88011350f680 [ 80.145172] R13: ffff880117903280 R14: ffff880118771890 R15: ffff88011350ef90 [ 80.145172] FS: 00007f02b5127740(0000) GS:ffff88011fd00000(0000) knlGS:0000000000000000 [ 80.145172] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 80.145172] CR2: 00007f981322a000 CR3: 00000001181b1000 CR4: 00000000000006e0 [ 80.145172] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 80.145172] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 80.145172] Stack: [ 80.145172] 0000000000000001 ffff880100000000 ffff880100000000 ffff880117903280 [ 80.145172] 0000000000000000 ffff880119a4cf00 0000000000000400 00000000000007fa [ 80.145172] 0000000000000000 0000000000000000 0000000000000000 ffff88011350f680 [ 80.145172] Call Trace: [ 80.145172] [<ffffffff815eeceb>] ? rt6_bind_peer+0x4b/0x90 [ 80.145172] [<ffffffff815ed985>] __ip6_ins_rt+0x45/0x70 [ 80.145172] [<ffffffff815eee35>] ip6_ins_rt+0x35/0x40 [ 80.145172] [<ffffffff815ef1e4>] ip6_pol_route.isra.44+0x3a4/0x4b0 [ 80.145172] [<ffffffff815ef34a>] ip6_pol_route_output+0x2a/0x30 [ 80.145172] [<ffffffff81616077>] fib6_rule_action+0xd7/0x210 [ 80.145172] [<ffffffff815ef320>] ? ip6_pol_route_input+0x30/0x30 [ 80.145172] [<ffffffff81553026>] fib_rules_lookup+0xc6/0x140 [ 80.145172] [<ffffffff81616374>] fib6_rule_lookup+0x44/0x80 [ 80.145172] [<ffffffff815ef320>] ? ip6_pol_route_input+0x30/0x30 [ 80.145172] [<ffffffff815edea3>] ip6_route_output+0x73/0xb0 [ 80.145172] [<ffffffff815dfdf3>] ip6_dst_lookup_tail+0x2c3/0x2e0 [ 80.145172] [<ffffffff813007b1>] ? list_del+0x11/0x40 [ 80.145172] [<ffffffff81082a4c>] ? remove_wait_queue+0x3c/0x50 [ 80.145172] [<ffffffff815dfe4d>] ip6_dst_lookup_flow+0x3d/0xa0 [ 80.145172] [<ffffffff815fda77>] rawv6_sendmsg+0x267/0xc20 [ 80.145172] [<ffffffff815a8a83>] inet_sendmsg+0x63/0xb0 [ 80.145172] [<ffffffff8128eb93>] ? selinux_socket_sendmsg+0x23/0x30 [ 80.145172] [<ffffffff815218d6>] sock_sendmsg+0xa6/0xd0 [ 80.145172] [<ffffffff81524a68>] SYSC_sendto+0x128/0x180 [ 80.145172] [<ffffffff8109825c>] ? update_curr+0xec/0x170 [ 80.145172] [<ffffffff81041d09>] ? kvm_clock_get_cycles+0x9/0x10 [ 80.145172] [<ffffffff810afd1e>] ? __getnstimeofday+0x3e/0xd0 [ 80.145172] [<ffffffff8152509e>] SyS_sendto+0xe/0x10 [ 80.145172] [<ffffffff8164efd9>] system_call_fastpath+0x16/0x1b [ 80.145172] Code: fe ff ff 41 f6 45 2a 06 0f 85 ca fe ff ff 49 8b 7e 08 4c 89 ee e8 94 ef ff ff e9 b9 fe ff ff 48 8b 82 28 05 00 00 e9 01 ff ff ff <0f> 0b 49 8b 54 24 30 0d 00 00 40 00 89 83 14 01 00 00 48 89 53 [ 80.145172] RIP [<ffffffff815f3b5d>] fib6_add+0x75d/0x830 [ 80.145172] RSP <ffff880118771798> [ 80.387413] ---[ end trace 02f20b7a8b81ed95 ]--- [ 80.390154] Kernel panic - not syncing: Fatal exception in interrupt Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com> Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-10 19:45:39 -07:00
Eliezer Tamir	64b0dc517e	net: rename busy poll socket op and globals Rename LL_SO to BUSY_POLL_SO Rename sysctl_net_ll_{read,poll} to sysctl_busy_{read,poll} Fix up users of these variables. Fix documentation for sysctl. a patch for the socket.7 man page will follow separately, because of limitations of my mail setup. Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-10 17:08:27 -07:00
Eliezer Tamir	8b80cda536	net: rename ll methods to busy-poll Rename ndo_ll_poll to ndo_busy_poll. Rename sk_mark_ll to sk_mark_napi_id. Rename skb_mark_ll to skb_mark_napi_id. Correct all useres of these functions. Update comments and defines in include/net/busy_poll.h Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-10 17:08:27 -07:00
Eliezer Tamir	076bb0c82a	net: rename include/net/ll_poll.h to include/net/busy_poll.h Rename the file and correct all the places where it is included. Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-10 17:08:27 -07:00
Trond Myklebust	eeee245268	SUNRPC: Fix a deadlock in rpc_client_register() Commit `384816051c` (SUNRPC: fix races on PipeFS MOUNT notifications) introduces a regression when we call rpc_setup_pipedir() with RPCSEC_GSS as the auth flavour. By calling rpcauth_create() while holding the sn->pipefs_sb_lock, we end up deadlocking in gss_pipes_dentries_create_net(). Fix is to register the client and release the mutex before calling rpcauth_create(). Reported-by: Weston Andros Adamson <dros@netapp.com> Tested-by: Weston Andros Adamson <dros@netapp.com> Cc: Stanislav Kinsbursky <skinsbursky@parallels.com> Cc: <stable@vger.kernel.org> # : 3848160: SUNRPC: fix races on PipeFS MOUNT Cc: <stable@vger.kernel.org> # : e73f4cc: SUNRPC: split client creation Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2013-07-10 15:58:55 -04:00
Fengguang Wu	4f8568cb52	rpc_pipe: rpc_dir_inode_operations can be static Hi Jeff, FYI, there are new sparse warnings show up in tree: git://git.linux-nfs.org/projects/trondmy/linux-nfs.git nfs-for-next head: 296afe1f58d55fd56ed85daaafafcfee39f59ece commit: `76fa666579` [2/5] rpc_pipe: set dentry operations at d_alloc time >> net/sunrpc/rpc_pipe.c:496:31: sparse: symbol 'rpc_dir_inode_operations' was not declared. Should it be static? Please consider folding the attached diff :-) Signed-off-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2013-07-09 21:35:27 -04:00
Linus Torvalds	496322bc91	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next Pull networking updates from David Miller: "This is a re-do of the net-next pull request for the current merge window. The only difference from the one I made the other day is that this has Eliezer's interface renames and the timeout handling changes made based upon your feedback, as well as a few bug fixes that have trickeled in. Highlights: 1) Low latency device polling, eliminating the cost of interrupt handling and context switches. Allows direct polling of a network device from socket operations, such as recvmsg() and poll(). Currently ixgbe, mlx4, and bnx2x support this feature. Full high level description, performance numbers, and design in commit `0a4db187a9` ("Merge branch 'll_poll'") From Eliezer Tamir. 2) With the routing cache removed, ip_check_mc_rcu() gets exercised more than ever before in the case where we have lots of multicast addresses. Use a hash table instead of a simple linked list, from Eric Dumazet. 3) Add driver for Atheros CQA98xx 802.11ac wireless devices, from Bartosz Markowski, Janusz Dziedzic, Kalle Valo, Marek Kwaczynski, Marek Puzyniak, Michal Kazior, and Sujith Manoharan. 4) Support reporting the TUN device persist flag to userspace, from Pavel Emelyanov. 5) Allow controlling network device VF link state using netlink, from Rony Efraim. 6) Support GRE tunneling in openvswitch, from Pravin B Shelar. 7) Adjust SOCK_MIN_RCVBUF and SOCK_MIN_SNDBUF for modern times, from Daniel Borkmann and Eric Dumazet. 8) Allow controlling of TCP quickack behavior on a per-route basis, from Cong Wang. 9) Several bug fixes and improvements to vxlan from Stephen Hemminger, Pravin B Shelar, and Mike Rapoport. In particular, support receiving on multiple UDP ports. 10) Major cleanups, particular in the area of debugging and cookie lifetime handline, to the SCTP protocol code. From Daniel Borkmann. 11) Allow packets to cross network namespaces when traversing tunnel devices. From Nicolas Dichtel. 12) Allow monitoring netlink traffic via AF_PACKET sockets, in a manner akin to how we monitor real network traffic via ptype_all. From Daniel Borkmann. 13) Several bug fixes and improvements for the new alx device driver, from Johannes Berg. 14) Fix scalability issues in the netem packet scheduler's time queue, by using an rbtree. From Eric Dumazet. 15) Several bug fixes in TCP loss recovery handling, from Yuchung Cheng. 16) Add support for GSO segmentation of MPLS packets, from Simon Horman. 17) Make network notifiers have a real data type for the opaque pointer that's passed into them. Use this to properly handle network device flag changes in arp_netdev_event(). From Jiri Pirko and Timo Teräs. 18) Convert several drivers over to module_pci_driver(), from Peter Huewe. 19) tcp_fixup_rcvbuf() can loop 500 times over loopback, just use a O(1) calculation instead. From Eric Dumazet. 20) Support setting of explicit tunnel peer addresses in ipv6, just like ipv4. From Nicolas Dichtel. 21) Protect x86 BPF JIT against spraying attacks, from Eric Dumazet. 22) Prevent a single high rate flow from overruning an individual cpu during RX packet processing via selective flow shedding. From Willem de Bruijn. 23) Don't use spinlocks in TCP md5 signing fast paths, from Eric Dumazet. 24) Don't just drop GSO packets which are above the TBF scheduler's burst limit, chop them up so they are in-bounds instead. Also from Eric Dumazet. 25) VLAN offloads are missed when configured on top of a bridge, fix from Vlad Yasevich. 26) Support IPV6 in ping sockets. From Lorenzo Colitti. 27) Receive flow steering targets should be updated at poll() time too, from David Majnemer. 28) Fix several corner case regressions in PMTU/redirect handling due to the routing cache removal, from Timo Teräs. 29) We have to be mindful of ipv4 mapped ipv6 sockets in upd_v6_push_pending_frames(). From Hannes Frederic Sowa. 30) Fix L2TP sequence number handling bugs, from James Chapman." * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1214 commits) drivers/net: caif: fix wrong rtnl_is_locked() usage drivers/net: enic: release rtnl_lock on error-path vhost-net: fix use-after-free in vhost_net_flush net: mv643xx_eth: do not use port number as platform device id net: sctp: confirm route during forward progress virtio_net: fix race in RX VQ processing virtio: support unlocked queue poll net/cadence/macb: fix bug/typo in extracting gem_irq_read_clear bit Documentation: Fix references to defunct linux-net@vger.kernel.org net/fs: change busy poll time accounting net: rename low latency sockets functions to busy poll bridge: fix some kernel warning in multicast timer sfc: Fix memory leak when discarding scattered packets sit: fix tunnel update via netlink dt:net:stmmac: Add dt specific phy reset callback support. dt:net:stmmac: Add support to dwmac version 3.610 and 3.710 dt:net:stmmac: Allocate platform data only if its NULL. net:stmmac: fix memleak in the open method ipv6: rt6_check_neigh should successfully verify neigh if no NUD information are available net: ipv6: fix wrong ping_v6_sendmsg return value ...	2013-07-09 18:24:39 -07:00
Jeff Layton	76fa666579	rpc_pipe: set dentry operations at d_alloc time Currently the way these get set is a little convoluted. If the dentry is allocated via lookup from userland, then it gets set by simple_lookup. If it gets allocated when the kernel is populating the directory, then it gets set via __rpc_lookup_create_exclusive, which has to check whether they might already be set. Between both of these, this ensures that all dentries have their d_op pointer set. Instead of doing that, just have them set at d_alloc time by pointing sb->s_d_op at them. With that change, we no longer want the lookup op to set them, so we must move to using our own lookup routine. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2013-07-09 17:16:39 -04:00
Linus Torvalds	899dd38885	Grab bag of little fixes and enhancements: * optional security enhancements * fix path coverage in MAINTAINERS * switch to using most used protocol and transport as default * clean up buffer dumps in trace code Held off on RDMA patches as they need to be cleaned up a bit, but will try to get the cleaned, checked, and pushed by mid-week. (attempt 2, hopefully this one won't screw up the history) -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) Comment: GPGTools - http://gpgtools.org iQIcBAABAgAGBQJR2iZUAAoJEDZk62b0Tg6xsfQP/i3cYmkpf58lb++WoWDohQdh iH34P6Tv+5AKcF5SViBFDyXsdkE0D/Ixzl/E6jTsx+6OTSCA0eIw4OYyvPQpzFyp 1+RqnTyEq6v2SQaGZKW7k7NyXDiRhVypXBupuNq8eZpYKS8B3cKdnQ/WFSAXcxQ1 sbKWKUWnnqIZYnRNqNK4LTxz9cbLovXIQOYBhn0F+NoAFinC1ZQrWzuUVbct880i cSoukTivmJHb37Pt9AKluPc6GGa6XHXkomQewh0WOnBJ/9FR3YUHeRXR04cnAWAL zpGKagnIhYWtdaTJQXCzO2OMCQakhf9FiBWYGjfM9ysyzS4LDp1cknlyUPox97xF o9o6MfFF161c8+uC/RpK8Lp3vG6CFPEcMVxp73BydNNI4/1hzbfCs3WcGdpkvAg/ rRik/zyN7l3jEwtvU03Y1WEV79Ep/Q8cvPqi4XZB2L1XYi43fT4yze6zMM/cmQ5K DLTbFxtN5ILWg2LjQergORyn66WqQjproPqcgd9tVrvJ30Z5KPjIh+CBVcYPWp4V hxD0Pd0yTySpxUqV4Qx/BMZdWiD1wuBgidKgl+jNldTaCSFtPqQ52LYmTWNpneI1 lcc3SMFRNRhqWMOFhzpcX1xGuXKD5eRiOrQ+L1ecFxGFYVndY5nwa6Pn8gUrfGHW LEBmADtMsv2YQW2Kahk2 =ktVU -----END PGP SIGNATURE----- Merge tag 'for-linus-3.11-merge-window-part-1' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs Pull 9p update from Eric Van Hensbergen: "Grab bag of little fixes and enhancements: - optional security enhancements - fix path coverage in MAINTAINERS - switch to using most used protocol and transport as default - clean up buffer dumps in trace code Held off on RDMA patches as they need to be cleaned up a bit, but will try to get the cleaned, checked, and pushed by mid-week" * tag 'for-linus-3.11-merge-window-part-1' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs: 9p: Add rest of 9p files to MAINTAINERS entry 9p: trace: use %*ph to dump buffer net/9p: Handle error in zero copy request correctly for 9p2000.u net/9p: Use virtio transpart as the default transport net/9p: Make 9P2000.L the default protocol for 9p file system	2013-07-09 12:55:13 -07:00
Daniel Borkmann	8c2f414ad1	net: sctp: confirm route during forward progress This fix has been proposed originally by Vlad Yasevich. He says: When SCTP makes forward progress (receives a SACK that acks new chunks, renegs, or answeres 0-window probes) or when HB-ACK arrives, mark the route as confirmed so we don't unnecessarily send NUD probes. Having a simple SCTP client/server that exchange data chunks every 1sec, without this patch ARP requests are sent periodically every 40-60sec. With this fix applied, an ARP request is only done once right at the "session" beginning. Also, when clearing the related ARP cache entry manually during the session, a new request is correctly done. I have only "backported" this to net-next and tested that it works, so full credit goes to Vlad. Signed-off-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-09 12:49:56 -07:00
Linus Torvalds	9a5889ae1c	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client Pull Ceph updates from Sage Weil: "There is some follow-on RBD cleanup after the last window's code drop, a series from Yan fixing multi-mds behavior in cephfs, and then a sprinkling of bug fixes all around. Some warnings, sleeping while atomic, a null dereference, and cleanups" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (36 commits) libceph: fix invalid unsigned->signed conversion for timespec encoding libceph: call r_unsafe_callback when unsafe reply is received ceph: fix race between cap issue and revoke ceph: fix cap revoke race ceph: fix pending vmtruncate race ceph: avoid accessing invalid memory libceph: Fix NULL pointer dereference in auth client code ceph: Reconstruct the func ceph_reserve_caps. ceph: Free mdsc if alloc mdsc->mdsmap failed. ceph: remove sb_start/end_write in ceph_aio_write. ceph: avoid meaningless calling ceph_caps_revoking if sync_mode == WB_SYNC_ALL. ceph: fix sleeping function called from invalid context. ceph: move inode to proper flushing list when auth MDS changes rbd: fix a couple warnings ceph: clear migrate seq when MDS restarts ceph: check migrate seq before changing auth cap ceph: fix race between page writeback and truncate ceph: reset iov_len when discarding cap release messages ceph: fix cap release race libceph: fix truncate size calculation ...	2013-07-09 12:39:10 -07:00
Linus Torvalds	be0c5d8c0b	NFS client updates for Linux 3.11 Feature highlights include: - Add basic client support for NFSv4.2 - Add basic client support for Labeled NFS (selinux for NFSv4.2) - Fix the use of credentials in NFSv4.1 stateful operations, and add support for NFSv4.1 state protection. Bugfix highlights: - Fix another NFSv4 open state recovery race - Fix an NFSv4.1 back channel session regression - Various rpc_pipefs races - Fix another issue with NFSv3 auth negotiation -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.13 (GNU/Linux) iQIcBAABAgAGBQJR2vsSAAoJEGcL54qWCgDyWBIP/AqlpBBAblxbNQ1Bl/0m1Pdb iKH961qgM4U1BzK0svGtHTZqkovpm4o/VbkbKBT5mQ4g6SbbsJ/AsS1plCyfnIZi bdnKNJyj6zg0NsAkJ3vKWqd4BTaP+icdSfEIlRKQxAPESewN7b5B3OWgY4KdYmnk q5BP25anC1ryxVycSY67ux8S2IKXVSRZeCZv+RO21rvZ2G0bV5y7t8Om28ztxEnU RKrHgQHgaaktR7i8QVO0sbiWq3iqLa3GPkUvFLwWGr8PQJtTkYY0QwYSrsV3N4rY hYpMRUZFHpZ8UG5YvBT6xyOy/XaGwMGKSfZjB9/YG4QVju+tTy50U1JbTil5PEWY GHWYF68aurIeUkXrhSv8AVnOnhir0mISx5ou/SV7p0QoAZ92V6kq+LkPrW520qlc z8ILh3j28pN3ZUCIEArcaZhYCt48uO2hwBi5TqevQyyGRsXFGbN1moD5jvHkllft Fi0XGuCBdvhrzFRZcsEl+PDq7fT8lXUK2BHe8oR5jz9PhUp+jpEl9m/eg3RsjJjN DuxsHye2U4chScdnRtLBQvpFtdINvWX/Gy8Bi7kdE5tsQySvOa+rdwuBc7h88PHC +4xI2iX3z4O1+GpsAe/T9+pjW689jEilS+eVDRVEGl6yHGn9q8PYOayjPjwbJHxS R2mLTRhKu1DKguTzO13f =wGjn -----END PGP SIGNATURE----- Merge tag 'nfs-for-3.11-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs Pull NFS client updates from Trond Myklebust: "Feature highlights include: - Add basic client support for NFSv4.2 - Add basic client support for Labeled NFS (selinux for NFSv4.2) - Fix the use of credentials in NFSv4.1 stateful operations, and add support for NFSv4.1 state protection. Bugfix highlights: - Fix another NFSv4 open state recovery race - Fix an NFSv4.1 back channel session regression - Various rpc_pipefs races - Fix another issue with NFSv3 auth negotiation Please note that Labeled NFS does require some additional support from the security subsystem. The relevant changesets have all been reviewed and acked by James Morris." * tag 'nfs-for-3.11-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (54 commits) NFS: Set NFS_CS_MIGRATION for NFSv4 mounts NFSv4.1 Refactor nfs4_init_session and nfs4_init_channel_attrs nfs: have NFSv3 try server-specified auth flavors in turn nfs: have nfs_mount fake up a auth_flavs list when the server didn't provide it nfs: move server_authlist into nfs_try_mount_request nfs: refactor "need_mount" code out of nfs_try_mount SUNRPC: PipeFS MOUNT notification optimization for dying clients SUNRPC: split client creation routine into setup and registration SUNRPC: fix races on PipeFS UMOUNT notifications SUNRPC: fix races on PipeFS MOUNT notifications NFSv4.1 use pnfs_device maxcount for the objectlayout gdia_maxcount NFSv4.1 use pnfs_device maxcount for the blocklayout gdia_maxcount NFSv4.1 Fix gdia_maxcount calculation to fit in ca_maxresponsesize NFS: Improve legacy idmapping fallback NFSv4.1 end back channel session draining NFS: Apply v4.1 capabilities to v4.2 NFSv4.1: Clean up layout segment comparison helper names NFSv4.1: layout segment comparison helpers should take 'const' parameters NFSv4: Move the DNS resolver into the NFSv4 module rpc_pipefs: only set rpc_dentry_ops if d_op isn't already set ...	2013-07-09 12:09:43 -07:00
Eliezer Tamir	cbf55001b2	net: rename low latency sockets functions to busy poll Rename functions in include/net/ll_poll.h to busy wait. Clarify documentation about expected power use increase. Rename POLL_LL to POLL_BUSY_LOOP. Add need_resched() testing to poll/select busy loops. Note, that in select and poll can_busy_poll is dynamic and is updated continuously to reflect the existence of supported sockets with valid queue information. Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-08 19:25:45 -07:00
J. Bruce Fields	0979292bfa	svcrpc: fix failures to handle -1 uid's As of `f025adf191` "sunrpc: Properly decode kuids and kgids in RPC_AUTH_UNIX credentials" any rpc containing a -1 (0xffff) uid or gid would fail with a badcred error. Commit `afe3c3fd53` "svcrpc: fix failures to handle -1 uid's and gid's" fixed part of the problem, but overlooked the gid upcall--the kernel can request supplementary gid's for the -1 uid, but mountd's attempt write a response will get -EINVAL. Symptoms were nfsd failing to reply to the first attempt to use a newly negotiated krb5 context. Reported-by: Sven Geggus <lists@fuchsschwanzdomain.de> Tested-by: Sven Geggus <lists@fuchsschwanzdomain.de> Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-07-08 17:27:23 -04:00
Simon Derr	80b45261a0	9P: Add cancelled() to the transport functions. RDMA needs to post a buffer for each incoming reply. Hence it needs to keep count of these and needs to be aware of whether a flushed request has received a reply or not. This patch adds the cancelled() callback to the transport modules. It is called when RFLUSH has been received and that the corresponding request will never receive a reply. Signed-off-by: Simon Derr <simon.derr@bull.net> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>	2013-07-07 22:18:18 -05:00
Simon Derr	1cff33069a	9P/RDMA: count posted buffers without a pending request In rdma_request(): If an error occurs between posting the recv and the send, there will be a reply context posted without a pending request. Since there is no way to "un-post" it, we remember it and skip post_recv() for the next request. Signed-off-by: Simon Derr <simon.derr@bull.net> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>	2013-07-07 22:04:36 -05:00
Simon Derr	2f52d07cb7	9P/RDMA: Improve error handling in rdma_request Most importantly: - do not free the recv context (rpl_context) after a successful post_recv() - but do free the send context (c) after a failed send. Signed-off-by: Simon Derr <simon.derr@bull.net> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>	2013-07-07 22:02:29 -05:00
Simon Derr	b530e252e2	9P/RDMA: Do not free req->rc in error handling in rdma_request() rdma_request() should never be in charge of freeing rc. When an error occurs: * Either the rc buffer has been recv_post()'ed. then kfree()'ing it certainly is a bad idea. * Or is has not, and in that case req->rc still points to it, hence it needs not be freed. Signed-off-by: Simon Derr <simon.derr@bull.net> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>	2013-07-07 22:02:29 -05:00
Simon Derr	fd453d0ed6	9P/RDMA: Use a semaphore to protect the RQ The current code keeps track of the number of buffers posted in the RQ, and will prevent it from overflowing. But it does so by simply dropping post requests (And leaking memory in the process). When this happens there will actually be too few buffers posted, and soon the 9P server will complain about 'RNR retry counter exceeded' errors. Instead, use a semaphore, and block until the RQ is ready for another buffer to be posted. Signed-off-by: Simon Derr <simon.derr@bull.net> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>	2013-07-07 22:02:29 -05:00
Simon Derr	47229ff85e	9P/RDMA: Protect against duplicate replies A well-behaved server would not send twice the reply to a request. But if it ever happens... This additional check prevents the kernel from leaking memory and possibly more nasty consequences in that unlikely event. Signed-off-by: Simon Derr <simon.derr@bull.net> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>	2013-07-07 22:02:28 -05:00
Simon Derr	3fcc62f4e8	9P/RDMA: increase P9_RDMA_MAXSIZE to 1MB The current value is too low to get good performance. Signed-off-by: Simon Derr <simon.derr@bull.net> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>	2013-07-07 22:02:28 -05:00
Simon Derr	5387320d48	9pnet: refactor struct p9_fcall alloc code Signed-off-by: Simon Derr <simon.derr@bull.net> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>	2013-07-07 22:02:27 -05:00
Simon Derr	17b6fd9d6d	9P/RDMA: rdma_request() needs not allocate req->rc p9_tag_alloc() takes care of that. Signed-off-by: Simon Derr <simon.derr@bull.net> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>	2013-07-07 22:02:27 -05:00
Simon Derr	ea071aa136	9P: Fix fcall allocation for rdma The current code assumes that when a request in the request array does have a tc, it also has a rc. This is normally true, but not always : when using RDMA, req->rc will temporarily be set to NULL after the request has been sent. That is usually OK though, as when the reply arrives, req->rc will be reassigned to a sane value before the request is recycled. But there is a catch : if the request is flushed, the reply will never arrive, and req->rc will be NULL, but not req->tc. This patch fixes p9_tag_alloc to take this into account. Signed-off-by: Simon Derr <simon.derr@bull.net> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>	2013-07-07 22:02:26 -05:00
Jim Garlick	2f28c8b31d	net/9p: add privport option to 9p tcp transport If the privport option is specified, the tcp transport binds local address to a reserved port before connecting to the 9p server. In some cases when 9P AUTH cannot be implemented, this is better than nothing. Signed-off-by: Jim Garlick <garlick@llnl.gov> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>	2013-07-07 21:59:54 -05:00
Cong Wang	c7e8e8a8f7	bridge: fix some kernel warning in multicast timer Several people reported the warning: "kernel BUG at kernel/timer.c:729!" and the stack trace is: #7 [ffff880214d25c10] mod_timer+501 at ffffffff8106d905 #8 [ffff880214d25c50] br_multicast_del_pg.isra.20+261 at ffffffffa0731d25 [bridge] #9 [ffff880214d25c80] br_multicast_disable_port+88 at ffffffffa0732948 [bridge] #10 [ffff880214d25cb0] br_stp_disable_port+154 at ffffffffa072bcca [bridge] #11 [ffff880214d25ce8] br_device_event+520 at ffffffffa072a4e8 [bridge] #12 [ffff880214d25d18] notifier_call_chain+76 at ffffffff8164aafc #13 [ffff880214d25d50] raw_notifier_call_chain+22 at ffffffff810858f6 #14 [ffff880214d25d60] call_netdevice_notifiers+45 at ffffffff81536aad #15 [ffff880214d25d80] dev_close_many+183 at ffffffff81536d17 #16 [ffff880214d25dc0] rollback_registered_many+168 at ffffffff81537f68 #17 [ffff880214d25de8] rollback_registered+49 at ffffffff81538101 #18 [ffff880214d25e10] unregister_netdevice_queue+72 at ffffffff815390d8 #19 [ffff880214d25e30] __tun_detach+272 at ffffffffa074c2f0 [tun] #20 [ffff880214d25e88] tun_chr_close+45 at ffffffffa074c4bd [tun] #21 [ffff880214d25ea8] __fput+225 at ffffffff8119b1f1 #22 [ffff880214d25ef0] ____fput+14 at ffffffff8119b3fe #23 [ffff880214d25f00] task_work_run+159 at ffffffff8107cf7f #24 [ffff880214d25f30] do_notify_resume+97 at ffffffff810139e1 #25 [ffff880214d25f50] int_signal+18 at ffffffff8164f292 this is due to I forgot to check if mp->timer is armed in br_multicast_del_pg(). This bug is introduced by commit `9f00b2e7cf` (bridge: only expire the mdb entry when query is received). Same for __br_mdb_del(). Tested-by: poma <pomidorabelisima@gmail.com> Reported-by: LiYonghua <809674045@qq.com> Reported-by: Robert Hancock <hancockrwd@gmail.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-06 18:12:47 -07:00
Nicolas Dichtel	86bd68bfd7	sit: fix tunnel update via netlink The device can stand in another netns, hence we need to do the lookup in netns tunnel->net. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-04 14:55:47 -07:00
Linus Torvalds	3366dd9fa8	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid Pull HID updates from Jiri Kosina: - HID battery handling cleanup by David Herrmann - ELO 4000/4500 driver, which has been finally ported to be proper HID driver by Jiri Slaby - ps3remote driver functionality is now provided by generic sony driver, by Jiri Kosina - PS2/3 Buzz controllers support, by Colin Leitner - rework of wiimote driver including full extensions hotpluggin support, sub-device modularization and speaker support by David Herrmann * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid: (55 commits) HID: wacom: Intuos4 battery charging changes HID: i2c-hid: support sending HID output reports using the output register HID: kye: Add report fixup for Genius Gila Gaming mouse HID: wiimote: support Nintendo Wii U Pro Controller Input: make gamepad API keycodes more clear input: document gamepad API and add extra keycodes HID: explain out-of-range check better HID: fix false positive out of range values HID: wiimote: fix coccinelle warnings HID: roccat: check cdev_add return value HID: fold ps3remote driver into generic Sony driver HID: hyperv: convert alloc+memcpy to memdup HID: core: fix reporting of raw events HID: wiimote: discard invalid EXT data reports HID: wiimote: fix classic controller parsing HID: wiimote: init EXT/MP during device detection HID: wiimote: fix DRM debug-attr to correctly parse input HID: wiimote: add MP quirks HID: wiimote: remove old static extension support HID: wiimote: add "bboard_calib" attribute ...	2013-07-04 11:39:00 -07:00
Jiri Kosina	db58316892	Merge branches 'for-3.11/battery', 'for-3.11/elo', 'for-3.11/holtek' and 'for-3.11/i2c-hid-fixed' into for-linus	2013-07-04 15:01:01 +02:00
Hannes Frederic Sowa	3630d40067	ipv6: rt6_check_neigh should successfully verify neigh if no NUD information are available After the removal of rt->n we do not create a neighbour entry at route insertion time (rt6_bind_neighbour is gone). As long as no neighbour is created because of "useful traffic" we skip this routing entry because rt6_check_neigh cannot pick up a valid neighbour (neigh == NULL) and thus returns false. This change was introduced by commit `887c95cc1d` ("ipv6: Complete neighbour entry removal from dst_entry.") To quote RFC4191: "If the host has no information about the router's reachability, then the host assumes the router is reachable." and also: "A host MUST NOT probe a router's reachability in the absence of useful traffic that the host would have sent to the router if it were reachable." So, just assume the router is reachable and let's rt6_probe do the rest. We don't need to create a neighbour on route insertion time. If we don't compile with CONFIG_IPV6_ROUTER_PREF (RFC4191 support) a neighbour is only valid if its nud_state is NUD_VALID. I did not find any references that we should probe the router on route insertion time via the other RFCs. So skip this route in that case. v2: a) use IS_ENABLED instead of #ifdefs (thanks to Sergei Shtylyov) Reported-by: Pierre Emeriaud <petrus.lt@gmail.com> Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-03 17:50:51 -07:00
Lorenzo Colitti	fbfe80c890	net: ipv6: fix wrong ping_v6_sendmsg return value ping_v6_sendmsg currently returns 0 on success. It should return the number of bytes written instead. Signed-off-by: Lorenzo Colitti <lorenzo@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-03 17:42:05 -07:00
Lorenzo Colitti	a1bdc45580	net: ipv6: add missing lock in ping_v6_sendmsg Signed-off-by: Lorenzo Colitti <lorenzo@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-03 17:40:58 -07:00
Linus Torvalds	7f0ef0267e	Merge branch 'akpm' (updates from Andrew Morton) Merge first patch-bomb from Andrew Morton: - various misc bits - I'm been patchmonkeying ocfs2 for a while, as Joel and Mark have been distracted. There has been quite a bit of activity. - About half the MM queue - Some backlight bits - Various lib/ updates - checkpatch updates - zillions more little rtc patches - ptrace - signals - exec - procfs - rapidio - nbd - aoe - pps - memstick - tools/testing/selftests updates * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (445 commits) tools/testing/selftests: don't assume the x bit is set on scripts selftests: add .gitignore for kcmp selftests: fix clean target in kcmp Makefile selftests: add .gitignore for vm selftests: add hugetlbfstest self-test: fix make clean selftests: exit 1 on failure kernel/resource.c: remove the unneeded assignment in function __find_resource aio: fix wrong comment in aio_complete() drivers/w1/slaves/w1_ds2408.c: add magic sequence to disable P0 test mode drivers/memstick/host/r592.c: convert to module_pci_driver drivers/memstick/host/jmb38x_ms: convert to module_pci_driver pps-gpio: add device-tree binding and support drivers/pps/clients/pps-gpio.c: convert to module_platform_driver drivers/pps/clients/pps-gpio.c: convert to devm_* helpers drivers/parport/share.c: use kzalloc Documentation/accounting/getdelays.c: avoid strncpy in accounting tool aoe: update internal version number to v83 aoe: update copyright date aoe: perform I/O completions in parallel ...	2013-07-03 17:12:13 -07:00
Eric Dumazet	36b7bfe09b	netem: fix possible NULL deref in netem_dequeue() commit `aec0a40a6f` ("netem: use rb tree to implement the time queue") added a regression if a child qdisc is attached to netem, as we perform a NULL dereference. Fix this by adding a temporary variable to cache netem_skb_cb(skb)->time_to_send. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-03 16:52:10 -07:00
Joe Stringer	4bc41b84e9	core: Copy inner_protocol in copy_skb_header() inner_protocol was added to struct sk_buff in `0d89d2035f` ("MPLS: Add limited GSO support"), which is scheduled to be included in v3.11. That patch did not update __copy_skb_header to copy the inner_protocol. Signed-off-by: Joe Stringer <joe@wand.net.nz> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-03 16:52:10 -07:00
Kees Cook	f170168b9a	drivers: avoid parsing names as kthread_run() format strings Calling kthread_run with a single name parameter causes it to be handled as a format string. Many callers are passing potentially dynamic string content, so use "%s" in those cases to avoid any potential accidents. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-07-03 16:07:41 -07:00
Kees Cook	d8537548c9	drivers: avoid format strings in names passed to alloc_workqueue() For the workqueue creation interfaces that do not expect format strings, make sure they cannot accidently be parsed that way. Additionally, clean up calls made with a single parameter that would be handled as a format string. Many callers are passing potentially dynamic string content, so use "%s" in those cases to avoid any potential accidents. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-07-03 16:07:41 -07:00
Jiang Liu	0ed5fd1385	mm: use totalram_pages instead of num_physpages at runtime The global variable num_physpages is scheduled to be removed, so use totalram_pages instead of num_physpages at runtime. Signed-off-by: Jiang Liu <jiang.liu@huawei.com> Cc: Miklos Szeredi <miklos@szeredi.hu> Cc: "David S. Miller" <davem@davemloft.net> Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Cc: James Morris <jmorris@namei.org> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Cc: Patrick McHardy <kaber@trash.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-07-03 16:07:35 -07:00
Yan, Zheng	61c5d6bf70	libceph: call r_unsafe_callback when unsafe reply is received We can't use !req->r_sent to check if OSD request is sent for the first time, this is because __cancel_request() zeros req->r_sent when OSD map changes. Rather than adding a new variable to struct ceph_osd_request to indicate if it's sent for the first time, We can call the unsafe callback only when unsafe OSD reply is received. If OSD's first reply is safe, just skip calling the unsafe callback. The purpose of unsafe callback is adding unsafe request to a list, so that fsync(2) can wait for the safe reply. fsync(2) doesn't need to wait for a write(2) that hasn't returned yet. So it's OK to add request to the unsafe list when the first OSD reply is received. (ceph_sync_write() returns after receiving the first OSD reply) Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Reviewed-by: Sage Weil <sage@inktank.com>	2013-07-03 15:32:58 -07:00
Tyler Hicks	2cb33cac62	libceph: Fix NULL pointer dereference in auth client code A malicious monitor can craft an auth reply message that could cause a NULL function pointer dereference in the client's kernel. To prevent this, the auth_none protocol handler needs an empty ceph_auth_client_ops->build_request() function. CVE-2013-1059 Signed-off-by: Tyler Hicks <tyhicks@canonical.com> Reported-by: Chanam Park <chanam.park@hkpco.kr> Reviewed-by: Seth Arnold <seth.arnold@canonical.com> Reviewed-by: Sage Weil <sage@inktank.com> Cc: stable@vger.kernel.org	2013-07-03 15:32:55 -07:00
Yan, Zheng	ccca4e37b1	libceph: fix truncate size calculation check the "not truncated yet" case Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Reviewed-by: Sage Weil <sage@inktank.com>	2013-07-03 15:32:45 -07:00
Yan, Zheng	eb845ff13a	libceph: fix safe completion handle_reply() calls complete_request() only if the first OSD reply has ONDISK flag. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Reviewed-by: Sage Weil <sage@inktank.com>	2013-07-03 15:32:44 -07:00
Alex Elder	4974341eb9	libceph: print more info for short message header If an osd client response message arrives that has a front section that's too big for the buffer set aside to receive it, a warning gets reported and a new buffer is allocated. The warning says nothing about which connection had the problem. Add the peer type and number to what gets reported, to be a bit more informative. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-07-03 15:32:40 -07:00
Alex Elder	96e4dac66f	libceph: add lingering request reference when registered When an osd request is set to linger, the osd client holds onto the request so it can be re-submitted following certain osd map changes. The osd client holds a reference to the request until it is unregistered. This is used by rbd for watch requests. Currently, the reference is taken when the request is marked with the linger flag. This means that if an error occurs after that time but before the the request completes successfully, that reference is leaked. There's really no reason to take the reference until the request is registered in the the osd client's list of lingering requests, and that only happens when the lingering (watch) request completes successfully. So take that reference only when it gets registered following succesful completion, and drop it (as before) when the request gets unregistered. This avoids the reference problem on error in rbd. Rearrange ceph_osdc_unregister_linger_request() to avoid using the request pointer after it may have been freed. And hold an extra reference in kick_requests() while handling a linger request that has not yet been registered, to ensure it doesn't go away. This resolves: http://tracker.ceph.com/issues/3859 Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-07-03 15:32:37 -07:00
David S. Miller	0c1072ae02	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/ethernet/freescale/fec_main.c drivers/net/ethernet/renesas/sh_eth.c net/ipv4/gre.c The GRE conflict is between a bug fix (kfree_skb --> kfree_skb_list) and the splitting of the gre.c code into seperate files. The FEC conflict was two sets of changes adding ethtool support code in an "!CONFIG_M5272" CPP protected block. Finally the sh_eth.c conflict was between one commit add bits set in the .eesr_err_check mask whilst another commit removed the .tx_error_check member and assignments. Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-03 14:55:13 -07:00
Daniel Borkmann	c50cd35788	net: gre: move GSO functions to gre_offload Similarly to TCP/UDP offloading, move all related GRE functions to gre_offload.c to make things more explicit and similar to the rest of the code. Suggested-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-03 14:37:39 -07:00
Linus Torvalds	f991fae5c6	Power management and ACPI updates for 3.11-rc1 - Hotplug changes allowing device hot-removal operations to fail gracefully (instead of crashing the kernel) if they cannot be carried out completely. From Rafael J Wysocki and Toshi Kani. - Freezer update from Colin Cross and Mandeep Singh Baines targeted at making the freezing of tasks a bit less heavy weight operation. - cpufreq resume fix from Srivatsa S Bhat for a regression introduced during the 3.10 cycle causing some cpufreq sysfs attributes to return wrong values to user space after resume. - New freqdomain_cpus sysfs attribute for the acpi-cpufreq driver to provide information previously available via related_cpus from Lan Tianyu. - cpufreq fixes and cleanups from Viresh Kumar, Jacob Shin, Heiko Stübner, Xiaoguang Chen, Ezequiel Garcia, Arnd Bergmann, and Tang Yuantian. - Fix for an ACPICA regression causing suspend/resume issues to appear on some systems introduced during the 3.4 development cycle from Lv Zheng. - ACPICA fixes and cleanups from Bob Moore, Tomasz Nowicki, Lv Zheng, Chao Guan, and Zhang Rui. - New cupidle driver for Xilinx Zynq processors from Michal Simek. - cpuidle fixes and cleanups from Daniel Lezcano. - Changes to make suspend/resume work correctly in Xen guests from Konrad Rzeszutek Wilk. - ACPI device power management fixes and cleanups from Fengguang Wu and Rafael J Wysocki. - ACPI documentation updates from Lv Zheng, Aaron Lu and Hanjun Guo. - Fix for the IA-64 issue that was the reason for reverting commit `9f29ab1` and updates of the ACPI scan code from Rafael J Wysocki. - Mechanism for adding CMOS RTC address space handlers from Lan Tianyu (to allow some EC-related breakage to be fixed on some systems). - Spec-compliant implementation of acpi_os_get_timer() from Mika Westerberg. - Modification of do_acpi_find_child() to execute _STA in order to to avoid situations in which a pointer to a disabled device object is returned instead of an enabled one with the same _ADR value. From Jeff Wu. - Intel BayTrail PCH (Platform Controller Hub) support for the ACPI Intel Low-Power Subsystems (LPSS) driver and modificaions of that driver to work around a couple of known BIOS issues from Mika Westerberg and Heikki Krogerus. - EC driver fix from Vasiliy Kulikov to make it use get_user() and put_user() instead of dereferencing user space pointers blindly. - Assorted ACPI code cleanups from Bjorn Helgaas, Nicholas Mazzuca and Toshi Kani. - Modification of the "runtime idle" helper routine to take the return values of the callbacks executed by it into account and to call rpm_suspend() if they return 0, which allows some code bloat reduction to be done, from Rafael J Wysocki and Alan Stern. - New trace points for PM QoS from Sahara <keun-o.park@windriver.com>. - PM QoS documentation update from Lan Tianyu. - Assorted core PM code cleanups and changes from Bernie Thompson, Bjorn Helgaas, Julius Werner, and Shuah Khan. - New devfreq driver for the Exynos5-bus device from Abhilash Kesavan. - Minor devfreq cleanups, fixes and MAINTAINERS update from MyungJoo Ham, Abhilash Kesavan, Paul Bolle, Rajagopal Venkat, and Wei Yongjun. - OMAP Adaptive Voltage Scaling (AVS) SmartReflex voltage control driver updates from Andrii Tseglytskyi and Nishanth Menon. / -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (GNU/Linux) iQIcBAABAgAGBQJR0ZNOAAoJEKhOf7ml8uNsDLYP/0EU4rmvw0TWTITfp6RS1KDE 9GwBn96ZR4Q5bJd9gBCTPSqhHOYMqxWEUp99sn/M2wehG1pk/jw5LO56+2IhM3UZ g1HDcJ7te2nVT/iXsKiAGTVhU9Rk0aYwoVSknwk27qpIBGxW9w/s5tLX8pY3Q3Zq wL/7aTPjyL+PFFFEaxgH7qLqsl3DhbtYW5AriUBTkXout/tJ4eO1b7MNBncLDh8X VQ/0DNCKE95VEJfkO4rk9RKUyVp9GDn0i+HXCD/FS4IA5oYzePdVdNDmXf7g+swe CGlTZq8pB+oBpDiHl4lxzbNrKQjRNbGnDUkoRcWqn0nAw56xK+vmYnWJhW99gQ/I fKnvxeLca5po1aiqmC4VSJxZIatFZqLrZAI4dzoCLWY+bGeTnCKmj0/F8ytFnZA2 8IuLLs7/dFOaHXV/pKmpg6FAlFa9CPxoqRFoyqb4M0GjEarADyalXUWsPtG+6xCp R/p0CISpwk+guKZR/qPhL7M654S7SHrPwd2DPF0KgGsvk+G2GhoB8EzvD8BVp98Z 9siCGCdgKQfJQVI6R0k9aFmn/4gRQIAgyPhkhv9tqULUUkiaXki+/t8kPfnb8O/d zep+CA57E2G8MYLkDJfpFeKS7GpPD6TIdgFdGmOUC0Y6sl9iTdiw4yTx8O2JM37z rHBZfYGkJBrbGRu+Q1gs =VBBq -----END PGP SIGNATURE----- Merge tag 'pm+acpi-3.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management and ACPI updates from Rafael Wysocki: "This time the total number of ACPI commits is slightly greater than the number of cpufreq commits, but Viresh Kumar (who works on cpufreq) remains the most active patch submitter. To me, the most significant change is the addition of offline/online device operations to the driver core (with the Greg's blessing) and the related modifications of the ACPI core hotplug code. Next are the freezer updates from Colin Cross that should make the freezing of tasks a bit less heavy weight. We also have a couple of regression fixes, a number of fixes for issues that have not been identified as regressions, two new drivers and a bunch of cleanups all over. Highlights: - Hotplug changes to support graceful hot-removal failures. It sometimes is necessary to fail device hot-removal operations gracefully if they cannot be carried out completely. For example, if memory from a memory module being hot-removed has been allocated for the kernel's own use and cannot be moved elsewhere, it's desirable to fail the hot-removal operation in a graceful way rather than to crash the kernel, but currenty a success or a kernel crash are the only possible outcomes of an attempted memory hot-removal. Needless to say, that is not a very attractive alternative and it had to be addressed. However, in order to make it work for memory, I first had to make it work for CPUs and for this purpose I needed to modify the ACPI processor driver. It's been split into two parts, a resident one handling the low-level initialization/cleanup and a modular one playing the actual driver's role (but it binds to the CPU system device objects rather than to the ACPI device objects representing processors). That's been sort of like a live brain surgery on a patient who's riding a bike. So this is a little scary, but since we found and fixed a couple of regressions it caused to happen during the early linux-next testing (a month ago), nobody has complained. As a bonus we remove some duplicated ACPI hotplug code, because the ACPI-based CPU hotplug is now going to use the common ACPI hotplug code. - Lighter weight freezing of tasks. These changes from Colin Cross and Mandeep Singh Baines are targeted at making the freezing of tasks a bit less heavy weight operation. They reduce the number of tasks woken up every time during the freezing, by using the observation that the freezer simply doesn't need to wake up some of them and wait for them all to call refrigerator(). The time needed for the freezer to decide to report a failure is reduced too. Also reintroduced is the check causing a lockdep warining to trigger when try_to_freeze() is called with locks held (which is generally unsafe and shouldn't happen). - cpufreq updates First off, a commit from Srivatsa S Bhat fixes a resume regression introduced during the 3.10 cycle causing some cpufreq sysfs attributes to return wrong values to user space after resume. The fix is kind of fresh, but also it's pretty obvious once Srivatsa has identified the root cause. Second, we have a new freqdomain_cpus sysfs attribute for the acpi-cpufreq driver to provide information previously available via related_cpus. From Lan Tianyu. Finally, we fix a number of issues, mostly related to the CPUFREQ_POSTCHANGE notifier and cpufreq Kconfig options and clean up some code. The majority of changes from Viresh Kumar with bits from Jacob Shin, Heiko Stübner, Xiaoguang Chen, Ezequiel Garcia, Arnd Bergmann, and Tang Yuantian. - ACPICA update A usual bunch of updates from the ACPICA upstream. During the 3.4 cycle we introduced support for ACPI 5 extended sleep registers, but they are only supposed to be used if the HW-reduced mode bit is set in the FADT flags and the code attempted to use them without checking that bit. That caused suspend/resume regressions to happen on some systems. Fix from Lv Zheng causes those registers to be used only if the HW-reduced mode bit is set. Apart from this some other ACPICA bugs are fixed and code cleanups are made by Bob Moore, Tomasz Nowicki, Lv Zheng, Chao Guan, and Zhang Rui. - cpuidle updates New driver for Xilinx Zynq processors is added by Michal Simek. Multidriver support simplification, addition of some missing kerneldoc comments and Kconfig-related fixes come from Daniel Lezcano. - ACPI power management updates Changes to make suspend/resume work correctly in Xen guests from Konrad Rzeszutek Wilk, sparse warning fix from Fengguang Wu and cleanups and fixes of the ACPI device power state selection routine. - ACPI documentation updates Some previously missing pieces of ACPI documentation are added by Lv Zheng and Aaron Lu (hopefully, that will help people to uderstand how the ACPI subsystem works) and one outdated doc is updated by Hanjun Guo. - Assorted ACPI updates We finally nailed down the IA-64 issue that was the reason for reverting commit `9f29ab11dd` ("ACPI / scan: do not match drivers against objects having scan handlers"), so we can fix it and move the ACPI scan handler check added to the ACPI video driver back to the core. A mechanism for adding CMOS RTC address space handlers is introduced by Lan Tianyu to allow some EC-related breakage to be fixed on some systems. A spec-compliant implementation of acpi_os_get_timer() is added by Mika Westerberg. The evaluation of _STA is added to do_acpi_find_child() to avoid situations in which a pointer to a disabled device object is returned instead of an enabled one with the same _ADR value. From Jeff Wu. Intel BayTrail PCH (Platform Controller Hub) support is added to the ACPI driver for Intel Low-Power Subsystems (LPSS) and that driver is modified to work around a couple of known BIOS issues. Changes from Mika Westerberg and Heikki Krogerus. The EC driver is fixed by Vasiliy Kulikov to use get_user() and put_user() instead of dereferencing user space pointers blindly. Code cleanups are made by Bjorn Helgaas, Nicholas Mazzuca and Toshi Kani. - Assorted power management updates The "runtime idle" helper routine is changed to take the return values of the callbacks executed by it into account and to call rpm_suspend() if they return 0, which allows us to reduce the overall code bloat a bit (by dropping some code that's not necessary any more after that modification). The runtime PM documentation is updated by Alan Stern (to reflect the "runtime idle" behavior change). New trace points for PM QoS are added by Sahara (<keun-o.park@windriver.com>). PM QoS documentation is updated by Lan Tianyu. Code cleanups are made and minor issues are addressed by Bernie Thompson, Bjorn Helgaas, Julius Werner, and Shuah Khan. - devfreq updates New driver for the Exynos5-bus device from Abhilash Kesavan. Minor cleanups, fixes and MAINTAINERS update from MyungJoo Ham, Abhilash Kesavan, Paul Bolle, Rajagopal Venkat, and Wei Yongjun. - OMAP power management updates Adaptive Voltage Scaling (AVS) SmartReflex voltage control driver updates from Andrii Tseglytskyi and Nishanth Menon." * tag 'pm+acpi-3.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (162 commits) cpufreq: Fix cpufreq regression after suspend/resume ACPI / PM: Fix possible NULL pointer deref in acpi_pm_device_sleep_state() PM / Sleep: Warn about system time after resume with pm_trace cpufreq: don't leave stale policy pointer in cdbs->cur_policy acpi-cpufreq: Add new sysfs attribute freqdomain_cpus cpufreq: make sure frequency transitions are serialized ACPI: implement acpi_os_get_timer() according the spec ACPI / EC: Add HP Folio 13 to ec_dmi_table in order to skip DSDT scan ACPI: Add CMOS RTC Operation Region handler support ACPI / processor: Drop unused variable from processor_perflib.c cpufreq: tegra: call CPUFREQ_POSTCHANGE notfier in error cases cpufreq: s3c64xx: call CPUFREQ_POSTCHANGE notfier in error cases cpufreq: omap: call CPUFREQ_POSTCHANGE notfier in error cases cpufreq: imx6q: call CPUFREQ_POSTCHANGE notfier in error cases cpufreq: exynos: call CPUFREQ_POSTCHANGE notfier in error cases cpufreq: dbx500: call CPUFREQ_POSTCHANGE notfier in error cases cpufreq: davinci: call CPUFREQ_POSTCHANGE notfier in error cases cpufreq: arm-big-little: call CPUFREQ_POSTCHANGE notfier in error cases cpufreq: powernow-k8: call CPUFREQ_POSTCHANGE notfier in error cases cpufreq: pcc: call CPUFREQ_POSTCHANGE notfier in error cases ...	2013-07-03 14:35:40 -07:00
Linus Torvalds	790eac5640	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull second set of VFS changes from Al Viro: "Assorted f_pos race fixes, making do_splice_direct() safe to call with i_mutex on parent, O_TMPFILE support, Jeff's locks.c series, ->d_hash/->d_compare calling conventions changes from Linus, misc stuff all over the place." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (63 commits) Document ->tmpfile() ext4: ->tmpfile() support vfs: export lseek_execute() to modules lseek_execute() doesn't need an inode passed to it block_dev: switch to fixed_size_llseek() cpqphp_sysfs: switch to fixed_size_llseek() tile-srom: switch to fixed_size_llseek() proc_powerpc: switch to fixed_size_llseek() ubi/cdev: switch to fixed_size_llseek() pci/proc: switch to fixed_size_llseek() isapnp: switch to fixed_size_llseek() lpfc: switch to fixed_size_llseek() locks: give the blocked_hash its own spinlock locks: add a new "lm_owner_key" lock operation locks: turn the blocked_list into a hashtable locks: convert fl_link to a hlist_node locks: avoid taking global lock if possible when waking up blocked waiters locks: protect most of the file_lock handling with i_lock locks: encapsulate the fl_link list handling locks: make "added" in __posix_lock_file a bool ...	2013-07-03 09:10:19 -07:00
Pravin B Shelar	23a3647bc4	ip_tunnels: Use skb-len to PMTU check. In path mtu check, ip header total length works for gre device but not for gre-tap device. Use skb len which is consistent for all tunneling types. This is old bug in gre. This also fixes mtu calculation bug introduced by commit `c544193214` (GRE: Refactor GRE tunneling code). Reported-by: Timo Teras <timo.teras@iki.fi> Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-02 16:43:35 -07:00
James Chapman	a0dbd82227	l2tp: make datapath resilient to packet loss when sequence numbers enabled If L2TP data sequence numbers are enabled and reordering is not enabled, data reception stops if a packet is lost since the kernel waits for a sequence number that is never resent. (When reordering is enabled, data reception restarts when the reorder timeout expires.) If no reorder timeout is set, we should count the number of in-sequence packets after the out-of-sequence (OOS) condition is detected, and reset sequence number state after a number of such packets are received. For now, the number of in-sequence packets while in OOS state which cause the sequence number state to be reset is hard-coded to 5. This could be configurable later. Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-02 16:33:25 -07:00
James Chapman	8a1631d588	l2tp: make datapath sequence number support RFC-compliant The L2TP datapath is not currently RFC-compliant when sequence numbers are used in L2TP data packets. According to the L2TP RFC, any received sequence number NR greater than or equal to the next expected NR is acceptable, where the "greater than or equal to" test is determined by the NR wrap point. This differs for L2TPv2 and L2TPv3, so add state in the session context to hold the max NR value and the NR window size in order to do the acceptable sequence number value check. These might be configurable later, but for now we derive it from the tunnel L2TP version, which determines the sequence number field size. Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-02 16:33:24 -07:00
James Chapman	b6dc01a43a	l2tp: do data sequence number handling in a separate func This change moves some code handling data sequence numbers into a separate function to avoid too much indentation. This is to prepare for some changes to data sequence number handling in subsequent patches. Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-02 16:33:24 -07:00
Yann Droneaud	8a59bd3e9b	sctp: use get_unused_fd_flags(0) instead of get_unused_fd() Macro get_unused_fd() is used to allocate a file descriptor with default flags. Those default flags (0) can be "unsafe": O_CLOEXEC must be used by default to not leak file descriptor across exec(). Instead of macro get_unused_fd(), functions anon_inode_getfd() or get_unused_fd_flags() should be used with flags given by userspace. If not possible, flags should be set to O_CLOEXEC to provide userspace with a default safe behavor. In a further patch, get_unused_fd() will be removed so that new code start using anon_inode_getfd() or get_unused_fd_flags() with correct flags. This patch replaces calls to get_unused_fd() with equivalent call to get_unused_fd_flags(0) to preserve current behavor for existing code. The hard coded flag value (0) should be reviewed on a per-subsystem basis, and, if possible, set to O_CLOEXEC. Signed-off-by: Yann Droneaud <ydroneaud@opteya.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-02 16:14:11 -07:00
Isaku Yamahata	06a23fe31c	core/dev: set pkt_type after eth_type_trans() in dev_forward_skb() The dev_forward_skb() assignment of pkt_type should be done after the call to eth_type_trans(). ip-encapsulated packets can be handled by localhost. But skb->pkt_type can be PACKET_OTHERHOST when packet comes via veth into ip tunnel device. In that case, the packet is dropped by ip_rcv(). Although this example uses gretap. l2tp-eth also has same issue. For l2tp-eth case, add dummy device for ip address and ip l2tp command. netns A \| root netns \| netns B veth<->veth=bridge=gretap <-loop back-> gretap=bridge=veth<->veth arp packet -> pkt_type BROADCAST------------>ip_rcv()------------------------> <- arp reply pkt_type ip_rcv()<-----------------OTHERHOST drop sample operations ip link add tapa type gretap remote 172.17.107.4 local 172.17.107.3 ip link add tapb type gretap remote 172.17.107.3 local 172.17.107.4 ip link set tapa up ip link set tapb up ip address add 172.17.107.3 dev tapa ip address add 172.17.107.4 dev tapb ip route get 172.17.107.3 > local 172.17.107.3 dev lo src 172.17.107.3 > cache <local> ip route get 172.17.107.4 > local 172.17.107.4 dev lo src 172.17.107.4 > cache <local> ip link add vetha type veth peer name vetha-peer ip link add vethb type veth peer name vethb-peer brctl addbr bra brctl addbr brb brctl addif bra tapa brctl addif bra vetha-peer brctl addif brb tapb brctl addif brb vethb-peer brctl show > bridge name bridge id STP enabled interfaces > bra 8000.6ea21e758ff1 no tapa > vetha-peer > brb 8000.420020eb92d5 no tapb > vethb-peer ip link set vetha-peer up ip link set vethb-peer up ip link set bra up ip link set brb up ip netns add a ip netns add b ip link set vetha netns a ip link set vethb netns b ip netns exec a ip address add 10.0.0.3/24 dev vetha ip netns exec b ip address add 10.0.0.4/24 dev vethb ip netns exec a ip link set vetha up ip netns exec b ip link set vethb up ip netns exec a arping -I vetha 10.0.0.4 ARPING 10.0.0.4 from 10.0.0.3 vetha ^CSent 2 probes (2 broadcast(s)) Received 0 response(s) Cc: Jason Wang <jasowang@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Patrick McHardy <kaber@trash.net> Cc: Hong Zhiguo <honkiko@gmail.com> Cc: Rami Rosen <ramirose@gmail.com> Cc: Tom Parkin <tparkin@katalix.com> Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: Pravin B Shelar <pshelar@nicira.com> Cc: Jesse Gross <jesse@nicira.com> Cc: dev@openvswitch.org Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-02 15:59:18 -07:00
Hannes Frederic Sowa	75a493e60a	ipv6: ip6_append_data_mtu did not care about pmtudisc and frag_size If the socket had an IPV6_MTU value set, ip6_append_data_mtu lost track of this when appending the second frame on a corked socket. This results in the following splat: [37598.993962] ------------[ cut here ]------------ [37598.994008] kernel BUG at net/core/skbuff.c:2064! [37598.994008] invalid opcode: 0000 [#1] SMP [37598.994008] Modules linked in: tcp_lp uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_core videodev media vfat fat usb_storage fuse ebtable_nat xt_CHECKSUM bridge stp llc ipt_MASQUERADE nf_conntrack_netbios_ns nf_conntrack_broadcast ip6table_mangle ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 iptable_nat +nf_nat_ipv4 nf_nat iptable_mangle nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ebtable_filter ebtables ip6table_filter ip6_tables be2iscsi iscsi_boot_sysfs bnx2i cnic uio cxgb4i cxgb4 cxgb3i cxgb3 mdio libcxgbi ib_iser rdma_cm ib_addr iw_cm ib_cm ib_sa ib_mad ib_core iscsi_tcp libiscsi_tcp libiscsi +scsi_transport_iscsi rfcomm bnep iTCO_wdt iTCO_vendor_support snd_hda_codec_conexant arc4 iwldvm mac80211 snd_hda_intel acpi_cpufreq mperf coretemp snd_hda_codec microcode cdc_wdm cdc_acm [37598.994008] snd_hwdep cdc_ether snd_seq snd_seq_device usbnet mii joydev btusb snd_pcm bluetooth i2c_i801 e1000e lpc_ich mfd_core ptp iwlwifi pps_core snd_page_alloc mei cfg80211 snd_timer thinkpad_acpi snd tpm_tis soundcore rfkill tpm tpm_bios vhost_net tun macvtap macvlan kvm_intel kvm uinput binfmt_misc +dm_crypt i915 i2c_algo_bit drm_kms_helper drm i2c_core wmi video [37598.994008] CPU 0 [37598.994008] Pid: 27320, comm: t2 Not tainted 3.9.6-200.fc18.x86_64 #1 LENOVO 27744PG/27744PG [37598.994008] RIP: 0010:[<ffffffff815443a5>] [<ffffffff815443a5>] skb_copy_and_csum_bits+0x325/0x330 [37598.994008] RSP: 0018:ffff88003670da18 EFLAGS: 00010202 [37598.994008] RAX: ffff88018105c018 RBX: 0000000000000004 RCX: 00000000000006c0 [37598.994008] RDX: ffff88018105a6c0 RSI: ffff88018105a000 RDI: ffff8801e1b0aa00 [37598.994008] RBP: ffff88003670da78 R08: 0000000000000000 R09: ffff88018105c040 [37598.994008] R10: ffff8801e1b0aa00 R11: 0000000000000000 R12: 000000000000fff8 [37598.994008] R13: 00000000000004fc R14: 00000000ffff0504 R15: 0000000000000000 [37598.994008] FS: 00007f28eea59740(0000) GS:ffff88023bc00000(0000) knlGS:0000000000000000 [37598.994008] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [37598.994008] CR2: 0000003d935789e0 CR3: 00000000365cb000 CR4: 00000000000407f0 [37598.994008] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [37598.994008] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [37598.994008] Process t2 (pid: 27320, threadinfo ffff88003670c000, task ffff88022c162ee0) [37598.994008] Stack: [37598.994008] ffff88022e098a00 ffff88020f973fc0 0000000000000008 00000000000004c8 [37598.994008] ffff88020f973fc0 00000000000004c4 ffff88003670da78 ffff8801e1b0a200 [37598.994008] 0000000000000018 00000000000004c8 ffff88020f973fc0 00000000000004c4 [37598.994008] Call Trace: [37598.994008] [<ffffffff815fc21f>] ip6_append_data+0xccf/0xfe0 [37598.994008] [<ffffffff8158d9f0>] ? ip_copy_metadata+0x1a0/0x1a0 [37598.994008] [<ffffffff81661f66>] ? _raw_spin_lock_bh+0x16/0x40 [37598.994008] [<ffffffff8161548d>] udpv6_sendmsg+0x1ed/0xc10 [37598.994008] [<ffffffff812a2845>] ? sock_has_perm+0x75/0x90 [37598.994008] [<ffffffff815c3693>] inet_sendmsg+0x63/0xb0 [37598.994008] [<ffffffff812a2973>] ? selinux_socket_sendmsg+0x23/0x30 [37598.994008] [<ffffffff8153a450>] sock_sendmsg+0xb0/0xe0 [37598.994008] [<ffffffff810135d1>] ? __switch_to+0x181/0x4a0 [37598.994008] [<ffffffff8153d97d>] sys_sendto+0x12d/0x180 [37598.994008] [<ffffffff810dfb64>] ? __audit_syscall_entry+0x94/0xf0 [37598.994008] [<ffffffff81020ed1>] ? syscall_trace_enter+0x231/0x240 [37598.994008] [<ffffffff8166a7e7>] tracesys+0xdd/0xe2 [37598.994008] Code: fe 07 00 00 48 c7 c7 04 28 a6 81 89 45 a0 4c 89 4d b8 44 89 5d a8 e8 1b ac b1 ff 44 8b 5d a8 4c 8b 4d b8 8b 45 a0 e9 cf fe ff ff <0f> 0b 66 0f 1f 84 00 00 00 00 00 66 66 66 66 90 55 48 89 e5 48 [37598.994008] RIP [<ffffffff815443a5>] skb_copy_and_csum_bits+0x325/0x330 [37598.994008] RSP <ffff88003670da18> [37599.007323] ---[ end trace d69f6a17f8ac8eee ]--- While there, also check if path mtu discovery is activated for this socket. The logic was adapted from ip6_append_data when first writing on the corked socket. This bug was introduced with commit `0c1833797a` ("ipv6: fix incorrect ipsec fragment"). v2: a) Replace IPV6_PMTU_DISC_DO with IPV6_PMTUDISC_PROBE. b) Don't pass ipv6_pinfo to ip6_append_data_mtu (suggestion by Gao feng, thanks!). c) Change mtu to unsigned int, else we get a warning about non-matching types because of the min()-macro type-check. Acked-by: Gao feng <gaofeng@cn.fujitsu.com> Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-02 12:44:18 -07:00
Hannes Frederic Sowa	8822b64a0f	ipv6: call udp_push_pending_frames when uncorking a socket with AF_INET pending data We accidentally call down to ip6_push_pending_frames when uncorking pending AF_INET data on a ipv6 socket. This results in the following splat (from Dave Jones): skbuff: skb_under_panic: text:ffffffff816765f6 len:48 put:40 head:ffff88013deb6df0 data:ffff88013deb6dec tail:0x2c end:0xc0 dev:<NULL> ------------[ cut here ]------------ kernel BUG at net/core/skbuff.c:126! invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC Modules linked in: dccp_ipv4 dccp 8021q garp bridge stp dlci mpoa snd_seq_dummy sctp fuse hidp tun bnep nfnetlink scsi_transport_iscsi rfcomm can_raw can_bcm af_802154 appletalk caif_socket can caif ipt_ULOG x25 rose af_key pppoe pppox ipx phonet irda llc2 ppp_generic slhc p8023 psnap p8022 llc crc_ccitt atm bluetooth +netrom ax25 nfc rfkill rds af_rxrpc coretemp hwmon kvm_intel kvm crc32c_intel snd_hda_codec_realtek ghash_clmulni_intel microcode pcspkr snd_hda_codec_hdmi snd_hda_intel snd_hda_codec snd_hwdep usb_debug snd_seq snd_seq_device snd_pcm e1000e snd_page_alloc snd_timer ptp snd pps_core soundcore xfs libcrc32c CPU: 2 PID: 8095 Comm: trinity-child2 Not tainted 3.10.0-rc7+ #37 task: ffff8801f52c2520 ti: ffff8801e6430000 task.ti: ffff8801e6430000 RIP: 0010:[<ffffffff816e759c>] [<ffffffff816e759c>] skb_panic+0x63/0x65 RSP: 0018:ffff8801e6431de8 EFLAGS: 00010282 RAX: 0000000000000086 RBX: ffff8802353d3cc0 RCX: 0000000000000006 RDX: 0000000000003b90 RSI: ffff8801f52c2ca0 RDI: ffff8801f52c2520 RBP: ffff8801e6431e08 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000001 R12: ffff88022ea0c800 R13: ffff88022ea0cdf8 R14: ffff8802353ecb40 R15: ffffffff81cc7800 FS: 00007f5720a10740(0000) GS:ffff880244c00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000005862000 CR3: 000000022843c000 CR4: 00000000001407e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600 Stack: ffff88013deb6dec 000000000000002c 00000000000000c0 ffffffff81a3f6e4 ffff8801e6431e18 ffffffff8159a9aa ffff8801e6431e90 ffffffff816765f6 ffffffff810b756b 0000000700000002 ffff8801e6431e40 0000fea9292aa8c0 Call Trace: [<ffffffff8159a9aa>] skb_push+0x3a/0x40 [<ffffffff816765f6>] ip6_push_pending_frames+0x1f6/0x4d0 [<ffffffff810b756b>] ? mark_held_locks+0xbb/0x140 [<ffffffff81694919>] udp_v6_push_pending_frames+0x2b9/0x3d0 [<ffffffff81694660>] ? udplite_getfrag+0x20/0x20 [<ffffffff8162092a>] udp_lib_setsockopt+0x1aa/0x1f0 [<ffffffff811cc5e7>] ? fget_light+0x387/0x4f0 [<ffffffff816958a4>] udpv6_setsockopt+0x34/0x40 [<ffffffff815949f4>] sock_common_setsockopt+0x14/0x20 [<ffffffff81593c31>] SyS_setsockopt+0x71/0xd0 [<ffffffff816f5d54>] tracesys+0xdd/0xe2 Code: 00 00 48 89 44 24 10 8b 87 d8 00 00 00 48 89 44 24 08 48 8b 87 e8 00 00 00 48 c7 c7 c0 04 aa 81 48 89 04 24 31 c0 e8 e1 7e ff ff <0f> 0b 55 48 89 e5 0f 0b 55 48 89 e5 0f 0b 55 48 89 e5 0f 0b 55 RIP [<ffffffff816e759c>] skb_panic+0x63/0x65 RSP <ffff8801e6431de8> This patch adds a check if the pending data is of address family AF_INET and directly calls udp_push_ending_frames from udp_v6_push_pending_frames if that is the case. This bug was found by Dave Jones with trinity. (Also move the initialization of fl6 below the AF_INET check, even if not strictly necessary.) Cc: Dave Jones <davej@redhat.com> Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-02 12:44:18 -07:00
Linus Torvalds	fe3c22bd5c	Char/Misc merge for 3.11-rc1 Here's the big char/misc driver tree merge for 3.11-rc1 A variety of different driver patches here. All of these have been in linux-next for a while, and the networking patches were acked-by David Miller, as it made sense for those patches to come through this tree. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (GNU/Linux) iEYEABECAAYFAlHRqsQACgkQMUfUDdst+ykNlACgwnDHLav/u2NrAxoqxmw7Bcd8 qY0An3h0ZGI5PpDe6U0IyBDQIipHuOjG =vaRG -----END PGP SIGNATURE----- Merge tag 'char-misc-3.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc updates from Greg KH: "Here's the big char/misc driver tree merge for 3.11-rc1 A variety of different driver patches here. All of these have been in linux-next for a while, and the networking patches were acked-by David Miller, as it made sense for those patches to come through this tree" * tag 'char-misc-3.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (102 commits) Revert "char: misc: assign file->private_data in all cases" drivers: uio_pdrv_genirq: Use of_match_ptr() macro mei: check whether hw start has succeeded mei: check if the hardware reset succeeded mei: mei_cl_connect: don't multiply the timeout twice mei: do not override a client writing state when buffering mei: move mei_cl_irq_write_complete to client.c UIO: Fix concurrency issue drivers: uio_dmem_genirq: Use of_match_ptr() macro char: misc: assign file->private_data in all cases drivers: hv: allocate synic structures before hv_synic_init() drivers: hv: check interrupt mask before read_index vme: vme_tsi148.c: fix error return code in tsi148_probe() FMC: fix error handling in probe() function fmc: avoid readl/writel namespace conflict FMC: NULL dereference on allocation failure UIO: fix uio_pdrv_genirq with device tree but no interrupt UIO: allow binding uio_pdrv_genirq.c to devices using command line option FMC: add a char-device mezzanine driver FMC: add a driver to write mezzanine EEPROM ...	2013-07-02 11:43:33 -07:00
Cong Wang	3b7b514f44	ipip: fix a regression in ioctl This is a regression introduced by commit `fd58156e45` (IPIP: Use ip-tunneling code.) Similar to GRE tunnel, previously we only check the parameters for SIOCADDTUNNEL and SIOCCHGTUNNEL, after that commit, the check is moved for all commands. So, just check for SIOCADDTUNNEL and SIOCCHGTUNNEL. Also, the check for i_key, o_key etc. is suspicious too, which did not exist before, reset them before passing to ip_tunnel_ioctl(). Cc: Pravin B Shelar <pshelar@nicira.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-02 01:13:09 -07:00
Wei Yongjun	e1558a93b6	l2tp: add missing .owner to struct pppox_proto Add missing .owner of struct pppox_proto. This prevents the module from being removed from underneath its users. Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-02 01:11:56 -07:00
Michal Schmidt	c590b5e2f0	ethtool: make .get_dump_data() harder to misuse by drivers As the patch "bnx2x: remove zeroing of dump data buffer" showed, it is too easy implement .get_dump_data incorrectly in a driver. Let's make sure drivers cannot get confused by userspace requesting a too big dump. Also WARN if the driver sets dump->len to something weird and make sure the length reported to userspace is the actual length of data copied to userspace. Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Reviewed-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-02 00:15:56 -07:00
Daniel Borkmann	e02010adee	net: sctp: get rid of SCTP_DBG_TSNS entirely After having reworked the debugging framework, Neil and Vlad agreed to get rid of the leftover SCTP_DBG_TSNS code for a couple of reasons: We can use systemtap scripts to investigate these things, we now have pr_debug() helpers that make life easier, and if we really need anything else besides those tools, we will be forced to come up with something better than we have there. Therefore, get rid of this ifdef debugging code entirely for now. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> CC: Vlad Yasevich <vyasevich@gmail.com> CC: Neil Horman <nhorman@tuxdriver.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-02 00:08:03 -07:00
Amerigo Wang	8965779d2c	ipv6,mcast: always hold idev->lock before mca_lock dingtianhong reported the following deadlock detected by lockdep: ====================================================== [ INFO: possible circular locking dependency detected ] 3.4.24.05-0.1-default #1 Not tainted ------------------------------------------------------- ksoftirqd/0/3 is trying to acquire lock: (&ndev->lock){+.+...}, at: [<ffffffff8147f804>] ipv6_get_lladdr+0x74/0x120 but task is already holding lock: (&mc->mca_lock){+.+...}, at: [<ffffffff8149d130>] mld_send_report+0x40/0x150 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&mc->mca_lock){+.+...}: [<ffffffff810a8027>] validate_chain+0x637/0x730 [<ffffffff810a8417>] __lock_acquire+0x2f7/0x500 [<ffffffff810a8734>] lock_acquire+0x114/0x150 [<ffffffff814f691a>] rt_spin_lock+0x4a/0x60 [<ffffffff8149e4bb>] igmp6_group_added+0x3b/0x120 [<ffffffff8149e5d8>] ipv6_mc_up+0x38/0x60 [<ffffffff81480a4d>] ipv6_find_idev+0x3d/0x80 [<ffffffff81483175>] addrconf_notify+0x3d5/0x4b0 [<ffffffff814fae3f>] notifier_call_chain+0x3f/0x80 [<ffffffff81073471>] raw_notifier_call_chain+0x11/0x20 [<ffffffff813d8722>] call_netdevice_notifiers+0x32/0x60 [<ffffffff813d92d4>] __dev_notify_flags+0x34/0x80 [<ffffffff813d9360>] dev_change_flags+0x40/0x70 [<ffffffff813ea627>] do_setlink+0x237/0x8a0 [<ffffffff813ebb6c>] rtnl_newlink+0x3ec/0x600 [<ffffffff813eb4d0>] rtnetlink_rcv_msg+0x160/0x310 [<ffffffff814040b9>] netlink_rcv_skb+0x89/0xb0 [<ffffffff813eb357>] rtnetlink_rcv+0x27/0x40 [<ffffffff81403e20>] netlink_unicast+0x140/0x180 [<ffffffff81404a9e>] netlink_sendmsg+0x33e/0x380 [<ffffffff813c4252>] sock_sendmsg+0x112/0x130 [<ffffffff813c537e>] __sys_sendmsg+0x44e/0x460 [<ffffffff813c5544>] sys_sendmsg+0x44/0x70 [<ffffffff814feab9>] system_call_fastpath+0x16/0x1b -> #0 (&ndev->lock){+.+...}: [<ffffffff810a798e>] check_prev_add+0x3de/0x440 [<ffffffff810a8027>] validate_chain+0x637/0x730 [<ffffffff810a8417>] __lock_acquire+0x2f7/0x500 [<ffffffff810a8734>] lock_acquire+0x114/0x150 [<ffffffff814f6c82>] rt_read_lock+0x42/0x60 [<ffffffff8147f804>] ipv6_get_lladdr+0x74/0x120 [<ffffffff8149b036>] mld_newpack+0xb6/0x160 [<ffffffff8149b18b>] add_grhead+0xab/0xc0 [<ffffffff8149d03b>] add_grec+0x3ab/0x460 [<ffffffff8149d14a>] mld_send_report+0x5a/0x150 [<ffffffff8149f99e>] igmp6_timer_handler+0x4e/0xb0 [<ffffffff8105705a>] call_timer_fn+0xca/0x1d0 [<ffffffff81057b9f>] run_timer_softirq+0x1df/0x2e0 [<ffffffff8104e8c7>] handle_pending_softirqs+0xf7/0x1f0 [<ffffffff8104ea3b>] __do_softirq_common+0x7b/0xf0 [<ffffffff8104f07f>] __thread_do_softirq+0x1af/0x210 [<ffffffff8104f1c1>] run_ksoftirqd+0xe1/0x1f0 [<ffffffff8106c7de>] kthread+0xae/0xc0 [<ffffffff814fff74>] kernel_thread_helper+0x4/0x10 actually we can just hold idev->lock before taking pmc->mca_lock, and avoid taking idev->lock again when iterating idev->addr_list, since the upper callers of mld_newpack() already take read_lock_bh(&idev->lock). Reported-by: dingtianhong <dingtianhong@huawei.com> Cc: dingtianhong <dingtianhong@huawei.com> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Cc: David S. Miller <davem@davemloft.net> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Tested-by: Ding Tianhong <dingtianhong@huawei.com> Tested-by: Chen Weilong <chenweilong@huawei.com> Signed-off-by: Cong Wang <amwang@redhat.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-01 23:39:21 -07:00
Cong Wang	ab6c7a0a43	vti: remove duplicated code to fix a memory leak vti module allocates dev->tstats twice: in vti_fb_tunnel_init() and in vti_tunnel_init(), this lead to a memory leak of dev->tstats. Just remove the duplicated operations in vti_fb_tunnel_init(). (candidate for -stable) Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: Saurabh Mohan <saurabh.mohan@vyatta.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-01 23:37:14 -07:00
Cong Wang	6c734fb859	gre: fix a regression in ioctl When testing GRE tunnel, I got: # ip tunnel show get tunnel gre0 failed: Invalid argument get tunnel gre1 failed: Invalid argument This is a regression introduced by commit `c544193214` ("GRE: Refactor GRE tunneling code.") because previously we only check the parameters for SIOCADDTUNNEL and SIOCCHGTUNNEL, after that commit, the check is moved for all commands. So, just check for SIOCADDTUNNEL and SIOCCHGTUNNEL. After this patch I got: # ip tunnel show gre0: gre/ip remote any local any ttl inherit nopmtudisc gre1: gre/ip remote 192.168.122.101 local 192.168.122.45 ttl inherit Cc: Pravin B Shelar <pshelar@nicira.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-01 23:35:22 -07:00
Daniel Borkmann	bb33381d0c	net: sctp: rework debugging framework to use pr_debug and friends We should get rid of all own SCTP debug printk macros and use the ones that the kernel offers anyway instead. This makes the code more readable and conform to the kernel code, and offers all the features of dynamic debbuging that pr_debug() et al has, such as only turning on/off portions of debug messages at runtime through debugfs. The runtime cost of having CONFIG_DYNAMIC_DEBUG enabled, but none of the debug statements printing, is negligible [1]. If kernel debugging is completly turned off, then these statements will also compile into "empty" functions. While we're at it, we also need to change the Kconfig option as it /now/ only refers to the ifdef'ed code portions in outqueue.c that enable further debugging/tracing of SCTP transaction fields. Also, since SCTP_ASSERT code was enabled with this Kconfig option and has now been removed, we transform those code parts into WARNs resp. where appropriate BUG_ONs so that those bugs can be more easily detected as probably not many people have SCTP debugging permanently turned on. To turn on all SCTP debugging, the following steps are needed: # mount -t debugfs none /sys/kernel/debug # echo -n 'module sctp +p' > /sys/kernel/debug/dynamic_debug/control This can be done more fine-grained on a per file, per line basis and others as described in [2]. [1] https://www.kernel.org/doc/ols/2009/ols2009-pages-39-46.pdf [2] Documentation/dynamic-debug-howto.txt Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-01 23:22:13 -07:00
David S. Miller	936faf6c49	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/shemminger/vxlan-next Stephen Hemminger says: ==================== Here is current updates for vxlan in net-next. It includes Mike's changes to handle multiple destinations and lots of little cosmetic stuff. This is a fresh vxlan-next repository which was forked from net-next. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-01 23:17:19 -07:00
Dave Jones	4ccb93ce74	x25: Fix broken locking in ioctl error paths. Two of the x25 ioctl cases have error paths that break out of the function without unlocking the socket, leading to this warning: ================================================ [ BUG: lock held when returning to user space! ] 3.10.0-rc7+ #36 Not tainted ------------------------------------------------ trinity-child2/31407 is leaving the kernel with locks still held! 1 lock held by trinity-child2/31407: #0: (sk_lock-AF_X25){+.+.+.}, at: [<ffffffffa024b6da>] x25_ioctl+0x8a/0x740 [x25] Signed-off-by: Dave Jones <davej@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-01 18:15:25 -07:00
Eric Dumazet	aec0a40a6f	netem: use rb tree to implement the time queue Following typical setup to implement a ~100 ms RTT and big amount of reorders has very poor performance because netem implements the time queue using a linked list. ----------------------------------------------------------- ETH=eth0 IFB=ifb0 modprobe ifb ip link set dev $IFB up tc qdisc add dev $ETH ingress 2>/dev/null tc filter add dev $ETH parent ffff: \ protocol ip u32 match u32 0 0 flowid 1:1 action mirred egress \ redirect dev $IFB ethtool -K $ETH gro off tso off gso off tc qdisc add dev $IFB root netem delay 50ms 10ms limit 100000 tc qd add dev $ETH root netem delay 50ms limit 100000 --------------------------------------------------------- Switch netem time queue to a rb tree, so this kind of setup can work at high speed. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-01 18:07:15 -07:00
NeilBrown	0bebc633f1	sunrpc: Don't schedule an upcall on a replaced cache entry. When a cache entry is replaced, the "expiry_time" get set to zero by a call to "cache_fresh_locked(..., 0)" at the end of "sunrpc_cache_update". This low expiry time makes cache_check() think that the 'refresh_age' is negative, so the 'age' is comparatively large and a refresh is triggered. However refreshing a replaced entry it pointless, it cannot achieve anything useful. So teach cache_check to ignore a low refresh_age when expiry_time is zero. Reported-by: Bodo Stroesser <bstroesser@ts.fujitsu.com> Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-07-01 17:53:28 -04:00
NeilBrown	7715cde868	net/sunrpc: xpt_auth_cache should be ignored when expired. commit `d202cce896` sunrpc: never return expired entries in sunrpc_cache_lookup moved the 'entry is expired' test from cache_check to sunrpc_cache_lookup, so that it happened early and some races could safely be ignored. However the ip_map (in svcauth_unix.c) has a separate single-item cache which allows quick lookup without locking. An entry in this case would not be subject to the expiry test and so could be used well after it has expired. This is not normally a big problem because the first time it is used after it is expired an up-call will be scheduled to refresh the entry (if it hasn't been scheduled already) and the old entry will then be invalidated. So on the second attempt to use it after it has expired, ip_map_cached_get will discard it. However that is subtle and not ideal, so replace the "!cache_valid" test with "cache_is_expired". In doing this we drop the test on the "CACHE_VALID" bit. This is unnecessary as the bit is never cleared, and an entry will only be cached if the bit is set. Reported-by: Bodo Stroesser <bstroesser@ts.fujitsu.com> Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-07-01 17:53:28 -04:00
NeilBrown	013920eb5d	sunrpc/cache: ensure items removed from cache do not have pending upcalls. It is possible for a race to set CACHE_PENDING after cache_clean() has removed a cache entry from the cache. If CACHE_PENDING is still set when the entry is finally 'put', the cache_dequeue() will never happen and we can leak memory. So set a new flag 'CACHE_CLEANED' when we remove something from the cache, and don't queue any upcall if it is set. If CACHE_PENDING is set before CACHE_CLEANED, the call that cache_clean() makes to cache_fresh_unlocked() will free memory as needed. If CACHE_PENDING is set after CACHE_CLEANED, the test in sunrpc_cache_pipe_upcall will ensure that the memory is not allocated. Reported-by: <bstroesser@ts.fujitsu.com> Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-07-01 17:53:28 -04:00
NeilBrown	2a1c7f53fd	sunrpc/cache: use cache_fresh_unlocked consistently and correctly. cache_fresh_unlocked() is called when a cache entry has been updated and ensures that if there were any pending upcalls, they are cleared. So every time we update a cache entry, we should call this, and this should be the only way that we try to clear pending calls (that sort of uniformity makes code sooo much easier to read). try_to_negate_entry() will (possibly) mark an entry as negative. If it doesn't, it is because the entry already is VALID. So the entry will be valid on exit, so it is appropriate to call cache_fresh_unlocked(). So tidy up try_to_negate_entry() to do that, and remove partial open-coded cache_fresh_unlocked() from the one call-site of try_to_negate_entry(). In the other branch of the 'switch(cache_make_upcall())', we again have a partial open-coded version of cache_fresh_unlocked(). Replace that with a real call. And again in cache_clean(), use a real call to cache_fresh_unlocked(). These call sites might previously have called cache_revisit_request() if CACHE_PENDING wasn't set. This is never necessary because cache_revisit_request() can only do anything if the item is in the cache_defer_hash, However any time that an item is added to the cache_defer_hash (setup_deferral), the code immediately tests CACHE_PENDING, and removes the entry again if it is clear. So all other places we only need to 'cache_revisit_request' if we've just cleared CACHE_PENDING. Reported-by: Bodo Stroesser <bstroesser@ts.fujitsu.com> Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-07-01 17:53:28 -04:00
NeilBrown	f9e1aedc6c	sunrpc/cache: remove races with queuing an upcall. We currently queue an upcall after setting CACHE_PENDING, and dequeue after clearing CACHE_PENDING. So a request should only be present when CACHE_PENDING is set. However we don't combine the test and the enqueue/dequeue in a protected region, so it is possible (if unlikely) for a race to result in a request being queued without CACHE_PENDING set, or a request to be absent despite CACHE_PENDING. So: include a test for CACHE_PENDING inside the regions of enqueue and dequeue where queue_lock is held, and abort the operation if the value is not as expected. Also remove the early 'return' from cache_dequeue() to ensure that it always removes all entries: As there is no locking between setting CACHE_PENDING and calling sunrpc_cache_pipe_upcall it is not inconceivable for some other thread to clear CACHE_PENDING and then someone else to set it and call sunrpc_cache_pipe_upcall, both before the original threads completed the call. With this, it perfectly safe and correct to: - call cache_dequeue() if and only if we have just cleared CACHE_PENDING - call sunrpc_cache_pipe_upcall() (via cache_make_upcall) if and only if we have just set CACHE_PENDING. Reported-by: Bodo Stroesser <bstroesser@ts.fujitsu.com> Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Bodo Stroesser <bstroesser@ts.fujitsu.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-07-01 17:42:53 -04:00
J. Bruce Fields	1f691b07c5	svcrpc: don't error out on small tcp fragment Though clients we care about mostly don't do this, it is possible for rpc requests to be sent in multiple fragments. Here we have a sanity check to ensure that the final received rpc isn't too small--except that the number we're actually checking is the length of just the final fragment, not of the whole rpc. So a perfectly legal rpc that's unluckily fragmented could cause the server to close the connection here. Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-07-01 17:32:04 -04:00
J. Bruce Fields	cf3aa02cb4	svcrpc: fix handling of too-short rpc's If we detect that an rpc is too short, we abort and close the connection. Except, there's a bug here: we're leaving sk_datalen nonzero without leaving any pages in the sk_pages array. The most likely result of the inconsistency is a subsequent crash in svc_tcp_clear_pages. Also demote the BUG_ON in svc_tcp_clear_pages to a WARN. Cc: stable@kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-07-01 17:32:04 -04:00
J. Bruce Fields	0dc1531aca	svcrpc: store gss mech in svc_cred Store a pointer to the gss mechanism used in the rq_cred and cl_cred. This will make it easier to enforce SP4_MACH_CRED, which needs to compare the mechanism used on the exchange_id with that used on protected operations. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-07-01 17:23:06 -04:00
J. Bruce Fields	4423406391	svcrpc: introduce init_svc_cred Common helper to zero out fields of the svc_cred. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-07-01 17:23:06 -04:00
J. Bruce Fields	0de934936b	Merge branch 'for-3.10' into 'for-3.11' Merge bugfixes into my for-3.11 branch.	2013-07-01 17:22:24 -04:00
Eric Dumazet	c9ab4d85de	neighbour: fix a race in neigh_destroy() There is a race in neighbour code, because neigh_destroy() uses skb_queue_purge(&neigh->arp_queue) without holding neighbour lock, while other parts of the code assume neighbour rwlock is what protects arp_queue Convert all skb_queue_purge() calls to the __skb_queue_purge() variant Use __skb_queue_head_init() instead of skb_queue_head_init() to make clear we do not use arp_queue.lock And hold neigh->lock in neigh_destroy() to close the race. Reported-by: Joe Jin <joe.jin@oracle.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-01 13:35:32 -07:00
Nicolas Dichtel	52bd4c0c15	ipv6: fix ecmp lookup when oif is specified There is no reason to skip ECMP lookup when oif is specified, but this implies to check oif given by user when selecting another route. When the new route does not match oif requirement, we simply keep the initial one. Spotted-by: dingzhi <zhi.ding@6wind.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-01 13:27:38 -07:00
Hannes Frederic Sowa	5c29fb12e8	ipv6: only apply anti-spoofing checks to not-pointopoint tunnels Because of commit `218774dc34` ("ipv6: add anti-spoofing checks for 6to4 and 6rd") the sit driver dropped packets for 2002::/16 destinations and sources even when configured to work as a tunnel with fixed endpoint. We may only apply the 6rd/6to4 anti-spoofing checks if the device is not in pointopoint mode. This was an oversight from me in the above commit, sorry. Thanks to Roman Mamedov for reporting this! Reported-by: Roman Mamedov <rm@romanrm.ru> Cc: David Miller <davem@davemloft.net> Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-01 13:26:18 -07:00
David S. Miller	e62bc9e55f	Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next John W. Linville says: ==================== Yet one more pull request for wireless updates intended for 3.11... For the mac80211 bits, Johannes says: "Here we have a few memory leak fixes related to BSS struct handling mostly from Ben, including a fix for a more theoretical problem (associating while a BSS struct times out) from myself, a compilation warning fix from Arend, mesh fixes from Thomas, tracking the beacon bitrate (Alex), a bandwidth change event fix (Ilan) and some initial work for 5/10 MHz channels from Simon." Regarding the iwlwifi bits, Johannes says: "Emmanuel removed some unneeded/unsupported module parameters and adds a Bluetooth 1x1 lookup-table for some upcoming products. From Alex I have an older patch to add low-power receive support, this depended on a mac80211 commit that only just came in with the merge from wireless-next I did. Ilan made beacon timings better, and Eytan added some debug statements for thermal throttling. I have a few cleanups, a fix for a long-standing but rare warning, and, arguably the most important patch here, the firmware API version bump for the 7260/3160 devices." Also included is a Bluetooth pull -- Gustavo says: "Here goes a set of patches to 3.11. The biggest work here is from Andre Guedes on the move of the Discovery to use the new request framework. Other than that Johan provided a bunch of fixes to the L2CAP code. The rest are just small fixes and clean ups." On top of all that, there are a variety of updates and fixes to brcmfmac, rt2x00, wil6210, ath9k, ath10k, and a few others here and there. This also includes a pull of the wireless tree, in order to prevent some merge conflicts. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-01 13:21:17 -07:00
Pravin B Shelar	fb825a550a	openvswitch: Add Kconfig dependency on GRE-DEMUX. Openvswitch uses function from NET_IPGRE_DEMUX module. Add Kconfig dependency to fix following compilation errors: http://marc.info/?l=linux-netdev&m=137244035226634 CC: Jesse Gross <jesse@nicira.com> Reported-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Pravin Shelar <pshelar@nicira.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Jesse Gross <jesse@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-01 13:19:43 -07:00
David S. Miller	4e144d3a80	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next Pablo Neira Ayuso says: ==================== The following batch contains Netfilter/IPVS updates for net-next, they are: * Enforce policy to several nfnetlink subsystem, from Daniel Borkmann. * Use xt_socket to match the third packet (to perform simplistic socket-based stateful filtering), from Eric Dumazet. * Avoid large timeout for picked up from the middle TCP flows, from Florian Westphal. * Exclude IPVS from struct net if IPVS is disabled and removal of unnecessary included header file, from JunweiZhang. * Release SCTP connection immediately under load, to mimic current TCP behaviour, from Julian Anastasov. * Replace and enhance SCTP state machine, from Julian Anastasov. * Add tweak to reduce sync traffic in the presence of persistence, also from Julian Anastasov. * Add tweak for the IPVS SH scheduler not to reject connections directed to a server, choose a new one instead, from Alexander Frolkin. * Add support for sloppy TCP and SCTP modes, that creates state information on any packet, not only initial handshake packets, from Alexander Frolkin. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-30 17:35:13 -07:00
Florian Westphal	496e4ae7dc	netfilter: nf_queue: add NFQA_SKB_CSUM_NOTVERIFIED info flag The common case is that TCP/IP checksums have already been verified, e.g. by hardware (rx checksum offload), or conntrack. Userspace can use this flag to determine when the checksum has not been validated yet. If the flag is set, this doesn't necessarily mean that the packet has an invalid checksum, e.g. if NIC doesn't support rx checksum. Userspace that sucessfully enabled NFQA_CFG_F_GSO queue feature flag can infer that IP/TCP checksum has already been validated if either the SKB_INFO attribute is not present or the NFQA_SKB_CSUM_NOTVERIFIED flag is unset. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-06-30 18:15:48 +02:00
Al Viro	e77e430033	more open-coded file_inode() calls Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-06-29 12:57:21 +04:00
Timo Teräs	2ffae99d1f	ipv4: use next hop exceptions also for input routes Commit `d2d68ba9` (ipv4: Cache input routes in fib_info nexthops) assmued that "locally destined, and routed packets, never trigger PMTU events or redirects that will be processed by us". However, it seems that tunnel devices do trigger PMTU events in certain cases. At least ip_gre, ip6_gre, sit, and ipip do use the inner flow's skb_dst(skb)->ops->update_pmtu to propage mtu information from the outer flows. These can cause the inner flow mtu to be decreased. If next hop exceptions are not consulted for pmtu, IP fragmentation will not be done properly for these routes. It also seems that we really need to have the PMTU information always for netfilter TCPMSS clamp-to-pmtu feature to work properly. So for the time being, cache separate copies of input routes for each next hop exception. Signed-off-by: Timo Teräs <timo.teras@iki.fi> Reviewed-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-28 21:27:47 -07:00
Hannes Frederic Sowa	b173ee488d	ipv6: resend MLD report if a link-local address completes DAD RFC3590/RFC3810 specifies we should resend MLD reports as soon as a valid link-local address is available. We now use the valid_ll_addr_cnt to check if it is necessary to resend a new report. Changes since Flavio Leitner's version: a) adapt for valid_ll_addr_cnt b) resend first reports directly in the path and just arm the timer for mc_qrv-1 resends. Reported-by: Flavio Leitner <fleitner@redhat.com> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Cc: David Stevens <dlstevens@us.ibm.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Flavio Leitner <fbl@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-28 21:19:17 -07:00
Hannes Frederic Sowa	1ec047eb47	ipv6: introduce per-interface counter for dad-completed ipv6 addresses To reduce the number of unnecessary router solicitations, MLDv2 and IGMPv3 messages we need to track the number of valid (as in non-optimistic, no-dad-failed and non-tentative) link-local addresses. Therefore, this patch implements a valid_ll_addr_cnt in struct inet6_dev. We now only emit router solicitations if the first link-local address finishes duplicate address detection. The changes for MLDv2 and IGMPv3 are in a follow-up patch. While there, also simplify one if statement(one minor nit I made in one of my previous patches): if (!...) do(); else return; <<into>> if (...) return; do(); Cc: Flavio Leitner <fbl@redhat.com> Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Cc: David Stevens <dlstevens@us.ibm.com> Suggested-by: David Stevens <dlstevens@us.ibm.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Acked-by: Flavio Leitner <fbl@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-28 21:19:17 -07:00
Stanislav Kinsbursky	4f6bb246f6	SUNRPC: PipeFS MOUNT notification optimization for dying clients Not need to create pipes for dying client. So just skip them. Note: we can safely dereference the client structure, because notification caller is holding sn->pipefs_sb_lock. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Cc: stable@vger.kernel.org Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2013-06-28 15:42:26 -04:00
Stanislav Kinsbursky	e73f4cc051	SUNRPC: split client creation routine into setup and registration This helper moves all "registration" code to the new rpc_client_register() helper. This helper will be used later in the series to synchronize against PipeFS MOUNT/UMOUNT events. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2013-06-28 15:42:16 -04:00
Stanislav Kinsbursky	adb6fa7ffe	SUNRPC: fix races on PipeFS UMOUNT notifications CPU#0 CPU#1 ----------------------------- ----------------------------- rpc_kill_sb sn->pipefs_sb = NULL rpc_release_client (UMOUNT_EVENT) rpc_free_auth rpc_pipefs_event rpc_get_client_for_event !atomic_inc_not_zero(cl_count) <skip the client> atomic_inc(cl_count) rpc_free_client rpc_clnt_remove_pipedir <skip client dir removing> To fix this, this patch does the following: 1) Calls RPC_PIPEFS_UMOUNT notification with sn->pipefs_sb_lock being held. 2) Removes SUNRPC client from the list AFTER pipes destroying. 3) Doesn't hold RPC client on notification: if client in the list, then it can't be destroyed while sn->pipefs_sb_lock in hold by notification caller. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Cc: stable@vger.kernel.org Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2013-06-28 15:42:02 -04:00
Stanislav Kinsbursky	384816051c	SUNRPC: fix races on PipeFS MOUNT notifications Below are races, when RPC client can be created without PiepFS dentries CPU#0 CPU#1 ----------------------------- ----------------------------- rpc_new_client rpc_fill_super rpc_setup_pipedir mutex_lock(&sn->pipefs_sb_lock) rpc_get_sb_net == NULL (no per-net PipeFS superblock) sn->pipefs_sb = sb; notifier_call_chain(MOUNT) (client is not in the list) rpc_register_client (client without pipes dentries) To fix this patch: 1) makes PipeFS mount notification call with pipefs_sb_lock being held. 2) releases pipefs_sb_lock on new SUNRPC client creation only after registration. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Cc: stable@vger.kernel.org Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2013-06-28 15:41:18 -04:00
John W. Linville	57ed5cd695	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem Conflicts: net/wireless/nl80211.c	2013-06-28 13:18:21 -04:00
Rafael J. Wysocki	207bc1181b	Merge branch 'freezer' * freezer: af_unix: use freezable blocking calls in read sigtimedwait: use freezable blocking call nanosleep: use freezable blocking call futex: use freezable blocking call select: use freezable blocking call epoll: use freezable blocking call binder: use freezable blocking calls freezer: add new freezable helpers using freezer_do_not_count() freezer: convert freezable helpers to static inline where possible freezer: convert freezable helpers to freezer_do_not_count() freezer: skip waking up tasks with PF_FREEZER_SKIP set freezer: shorten freezer sleep time using exponential backoff lockdep: check that no locks held at freeze time lockdep: remove task argument from debug_check_no_locks_held freezer: add unsafe versions of freezable helpers for CIFS freezer: add unsafe versions of freezable helpers for NFS	2013-06-28 13:00:53 +02:00
Pablo Neira	3a36515f72	netlink: fix splat in skb_clone with large messages Since (`c05cdb1` netlink: allow large data transfers from user-space), netlink splats if it invokes skb_clone on large netlink skbs since: * skb_shared_info was not correctly initialized. * skb->destructor is not set in the cloned skb. This was spotted by trinity: [ 894.990671] BUG: unable to handle kernel paging request at ffffc9000047b001 [ 894.991034] IP: [<ffffffff81a212c4>] skb_clone+0x24/0xc0 [...] [ 894.991034] Call Trace: [ 894.991034] [<ffffffff81ad299a>] nl_fib_input+0x6a/0x240 [ 894.991034] [<ffffffff81c3b7e6>] ? _raw_read_unlock+0x26/0x40 [ 894.991034] [<ffffffff81a5f189>] netlink_unicast+0x169/0x1e0 [ 894.991034] [<ffffffff81a601e1>] netlink_sendmsg+0x251/0x3d0 Fix it by: 1) introducing a new netlink_skb_clone function that is used in nl_fib_input, that sets our special skb->destructor in the cloned skb. Moreover, handle the release of the large cloned skb head area in the destructor path. 2) not allowing large skbuffs in the netlink broadcast path. I cannot find any reasonable use of the large data transfer using netlink in that path, moreover this helps to skip extra skb_clone handling. I found two more netlink clients that are cloning the skbs, but they are not in the sendmsg path. Therefore, the sole client cloning that I found seems to be the fib frontend. Thanks to Eric Dumazet for helping to address this issue. Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-27 22:44:16 -07:00
Nicolas Dichtel	5e6700b3bf	sit: add support of x-netns This patch allows to switch the netns when packet is encapsulated or decapsulated. In other word, the encapsulated packet is received in a netns, where the lookup is done to find the tunnel. Once the tunnel is found, the packet is decapsulated and injecting into the corresponding interface which stands to another netns. When one of the two netns is removed, the tunnel is destroyed. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-27 22:30:47 -07:00
Nicolas Dichtel	621e84d6f3	dev: introduce skb_scrub_packet() The goal of this new function is to perform all needed cleanup before sending an skb into another netns. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-27 22:29:05 -07:00
Hannes Frederic Sowa	77ecaace6c	ipv6: rearm router solicitaion timer when setting new tokenized address When a new tokenized address gets installed we send out just one router solicition. We should send out `rtr_solicits' in case one router advertisment got lost. So, rearm the timer as we do in addrconf_dad_complete. Cc: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Acked-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-26 15:23:01 -07:00
Mathias Krause	a5cc68f3d6	af_key: fix info leaks in notify messages key_notify_sa_flush() and key_notify_policy_flush() miss to initialize the sadb_msg_reserved member of the broadcasted message and thereby leak 2 bytes of heap memory to listeners. Fix that. Signed-off-by: Mathias Krause <minipli@googlemail.com> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-26 15:15:54 -07:00
Eric Dumazet	a963a37d38	ipv6: ip6_sk_dst_check() must not assume ipv6 dst It's possible to use AF_INET6 sockets and to connect to an IPv4 destination. After this, socket dst cache is a pointer to a rtable, not rt6_info. ip6_sk_dst_check() should check the socket dst cache is IPv6, or else various corruptions/crashes can happen. Dave Jones can reproduce immediate crash with trinity -q -l off -n -c sendmsg -c connect With help from Hannes Frederic Sowa Reported-by: Dave Jones <davej@redhat.com> Reported-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-26 15:13:47 -07:00
Nicolas Schichan	5dbe7c178d	net: fix kernel deadlock with interface rename and netdev name retrieval. When the kernel (compiled with CONFIG_PREEMPT=n) is performing the rename of a network interface, it can end up waiting for a workqueue to complete. If userland is able to invoke a SIOCGIFNAME ioctl or a SO_BINDTODEVICE getsockopt in between, the kernel will deadlock due to the fact that read_secklock_begin() will spin forever waiting for the writer process (the one doing the interface rename) to update the devnet_rename_seq sequence. This patch fixes the problem by adding a helper (netdev_get_name()) and using it in the code handling the SIOCGIFNAME ioctl and SO_BINDTODEVICE setsockopt. The netdev_get_name() helper uses raw_seqcount_begin() to avoid spinning forever, waiting for devnet_rename_seq->sequence to become even. cond_resched() is used in the contended case, before retrying the access to give the writer process a chance to finish. The use of raw_seqcount_begin() will incur some unneeded work in the reader process in the contended case, but this is better than deadlocking the system. Signed-off-by: Nicolas Schichan <nschichan@freebox.fr> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-26 13:42:54 -07:00
Nicolas Dichtel	963b89e80d	sit: fix 4in4 + IPsec scenario Since commit `32b8a8e59c` "sit: add IPv4 over IPv4 support", tunnel->parms.iph.protocol is 0 when both 4in4 and 6in4 are setup, but xfrm_lookup() is called only when proto is != 0, thus we need to pass the real value. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-26 13:42:03 -07:00
David S. Miller	a77471ff70	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next Steffen Klassert says: ==================== Just one patch this time. 1) Drop packets when the matching SA is in larval state and add a statistic counter for that. From Fan Du. Please pull or let me know if there are problems. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-26 13:23:13 -07:00
John W. Linville	729d8d182b	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless	2013-06-26 12:01:42 -04:00
Julian Anastasov	4d0c875dcc	ipvs: add sync_persist_mode flag Add sync_persist_mode flag to reduce sync traffic by syncing only persistent templates. Signed-off-by: Julian Anastasov <ja@ssi.bg> Tested-by: Aleksey Chudov <aleksey.chudov@gmail.com> Signed-off-by: Simon Horman <horms@verge.net.au>	2013-06-26 18:01:46 +09:00
Alexander Frolkin	eba3b5a787	ipvs: SH fallback and L4 hashing By default the SH scheduler rejects connections that are hashed onto a realserver of weight 0. This patch adds a flag to make SH choose a different realserver in this case, instead of rejecting the connection. The patch also adds a flag to make SH include the source port (TCP, UDP, SCTP) in the hash as well as the source address. This basically allows for deterministic round-robin load balancing (i.e., where any director in a cluster of directors with identical config will send the same packet the same way). The flags are service flags (IP_VS_SVC_F_SCHED*) so that these options can be set per service. They are set using a new option to ipvsadm. Signed-off-by: Alexander Frolkin <avf@eldamar.org.uk> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>	2013-06-26 18:01:46 +09:00
Julian Anastasov	acaac5d8bb	ipvs: drop SCTP connections depending on state Drop SCTP connections under load (dropentry context) depending on the protocol state, just like for TCP: INIT conns are dropped immediately, established are dropped randomly while connections in progress or shutdown are skipped. Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>	2013-06-26 18:01:46 +09:00
Julian Anastasov	61e7c420b4	ipvs: replace the SCTP state machine Convert the SCTP state table, so that it is more readable. Change the states to be according to the diagram in RFC 2960 and add more states suitable for middle box. Still, such change in states adds incompatibility if systems in sync setup include this change and others do not include it. With this change we also have proper transitions in INPUT-ONLY mode (DR/TUN) where we see packets only from client. Now we should not switch to 10-second CLOSED state at a time when we should stay in ESTABLISHED state. The short names for states are because we have 16-char space in ipvsadm and 11-char limit for the connection list format. It is a sequence of the TCP implementation where the longest state name is ESTABLISHED. Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>	2013-06-26 18:01:46 +09:00
Alexander Frolkin	c6c96c1883	ipvs: sloppy TCP and SCTP This adds support for sloppy TCP and SCTP modes to IPVS. When enabled (sysctls net.ipv4.vs.sloppy_tcp and net.ipv4.vs.sloppy_sctp), allows IPVS to create connection state on any packet, not just a TCP SYN (or SCTP INIT). This allows connections to fail over from one IPVS director to another mid-flight. Signed-off-by: Alexander Frolkin <avf@eldamar.org.uk> Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>	2013-06-26 18:01:46 +09:00
Julian Anastasov	bba54de5bd	ipvs: provide iph to schedulers Before now the schedulers needed access only to IP addresses and it was easy to get them from skb by using ip_vs_fill_iph_addr_only. New changes for the SH scheduler will need the protocol and ports which is difficult to get from skb for the IPv6 case. As we have all the data in the iph structure, to avoid the same slow lookups provide the iph to schedulers. Signed-off-by: Julian Anastasov <ja@ssi.bg> Acked-by: Hans Schillstrom <hans@schillstrom.com> Signed-off-by: Simon Horman <horms@verge.net.au>	2013-06-26 18:01:45 +09:00
Stephen Hemminger	3f5d6af094	Merge ../vxlan-x	2013-06-25 17:02:49 -07:00
Stephen Hemminger	537f7f8494	bridge: check for zero ether address in fdb add The check for all-zero ether address was removed from rtnetlink core, since Vxlan uses all-zero ether address to signify default address. Need to add check back in for bridge. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2013-06-25 16:59:27 -07:00
Eliezer Tamir	2d48d67fa8	net: poll/select low latency socket support select/poll busy-poll support. Split sysctl value into two separate ones, one for read and one for poll. updated Documentation/sysctl/net.txt Add a new poll flag POLL_LL. When this flag is set, sock_poll will call sk_poll_ll if possible. sock_poll sets this flag in its return value to indicate to select/poll when a socket that can busy poll is found. When poll/select have nothing to report, call the low-level sock_poll again until we are out of time or we find something. Once the system call finds something, it stops setting POLL_LL, so it can return the result to the user ASAP. Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-25 16:35:52 -07:00
Daniel Borkmann	62208f1245	net: sctp: simplify sctp_get_port No need to have an extra ret variable when we directly can return the value of sctp_get_port_local(). Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-25 16:33:05 -07:00
Daniel Borkmann	0a2fbac197	net: sctp: decouple cleaning some socket data from endpoint Rather instead of having the endpoint clean the garbage from the socket, use a sk_destruct handler sctp_destruct_sock(), that does the job for that when there are no more references on the socket. At least do this for our crypto transform through crypto_free_hash() that is allocated when in listening state. Also, perform sctp_put_port() only when sk is valid. At a later point in time we can still determine if there's an option of placing this into sk_prot->unhash() or sctp_endpoint_free() without any races. For now, leave it in sctp_endpoint_destroy() though. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-25 16:33:04 -07:00
Daniel Borkmann	b527fe6933	net: sctp: minor: sctp_seq_dump_local_addrs add missing newline A trailing newline has been forgotten to add into the WARN(). Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-25 16:33:04 -07:00
Daniel Borkmann	52db882f3f	net: sctp: migrate cookie life from timeval to ktime Currently, SCTP code defines its own timeval functions (since timeval is rarely used inside the kernel by others), namely tv_lt() and TIMEVAL_ADD() macros, that operate on SCTP cookie expiration. We might as well remove all those, and operate directly on ktime structures for a couple of reasons: ktime is available on all archs; complexity of ktime calculations depending on the arch is less than (reduces to a simple arithmetic operations on archs with BITS_PER_LONG == 64 or CONFIG_KTIME_SCALAR) or equal to timeval functions (other archs); code becomes more readable; macros can be thrown out. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-25 16:33:04 -07:00
Hannes Frederic Sowa	2b9651d72d	ipv6: remove old token ipv6 address as soon as possible If the tokenized ip address is re-set on an interface we depend on the arrival of a new router advertisment to call addrconf_verify to clean up the old address (which valid_lft is now set to 0). Old addresses can linger around for a longer time if e.g. the source of router advertisments vanishes. So, call addrconf_verify immediately after setting the new tokenized address to get rid of the old tokenized addresses. Cc: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Acked-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-25 16:28:24 -07:00
Hannes Frederic Sowa	dc8482926e	ipv6: check return value of ipv6_get_lladdr We should check the return value of ipv6_get_lladdr in inet6_set_iftoken. A possible situation, which could leave ll_addr unassigned is, when the user removed her link-local address but a global scoped address was already set. In this case the interface would still be IF_READY and not dead. In that case the RS source address is some value from the stack. v2: Daniel Borkmann noted a small indent inconstancy; no semantic changes. Cc: Daniel Borkmann <dborkman@redhat.com> Acked-by: Daniel Borkmann <dborkman@redhat.com> Reviewed-by: Flavio Leitner <fbl@redhat.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-25 16:27:28 -07:00
Hannes Frederic Sowa	876fd05ddb	ipv6: don't disable interface if last ipv6 address is removed The reason behind this change is that as soon as we delete the last ipv6 address of an interface we also lose the /proc/sys/net/ipv6/conf/<interface> directory. This seems to be a usability problem for me. I don't see any reason why we should shutdown ipv6 on that interface in such cases. Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-25 16:23:03 -07:00
Hannes Frederic Sowa	b7b1bfce0b	ipv6: split duplicate address detection and router solicitation timer This patch splits the timers for duplicate address detection and router solicitations apart. The router solicitations timer goes into inet6_dev and the dad timer stays in inet6_ifaddr. The reason behind this patch is to reduce the number of unneeded router solicitations send out by the host if additional link-local addresses are created. Currently we send out RS for every link-local address on an interface. If the RS timer fires we pick a source address with ipv6_get_lladdr. This change could hurt people adding additional link-local addresses and specifying these addresses in the radvd clients section because we no longer guarantee that we use every ll address as source address in router solicitations. Cc: Flavio Leitner <fleitner@redhat.com> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Cc: David Stevens <dlstevens@us.ibm.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Reviewed-by: Flavio Leitner <fbl@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-25 16:23:03 -07:00
Eric Dumazet	bd8a7036c0	gre: fix a possible skb leak commit `68c3316311` ("v4 GRE: Add TCP segmentation offload for GRE") added a possible skb leak, because it frees only the head of segment list, in case a skb_linearize() call fails. This patch adds a kfree_skb_list() helper to fix the bug. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Pravin B Shelar <pshelar@nicira.com> Cc: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-25 16:07:44 -07:00
David S. Miller	2b7a5db060	Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless John W. Linville says: ==================== A few more late-breaking fixes hoping for 3.10... Regarding the Bluetooth fix, Gustavo says: "A important fix to 3.10, this patch fixes an issues that was preventing the l2cap info response command to be handled properly." Also for that Bluetooth fix, Johan adds: "Once the code gives up parsing this PDU it also gives up essential parts of the L2CAP connection creation process, i.e. without this patch the stack will fail to establish connections properly." Moving onto ath9k, Felix Fietkau fixes an RCU locking issue in the transmit path. As for ath9k_htc, Sujith Manoharan fixes some authentication timeouts by ensuring that a chip reset is done when IDLE is turned off. I think these are all micro-fixes that shouldn't cause any trouble. Please let me know if there are problems! ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-25 16:04:35 -07:00
YOSHIFUJI Hideaki / 吉藤英明	ab4eb3537e	ipv6: Process unicast packet with Router Alert by checking flag in skb. Router Alert option is marked in skb. Previously, IP6CB(skb)->ra was set to positive value for such packets. Since commit `dd3332bf` ("ipv6: Store Router Alert option in IP6CB directly."), IP6SKB_ROUTERALERT is set in IP6CB(skb)->flags, and the value of Router Alert option (in network byte order) is set to IP6CB(skb)->ra for such packets. Multicast forwarding path uses that flag and value, but unicast forwarding path does not use the flag and misuses IP6CB(skb)->ra value. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-25 14:47:22 -07:00
John W. Linville	9d5c34f568	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem	2013-06-25 13:24:12 -04:00
Mike Rapoport	f693dff710	rtnetlink: allow using zero MAC address in rtnl_fdb_{add,del} This is required for multiple default destinations management in VXLAN Signed-off-by: Mike Rapoport <mike.rapoport@ravellosystems.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2013-06-25 09:31:39 -07:00
Eric Dumazet	6da334ee0c	ipv6: add include file to suppress sparse warnings commit `f88c91ddba` ("ipv6: statically link register_inet6addr_notifier()" added following sparse warnings : net/ipv6/addrconf_core.c:83:5: warning: symbol 'register_inet6addr_notifier' was not declared. Should it be static? net/ipv6/addrconf_core.c:89:5: warning: symbol 'unregister_inet6addr_notifier' was not declared. Should it be static? net/ipv6/addrconf_core.c:95:5: warning: symbol 'inet6addr_notifier_call_chain' was not declared. Should it be static? Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-25 02:44:05 -07:00
Daniel Borkmann	bcbde0d449	net: netlink: virtual tap device management Similarly to the networking receive path with ptype_all taps, we add the possibility to register netdevices that are for ARPHRD_NETLINK to the netlink subsystem, so that those can be used for netlink analyzers resp. debuggers. We do not offer a direct callback function as out-of-tree modules could do crap with it. Instead, a netdevice must be registered properly and only receives a clone, managed by the netlink layer. Symbols are exported as GPL-only. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-24 16:39:05 -07:00
David S. Miller	a3d9dd89b7	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf Pablo Neira Ayuso says: ==================== The following patchset contains five fixes for Netfilter/IPVS, they are: * A skb leak fix in fragmentation handling in case that helpers are in place, it occurs since the IPV6 NAT infrastructure, from Phil Oester. * Fix SCTP port mangling in ICMP packets for IPVS, from Julian Anastasov. * Fix event delivery in ctnetlink regarding the new connlabel infrastructure, from Florian Westphal. * Fix mangling in the SIP NAT helper, from Balazs Peter Odor. * Fix crash in ipt_ULOG introduced while adding netnamespace support, from Gao Feng. I'll take care of passing several of these patches to -stable once they hit Linus' tree. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-24 12:45:24 -07:00
John W. Linville	9fbdc75116	Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next	2013-06-24 14:45:50 -04:00
John W. Linville	66ba271ab9	Merge branch 'for-john' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next	2013-06-24 14:44:59 -04:00
John W. Linville	57bf74407b	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth	2013-06-24 13:53:15 -04:00
Gao feng	c8fc51cfa7	netfilter: ipt_ULOG: fix incorrect setting of ulog timer The parameter of setup_timer should be &ulog->nlgroup[i]. the incorrect parameter will cause kernel panic in ulog_timer. Bug introducted in commit `355430671a` "netfilter: ipt_ULOG: add net namespace support for ipt_ULOG" ebt_ULOG doesn't have this problem. [ I have mangled this patch to fix nlgroup != 0 case, we were also crashing there --pablo ] Tested-by: George Spelvin <linux@horizon.com> Reported-by: Borislav Petkov <bp@alien8.de> Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-06-24 17:10:44 +02:00
Thomas Pedersen	6c7c4cbfd5	mac80211: initialize power mode for mesh STAs Previously the default mesh STA nonpeer power mode was UNKNOWN (0) make the default mesh STA power mode ACTIVE, to prevent unnecessary frame buffering while peering is not yet complete. Fixes a panic in ath9k_htc when adding stations from userspace, and mcast buffered frames are later released. Thanks to Bob Copeland for his help debugging this. Signed-off-by: Thomas Pedersen <thomas@cozybit.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-06-24 15:59:20 +02:00
Thomas Pedersen	ac49e1a896	mac80211: allow self-protected frame tx without sta Useful for userspace mesh to authenticate and peer without a station entry, since both steps may fail anyway. Signed-off-by: Thomas Pedersen <thomas@cozybit.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-06-24 15:59:20 +02:00
Arend van Spriel	a33d402610	cfg80211: fix compilation warning for cfg80211_leave_all() The following compilation issue popped up moving from v3.10-rc1 to v3.10-rc6 after merging wireless-testing. net/wireless/sysfs.c:86:13: error: 'cfg80211_leave_all' defined but not used [-Werror=unused-function] The function is only called when CONFIG_PM is enabled. Moving the function under CONFIG_PM as well. Signed-off-by: Arend van Spriel <arend@broadcom.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-06-24 15:57:32 +02:00
Ben Greear	f9bef3df52	wireless: check for dangling wdev->current_bss pointer If it is still set when the netdev is being deleted, then we are about to leak a pointer. Warn and clean up in that case. Signed-off-by: Ben Greear <greearb@candelatech.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-06-24 15:55:36 +02:00
Ben Greear	0e3a39b562	wireless: add comments about bss refcounting Should help the next person that tries to understand the bss refcounting logic. Signed-off-by: Ben Greear <greearb@candelatech.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-06-24 15:54:45 +02:00
Ben Greear	6f390908e5	wireless: Make sure __cfg80211_connect_result always puts bss Otherwise, we can leak a bss reference. Signed-off-by: Ben Greear <greearb@candelatech.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-06-24 15:51:22 +02:00
Florian Westphal	797a7d66d2	netfilter: ctnetlink: send event when conntrack label was modified commit `0ceabd8387` (netfilter: ctnetlink: deliver labels to userspace) sets the event bit when we raced with another packet, instead of raising the event bit when the label bit is set for the first time. commit `9b21f6a909` (netfilter: ctnetlink: allow userspace to modify labels) forgot to update the event mask in the "conntrack already exists" case. Both issues result in CTA_LABELS attribute not getting included in the conntrack event. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-06-24 11:32:56 +02:00
Balazs Peter Odor	5aed93875c	netfilter: nf_nat_sip: fix mangling In (b20ab9c netfilter: nf_ct_helper: better logging for dropped packets) there were some missing brackets around the logging information, thus always returning drop. Closes https://bugzilla.kernel.org/show_bug.cgi?id=60061 Signed-off-by: Balazs Peter Odor <balazs@obiserver.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-06-24 11:32:40 +02:00
Wedson Almeida Filho	aeb193ea6c	net: Unmap fragment page once iterator is done Callers of skb_seq_read() are currently forced to call skb_abort_seq_read() even when consuming all the data because the last call to skb_seq_read (the one that returns 0 to indicate the end) fails to unmap the last fragment page. With this patch callers will be allowed to traverse the SKB data by calling skb_prepare_seq_read() once and repeatedly calling skb_seq_read() as originally intended (and documented in the original commit `677e90eda`), that is, only call skb_abort_seq_read() if the sequential read is actually aborted. Signed-off-by: Wedson Almeida Filho <wedsonaf@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-24 01:46:01 -07:00
David S. Miller	7e2f934dc5	Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next John W. Linville says: ==================== I would guess that this is the last big wireless pull request before the 3.11 merge window... Regarding the mac80211 bits, Johannes says: "I have a number of mesh fixes and improvements from Colleen, Jacob, Ashok and Thomas, powersave fixes in mac80211 from Alex, improved management-TX from Antonio, and a few various things, including locking fixes, from others and myself. Overall though, nothing really stands out." As for the iwlwifi bits, Johannes says: "Emmanuel contributed two AP mode fixes, removed an unused field, fixed a comment and added a warning for something that shouldn't happen in practice, and I removed the declaration of a function that doesn't even exist and cleaned up a small include." "This time I have a number of cleanups, a small fix from Emmanuel and two performance improvements that combined reduce our driver's CPU utilisation as much as 75% in high TX-throughput scenarios." "These two patches fix two issues with using rfkill randomly during traffic, which would then cause our driver to stop working and not be able to recover at all." Regarding the ath6kl bits, Kalle says: "Here are few simple patches for ath6kl. We have a suspend crash fix for USB from Shafi, use of mac_pton(), a compiler warning fix and a fix for module initialisation error path." Kalle also sends the biggest single item of note, the new ath10k driver for Qualcomm Atheros 802.11ac CQA98xx devices. Included is an NFC pull, of which Samuel says: "These are the pending NFC patches for the 3.11 merge window. It contains the pending fixes that were on nfc-fixes (nfc-fixes-3.10-2), along with a few more for the pn544 and pn533 drivers, the LLCP disconnection path and an LLCP memory leak. Highlights for this one are: - An initial secure element API. NFC chipsets can carry an embedded secure element or get access to the SIM one. In both cases they control the secure elements and this API provides a way to discover, enable and disable the available SEs. It also exports that to userspace in order for SE focused middleware to actually do something with them (e.g. payments). - NCI over SPI support. SPI is the most complex NCI specified transport layer and we now have support for it in the kernel. The next step will be to implement drivers for NCI chipsets using this transport like e.g. bcm2079x. - NFC p2p hardware simulation driver. We now have an nfcsim driver that is mostly a loopback device between 2 NFC interfaces. It also implements the rest of the NFC core API like polling and target detection. This driver, with neard running on top of it, allows us to completely test the LLCP, SNEP and Handover implementation without physical hardware. - A Firmware update netlink API. Most (All ?) HCI chipsets have a special firmware update mode where applications can push a new firmware that will be flashed. We now have a netlink API for providing that mode to e.g. nfctool." On top of all that, there are a variety of updates to brcmfmac, iwlegacy, rtlwifi, wil6210, and the TI wl12xx drivers. As usual, the bcma and ssb busses get a little love as well, as do a handful of others here and there. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-24 00:31:02 -07:00
Pravin B Shelar	479b1a5825	openvswitch: Use correct config guard. This bug was introduced by commit `aa310701e7` (openvswitch: Add gre tunnel support.) Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-24 00:16:46 -07:00
Cong Wang	7c77602f57	bridge: fix a typo in comments Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-24 00:15:54 -07:00
Eric Dumazet	60877a32bc	net: allow large number of tx queues netif_alloc_netdev_queues() uses kcalloc() to allocate memory for the "struct netdev_queue *_tx" array. For large number of tx queues, kcalloc() might fail, so this patch does a fallback to vzalloc(). As vmalloc() adds overhead on a critical network path, add __GFP_REPEAT to kzalloc() flags to do this fallback only when really needed. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-23 23:56:55 -07:00
Asias He	a49dd9dcb5	VSOCK: Fix VSOCK_HASH and VSOCK_CONN_HASH If we mod with VSOCK_HASH_SIZE -1, we get 0, 1, .... 249. Actually, we have vsock_bind_table[0 ... 250] and vsock_connected_table[0 .. 250]. In this case the last entry will never be used. We should mod with VSOCK_HASH_SIZE instead. Signed-off-by: Asias He <asias@redhat.com> Acked-by: Andy King <acking@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-23 23:51:48 -07:00
Asias He	0fc9324676	VSOCK: Remove unnecessary label Signed-off-by: Asias He <asias@redhat.com> Acked-by: Andy King <acking@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-23 23:51:48 -07:00
Asias He	dce1a28777	VSOCK: Return VMCI_ERROR_NO_MEM when fails to allocate skb vmci_transport_recv_dgram_cb always return VMCI_SUCESS even if we fail to allocate skb, return VMCI_ERROR_NO_MEM instead. Signed-off-by: Asias He <asias@redhat.com> Acked-by: Andy King <acking@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-23 23:51:48 -07:00
Asias He	b3a6dfe817	VSOCK: Introduce vsock_auto_bind helper This peace of code is called three times, let's have a helper for it. Signed-off-by: Asias He <asias@redhat.com> Acked-by: Andy King <acking@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-23 23:51:48 -07:00
Cong Wang	b33698e267	ipv6: remove a useless pr_info() in addrconf_gre_config() This is debug info, should at least be pr_debug(), but given that this code is in upstream for two years, there is no need to keep this debugging printk any more, so just remove it. Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-23 18:46:36 -07:00
Gustavo Padovan	b8f4e06800	Bluetooth: Improve comments on the HCI_Delete_Store_Link_Key issue Some Bluetooth controllers doesn't support this command so we first need to check for its support before sending it. This patch adds a lengthful commentary about this. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2013-06-23 03:05:47 +01:00
Jaganath Kanakkassery	3f6fa3d489	Bluetooth: Fix invalid length check in l2cap_information_rsp() The length check is invalid since the length varies with type of info response. This was introduced by the commit `cb3b3152b2` Because of this, l2cap info rsp is not handled and command reject is sent. > ACL data: handle 11 flags 0x02 dlen 16 L2CAP(s): Info rsp: type 2 result 0 Extended feature mask 0x00b8 Enhanced Retransmission mode Streaming mode FCS Option Fixed Channels < ACL data: handle 11 flags 0x00 dlen 10 L2CAP(s): Command rej: reason 0 Command not understood Cc: stable@vger.kernel.org Signed-off-by: Jaganath Kanakkassery <jaganath.k@samsung.com> Signed-off-by: Chan-Yeol Park <chanyeol.park@samsung.com> Acked-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2013-06-23 00:24:58 +01:00
Chen Gang	673e1dd7ed	Bluetooth: hidp: using strlcpy instead of strncpy, also beautify code. For NULL terminated string, need always let it ended by zero. Since have already called memcpy() to initialize 'ci', so need not redundant initialization. Better use ''if(session->hid) {} else if(session->input) {}"" instead of ''if(session->hid) {}; if(session->input) {};'' Signed-off-by: Chen Gang <gang.chen@asianux.com> Reviewed-by: David Herrmann <dh.herrmann@gmail.com> Acked-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2013-06-23 00:23:53 +01:00
Andrei Emeltchenko	0a804654af	Bluetooth: Remove unneeded flag Remove HCI_LINK_KEYS flag since using HCI_MGMT is enough for test that user space expects the kernel managing link keys. Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com> Acked-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2013-06-23 00:23:53 +01:00
Andrei Emeltchenko	034cbea093	Bluetooth: Use HCI_MGMT instead of HCI_LINK_KEYS flag Use HCI_MGMT flag instead of HCI_LINK_KEYS flag. There is a problem with HCI_LINK_KEYS flag since it is set only when link keys are loaded. Otherwise kernel assumes that old interface is used. Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com> Acked-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2013-06-23 00:23:52 +01:00
Andre Guedes	12602d0cc0	Bluetooth: Mgmt Device Found Event We only want to send Mgmt Device Found Events if we are running the Device Discovery procedure (started by the MGMT Start Discovery Command). Inquiry or LE scanning triggered by HCI raw interface (e.g. hcitool) or kernel internals should not send Mgmt Device Found Events. Signed-off-by: Andre Guedes <andre.guedes@openbossa.org> Acked-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2013-06-23 00:23:52 +01:00
Andre Guedes	8892d8beb3	Bluetooth: Remove empty event handler This patch removes the hci_cc_le_set_scan_param event handler. This handler became empty because failures of this event are now handled by start_discovery_complete function in mgmt.c. Signed-off-by: Andre Guedes <andre.guedes@openbossa.org> Acked-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2013-06-23 00:23:52 +01:00
Andre Guedes	b0434345f2	Bluetooth: Remove inquiry helpers This patch removes hci_do_inquiry and hci_cancel_inquiry helpers. We now use the HCI request framework in device discovery functionality and these helpers are no longer needed. Signed-off-by: Andre Guedes <andre.guedes@openbossa.org> Acked-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2013-06-23 00:23:52 +01:00
Andre Guedes	917eedc56c	Bluetooth: Remove LE scan helpers This patch removes the LE scan helpers hci_le_scan and hci_cancel_ le_scan and all code related to it. We now use the HCI request framework in device discovery functionality and these helpers are no longer needed. Signed-off-by: Andre Guedes <andre.guedes@openbossa.org> Acked-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2013-06-23 00:23:51 +01:00
Andre Guedes	3fd319b830	Bluetooth: Refactor hci_cc_le_set_scan_enable This patch does a trivial refactoring in hci_cc_le_set_scan_enable. Since start and stop discovery command failures are now handled in mgmt layer, the status check became empty. So, we can move it to outside the switch statement. Signed-off-by: Andre Guedes <andre.guedes@openbossa.org> Acked-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2013-06-23 00:23:51 +01:00
Andre Guedes	1183fdcad4	Bluetooth: Make mgmt_stop_discovery_failed static mgmt_stop_discovery_failed is now only used in mgmt.c so we can make it a local function. This patch also moves the mgmt_stop_ discovery_failed definition up in mgmt.c to avoid forward declaration. Signed-off-by: Andre Guedes <andre.guedes@openbossa.org> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2013-06-23 00:23:51 +01:00
Andre Guedes	82f4785ca7	Bluetooth: Remove stop discovery handling from hci_event.c Since all mgmt stop discovery command complete events are now handled in stop_discovery_complete callback in mgmt.c, we can remove this handling from hci_event.c. Signed-off-by: Andre Guedes <andre.guedes@openbossa.org> Acked-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2013-06-23 00:23:51 +01:00
Andre Guedes	0e05bba6f6	Bluetooth: Update stop_discovery to use HCI request This patch modifies the stop_discovery function so it uses the HCI request framework. The HCI request is built according to the current discovery state (inquiry, LE scanning or name resolving) and a complete callback is register to handle the command complete event for the stop discovery command. This way, we move all stop_discovery mgmt handling code spread in hci_event.c to a single place in mgmt.c. Signed-off-by: Andre Guedes <andre.guedes@openbossa.org> Acked-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2013-06-23 00:23:51 +01:00
Andre Guedes	4c87eaab01	Bluetooth: Use HCI request in interleaved discovery In order to have a better HCI error handling in interleaved discovery functionality, we should use the HCI request framework. This patch updates le_scan_disable_work function so it uses the HCI request framework instead of the hci_send_cmd helper. A complete callback is registered (le_scan_disable_work_complete function) so we are able to trigger the inquiry procedure (if we are running the interleaved discovery) or to stop the discovery procedure (if we are running LE-only discovery). This patch also removes the extra logic in hci_cc_le_set_scan_enable to trigger the inquiry procedure and the mgmt_interleaved_discovery function since they become useless. Signed-off-by: Andre Guedes <andre.guedes@openbossa.org> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2013-06-23 00:23:50 +01:00
Andre Guedes	0d8cc935e0	Bluetooth: Move discovery macros to hci_core.h Some of discovery macros will be used in hci_core so we need to define them in common place such as hci_core.h. Thus, this patch moves discovery macros to hci_core.h and also adds the DISCOV_ prefix to them. Signed-off-by: Andre Guedes <andre.guedes@openbossa.org> Acked-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2013-06-23 00:23:50 +01:00
Andre Guedes	41dc2bd6d1	Bluetooth: Make mgmt_start_discovery_failed static mgmt_start_discovery_failed is now only used in mgmt.c so we can make it a local function. This patch also moves the mgmt_start_ discovery_failed definition up in mgmt.c to avoid forward declaration. Signed-off-by: Andre Guedes <andre.guedes@openbossa.org> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2013-06-23 00:23:50 +01:00
Andre Guedes	fef5234a79	Bluetooth: Remove start discovery handling from hci_event.c Since all mgmt start discovery command complete events are now handled in start_discovery_complete callback in mgmt.c, we can remove this handling from hci_event.c. Signed-off-by: Andre Guedes <andre.guedes@openbossa.org> Acked-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2013-06-23 00:23:50 +01:00
Andre Guedes	7c3077207c	Bluetooth: Update start_discovery to use HCI request This patch modifies the start_discovery function so it uses the HCI request framework. We build the HCI request according to the discovery type (add inquiry or LE scan HCI commands) and run the HCI request. We also register the start_discovery_complete callback which handles mgmt command complete events for this command. This way, we move all start_ discovery mgmt handling code spread in hci_event.c to a single place in mgmt.c. This patch also merges the LE-only and interleaved discovery type cases since these cases are pretty much the same now. Signed-off-by: Andre Guedes <andre.guedes@openbossa.org> Acked-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2013-06-23 00:23:50 +01:00
Andre Guedes	1f9b9a5dc5	Bluetooth: Make inquiry_cache_flush non-static In order to use HCI request framework in start_discovery, we'll need to call inquiry_cache_flush in mgmt.c. Therefore, this patch adds the hci_ prefix to inquiry_cache_flush and makes it non-static. Signed-off-by: Andre Guedes <andre.guedes@openbossa.org> Acked-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2013-06-23 00:23:49 +01:00
Johan Hedberg	44f3b0fbaa	Bluetooth: Fix multiple LE socket handling The LE ATT server socket needs to be superseded by any ATT client sockets. Previously this was done by looking at the hcon->out variable (indicating whether the connection is outgoing or incoming) which is a too crude way of determining whether the server socket needs to be picked or not (an outgoing connection doesn't necessarily mean that an ATT client socket has triggered it). This patch extends the ATT server socket lookup function (l2cap_le_conn_ready) to be used for all LE connections (regardless of the hcon->out value) and adds an internal check into the function for the existence of any ATT client sockets (in which case the server socket should be skipped). For this to work reliably all lookups must be done while the l2cap_conn->chan_lock is held, meaning also that the call to l2cap_chan_add needs to be changed to its lockless __l2cap_chan_add counterpart. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Acked-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2013-06-23 00:23:49 +01:00
Johan Hedberg	0cc59a72c7	Bluetooth: Remove useless hci_conn disc_timeout setting There's no need to reset disc_timeout in l2cap_le_conn_ready since HCI_DISCONN_TIMEOUT is the default when the hci_conn is created and there should be no way for it to get changed between creation and l2cap_le_conn_ready being called. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Acked-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2013-06-23 00:23:49 +01:00
Johan Hedberg	5ee9891dd8	Bluetooth: Simplify hci_conn_hold/drop logic for L2CAP The L2CAP code has been incrementing the hci_conn reference for each l2cap_chan instance in the l2cap_conn list. Likewise, the reference is dropped each time an l2cap_chan is removed from the list. The reference counting policy with respect to removal has been clear and explicit in the l2cap_chan_del function, however for addition the function calling 2cap_chan_add has always had to do a separate hci_conn_hold call. What made the counting even more hard to follow is that the hci_connect() procedure increments the reference and the L2CAP layer making this call took advantage of it to use it as its own reference. This patch aims to clarify things by having the call to hci_conn_hold inside __l2cap_chan_add, thereby removing the need to do it in the functions calling __l2cap_chan_add. The reference count for hci_connect is still kept as it's necessary for users such as mgmt_pair_device, however for the L2CAP layer it means that an extra call to hci_conn_drop must be performed once l2cap_chan_add has been done. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Acked-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2013-06-23 00:23:49 +01:00
Johan Hedberg	af1c01349e	Bluetooth: Remove unnecessary L2CAP channel state check In l2cap_att_channel() we're only interested in the BT_CONNECTED state so this state can directly be passed to l2cap_global_chan_by_scid(). This way there's no need to do any additional state check later. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Acked-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2013-06-23 00:23:49 +01:00
Johan Hedberg	60bac184c9	Bluetooth: Remove useless sk variable in l2cap_le_conn_ready The sk variable is of quite little use since it's only used to simplify access in the two bt_sk() calls. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Acked-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2013-06-23 00:23:48 +01:00
Johan Hedberg	97f57c0b14	Bluetooth: Fix duplicate call to l2cap_chan_ready() In l2cap_le_conn_ready() after doing l2cap_chann_add() the LE channel is part of the list which is subsequently iterated in l2cap_conn_ready() in this loop each channel will get l2cap_chan_ready() called which would result in trying to set the channel two times into BT_CONNECTED state. Instead it makes sense to just add the channel but not call chan_ready in l2cap_le_conn_ready, which is what this patch does. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Acked-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2013-06-23 00:23:48 +01:00
Johan Hedberg	d8729922b4	Bluetooth: Add clarifying comment to l2cap_conn_ready() There is an extra call to smp_conn_security() for outgoing LE connections from l2cap_conn_ready() but the reason for this call is far from clear. After a bit of commit history research and using git blame I found out that this extra call is for socket-less pairing processes added by commit `160dc6ac1`. This patch adds a clarifying comment to the code for this. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Acked-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2013-06-23 00:23:48 +01:00
Johan Hedberg	9f22398ce4	Bluetooth: Fix hardcoding ATT CID in __l2cap_chan_add() Since in the future more than the ATT CID may be permissible we should not be hardcoding it for all LE connections in __l2cap_chan_add(). Instead, the source ATT CID should only be set if the destination is also ATT, and in other cases we should just use the existing dynamic CID allocation function. Assigning scid based on dcid means that whenever __l2cap_chan_add() is called that chan->dcid is properly initialized. l2cap_le_conn_ready() wasn't initializing is properly so this is also taken care of in this patch. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Acked-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2013-06-23 00:23:48 +01:00
Johan Hedberg	141d57065a	Bluetooth: Fix EBUSY condition test in l2cap_chan_connect The current test in l2cap_chan_connect is intended to protect against multiple conflicting connect attempts. However, it assumes that there will ever only be a single CID that is connected to, which is not true. We do need to check for conflicts with connect attempts to the same destination CID but this check is not in anyway specific to LE but can be applied to BR/EDR as well. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Acked-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2013-06-23 00:23:47 +01:00
Johan Hedberg	f224ca5fc2	Bluetooth: Fix LE vs BR/EDR selection when connecting The choice between LE and BR/EDR should be made on the destination address type instead of the destination CID. This is particularly important when in the future more than one CID will be allowed for LE. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Acked-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2013-06-23 00:23:47 +01:00
Johan Hedberg	073d1cf35f	Bluetooth: Rename L2CAP_CID_LE_DATA to L2CAP_CID_ATT In future Core Specification versions the ATT CID will be just one of many possible CIDs that can be used for data transfer. Therefore, it makes sense to rename the define for the ATT CID to something less ambigous. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Acked-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2013-06-23 00:23:47 +01:00
Johan Hedberg	c5623556fc	Bluetooth: Handle LE L2CAP signalling in its own function The LE L2CAP signalling channel follows its own rules and will continue to evolve independently from the BR/EDR signalling channel. Therefore, it makes sense to have a clear split from BR/EDR by having a dedicated function for handling LE signalling commands. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Acked-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2013-06-23 00:23:47 +01:00
John W. Linville	7d2a47aab2	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem Conflicts: net/wireless/nl80211.c	2013-06-21 15:42:30 -04:00
Eric Dumazet	681f130f39	netfilter: xt_socket: add XT_SOCKET_NOWILDCARD flag xt_socket module can be a nice replacement to conntrack module in some cases (SYN filtering for example) But it lacks the ability to match the 3rd packet of TCP handshake (ACK coming from the client). Add a XT_SOCKET_NOWILDCARD flag to disable the wildcard mechanism. The wildcard is the legacy socket match behavior, that ignores LISTEN sockets bound to INADDR_ANY (or ipv6 equivalent) iptables -I INPUT -p tcp --syn -j SYN_CHAIN iptables -I INPUT -m socket --nowildcard -j ACCEPT Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Patrick McHardy <kaber@trash.net> Cc: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-06-20 20:28:49 +02:00
Phil Oester	142dcdd3c2	netfilter: nf_conntrack_ipv6: Plug sk_buff leak in fragment handling In commit `4cdd3408` ("netfilter: nf_conntrack_ipv6: improve fragmentation handling"), an sk_buff leak was introduced when dealing with reassembled packets by grabbing a reference to the original skb instead of the reassembled skb. At this point, the leak only impacted conntracks with an associated helper. In commit `58a317f1` ("netfilter: ipv6: add IPv6 NAT support"), the bug was expanded to include all reassembled packets with unconfirmed conntracks. Fix this by grabbing a reference to the proper reassembled skb. This closes netfilter bugzilla #823. Signed-off-by: Phil Oester <kernel@linuxace.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-06-20 12:01:24 +02:00
Florian Westphal	6547a22187	netfilter: nf_conntrack: avoid large timeout for mid-stream pickup When loose tracking is enabled (default), non-syn packets cause creation of new conntracks in established state with default timeout for established state (5 days). This causes the table to fill up with UNREPLIED when the 'new ack' packet happened to be the last-ack of a previous, already timed-out connection. Consider: A 192.168.x.52792 > 10.184.y.80: F, 426:426(0) ack 9237 win 255 B 10.184.y.80 > 192.168.x.52792: ., ack 427 win 123 <61 second pause> C 10.184.y.80 > 192.168.x.52792: F, 9237:9237(0) ack 427 win 123 D 192.168.x.52792 > 10.184.y.80: ., ack 9238 win 255 B moves conntrack to CLOSE_WAIT and will kill it after 60 second timeout, C is ignored (FIN set), but last packet (D) causes new ct with 5-days timeout. Use UNACK timeout (5 minutes) instead to get rid of these entries sooner when in ESTABLISHED state without having seen traffic in both directions. Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-06-20 11:20:13 +02:00
Daniel Borkmann	130ffbc263	netfilter: check return code from nla_parse_tested These are the only calls under net/ that do not check nla_parse_nested() for its error code, but simply continue execution. If parsing of netlink attributes fails, we should return with an error instead of continuing. In nearly all of these calls we have a policy attached, that is being type verified during nla_parse_nested(), which we would miss checking for otherwise. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-06-20 11:20:13 +02:00
Rami Rosen	af92e5425e	inet: frag , remove an empty ifdef. This patch removes an empty ifdef from inet_frag_intern() in net/ipv4/inet_fragment.c. commit `b67bfe0d42` (hlist: drop the node parameter from iterators) removed hlist from net/ipv4/inet_fragment.c, but did not remove the enclosing ifdef command, which is now empty. Signed-off-by: Rami Rosen <ramirose@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-19 23:06:52 -07:00
Eric Dumazet	c9364636dc	htb: refactor struct htb_sched fields for performance htb_sched structures are big, and source of false sharing on SMP. Every time a packet is queued or dequeue, many cache lines must be touched because structures are not lay out properly. By carefully splitting htb_sched in two parts, and define sub structures to increase data locality, we can improve performance dramatically on SMP. New htb_prio structure can also be used in htb_class to increase data locality. I got 26 % performance increase on a 24 threads machine, with 200 concurrent netperf in TCP_RR mode, using a HTB hierarchy of 4 classes. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-19 23:06:52 -07:00
Cong Wang	bcefe17cff	tcp: introduce a per-route knob for quick ack In previous discussions, I tried to find some reasonable heuristics for delayed ACK, however this seems not possible, according to Eric: "ACKS might also be delayed because of bidirectional traffic, and is more controlled by the application response time. TCP stack can not easily estimate it." "ACK can be incredibly useful to recover from losses in a short time. The vast majority of TCP sessions are small lived, and we send one ACK per received segment anyway at beginning or retransmits to let the sender smoothly increase its cwnd, so an auto-tuning facility wont help them that much." and according to David: "ACKs are the only information we have to detect loss. And, for the same reasons that TCP VEGAS is fundamentally broken, we cannot measure the pipe or some other receiver-side-visible piece of information to determine when it's "safe" to stretch ACK. And even if it's "safe", we should not do it so that losses are accurately detected and we don't spuriously retransmit. The only way to know when the bandwidth increases is to "test" it, by sending more and more packets until drops happen. That's why all successful congestion control algorithms must operate on explicited tested pieces of information. Similarly, it's not really possible to universally know if it's safe to stretch ACK or not." It still makes sense to enable or disable quick ack mode like what TCP_QUICK_ACK does. Similar to TCP_QUICK_ACK option, but for people who can't modify the source code and still wants to control TCP delayed ACK behavior. As David suggested, this should belong to per-path scope, since different pathes may want different behaviors. Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Rick Jones <rick.jones2@hp.com> Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Thomas Graf <tgraf@suug.ch> CC: David Laight <David.Laight@ACULAB.COM> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-19 23:06:51 -07:00
Gao feng	a881ae1f62	ipv6: don't call addrconf_dst_alloc again when enable lo If we disable all of the net interfaces, and enable un-lo interface before lo interface, we already allocated the addrconf dst in ipv6_add_addr. So we shouldn't allocate it again when we enable lo interface. Otherwise the message below will be triggered. unregister_netdevice: waiting for sit1 to become free. Usage count = 1 This problem is introduced by commit `25fb6ca4ed` "net IPv6 : Fix broken IPv6 routing table after loopback down-up" Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-19 23:04:50 -07:00
Dave Jones	2c0740e4e1	sctp: Convert __list_for_each use to list_for_each Signed-off-by: Dave Jones <davej@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-19 23:02:49 -07:00
Weiping Pan	9ef71e0c82	tcp:typo unset should be unsent Signed-off-by: Weiping Pan <wpan@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-19 22:21:09 -07:00
Aydin Arik	c0353c7b5d	ipv4: Fixed MD5 key lookups when adding/ removing MD5 to/ from TCP sockets. MD5 key lookups on a given TCP socket were being performed incorrectly. This fix alters parameter inputs to the MD5 lookup function tcp_md5_do_lookup, which is called by functions tcp_md5_do_add and tcp_md5_do_del. Specifically, the change now inputs the correct address and address family required to make a proper lookup. Signed-off-by: Aydin Arik <aydin.arik@alliedtelesis.co.nz> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-19 21:21:53 -07:00
Nicolas Dichtel	c2ff682a6f	sit: fix an oops when IFLA_IPTUN_PROTO is not set The use of this attribute has been added in `32b8a8e59c` (sit: add IPv4 over IPv4 support). It is optional, by default proto is IPPROTO_IPV6. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-19 21:18:17 -07:00
Gao feng	dc25c676f5	neigh: disallow un-init_net to change thresh of neigh thresh and interval are global resources, only init net can change them. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-19 21:13:24 -07:00
Gao feng	170d6f9954	neigh: only allow init_net to change the default neigh_parms Though we don't export the /proc/sys/net/ipv[4,6]/neigh/default/ directory to the un-init_net, but we can still use cmd such as "ip ntable change name arp_cache locktime 129" to change the locktime of default neigh_parms. This patch disallows the un-init_net to find out the neigh_table.parms. So the un-init_net will failed to influence the init_net. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-19 21:13:24 -07:00
Gao feng	cf89d6b280	neigh: no need to call lookup_neigh_parms in neigh_parms_alloc neigh_table.parms always exist and is initialized,kmemdup can use it to create new neigh_parms, actually lookup_neigh_parms here will return neigh_table.parms too. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-19 21:13:24 -07:00
Pravin B Shelar	aa310701e7	openvswitch: Add gre tunnel support. Add gre vport implementation. Most of gre protocol processing is pushed to gre module. It make use of gre demultiplexer therefore it can co-exist with linux device based gre tunnels. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-19 18:07:42 -07:00
Pravin B Shelar	a3e82996a8	openvswitch: Optimize flow key match for non tunnel flows. Following patch adds start offset for sw_flow-key, so that we can skip tunneling information in key for non-tunnel flows. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-19 18:07:41 -07:00
Pravin B Shelar	ffe3f43217	openvswitch: Expand action buffer size. MAX_ACTIONS_BUFSIZE limits action list size, set tunnel action needs extra space on action list, for now increase max actions list limit. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-19 18:07:41 -07:00
Pravin B Shelar	7d5437c709	openvswitch: Add tunneling interface. Add ovs tunnel interface for set tunnel action for userspace. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-19 18:07:41 -07:00
Pravin B Shelar	74f84a5726	openvswitch: Copy individual actions. Rather than validating actions and then copying all actiaons in one block, following patch does same operation in single pass. This validate and copy action one by one. This is required for ovs tunneling patch. This patch does not change any functionality. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-19 18:07:41 -07:00
Pravin B Shelar	3d7b46cd20	ip_tunnel: push generic protocol handling to ip_tunnel module. Process skb tunnel header before sending packet to protocol handler. this allows code sharing between gre and ovs gre modules. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-19 18:07:41 -07:00
Pravin B Shelar	0e6fbc5b6c	ip_tunnels: extend iptunnel_xmit() Refactor various ip tunnels xmit functions and extend iptunnel_xmit() so that there is more code sharing. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-19 18:07:41 -07:00
Pravin B Shelar	45f2e9976c	gre: export gre_handle_offloads() function. This is required for OVS GRE offloading. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-19 18:07:41 -07:00
Pravin B Shelar	752f36da68	gre: export gre_build_header() function. This is required for ovs gre module. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-19 18:07:40 -07:00
Pravin B Shelar	bda7bb4634	gre: Allow multiple protocol listener for gre protocol. Currently there is only one user is allowed to register for gre protocol. Following patch adds de-multiplexer. So that multiple modules can listen on gre protocol e.g. kernel gre devices and ovs. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-19 18:07:40 -07:00
Pravin B Shelar	20fd4d1f04	gre: Simplify gre protocol registration locking. Use cmpxchg() for atomic protocol registration which saves code and data space. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-19 18:07:40 -07:00
David S. Miller	d98cae64e4	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/wireless/ath/ath9k/Kconfig drivers/net/xen-netback/netback.c net/batman-adv/bat_iv_ogm.c net/wireless/nl80211.c The ath9k Kconfig conflict was a change of a Kconfig option name right next to the deletion of another option. The xen-netback conflict was overlapping changes involving the handling of the notify list in xen_netbk_rx_action(). Batman conflict resolution provided by Antonio Quartulli, basically keep everything in both conflict hunks. The nl80211 conflict is a little more involved. In 'net' we added a dynamic memory allocation to nl80211_dump_wiphy() to fix a race that Linus reported. Meanwhile in 'net-next' the handlers were converted to use pre and post doit handlers which use a flag to determine whether to hold the RTNL mutex around the operation. However, the dump handlers to not use this logic. Instead they have to explicitly do the locking. There were apparent bugs in the conversion of nl80211_dump_wiphy() in that we were not dropping the RTNL mutex in all the return paths, and it seems we very much should be doing so. So I fixed that whilst handling the overlapping changes. To simplify the initial returns, I take the RTNL mutex after we try to allocate 'tb'. Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-19 16:49:39 -07:00
John W. Linville	4067c666f2	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem	2013-06-19 15:24:36 -04:00
John W. Linville	2f2a8846d5	Merge branch 'for-john' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211	2013-06-19 14:50:44 -04:00
Johannes Berg	f1940c5730	cfg80211: hold BSS over association process This fixes the potential issue that the BSS struct that we use and later assign to wdev->current_bss is removed from the scan list while associating. Also warn when we don't have a BSS struct in connect_result unless it's from a driver that only has the connect() API. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-06-19 18:55:39 +02:00
Johannes Berg	959867fa55	cfg80211: require passing BSS struct back to cfg80211_assoc_timeout Doing so will allow us to hold the BSS (not just ref it) over the association process, thus ensuring that it doesn't time out and gets invisible to the user (e.g. in 'iw wlan0 link'.) This also fixes a leak in mac80211 where it doesn't always release the BSS struct properly in all cases where calling this function. This leak was reported by Ben Greear. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-06-19 18:55:39 +02:00
Johannes Berg	86e8cf98de	nl80211: use small state buffer for wiphy_dump Avoid parsing the original dump message again and again by allocating a small state struct that is used by the functions involved in the dump, storing this struct in cb->args[0]. This reduces the memory allocation size as well. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-06-19 18:55:38 +02:00
Johannes Berg	f93beba705	Merge remote-tracking branch 'mac80211/master' into HEAD Merge mac80211 to avoid conflicts with the nl80211 attrbuf changes. Conflicts: net/mac80211/iface.c net/wireless/nl80211.c Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-06-19 18:55:12 +02:00
Johannes Berg	3a5a423bb9	nl80211: fix attrbuf access race by allocating a separate one Since my commit `3713b4e364` ("nl80211: allow splitting wiphy information in dumps"), nl80211_dump_wiphy() uses the global nl80211_fam.attrbuf for parsing the incoming data. This wouldn't be a problem if it only did so on the first dump iteration which is locked against other commands in generic netlink, but due to space constraints in cb->args (the needed state doesn't fit) I decided to always parse the original message. That's racy though since nl80211_fam.attrbuf could be used by some other parsing in generic netlink concurrently. For now, fix this by allocating a separate parse buffer (it's a bit too big for the stack, currently 1448 bytes on 64-bit). For -next, I'll change the code to parse into the global buffer in the first round only and then allocate a smaller buffer to keep the data in cb->args. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-06-19 18:31:20 +02:00
Julian Anastasov	06f3d7f973	ipvs: SCTP ports should be writable in ICMP packets Make sure that SCTP ports are writable when embedded in ICMP from client, so that ip_vs_nat_icmp can translate them safely. Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>	2013-06-19 09:53:52 +09:00
John W. Linville	9d1059c248	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless	2013-06-18 14:04:51 -04:00
Jeff Layton	e401452d92	rpc_pipefs: only set rpc_dentry_ops if d_op isn't already set We had a report of a reproducible WARNING: [ 1360.039358] ------------[ cut here ]------------ [ 1360.043978] WARNING: at fs/dcache.c:1355 d_set_d_op+0x8d/0xc0() [ 1360.049880] Hardware name: HP Z200 Workstation [ 1360.054308] Modules linked in: nfsv4 nfs dns_resolver fscache nfsd auth_rpcgss nfs_acl lockd sunrpc sg acpi_cpufreq mperf coretemp kvm_intel kvm snd_hda_codec_realtek snd_hda_intel snd_hda_codec hp_wmi crc32c_intel snd_hwdep e1000e snd_seq snd_seq_device snd_pcm snd_page_alloc snd_timer snd sparse_keymap rfkill soundcore serio_raw ptp iTCO_wdt pps_core pcspkr iTCO_vendor_support mei microcode lpc_ich mfd_core wmi xfs libcrc32c sr_mod sd_mod cdrom crc_t10dif radeon i2c_algo_bit drm_kms_helper ttm ahci libahci drm i2c_core libata dm_mirror dm_region_hash dm_log dm_mod [last unloaded: auth_rpcgss] [ 1360.107406] Pid: 8814, comm: mount.nfs4 Tainted: G I -------------- 3.9.0-0.55.el7.x86_64 #1 [ 1360.116771] Call Trace: [ 1360.119219] [<ffffffff810610c0>] warn_slowpath_common+0x70/0xa0 [ 1360.125208] [<ffffffff810611aa>] warn_slowpath_null+0x1a/0x20 [ 1360.131025] [<ffffffff811af46d>] d_set_d_op+0x8d/0xc0 [ 1360.136159] [<ffffffffa05a7d6f>] __rpc_lookup_create_exclusive+0x4f/0x80 [sunrpc] [ 1360.143710] [<ffffffffa05a8cc6>] rpc_mkpipe_dentry+0x86/0x170 [sunrpc] [ 1360.150311] [<ffffffffa062a7b6>] nfs_idmap_new+0x96/0x130 [nfsv4] [ 1360.156475] [<ffffffffa062e7cd>] nfs4_init_client+0xad/0x2d0 [nfsv4] [ 1360.162902] [<ffffffff812f02df>] ? idr_get_empty_slot+0x16f/0x3c0 [ 1360.169062] [<ffffffff812f0582>] ? idr_mark_full+0x52/0x60 [ 1360.174615] [<ffffffff812f0699>] ? idr_alloc+0x79/0xe0 [ 1360.179826] [<ffffffffa0598081>] ? __rpc_init_priority_wait_queue+0x81/0xc0 [sunrpc] [ 1360.187635] [<ffffffffa05980f3>] ? rpc_init_wait_queue+0x13/0x20 [sunrpc] [ 1360.194493] [<ffffffffa05d05da>] nfs_get_client+0x27a/0x350 [nfs] [ 1360.200666] [<ffffffffa062e438>] nfs4_set_client.isra.8+0x78/0x100 [nfsv4] [ 1360.207624] [<ffffffffa062f2f3>] nfs4_create_server+0xf3/0x3a0 [nfsv4] [ 1360.214222] [<ffffffffa06284be>] nfs4_remote_mount+0x2e/0x60 [nfsv4] [ 1360.220644] [<ffffffff8119ea79>] mount_fs+0x39/0x1b0 [ 1360.225691] [<ffffffff81153880>] ? __alloc_percpu+0x10/0x20 [ 1360.231348] [<ffffffff811b7ccf>] vfs_kern_mount+0x5f/0xf0 [ 1360.236822] [<ffffffffa0628396>] nfs_do_root_mount+0x86/0xc0 [nfsv4] [ 1360.243246] [<ffffffffa06287b4>] nfs4_try_mount+0x44/0xc0 [nfsv4] [ 1360.249410] [<ffffffffa05d1457>] ? get_nfs_version+0x27/0x80 [nfs] [ 1360.255659] [<ffffffffa05db985>] nfs_fs_mount+0x5c5/0xd10 [nfs] [ 1360.261650] [<ffffffffa05dc550>] ? nfs_clone_super+0x140/0x140 [nfs] [ 1360.268074] [<ffffffffa05da8e0>] ? param_set_portnr+0x60/0x60 [nfs] [ 1360.274406] [<ffffffff8119ea79>] mount_fs+0x39/0x1b0 [ 1360.279443] [<ffffffff81153880>] ? __alloc_percpu+0x10/0x20 [ 1360.285088] [<ffffffff811b7ccf>] vfs_kern_mount+0x5f/0xf0 [ 1360.290556] [<ffffffff811b9f5d>] do_mount+0x1fd/0xa00 [ 1360.295677] [<ffffffff81137dee>] ? __get_free_pages+0xe/0x50 [ 1360.301405] [<ffffffff811b9be6>] ? copy_mount_options+0x36/0x170 [ 1360.307479] [<ffffffff811ba7e3>] sys_mount+0x83/0xc0 [ 1360.312515] [<ffffffff8160ad59>] system_call_fastpath+0x16/0x1b [ 1360.318503] ---[ end trace 8fa1f4cbc36094a7 ]--- The problem is that we're ending up in __rpc_lookup_create_exclusive with a negative dentry that already has d_op set. A little debugging has shown that when we hit this, the d_ops are already set to simple_dentry_operations. I believe that what's happening is that during a mount, idmapd is racing in and doing a lookup of /var/lib/nfs/rpc_pipefs/nfs/clnt???/idmap. Before that dentry reference is released, the kernel races in to create that file and finds the new negative dentry, which already has the d_op set. This patch just avoids setting the d_op if it's already set. simple_dentry_operations and rpc_dentry_operations are functionally equivalent so it shouldn't matter which one it's set to. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2013-06-18 13:46:50 -04:00
Antonio Quartulli	52874a5e39	Revert "mac80211: in IBSS use the Auth frame to trigger STA reinsertion" This reverts commit `6d810f1032` In this way an IBSS station will not use the AUTH messages to trigger a state reinitialisation anymore. The behaviour was racy and was not working properly. It has been introduced to help wpa_supplicant to support IBSS/RSN, however all the logic is now getting moved into wpa_s itself which will also be in charge of handling the AUTH messages thanks to the mgmt frame registration. If userspace does not register for receiving AUTH frames then mac80211 will still reply by itself. At the same time, the auth frame registration counter can be removed since it is not needed anymore. Signed-off-by: Antonio Quartulli <antonio@open-mesh.com> [remove unused variable] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-06-18 16:36:59 +02:00
Simon Wunderlich	3aede78aad	mac80211: change IBSS channel state to chandef This should make some parts cleaner and is also required for handling 5/10 MHz properly. Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de> Signed-off-by: Mathias Kretschmer <mathias.kretschmer@fokus.fraunhofer.de> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-06-18 16:27:25 +02:00
Simon Wunderlich	0418a44583	mac80211: fix various components for the new 5 and 10 MHz widths This is a collection of minor fixes: * don't allow HT IEs in IBSS for 5/10 MHz * don't allow HT IEs in Mesh for 5/10 MHz * don't downgrade from/to 5 and 10 MHz channels * don't try HT rates for 5 and 10 MHz channels when selecting rates Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de> Signed-off-by: Mathias Kretschmer <mathias.kretschmer@fokus.fraunhofer.de> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-06-18 16:17:11 +02:00
Simon Wunderlich	2f301ab29e	nl80211/cfg80211: add 5 and 10 MHz defines and wiphy flag Add defines for 5 and 10 MHz channel width and fix channel handling functions accordingly. Also check for and report the WIPHY_FLAG_SUPPORTS_5_10_MHZ capability. Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de> Signed-off-by: Mathias Kretschmer <mathias.kretschmer@fokus.fraunhofer.de> [fix spelling in comment] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-06-18 16:06:50 +02:00
Thomas Pedersen	f81a9dedaf	mac80211: update mesh beacon on workqueue Instead of updating the mesh beacon immediately when requested (which would require the sdata_lock()), defer it to the mac80211 workqueue. Fixes yet another deadlock on calling sta_info_flush() with the sdata_lock() held from ieee80211_stop_mesh(). We could just drop the sdata_lock() around the mesh_sta_cleanup() call, but this path is also taken from several non-locked error paths. Signed-off-by: Thomas Pedersen <thomas@cozybit.com> [fix comment position] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-06-18 15:57:27 +02:00
Simon Wunderlich	a1193be83b	nl80211: use attributes to parse beacons only the attributes are required and not the whole netlink info, as the function accesses the attributes only anyway. This makes it easier to parse nested beacon IEs later. Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de> Signed-off-by: Mathias Kretschmer <mathias.kretschmer@fokus.fraunhofer.de> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-06-18 15:55:19 +02:00
Matthias Schiffer	33be081a81	ipv6: ndisc: fix ndisc_send_redirect writing to the wrong skb Since some refactoring in `5f5a011`, ndisc_send_redirect called ndisc_fill_redirect_hdr_option on the wrong skb, leading to data corruption or in the worst case a panic when the skb_put failed. Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net> Reviewed-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-17 22:59:12 -07:00
Linus Lüssing	32de868cbc	bridge: fix switched interval for MLD Query types General Queries (the one with the Multicast Address field set to zero / '::') are supposed to have a Maximum Response Delay of [Query Response Interval], while for Multicast-Address-Specific Queries it is [Last Listener Query Interval] - not the other way round. (see RFC2710, section 7.3+7.8) Signed-off-by: Linus Lüssing <linus.luessing@web.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-17 17:11:22 -07:00
Fernando Luis Vazquez Cao	788dfcacac	vlan: restore ethtool ABI to control VLAN hardware acceleration As part of the push to add 802.1ad server provider tagging support to the kernel the VLAN features flags were renamed. Unfortunately the kernel name for the VLAN hardware acceleration features that the kernel shows user space was included in the rename, which broke ethtool (txvlan and rxvlan options do not work). This patch restores the original names, i.e. the original ABI. If we wanted to make clear to users that we are refering to CTAGs we can always change ethtool's short_name and long_name for these features (for example something along the lines of txvlan -> txvlan-ctag, tx-vlan-offload -> tx-vlan-ctag-offload). Cc: Patrick McHardy <kaber@trash.net> Cc: David S. Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp> Reviewed-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-17 17:09:35 -07:00
Daniel Borkmann	dda9192851	net: sctp: remove SCTP_STATIC macro SCTP_STATIC is just another define for the static keyword. It's use is inconsistent in the SCTP code anyway and it was introduced in the initial implementation of SCTP in 2.5. We have a regression suite in lksctp-tools, but this is for user space only, so noone makes use of this macro anymore. The kernel test suite for 2.5 is incompatible with the current SCTP code anyway. So simply Remove it, to be more consistent with the rest of the kernel code. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-17 17:08:05 -07:00
Daniel Borkmann	939cfa75a0	net: sctp: get rid of t_new macro for kzalloc t_new rather obfuscates things where everyone else is using actual function names instead of that macro, so replace it with kzalloc, which is the function t_new wraps. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-17 17:08:04 -07:00
David S. Miller	c5b248dd05	Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into wireless John W. Linville says: ==================== This will probably be the last batch of wireless fixes intended for 3.10. Many of these are one- or two-liners, and a couple of others are mostly relocating existing code to avoid races or to limit the code to effecting specific hardware, etc. The mac80211 fixes have a couple of exceptions to the above. Regarding those, Johannes says: "Following davem's complaint about my patch, here's a new pull request w/o the patch he was complaining about, but instead with the const fix rolled into the fix. I have a fix for radar detection, one for rate control and a workaround for broken HT APs which is a regression fix because we didn't rely on them to be correct before." Johannes also sends some iwlwifi fixes: "I picked up Nikolay's patch for the chain noise calibration bug that seems to have been there forever, a fix from Emmanuel for setting TX flags on BAR frames and a fix of my own to avoid printing request_module() errors if the kernel isn't even modular. We also have our own version of Stanislaw's fix for rate control." Along with those... Anderson Lizardo fixes a Bluetooth memory corruption bug when an MTU value is set to too small of a value. Arend van Spriel sends a revised brcmsmac bug that fixes a regression caused by a bad return value in an earlier patch. He also sends a brcmfmac fix to avoid an oops when loading the driver at boot. Daniel Drake fixes a race condition in btmrvl that causes hangs on suspend for OLPC hardware. Johan Hedberg adds a check to avoid sending a HCI_Delete_Stored_Link_Key command to devices that don't support them, avoiding some scary looking log spam. Stanislaw Gruszka gives us a fix for iwlegacy to be able to use rates higher than 1Mb/s on older wireless networks. He also sends an rt2x00 fix to reinstate older tx power handling behavior for some devices that didn't work well with the current code. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-17 16:15:51 -07:00
David S. Miller	e00c7f1fad	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf Pablo Neira Ayuso says: ==================== The following patchset contains Netfilter fixes. They are targeted to the TCP option targets, that have receive some scrinity in the last week. The changes are: * Fix TCPOPTSTRIP, it stopped working in the forward chain as tcp_hdr uses skb->transport_header, and we cannot use that in the forwarding case, from myself. * Fix default IPv6 MSS in TCPMSS in case of absence of TCP MSS options, from Phil Oester. * Fix missing fragmentation handling again in TCPMSS, from Phil Oester. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-17 16:13:45 -07:00
Ying Xue	2537af9dca	tipc: remove dev_base_lock use from enable_bearer Convert enable_bearer() to RCU locking with dev_get_by_name(). Based on a similar changeset in commit `840a185d` ["aoe: remove dev_base_lock use from aoecmd_cfg_pkts()"] -- quoting that: "dev_base_lock is the legacy way to lock the device list, and is planned to disappear. (writers hold RTNL, readers hold RCU lock)" Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-17 15:53:01 -07:00
Ying Xue	126c052464	tipc: fix wrong return value for link_send_sections_long routine When skb buffer cannot be allocated in link_send_sections_long(), -ENOMEM error code instead of -EFAULT should be returned to its caller. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-17 15:53:01 -07:00
Ying Xue	7410f967ba	tipc: make tipc_link_send_sections_fast exit earlier Once message build request function returns invalid code, the process of sending message cannot continue. So in case of message build failure, tipc_link_send_sections_fast() should return immediately. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-17 15:53:01 -07:00
Ying Xue	796c75d0d3	tipc: enhance priority of link protocol packet pfifo_fast is set as default traffic class queueing discipline. This queue has three so called "bands". Within each band, FIFO rules apply. However, as long as there are packets waiting in band 0, band 1 won't be processed. Now all kind of TIPC type packet priorities are never set, that is, their priorities are 0, so they are mapped to band 1 of pfifo_fast qdisc. But, especially during link congestion, if link protocol packet can be sent out as earlier as possible than other type of packets so that protocol packet can arrive at peer endpoint in time, the peer will timely reset its link timeout timer to keep the link alive. So enhancing the priority of link protocol packets can meet the specific demand to avoid unnecessary link reset due to a transient link congestion. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-17 15:53:01 -07:00
Paul Gortmaker	ae8509c420	tipc: cosmetic realignment of function arguments No runtime code changes here. Just a realign of the function arguments to start where the 1st one was, and fit as many args as can be put in an 80 char line. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-17 15:53:01 -07:00
Ying Xue	c0fee8aca7	tipc: save sock structure pointer instead of void pointer to tipc_port Directly save sock structure pointer instead of void pointer to avoid unnecessary cast conversions. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-17 15:53:01 -07:00
Ying Xue	28e5297281	tipc: convert config_lock from spinlock to mutex As the configuration server is now running under process context, it's unnecessary for us to have a spinlock serializing the TIPC configuration process. Instead, we replace it with a mutex lock, which gives us more freedom. For instance, we can now call pre-emptable functions within the protected area. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-17 15:53:01 -07:00
Ying Xue	3c5db8e4ec	tipc: rename tipc_createport_raw to tipc_createport After the removal of the native API, there is now only one way to to create a TIPC port instance -- the function tipc_createport_raw(). We make it more readable by renaming it to tipc_createport(). Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-17 15:53:01 -07:00
Ying Xue	f1733d7580	tipc: remove user_port instance from tipc_port structure After the native API has been completely removed, the 'user_port' field in struct tipc_port becomes unused, and can be removed. As a consequence, the "usrmem" argument in tipc_msg_build() is no longer needed, and so we remove that one too. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-17 15:53:00 -07:00
Ying Xue	198d73b82b	tipc: delete code orphaned by new server infrastructure Having completed the conversion of the topology server and configuration server to use the new server infrastructure, the following functions become unused, and can be deleted: - tipc_createport() - port_wakeup_sh() - port_dispatcher() - port_dispatcher_sigh() - tipc_send_buf_fast() - tipc_send_buf2port Additionally, the following variables become orphaned, and can be deleted: - tipc_msg_err_event - tipc_named_msg_err_event - tipc_conn_shutdown_event - tipc_msg_event - tipc_named_msg_event - tipc_conn_msg_event - tipc_continue_event - msg_queue_head - msg_queue_tail - queue_lock Deletion is done here in a separate commit in order to allow the actual conversion changes to be more easily viewed. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-17 15:53:00 -07:00
Ying Xue	7d0ab17b74	tipc: convert configuration server to use new server facility As the new socket-based TIPC server infrastructure has been introduced, we can now convert the configuration server to use it. Then we can take future steps to simplify the configuration server locking policy. Some minor reordering of initialization is done, due to the dependency on having tipc_socket_init completed. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-17 15:53:00 -07:00
Ying Xue	13a2e89873	tipc: convert topology server to use new server facility As the new TIPC server infrastructure has been introduced, we can now convert the TIPC topology server to it. We get two benefits from doing this: 1) It simplifies the topology server locking policy. In the original locking policy, we placed one spin lock pointer in the tipc_subscriber structure to reuse the lock of the subscriber's server port, controlling access to members of tipc_subscriber instance. That is, we only used one lock to ensure both tipc_port and tipc_subscriber members were safely accessed. Now we introduce another spin lock for tipc_subscriber structure only protecting themselves, to get a finer granularity locking policy. Moreover, the change will allow us to make the topology server code more readable and maintainable. 2) It fixes a bug where sent subscription events may be lost when the topology port is congested. Using the new service, the topology server now queues sent events into an outgoing buffer, and then wakes up a sender process which has been blocked in workqueue context. The process will keep picking events from the buffer and send them to their respective subscribers, using the kernel socket interface, until the buffer is empty. Even if the socket is congested during transmission there is no risk that events may be dropped, since the sender process may block when needed. Some minor reordering of initialization is done, since we now have a scenario where the topology server must be started after socket initialization has taken place, as the former depends on the latter. And overall, we see a simplification of the TIPC subscriber code in making this changeover. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-17 15:53:00 -07:00
Ying Xue	c5fa7b3cf3	tipc: introduce new TIPC server infrastructure TIPC has two internal servers, one providing a subscription service for topology events, and another providing the configuration interface. These servers have previously been running in BH context, accessing the TIPC-port (aka native) API directly. Apart from these servers, even the TIPC socket implementation is partially built on this API. As this API may simultaneously be called via different paths and in different contexts, a complex and costly lock policiy is required in order to protect TIPC internal resources. To eliminate the need for this complex lock policiy, we introduce a new, generic service API that uses kernel sockets for message passing instead of the native API. Once the toplogy and configuration servers are converted to use this new service, all code pertaining to the native API can be removed. This entails a significant reduction in code amount and complexity, and opens up for a complete rework of the locking policy in TIPC. The new service also solves another problem: As the current topology server works in BH context, it cannot easily be blocked when sending of events fails due to congestion. In such cases events may have to be silently dropped, something that is unacceptable. Therefore, the new service keeps a dedicated outbound queue receiving messages from BH context. Once messages are inserted into this queue, we will immediately schedule a work from a special workqueue. This way, messages/events from the topology server are in reality sent in process context, and the server can block if necessary. Analogously, there is a new workqueue for receiving messages. Once a notification about an arriving message is received in BH context, we schedule a work from the receive workqueue to do the job of receiving the message in process context. As both sending and receive messages are now finished in processes, subscribed events cannot be dropped any more. As of this commit, this new server infrastructure is built, but not actually yet called by the existing TIPC code, but since the conversion changes required in order to use it are significant, the addition is kept here as a separate commit. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-17 15:53:00 -07:00
Erik Hugne	5d21cb70db	tipc: allow implicit connect for stream sockets TIPC's implied connect feature, aka piggyback connect, allows applications to save one syscall and all SYN/SYN-ACK signalling overhead when setting up a connection. Until now, this has only been supported for SEQPACKET sockets. Here, we make it possible to use this feature even with stream sockets. At the connecting side, the connection is completed when the first data message arrives from the accepting peer. This means that we must allow the connecting user to call blocking recv() before the socket has reached state SS_CONNECTED. So we must must relax the state machine check at recv_stream(), and allow the recv() call even if socket is in state SS_CONNECTING. Signed-off-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-17 15:53:00 -07:00
Ying Xue	cc79dd1ba9	tipc: change socket buffer overflow control to respect sk_rcvbuf As per feedback from the netdev community, we change the buffer overflow protection algorithm in receiving sockets so that it always respects the nominal upper limit set in sk_rcvbuf. Instead of scaling up from a small sk_rcvbuf value, which leads to violation of the configured sk_rcvbuf limit, we now calculate the weighted per-message limit by scaling down from a much bigger value, still in the same field, according to the importance priority of the received message. To allow for administrative tunability of the socket receive buffer size, we create a tipc_rmem sysctl variable to allow the user to configure an even bigger value via sysctl command. It is a size of three (min/default/max) to be consistent with things like tcp_rmem. By default, the value initialized in tipc_rmem[1] is equal to the receive socket size needed by a TIPC_CRITICAL_IMPORTANCE message. This value is also set as the default value of sk_rcvbuf. Originally-by: Jon Maloy <jon.maloy@ericsson.com> Cc: Neil Horman <nhorman@tuxdriver.com> Cc: Jon Maloy <jon.maloy@ericsson.com> [Ying: added sysctl variation to Jon's original patch] Signed-off-by: Ying Xue <ying.xue@windriver.com> [PG: don't compile sysctl.c if not config'd; add Documentation] Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-17 15:53:00 -07:00
Eliezer Tamir	dafcc4380d	net: add socket option for low latency polling adds a socket option for low latency polling. This allows overriding the global sysctl value with a per-socket one. Unexport sysctl_net_ll_poll since for now it's not needed in modules. Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-17 15:48:14 -07:00
Eliezer Tamir	89bf1b5a68	net: remove NET_LL_RX_POLL config menue Remove NET_LL_RX_POLL from the config menu. Change default to y. Busy polling still needs to be enabled at run time. Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-17 15:48:14 -07:00
Eliezer Tamir	9a3c71aa80	net: convert low latency sockets to sched_clock() Use sched_clock() instead of get_cycles(). We can use sched_clock() because we don't care much about accuracy. Remove the dependency on X86_TSC Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-17 15:48:14 -07:00
Eliezer Tamir	eb6db62282	net: change sysctl_net_ll_poll into an unsigned int There is no reason for sysctl_net_ll_poll to be an unsigned long. Change it into an unsigned int. Fix the proc handler. Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-17 15:48:13 -07:00
Greg Kroah-Hartman	0e496b8e84	Merge 3.10-rc6 into char-misc-next We want the fixes in here.	2013-06-17 11:54:25 -07:00
John W. Linville	722a300bf8	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem	2013-06-17 13:44:01 -04:00
Daniel Borkmann	2e0c9e7911	net: sctp: sctp_association_init: put refs in reverse order In case we need to bail out for whatever reason during assoc init, we call sctp_endpoint_put() and then sock_put(), however, we've hold both refs in reverse, non-symmetric order, so first sctp_endpoint_hold() and then sock_hold(). Reverse this, so that in an error case we have sock_put() and then sctp_endpoint_put(). Actually shouldn't matter too much, since both cleanup paths do the right thing, but that way, it is more consistent with the rest of the code. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-14 15:38:36 -07:00
Daniel Borkmann	c164b83814	net: sctp: minor: remove variable in sctp_init_sock It's only used at this one time, so we could remove it as well. This is valid and also makes it more explicit/obvious that in case of error the sp->ep is NULL here, i.e. for the sctp_destroy_sock() check that was recently added. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-14 15:38:36 -07:00
Daniel Borkmann	405426f6ca	net: sctp: sctp_sf_do_prm_asoc: do SCTP_CMD_INIT_CHOOSE_TRANSPORT first While this currently cannot trigger any NULL pointer dereference in sctp_seq_dump_local_addrs(), better change the order of commands to prevent a future bug to happen. Although we first add SCTP_CMD_NEW_ASOC and then set the SCTP_CMD_INIT_CHOOSE_TRANSPORT, it is okay for now, since this primitive is only called by sctp_connect() or sctp_sendmsg() with sctp_assoc_add_peer() set first. However, lets do this precaution and first set the transport and then add it to the association hashlist to prevent in future something to possibly triggering this. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-14 15:38:36 -07:00
Daniel Borkmann	f9e42b8535	net: sctp: sideeffect: throw BUG if primary_path is NULL This clearly states a BUG somewhere in the SCTP code as e.g. fixed once in `f28156335` ("sctp: Use correct sideffect command in duplicate cookie handling"). If this ever happens, throw a trace in the sideeffect engine where assocs clearly must have a primary_path assigned. When in sctp_seq_dump_local_addrs() also throw a WARN and bail out since we do not need to panic for printing this one asterisk. Also, it will avoid the not so obvious case when primary != NULL test passes and at a later point in time triggering a NULL ptr dereference caused by primary. While at it, also fix up the white space. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-14 15:38:36 -07:00
David S. Miller	09ce069dff	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jesse/openvswitch Jesse Gross says: ==================== A few miscellaneous improvements and cleanups before the GRE tunnel integration series. Intended for net-next/3.11. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-14 15:31:22 -07:00
Pravin B Shelar	93d8fd1514	openvswitch: Simplify interface ovs_flow_metadata_from_nlattrs() This is not functional change, this is just code cleanup. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: Jesse Gross <jesse@nicira.com>	2013-06-14 15:09:12 -07:00
Pravin B Shelar	b34df5e805	openvswitch: make skb->csum consistent with rest of networking stack. Following patch keeps skb->csum correct across ovs. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: Jesse Gross <jesse@nicira.com>	2013-06-14 15:09:12 -07:00
Andy Hill	af7841636b	openvswitch: Fix misspellings in comments and docs. Flagged with: https://github.com/lyda/misspell-check Run with: git ls-files \| misspellings -f - Signed-off-by: Andy Hill <hillad@gmail.com> Signed-off-by: Jesse Gross <jesse@nicira.com>	2013-06-14 15:09:11 -07:00
Lorand Jakab	34d94f2102	openvswitch: fix variable names in comment Signed-off-by: Lorand Jakab <lojakab@cisco.com> Signed-off-by: Jesse Gross <jesse@nicira.com>	2013-06-14 15:09:10 -07:00
Pravin B Shelar	91b7514cdf	openvswitch: Unify vport error stats handling. Following patch changes vport->send return type so that vport layer can do error accounting. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: Jesse Gross <jesse@nicira.com>	2013-06-14 15:09:10 -07:00

... 7 8 9 10 11 ...

29386 Commits