linux

Commit Graph

Author	SHA1	Message	Date
Linus Lüssing	a4deee1ad4	batman-adv: Add dummy soft-interface rx mode handler We do not actually need to set any rx filters for the virtual batman soft interface. However a dummy handler enables a user to set static multicast listeners for instance. Signed-off-by: Linus Lüssing <linus.luessing@web.de> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>	2013-10-12 17:17:12 +02:00
Antonio Quartulli	e8cf234a4e	batman-adv: make batadv_tt_save_orig_buffer() generic This is a simple batadv_tt_save_orig_buffer() refactoring aiming to make it more generic and avoid useless casts. Signed-off-by: Antonio Quartulli <antonio@open-mesh.com> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>	2013-10-12 17:17:11 +02:00
Antonio Quartulli	298e6e685b	batman-adv: implement batadv_tt_entries Implement batadv_tt_entries() to get the number of entries fitting in a given amount of bytes. This computation is done several times in the code and therefore it is useful to have an helper function. Signed-off-by: Antonio Quartulli <antonio@open-mesh.com> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>	2013-10-12 17:17:11 +02:00
Simon Wunderlich	56a5ca8409	batman-adv: remove useless find_router look up This is not used anymore with the new fragmentation, and it might actually mess up the bonding code because find_router() assumes it is only called once per packet. Signed-off-by: Simon Wunderlich <simon@open-mesh.com> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>	2013-10-12 17:17:10 +02:00
Marek Lindner	411d6ed93a	batman-adv: consider network coding overhead when calculating required mtu The module prints a warning when the MTU on the hard interface is too small to transfer payload traffic without fragmentation. The required MTU is calculated based on the encapsulation header size. If network coding is compild into the module its header size is taken into account as well. Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>	2013-10-12 17:17:09 +02:00
Antonio Quartulli	0bf84c160a	batman-adv: create common header for ICMP packets the icmp and the icmp_rr packets share the same initial fields since they use the same code to be processed and forwarded. Extract the common fields and put them into a separate struct so that future ICMP packets can be easily added without bloating the packet definition. However, keep the seqno field outside of the newly created common header because future ICMP types may require a bigger sequence number space. This change breaks compatibility due to fields reordering in the ICMP headers. Signed-off-by: Antonio Quartulli <antonio@open-mesh.com> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>	2013-10-12 17:17:09 +02:00
Antonio Quartulli	293e93385e	batman-adv: use htons when possible When comparing a network ordered value with a constant, it is better to convert the constant at compile time by means of htons() instead of converting the value at runtime using ntohs(). This refactoring may slightly improve the code performance. Moreover substitute __constant_htons() with htons() since the latter increase readability and it is smart enough to be as efficient as the former Signed-off-by: Antonio Quartulli <ordex@autistici.org> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Acked-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de>	2013-10-12 11:59:23 +02:00
Martin Hundebøll	ee75ed8887	batman-adv: Fragment and send skbs larger than mtu Non-broadcast packets larger than MTU are fragmented and sent with an encapsulating header. Up to 16 fragments are supported, which are sent in reverse order on the wire to allow minimal memory copying when creating fragments. Signed-off-by: Martin Hundebøll <martin@hundeboll.net> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>	2013-10-12 11:58:35 +02:00
Martin Hundebøll	610bfc6bc9	batman-adv: Receive fragmented packets and merge Fragments arriving at their destination are buffered for later merge. Merged packets are passed to the main receive function as had they never been fragmented. Fragments are forwarded without merging if the MTU of the outgoing interface is smaller than the size of the merged packet. Signed-off-by: Martin Hundebøll <martin@hundeboll.net> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>	2013-10-12 11:58:34 +02:00
Martin Hundebøll	f097e25dbe	batman-adv: Remove old fragmentation code Remove the existing fragmentation code before adding the new version and delete unicast.{h,c}. batadv_unicast_send_skb() is moved to send.c and renamed to batadv_send_skb_unicast(). fragmentation entry in sysfs (bat_priv->fragmentation) is kept for use in the new fragmentation code. BATADV_UNICAST_FRAG packet type is renamed to BATADV_FRAG for use in the new fragmentation code. Signed-off-by: Martin Hundebøll <martin@hundeboll.net> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>	2013-10-12 11:58:33 +02:00
Antonio Quartulli	2c598663e8	batman-adv: use VLAN_ETH_HLEN instead of sizeof(struct vlan_eth_hdr) Signed-off-by: Antonio Quartulli <ordex@autistici.org> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>	2013-10-12 11:58:33 +02:00
Antonio Quartulli	f7f8ed5695	batman-adv: h_vlan_encapsulated_proto access refactoring In case of a VLAN tagged frame the ethhdr pointer is moved forward by 4 bytes so that the offset of h_proto in struct ethhdr matches the real h_vlan_encapsulated_proto address in the skb. While this trickery is correct it makes the code harder to understand and may lead to bugs in case of re-use of ethhdr for other purposes. This patch introduces a proto variable to make things cleaner and easier to understand. Signed-off-by: Antonio Quartulli <ordex@autistici.org> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>	2013-10-12 11:58:30 +02:00
Antonio Quartulli	2102605947	batman-adv: don't use call_rcu if not needed batadv_tt_global_entry_free_ref uses call_rcu to schedule a function which will only free the global entry itself. For this reason call_rcu is useless and kfree_rcu can be used to simplify the code. Signed-off-by: Antonio Quartulli <ordex@autistici.org> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>	2013-10-12 09:51:28 +02:00
Antonio Quartulli	d7ee88d048	batman-adv: remove batadv_tt_global_add_orig declaration batadv_tt_global_add_orig is neither used nor implemented anymore, therefore it is possible to remove its declaration Signed-off-by: Antonio Quartulli <ordex@autistici.org> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>	2013-10-12 09:51:28 +02:00
Antonio Quartulli	1e5d49fce3	batman-adv: make tt_global_add static and return bool batadv_tt_global_add is not used anymore outside of the TT code thanks to the TVLV implementation. It can therefore be declared as static Last user has been removed by 3de4e64df0f1326db7cc0ef25f5af8522850252d ("batman-adv: tvlv - convert roaming adv packet to use tvlv unicast packets") Moreover make it return bool since its result can be either 0 or 1. Reported-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de> Signed-off-by: Antonio Quartulli <ordex@autistici.org> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>	2013-10-12 09:51:27 +02:00
Simon Wunderlich	97dbc03b47	batman-adv: only add recordroute information to icmp request/reply Adding host information for record route is only required for ICMP requests and replys, and should not be added to just any (future?) packet type. Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>	2013-10-12 09:51:26 +02:00
Oussama Ghorbel	bf58175954	ipv6: Initialize ip6_tnl.hlen in gre tunnel even if no route is found The ip6_tnl.hlen (gre and ipv6 headers length) is independent from the outgoing interface, so it would be better to initialize it even when no route is found, otherwise its value will be zero. While I'm not sure if this could happen in real life, but doing that will avoid to call the skb_push function with a zero in ip6gre_header function. Suggested-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Oussama Ghorbel <ou.ghorbel@gmail.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-11 17:50:59 -04:00
Eric Dumazet	ccdbb6e96b	tcp: tcp_transmit_skb() optimizations 1) We need to take a timestamp only for skb that should be cloned. Other skbs are not in write queue and no rtt estimation is done on them. 2) the unlikely() hint is wrong for receivers (they send pure ACK) Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: MF Nowlan <fitz@cs.yale.edu> Cc: Yuchung Cheng <ycheng@google.com> Cc: Neal Cardwell <ncardwell@google.com> Acked-By: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-11 17:48:18 -04:00
stephen hemminger	ff704050f2	netem: free skb's in tree on reset Netem can leak memory because packets get stored in red-black tree and it is not cleared on reset. Reported by: Сергеев Сергей <adron@yapic.net> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-11 17:30:38 -04:00
stephen hemminger	638a52b801	netem: update backlog after drop When packet is dropped from rb-tree netem the backlog statistic should also be updated. Reported-by: Сергеев Сергей <adron@yapic.net> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-11 17:29:29 -04:00
Eric Dumazet	455cc32bf1	l2tp: must disable bh before calling l2tp_xmit_skb() François Cachereul made a very nice bug report and suspected the bh_lock_sock() / bh_unlok_sock() pair used in l2tp_xmit_skb() from process context was not good. This problem was added by commit `6af88da14e` ("l2tp: Fix locking in l2tp_core.c"). l2tp_eth_dev_xmit() runs from BH context, so we must disable BH from other l2tp_xmit_skb() users. [ 452.060011] BUG: soft lockup - CPU#1 stuck for 23s! [accel-pppd:6662] [ 452.061757] Modules linked in: l2tp_ppp l2tp_netlink l2tp_core pppoe pppox ppp_generic slhc ipv6 ext3 mbcache jbd virtio_balloon xfs exportfs dm_mod virtio_blk ata_generic virtio_net floppy ata_piix libata virtio_pci virtio_ring virtio [last unloaded: scsi_wait_scan] [ 452.064012] CPU 1 [ 452.080015] BUG: soft lockup - CPU#2 stuck for 23s! [accel-pppd:6643] [ 452.080015] CPU 2 [ 452.080015] [ 452.080015] Pid: 6643, comm: accel-pppd Not tainted 3.2.46.mini #1 Bochs Bochs [ 452.080015] RIP: 0010:[<ffffffff81059f6c>] [<ffffffff81059f6c>] do_raw_spin_lock+0x17/0x1f [ 452.080015] RSP: 0018:ffff88007125fc18 EFLAGS: 00000293 [ 452.080015] RAX: 000000000000aba9 RBX: ffffffff811d0703 RCX: 0000000000000000 [ 452.080015] RDX: 00000000000000ab RSI: ffff8800711f6896 RDI: ffff8800745c8110 [ 452.080015] RBP: ffff88007125fc18 R08: 0000000000000020 R09: 0000000000000000 [ 452.080015] R10: 0000000000000000 R11: 0000000000000280 R12: 0000000000000286 [ 452.080015] R13: 0000000000000020 R14: 0000000000000240 R15: 0000000000000000 [ 452.080015] FS: 00007fdc0cc24700(0000) GS:ffff8800b6f00000(0000) knlGS:0000000000000000 [ 452.080015] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 452.080015] CR2: 00007fdb054899b8 CR3: 0000000074404000 CR4: 00000000000006a0 [ 452.080015] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 452.080015] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 452.080015] Process accel-pppd (pid: 6643, threadinfo ffff88007125e000, task ffff8800b27e6dd0) [ 452.080015] Stack: [ 452.080015] ffff88007125fc28 ffffffff81256559 ffff88007125fc98 ffffffffa01b2bd1 [ 452.080015] ffff88007125fc58 000000000000000c 00000000029490d0 0000009c71dbe25e [ 452.080015] 000000000000005c 000000080000000e 0000000000000000 ffff880071170600 [ 452.080015] Call Trace: [ 452.080015] [<ffffffff81256559>] _raw_spin_lock+0xe/0x10 [ 452.080015] [<ffffffffa01b2bd1>] l2tp_xmit_skb+0x189/0x4ac [l2tp_core] [ 452.080015] [<ffffffffa01c2d36>] pppol2tp_sendmsg+0x15e/0x19c [l2tp_ppp] [ 452.080015] [<ffffffff811c7872>] __sock_sendmsg_nosec+0x22/0x24 [ 452.080015] [<ffffffff811c83bd>] sock_sendmsg+0xa1/0xb6 [ 452.080015] [<ffffffff81254e88>] ? __schedule+0x5c1/0x616 [ 452.080015] [<ffffffff8103c7c6>] ? __dequeue_signal+0xb7/0x10c [ 452.080015] [<ffffffff810bbd21>] ? fget_light+0x75/0x89 [ 452.080015] [<ffffffff811c8444>] ? sockfd_lookup_light+0x20/0x56 [ 452.080015] [<ffffffff811c9b34>] sys_sendto+0x10c/0x13b [ 452.080015] [<ffffffff8125cac2>] system_call_fastpath+0x16/0x1b [ 452.080015] Code: 81 48 89 e5 72 0c 31 c0 48 81 ff 45 66 25 81 0f 92 c0 5d c3 55 b8 00 01 00 00 48 89 e5 f0 66 0f c1 07 0f b6 d4 38 d0 74 06 f3 90 <8a> 07 eb f6 5d c3 90 90 55 48 89 e5 9c 58 0f 1f 44 00 00 5d c3 [ 452.080015] Call Trace: [ 452.080015] [<ffffffff81256559>] _raw_spin_lock+0xe/0x10 [ 452.080015] [<ffffffffa01b2bd1>] l2tp_xmit_skb+0x189/0x4ac [l2tp_core] [ 452.080015] [<ffffffffa01c2d36>] pppol2tp_sendmsg+0x15e/0x19c [l2tp_ppp] [ 452.080015] [<ffffffff811c7872>] __sock_sendmsg_nosec+0x22/0x24 [ 452.080015] [<ffffffff811c83bd>] sock_sendmsg+0xa1/0xb6 [ 452.080015] [<ffffffff81254e88>] ? __schedule+0x5c1/0x616 [ 452.080015] [<ffffffff8103c7c6>] ? __dequeue_signal+0xb7/0x10c [ 452.080015] [<ffffffff810bbd21>] ? fget_light+0x75/0x89 [ 452.080015] [<ffffffff811c8444>] ? sockfd_lookup_light+0x20/0x56 [ 452.080015] [<ffffffff811c9b34>] sys_sendto+0x10c/0x13b [ 452.080015] [<ffffffff8125cac2>] system_call_fastpath+0x16/0x1b [ 452.064012] [ 452.064012] Pid: 6662, comm: accel-pppd Not tainted 3.2.46.mini #1 Bochs Bochs [ 452.064012] RIP: 0010:[<ffffffff81059f6e>] [<ffffffff81059f6e>] do_raw_spin_lock+0x19/0x1f [ 452.064012] RSP: 0018:ffff8800b6e83ba0 EFLAGS: 00000297 [ 452.064012] RAX: 000000000000aaa9 RBX: ffff8800b6e83b40 RCX: 0000000000000002 [ 452.064012] RDX: 00000000000000aa RSI: 000000000000000a RDI: ffff8800745c8110 [ 452.064012] RBP: ffff8800b6e83ba0 R08: 000000000000c802 R09: 000000000000001c [ 452.064012] R10: ffff880071096c4e R11: 0000000000000006 R12: ffff8800b6e83b18 [ 452.064012] R13: ffffffff8125d51e R14: ffff8800b6e83ba0 R15: ffff880072a589c0 [ 452.064012] FS: 00007fdc0b81e700(0000) GS:ffff8800b6e80000(0000) knlGS:0000000000000000 [ 452.064012] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 452.064012] CR2: 0000000000625208 CR3: 0000000074404000 CR4: 00000000000006a0 [ 452.064012] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 452.064012] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 452.064012] Process accel-pppd (pid: 6662, threadinfo ffff88007129a000, task ffff8800744f7410) [ 452.064012] Stack: [ 452.064012] ffff8800b6e83bb0 ffffffff81256559 ffff8800b6e83bc0 ffffffff8121c64a [ 452.064012] ffff8800b6e83bf0 ffffffff8121ec7a ffff880072a589c0 ffff880071096c62 [ 452.064012] 0000000000000011 ffffffff81430024 ffff8800b6e83c80 ffffffff8121f276 [ 452.064012] Call Trace: [ 452.064012] <IRQ> [ 452.064012] [<ffffffff81256559>] _raw_spin_lock+0xe/0x10 [ 452.064012] [<ffffffff8121c64a>] spin_lock+0x9/0xb [ 452.064012] [<ffffffff8121ec7a>] udp_queue_rcv_skb+0x186/0x269 [ 452.064012] [<ffffffff8121f276>] __udp4_lib_rcv+0x297/0x4ae [ 452.064012] [<ffffffff8121c178>] ? raw_rcv+0xe9/0xf0 [ 452.064012] [<ffffffff8121f4a7>] udp_rcv+0x1a/0x1c [ 452.064012] [<ffffffff811fe385>] ip_local_deliver_finish+0x12b/0x1a5 [ 452.064012] [<ffffffff811fe54e>] ip_local_deliver+0x53/0x84 [ 452.064012] [<ffffffff811fe1d0>] ip_rcv_finish+0x2bc/0x2f3 [ 452.064012] [<ffffffff811fe78f>] ip_rcv+0x210/0x269 [ 452.064012] [<ffffffff8101911e>] ? kvm_clock_get_cycles+0x9/0xb [ 452.064012] [<ffffffff811d88cd>] __netif_receive_skb+0x3a5/0x3f7 [ 452.064012] [<ffffffff811d8eba>] netif_receive_skb+0x57/0x5e [ 452.064012] [<ffffffff811cf30f>] ? __netdev_alloc_skb+0x1f/0x3b [ 452.064012] [<ffffffffa0049126>] virtnet_poll+0x4ba/0x5a4 [virtio_net] [ 452.064012] [<ffffffff811d9417>] net_rx_action+0x73/0x184 [ 452.064012] [<ffffffffa01b2cc2>] ? l2tp_xmit_skb+0x27a/0x4ac [l2tp_core] [ 452.064012] [<ffffffff810343b9>] __do_softirq+0xc3/0x1a8 [ 452.064012] [<ffffffff81013b56>] ? ack_APIC_irq+0x10/0x12 [ 452.064012] [<ffffffff81256559>] ? _raw_spin_lock+0xe/0x10 [ 452.064012] [<ffffffff8125e0ac>] call_softirq+0x1c/0x26 [ 452.064012] [<ffffffff81003587>] do_softirq+0x45/0x82 [ 452.064012] [<ffffffff81034667>] irq_exit+0x42/0x9c [ 452.064012] [<ffffffff8125e146>] do_IRQ+0x8e/0xa5 [ 452.064012] [<ffffffff8125676e>] common_interrupt+0x6e/0x6e [ 452.064012] <EOI> [ 452.064012] [<ffffffff810b82a1>] ? kfree+0x8a/0xa3 [ 452.064012] [<ffffffffa01b2cc2>] ? l2tp_xmit_skb+0x27a/0x4ac [l2tp_core] [ 452.064012] [<ffffffffa01b2c25>] ? l2tp_xmit_skb+0x1dd/0x4ac [l2tp_core] [ 452.064012] [<ffffffffa01c2d36>] pppol2tp_sendmsg+0x15e/0x19c [l2tp_ppp] [ 452.064012] [<ffffffff811c7872>] __sock_sendmsg_nosec+0x22/0x24 [ 452.064012] [<ffffffff811c83bd>] sock_sendmsg+0xa1/0xb6 [ 452.064012] [<ffffffff81254e88>] ? __schedule+0x5c1/0x616 [ 452.064012] [<ffffffff8103c7c6>] ? __dequeue_signal+0xb7/0x10c [ 452.064012] [<ffffffff810bbd21>] ? fget_light+0x75/0x89 [ 452.064012] [<ffffffff811c8444>] ? sockfd_lookup_light+0x20/0x56 [ 452.064012] [<ffffffff811c9b34>] sys_sendto+0x10c/0x13b [ 452.064012] [<ffffffff8125cac2>] system_call_fastpath+0x16/0x1b [ 452.064012] Code: 89 e5 72 0c 31 c0 48 81 ff 45 66 25 81 0f 92 c0 5d c3 55 b8 00 01 00 00 48 89 e5 f0 66 0f c1 07 0f b6 d4 38 d0 74 06 f3 90 8a 07 <eb> f6 5d c3 90 90 55 48 89 e5 9c 58 0f 1f 44 00 00 5d c3 55 48 [ 452.064012] Call Trace: [ 452.064012] <IRQ> [<ffffffff81256559>] _raw_spin_lock+0xe/0x10 [ 452.064012] [<ffffffff8121c64a>] spin_lock+0x9/0xb [ 452.064012] [<ffffffff8121ec7a>] udp_queue_rcv_skb+0x186/0x269 [ 452.064012] [<ffffffff8121f276>] __udp4_lib_rcv+0x297/0x4ae [ 452.064012] [<ffffffff8121c178>] ? raw_rcv+0xe9/0xf0 [ 452.064012] [<ffffffff8121f4a7>] udp_rcv+0x1a/0x1c [ 452.064012] [<ffffffff811fe385>] ip_local_deliver_finish+0x12b/0x1a5 [ 452.064012] [<ffffffff811fe54e>] ip_local_deliver+0x53/0x84 [ 452.064012] [<ffffffff811fe1d0>] ip_rcv_finish+0x2bc/0x2f3 [ 452.064012] [<ffffffff811fe78f>] ip_rcv+0x210/0x269 [ 452.064012] [<ffffffff8101911e>] ? kvm_clock_get_cycles+0x9/0xb [ 452.064012] [<ffffffff811d88cd>] __netif_receive_skb+0x3a5/0x3f7 [ 452.064012] [<ffffffff811d8eba>] netif_receive_skb+0x57/0x5e [ 452.064012] [<ffffffff811cf30f>] ? __netdev_alloc_skb+0x1f/0x3b [ 452.064012] [<ffffffffa0049126>] virtnet_poll+0x4ba/0x5a4 [virtio_net] [ 452.064012] [<ffffffff811d9417>] net_rx_action+0x73/0x184 [ 452.064012] [<ffffffffa01b2cc2>] ? l2tp_xmit_skb+0x27a/0x4ac [l2tp_core] [ 452.064012] [<ffffffff810343b9>] __do_softirq+0xc3/0x1a8 [ 452.064012] [<ffffffff81013b56>] ? ack_APIC_irq+0x10/0x12 [ 452.064012] [<ffffffff81256559>] ? _raw_spin_lock+0xe/0x10 [ 452.064012] [<ffffffff8125e0ac>] call_softirq+0x1c/0x26 [ 452.064012] [<ffffffff81003587>] do_softirq+0x45/0x82 [ 452.064012] [<ffffffff81034667>] irq_exit+0x42/0x9c [ 452.064012] [<ffffffff8125e146>] do_IRQ+0x8e/0xa5 [ 452.064012] [<ffffffff8125676e>] common_interrupt+0x6e/0x6e [ 452.064012] <EOI> [<ffffffff810b82a1>] ? kfree+0x8a/0xa3 [ 452.064012] [<ffffffffa01b2cc2>] ? l2tp_xmit_skb+0x27a/0x4ac [l2tp_core] [ 452.064012] [<ffffffffa01b2c25>] ? l2tp_xmit_skb+0x1dd/0x4ac [l2tp_core] [ 452.064012] [<ffffffffa01c2d36>] pppol2tp_sendmsg+0x15e/0x19c [l2tp_ppp] [ 452.064012] [<ffffffff811c7872>] __sock_sendmsg_nosec+0x22/0x24 [ 452.064012] [<ffffffff811c83bd>] sock_sendmsg+0xa1/0xb6 [ 452.064012] [<ffffffff81254e88>] ? __schedule+0x5c1/0x616 [ 452.064012] [<ffffffff8103c7c6>] ? __dequeue_signal+0xb7/0x10c [ 452.064012] [<ffffffff810bbd21>] ? fget_light+0x75/0x89 [ 452.064012] [<ffffffff811c8444>] ? sockfd_lookup_light+0x20/0x56 [ 452.064012] [<ffffffff811c9b34>] sys_sendto+0x10c/0x13b [ 452.064012] [<ffffffff8125cac2>] system_call_fastpath+0x16/0x1b Reported-by: François Cachereul <f.cachereul@alphalink.fr> Tested-by: François Cachereul <f.cachereul@alphalink.fr> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-11 16:51:37 -04:00
David S. Miller	29b67c39dc	Included changes: - update emails for A. Quartulli and M. Lindner in MAINTAINERS - switch to the next on-the-wire protocol version - introduce the T(ype) V(ersion) L(ength) V(alue) framework - adjust the existing components to make them use the new TVLV code - make the TT component use CRC32 instead of CRC16 - totally remove the VIS functionality (has been moved to userspace) - reorder packet types and flags - add static checks on packet format - remove __packed from batadv_ogm_packet -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAABCAAGBQJSVa1BAAoJEADl0hg6qKeOiZkP/iD8Gehkb+jII3J1AE4+TdaW X5XzB40A6bRr/RhiqqFXA6FhjLPCwPlaYwo+FL9QC9fIbV6Drofd8B4Vxvp8daQI A5ZTzA7GFJhUkzM4vvw5eyF5848OYGITAJh5kSkiYiFSC6x9X9UFrG1EBBX7ikGd 7MXCjNa5GXaO7BsSnSbGuuMYw6CuZxwwxUZ3F/fG031yTlRqz88TiBpOGCZXSM5J eYOf68byDPPsRvj/RgpmYTt7MzWnAgp9gJotp5Owuqpb3kvZC6SkA8uj4O2f29rn nmQ+1q5onjHpfqukezPOIx9HEKo2XCq3qbFiW74dyNi+EQlYJ6RAGyRBkTTl4hse cTOeWENANoFco/SPiOi1PjGrytuflAiIegwkVugAvnmYlbsVIi6MXV+BGo/ARybv RwPBFULCaxO5HNkYz6elcu8cwFY9s3PROEdMVpRj+iNTeQq3ZQHj+QfPsHdedkAz mkHqiZQo/pgBeWd+KiN98YJtQ4nPV1Qi+yGiayc3hmsk5fIABaOHxP3NV0NrTeOO PYKWFcKfdLgtvrfcOXbcWpPz0d8SUkyF8zjlzIscXlpJVWK1sOw8oEER3IrDgzDy PmzHkEL+qfDh4I5hEPOIEXxzPw/E4b+G3WI34bUN/4chGSmJdo6yyaC6XcXe7r9g AsxmABE84uT2npNSOSkN =YQD5 -----END PGP SIGNATURE----- Merge tag 'batman-adv-for-davem' of git://git.open-mesh.org/linux-merge Included changes: - update emails for A. Quartulli and M. Lindner in MAINTAINERS - switch to the next on-the-wire protocol version - introduce the T(ype) V(ersion) L(ength) V(alue) framework - adjust the existing components to make them use the new TVLV code - make the TT component use CRC32 instead of CRC16 - totally remove the VIS functionality (has been moved to userspace) - reorder packet types and flags - add static checks on packet format - remove __packed from batadv_ogm_packet	2013-10-11 16:28:55 -04:00
Christophe Gouault	7263a5187f	vti: get rid of nf mark rule in prerouting This patch fixes and improves the use of vti interfaces (while lightly changing the way of configuring them). Currently: - it is necessary to identify and mark inbound IPsec packets destined to each vti interface, via netfilter rules in the mangle table at prerouting hook. - the vti module cannot retrieve the right tunnel in input since commit b9959fd3: vti tunnels all have an i_key, but the tunnel lookup is done with flag TUNNEL_NO_KEY, so there no chance to retrieve them. - the i_key is used by the outbound processing as a mark to lookup for the right SP and SA bundle. This patch uses the o_key to store the vti mark (instead of i_key) and enables: - to avoid the need for previously marking the inbound skbuffs via a netfilter rule. - to properly retrieve the right tunnel in input, only based on the IPsec packet outer addresses. - to properly perform an inbound policy check (using the tunnel o_key as a mark). - to properly perform an outbound SPD and SAD lookup (using the tunnel o_key as a mark). - to keep the current mark of the skbuff. The skbuff mark is neither used nor changed by the vti interface. Only the vti interface o_key is used. SAs have a wildcard mark. SPs have a mark equal to the vti interface o_key. The vti interface must be created as follows (i_key = 0, o_key = mark): ip link add vti1 mode vti local 1.1.1.1 remote 2.2.2.2 okey 1 The SPs attached to vti1 must be created as follows (mark = vti1 o_key): ip xfrm policy add dir out mark 1 tmpl src 1.1.1.1 dst 2.2.2.2 \ proto esp mode tunnel ip xfrm policy add dir in mark 1 tmpl src 2.2.2.2 dst 1.1.1.1 \ proto esp mode tunnel The SAs are created with the default wildcard mark. There is no distinction between global vs. vti SAs. Just their addresses will possibly link them to a vti interface: ip xfrm state add src 1.1.1.1 dst 2.2.2.2 proto esp spi 1000 mode tunnel \ enc "cbc(aes)" "azertyuiopqsdfgh" ip xfrm state add src 2.2.2.2 dst 1.1.1.1 proto esp spi 2000 mode tunnel \ enc "cbc(aes)" "sqbdhgqsdjqjsdfh" To avoid matching "global" (not vti) SPs in vti interfaces, global SPs should no use the default wildcard mark, but explicitly match mark 0. To avoid a double SPD lookup in input and output (in global and vti SPDs), the NOPOLICY and NOXFRM options should be set on the vti interfaces: echo 1 > /proc/sys/net/ipv4/conf/vti1/disable_policy echo 1 > /proc/sys/net/ipv4/conf/vti1/disable_xfrm The outgoing traffic is steered to vti1 by a route via the vti interface: ip route add 192.168.0.0/16 dev vti1 The incoming IPsec traffic is steered to vti1 because its outer addresses match the vti1 tunnel configuration. Signed-off-by: Christophe Gouault <christophe.gouault@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-11 14:52:17 -04:00
Vlad Yasevich	f144febd93	bridge: update mdb expiration timer upon reports. commit `9f00b2e7cf` bridge: only expire the mdb entry when query is received changed the mdb expiration timer to be armed only when QUERY is received. Howerver, this causes issues in an environment where the multicast server socket comes and goes very fast while a client is trying to send traffic to it. The root cause is a race where a sequence of LEAVE followed by REPORT messages can race against QUERY messages generated in response to LEAVE. The QUERY ends up starting the expiration timer, and that timer can potentially expire after the new REPORT message has been received signaling the new join operation. This leads to a significant drop in multicast traffic and possible complete stall. The solution is to have REPORT messages update the expiration timer on entries that already exist. CC: Cong Wang <xiyou.wangcong@gmail.com> CC: Herbert Xu <herbert@gondor.apana.org.au> CC: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-11 00:56:47 -04:00
Eric Dumazet	b44084c2c8	inet: rename ir_loc_port to ir_num In commit `634fb979e8` ("inet: includes a sock_common in request_sock") I forgot that the two ports in sock_common do not have same byte order : skc_dport is __be16 (network order), but skc_num is __u16 (host order) So sparse complains because ir_loc_port (mapped into skc_num) is considered as __u16 while it should be __be16 Let rename ir_loc_port to ireq->ir_num (analogy with inet->inet_num), and perform appropriate htons/ntohs conversions. Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Wu Fengguang <fengguang.wu@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-10 14:37:35 -04:00
John W. Linville	e9517fecf2	Merge branch 'for-john' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next	2013-10-10 13:38:50 -04:00
John W. Linville	c380a1fda5	Merge branch 'for-john' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211	2013-10-10 13:19:58 -04:00
Steffen Klassert	ed1efb2aef	ipv6: Add support for IPsec virtual tunnel interfaces This patch adds IPv6 support for IPsec virtual tunnel interfaces (vti). IPsec virtual tunnel interfaces provide a routable interface for IPsec tunnel endpoints. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2013-10-10 12:00:01 +02:00
Eric Dumazet	ba537427d7	tcp: use ACCESS_ONCE() in tcp_update_pacing_rate() sk_pacing_rate is read by sch_fq packet scheduler at any time, with no synchronization, so make sure we update it in a sensible way. ACCESS_ONCE() is how we instruct compiler to not do stupid things, like using the memory location as a temporary variable. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-10 00:08:07 -04:00
Eric Dumazet	634fb979e8	inet: includes a sock_common in request_sock TCP listener refactoring, part 5 : We want to be able to insert request sockets (SYN_RECV) into main ehash table instead of the per listener hash table to allow RCU lookups and remove listener lock contention. This patch includes the needed struct sock_common in front of struct request_sock This means there is no more inet6_request_sock IPv6 specific structure. Following inet_request_sock fields were renamed as they became macros to reference fields from struct sock_common. Prefix ir_ was chosen to avoid name collisions. loc_port -> ir_loc_port loc_addr -> ir_loc_addr rmt_addr -> ir_rmt_addr rmt_port -> ir_rmt_port iif -> ir_iif Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-10 00:08:07 -04:00
Eric Dumazet	8a29111c7c	net: gro: allow to build full sized skb skb_gro_receive() is currently limited to 16 or 17 MSS per GRO skb, typically 24616 bytes, because it fills up to MAX_SKB_FRAGS frags. It's relatively easy to extend the skb using frag_list to allow more frags to be appended into the last sk_buff. This still builds very efficient skbs, and allows reaching 45 MSS per skb. (45 MSS GRO packet uses one skb plus a frag_list containing 2 additional sk_buff) High speed TCP flows benefit from this extension by lowering TCP stack cpu usage (less packets stored in receive queue, less ACK packets processed) Forwarding setups could be hurt, as such skbs will need to be linearized, although its not a new problem, as GRO could already provide skbs with a frag_list. We could make the 65536 bytes threshold a tunable to mitigate this. (First time we need to linearize skb in skb_needs_linearize(), we could lower the tunable to ~16*1460 so that following skb_gro_receive() calls build smaller skbs) Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-10 00:08:07 -04:00
baker.zhang	4c60f1d67f	fib_trie: only calc for the un-first node This is a enhancement. for the first node in fib_trie, newpos is 0, bit is 1. Only for the leaf or node with unmatched key need calc pos. Signed-off-by: baker.zhang <baker.kernel@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-10 00:08:07 -04:00
Simon Wunderlich	18c68d5960	batman-adv: reorder batadv_iv_flags The vis flag is not needed anymore, and since we do a compat bump we can start with the first bit again Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>	2013-10-09 21:22:35 +02:00
Simon Wunderlich	9284a47e8b	batman-adv: remove packed from batadv_ogm_packet As we decreased the struct size from 26 to 24 byte, we can remove __packed as the compiler will not add any more padding. Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>	2013-10-09 21:22:34 +02:00
Simon Wunderlich	a1f1ac5c4d	batman-adv: reorder packet types Reordering the packet type numbers allows us to handle unicast packets in a general way - even if we don't know the specific packet type, we can still forward it. There was already code handling this for a couple of unicast packets, and this is the more generalized version to do that. Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>	2013-10-09 21:22:34 +02:00
Simon Wunderlich	80067c8320	batman-adv: add build check macros for packet member offset Since we removed the __packed from most of the packets, we should make sure that the offset generated by the compiler are correct for sent/received data. Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>	2013-10-09 21:22:33 +02:00
Simon Wunderlich	9f4980e68b	batman-adv: remove vis functionality This is replaced by a userspace program, we don't need this functionality to bloat the kernel. Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>	2013-10-09 21:22:32 +02:00
Antonio Quartulli	0035f97e65	batman-adv: move BATADV_TT_CLIENT_TEMP to higher bit Client flags from bit 0 to 7 are sent over the wire. BATADV_TT_CLIENT_TEMP is a local flag and is not supposed to be sent to the network. Therefore it has occupy a higher bit. Signed-off-by: Antonio Quartulli <ordex@autistici.org> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>	2013-10-09 21:22:32 +02:00
Antonio Quartulli	ced72933a5	batman-adv: use CRC32C instead of CRC16 in TT code CRC32C has to be preferred to CRC16 because of its possible HW native support and because of the reduced collision probability. With this change the Translation Table component now uses CRC32C to compute the local and global table checksum. Signed-off-by: Antonio Quartulli <ordex@autistici.org> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>	2013-10-09 21:22:31 +02:00
Marek Lindner	122edaa059	batman-adv: tvlv - convert roaming adv packet to use tvlv unicast packets Instead of generating roaming specific packets the TVLV unicast API is used to send roaming information. Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>	2013-10-09 21:22:30 +02:00
Marek Lindner	335fbe0f5d	batman-adv: tvlv - convert tt query packet to use tvlv unicast packets Instead of generating TT specific packets the TVLV unicast API is used to send translation table data. Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>	2013-10-09 21:22:30 +02:00
Marek Lindner	e1bf0c1409	batman-adv: tvlv - convert tt data sent within OGMs The translation table meta data (version number, crc checksum, etc) as well as the translation table diff propgated within OGMs now uses the newly introduced tvlv infrastructure. Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>	2013-10-09 21:22:29 +02:00
Marek Lindner	3f4841ffb3	batman-adv: tvlv - add network coding container Create network coding container to announce network coding capabilities (if enabled). Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>	2013-10-09 21:22:28 +02:00
Marek Lindner	17cf0ea455	batman-adv: tvlv - add distributed arp table container Create DAT container to announce DAT capabilities (if enabled). Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>	2013-10-09 21:22:27 +02:00
Marek Lindner	414254e342	batman-adv: tvlv - gateway download/upload bandwidth container Prior to this patch batman-adv read the advertised uplink bandwidth from userspace and compressed this information into a single byte called "gateway class". Now the download & upload bandwidth information is sent as-is. No userspace change is necessary since the sysfs API always allowed to specify a bandwidth. Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Signed-off-by: Spyros Gasteratos <morfeas3000@gmail.com> Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>	2013-10-09 21:22:27 +02:00
Marek Lindner	ef26157747	batman-adv: tvlv - basic infrastructure The goal is to provide the infrastructure for sending, receiving and parsing information 'containers' while preserving backward compatibility. TVLV (based on the commonly known Type Length Value technique) was chosen as the format for those containers. Even if a node does not know the tvlv type of a certain container it can simply skip the current container and proceed with the next. Past experience has shown features evolve over time, so a 'version' field was added right from the start to allow differentiating between feature variants - hence the name: T(ype) V(ersion) L(ength) V(alue). This patch introduces the basic TVLV infrastructure: * register / unregister tvlv containers to be sent with each OGM (on primary interfaces only) * register / unregister callback handlers to be called upon finding the corresponding tvlv type in a tvlv buffer * unicast tvlv send / receive API calls Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Signed-off-by: Spyros Gasteratos <morfeas3000@gmail.com> Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>	2013-10-09 21:22:26 +02:00
Antonio Quartulli	60cf7981b7	batman-adv: switch to a new packet compatibility version With this change batman-adv is breaking compatibility with older versions and it is moving to compat-version 15. Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Signed-off-by: Martin Hundebøll <martin@hundeboll.net> Signed-off-by: Antonio Quartulli <ordex@autistici.org>	2013-10-09 21:22:25 +02:00
David S. Miller	f606385068	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec Steffen Klassert says: ==================== 1) We used the wrong netlink attribute to verify the lenght of the replay window on async events. Fix this by using the right netlink attribute. 2) Policy lookups can not match the output interface on forwarding. Add the needed informations to the flow informations. 3) We update the pmtu when we receive a ICMPV6_DEST_UNREACH message on IPsec with ipv6. This is wrong and leads to strange fragmented packets, only ICMPV6_PKT_TOOBIG messages should update the pmtu. Fix this by removing the ICMPV6_DEST_UNREACH check from the IPsec protocol error handlers. 4) The legacy IPsec anti replay mechanism supports anti replay windows up to 32 packets. If a user requests for a bigger anti replay window, we use 32 packets but pretend that we use the requested window size. Fix from Fan Du. 5) If asynchronous events are enabled and replay_maxdiff is set to zero, we generate an async event for every received packet instead of checking whether a timeout occurred. Fix from Thomas Egerer. 6) Policies need a refcount when the state resolution timer is armed. Otherwise the timer can fire after the policy is deleted. 7) We might dreference a NULL pointer if the hold_queue is empty, add a check to avoid this. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-09 13:41:45 -04:00
Eric Dumazet	c2bb06db59	net: fix build errors if ipv6 is disabled CONFIG_IPV6=n is still a valid choice ;) It appears we can remove dead code. Reported-by: Wu Fengguang <fengguang.wu@intel.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-09 13:04:03 -04:00
Fabio Estevam	cb03db9d0e	net: secure_seq: Fix warning when CONFIG_IPV6 and CONFIG_INET are not selected net_secret() is only used when CONFIG_IPV6 or CONFIG_INET are selected. Building a defconfig with both of these symbols unselected (Using the ARM at91sam9rl_defconfig, for example) leads to the following build warning: $ make at91sam9rl_defconfig # # configuration written to .config # $ make net/core/secure_seq.o scripts/kconfig/conf --silentoldconfig Kconfig CHK include/config/kernel.release CHK include/generated/uapi/linux/version.h CHK include/generated/utsrelease.h make[1]: `include/generated/mach-types.h' is up to date. CALL scripts/checksyscalls.sh CC net/core/secure_seq.o net/core/secure_seq.c:17:13: warning: 'net_secret_init' defined but not used [-Wunused-function] Fix this warning by protecting the definition of net_secret() with these symbols. Reported-by: Olof Johansson <olof@lixom.net> Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-09 12:59:57 -04:00
Emmanuel Grumbach	f38dd58ccc	cfg80211: don't add p2p device while in RFKILL Since P2P device doesn't have a netdev associated to it, we cannot prevent the user to start it when in RFKILL. So refuse to even add it when in RFKILL. Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-10-09 18:40:16 +02:00
Emmanuel Grumbach	a754055a12	mac80211: correctly close cancelled scans __ieee80211_scan_completed is called from a worker. This means that the following flow is possible. * driver calls ieee80211_scan_completed * mac80211 cancels the scan (that is already complete) * __ieee80211_scan_completed runs When scan_work will finally run, it will see that the scan hasn't been aborted and might even trigger another scan on another band. This leads to a situation where cfg80211's scan is not done and no further scan can be issued. Fix this by setting a new flag when a HW scan is being cancelled so that no other scan will be triggered. Cc: stable@vger.kernel.org Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-10-09 18:40:07 +02:00
Steffen Klassert	212e560112	ipv6: Add a receive path hook for vti6 in xfrm6_mode_tunnel. Add a receive path hook for the IPsec vritual tunnel interface. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2013-10-09 13:16:36 +02:00
Eric Dumazet	f69b923a75	udp: fix a typo in __udp4_lib_mcast_demux_lookup At this point sk might contain garbage. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-09 01:51:57 -04:00
Eric Dumazet	efe4208f47	ipv6: make lookups simpler and faster TCP listener refactoring, part 4 : To speed up inet lookups, we moved IPv4 addresses from inet to struct sock_common Now is time to do the same for IPv6, because it permits us to have fast lookups for all kind of sockets, including upcoming SYN_RECV. Getting IPv6 addresses in TCP lookups currently requires two extra cache lines, plus a dereference (and memory stall). inet6_sk(sk) does the dereference of inet_sk(__sk)->pinet6 This patch is way bigger than its IPv4 counter part, because for IPv4, we could add aliases (inet_daddr, inet_rcv_saddr), while on IPv6, it's not doable easily. inet6_sk(sk)->daddr becomes sk->sk_v6_daddr inet6_sk(sk)->rcv_saddr becomes sk->sk_v6_rcv_saddr And timewait socket also have tw->tw_v6_daddr & tw->tw_v6_rcv_saddr at the same offset. We get rid of INET6_TW_MATCH() as INET6_MATCH() is now the generic macro. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-09 00:01:25 -04:00
Eric Dumazet	05dbc7b594	tcp/dccp: remove twchain TCP listener refactoring, part 3 : Our goal is to hash SYN_RECV sockets into main ehash for fast lookup, and parallel SYN processing. Current inet_ehash_bucket contains two chains, one for ESTABLISH (and friend states) sockets, another for TIME_WAIT sockets only. As the hash table is sized to get at most one socket per bucket, it makes little sense to have separate twchain, as it makes the lookup slightly more complicated, and doubles hash table memory usage. If we make sure all socket types have the lookup keys at the same offsets, we can use a generic and faster lookup. It turns out TIME_WAIT and ESTABLISHED sockets already have common lookup fields for IPv4. [ INET_TW_MATCH() is no longer needed ] I'll provide a follow-up to factorize IPv6 lookup as well, to remove INET6_TW_MATCH() This way, SYN_RECV pseudo sockets will be supported the same. A new sock_gen_put() helper is added, doing either a sock_put() or inet_twsk_put() [ and will support SYN_RECV later ]. Note this helper should only be called in real slow path, when rcu lookup found a socket that was moved to another identity (freed/reused immediately), but could eventually be used in other contexts, like sock_edemux() Before patch : dmesg \| grep "TCP established" TCP established hash table entries: 524288 (order: 11, 8388608 bytes) After patch : TCP established hash table entries: 524288 (order: 10, 4194304 bytes) Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-08 23:19:24 -04:00
David S. Miller	53af53ae83	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: include/linux/netdevice.h net/core/sock.c Trivial merge issues. Removal of "extern" for functions declaration in netdevice.h at the same time "const" was added to an argument. Two parallel line additions in net/core/sock.c Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-08 23:07:53 -04:00
Eric Dumazet	7eec4174ff	pkt_sched: fq: fix non TCP flows pacing Steinar reported FQ pacing was not working for UDP flows. It looks like the initial sk->sk_pacing_rate value of 0 was a wrong choice. We should init it to ~0U (unlimited) Then, TCA_FQ_FLOW_DEFAULT_RATE should be removed because it makes no real sense. The default rate is really unlimited, and we need to avoid a zero divide. Reported-by: Steinar H. Gunderson <sesse@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-08 21:54:01 -04:00
Marc Kleine-Budde	c33a39c575	net: vlan: fix nlmsg size calculation in vlan_get_size() This patch fixes the calculation of the nlmsg size, by adding the missing nla_total_size(). Cc: Patrick McHardy <kaber@trash.net> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-08 16:32:41 -04:00
Eric Dumazet	ede869cd0f	pkt_sched: fq: fix typo for initial_quantum TCA_FQ_INITIAL_QUANTUM should set q->initial_quantum Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-08 16:32:41 -04:00
Oussama Ghorbel	0e719e3a53	ipv6: Fix the upper MTU limit in GRE tunnel Unlike ipv4, the struct member hlen holds the length of the GRE and ipv6 headers. This length is also counted in dev->hard_header_len. Perhaps, it's more clean to modify the hlen to count only the GRE header without ipv6 header as the variable name suggest, but the simple way to fix this without regression risk is simply modify the calculation of the limit in ip6gre_tunnel_change_mtu function. Verified in kernel version v3.11. Signed-off-by: Oussama Ghorbel <ou.ghorbel@gmail.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-08 16:32:40 -04:00
Gao feng	ff0bfad6a2	cgroup: cls: remove unnecessary task_cls_classid We can get classid through cgroup_subsys_state, this is directviewing and effective. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-08 16:27:34 -04:00
Gao feng	e1af5e445e	cgroup: netprio: remove unnecessary task_netprioidx Since the tasks have been migrated to the cgroup, there is no need to call task_netprioidx to get task's cgroup id. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-08 16:27:34 -04:00
Shawn Bohrer	fbf8866d65	net: ipv4 only populate IP_PKTINFO when needed The since the removal of the routing cache computing fib_compute_spec_dst() does a fib_table lookup for each UDP multicast packet received. This has introduced a performance regression for some UDP workloads. This change skips populating the packet info for sockets that do not have IP_PKTINFO set. Benchmark results from a netperf UDP_RR test: Before 89789.68 transactions/s After 90587.62 transactions/s Benchmark results from a fio 1 byte UDP multicast pingpong test (Multicast one way unicast response): Before 12.63us RTT After 12.48us RTT Signed-off-by: Shawn Bohrer <sbohrer@rgmadvisors.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-08 16:27:33 -04:00
Shawn Bohrer	421b3885bf	udp: ipv4: Add udp early demux The removal of the routing cache introduced a performance regression for some UDP workloads since a dst lookup must be done for each packet. This change caches the dst per socket in a similar manner to what we do for TCP by implementing early_demux. For UDP multicast we can only cache the dst if there is only one receiving socket on the host. Since caching only works when there is one receiving socket we do the multicast socket lookup using RCU. For UDP unicast we only demux sockets with an exact match in order to not break forwarding setups. Additionally since the hash chains may be long we only check the first socket to see if it is a match and not waste extra time searching the whole chain when we might not find an exact match. Benchmark results from a netperf UDP_RR test: Before 87961.22 transactions/s After 89789.68 transactions/s Benchmark results from a fio 1 byte UDP multicast pingpong test (Multicast one way unicast response): Before 12.97us RTT After 12.63us RTT Signed-off-by: Shawn Bohrer <sbohrer@rgmadvisors.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-08 16:27:33 -04:00
Shawn Bohrer	005ec97433	udp: Only allow busy read/poll on connected sockets UDP sockets can receive packets from multiple endpoints and thus may be received on multiple receive queues. Since packets packets can arrive on multiple receive queues we should not mark the napi_id for all packets. This makes busy read/poll only work for connected UDP sockets. This additionally enables busy read/poll for UDP multicast packets as long as the socket is connected by moving the check into __udp_queue_rcv_skb(). Signed-off-by: Shawn Bohrer <sbohrer@rgmadvisors.com> Suggested-by: Eric Dumazet <edumazet@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-08 16:27:33 -04:00
Eric Dumazet	2c8c8e6f9d	net_sched: increment drop counters in qdisc_tree_decrease_qlen() qdisc_tree_decrease_qlen() is called when some packets are dropped on a qdisc, and we want to notify parents of qlen changes. We also can increment parents qdisc qstats drop counters. This permits more accurate drop counters up to root qdisc. For example a graft operation typically resets a qdisc (drops all packets) and call qdisc_tree_decrease_qlen() Note that callers are responsible for their drop counters. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-08 16:27:33 -04:00
David S. Miller	8d8a51e26a	l2tp: Fix build warning with ipv6 disabled. net/l2tp/l2tp_core.c: In function ‘l2tp_verify_udp_checksum’: net/l2tp/l2tp_core.c:499:22: warning: unused variable ‘tunnel’ [-Wunused-variable] Create a helper "l2tp_tunnel()" to facilitate this, and as a side effect get rid of a bunch of unnecessary void pointer casts. Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-08 15:44:26 -04:00
Alan Ott	ab2d95df9c	6lowpan: Sync default hardware address of lowpan links to their wpan When a lowpan link to a wpan device is created, set the hardware address of the lowpan link to that of the wpan device. Signed-off-by: Alan Ott <alan@signal11.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-08 15:28:37 -04:00
Alan Ott	7adac1ec81	6lowpan: Only make 6lowpan links to IEEE802154 devices Refuse to create 6lowpan links if the actual hardware interface is of any type other than ARPHRD_IEEE802154. Signed-off-by: Alan Ott <alan@signal11.us> Suggested-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-08 15:28:37 -04:00
Steffen Klassert	2bb53e2557	xfrm: check for a vaild skb in xfrm_policy_queue_process We might dreference a NULL pointer if the hold_queue is empty, so add a check to avoid this. Bug was introduced with git commit `a0073fe18` ("xfrm: Add a state resolution packet queue") Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2013-10-08 10:49:51 +02:00
Steffen Klassert	e7d8f6cb2f	xfrm: Add refcount handling to queued policies We need to ensure that policies can't go away as long as the hold timer is armed, so take a refcont when we arm the timer and drop one if we delete it. Bug was introduced with git commit `a0073fe18` ("xfrm: Add a state resolution packet queue") Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2013-10-08 10:49:45 +02:00
Eric W. Biederman	88ba09df23	net: Update the sysctl permissions handler to test effective uid/gid On Tue, 20 Aug 2013 11:40:04 -0500 Eric Sandeen <sandeen@redhat.com> wrote: > This was brought up in a Red Hat bug (which may be marked private, I'm sorry): > > Bug 987055 - open O_WRONLY succeeds on some root owned files in /proc for process running with unprivileged EUID > > "On RHEL7 some of the files in /proc can be opened for writing by an unprivileged EUID." > > The flaw existed upstream as well last I checked. > > This commit in kernel v3.8 caused the regression: > > commit `cff109768b` > Author: Eric W. Biederman <ebiederm@xmission.com> > Date: Fri Nov 16 03:03:01 2012 +0000 > > net: Update the per network namespace sysctls to be available to the network namespace owner > > - Allow anyone with CAP_NET_ADMIN rights in the user namespace of the > the netowrk namespace to change sysctls. > - Allow anyone the uid of the user namespace root the same > permissions over the network namespace sysctls as the global root. > - Allow anyone with gid of the user namespace root group the same > permissions over the network namespace sysctl as the global root group. > > Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> > Signed-off-by: David S. Miller <davem@davemloft.net> > > because it changed /sys/net's special permission handler to test current_uid, not > current_euid; same for current_gid/current_egid. > > So in this case, root cannot drop privs via set[ug]id, and retains all privs > in this codepath. Modify the code to use current_euid(), and in_egroup_p, as in done in fs/proc/proc_sysctl.c:test_perm() Cc: stable@vger.kernel.org Reviewed-by: Eric Sandeen <sandeen@redhat.com> Reported-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-07 15:57:56 -04:00
David S. Miller	7009deab19	Merge git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next Conflicts: drivers/net/wireless/brcm80211/brcmfmac/dhd_bus.h drivers/net/wireless/rtlwifi/rtl8188ee/phy.h drivers/net/wireless/rtlwifi/rtl8192ce/phy.h drivers/net/wireless/rtlwifi/rtl8192de/phy.h drivers/net/wireless/rtlwifi/rtl8723ae/phy.h Just some minor conflicts between the wireless-next changes and Joe Perches's "extern" removal from function prototypes in header files. John W. Linville says: ==================== Regarding the Bluetooth bits, Gustavo says: "The big work here is from Marcel and Johan. They did a lot of work in the L2CAP, HCI and MGMT layers. The most important ones are the addition of a new MGMT command to enable/disable LE advertisement and the introduction of the HCI user channel to allow applications to get directly and exclusive access to Bluetooth devices." As to the ath10k bits, Kalle says: "Bartosz dropped support for qca98xx hw1.0 hardware from ath10k, it's just too much to support it. Michal added support for the new firmware interface. Marek fixed WEP in AP and IBSS mode. Rest of the changes are minor fixes or cleanups." And also: "Major changes are: * throughput improvements including aligning the RX frames correctly and optimising HTT layer (Michal) * remove qca98xx hw1.0 support (Bartosz) * add support for firmware version 999.999.0.636 (Michal) * firmware htt statistics support (Kalle) * fix WEP in AP and IBSS mode (Marek) * fix a mutex unlock balance in debugfs file (Shafi) And of course there's a lot of smaller fixes and cleanup." For the wl12xx bits, Luca says: "Here are some patches intended for 3.13. Eliad is upstreaming a bunch of patches that have been pending in the internal tree. Mostly bugfixes and other small improvements." Along with that... Arend and friends bring us a batch of brcmfmac updates, Larry Finger offers some rtlwifi refactoring, and Sujith sends the usual batch of ath9k updates. As usual, there are a number of other small updates from a variety of players as well. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-07 15:40:44 -04:00
Jiri Benc	0a7e226090	ipv4: fix ineffective source address selection When sending out multicast messages, the source address in inet->mc_addr is ignored and rewritten by an autoselected one. This is caused by a typo in commit `813b3b5db8` ("ipv4: Use caller's on-stack flowi as-is in output route lookups"). Signed-off-by: Jiri Benc <jbenc@redhat.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-07 15:26:46 -04:00
Eric W. Biederman	5cde282938	net: Separate the close_list and the unreg_list v2 Separate the unreg_list and the close_list in dev_close_many preventing dev_close_many from permuting the unreg_list. The permutations of the unreg_list have resulted in cases where the loopback device is accessed it has been freed in code such as dst_ifdown. Resulting in subtle memory corruption. This is the second bug from sharing the storage between the close_list and the unreg_list. The issues that crop up with sharing are apparently too subtle to show up in normal testing or usage, so let's forget about being clever and use two separate lists. v2: Make all callers pass in a close_list to dev_close_many Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-07 15:23:14 -04:00
Alexei Starovoitov	d45ed4a4e3	net: fix unsafe set_memory_rw from softirq on x86 system with net.core.bpf_jit_enable = 1 sudo tcpdump -i eth1 'tcp port 22' causes the warning: [ 56.766097] Possible unsafe locking scenario: [ 56.766097] [ 56.780146] CPU0 [ 56.786807] ---- [ 56.793188] lock(&(&vb->lock)->rlock); [ 56.799593] <Interrupt> [ 56.805889] lock(&(&vb->lock)->rlock); [ 56.812266] [ 56.812266] * DEADLOCK * [ 56.812266] [ 56.830670] 1 lock held by ksoftirqd/1/13: [ 56.836838] #0: (rcu_read_lock){.+.+..}, at: [<ffffffff8118f44c>] vm_unmap_aliases+0x8c/0x380 [ 56.849757] [ 56.849757] stack backtrace: [ 56.862194] CPU: 1 PID: 13 Comm: ksoftirqd/1 Not tainted 3.12.0-rc3+ #45 [ 56.868721] Hardware name: System manufacturer System Product Name/P8Z77 WS, BIOS 3007 07/26/2012 [ 56.882004] ffffffff821944c0 ffff88080bbdb8c8 ffffffff8175a145 0000000000000007 [ 56.895630] ffff88080bbd5f40 ffff88080bbdb928 ffffffff81755b14 0000000000000001 [ 56.909313] ffff880800000001 ffff880800000000 ffffffff8101178f 0000000000000001 [ 56.923006] Call Trace: [ 56.929532] [<ffffffff8175a145>] dump_stack+0x55/0x76 [ 56.936067] [<ffffffff81755b14>] print_usage_bug+0x1f7/0x208 [ 56.942445] [<ffffffff8101178f>] ? save_stack_trace+0x2f/0x50 [ 56.948932] [<ffffffff810cc0a0>] ? check_usage_backwards+0x150/0x150 [ 56.955470] [<ffffffff810ccb52>] mark_lock+0x282/0x2c0 [ 56.961945] [<ffffffff810ccfed>] __lock_acquire+0x45d/0x1d50 [ 56.968474] [<ffffffff810cce6e>] ? __lock_acquire+0x2de/0x1d50 [ 56.975140] [<ffffffff81393bf5>] ? cpumask_next_and+0x55/0x90 [ 56.981942] [<ffffffff810cef72>] lock_acquire+0x92/0x1d0 [ 56.988745] [<ffffffff8118f52a>] ? vm_unmap_aliases+0x16a/0x380 [ 56.995619] [<ffffffff817628f1>] _raw_spin_lock+0x41/0x50 [ 57.002493] [<ffffffff8118f52a>] ? vm_unmap_aliases+0x16a/0x380 [ 57.009447] [<ffffffff8118f52a>] vm_unmap_aliases+0x16a/0x380 [ 57.016477] [<ffffffff8118f44c>] ? vm_unmap_aliases+0x8c/0x380 [ 57.023607] [<ffffffff810436b0>] change_page_attr_set_clr+0xc0/0x460 [ 57.030818] [<ffffffff810cfb8d>] ? trace_hardirqs_on+0xd/0x10 [ 57.037896] [<ffffffff811a8330>] ? kmem_cache_free+0xb0/0x2b0 [ 57.044789] [<ffffffff811b59c3>] ? free_object_rcu+0x93/0xa0 [ 57.051720] [<ffffffff81043d9f>] set_memory_rw+0x2f/0x40 [ 57.058727] [<ffffffff8104e17c>] bpf_jit_free+0x2c/0x40 [ 57.065577] [<ffffffff81642cba>] sk_filter_release_rcu+0x1a/0x30 [ 57.072338] [<ffffffff811108e2>] rcu_process_callbacks+0x202/0x7c0 [ 57.078962] [<ffffffff81057f17>] __do_softirq+0xf7/0x3f0 [ 57.085373] [<ffffffff81058245>] run_ksoftirqd+0x35/0x70 cannot reuse jited filter memory, since it's readonly, so use original bpf insns memory to hold work_struct defer kfree of sk_filter until jit completed freeing tested on x86_64 and i386 Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-07 15:16:45 -04:00
Oussama Ghorbel	582442d6d5	ipv6: Allow the MTU of ipip6 tunnel to be set below 1280 The (inner) MTU of a ipip6 (IPv4-in-IPv6) tunnel cannot be set below 1280, which is the minimum MTU in IPv6. However, there should be no IPv6 on the tunnel interface at all, so the IPv6 rules should not apply. More info at https://bugzilla.kernel.org/show_bug.cgi?id=15530 This patch allows to check the minimum MTU for ipv6 tunnel according to these rules: -In case the tunnel is configured with ipip6 mode the minimum MTU is 68. -In case the tunnel is configured with ip6ip6 or any mode the minimum MTU is 1280. Signed-off-by: Oussama Ghorbel <ou.ghorbel@gmail.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-07 12:32:26 -04:00
Michael S. Tsirkin	3573540caf	netif_set_xps_queue: make cpu mask const virtio wants to pass in cpumask_of(cpu), make parameter const to avoid build warnings. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-07 12:29:26 -04:00
Eric Dumazet	5e8a402f83	tcp: do not forget FIN in tcp_shifted_skb() Yuchung found following problem : There are bugs in the SACK processing code, merging part in tcp_shift_skb_data(), that incorrectly resets or ignores the sacked skbs FIN flag. When a receiver first SACK the FIN sequence, and later throw away ofo queue (e.g., sack-reneging), the sender will stop retransmitting the FIN flag, and hangs forever. Following packetdrill test can be used to reproduce the bug. $ cat sack-merge-bug.pkt `sysctl -q net.ipv4.tcp_fack=0` // Establish a connection and send 10 MSS. 0.000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3 +.000 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0 +.000 bind(3, ..., ...) = 0 +.000 listen(3, 1) = 0 +.050 < S 0:0(0) win 32792 <mss 1000,sackOK,nop,nop,nop,wscale 7> +.000 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 6> +.001 < . 1:1(0) ack 1 win 1024 +.000 accept(3, ..., ...) = 4 +.100 write(4, ..., 12000) = 12000 +.000 shutdown(4, SHUT_WR) = 0 +.000 > . 1:10001(10000) ack 1 +.050 < . 1:1(0) ack 2001 win 257 +.000 > FP. 10001:12001(2000) ack 1 +.050 < . 1:1(0) ack 2001 win 257 <sack 10001:11001,nop,nop> +.050 < . 1:1(0) ack 2001 win 257 <sack 10001:12002,nop,nop> // SACK reneg +.050 < . 1:1(0) ack 12001 win 257 +0 %{ print "unacked: ",tcpi_unacked }% +5 %{ print "" }% First, a typo inverted left/right of one OR operation, then code forgot to advance end_seq if the merged skb carried FIN. Bug was added in 2.6.29 by commit `832d11c5cd` ("tcp: Try to restore large SKBs while SACK processing") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Acked-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-04 14:16:36 -04:00
David S. Miller	d639feaaf3	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next Pablo Neira Ayuso says: ==================== The following patchset contains Netfilter updates for your net-next tree, mostly ipset improvements and enhancements features, they are: * Don't call ip_nest_end needlessly in the error path from me, suggested by Pablo Neira Ayuso, from Jozsef Kadlecsik. * Fixed sparse warnings about shadowed variable and missing rcu annotation and fix of "may be used uninitialized" warnings, also from Jozsef. * Renamed simple macro names to avoid namespace issues, reported by David Laight, again from Jozsef. * Use fix sized type for timeout in the extension part, and cosmetic ordering of matches and targets separatedly in xt_set.c, from Jozsef. * Support package fragments for IPv4 protos without ports from Anders K. Pedersen. For example this allows a hash:ip,port ipset containing the entry 192.168.0.1,gre:0 to match all package fragments for PPTP VPN tunnels to/from the host. Without this patch only the first package fragment (with fragment offset 0) was matched. * Introduced a new operation to get both setname and family, from Jozsef. ip[6]tables set match and SET target need to know the family of the set in order to reject adding rules which refer to a set with a non-mathcing family. Currently such rules are silently accepted and then ignored instead of generating an error message to the user. * Reworked extensions support in ipset types from Jozsef. The approach of defining structures with all variations is not manageable as the number of extensions grows. Therefore a blob for the extensions is introduced, somewhat similar to conntrack. The support of extensions which need a per data destroy function is added as well. * When an element timed out in a list:set type of set, the garbage collector skipped the checking of the next element. So the purging was delayed to the next run of the gc, fixed by Jozsef. * A small Kconfig fix: NETFILTER_NETLINK cannot be selected and ipset requires it. * hash:net,net type from Oliver Smith. The type provides the ability to store pairs of subnets in a set. * Comment for ipset entries from Oliver Smith. This makes possible to annotate entries in a set with comments, for example: ipset n foo hash:net,net comment ipset a foo 10.0.0.0/21,192.168.1.0/24 comment "office nets A and B" * Fix of hash types resizing with comment extension from Jozsef. * Fix of new extensions for list:set type when an element is added into a slot from where another element was pushed away from Jozsef. * Introduction of a common function for the listing of the element extensions from Jozsef. * Net namespace support for ipset from Vitaly Lavrov. * hash:net,port,net type from Oliver Smith, which makes possible to store the triples of two subnets and a protocol, port pair in a set. * Get xt_TCPMSS working with net namespace, by Gao feng. * Use the proper net netnamespace to allocate skbs, also by Gao feng. * A couple of cleanups for the conntrack SIP helper, by Holger Eitzenberger. * Extend cttimeout to allow setting default conntrack timeouts via nfnetlink, so we can get rid of all our sysctl/proc interfaces in the future for timeout tuning, from me. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-04 13:26:38 -04:00
Eric Dumazet	96f817fede	tcp: shrink tcp6_timewait_sock by one cache line While working on tcp listener refactoring, I found that it would really make things easier if sock_common could include the IPv6 addresses needed in the lookups, instead of doing very complex games to get their values (depending on sock being SYN_RECV, ESTABLISHED, TIME_WAIT) For this to happen, I need to be sure that tcp6_timewait_sock and tcp_timewait_sock consume same number of cache lines. This is possible if we only use 32bits for tw_ttd, as we remove one 32bit hole in inet_timewait_sock inet_tw_time_stamp() is defined and used, even if its current implementation looks like tcp_time_stamp : We might need finer resolution for tcp_time_stamp in the future. Before patch : sizeof(struct tcp6_timewait_sock) = 0xc8 After patch : sizeof(struct tcp6_timewait_sock) = 0xc0 Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-03 17:43:39 -04:00
David S. Miller	7df9b48588	Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless John W. Linville says: ==================== Here is another batch of fixes intended for the 3.12 stream... For the mac80211 bits, Johannes says: "This time I have two fixes for IBSS (including one for wext, hah), a fix for extended rates IEs, an active monitor checking fix and a sysfs registration race fix." On top of those... Amitkumar Karwar brings an mwifiex fix for an interrupt loss issue w/ SDIO devices. The problem was due to a command timeout issue introduced by an earlier patch. Felix Fietkau a stall in the ath9k driver. This patch fixes the regression introduced in the commit "ath9k: use software queues for un-aggregated data packets". Stanislaw Gruszka reverts an rt2x00 patch that was found to cause connection problems with some devices. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-03 16:28:34 -04:00
John W. Linville	0d4f55bc37	Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next	2013-10-03 16:19:07 -04:00
Dan Carpenter	1661bf364a	net: heap overflow in __audit_sockaddr() We need to cap ->msg_namelen or it leads to a buffer overflow when we to the memcpy() in __audit_sockaddr(). It requires CAP_AUDIT_CONTROL to exploit this bug. The call tree is: ___sys_recvmsg() move_addr_to_user() audit_sockaddr() __audit_sockaddr() Reported-by: Jüri Aedla <juri.aedla@gmail.com> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-03 16:05:14 -04:00
John W. Linville	1eea72f03a	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem	2013-10-03 16:00:03 -04:00
David S. Miller	196896d4bb	Included change: - fix multi soft-interfaces setups with Network Coding enabled by registering the CODED packet type once only (instead of once per soft-if) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.20 (GNU/Linux) iQIcBAABCAAGBQJSTHvQAAoJEADl0hg6qKeOBl8P/jDJhnH65p+zsXlK5RQ1bOmq F9cY7nY1cESQ0V4j9BGGqPcvy3ltCbVnaPGvrYKQ78CIVYFIlA0ZwmnjnXzkxi4a XWgG10Znx8yUOPllFoRp7r7yJht2FVprWnEN1aVCwbflpHxD5jI+L3C8JWULEfbI 7Gm3CcHQWzQSOv8u00XeoBmAo3Q+N0gaEAXl+vogKW4RP59GU4QSCstahyRuPmme l1C9SrLqi+KJjpvgxEdjHmGD8K0yLYJVw/6iMYlYpKbraU793madj0JNT+LwwAmE dMTOp83yKy+n8k4XRKYRnvOElAJVVvEjU81V/4ompVHzIfu/7f1xSWyAQpecbhFG srd/QLqIszScx7ELDQ3IVMacTLs2tMaEotvrymYIooRLz3ecgeAyXth3aBQErSD2 SoDliIpx8+D45c04ri9Hcwu2k1g100VYG0QiJMUC0berYGDyjPnbEdpnmYTioJ6J 4s4Qs3ve70lo0yc2ODDZxYN6n6Rk0PXuxJwj5PeBR6RswEo1izdelOXEcAevVjZE SRJn0niZmtYlS5gD/6aohkVKnKti9Rd2DrgOU7qCWJ/wLUiFSL5L7Lj9megKbmeG f4qxD9rC3wKQdX1TtU/ED7IfMWMBY0tcSEnbCYs+otI8kCbtvr0490h1JtNJALHb po2HXXIMEjqmbhkgsz29 =pqD/ -----END PGP SIGNATURE----- Merge tag 'batman-adv-fix-for-davem' of git://git.open-mesh.org/linux-merge Included change: - fix multi soft-interfaces setups with Network Coding enabled by registering the CODED packet type once only (instead of once per soft-if) Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-03 15:57:36 -04:00
Peter Senna Tschudin	34a6eda163	net: ipv4: Change variable type to bool The variable fully_acked is only assigned the values true and false. Change its type to bool. The simplified semantic patch that find this problem is as follows (http://coccinelle.lip6.fr/): @exists@ type T; identifier b; @@ - T + bool b = ...; ... when any b = $true\\|false$ Signed-off-by: Peter Senna Tschudin <peter.senna@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-03 15:40:34 -04:00
Nikolay Aleksandrov	357afe9c46	flow_dissector: factor out the ports extraction in skb_flow_get_ports Factor out the code that extracts the ports from skb_flow_dissect and add a new function skb_flow_get_ports which can be re-used. Suggested-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Acked-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-03 15:36:37 -04:00
Eric Dumazet	5080546682	inet: consolidate INET_TW_MATCH TCP listener refactoring, part 2 : We can use a generic lookup, sockets being in whatever state, if we are sure all relevant fields are at the same place in all socket types (ESTABLISH, TIME_WAIT, SYN_RECV) This patch removes these macros : inet_addrpair, inet_addrpair, tw_addrpair, tw_portpair And adds : sk_portpair, sk_addrpair, sk_daddr, sk_rcv_saddr Then, INET_TW_MATCH() is really the same than INET_MATCH() Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-03 15:33:35 -04:00
Marcel Holtmann	4f3e219d95	Bluetooth: Only one command per L2CAP LE signalling is supported The Bluetooth specification makes it clear that only one command should be present in the L2CAP LE signalling packet. So tighten the checks here and restrict it to exactly one command. This is different from L2CAP BR/EDR signalling where multiple commands can be part of the same packet. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2013-10-03 16:09:59 +03:00
Marcel Holtmann	92381f5cd7	Bluetooth: Check minimum length of SMP packets When SMP packets are received, make sure they contain at least 1 byte header for the opcode. If not, drop the packet and disconnect the link. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2013-10-03 13:06:41 +03:00
Marcel Holtmann	b99707d7ee	Bluetooth: Drop packets on ATT fixed channel on BR/EDR The ATT fixed channel is only valid when using LE connections. On BR/EDR it is required to go through L2CAP connection oriented channel for ATT. Drop ATT packets when they are received on a BR/EDR connection. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2013-10-03 13:05:36 +03:00
Marcel Holtmann	ae4fd2d374	Bluetooth: L2CAP connectionless channels are only valid for BR/EDR When receiving connectionless packets on a LE connection, just drop the packet. There is no concept of connectionless channels for LE. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2013-10-03 10:13:30 +03:00
Marcel Holtmann	7b9899dbcf	Bluetooth: SMP packets are only valid on LE connections When receiving SMP packets on a BR/EDR connection, then just drop the packet and do not try to process it. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2013-10-03 10:09:12 +03:00
Marcel Holtmann	94b6a09b67	Bluetooth: Don't copy L2CAP LE signalling to raw sockets The L2CAP raw sockets are only used for BR/EDR signalling. Packets on LE links should not be forwarded there. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2013-10-03 10:07:58 +03:00
Marcel Holtmann	a28776296c	Bluetooth: Fix switch statement order for L2CAP fixed channels The switch statement for the various L2CAP fixed channel handlers is not really ordered. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2013-10-03 10:07:29 +03:00
Marcel Holtmann	6203fc9834	Bluetooth: Allow changing device class when BR/EDR is disabled Changing the device class when BR/EDR is disabled has no visible effect for remote devices. However to simplify the logic allow it as long as the controller supports BR/EDR operations. If it is not allowed, then the overall logic becomes rather complicated since the class of device values would need clearing or restoring when BR/EDR setting changes. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2013-10-03 10:05:27 +03:00
Marcel Holtmann	cf99ba1359	Bluetooth: Restrict loading of long term keys to LE capable controllers Loading long term keys into a BR/EDR only controller make no sense. The kernel would never use any of these keys. So instead of allowing userspace to waste memory, reject such operation with a not supported error message. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2013-10-03 09:33:02 +03:00
Marcel Holtmann	9060d5cf52	Bluetooth: Restrict loading of link keys to BR/EDR capable controllers Loading link keys into a LE only controller make no sense. The kernel would never use any of these keys. So instead of allowing userspace to waste memory, reject such operation with a not supported error message. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2013-10-03 09:32:57 +03:00
Marcel Holtmann	62af444319	Bluetooth: Allow setting static address even if LE is disabled Setting the static address does not depend on LE beeing enabled. It only depends on a controller with LE support. When depending on LE enabled this command becomes really complicated since in case LE gets disabled, it would be required to clear the static address and also its random address representation inside the controller. With future support for private addresses such complex setup should be avoided. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2013-10-03 09:29:38 +03:00
Marcel Holtmann	cdba5281b2	Bluetooth: Restrict SSP setting changes to BR/EDR enabled controllers Only when BR/EDR is supported and enabled, allow changing of the SSP setting. Just checking if the hardware supports SSP is not enough since it might be the case that BR/EDR is disabled. In the case that BR/EDR is disabled, but SSP supported by the controller the not supported error message is now returned. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2013-10-03 09:20:37 +03:00
François Cachereul	e18503f41f	l2tp: fix kernel panic when using IPv4-mapped IPv6 addresses IPv4 mapped addresses cause kernel panic. The patch juste check whether the IPv6 address is an IPv4 mapped address. If so, use IPv4 API instead of IPv6. [ 940.026915] general protection fault: 0000 [#1] [ 940.026915] Modules linked in: l2tp_ppp l2tp_netlink l2tp_core pppox ppp_generic slhc loop psmouse [ 940.026915] CPU: 0 PID: 3184 Comm: memcheck-amd64- Not tainted 3.11.0+ #1 [ 940.026915] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007 [ 940.026915] task: ffff880007130e20 ti: ffff88000737e000 task.ti: ffff88000737e000 [ 940.026915] RIP: 0010:[<ffffffff81333780>] [<ffffffff81333780>] ip6_xmit+0x276/0x326 [ 940.026915] RSP: 0018:ffff88000737fd28 EFLAGS: 00010286 [ 940.026915] RAX: c748521a75ceff48 RBX: ffff880000c30800 RCX: 0000000000000000 [ 940.026915] RDX: ffff88000075cc4e RSI: 0000000000000028 RDI: ffff8800060e5a40 [ 940.026915] RBP: ffff8800060e5a40 R08: 0000000000000000 R09: ffff88000075cc90 [ 940.026915] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88000737fda0 [ 940.026915] R13: 0000000000000000 R14: 0000000000002000 R15: ffff880005d3b580 [ 940.026915] FS: 00007f163dc5e800(0000) GS:ffffffff81623000(0000) knlGS:0000000000000000 [ 940.026915] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 940.026915] CR2: 00000004032dc940 CR3: 0000000005c25000 CR4: 00000000000006f0 [ 940.026915] Stack: [ 940.026915] ffff88000075cc4e ffffffff81694e90 ffff880000c30b38 0000000000000020 [ 940.026915] 11000000523c4bac ffff88000737fdb4 0000000000000000 ffff880000c30800 [ 940.026915] ffff880005d3b580 ffff880000c30b38 ffff8800060e5a40 0000000000000020 [ 940.026915] Call Trace: [ 940.026915] [<ffffffff81356cc3>] ? inet6_csk_xmit+0xa4/0xc4 [ 940.026915] [<ffffffffa0038535>] ? l2tp_xmit_skb+0x503/0x55a [l2tp_core] [ 940.026915] [<ffffffff812b8d3b>] ? pskb_expand_head+0x161/0x214 [ 940.026915] [<ffffffffa003e91d>] ? pppol2tp_xmit+0xf2/0x143 [l2tp_ppp] [ 940.026915] [<ffffffffa00292e0>] ? ppp_channel_push+0x36/0x8b [ppp_generic] [ 940.026915] [<ffffffffa00293fe>] ? ppp_write+0xaf/0xc5 [ppp_generic] [ 940.026915] [<ffffffff8110ead4>] ? vfs_write+0xa2/0x106 [ 940.026915] [<ffffffff8110edd6>] ? SyS_write+0x56/0x8a [ 940.026915] [<ffffffff81378ac0>] ? system_call_fastpath+0x16/0x1b [ 940.026915] Code: 00 49 8b 8f d8 00 00 00 66 83 7c 11 02 00 74 60 49 8b 47 58 48 83 e0 fe 48 8b 80 18 01 00 00 48 85 c0 74 13 48 8b 80 78 02 00 00 <48> ff 40 28 41 8b 57 68 48 01 50 30 48 8b 54 24 08 49 c7 c1 51 [ 940.026915] RIP [<ffffffff81333780>] ip6_xmit+0x276/0x326 [ 940.026915] RSP <ffff88000737fd28> [ 940.057945] ---[ end trace be8aba9a61c8b7f3 ]--- [ 940.058583] Kernel panic - not syncing: Fatal exception in interrupt Signed-off-by: François CACHEREUL <f.cachereul@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-02 17:09:22 -04:00
Eric Dumazet	80ad1d61e7	net: do not call sock_put() on TIMEWAIT sockets commit `3ab5aee7fe` ("net: Convert TCP & DCCP hash tables to use RCU / hlist_nulls") incorrectly used sock_put() on TIMEWAIT sockets. We should instead use inet_twsk_put() Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-02 17:05:54 -04:00
Joe Perches	d458cdf712	net:drivers/net: Miscellaneous conversions to ETH_ALEN Convert the memset/memcpy uses of 6 to ETH_ALEN where appropriate. Also convert some struct definitions and u8 array declarations of [6] to ETH_ALEN. Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Arend van Spriel <arend@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-02 17:04:45 -04:00
Eric Dumazet	6ae705323b	tcp: sndbuf autotuning improvements tcp_fixup_sndbuf() is underestimating initial send buffer requirements. It was not noticed because big GSO packets were escaping the limitation, but with smaller TSO packets (or TSO/GSO/SG off), application hits sk_sndbuf before having a chance to fill enough packets in socket write queue. - initial cwnd can be bigger than 10 for specific routes - SKB_TRUESIZE() is a bit under real needs in some cases, because of power-of-two rounding in kmalloc() - Fast Recovery (RFC 5681 3.2) : Cubic needs 70% factor - Extra cushion (application might react slowly to POLLOUT) tcp_v4_conn_req_fastopen() needs to call tcp_init_metrics() before calling tcp_init_buffer_space() Then we realize tcp_new_space() should call tcp_fixup_sndbuf() instead of duplicating this stuff. Rename tcp_fixup_sndbuf() to tcp_sndbuf_expand() to be more descriptive. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Acked-by: Maciej Żenczykowski <maze@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-02 16:45:17 -04:00
baker.zhang	bbe34cf8a1	fib_trie: avoid a redundant bit judgement in inflate Because 'node' is the i'st child of 'oldnode', thus, here 'i' equals tkey_extract_bits(node->key, oldtnode->pos, oldtnode->bits) we just get 1 more bit, and need not care the detail value of this bits. I apologize for the mistake. I generated the patch on a branch version, and did not notice the put_child has been changed. I have redone the test on HEAD version with my patch. two cases are used. case 1. inflate a node which has a leaf child node. case 2: inflate a node which has a an child node with skipped bits test env: ip link set eth0 up ip a add dev eth0 192.168.11.1/32 here, we just focus on route table(MAIN), so I use a "192.168.11.1/32" address to simplify the test case. call trace: + fib_insert_node + + trie_rebalance + + + resize + + + + inflate Test case 1: inflate a node which has a leaf child node. =========================================================== step 1. prepare a fib trie ------------------------------------------ ip r a 192.168.0.0/24 via 192.168.11.1 ip r a 192.168.1.0/24 via 192.168.11.1 we get a fib trie. root@baker:~# cat /proc/net/fib_trie Main: +-- 192.168.0.0/23 1 0 0 \|-- 192.168.0.0 /24 universe UNICAST \|-- 192.168.1.0 /24 universe UNICAST Local: ..... step 2. Add the third route ------------------------------------------ root@baker:~# ip r a 192.168.2.0/24 via 192.168.11.1 A fib_trie leaf will be inserted in fib_insert_node before trie_rebalance. For function 'inflate': 'inflate' is called with following trie. +-- 192.168.0.0/22 1 1 0 <=== tn node +-- 192.168.0.0/23 1 0 0 <== node a \|-- 192.168.0.0 /24 universe UNICAST \|-- 192.168.1.0 /24 universe UNICAST \|-- 192.168.2.0 <== leaf(node b) When process node b, which is a leaf. here: i is 1, node key "192.168.2.0" oldnode is (pos:22, bits:1) unpatch source: tkey_extract_bits(node->key, oldtnode->pos + oldtnode->bits, 1) it equals: tkey_extract_bits("192.168,2,0", 22 + 1, 1) thus got 0, and call put_child(tn, 2i, node); <== 2i=2. patched source: tkey_extract_bits(node->key, oldtnode->pos, oldtnode->bits + 1), tkey_extract_bits("192.168,2,0", 22, 1 + 1) <== get 2. Test case 2: inflate a node which has a an child node with skipped bits ========================================================================== step 1. prepare a fib trie. ip link set eth0 up ip a add dev eth0 192.168.11.1/32 ip r a 192.168.128.0/24 via 192.168.11.1 ip r a 192.168.0.0/24 via 192.168.11.1 ip r a 192.168.16.0/24 via 192.168.11.1 ip r a 192.168.32.0/24 via 192.168.11.1 ip r a 192.168.48.0/24 via 192.168.11.1 ip r a 192.168.144.0/24 via 192.168.11.1 ip r a 192.168.160.0/24 via 192.168.11.1 ip r a 192.168.176.0/24 via 192.168.11.1 check: root@baker:~# cat /proc/net/fib_trie Main: +-- 192.168.0.0/16 1 0 0 +-- 192.168.0.0/18 2 0 0 \|-- 192.168.0.0 /24 universe UNICAST \|-- 192.168.16.0 /24 universe UNICAST \|-- 192.168.32.0 /24 universe UNICAST \|-- 192.168.48.0 /24 universe UNICAST +-- 192.168.128.0/18 2 0 0 \|-- 192.168.128.0 /24 universe UNICAST \|-- 192.168.144.0 /24 universe UNICAST \|-- 192.168.160.0 /24 universe UNICAST \|-- 192.168.176.0 /24 universe UNICAST Local: ... step 2. add a route to trigger inflate. ip r a 192.168.96.0/24 via 192.168.11.1 This command will call serveral times inflate. In the first time, the fib_trie is: ________________________ +-- 192.168.128.0/(16, 1) <== tn node +-- 192.168.0.0/(17, 1) <== node a +-- 192.168.0.0/(18, 2) \|-- 192.168.0.0 \|-- 192.168.16.0 \|-- 192.168.32.0 \|-- 192.168.48.0 \|-- 192.168.96.0 +-- 192.168.128.0/(18, 2) <== node b. \|-- 192.168.128.0 \|-- 192.168.144.0 \|-- 192.168.160.0 \|-- 192.168.176.0 NOTE: node b is a interal node with skipped bits. here, i:1, node->key "192.168.128.0", oldnode:(pos:16, bits:1) so tkey_extract_bits(node->key, oldtnode->pos + oldtnode->bits, 1) it equals: tkey_extract_bits("192.168,128,0", 16 + 1, 1) <=== 0 tkey_extract_bits(node->key, oldtnode->pos, oldtnode->bits, 1) it equals: tkey_extract_bits("192.168,128,0", 16, 1+1) <=== 2 2*i + 0 == 2, so the result is same. Signed-off-by: baker.zhang <baker.kernel@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-02 16:37:15 -04:00
Andi Kleen	5843ef4213	tcp: Always set options to 0 before calling tcp_established_options tcp_established_options assumes opts->options is 0 before calling, as it read modify writes it. For the tcp_current_mss() case the opts structure is not zeroed, so this can be done with uninitialized values. This is ok, because ->options is not read in this path. But it's still better to avoid the operation on the uninitialized field. This shuts up a static code analyzer, and presumably may help the optimizer. Cc: netdev@vger.kernel.org Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-02 16:32:43 -04:00
Marcel Holtmann	3b1662952e	Bluetooth: Fix memory leak with L2CAP signal channels The wrong type of L2CAP signalling packets on the wrong type of either BR/EDR or LE links need to be dropped. When that happens the packet is dropped, but the memory not freed. So actually free the memory as well. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2013-10-02 17:17:05 -03:00
Mathias Krause	6865d1e834	unix_diag: fix info leak When filling the netlink message we miss to wipe the pad field, therefore leak one byte of heap memory to userland. Fix this by setting pad to 0. Signed-off-by: Mathias Krause <minipli@googlemail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-02 16:08:24 -04:00
Arik Nemtsov	7578d57520	mac80211: implement STA CSA for drivers using channel contexts Limit the current implementation to a single channel context used by a single vif, thereby avoiding multi-vif/channel complexities. Reuse the main function from AP CSA code, but move a portion out in order to fit the STA scenario. Add a new mac80211 HW flag so we don't break devices that don't support channel switch with channel-contexts. The new behavior will be opt-in. Signed-off-by: Arik Nemtsov <arik@wizery.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-10-02 18:18:23 +02:00
Marcel Holtmann	9ab8cf3729	Bluetooth: Increment management interface revision This patch increments the management interface revision due to the various fixes, improvements and other changes that have gone in lately. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2013-10-02 16:24:03 +03:00
Johan Hedberg	11802b299f	Bluetooth: Fix advertising data flags with disabled BR/EDR We shouldn't include the simultaneous LE & BR/EDR flags in the LE advertising data if BR/EDR is disabled on a dual-mode controller. This patch fixes this issue and ensures that the create_ad function generates the correct flags when BR/EDR is disabled. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2013-10-02 06:18:18 -07:00
Johan Hedberg	e6fe798652	Bluetooth: Fix REJECTED vs NOT_SUPPORTED mgmt responses The REJECTED management response should mainly be used when the adapter is in a state where we cannot accept some command or a specific parameter value. The NOT_SUPPORTED response in turn means that the adapter really cannot support the command or parameter value. This patch fixes this distinction and adds two helper functions to easily get the appropriate LE or BR/EDR related status response. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2013-10-02 05:52:51 -07:00
Marcel Holtmann	d13eafce2c	Bluetooth: Add management command for setting static address On dual-mode BR/EDR/LE and LE only controllers it is possible to configure a random address. There are two types or random addresses, one is static and the other private. Since the random private addresses require special privacy feature to be supported, the configuration of these two are kept separate. This command allows for setting the static random address. It is only supported on controllers with LE support. The static random address is suppose to be valid for the lifetime of the controller or at least until the next power cycle. To ensure such behavior, setting of the address is limited to when the controller is powered off. The special BDADDR_ANY address (00:00:00:00:00:00) can be used to disable the static address. This is also the default value. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2013-10-02 14:50:58 +03:00
Matthias Schiffer	6c519bad7b	batman-adv: set up network coding packet handlers during module init batman-adv saves its table of packet handlers as a global state, so handlers must be set up only once (and setting them up a second time will fail). The recently-added network coding support tries to set up its handler each time a new softif is registered, which obviously fails when more that one softif is used (and in consequence, the softif creation fails). Fix this by splitting up batadv_nc_init into batadv_nc_init (which is called only once) and batadv_nc_mesh_init (which is called for each softif); in addition batadv_nc_free is renamed to batadv_nc_mesh_free to keep naming consistent. Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>	2013-10-02 13:46:19 +02:00
Marcel Holtmann	a0cdf960be	Bluetooth: Restrict disabling of HS when controller is powered off Disabling the high speed setting when the controller is powered on has too many side effects that are not taken care of. And in general it is not an useful operation anyway. So just make such a command fail with a rejection error message. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2013-10-02 13:51:50 +03:00
Johan Hedberg	0663ca2a03	Bluetooth: Add a new mgmt_set_bredr command This patch introduces a new mgmt command for enabling/disabling BR/EDR functionality. This can be convenient when one wants to make a dual-mode controller behave like a single-mode one. The command is only available for dual-mode controllers and requires that LE is enabled before using it. The BR/EDR setting can be enabled at any point, however disabling it requires the controller to be powered off (otherwise a "rejected" response will be sent). Disabling the BR/EDR setting will automatically disable all other BR/EDR related settings. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2013-10-02 03:48:28 -07:00
Johan Hedberg	56f8790102	Bluetooth: Introduce a new HCI_BREDR_ENABLED flag To allow treating dual-mode (BR/EDR/LE) controllers as single-mode ones (LE-only) we want to introduce a new HCI_BREDR_ENABLED flag to track whether BR/EDR is enabled or not (previously we simply looked at the feature bit with lmp_bredr_enabled). This patch add the new flag and updates the relevant places to test against it instead of using lmp_bredr_enabled. The flag is by default enabled when registering an adapter and only cleared if necessary once the local features have been read during the HCI init procedure. We cannot completely block BR/EDR usage in case user space uses raw HCI sockets but the patch tries to block this in places where possible, such as the various BR/EDR specific ioctls. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2013-10-02 03:48:28 -07:00
cedric Voncken	c6ca5e28bc	cfg80211: vlan priority handling in WMM If the VLAN tci is set in skb->vlan_tci use the priority field to determine the WMM priority. Signed-off-by: cedric Voncken <cedric.voncken@acksys.fr> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-10-02 11:04:33 +02:00
Johan Hedberg	e1d08f4067	Bluetooth: Fix workqueue synchronization in hci_dev_open When hci_sock.c calls hci_dev_open it needs to ensure that there isn't pending work in progress, such as that which is scheduled for the initial setup procedure or the one for automatically powering off after the setup procedure. This adds the necessary calls to ensure that any previously scheduled work is completed before attempting to call hci_dev_do_open. This patch fixes a race with old user space versions where we might receive a HCIDEVUP ioctl before the setup procedure has been completed. When that happens the setup procedures callback may fail early and leave the device in an inconsistent state, causing e.g. the setup callback to be (incorrectly) called more than once. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2013-10-01 23:27:08 -07:00
Johan Hedberg	cbed0ca137	Bluetooth: Refactor hci_dev_open to a separate hci_dev_do_open function The requirements of an external call to hci_dev_open from hci_sock.c are different to that from within hci_core.c. In the former case we want to flush any pending work in hdev->req_workqueue whereas in the latter we don't (since there we are already calling from within the workqueue itself). This patch does the necessary refactoring to a separate hci_dev_do_open function (analogous to hci_dev_do_close) but does not yet introduce the synchronizations relating to the workqueue usage. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2013-10-01 23:27:08 -07:00
Marcel Holtmann	922ca1dfc2	Bluetooth: Enable -D__CHECK_ENDIAN__ for sparse by default The Bluetooth protocol and hardware is pretty much all little endian and so when running sparse via "make C=2" for example, enable the endian checks by default. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2013-10-02 09:10:05 +03:00
Marcel Holtmann	10a8b86f57	Bluetooth: Require CAP_NET_ADMIN for HCI User Channel operation The HCI User Channel operation is an admin operation that puts the device into promiscuous mode for single use. It is more suitable to require CAP_NET_ADMIN than CAP_NET_RAW. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2013-10-02 09:10:04 +03:00
Marcel Holtmann	ee39269369	Bluetooth: Send new settings event when changing high speed option When enabling or disabling high speed setting it is required to send a new settings event to inform other management interface users about the changed settings. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2013-10-02 09:10:01 +03:00
Marcel Holtmann	848566b381	Bluetooth: Provide high speed configuration option Hiding the Bluetooth high speed support behind a module parameter is not really useful. This can be enabled and disabled at runtime via the management interface. This also has the advantage that this can now be changed per controller and not just global. This patch removes the module parameter and exposes the high speed setting of the management interface to all controllers. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2013-10-02 09:09:59 +03:00
Marcel Holtmann	60f2a3ed7b	Bluetooth: Use only 2 bits for controller type information The controller type is limited to BR/EDR/LE and AMP controllers. This can be easily encoded with just 2 bits and still leave enough room for future controller types. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2013-10-02 09:09:54 +03:00
David S. Miller	4fbef95af4	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/ethernet/emulex/benet/be.h drivers/net/usb/qmi_wwan.c drivers/net/wireless/brcm80211/brcmfmac/dhd_bus.h include/net/netfilter/nf_conntrack_synproxy.h include/net/secure_seq.h The conflicts are of two varieties: 1) Conflicts with Joe Perches's 'extern' removal from header file function declarations. Usually it's an argument signature change or a function being added/removed. The resolutions are trivial. 2) Some overlapping changes in qmi_wwan.c and be.h, one commit adds a new value, another changes an existing value. That sort of thing. Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-01 17:06:14 -04:00
Linus Torvalds	c31eeaced2	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking changes from David Miller: 1) Multiply in netfilter IPVS can overflow when calculating destination weight. From Simon Kirby. 2) Use after free fixes in IPVS from Julian Anastasov. 3) SFC driver bug fixes from Daniel Pieczko. 4) Memory leak in pcan_usb_core failure paths, from Alexey Khoroshilov. 5) Locking and encapsulation fixes to serial line CAN driver, from Andrew Naujoks. 6) Duplex and VF handling fixes to bnx2x driver from Yaniv Rosner, Eilon Greenstein, and Ariel Elior. 7) In lapb, if no other packets are outstanding, T1 timeouts actually stall things and no packet gets sent. Fix from Josselin Costanzi. 8) ICMP redirects should not make it to the socket error queues, from Duan Jiong. 9) Fix bugs in skge DMA mapping error handling, from Nikulas Patocka. 10) Fix setting of VLAN priority field on via-rhine driver, from Roget Luethi. 11) Fix TX stalls and VLAN promisc programming in be2net driver from Ajit Khaparde. 12) Packet padding doesn't get handled correctly in new usbnet SG support code, from Ming Lei. 13) Fix races in netdevice teardown wrt. network namespace closing. From Eric W. Biederman. 14) Fix potential missed initialization of net_secret if not TCP connections are openned. From Eric Dumazet. 15) Cinterion PLXX product ID in qmi_wwan driver is wrong, from Aleksander Morgado. 16) skb_cow_head() can change skb->data and thus packet header pointers, don't use stale ip_hdr reference in ip_tunnel code. 17) Backend state transition handling fixes in xen-netback, from Paul Durrant. 18) Packet offset for AH protocol is handled wrong in flow dissector, from Eric Dumazet. 19) Taking down an fq packet scheduler instance can leave stale packets in the queues, fix from Eric Dumazet. 20) Fix performance regressions introduced by TCP Small Queues. From Eric Dumazet. 21) IPV6 GRE tunneling code calculates max_headroom incorrectly, from Hannes Frederic Sowa. 22) Multicast timer handlers in ipv4 and ipv6 can be the last and final reference to the ipv4/ipv6 specific network device state, so use the reference put that will check and release the object if the reference hits zero. From Salam Noureddine. 23) Fix memory corruption in ip_tunnel driver, and use skb_push() instead of __skb_push() so that similar bugs are less hard to find. From Steffen Klassert. 24) Add forgotten hookup of rtnl_ops in SIT and ip6tnl drivers, from Nicolas Dichtel. 25) fq scheduler doesn't accurately rate limit in certain circumstances, from Eric Dumazet. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (103 commits) pkt_sched: fq: rate limiting improvements ip6tnl: allow to use rtnl ops on fb tunnel sit: allow to use rtnl ops on fb tunnel ip_tunnel: Remove double unregister of the fallback device ip_tunnel_core: Change __skb_push back to skb_push ip_tunnel: Add fallback tunnels to the hash lists ip_tunnel: Fix a memory corruption in ip_tunnel_xmit qlcnic: Fix SR-IOV configuration ll_temac: Reset dma descriptors indexes on ndo_open skbuff: size of hole is wrong in a comment ipv6 mcast: use in6_dev_put in timer handlers instead of __in6_dev_put ipv4 igmp: use in_dev_put in timer handlers instead of __in_dev_put ethernet: moxa: fix incorrect placement of __initdata tag ipv6: gre: correct calculation of max_headroom powerpc/83xx: gianfar_ptp: select 1588 clock source through dts file Revert "powerpc/83xx: gianfar_ptp: select 1588 clock source through dts file" bonding: Fix broken promiscuity reference counting issue tcp: TSQ can use a dynamic limit dm9601: fix IFF_ALLMULTI handling pkt_sched: fq: qdisc dismantle fixes ...	2013-10-01 12:58:48 -07:00
Eric Dumazet	0eab5eb7a3	pkt_sched: fq: rate limiting improvements FQ rate limiting suffers from two problems, reported by Steinar : 1) FQ enforces a delay when flow quantum is exhausted in order to reduce cpu overhead. But if packets are small, current delay computation is slightly wrong, and observed rates can be too high. Steinar had this problem because he disabled TSO and GSO, and default FQ quantum is 2*1514. (Of course, I wish recent TSO auto sizing changes will help to not having to disable TSO in the first place) 2) maxrate was not used for forwarded flows (skbs not attached to a socket) Tested: tc qdisc add dev eth0 root est 1sec 4sec fq maxrate 8Mbit netperf -H lpq84 -l 1000 & sleep 10 ; tc -s qdisc show dev eth0 qdisc fq 8003: root refcnt 32 limit 10000p flow_limit 100p buckets 1024 quantum 3028 initial_quantum 15140 maxrate 8000Kbit Sent 16819357 bytes 11258 pkt (dropped 0, overlimits 0 requeues 0) rate 7831Kbit 653pps backlog 7570b 5p requeues 0 44 flows (43 inactive, 1 throttled), next packet delay 2977352 ns 0 gc, 0 highprio, 5545 throttled lpq83:~# tcpdump -p -i eth0 host lpq84 -c 12 09:02:52.079484 IP lpq83 > lpq84: . 1389536928:1389538376(1448) ack 3808678021 win 457 <nop,nop,timestamp 961812 572609068> 09:02:52.079499 IP lpq83 > lpq84: . 1448:2896(1448) ack 1 win 457 <nop,nop,timestamp 961812 572609068> 09:02:52.079906 IP lpq84 > lpq83: . ack 2896 win 16384 <nop,nop,timestamp 572609080 961812> 09:02:52.082568 IP lpq83 > lpq84: . 2896:4344(1448) ack 1 win 457 <nop,nop,timestamp 961815 572609071> 09:02:52.082581 IP lpq83 > lpq84: . 4344:5792(1448) ack 1 win 457 <nop,nop,timestamp 961815 572609071> 09:02:52.083017 IP lpq84 > lpq83: . ack 5792 win 16384 <nop,nop,timestamp 572609083 961815> 09:02:52.085678 IP lpq83 > lpq84: . 5792:7240(1448) ack 1 win 457 <nop,nop,timestamp 961818 572609074> 09:02:52.085693 IP lpq83 > lpq84: . 7240:8688(1448) ack 1 win 457 <nop,nop,timestamp 961818 572609074> 09:02:52.086117 IP lpq84 > lpq83: . ack 8688 win 16384 <nop,nop,timestamp 572609086 961818> 09:02:52.088792 IP lpq83 > lpq84: . 8688:10136(1448) ack 1 win 457 <nop,nop,timestamp 961821 572609077> 09:02:52.088806 IP lpq83 > lpq84: . 10136:11584(1448) ack 1 win 457 <nop,nop,timestamp 961821 572609077> 09:02:52.089217 IP lpq84 > lpq83: . ack 11584 win 16384 <nop,nop,timestamp 572609090 961821> Reported-by: Steinar H. Gunderson <sesse@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-01 13:00:38 -04:00
Nicolas Dichtel	bb8140947a	ip6tnl: allow to use rtnl ops on fb tunnel rtnl ops where introduced by `c075b13098` ("ip6tnl: advertise tunnel param via rtnl"), but I forget to assign rtnl ops to fb tunnels. Now that it is done, we must remove the explicit call to unregister_netdevice_queue(), because the fallback tunnel is added to the queue in ip6_tnl_destroy_tunnels() when checking rtnl_link_ops of all netdevices (this is valid since commit `0bd8762824` ("ip6tnl: add x-netns support")). Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-01 12:55:53 -04:00
Nicolas Dichtel	205983c437	sit: allow to use rtnl ops on fb tunnel rtnl ops where introduced by `ba3e3f50a0` ("sit: advertise tunnel param via rtnl"), but I forget to assign rtnl ops to fb tunnels. Now that it is done, we must remove the explicit call to unregister_netdevice_queue(), because the fallback tunnel is added to the queue in sit_destroy_tunnels() when checking rtnl_link_ops of all netdevices (this is valid since commit `5e6700b3bf` ("sit: add support of x-netns")). Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-01 12:55:53 -04:00
Steffen Klassert	cfe4a53692	ip_tunnel: Remove double unregister of the fallback device When queueing the netdevices for removal, we queue the fallback device twice in ip_tunnel_destroy(). The first time when we queue all netdevices in the namespace and then again explicitly. Fix this by removing the explicit queueing of the fallback device. Bug was introduced when network namespace support was added with commit `6c742e714d` ("ipip: add x-netns support"). Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-01 12:42:16 -04:00
Steffen Klassert	78a3694d44	ip_tunnel_core: Change __skb_push back to skb_push Git commit `0e6fbc5b` ("ip_tunnels: extend iptunnel_xmit()") moved the IP header installation to iptunnel_xmit() and changed skb_push() to __skb_push(). This makes possible bugs hard to track down, so change it back to skb_push(). Cc: Pravin Shelar <pshelar@nicira.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-01 12:42:16 -04:00
Steffen Klassert	6701328262	ip_tunnel: Add fallback tunnels to the hash lists Currently we can not update the tunnel parameters of the fallback tunnels because we don't find them in the hash lists. Fix this by adding them on initialization. Bug was introduced with commit `c544193214` ("GRE: Refactor GRE tunneling code.") Cc: Pravin Shelar <pshelar@nicira.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-01 12:42:16 -04:00
Steffen Klassert	3e08f4a72f	ip_tunnel: Fix a memory corruption in ip_tunnel_xmit We might extend the used aera of a skb beyond the total headroom when we install the ipip header. Fix this by calling skb_cow_head() unconditionally. Bug was introduced with commit `c544193214` ("GRE: Refactor GRE tunneling code.") Cc: Pravin Shelar <pshelar@nicira.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-01 12:42:16 -04:00
David S. Miller	e024bdc051	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf Pablo Neira Ayuso says: ==================== The following patchset contains Netfilter/IPVS fixes for your net tree, they are: * Fix BUG_ON splat due to malformed TCP packets seen by synproxy, from Patrick McHardy. * Fix possible weight overflow in lblc and lblcr schedulers due to 32-bits arithmetics, from Simon Kirby. * Fix possible memory access race in the lblc and lblcr schedulers, introduced when it was converted to use RCU, two patches from Julian Anastasov. * Fix hard dependency on CPU 0 when reading per-cpu stats in the rate estimator, from Julian Anastasov. * Fix race that may lead to object use after release, when invoking ipvsadm -C && ipvsadm -R, introduced when adding RCU, from Julian Anastasov. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-01 12:39:35 -04:00
Johannes Berg	131a19bc92	regulatory: enable channels 52-64 and 100-144 for world roaming If allowed in a country, these channels typically require DFS so mark them as such. Channel 144 is a bit special, it's coming into use now to allow more VHT 80 channels, but world roaming with passive scanning is acceptable anyway. It seems fairly unlikely that it'll be used as the control channel for a VHT AP, but it needs to be present to allow a full VHT connection to an AP that uses it as one of the secondary channels. Also enable VHT 160 on these channels, and also for channels 36-48 to be able to use VHT 160 there. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-10-01 14:14:02 +02:00
Pablo Neira Ayuso	91cb498e6a	netfilter: cttimeout: allow to set/get default protocol timeouts Default timeouts are currently set via proc/sysctl interface, the typical pattern is a file name like: /proc/sys/net/netfilter/nf_conntrack_PROTOCOL_timeout_STATE This results in one entry per default protocol state timeout. This patch simplifies this by allowing to set default protocol timeouts via cttimeout netlink interface. This should allow us to get rid of the existing proc/sysctl code in the midterm. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-10-01 13:17:39 +02:00
Simon Wunderlich	ff311bc11a	nl80211: allow CAC only if no operation is going on A CAC should fail if it is triggered while the interface is already running. Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de> Signed-off-by: Mathias Kretschmer <mathias.kretschmer@fokus.fraunhofer.de> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-10-01 13:05:59 +02:00
holger@eitzenberger.org	180cf72f56	netfilter: nf_ct_sip: consolidate NAT hook functions There are currently seven different NAT hooks used in both nf_conntrack_sip and nf_nat_sip, each of the hooks is exported in nf_conntrack_sip, then set from the nf_nat_sip NAT helper. And because each of them is exported there is quite some overhead introduced due of this. By introducing nf_nat_sip_hooks I am able to reduce both text/data somewhat. For nf_conntrack_sip e. g. I get text data bss dec old 15243 5256 32 20531 new 15010 5192 32 20234 Signed-off-by: Holger Eitzenberger <holger@eitzenberger.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-10-01 12:47:09 +02:00
Gao feng	afff14f608	netfilter: nfnetlink_log: use proper net to allocate skb Use proper net struct to allocate skb, otherwise netlink mmap will be of no effect. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-10-01 12:46:56 +02:00
Fred Zhou	1f4ffde845	mac80211: improve default WMM parameter setting Move the default setting for WMM parameters outside the for loop to avoid redundant assignment multiple times. Signed-off-by: Fred Zhou <fred.zy@gmail.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-10-01 12:24:29 +02:00
Michal Kazior	0cfcefef19	mac80211: support reporting A-MSDU subframes individually Some devices may not be able to report A-MSDUs in single buffers. Drivers for such devices were forced to re-assemble A-MSDUs which would then be eventually disassembled by mac80211. This could lead to CPU cache thrashing and poor performance. Since A-MSDU has a single sequence number all subframes share it. This was in conflict with retransmission/duplication recovery (IEEE802.11-2012: 9.3.2.10). Patch introduces a new flag that is meant to be set for all individually reported A-MSDU subframes except the last one. This ensures the last_seq_ctrl is updated after the last subframe is processed. If an A-MSDU is actually a duplicate transmission all reported subframes will be properly discarded. Signed-off-by: Michal Kazior <michal.kazior@tieto.com> [johannes: add braces that were missing even before] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-10-01 12:22:03 +02:00
Fred Zhou	15e230abaa	mac80211: use exact-size allocation for authentication frame The authentication frame has a fixied size of 30 bytes (including header, algo num, trans seq num, and status) followed by a variable challenge text. Allocate using exact size, instead of over-allocation by sizeof(ieee80211_mgmt). Signed-off-by: Fred Zhou <fred.zy@gmail.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-10-01 12:20:38 +02:00
Gao feng	7433268783	netfilter: nfnetlink_queue: use proper net namespace to allocate skb Use proper net struct to allocate skb, otherwise netlink mmap will have no effect. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-10-01 12:20:31 +02:00
Janusz Dziedzic	f0823475d5	cfg80211: parse dfs region for internal regdb option Add support for parsing and setting the dfs region (ETSI, FCC, JP) when the internal regulatory database is used. Before this the DFS region was being ignored even if present on the used db.txt Signed-off-by: Janusz Dziedzic <janusz.dziedzic@tieto.com> Reviewed-by: Luis R. Rodriguez <mcgrof@do-not-panic.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-10-01 12:18:36 +02:00
Johannes Berg	55fff50113	mac80211: add explicit IBSS driver operations This can be useful for drivers if they have any failure cases when joining an IBSS. Also move setting the queue parameters to before this new call, in case the new driver op needs them already. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-10-01 12:17:45 +02:00
Eliad Peller	5eb7906b47	ieee80211: fix vht cap definitions VHT_CAP_BEAMFORMER_ANTENNAS cap is actually defined in the draft as VHT_CAP_BEAMFORMEE_STS_MAX, and its size is 3 bits long. VHT_CAP_SOUNDING_DIMENSIONS is also 3 bits long. Fix the definitions and change the cap masking accordingly. Signed-off-by: Eliad Peller <eliad@wizery.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-10-01 12:17:08 +02:00
Eliad Peller	f364ef99a8	mac80211: fix some snprintf misuses In some debugfs related functions snprintf was used while scnprintf should have been used instead. (blindly adding the return value of snprintf and supplying it to the next snprintf might result in buffer overflow when the input is too big) Signed-off-by: Eliad Peller <eliad@wizery.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-10-01 12:16:51 +02:00
Fan Du	f59bbdfa5c	xfrm: Simplify SA looking up when using wildcard source __xfrm4/6_state_addr_check is a four steps check, all we need to do is checking whether the destination address match when looking SA using wildcard source address. Passing saddr from flow is worst option, as the checking needs to reach the fourth step while actually only one time checking will do the work. So, simplify this process by only checking destination address when using wildcard source address for looking up SAs. Signed-off-by: Fan Du <fan.du@windriver.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2013-10-01 10:09:33 +02:00
Fan Du	6f1156383a	xfrm: Force SA to be lookup again if SA in acquire state If SA is in the process of acquiring, which indicates this SA is more promising and precise than the fall back option, i.e. using wild card source address for searching less suitable SA. So, here bail out, and try again. Signed-off-by: Fan Du <fan.du@windriver.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2013-10-01 10:09:33 +02:00
Salam Noureddine	9260d3e101	ipv6 mcast: use in6_dev_put in timer handlers instead of __in6_dev_put It is possible for the timer handlers to run after the call to ipv6_mc_down so use in6_dev_put instead of __in6_dev_put in the handler function in order to do proper cleanup when the refcnt reaches 0. Otherwise, the refcnt can reach zero without the inet6_dev being destroyed and we end up leaking a reference to the net_device and see messages like the following, unregister_netdevice: waiting for eth0 to become free. Usage count = 1 Tested on linux-3.4.43. Signed-off-by: Salam Noureddine <noureddine@aristanetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-09-30 22:28:58 -07:00
Salam Noureddine	e2401654dd	ipv4 igmp: use in_dev_put in timer handlers instead of __in_dev_put It is possible for the timer handlers to run after the call to ip_mc_down so use in_dev_put instead of __in_dev_put in the handler function in order to do proper cleanup when the refcnt reaches 0. Otherwise, the refcnt can reach zero without the in_device being destroyed and we end up leaking a reference to the net_device and see messages like the following, unregister_netdevice: waiting for eth0 to become free. Usage count = 1 Tested on linux-3.4.43. Signed-off-by: Salam Noureddine <noureddine@aristanetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-09-30 22:28:56 -07:00
Hannes Frederic Sowa	3da812d860	ipv6: gre: correct calculation of max_headroom gre_hlen already accounts for sizeof(struct ipv6_hdr) + gre header, so initialize max_headroom to zero. Otherwise the if (encap_limit >= 0) { max_headroom += 8; mtu -= 8; } increments an uninitialized variable before max_headroom was reset. Found with coverity: 728539 Cc: Dmitry Kozlov <xeb@mail.ru> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-09-30 22:04:09 -07:00
Eric W. Biederman	0bbf87d852	net ipv4: Convert ipv4.ip_local_port_range to be per netns v3 - Move sysctl_local_ports from a global variable into struct netns_ipv4. - Modify inet_get_local_port_range to take a struct net, and update all of the callers. - Move the initialization of sysctl_local_ports into sysctl_net_ipv4.c:ipv4_sysctl_init_net from inet_connection_sock.c v2: - Ensure indentation used tabs - Fixed ip.h so it applies cleanly to todays net-next v3: - Compile fixes of strange callers of inet_get_local_port_range. This patch now successfully passes an allmodconfig build. Removed manual inlining of inet_get_local_port_range in ipv4_local_port_range Originally-by: Samya <samya@twitter.com> Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-09-30 21:59:38 -07:00
stephen hemminger	56d7b53f47	ethernet: use likely() for common Ethernet encap Mark code path's likely/unlikely based on most common usage. * Very few devices use dsa tags. * Most traffic is Ethernet (not 802.2) * No sane person uses trailer type or Novell encapsulation Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-09-30 21:52:53 -07:00
stephen hemminger	12861b7bc2	ethernet: cleanup eth_type_trans Remove old legacy comment and weird if condition. The comment has outlived it's stay and is throwback to some early net code (before my time). Maybe Dave remembers what it meant. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-09-30 21:52:52 -07:00
Eric Dumazet	c9eeec26e3	tcp: TSQ can use a dynamic limit When TCP Small Queues was added, we used a sysctl to limit amount of packets queues on Qdisc/device queues for a given TCP flow. Problem is this limit is either too big for low rates, or too small for high rates. Now TCP stack has rate estimation in sk->sk_pacing_rate, and TSO auto sizing, it can better control number of packets in Qdisc/device queues. New limit is two packets or at least 1 to 2 ms worth of packets. Low rates flows benefit from this patch by having even smaller number of packets in queues, allowing for faster recovery, better RTT estimations. High rates flows benefit from this patch by allowing more than 2 packets in flight as we had reports this was a limiting factor to reach line rate. [ In particular if TX completion is delayed because of coalescing parameters ] Example for a single flow on 10Gbp link controlled by FQ/pacing 14 packets in flight instead of 2 $ tc -s -d qd qdisc fq 8001: dev eth0 root refcnt 32 limit 10000p flow_limit 100p buckets 1024 quantum 3028 initial_quantum 15140 Sent 1168459366606 bytes 771822841 pkt (dropped 0, overlimits 0 requeues 6822476) rate 9346Mbit 771713pps backlog `953820b` 14p requeues 6822476 2047 flow, 2046 inactive, 1 throttled, delay 15673 ns 2372 gc, 0 highprio, 0 retrans, 9739249 throttled, 0 flows_plimit Note that sk_pacing_rate is currently set to twice the actual rate, but this might be refined in the future when a flow is in congestion avoidance. Additional change : skb->destructor should be set to tcp_wfree(). A future patch (for linux 3.13+) might remove tcp_limit_output_bytes Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Wei Liu <wei.liu2@citrix.com> Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Neal Cardwell <ncardwell@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-09-30 20:41:57 -07:00
Li RongQing	fbadadd90c	ipv6: Not need to set fl6.flowi6_flags as zero setting fl6.flowi6_flags as zero after memset is redundant, Remove it. Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-09-30 19:14:11 -04:00
John W. Linville	15214c2f6c	Merge branch 'for-john' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211	2013-09-30 16:14:27 -04:00
Eric Dumazet	8d34ce10c5	pkt_sched: fq: qdisc dismantle fixes fq_reset() should drops all packets in queue, including throttled flows. This patch moves code from fq_destroy() to fq_reset() to do the cleaning. fq_change() must stop calling fq_dequeue() if all remaining packets are from throttled flows. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-09-30 15:51:23 -04:00
stephen hemminger	6459082a3c	qdisc: basic classifier - remove unnecessary initialization err is set once, then first code resets it. err = tcf_exts_validate(...) Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Jamal Hadi Salim <hadi@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-09-30 15:47:43 -04:00
stephen hemminger	0c4e4020f0	qdisc: meta return ENOMEM on alloc failure Rather than returning earlier value (EINVAL), return ENOMEM if kzalloc fails. Found while reviewing to find another EINVAL condition. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-09-30 15:47:43 -04:00
Oliver Smith	7c3ad056ef	netfilter: ipset: Add hash:net,port,net module to kernel. This adds a new set that provides similar functionality to ip,port,net but permits arbitrary size subnets for both the first and last parameter. Signed-off-by: Oliver Smith <oliver@8.c.9.b.0.7.4.0.1.0.0.2.ip6.arpa> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>	2013-09-30 21:42:58 +02:00
Vitaly Lavrov	1785e8f473	netfiler: ipset: Add net namespace for ipset This patch adds netns support for ipset. Major changes were made in ip_set_core.c and ip_set.h. Global variables are moved to per net namespace. Added initialization code and the destruction of the network namespace ipset subsystem. In the prototypes of public functions ip_set_* added parameter "struct net". The remaining corrections related to the change prototypes of public functions ip_set_. The patch for git://git.netfilter.org/ipset.git commit 6a4ec96c0b8caac5c35474e40e319704d92ca347 Signed-off-by: Vitaly Lavrov <lve@guap.ru> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>	2013-09-30 21:42:52 +02:00
Jozsef Kadlecsik	3fd986b3d9	netfilter: ipset: Use a common function at listing the extensions Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>	2013-09-30 21:42:36 +02:00
Jozsef Kadlecsik	8ec81f9a4d	netfilter: ipset: For set:list types, replaced elements must be zeroed out The new extensions require zero initialization for the new element to be added into a slot from where another element was pushed away. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>	2013-09-30 21:33:29 +02:00
Jozsef Kadlecsik	80571a9ea4	netfilter: ipset: Fix hash resizing with comments The destroy function must take into account that resizing doesn't create new extensions so those cannot be destroyed at resize. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>	2013-09-30 21:33:29 +02:00
Oliver Smith	fda75c6d9e	netfilter: ipset: Support comments in hash-type ipsets. This provides kernel support for creating ipsets with comment support. This does incur a penalty to flushing/destroying an ipset since all entries are walked in order to free the allocated strings, this penalty is of course less expensive than the operation of listing an ipset to userspace, so for general-purpose usage the overall impact is expected to be little to none. Signed-off-by: Oliver Smith <oliver@8.c.9.b.0.7.4.0.1.0.0.2.ip6.arpa> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>	2013-09-30 21:33:29 +02:00
Oliver Smith	81b10bb4bd	netfilter: ipset: Support comments in the list-type ipset. This provides kernel support for creating list ipsets with the comment annotation extension. Signed-off-by: Oliver Smith <oliver@8.c.9.b.0.7.4.0.1.0.0.2.ip6.arpa> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>	2013-09-30 21:33:29 +02:00
Oliver Smith	b90cb8ba19	netfilter: ipset: Support comments in bitmap-type ipsets. This provides kernel support for creating bitmap ipsets with comment support. As is the case for hashes, this incurs a penalty when flushing or destroying the entire ipset as the entries must first be walked in order to free the comment strings. This penalty is of course far less than the cost of listing an ipset to userspace. Any set created without support for comments will be flushed/destroyed as before. Signed-off-by: Oliver Smith <oliver@8.c.9.b.0.7.4.0.1.0.0.2.ip6.arpa> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>	2013-09-30 21:33:28 +02:00
Oliver Smith	68b63f08d2	netfilter: ipset: Support comments for ipset entries in the core. This adds the core support for having comments on ipset entries. The comments are stored as standard null-terminated strings in dynamically allocated memory after being passed to the kernel. As a result of this, code has been added to the generic destroy function to iterate all extensions and call that extension's destroy task if the set has that extension activated, and if such a task is defined. Signed-off-by: Oliver Smith <oliver@8.c.9.b.0.7.4.0.1.0.0.2.ip6.arpa> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>	2013-09-30 21:33:28 +02:00
Oliver Smith	ea53ac5b63	netfilter: ipset: Add hash:net,net module to kernel. This adds a new set that provides the ability to configure pairs of subnets. A small amount of additional handling code has been added to the generic hash header file - this code is conditionally activated by a preprocessor definition. Signed-off-by: Oliver Smith <oliver@8.c.9.b.0.7.4.0.1.0.0.2.ip6.arpa> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>	2013-09-30 21:33:28 +02:00
Jozsef Kadlecsik	d9628bbeca	netfilter: ipset: Kconfig: ipset needs NETFILTER_NETLINK Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>	2013-09-30 21:33:28 +02:00
Jozsef Kadlecsik	b91b396d5e	netfilter: ipset: list:set: make sure all elements are checked by the gc When an element timed out, the next one was skipped by the garbage collector, fixed. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>	2013-09-30 21:33:27 +02:00
Jozsef Kadlecsik	40cd63bf33	netfilter: ipset: Support extensions which need a per data destroy function Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>	2013-09-30 21:33:27 +02:00
Jozsef Kadlecsik	03c8b234e6	netfilter: ipset: Generalize extensions support Get rid of the structure based extensions and introduce a blob for the extensions. Thus we can support more extension types easily. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>	2013-09-30 21:33:27 +02:00
Jozsef Kadlecsik	ca134ce864	netfilter: ipset: Move extension data to set structure Default timeout and extension offsets are moved to struct set, because all set types supports all extensions and it makes possible to generalize extension support. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>	2013-09-30 21:33:27 +02:00
Jozsef Kadlecsik	f925f70569	netfilter: ipset: Rename extension offset ids to extension ids Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>	2013-09-30 21:33:27 +02:00
Jozsef Kadlecsik	a04d8b6bd9	netfilter: ipset: Prepare ipset to support multiple networks for hash types In order to support hash:net,net, hash:net,port,net etc. types, arrays are introduced for the book-keeping of existing cidr sizes and network numbers in a set. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>	2013-09-30 21:33:26 +02:00
Jozsef Kadlecsik	5e04c0c38c	netfilter: ipset: Introduce new operation to get both setname and family ip[6]tables set match and SET target need to know the family of the set in order to reject adding rules which refer to a set with a non-mathcing family. Currently such rules are silently accepted and then ignored instead of generating a clear error message to the user, which is not helpful. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>	2013-09-30 21:33:26 +02:00
Jozsef Kadlecsik	bd3129fc5e	netfilter: ipset: order matches and targets separatedly in xt_set.c Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>	2013-09-30 21:33:26 +02:00
Anders K. Pedersen	60b0fe3724	netfilter: ipset: Support package fragments for IPv4 protos without ports Enable ipset port set types to match IPv4 package fragments for protocols that doesn't have ports (or the port information isn't supported by ipset). For example this allows a hash:ip,port ipset containing the entry 192.168.0.1,gre:0 to match all package fragments for PPTP VPN tunnels to/from the host. Without this patch only the first package fragment (with fragment offset 0) was matched, while subsequent fragments wasn't. This is not possible for IPv6, where the protocol is in the fragmented part of the package unlike IPv4, where the protocol is in the IP header. IPPROTO_ICMPV6 is deliberately not included, because it isn't relevant for IPv4. Signed-off-by: Anders K. Pedersen <akp@surftown.com> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>	2013-09-30 21:33:26 +02:00
Jozsef Kadlecsik	20b2fab483	netfilter: ipset: Fix "may be used uninitialized" warnings Reported-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>	2013-09-30 21:33:25 +02:00
Jozsef Kadlecsik	35b8dcf8c3	netfilter: ipset: Rename simple macro names to avoid namespace issues. Reported-by: David Laight <David.Laight@ACULAB.COM> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>	2013-09-30 21:33:25 +02:00
Jozsef Kadlecsik	a0f28dc754	netfilter: ipset: Fix sparse warnings due to missing rcu annotations Reported-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>	2013-09-30 21:33:25 +02:00
Jozsef Kadlecsik	b3aabd149c	netfilter: ipset: Sparse warning about shadowed variable fixed net/netfilter/ipset/ip_set_hash_ipportnet.c:275:20: warning: symbol 'cidr' shadows an earlier one Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>	2013-09-30 21:33:25 +02:00
Jozsef Kadlecsik	122ebbf24c	netfilter: ipset: Don't call ip_nest_end needlessly in the error path Suggested-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>	2013-09-30 21:33:25 +02:00
Eric Dumazet	b86783587b	net: flow_dissector: fix thoff for IPPROTO_AH In commit `8ed781668d` ("flow_keys: include thoff into flow_keys for later usage"), we missed that existing code was using nhoff as a temporary variable that could not always contain transport header offset. This is not a problem for TCP/UDP because port offset (@poff) is 0 for these protocols. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Daniel Borkmann <dborkman@redhat.com> Cc: Nikolay Aleksandrov <nikolay@redhat.com> Acked-by: Nikolay Aleksandrov <nikolay@redhat.com> Acked-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-09-30 15:32:05 -04:00
David S. Miller	7b77d161ce	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next Conflicts: include/net/xfrm.h Simple conflict between Joe Perches "extern" removal for function declarations in header files and the changes in Steffen's tree. Steffen Klassert says: ==================== Two patches that are left from the last development cycle. Manual merging of include/net/xfrm.h is needed. The conflict can be solved as it is currently done in linux-next. 1) We announce the creation of temporary acquire state via an asyc event, so the deletion should be annunced too. From Nicolas Dichtel. 2) The VTI tunnels do not real tunning, they just provide a routable IPsec tunnel interface. So introduce and use xfrm_tunnel_notifier instead of xfrm_tunnel for xfrm tunnel mode callback. From Fan Du. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2013-09-30 15:24:57 -04:00
Nicolas Dichtel	991fb3f74c	dev: always advertise rx_flags changes via netlink When flags IFF_PROMISC and IFF_ALLMULTI are changed, netlink messages are not consistent. For example, if a multicast daemon is running (flag IFF_ALLMULTI set in dev->flags but not dev->gflags, ie not exported to userspace) and then a user sets it via netlink (flag IFF_ALLMULTI set in dev->flags and dev->gflags, ie exported to userspace), no netlink message is sent. Same for IFF_PROMISC and because dev->promiscuity is exported via IFLA_PROMISCUITY, we may send a netlink message after each change of this counter. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-09-30 15:08:13 -04:00
Nicolas Dichtel	a528c219df	dev: update __dev_notify_flags() to send rtnl msg This patch only prepares the next one, there is no functional change. Now, __dev_notify_flags() can also be used to notify flags changes via rtnetlink. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-09-30 15:08:12 -04:00
Paul Marks	c9d55d5bff	ipv6: Fix preferred_lft not updating in some cases Consider the scenario where an IPv6 router is advertising a fixed preferred_lft of 1800 seconds, while the valid_lft begins at 3600 seconds and counts down in realtime. A client should reset its preferred_lft to 1800 every time the RA is received, but a bug is causing Linux to ignore the update. The core problem is here: if (prefered_lft != ifp->prefered_lft) { Note that ifp->prefered_lft is an offset, so it doesn't decrease over time. Thus, the comparison is always (1800 != 1800), which fails to trigger an update. The most direct solution would be to compute a "stored_prefered_lft", and use that value in the comparison. But I think that trying to filter out unnecessary updates here is a premature optimization. In order for the filter to apply, both of these would need to hold: - The advertised valid_lft and preferred_lft are both declining in real time. - No clock skew exists between the router & client. So in this patch, I've set "update_lft = 1" unconditionally, which allows the surrounding code to be greatly simplified. Signed-off-by: Paul Marks <pmarks@google.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-09-30 15:06:19 -04:00
Pravin B Shelar	d4a71b155c	ip_tunnel: Do not use stale inner_iph pointer. While sending packet skb_cow_head() can change skb header which invalidates inner_iph pointer to skb header. Following patch avoid using it. Found by code inspection. This bug was introduced by commit `0e6fbc5b6c` (ip_tunnels: extend iptunnel_xmit()). Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-09-30 15:05:07 -04:00
Patrick McHardy	f4a87e7bd2	netfilter: synproxy: fix BUG_ON triggered by corrupt TCP packets TCP packets hitting the SYN proxy through the SYNPROXY target are not validated by TCP conntrack. When th->doff is below 5, an underflow happens when calculating the options length, causing skb_header_pointer() to return NULL and triggering the BUG_ON(). Handle this case gracefully by checking for NULL instead of using BUG_ON(). Reported-by: Martin Topholm <mph@one.com> Tested-by: Martin Topholm <mph@one.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-09-30 12:44:38 +02:00
Jouni Malinen	22c4ceed01	mac80211: Run deferred scan if last roc_list item is not started mac80211 scan processing could get stuck if roc work for pending, but not started when a scan request was deferred due to such roc item. Normally the deferred scan would be started from ieee80211_start_next_roc(), but ieee80211_sw_roc_work() calls that only if the finished ROC was started. Fix this by calling ieee80211_run_deferred_scan() in the case the last ROC was not actually started. This issue was hit relatively easily in P2P find operations where Listen state (remain-on-channel) and Search state (scan) are repeated in a loop. Signed-off-by: Jouni Malinen <j@w1.fi> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-09-30 12:36:56 +02:00
Felix Fietkau	0c5b93290b	mac80211: update sta->last_rx on acked tx frames When clients are idle for too long, hostapd sends nullfunc frames for probing. When those are acked by the client, the idle time needs to be updated. To make this work (and to avoid unnecessary probing), update sta->last_rx whenever an ACK was received for a tx packet. Only do this if the flag IEEE80211_HW_REPORTS_TX_ACK_STATUS is set. Cc: stable@vger.kernel.org Signed-off-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-09-30 12:34:09 +02:00
Felix Fietkau	03bb7f4276	mac80211: use sta_info_get_bss() for nl80211 tx and client probing This allows calls for clients in AP_VLANs (e.g. for 4-addr) to succeed Cc: stable@vger.kernel.org Signed-off-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-09-30 11:30:57 +02:00
Eric Dumazet	62748f32d5	net: introduce SO_MAX_PACING_RATE As mentioned in commit `afe4fd0624` ("pkt_sched: fq: Fair Queue packet scheduler"), this patch adds a new socket option. SO_MAX_PACING_RATE offers the application the ability to cap the rate computed by transport layer. Value is in bytes per second. u32 val = 1000000; setsockopt(sockfd, SOL_SOCKET, SO_MAX_PACING_RATE, &val, sizeof(val)); To be effectively paced, a flow must use FQ packet scheduler. Note that a packet scheduler takes into account the headers for its computations. The effective payload rate depends on MSS and retransmits if any. I chose to make this pacing rate a SOL_SOCKET option instead of a TCP one because this can be used by other protocols. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Steinar H. Gunderson <sesse@google.com> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-09-28 15:35:41 -07:00

... 2 3 4 5 6 ...

29862 Commits