linux_old1

Commit Graph

Author	SHA1	Message	Date
Eric Dumazet	b8dfe49877	netfilter: factorize ifname_compare() We use same not trivial helper function in four places. We can factorize it. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-03-25 17:31:52 +01:00
Eric Dumazet	78f3648601	netfilter: nf_conntrack: use hlist_add_head_rcu() in nf_conntrack_set_hashsize() Using hlist_add_head() in nf_conntrack_set_hashsize() is quite dangerous. Without any barrier, one CPU could see a loop while doing its lookup. Its true new table cannot be seen by another cpu, but previous table is still readable. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-03-25 17:24:34 +01:00
Patrick McHardy	a9a9adfe2f	netfilter: fix xt_LED build failure net/netfilter/xt_LED.c:40: error: field netfilter_led_trigger has incomplete type net/netfilter/xt_LED.c: In function led_timeout_callback: net/netfilter/xt_LED.c:78: warning: unused variable ledinternal net/netfilter/xt_LED.c: In function led_tg_check: net/netfilter/xt_LED.c:102: error: implicit declaration of function led_trigger_register net/netfilter/xt_LED.c: In function led_tg_destroy: net/netfilter/xt_LED.c:135: error: implicit declaration of function led_trigger_unregister Fix by adding a dependency on LED_TRIGGERS. Reported-by: Sachin Sant <sachinp@in.ibm.com> Tested-by: Subrata Modak <tosubrata@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-03-25 17:21:34 +01:00
Vlad Yasevich	b2f5e7cd3d	ipv6: Fix conflict resolutions during ipv6 binding The ipv6 version of bind_conflict code calls ipv6_rcv_saddr_equal() which at times wrongly identified intersections between addresses. It particularly broke down under a few instances and caused erroneous bind conflicts. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-24 19:49:11 -07:00
Vlad Yasevich	63d9950b08	ipv6: Make v4-mapped bindings consistent with IPv4 Binding to a v4-mapped address on an AF_INET6 socket should produce the same result as binding to an IPv4 address on AF_INET socket. The two are interchangable as v4-mapped address is really a portability aid. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-24 19:49:10 -07:00
Vlad Yasevich	0f8d3c7ac3	ipv6: Allow ipv4 wildcard binds after ipv6 address binds The IPv4 wildcard (0.0.0.0) address does not intersect in any way with explicit IPv6 addresses. These two should be permitted, but the IPv4 conflict code checks the ipv6only bit as part of the test. Since binding to an explicit IPv6 address restricts the socket to only that IPv6 address, the side-effect is that the socket behaves as v6-only. By explicitely setting ipv6only in this case, allows the 2 binds to succeed. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-24 19:49:10 -07:00
Vlad Yasevich	783ed5a783	ipv6: Disallow binding to v4-mapped address on v6-only socket. A socket marked v6-only, can not receive or send traffic to v4-mapped addresses. Thus allowing binding to v4-mapped address on such a socket makes no sense. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-24 19:49:09 -07:00
David S. Miller	c80dd2da73	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-2.6	2009-03-24 16:38:53 -07:00
Jason Baron	e9d376f0fa	dynamic debug: combine dprintk and dynamic printk This patch combines Greg Bank's dprintk() work with the existing dynamic printk patchset, we are now calling it 'dynamic debug'. The new feature of this patchset is a richer /debugfs control file interface, (an example output from my system is at the bottom), which allows fined grained control over the the debug output. The output can be controlled by function, file, module, format string, and line number. for example, enabled all debug messages in module 'nf_conntrack': echo -n 'module nf_conntrack +p' > /mnt/debugfs/dynamic_debug/control to disable them: echo -n 'module nf_conntrack -p' > /mnt/debugfs/dynamic_debug/control A further explanation can be found in the documentation patch. Signed-off-by: Greg Banks <gnb@sgi.com> Signed-off-by: Jason Baron <jbaron@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-03-24 16:38:26 -07:00
Cornelia Huck	ffa6a7054d	Driver core: Fix device_move() vs. dpm list ordering, v2 dpm_list currently relies on the fact that child devices will be registered after their parents to get a correct suspend order. Using device_move() however destroys this assumption, as an already registered device may be moved under a newly registered one. This patch adds a new argument to device_move(), allowing callers to specify how dpm_list should be adapted. Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com> Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-03-24 16:38:26 -07:00
Pablo Neira Ayuso	38938bfe34	netlink: add NETLINK_NO_ENOBUFS socket flag This patch adds the NETLINK_NO_ENOBUFS socket flag. This flag can be used by unicast and broadcast listeners to avoid receiving ENOBUFS errors. Generally speaking, ENOBUFS errors are useful to notify two things to the listener: a) You may increase the receiver buffer size via setsockopt(). b) You have lost messages, you may be out of sync. In some cases, ignoring ENOBUFS errors can be useful. For example: a) nfnetlink_queue: this subsystem does not have any sort of resync method and you can decide to ignore ENOBUFS once you have set a given buffer size. b) ctnetlink: you can use this together with the socket flag NETLINK_BROADCAST_SEND_ERROR to stop getting ENOBUFS errors as you do not need to resync (packets whose event are not delivered are drop to provide reliable logging and state-synchronization). Moreover, the use of NETLINK_NO_ENOBUFS also reduces a "go up, go down" effect in terms of performance which is due to the netlink congestion control when the listener cannot back off. The effect is the following: 1) throughput rate goes up and netlink messages are inserted in the receiver buffer. 2) Then, netlink buffer fills and overruns (set on nlk->state bit 0). 3) While the listener empties the receiver buffer, netlink keeps dropping messages. Thus, throughput goes dramatically down. 4) Then, once the listener has emptied the buffer (nlk->state bit 0 is set off), goto step 1. This effect is easy to trigger with netlink broadcast under heavy load, and it is more noticeable when using a big receiver buffer. You can find some results in [1] that show this problem. [1] http://1984.lsi.us.es/linux/netlink/ This patch also includes the use of sk_drop to account the number of netlink messages drop due to overrun. This value is shown in /proc/net/netlink. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-24 16:37:55 -07:00
Eric Dumazet	35c7f6de73	arp_tables: ifname_compare() can assume 16bit alignment Arches without efficient unaligned access can still perform a loop assuming 16bit alignment in ifname_compare() Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-24 14:15:22 -07:00
Jan Engelhardt	8dd1d0471b	netfilter: trivial Kconfig spelling fixes Supplements commit `67c0d57930`. Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-24 13:35:27 -07:00
David S. Miller	b5bb14386e	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-next-2.6	2009-03-24 13:24:36 -07:00
Eric Dumazet	1d45209d89	netfilter: nf_conntrack: Reduce conntrack count in nf_conntrack_free() We use RCU to defer freeing of conntrack structures. In DOS situation, RCU might accumulate about 10.000 elements per CPU in its internal queues. To get accurate conntrack counts (at the expense of slightly more RAM used), we might consider conntrack counter not taking into account "about to be freed elements, waiting in RCU queues". We thus decrement it in nf_conntrack_free(), not in the RCU callback. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Tested-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-03-24 14:26:50 +01:00
Vitaly Mayatskikh	30842f2989	udp: Wrong locking code in udp seq_file infrastructure Reading zero bytes from /proc/net/udp or other similar files which use the same seq_file udp infrastructure panics kernel in that way: ===================================== [ BUG: bad unlock balance detected! ] ------------------------------------- read/1985 is trying to release lock (&table->hash[i].lock) at: [<ffffffff81321d83>] udp_seq_stop+0x27/0x29 but there are no more locks to release! other info that might help us debug this: 1 lock held by read/1985: #0: (&p->lock){--..}, at: [<ffffffff810eefb6>] seq_read+0x38/0x348 stack backtrace: Pid: 1985, comm: read Not tainted 2.6.29-rc8 #9 Call Trace: [<ffffffff81321d83>] ? udp_seq_stop+0x27/0x29 [<ffffffff8106dab9>] print_unlock_inbalance_bug+0xd6/0xe1 [<ffffffff8106db62>] lock_release_non_nested+0x9e/0x1c6 [<ffffffff810ef030>] ? seq_read+0xb2/0x348 [<ffffffff8106bdba>] ? mark_held_locks+0x68/0x86 [<ffffffff81321d83>] ? udp_seq_stop+0x27/0x29 [<ffffffff8106dde7>] lock_release+0x15d/0x189 [<ffffffff8137163c>] _spin_unlock_bh+0x1e/0x34 [<ffffffff81321d83>] udp_seq_stop+0x27/0x29 [<ffffffff810ef239>] seq_read+0x2bb/0x348 [<ffffffff810eef7e>] ? seq_read+0x0/0x348 [<ffffffff8111aedd>] proc_reg_read+0x90/0xaf [<ffffffff810d878f>] vfs_read+0xa6/0x103 [<ffffffff8106bfac>] ? trace_hardirqs_on_caller+0x12f/0x153 [<ffffffff810d88a2>] sys_read+0x45/0x69 [<ffffffff8101123a>] system_call_fastpath+0x16/0x1b BUG: scheduling while atomic: read/1985/0xffffff00 INFO: lockdep is turned off. Modules linked in: cpufreq_ondemand acpi_cpufreq freq_table dm_multipath kvm ppdev snd_hda_codec_analog snd_hda_intel snd_hda_codec snd_hwdep snd_seq_dummy snd_seq_oss snd_seq_midi_event arc4 snd_s eq ecb thinkpad_acpi snd_seq_device iwl3945 hwmon sdhci_pci snd_pcm_oss sdhci rfkill mmc_core snd_mixer_oss i2c_i801 mac80211 yenta_socket ricoh_mmc i2c_core iTCO_wdt snd_pcm iTCO_vendor_support rs rc_nonstatic snd_timer snd lib80211 cfg80211 soundcore snd_page_alloc video parport_pc output parport e1000e [last unloaded: scsi_wait_scan] Pid: 1985, comm: read Not tainted 2.6.29-rc8 #9 Call Trace: [<ffffffff8106b456>] ? __debug_show_held_locks+0x1b/0x24 [<ffffffff81043660>] __schedule_bug+0x7e/0x83 [<ffffffff8136ede9>] schedule+0xce/0x838 [<ffffffff810d7972>] ? fsnotify_access+0x5f/0x67 [<ffffffff810112d0>] ? sysret_careful+0xb/0x37 [<ffffffff8106be9c>] ? trace_hardirqs_on_caller+0x1f/0x153 [<ffffffff8137127b>] ? trace_hardirqs_on_thunk+0x3a/0x3f [<ffffffff810112f6>] sysret_careful+0x31/0x37 read[1985]: segfault at 7fffc479bfe8 ip 0000003e7420a180 sp 00007fffc479bfa0 error 6 Kernel panic - not syncing: Aiee, killing interrupt handler! udp_seq_stop() tries to unlock not yet locked spinlock. The lock was lost during splitting global udp_hash_lock to subsequent spinlocks. Signed-off by: Vitaly Mayatskikh <v.mayatskih@gmail.com> Acked-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-23 15:22:33 -07:00
David S. Miller	8be7cdccac	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/ucc_geth.c	2009-03-23 13:35:04 -07:00
Linus Torvalds	d56ffd38a9	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (32 commits) ucc_geth: Fix oops when using fixed-link support dm9000: locking bugfix net: update dnet.c for bus_id removal dnet: DNET should depend on HAS_IOMEM dca: add missing copyright/license headers nl80211: Check that function pointer != NULL before using it sungem: missing net_device_ops be2net: fix to restore vlan ids into BE2 during a IF DOWN->UP cycle be2net: replenish when posting to rx-queue is starved in out of mem conditions bas_gigaset: correctly allocate USB interrupt transfer buffer smsc911x: reset last known duplex and carrier on open sh_eth: Fix mistake of the address of SH7763 sh_eth: Change handling of IRQ netns: oops in ip[6]_frag_reasm incrementing stats net: kfree(napi->skb) => kfree_skb net: fix sctp breakage ipv6: fix display of local and remote sit endpoints net: Document /proc/sys/net/core/netdev_budget tulip: fix crash on iface up with shirq debug virtio_net: Make virtio_net support carrier detection ...	2009-03-23 09:25:58 -07:00
Mark H. Weaver	534f81a506	netfilter: nf_conntrack_tcp: fix unaligned memory access in tcp_sack This patch fixes an unaligned memory access in tcp_sack while reading sequence numbers from TCP selective acknowledgement options. Prior to applying this patch, upstream linux-2.6.27.20 was occasionally generating messages like this on my sparc64 system: [54678.532071] Kernel unaligned access at TPC[6b17d4] tcp_packet+0xcd4/0xd00 Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-03-23 13:46:12 +01:00
Pablo Neira Ayuso	dd5b6ce6fd	nefilter: nfnetlink: add nfnetlink_set_err and use it in ctnetlink This patch adds nfnetlink_set_err() to propagate the error to netlink broadcast listener in case of memory allocation errors in the message building. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-03-23 13:21:06 +01:00
Eric Leblond	176252746e	netfilter: sysctl support of logger choice This patchs adds support of modification of the used logger via sysctl. It can be used to change the logger to module that can not use the bind operation (ipt_LOG and ipt_ULOG). For this purpose, it creates a directory /proc/sys/net/netfilter/nf_log which contains a file per-protocol. The content of the file is the name current logger (NONE if not set) and a logger can be setup by simply echoing its name to the file. By echoing "NONE" to a /proc/sys/net/netfilter/nf_log/PROTO file, the logger corresponding to this PROTO is set to NULL. Signed-off-by: Eric Leblond <eric@inl.fr> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-03-23 13:16:53 +01:00
John Dykstra	96e0bf4b51	tcp: Discard segments that ack data not yet sent Discard incoming packets whose ack field iincludes data not yet sent. This is consistent with RFC 793 Section 3.9. Change tcp_ack() to distinguish between too-small and too-large ack field values. Keep segments with too-large ack fields out of the fast path, and change slow path to discard them. Reported-by: Oliver Zheng <mailinglists+netdev@oliverzheng.com> Signed-off-by: John Dykstra <john.dykstra1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-22 21:49:57 -07:00
Stephen Hemminger	d44c3a2e0e	netdev: expose net_device_ops compat as config option Now that most network device drivers in (all but one in x86_64 allmodconfig) support net_device_ops. Expose it as a configuration parameter. Still need to address even older 32 bit drivers, and other arch before compatiablity can be scheduled for removal in some future release. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-21 22:55:36 -07:00
Stephen Hemminger	9cc8ba783d	irlan: convert to net_device_ops Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-21 19:19:16 -07:00
Stephen Hemminger	92bcd4fe9a	irda: net_device_ops ioctl fix Need to reference net_device_ops not old pointer. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-21 19:19:14 -07:00
Stephen Hemminger	dde0975855	atm: convert clip driver to net_device_ops Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-21 19:19:12 -07:00
Stephen Hemminger	788dee0a95	atm: convert mpc device to using netdev_ops This converts the mpc device to using new netdevice_ops. Compile tested only, needs more than usual review since device was swaping pointers around etc. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Acked-by: Chas Williams <chas@cmf.nrl.navy.mil> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-21 19:19:12 -07:00
Lennert Buytenhek	e84665c9cb	dsa: add switch chip cascading support The initial version of the DSA driver only supported a single switch chip per network interface, while DSA-capable switch chips can be interconnected to form a tree of switch chips. This patch adds support for multiple switch chips on a network interface. An example topology for a 16-port device with an embedded CPU is as follows: +-----+ +--------+ +--------+ \| \|eth0 10\| switch \|9 10\| switch \| \| CPU +----------+ +-------+ \| \| \| \| chip 0 \| \| chip 1 \| +-----+ +---++---+ +---++---+ \|\| \|\| \|\| \|\| \|\|1000baseT \|\|1000baseT \|\|ports 1-8 \|\|ports 9-16 This requires a couple of interdependent changes in the DSA layer: - The dsa platform driver data needs to be extended: there is still only one netdevice per DSA driver instance (eth0 in the example above), but each of the switch chips in the tree needs its own mii_bus device pointer, MII management bus address, and port name array. (include/net/dsa.h) The existing in-tree dsa users need some small changes to deal with this. (arch/arm) - The DSA and Ethertype DSA tagging modules need to be extended to use the DSA device ID field on receive and demultiplex the packet accordingly, and fill in the DSA device ID field on transmit according to which switch chip the packet is heading to. (net/dsa/tag_{dsa,edsa}.c) - The concept of "CPU port", which is the switch chip port that the CPU is connected to (port 10 on switch chip 0 in the example), needs to be extended with the concept of "upstream port", which is the port on the switch chip that will bring us one hop closer to the CPU (port 10 for both switch chips in the example above). - The dsa platform data needs to specify which ports on which switch chips are links to other switch chips, so that we can enable DSA tagging mode on them. (For inter-switch links, we always use non-EtherType DSA tagging, since it has lower overhead. The CPU link uses dsa or edsa tagging depending on what the 'root' switch chip supports.) This is done by specifying "dsa" for the given port in the port array. - The dsa platform data needs to be extended with information on via which port to reach any given switch chip from any given switch chip. This info is specified via the per-switch chip data struct ->rtable[] array, which gives the nexthop ports for each of the other switches in the tree. For the example topology above, the dsa platform data would look something like this: static struct dsa_chip_data sw[2] = { { .mii_bus = &foo, .sw_addr = 1, .port_names[0] = "p1", .port_names[1] = "p2", .port_names[2] = "p3", .port_names[3] = "p4", .port_names[4] = "p5", .port_names[5] = "p6", .port_names[6] = "p7", .port_names[7] = "p8", .port_names[9] = "dsa", .port_names[10] = "cpu", .rtable = (s8 []){ -1, 9, }, }, { .mii_bus = &foo, .sw_addr = 2, .port_names[0] = "p9", .port_names[1] = "p10", .port_names[2] = "p11", .port_names[3] = "p12", .port_names[4] = "p13", .port_names[5] = "p14", .port_names[6] = "p15", .port_names[7] = "p16", .port_names[10] = "dsa", .rtable = (s8 []){ 10, -1, }, }, }, static struct dsa_platform_data pd = { .netdev = &foo, .nr_switches = 2, .sw = sw, }; Signed-off-by: Lennert Buytenhek <buytenh@marvell.com> Tested-by: Gary Thomas <gary@mlbassoc.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-21 19:06:54 -07:00
Lennert Buytenhek	076d3e10a5	dsa: add support for the Marvell 88E6095/6095F switch chips Add support for the Marvell 88E6095/6095F switch chips. These chips are similar to the 88e6131, so we can add the support to mv88e6131.c easily. Thanks to Gary Thomas <gary@mlbassoc.com> and Jesper Dangaard Brouer <hawk@diku.dk> for testing various patches. Signed-off-by: Lennert Buytenhek <buytenh@marvell.com> Tested-by: Gary Thomas <gary@mlbassoc.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-21 19:06:54 -07:00
Lennert Buytenhek	c084080151	dsa: set ->iflink on slave interfaces to the ifindex of the parent ..so that we can parse the DSA topology from 'ip link' output: 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast qlen 1000 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast qlen 1000 4: lan1@eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue 5: lan2@eth0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue 6: lan3@eth0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue 7: lan4@eth0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue Signed-off-by: Lennert Buytenhek <buytenh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-21 19:06:53 -07:00
Stephen Hemminger	fa665ccf01	ipx: use constant for strings and desciptor Fix compiler warning about non-const format string. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-21 19:06:51 -07:00
Stephen Hemminger	7ca98fa234	snap: use const for descriptor Protocols should be able to use constant value for the descriptor. Minor whitespace cleanup as well Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-21 19:06:50 -07:00
Eric Dumazet	ed734a97c6	net: remove useless prefetch() call There is no gain using prefetch() in dev_hard_start_xmit(), since we already had to read ops->ndo_select_queue pointer in dev_pick_tx(), and both pointers are probably located in the same cache line. This prefetch call slows down fast path because of a stall in address computation. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-21 13:42:55 -07:00
Vlad Yasevich	8d2f9e8116	sctp: Clean up TEST_FRAME hacks. Remove 2 TEST_FRAME hacks that are no longer needed. These allowed sctp regression tests to compile before, but are no longer needed. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-21 13:41:09 -07:00
Stephen Hemminger	9247744e5e	skb: expose and constify hash primitives Some minor changes to queue hashing: 1. Use const on accessor functions 2. Export skb_tx_hash for use in drivers (see ixgbe) Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-21 13:39:26 -07:00
Stephen Hemminger	1f1900f935	atm: lec use dev_change_mtu Rather than calling device pointer directly (which is incorrect with net_device_ops), use the standard dev_change_mtu. Compile tested only. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-21 13:37:28 -07:00
Ilpo Järvinen	a0bffffc14	net/*: use linux/kernel.h swap() tcp_sack_swap seems unnecessary so I pushed swap to the caller. Also removed comment that seemed then pointless, and added include when not already there. Compile tested. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-21 13:36:17 -07:00
Bernard Pidoux	a3ac80a130	netrom: zero length frame filtering in NetRom A zero length frame filter was recently introduced in ROSE protocole. Previous commit makes the same at AX25 protocole level. This patch has the same purpose for NetRom protocole. The reason is that empty frames have no meaning in NetRom protocole. Signed-off-by: Bernard Pidoux <f6bvp@amsat.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-21 13:34:20 -07:00
Bernard Pidoux	f99bcff7a2	ax25: zero length frame filtering in AX25 In previous commit `244f46ae6e` was introduced a zero length frame filter for ROSE protocole. This patch has the same purpose at AX25 frame level for the same reason. Empty frames have no meaning in AX25 protocole. Signed-off-by: Bernard Pidoux <f6bvp@amsat.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-21 13:33:55 -07:00
Bernard Pidoux	60784427ab	ax25: SOCK_DEBUG message simplification This patch condenses two debug messages in one. Signed-off-by: Bernard Pidoux <f6bvp@amsat.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-21 13:33:18 -07:00
Jouni Malinen	f3f9258678	nl80211: Check that function pointer != NULL before using it NL80211_CMD_GET_MESH_PARAMS and NL80211_CMD_SET_MESH_PARAMS handlers did not verify whether a function pointer is NULL (not supported by the driver) before trying to call the function. The former nl80211 command is available for unprivileged users, too, so this can potentially allow normal users to kill networking (or worse..) if mac80211 is built without CONFIG_MAC80211_MESH=y. Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-03-20 16:01:57 -04:00
David S. Miller	2b1c4354de	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/virtio_net.c	2009-03-20 02:27:41 -07:00
Tom Talpey	2e3c230bc7	SVCRDMA: fix recent printk format warnings. printk formats in prior commit were reversed/incorrect. Compiled without warning on x86 and x86_64, but detected on ppc. Signed-off-by: Tom Talpey <tmtalpey@gmail.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-03-19 15:17:37 -04:00
Trond Myklebust	55420c24a0	SUNRPC: Ensure we close the socket on EPIPE errors too... As long as one task is holding the socket lock, then calls to xprt_force_disconnect(xprt) will not succeed in shutting down the socket. In particular, this would mean that a server initiated shutdown will not succeed until the lock is relinquished. In order to avoid the deadlock, we should ensure that xs_tcp_send_request() closes the socket on EPIPE errors too. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-03-19 15:17:36 -04:00
Trond Myklebust	b61d59fffd	SUNRPC: xs_tcp_connect_worker{4,6}: merge common code Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-03-19 15:17:35 -04:00
Trond Myklebust	25fe6142a5	SUNRPC: Add a sysctl to control the duration of the socket linger timeout Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-03-19 15:17:34 -04:00
Trond Myklebust	7d1e8255cf	SUNRPC: Add the equivalent of the linger and linger2 timeouts to RPC sockets This fixes a regression against FreeBSD servers as reported by Tomas Kasparek. Apparently when using RPC over a TCP socket, the FreeBSD servers don't ever react to the client closing the socket, and so commit `e06799f958` (SUNRPC: Use shutdown() instead of close() when disconnecting a TCP socket) causes the setup to hang forever whenever the client attempts to close and then reconnect. We break the deadlock by adding a 'linger2' style timeout to the socket, after which, the client will abort the connection using a TCP 'RST'. The default timeout is set to 15 seconds. A subsequent patch will put it under user control by means of a systctl. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-03-19 15:17:34 -04:00
Jorge Boncompte [DTI2]	2bad35b7c9	netns: oops in ip[6]_frag_reasm incrementing stats dev can be NULL in ip[6]_frag_reasm for skb's coming from RAW sockets. Quagga's OSPFD sends fragmented packets on a RAW socket, when netfilter conntrack reassembles them on the OUTPUT path you hit this code path. You can test it with something like "hping2 -0 -d 2000 -f AA.BB.CC.DD" With help from Jarek Poplawski. Signed-off-by: Jorge Boncompte [DTI2] <jorge@dti2.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-18 23:26:11 -07:00
Roel Kluin	e4a389a9b5	net: kfree(napi->skb) => kfree_skb struct sk_buff pointers should be freed with kfree_skb. Signed-off-by: Roel Kluin <roel.kluin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-18 23:12:13 -07:00
Al Viro	cb0dc77de0	net: fix sctp breakage broken by commit 5e739d1752aca4e8f3e794d431503bfca3162df4; AFAICS should be -stable fodder as well... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Aced-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-18 19:12:42 -07:00
Stephen Hemminger	4b704d59d6	tipc: fix non-const printf format arguments Fix warnings from current gcc about using non-const strings as printf args in TIPC. Compile tested only (not a TIPC user). Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-18 19:11:29 -07:00
Bjørn Mork	1b1d8f73a4	ipv6: fix display of local and remote sit endpoints This fixes the regressions cause by commit `1326c3d5a4` (v2.6.28-rc6-461-g23a12b1) broke the display of local and remote addresses of an SIT tunnel in iproute2. nt->parms is used by ipip6_tunnel_init() and therefore need to be initialized first. Tracked as http://bugzilla.kernel.org/show_bug.cgi?id=12868 Reported-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-18 18:56:54 -07:00
Rami Rosen	beedad923a	tcp: remove parameter from tcp_recv_urg(). This patch removes an unused parameter (addr_len) from tcp_recv_urg() method in net/ipv4/tcp.c. Signed-off-by: Rami Rosen <ramirose@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-18 18:50:09 -07:00
Brian Haley	9bdd8d40c8	ipv6: Fix incorrect disable_ipv6 behavior Fix the behavior of allowing both sysctl and addrconf_dad_failure() to set the disable_ipv6 parameter without any bad side-effects. If DAD fails and accept_dad > 1, we will still set disable_ipv6=1, but then instead of allowing an RA to add an address then immediately fail DAD, we simply don't allow the address to be added in the first place. This also lets the user set this flag and disable all IPv6 addresses on the interface, or on the entire system. Signed-off-by: Brian Haley <brian.haley@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-18 18:22:48 -07:00
Olga Kornievskaia	47a14ef1af	svcrpc: take advantage of tcp autotuning Allow the NFSv4 server to make use of TCP autotuning behaviour, which was previously disabled by setting the sk_userlocks variable. Set the receive buffers to be big enough to receive the whole RPC request, and set this for the listening socket, not the accept socket. Remove the code that readjusts the receive/send buffer sizes for the accepted socket. Previously this code was used to influence the TCP window management behaviour, which is no longer needed when autotuning is enabled. This can improve IO bandwidth on networks with high bandwidth-delay products, where a large tcp window is required. It also simplifies performance tuning, since getting adequate tcp buffers previously required increasing the number of nfsd threads. Signed-off-by: Olga Kornievskaia <aglo@citi.umich.edu> Cc: Jim Rees <rees@umich.edu> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-03-18 17:46:59 -04:00
Greg Banks	03cf6c9f49	knfsd: add file to export stats about nfsd pools Add /proc/fs/nfsd/pool_stats to export to userspace various statistics about the operation of rpc server thread pools. This patch is based on a forward-ported version of knfsd-add-pool-thread-stats which has been shipping in the SGI "Enhanced NFS" product since 2006 and which was previously posted: http://article.gmane.org/gmane.linux.nfs/10375 It has also been updated thus: * moved EXPORT_SYMBOL() to near the function it exports * made the new struct struct seq_operations const * used SEQ_START_TOKEN instead of ((void )1) merged fix from SGI PV 990526 "sunrpc: use dprintk instead of printk in svc_pool_stats_()" by Harshula Jayasuriya. merged fix from SGI PV 964001 "Crash reading pool_stats before nfsds are started". Signed-off-by: Greg Banks <gnb@sgi.com> Signed-off-by: Harshula Jayasuriya <harshula@sgi.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-03-18 17:38:42 -04:00
Greg Banks	59a252ff8c	knfsd: avoid overloading the CPU scheduler with enormous load averages Avoid overloading the CPU scheduler with enormous load averages when handling high call-rate NFS loads. When the knfsd bottom half is made aware of an incoming call by the socket layer, it tries to choose an nfsd thread and wake it up. As long as there are idle threads, one will be woken up. If there are lot of nfsd threads (a sensible configuration when the server is disk-bound or is running an HSM), there will be many more nfsd threads than CPUs to run them. Under a high call-rate low service-time workload, the result is that almost every nfsd is runnable, but only a handful are actually able to run. This situation causes two significant problems: 1. The CPU scheduler takes over 10% of each CPU, which is robbing the nfsd threads of valuable CPU time. 2. At a high enough load, the nfsd threads starve userspace threads of CPU time, to the point where daemons like portmap and rpc.mountd do not schedule for tens of seconds at a time. Clients attempting to mount an NFS filesystem timeout at the very first step (opening a TCP connection to portmap) because portmap cannot wake up from select() and call accept() in time. Disclaimer: these effects were observed on a SLES9 kernel, modern kernels' schedulers may behave more gracefully. The solution is simple: keep in each svc_pool a counter of the number of threads which have been woken but have not yet run, and do not wake any more if that count reaches an arbitrary small threshold. Testing was on a 4 CPU 4 NIC Altix using 4 IRIX clients, each with 16 synthetic client threads simulating an rsync (i.e. recursive directory listing) workload reading from an i386 RH9 install image (161480 regular files in 10841 directories) on the server. That tree is small enough to fill in the server's RAM so no disk traffic was involved. This setup gives a sustained call rate in excess of 60000 calls/sec before being CPU-bound on the server. The server was running 128 nfsds. Profiling showed schedule() taking 6.7% of every CPU, and __wake_up() taking 5.2%. This patch drops those contributions to 3.0% and 2.2%. Load average was over 120 before the patch, and 20.9 after. This patch is a forward-ported version of knfsd-avoid-nfsd-overload which has been shipping in the SGI "Enhanced NFS" product since 2006. It has been posted before: http://article.gmane.org/gmane.linux.nfs/10374 Signed-off-by: Greg Banks <gnb@sgi.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-03-18 17:38:41 -04:00
Patrick McHardy	0f5b3e85a3	netfilter: ctnetlink: fix rcu context imbalance Introduced by `7ec47496` (netfilter: ctnetlink: cleanup master conntrack assignation): net/netfilter/nf_conntrack_netlink.c:1275:2: warning: context imbalance in 'ctnetlink_create_conntrack' - different lock contexts for basic block Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-03-18 17:36:40 +01:00
Florian Westphal	711d60a9e7	netfilter: remove nf_ct_l4proto_find_get/nf_ct_l4proto_put users have been moved to __nf_ct_l4proto_find. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-03-18 17:30:50 +01:00
Florian Westphal	cd91566e4b	netfilter: ctnetlink: remove remaining module refcounting Convert the remaining refcount users. As pointed out by Patrick McHardy, the protocols can be accessed safely using RCU. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-03-18 17:28:37 +01:00
David S. Miller	af4330631c	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6	2009-03-17 15:04:31 -07:00
David S. Miller	2d6a5e9500	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/igb/igb_main.c drivers/net/qlge/qlge_main.c drivers/net/wireless/ath9k/ath9k.h drivers/net/wireless/ath9k/core.h drivers/net/wireless/ath9k/hw.c	2009-03-17 15:01:30 -07:00
David S. Miller	f10023a4ef	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6	2009-03-17 14:29:22 -07:00
David S. Miller	4ada8107f4	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-2.6	2009-03-17 13:12:47 -07:00
Herbert Xu	303c6a0251	gro: Fix legacy path napi_complete crash On the legacy netif_rx path, I incorrectly tried to optimise the napi_complete call by using __napi_complete before we reenable IRQs. This simply doesn't work since we need to flush the held GRO packets first. This patch fixes it by doing the obvious thing of reenabling IRQs first and then calling napi_complete. Reported-by: Frank Blaschka <blaschka@linux.vnet.ibm.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-17 13:11:29 -07:00
Herbert Xu	2ffb455819	gro: Fix vlan/netpoll check again Jarek Poplawski pointed out that my previous fix is broken for VLAN+netpoll as if netpoll is enabled we'd end up in the normal receive path instead of the VLAN receive path. This patch fixes it by calling the VLAN receive hook. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-17 13:10:52 -07:00
Luis R. Rodriguez	73d54c9e74	cfg80211: add regulatory netlink multicast group This allows us to send to userspace "regulatory" events. For now we just send an event when we change regulatory domains. We also notify userspace when devices are using their own custom world roaming regulatory domains. Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-03-16 18:09:40 -04:00
Luis R. Rodriguez	7db90f4a25	cfg80211: move enum reg_set_by to nl80211.h We do this so we can later inform userspace who set the regulatory domain and provide details of the request. Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-03-16 18:09:40 -04:00
Luis R. Rodriguez	0fee54cab7	cfg80211: remove REGDOM_SET_BY_INIT This is not used as we can always just assume the first regulatory domain set will _always_ be a static regulatory domain. REGDOM_SET_BY_CORE will be the first request from cfg80211 for a regdomain and that then populates the first regulatory request. Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-03-16 18:09:39 -04:00
Herton Ronaldo Krzesinski	1a28c78b46	mac80211: deauth before flushing STA information Even after commit "mac80211: deauth when interface is marked down" (`e327b847` on Linus tree), userspace still isn't notified when interface goes down. There isn't a problem with this commit, but because of other code changes it doesn't work on kernels >= 2.6.28 (works if same/similar change applied on 2.6.27 for example). The issue is as follows: after commit "mac80211: restructure disassoc/deauth flows" in 2.6.28, the call to ieee80211_sta_deauthenticate added by commit `e327b847` will not work: because we do sta_info_flush(local, sdata) inside ieee80211_stop (iface.c), all stations in interface are cleared, so when calling ieee80211_sta_deauthenticate->ieee80211_set_disassoc (mlme.c), inside ieee80211_set_disassoc we have this in the beginning: sta = sta_info_get(local, ifsta->bssid); if (!sta) { The !sta check triggers, thus the function returns early and ieee80211_sta_send_apinfo(sdata, ifsta) later isn't called, so wpa_supplicant/userspace isn't notified with SIOCGIWAP. This commit moves deauthentication to before flushing STA info (sta_info_flush), thus the above can't happen and userspace is really notified when interface goes down. Signed-off-by: Herton Ronaldo Krzesinski <herton@mandriva.com.br> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-03-16 18:09:39 -04:00
Helmut Schaa	af88b9078d	mac80211: handle failed scan requests in STA mode If cfg80211 requests a scan it awaits either a return code != 0 from the scan function or the cfg80211_scan_done to be called. In case of a STA mac80211's scan function ever returns 0 and queues the scan request. If ieee80211_sta_work is executed and ieee80211_start_scan fails for some reason cfg80211_scan_done will never be called but cfg80211 still thinks the scan was triggered successfully and will refuse any future scan requests due to drv->scan_req not being cleaned up. If a scan is triggered from within the MLME a similar problem appears. If ieee80211_start_scan returns an error, local->scan_req will not be reset and mac80211 will refuse any future scan requests. Hence, in both cases call ieee80211_scan_failed (which notifies cfg80211 and resets local->scan_req) if ieee80211_start_scan returns an error. Signed-off-by: Helmut Schaa <helmut.schaa@googlemail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-03-16 18:09:38 -04:00
Luis R. Rodriguez	ec329acef9	cfg80211: fix max tx power for world regdom on 5 GHz to 20dBm This is the lowest value amongst countries which do enable 5 GHz operation. Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-03-16 18:09:29 -04:00
Luis R. Rodriguez	611b6a82aa	cfg80211: Enable passive scan on channels 12-14 for world roaming Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-03-16 18:09:29 -04:00
Jouni Malinen	0eeb59fe2c	mac80211: Fix WMM ACM parsing and AC downgrade operation Incorrect local->wmm_acm bits were set for AC_BK and AC_BE. Fix this and add some comments to make it easier to understand the AC-to-UP(pair) mapping. Set the wmm_acm bits (and show WMM debug) even if the driver does not implement conf_tx() handler. In addition, fix the ACM-based AC downgrade code to not use the highest priority in error cases. We need to break the loop to get the correct AC_BK value (3) instead of returning 0 (which would indicate AC_VO). The comment here was not really very useful either, so let's provide somewhat more helpful description of the situation. Since it is very unlikely that the ACM flag would be set for AC_BK and AC_BE, these bugs are not likely to be seen in real life networks. Anyway, better do these things correctly should someone really use silly AP configuration (and to pass some functionality tests, too). Remove the TODO comment about handling ACM. Downgrading AC is perfectly valid mechanism for ACM. Eventually, we may add support for WMM-AC and send a request for a TS, but anyway, that functionality won't be here at the location of this TODO comment. Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-03-16 18:09:27 -04:00
Jouni Malinen	055249d20d	mac80211: Fix panic on fragmentation with power saving It was possible to hit a kernel panic on NULL pointer dereference in dev_queue_xmit() when sending power save buffered frames to a STA that woke up from sleep. This happened when the buffered frame was requeued for transmission in ap_sta_ps_end(). In order to avoid the panic, copy the skb->dev and skb->iif values from the first fragment to all other fragments. Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-03-16 18:01:59 -04:00
John W. Linville	6f16bf3bdb	lib80211: silence excessive crypto debugging messages When they were part of the now defunct ieee80211 component, these messages were only visible when special debugging settings were enabled. Let's mirror that with a new lib80211 debugging Kconfig option. Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-03-16 18:01:58 -04:00
Herbert Xu	d1c76af9e2	GRO: Move netpoll checks to correct location As my netpoll fix for net doesn't really work for net-next, we need this update to move the checks into the right place. As it stands we may pass freed skbs to netpoll_receive_skb. This patch also introduces a netpoll_rx_on function to avoid GRO completely if we're invoked through netpoll. This might seem paranoid but as netpoll may have an external receive hook it's better to be safe than sorry. I don't think we need this for 2.6.29 though since there's nothing immediately broken by it. This patch also moves the GRO_* return values to netdevice.h since VLAN needs them too (I tried to avoid this originally but alas this seems to be the easiest way out). This fixes a bug in VLAN where it continued to use the old return value 2 instead of the correct GRO_DROP. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-16 10:50:02 -07:00
Pablo Neira Ayuso	0269ea4937	netfilter: xtables: add cluster match This patch adds the iptables cluster match. This match can be used to deploy gateway and back-end load-sharing clusters. The cluster can be composed of 32 nodes maximum (although I have only tested this with two nodes, so I cannot tell what is the real scalability limit of this solution in terms of cluster nodes). Assuming that all the nodes see all packets (see below for an example on how to do that if your switch does not allow this), the cluster match decides if this node has to handle a packet given: (jhash(source IP) % total_nodes) & node_mask For related connections, the master conntrack is used. The following is an example of its use to deploy a gateway cluster composed of two nodes (where this is the node 1): iptables -I PREROUTING -t mangle -i eth1 -m cluster \ --cluster-total-nodes 2 --cluster-local-node 1 \ --cluster-proc-name eth1 -j MARK --set-mark 0xffff iptables -A PREROUTING -t mangle -i eth1 \ -m mark ! --mark 0xffff -j DROP iptables -A PREROUTING -t mangle -i eth2 -m cluster \ --cluster-total-nodes 2 --cluster-local-node 1 \ --cluster-proc-name eth2 -j MARK --set-mark 0xffff iptables -A PREROUTING -t mangle -i eth2 \ -m mark ! --mark 0xffff -j DROP And the following commands to make all nodes see the same packets: ip maddr add 01:00:5e:00:01:01 dev eth1 ip maddr add 01:00:5e:00:01:02 dev eth2 arptables -I OUTPUT -o eth1 --h-length 6 \ -j mangle --mangle-mac-s 01:00:5e:00:01:01 arptables -I INPUT -i eth1 --h-length 6 \ --destination-mac 01:00:5e:00:01:01 \ -j mangle --mangle-mac-d 00:zz:yy:xx:5a:27 arptables -I OUTPUT -o eth2 --h-length 6 \ -j mangle --mangle-mac-s 01:00:5e:00:01:02 arptables -I INPUT -i eth2 --h-length 6 \ --destination-mac 01:00:5e:00:01:02 \ -j mangle --mangle-mac-d 00:zz:yy:xx:5a:27 In the case of TCP connections, pickup facility has to be disabled to avoid marking TCP ACK packets coming in the reply direction as valid. echo 0 > /proc/sys/net/netfilter/nf_conntrack_tcp_loose BTW, some final notes: * This match mangles the skbuff pkt_type in case that it detects PACKET_MULTICAST for a non-multicast address. This may be done in a PKTTYPE target for this sole purpose. * This match supersedes the CLUSTERIP target. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-03-16 17:10:36 +01:00
Cyrill Gorcunov	1546000fe8	net: netfilter conntrack - add per-net functionality for DCCP protocol Module specific data moved into per-net site and being allocated/freed during net namespace creation/deletion. Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Acked-by: Daniel Lezcano <daniel.lezcano@free.fr> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-03-16 16:30:49 +01:00
Cyrill Gorcunov	81a1d3c31e	net: sysctl_net - use net_eq to compare nets Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Acked-by: Daniel Lezcano <daniel.lezcano@free.fr> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-03-16 16:23:30 +01:00
Linus Torvalds	8e91f178a2	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (21 commits) r8169: revert "r8169: read MAC address from EEPROM on init (2nd attempt)" r8169: use hardware auto-padding. igb: remove ASPM L0s workaround netxen: remove old flash check. mv643xx_eth: fix unicast address filter corruption on mtu change xfrm: Fix xfrm_state_find() wrt. wildcard source address. emac: Fix clock control for 405EX and 405EXr chips ixgbe: fix multiple unicast address support via-velocity: Fix DMA mapping length errors on transmit. qlge: bugfix: Pad outbound frames smaller than 60 bytes. qlge: bugfix: Move netif_napi_del() to common call point. qlge: bugfix: Tell hw to strip vlan header. qlge: bugfix: Increase filter on inbound csum. dnet: replace obsolete netif_rx_ functions with napi_ net: Add be2net driver. dnet: Fix warnings on 64-bit. dnet: Dave DNET ethernet controller driver (updated) ipv6: Fix BUG when disabled ipv6 module is unloaded bnx2x: Using DMAE to initialize the chip bnx2x: Casting page alignment ...	2009-03-16 07:56:58 -07:00
Christoph Paasch	d1238d5337	netfilter: conntrack: check for NEXTHDR_NONE before header sanity checking NEXTHDR_NONE doesn't has an IPv6 option header, so the first check for the length will always fail and results in a confusing message "too short" if debugging enabled. With this patch, we check for NEXTHDR_NONE before length sanity checkings are done. Signed-off-by: Christoph Paasch <christoph.paasch@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-03-16 15:52:11 +01:00
Christoph Paasch	ec8d540969	netfilter: conntrack: fix dropping packet after l4proto->packet() We currently use the negative value in the conntrack code to encode the packet verdict in the error. As NF_DROP is equal to 0, inverting NF_DROP makes no sense and, as a result, no packets are ever dropped. Signed-off-by: Christoph Paasch <christoph.paasch@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-03-16 15:51:29 +01:00
Pablo Neira Ayuso	626ba8fbac	netfilter: ctnetlink: fix crash during expectation creation This patch fixes a possible crash due to the missing initialization of the expectation class when nf_ct_expect_related() is called. Reported-by: BORBELY Zoltan <bozo@andrews.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-03-16 15:50:51 +01:00
Jan Engelhardt	acc738fec0	netfilter: xtables: avoid pointer to self Commit `784544739a` (netfilter: iptables: lock free counters) broke a number of modules whose rule data referenced itself. A reallocation would not reestablish the correct references, so it is best to use a separate struct that does not fall under RCU. Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-03-16 15:35:29 +01:00
Jonathan Corbet	76398425bb	Move FASYNC bit handling to f_op->fasync() Removing the BKL from FASYNC handling ran into the challenge of keeping the setting of the FASYNC bit in filp->f_flags atomic with regard to calls to the underlying fasync() function. Andi Kleen suggested moving the handling of that bit into fasync(); this patch does exactly that. As a result, we have a couple of internal API changes: fasync() must now manage the FASYNC bit, and it will be called without the BKL held. As it happens, every fasync() implementation in the kernel with one exception calls fasync_helper(). So, if we make fasync_helper() set the FASYNC bit, we can avoid making any changes to the other fasync() functions - as long as those functions, themselves, have proper locking. Most fasync() implementations do nothing but call fasync_helper() - which has its own lock - so they are easily verified as correct. The BKL had already been pushed down into the rest. The networking code has its own version of fasync_helper(), so that code has been augmented with explicit FASYNC bit handling. Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: David Miller <davem@davemloft.net> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jonathan Corbet <corbet@lwn.net>	2009-03-16 08:32:27 -06:00
Scott James Remnant	95ba434f89	netfilter: auto-load ip_queue module when socket opened The ip_queue module is missing the net-pf-16-proto-3 alias that would causae it to be auto-loaded when a socket of that type is opened. This patch adds the alias. Signed-off-by: Scott James Remnant <scott@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-03-16 15:31:10 +01:00
Scott James Remnant	26c3b67806	netfilter: auto-load ip6_queue module when socket opened The ip6_queue module is missing the net-pf-16-proto-13 alias that would cause it to be auto-loaded when a socket of that type is opened. This patch adds the alias. Signed-off-by: Scott James Remnant <scott@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-03-16 15:30:14 +01:00
Pablo Neira Ayuso	f0a3c0869f	netfilter: ctnetlink: move event reporting for new entries outside the lock This patch moves the event reporting outside the lock section. With this patch, the creation and update of entries is homogeneous from the event reporting perspective. Moreover, as the event reporting is done outside the lock section, the netlink broadcast delivery can benefit of the yield() call under congestion. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-03-16 15:28:09 +01:00
Pablo Neira Ayuso	e098360f15	netfilter: ctnetlink: cleanup conntrack update preliminary checkings This patch moves the preliminary checkings that must be fulfilled to update a conntrack, which are the following: * NAT manglings cannot be updated * Changing the master conntrack is not allowed. This patch is a cleanup. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-03-16 15:27:22 +01:00
Pablo Neira Ayuso	7ec4749675	netfilter: ctnetlink: cleanup master conntrack assignation This patch moves the assignation of the master conntrack to ctnetlink_create_conntrack(), which is where it really belongs. This patch is a cleanup. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-03-16 15:25:46 +01:00
Pablo Neira Ayuso	1db7a748df	netfilter: conntrack: increase drop stats if sequence adjustment fails This patch increases the statistics of packets drop if the sequence adjustment fails in ipv4_confirm(). Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-03-16 15:18:50 +01:00
Stephen Hemminger	67c0d57930	netfilter: Kconfig spelling fixes (trivial) Signed-off-by: Stephen Hemminger <sheminger@vyatta.com> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-03-16 15:17:23 +01:00
Christoph Paasch	9d2493f88f	netfilter: remove IPvX specific parts from nf_conntrack_l4proto.h Moving the structure definitions to the corresponding IPvX specific header files. Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-03-16 15:15:35 +01:00
Eric Leblond	c7a913cd55	netfilter: print the list of register loggers This patch modifies the proc output to add display of registered loggers. The content of /proc/net/netfilter/nf_log is modified. Instead of displaying a protocol per line with format: proto:logger it now displays: proto:logger (comma_separated_list_of_loggers) NONE is used as keyword if no logger is used. Signed-off-by: Eric Leblond <eric@inl.fr> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-03-16 14:55:27 +01:00
Eric Leblond	ca735b3aaa	netfilter: use a linked list of loggers This patch modifies nf_log to use a linked list of loggers for each protocol. This list of loggers is read and write protected with a mutex. This patch separates registration and binding. To be used as logging module, a module has to register calling nf_log_register() and to bind to a protocol it has to call nf_log_bind_pf(). This patch also converts the logging modules to the new API. For nfnetlink_log, it simply switchs call to register functions to call to bind function and adds a call to nf_log_register() during init. For other modules, it just remove a const flag from the logger structure and replace it with a __read_mostly. Signed-off-by: Eric Leblond <eric@inl.fr> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-03-16 14:54:21 +01:00
Ilpo Järvinen	afece1c658	tcp: make sure xmit goal size never becomes zero It's not too likely to happen, would basically require crafted packets (must hit the max guard in tcp_bound_to_half_wnd()). It seems that nothing that bad would happen as there's tcp_mems and congestion window that prevent runaway at some point from hurting all too much (I'm not that sure what all those zero sized segments we would generate do though in write queue). Preventing it regardless is certainly the best way to go. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Cc: Evgeniy Polyakov <zbr@ioremap.net> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-15 20:09:55 -07:00
Ilpo Järvinen	2a3a041c4e	tcp: cache result of earlier divides when mss-aligning things The results is very unlikely change every so often so we hardly need to divide again after doing that once for a connection. Yet, if divide still becomes necessary we detect that and do the right thing and again settle for non-divide state. Takes the u16 space which was previously taken by the plain xmit_size_goal. This should take care part of the tso vs non-tso difference we found earlier. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-15 20:09:55 -07:00
Ilpo Järvinen	0c54b85f28	tcp: simplify tcp_current_mss There's very little need for most of the callsites to get tp->xmit_goal_size updated. That will cost us divide as is, so slice the function in two. Also, the only users of the tp->xmit_goal_size are directly behind tcp_current_mss(), so there's no need to store that variable into tcp_sock at all! The drop of xmit_goal_size currently leaves 16-bit hole and some reorganization would again be necessary to change that (but I'm aiming to fill that hole with u16 xmit_goal_size_segs to cache the results of the remaining divide to get that tso on regression). Bring xmit_goal_size parts into tcp.c Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Cc: Evgeniy Polyakov <zbr@ioremap.net> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-15 20:09:54 -07:00
Ilpo Järvinen	72211e9050	tcp: don't check mtu probe completion in the loop It seems that no variables clash such that we couldn't do the check just once later on. Therefore move it. Also kill dead obvious comment, dead argument and add unlikely since this mtu probe does not happen too often. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-15 20:09:53 -07:00
Ilpo Järvinen	c887e6d2d9	tcp: consolidate paws check Wow, it was quite tricky to merge that stream of negations but I think I finally got it right: check & replace_ts_recent: (s32)(rcv_tsval - ts_recent) >= 0 => 0 (s32)(ts_recent - rcv_tsval) <= 0 => 0 discard: (s32)(ts_recent - rcv_tsval) > TCP_PAWS_WINDOW => 1 (s32)(ts_recent - rcv_tsval) <= TCP_PAWS_WINDOW => 0 I toggled the return values of tcp_paws_check around since the old encoding added yet-another negation making tracking of truth-values really complicated. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-15 20:09:52 -07:00
Ilpo Järvinen	c43d558a51	tcp: kill dead end_seq variable in clean_rtx_queue I've already forgotten what for this was necessary, anyway it's no longer used (if it ever was). Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-15 20:09:51 -07:00
Ilpo Järvinen	5861f8e58d	tcp: remove pointless .dsack/.num_sacks code In the pure assignment case, the earlier zeroing is still in effect. David S. Miller raised concerns if the ifs are there to avoid dirtying cachelines. I came to these conclusions: > We'll be dirty it anyway (now that I check), the first "real" statement > in tcp_rcv_established is: > > tp->rx_opt.saw_tstamp = 0; > > ...that'll land on the same dword. :-/ > > I suppose the blocks are there just because they had more complexity > inside when they had to calculate the eff_sacks too (maybe it would > have been better to just remove them in that drop-patch so you would > have had less head-ache :-)). Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-15 20:09:51 -07:00
Jarek Poplawski	7cd0a63872	pkt_sched: Change misleading code in class delete. While looking for a possible reason of bugzilla report on HTB oops: http://bugzilla.kernel.org/show_bug.cgi?id=12858 I found the code in htb_delete calling htb_destroy_class on zero refcount is very misleading: it can suggest this is a common path, and destroy is called under sch_tree_lock. Actually, this can never happen like this because before deletion cops->get() is done, and after delete a class is still used by tclass_notify. The class destroy is always called from cops->put(), so without sch_tree_lock. This doesn't mean much now (since 2.6.27) because all vulnerable calls were moved from htb_destroy_class to htb_delete, but there was a bug in older kernels. The same change is done for other classful scheds, which, it seems, didn't have similar locking problems here. Reported-by: m0sia <m0sia@m0sia.ru> Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-15 20:00:19 -07:00
Roel Kluin	a2025b8b10	tcp: '< 0' test on unsigned promote 'cnt' to size_t, to match 'len'. Signed-off-by: Roel Kluin <roel.kluin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-13 16:05:14 -07:00
Roel Kluin	8db09f26f9	x25: '< 0' and '>= 0' test on unsigned skb->len is an unsigned int, so the test in x25_rx_call_request() always evaluates to true. len in x25_sendmsg() is unsigned as well. so -ERRORS returned by x25_output() are not noticed. Signed-off-by: Roel Kluin <roel.kluin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-13 16:04:12 -07:00
Denys Fedoryshchenko	73ce7b01b4	ipv4: arp announce, arp_proxy and windows ip conflict verification Windows (XP at least) hosts on boot, with configured static ip, performing address conflict detection, which is defined in RFC3927. Here is quote of important information: " An ARP announcement is identical to the ARP Probe described above, except that now the sender and target IP addresses are both set to the host's newly selected IPv4 address. " But it same time this goes wrong with RFC5227. " The 'sender IP address' field MUST be set to all zeroes; this is to avoid polluting ARP caches in other hosts on the same link in the case where the address turns out to be already in use by another host. " When ARP proxy configured, it must not answer to both cases, because it is address conflict verification in any case. For Windows it is just causing to detect false "ip conflict". Already there is code for RFC5227, so just trivially we just check also if source ip == target ip. Signed-off-by: Denys Fedoryshchenko <denys@visp.net.lb> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-13 16:02:07 -07:00
David S. Miller	08ec9af1c0	xfrm: Fix xfrm_state_find() wrt. wildcard source address. The change to make xfrm_state objects hash on source address broke the case where such source addresses are wildcarded. Fix this by doing a two phase lookup, first with fully specified source address, next using saddr wildcarded. Reported-by: Nicolas Dichtel <nicolas.dichtel@dev.6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-13 14:22:40 -07:00
Yi Zou	1c8dbcf649	[SCSI] net: add NETIF_F_FCOE_CRC to can_checksum_protocol Add FC CRC offload check for ETH_P_FCOE. Signed-off-by: Yi Zou <yi.zou@intel.com> Acked-by: David Miller <davem@davemloft.net> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2009-03-13 15:11:38 -05:00
Neil Horman	273ae44b9c	Network Drop Monitor: Adding Build changes to enable drop monitor Network Drop Monitor: Adding Build changes to enable drop monitor Signed-off-by: Neil Horman <nhorman@tuxdriver.com> include/linux/Kbuild \| 1 + net/Kconfig \| 11 +++++++++++ net/core/Makefile \| 1 + 3 files changed, 13 insertions(+) Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-13 12:09:29 -07:00
Neil Horman	9a8afc8d39	Network Drop Monitor: Adding drop monitor implementation & Netlink protocol Signed-off-by: Neil Horman <nhorman@tuxdriver.com> include/linux/net_dropmon.h \| 56 +++++++++ net/core/drop_monitor.c \| 263 ++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 319 insertions(+) Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-13 12:09:29 -07:00
Neil Horman	ead2ceb0ec	Network Drop Monitor: Adding kfree_skb_clean for non-drops and modifying end-of-line points for skbs Signed-off-by: Neil Horman <nhorman@tuxdriver.com> include/linux/skbuff.h \| 4 +++- net/core/datagram.c \| 2 +- net/core/skbuff.c \| 22 ++++++++++++++++++++++ net/ipv4/arp.c \| 2 +- net/ipv4/udp.c \| 2 +- net/packet/af_packet.c \| 2 +- 6 files changed, 29 insertions(+), 5 deletions(-) Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-13 12:09:28 -07:00
Neil Horman	4893d39e86	Network Drop Monitor: Add trace declaration for skb frees Signed-off-by: Neil Horman <nhorman@tuxdriver.com> include/trace/skb.h \| 8 ++++++++ net/core/Makefile \| 2 ++ net/core/net-traces.c \| 29 +++++++++++++++++++++++++++++ 3 files changed, 39 insertions(+) Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-13 12:09:27 -07:00
malc	6fc791ee63	sctp: add Adaptation Layer Indication parameter only when it's set RFC5061 states: Each adaptation layer that is defined that wishes to use this parameter MUST specify an adaptation code point in an appropriate RFC defining its use and meaning. If the user has not set one - assume they don't want to sent the param with a zero Adaptation Code Point. Rationale - Currently the IANA defines zero as reserved - and 1 as the only valid value - so we consider zero to be unset - to save adding a boolean to the socket structure. Including this parameter unconditionally causes endpoints that do not understand it to report errors unnecessarily. Signed-off-by: Malcolm Lashley <mlashley@gmail.com> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-13 11:37:58 -07:00
Wei Yongjun	76595024ff	sctp: fix to send FORWARD-TSN chunk only if peer has such capable RFC3758 Section 3.3.1. Sending Forward-TSN-Supported param in INIT Note that if the endpoint chooses NOT to include the parameter, then at no time during the life of the association can it send or process a FORWARD TSN. If peer does not support PR-SCTP capable, don't send FORWARD-TSN chunk to peer. Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-13 11:37:58 -07:00
Wei Yongjun	5ffad5aceb	sctp: fix to indicate ASCONF support in INIT-ACK only if peer has such capable This patch fix to indicate ASCONF support in INIT-ACK only if peer has such capable. This patch also fix to calc the chunk size if peer has no FWD-TSN capable. Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-13 11:37:56 -07:00
Vlad Yasevich	5e8f3f703a	sctp: simplify sctp listening code sctp_inet_listen() call is split between UDP and TCP style. Looking at the code, the two functions are almost the same and can be merged into a single helper. This also fixes a bug that was fixed in the UDP function, but missed in the TCP function. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-13 11:37:56 -07:00
Rusty Russell	a70f730282	cpumask: replace node_to_cpumask with cpumask_of_node. Impact: cleanup node_to_cpumask (and the blecherous node_to_cpumask_ptr which contained a declaration) are replaced now everyone implements cpumask_of_node. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2009-03-13 14:49:46 +10:30
Trond Myklebust	5e3771ce2d	SUNRPC: Ensure that xs_nospace return values are propagated If xs_nospace() finds that the socket has disconnected, it attempts to return ENOTCONN, however that value is then squashed by the callers. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-03-11 14:38:01 -04:00
Trond Myklebust	8a2cec295f	SUNRPC: Delay, then retry on connection errors. Enforce the comment in xs_tcp_connect_worker4/xs_tcp_connect_worker6 that we should delay, then retry on certain connection errors. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-03-11 14:38:01 -04:00
Trond Myklebust	2a4919919a	SUNRPC: Return EAGAIN instead of ENOTCONN when waking up xprt->pending While we should definitely return socket errors to the task that is currently trying to send data, there is no need to propagate the same error to all the other tasks on xprt->pending. Doing so actually slows down recovery, since it causes more than one tasks to attempt socket recovery. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-03-11 14:38:00 -04:00
Trond Myklebust	482f32e65d	SUNRPC: Handle socket errors correctly Ensure that we pick up and handle socket errors as they occur. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-03-11 14:38:00 -04:00
Trond Myklebust	c8485e4d63	SUNRPC: Handle ECONNREFUSED correctly in xprt_transmit() If we get an ECONNREFUSED error, we currently go to sleep on the 'xprt->sending' wait queue. The problem is that no timeout is set there, and there is nothing else that will wake the task up later. We should deal with ECONNREFUSED in call_status, given that is where we also deal with -EHOSTDOWN, and friends. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-03-11 14:37:59 -04:00
Trond Myklebust	40d2549db5	SUNRPC: Don't disconnect if a connection is still in progress. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-03-11 14:37:58 -04:00
Trond Myklebust	670f945731	SUNRPC: Ensure we set XPRT_CLOSING only after we've sent a tcp FIN... ...so that we can distinguish between when we need to shutdown and when we don't. Also remove the call to xs_tcp_shutdown() from xs_tcp_connect(), since xprt_connect() makes the same test. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-03-11 14:37:58 -04:00
Trond Myklebust	15f081ca8d	SUNRPC: Avoid an unnecessary task reschedule on ENOTCONN If the socket is unconnected, and xprt_transmit() returns ENOTCONN, we currently give up the lock on the transport channel. Doing so means that the lock automatically gets assigned to the next task in the xprt->sending queue, and so that task needs to be woken up to do the actual connect. The following patch aims to avoid that unnecessary task switch. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-03-11 14:37:57 -04:00
Tom Talpey	441e3e2429	SUNRPC: dynamically load RPC transport modules on-demand Provide an api to attempt to load any necessary kernel RPC client transport module automatically. By convention, the desired module name is "xprt"+"transport name". For example, when NFS mounting with "-o proto=rdma", attempt to load the "xprtrdma" module. Signed-off-by: Tom Talpey <tmtalpey@gmail.com> Cc: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-03-11 14:37:56 -04:00
Tom Talpey	b38ab40ad5	XPRTRDMA: correct an rpc/rdma inline send marshaling error Certain client rpc's which contain both lengthy page-contained metadata and a non-empty xdr_tail buffer require careful handling to avoid overlapped memory copying. Rearranging of existing rpcrdma marshaling code avoids it; this fixes an NFSv4 symlink creation error detected with connectathon basic/test8 to multiple servers. Signed-off-by: Tom Talpey <tmtalpey@gmail.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-03-11 14:37:55 -04:00
Tom Talpey	b1e1e15877	SVCRDMA: remove faulty assertions in rpc/rdma chunk validation. Certain client-provided RPCRDMA chunk alignments result in an additional scatter/gather entry, which triggered nfs/rdma server assertions incorrectly. OpenSolaris nfs/rdma client connectathon testing was blocked by these in the special/locking section. Signed-off-by: Tom Talpey <tmtalpey@gmail.com> Cc: Tom Tucker <tom@opengridcomputing.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-03-11 14:37:55 -04:00
Chuck Lever	fe315e76fc	SUNRPC: Avoid spurious wake-up during UDP connect processing To clear out old state, the UDP connect workers unconditionally invoke xs_close() before proceeding with a new connect. Nowadays this causes a spurious wake-up of the task waiting for the connect to complete. This is a little racey, but usually harmless. The waiting task immediately retries the connect via a call_bind/call_connect sequence, which usually finds the transport already in the connected state because the connect worker has finished in the background. To avoid a spurious wake-up, factor the xs_close() logic that resets the underlying socket into a helper, and have the UDP connect workers call that helper instead of xs_close(). Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-03-11 14:10:21 -04:00
Trond Myklebust	01d37c428a	SUNRPC: xprt_connect() don't abort the task if the transport isn't bound If the transport isn't bound, then we should just return ENOTCONN, letting call_connect_status() and/or call_status() deal with retrying. Currently, we appear to abort all pending tasks with an EIO error. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-03-11 14:09:39 -04:00
Trond Myklebust	fba91afbec	SUNRPC: Fix an Oops due to socket not set up yet... We can Oops in both xs_udp_send_request() and xs_tcp_send_request() if the call to xs_sendpages() returns an error due to the socket not yet being set up. Deal with that situation by returning a new error: ENOTSOCK, so that we know to avoid dereferencing transport->sock. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-03-11 14:06:41 -04:00
Eric Dumazet	fc1ad92dfc	tcp: allow timestamps even if SYN packet has tsval=0 Some systems send SYN packets with apparently wrong RFC1323 timestamp option values [timestamp tsval=0 tsecr=0]. It might be for security reasons (http://www.secuobs.com/plugs/25220.shtml ) Linux TCP stack ignores this option and sends back a SYN+ACK packet without timestamp option, thus many TCP flows cannot use timestamps and lose some benefit of RFC1323. Other operating systems seem to not care about initial tsval value, and let tcp flows to negotiate timestamp option. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-11 09:23:57 -07:00
John Dykstra	ff8cf9a938	ipv6: Fix BUG when disabled ipv6 module is unloaded Do not try to "uninitialize" ipv6 if its initialization had been skipped because module parameter disable=1 had been specified. Reported-by: Thomas Backlund <tmb@mandriva.org> Signed-off-by: John Dykstra <john.dykstra1@gmail.com> Acked-by: Brian Haley <brian.haley@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-11 09:22:51 -07:00
Ingo Molnar	78b020d035	Merge branches 'x86/cleanups', 'x86/kexec', 'x86/mce2' and 'linus' into x86/core	2009-03-11 10:49:15 +01:00
Trond Myklebust	eb9b55ab4d	SUNRPC: Tighten up the task locking rules in __rpc_execute() We should probably not be testing any flags after we've cleared the RPC_TASK_RUNNING flag, since rpc_make_runnable() is then free to assign the rpc_task to another workqueue, which may then destroy it. We can fix any races with rpc_make_runnable() by ensuring that we only clear the RPC_TASK_RUNNING flag while holding the rpc_wait_queue->lock that the task is supposed to be sleeping on (and then checking whether or not the task really is sleeping). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-03-10 20:33:16 -04:00
Stephen Hemminger	a2205472c3	net: fix warning about non-const string Since dev_set_name takes a printf style string, new gcc complains if arg is not const. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-10 05:22:43 -07:00
Stephen Hemminger	7546dd97d2	net: convert usage of packet_type to read_mostly Protocols that use packet_type can be __read_mostly section for better locality. Elminate any unnecessary initializations of NULL. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-10 05:22:43 -07:00
David S. Miller	d5df2a1613	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/bnx2x_main.c drivers/net/wireless/iwlwifi/iwl3945-base.c drivers/net/wireless/rt2x00/rt73usb.c	2009-03-10 05:04:16 -07:00
Roel Kluin	bd05f28e1a	cfg80211: test before subtraction on unsigned freq_diff is unsigned, so test before subtraction Signed-off-by: Roel Kluin <roel.kluin@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-03-06 15:54:32 -05:00
Sujith	707c1b4e68	mac80211: Update IBSS beacon timestamp properly In IBSS mode, the beacon timestamp has to be filled with the BSS's timestamp when joining, and set to zero when creating a new BSS. Signed-off-by: Sujith <Sujith.Manoharan@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-03-05 14:39:40 -05:00
Vivek Natarajan	25c9c87528	mac80211: Always send a null data frame if TIM bit is set. If the AP thinks we are in power save state eventhough we are not truly in that state, it sets the TIM bit and does not send a data frame unless we send a null data frame to correct the state in the AP. This might happen if the null data frame for wake up is lost in the air after we disable power save. Signed-off-by: Vivek Natarajan <vnatarajan@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-03-05 14:39:38 -05:00
Sujith	e65c22633c	mac80211: Fix TKIP/WEP HT capability handling There is no need to parse the AP's HT capabilities if the STA uses TKIP/WEP cipher. This allows the rate control module to choose the correct(legacy) rate table. Signed-off-by: Sujith <Sujith.Manoharan@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-03-05 14:39:37 -05:00
Johannes Berg	24776cfd55	mac80211: Fix quality reporting for wireless stats Since "mac80211/cfg80211: move iwrange handler to cfg80211", the results for link quality from "iwlist scan" and "iwconfig" commands have been very different. The results are now consistent. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Reported- and tested-by: Larry Finger <larry.finger@lwfinger.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-03-05 14:39:35 -05:00
Sujith	e31ae05083	mac80211: Notify the driver only when the beacon interval changes Currently, the driver is unconditionally notified of beacon interval. This is a problem in AP mode, because the driver has to know that the beacon interval has actualy changed to recalculate TBTT and reset the HW TSF. Fix this to make mac80211 notify the driver only when the beacon interval has been reconfigured to a new value. Signed-off-by: Sujith <Sujith.Manoharan@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-03-05 14:39:32 -05:00
David S. Miller	508827ff0a	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/tokenring/tmspci.c drivers/net/ucc_geth_mii.c	2009-03-05 02:06:47 -08:00
David S. Miller	9d40bbda59	vlan: Fix vlan-in-vlan crashes. As analyzed by Patrick McHardy, vlan needs to reset it's netdev_ops pointer in it's ->init() function but this leaves the compat method pointers stale. Add a netdev_resync_ops() and call it from the vlan code. Any other driver which changes ->netdev_ops after register_netdevice() will need to call this new function after doing so too. With help from Patrick McHardy. Tested-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-04 23:46:25 -08:00
David S. Miller	54acd0efab	net: Fix missing dev->neigh_setup in register_netdevice(). Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-04 23:01:02 -08:00
Jarek Poplawski	a883bf564e	pkt_sched: act_police: Fix a rate estimator test. A commit `c1b56878fb` "tc: policing requires a rate estimator" introduced a test which invalidates previously working configs, based on examples from iproute2: doc/actions/actions-general. This is too rigorous: a rate estimator is needed only when police's "avrate" option is used. Reported-by: Joao Correia <joaomiguelcorreia@gmail.com> Diagnosed-by: John Dykstra <john.dykstra1@gmail.com> Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-04 17:38:10 -08:00
Brian Haley	fb13d9f9e4	SCTP: change sctp_ctl_sock_init() to try IPv4 if IPv6 fails Change sctp_ctl_sock_init() to try IPv4 if IPv6 socket registration fails. Required if the IPv6 module is loaded with "disable=1", else SCTP will fail to load. Signed-off-by: Brian Haley <brian.haley@hp.com> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-04 03:20:26 -08:00
Brian Haley	fe7ca2e1e8	IPv6: add "disable" module parameter support to ipv6.ko Add "disable" module parameter support to ipv6.ko by specifying "disable=1" on module load. We just do the minimum of initializing inetsw6[] so calls from other modules to inet6_register_protosw() won't OOPs, then bail out. No IPv6 addresses or sockets can be created as a result, and a reboot is required to enable IPv6. Signed-off-by: Brian Haley <brian.haley@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-04 03:19:08 -08:00
Eric Biederman	0c5c2d3089	neigh: Allow for user space users of the neighbour table Currently it is possible to do just about everything with the arp table from user space except treat an entry like you are using it. To that end implement and a flag NTF_USE that when set in a netwlink update request treats the neighbour table entry like the kernel does on the output path. This allows user space applications to share the kernel's arp cache. Signed-off-by: Eric Biederman <ebiederm@aristanetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-04 00:03:08 -08:00
Meelis Roos	4222474519	net: fix tokenring license Currently, modular tokenring ("tr") lacks a license and fails to load: tr: module license 'unspecified' taints kernel. tr: Unknown symbol proc_net_fops_create Beacuse of this, no tokenring driver can load if it depends on modular tr. Fix this by adding GPL module license as it is in the kernel. With this fix, tr module loads fine and tms380 driver also loads. Well, it does'nt work but that's a different bug. Signed-off-by: Meelis Roos <mroos@linux.ee> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-03 23:48:50 -08:00
Pablo Neira Ayuso	4843b93c96	netlink: invert error code in netlink_set_err() The callers of netlink_set_err() currently pass a negative value as parameter for the error code. However, sk->sk_err wants a positive error value. Without this patch, skb_recv_datagram() called by netlink_recvmsg() may return a positive value to report an error. Another choice to fix this is to change callers to pass a positive error value, but this seems a bit inconsistent and error prone to me. Indeed, the callers of netlink_set_err() assumed that the (usual) negative value for error codes was fine before this patch :). This patch also includes some documentation in docbook format for netlink_set_err() to avoid this sort of confusion. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-03 23:37:30 -08:00
Geert Uytterhoeven	e9cc8bddae	netlink: Move netlink attribute parsing support to lib Netlink attribute parsing may be used even if CONFIG_NET is not set. Move it from net/netlink to lib and control its inclusion based on the new config symbol CONFIG_NLATTR, which is selected by CONFIG_NET. Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2009-03-04 14:53:30 +08:00
Randy Dunlap	abb79972b4	rds: fix iband RDMA dependencies Fix RDS Infiniband dependencies for RDMA so that these build errors won't happen: ERROR: "rdma_accept" [net/rds/rds.ko] undefined! ERROR: "rdma_destroy_id" [net/rds/rds.ko] undefined! ERROR: "rdma_connect" [net/rds/rds.ko] undefined! ERROR: "rdma_destroy_qp" [net/rds/rds.ko] undefined! ERROR: "rdma_listen" [net/rds/rds.ko] undefined! ERROR: "rdma_notify" [net/rds/rds.ko] undefined! ERROR: "rdma_create_id" [net/rds/rds.ko] undefined! ERROR: "rdma_create_qp" [net/rds/rds.ko] undefined! ERROR: "rdma_bind_addr" [net/rds/rds.ko] undefined! ERROR: "rdma_resolve_route" [net/rds/rds.ko] undefined! ERROR: "rdma_disconnect" [net/rds/rds.ko] undefined! ERROR: "rdma_reject" [net/rds/rds.ko] undefined! ERROR: "rdma_resolve_addr" [net/rds/rds.ko] undefined! Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Acked-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-03 21:39:40 -08:00
Ingo Molnar	91d75e209b	Merge branch 'x86/core' into core/percpu	2009-03-04 02:29:19 +01:00
Eric W. Biederman	17edde5209	netns: Remove net_alive It turns out that net_alive is unnecessary, and the original problem that led to it being added was simply that the icmp code thought it was a network device and wound up being unable to handle packets while there were still packets in the network namespace. Now that icmp and tcp have been fixed to properly register themselves this problem is no longer present and we have a stronger guarantee that packets will not arrive in a network namespace then that provided by net_alive in netif_receive_skb. So remove net_alive allowing packet reception run a little faster. Additionally document the strong reason why network namespace cleanup is safe so that if something happens again someone else will have a chance of figuring it out. Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-03 01:14:27 -08:00
Eric W. Biederman	2f20d2e667	tcp: Like icmp use register_pernet_subsys To remove the possibility of packets flying around when network devices are being cleaned up use reisger_pernet_subsys instead of register_pernet_device. Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com> Acked-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-03 01:14:21 -08:00
Eric W. Biederman	6eb0777228	netns: Fix icmp shutdown. Recently I had a kernel panic in icmp_send during a network namespace cleanup. There were packets in the arp queue that failed to be sent and we attempted to generate an ICMP host unreachable message, but failed because icmp_sk_exit had already been called. The network devices are removed from a network namespace and their arp queues are flushed before we do attempt to shutdown subsystems so this error should have been impossible. It turns out icmp_init is using register_pernet_device instead of register_pernet_subsys. Which resulted in icmp being shut down while we still had the possibility of packets in flight, making a nasty NULL pointer deference in interrupt context possible. Changing this to register_pernet_subsys fixes the problem in my testing. Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com> Acked-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-03 01:14:15 -08:00
Daniel Lezcano	176c39af29	netns: fix addrconf_ifdown kernel panic When a network namespace is destroyed the network interfaces are all unregistered, making addrconf_ifdown called by the netdevice notifier. In the other hand, the addrconf exit method does a loop on the network devices and does addrconf_ifdown on each of them. But the ordering of the netns subsystem is not right because it uses the register_pernet_device instead of register_pernet_subsys. If we handle the loopback as any network device, we can safely use register_pernet_subsys. But if we use register_pernet_subsys, the addrconf exit method will do exactly what was already done with the unregistering of the network devices. So in definitive, this code is pointless. I removed the netns addrconf exit method and moved the code to the addrconf cleanup function. Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-03 01:06:45 -08:00
Stephen Hemminger	b325fddb7f	ipv6: Fix sysctl unregistration deadlock Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-03 00:47:47 -08:00
Stephen Hemminger	5a5990d309	net: Avoid race between network down and sysfs Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-03 00:47:46 -08:00
Vlad Yasevich	7e99013a50	sctp: Fix broken RTO-doubling for data retransmits Commit `faee47cdbf` (sctp: Fix the RTO-doubling on idle-link heartbeats) broke the RTO doubling for data retransmits. If the heartbeat was sent before the data T3-rtx time, the the RTO will not double upon the T3-rtx expiration. Distingish between the operations by passing an argument to the function. Additionally, Wei Youngjun pointed out that our treatment of requested HEARTBEATS and timer HEARTBEATS is the same wrt resetting congestion window. That needs to be separated, since user requested HEARTBEATS should not treat the link as idle. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-02 22:49:18 -08:00
Wei Yongjun	f61f6f82c9	sctp: use time_before or time_after for comparing jiffies The functions time_before or time_after are more robust for comparing jiffies against other values. Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-02 22:49:18 -08:00
Wei Yongjun	c6db93a58f	sctp: fix the length check in sctp_getsockopt_maxburst() The code in sctp_getsockopt_maxburst() doesn't allow len to be larger then struct sctp_assoc_value, which is a common case where app writers just pass down the sizeof(buf) or something similar. This patch fix the problem. Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-02 22:49:17 -08:00
Wei Yongjun	d212318c9d	sctp: remove dup code in net/sctp/socket.c Remove dup check of "if (optlen < sizeof(int))". Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-02 22:49:16 -08:00
Wei Yongjun	906f8257ee	sctp: Add some missing types for debug message This patch add the type name "AUTH" and primitive type name "PRIMITIVE_ASCONF" for debug message. Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-02 22:49:16 -08:00
Hantzis Fotis	ee7537b63a	tcp: tcp_init_wl / tcp_update_wl argument cleanup The above functions from include/net/tcp.h have been defined with an argument that they never use. The argument is 'u32 ack' which is never used inside the function body, and thus it can be removed. The rest of the patch involves the necessary changes to the function callers of the above two functions. Signed-off-by: Hantzis Fotis <xantzis@ceid.upatras.gr> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-02 22:42:02 -08:00
Wei Yongjun	3df2678737	sctp: fix kernel panic with ERROR chunk containing too many error causes If ERROR chunk is received with too many error causes in ESTABLISHED state, the kernel get panic. This is because sctp limit the max length of cmds to 14, but while ERROR chunk is received, one error cause will add around 2 cmds by sctp_add_cmd_sf(). So many error causes will fill the limit of cmds and panic. This patch fixed the problem. This bug can be test by SCTP Conformance Test Suite <http://networktest.sourceforge.net/>. Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-02 22:27:39 -08:00
Vlad Yasevich	d1dd524785	sctp: fix crash during module unload An extra list_del() during the module load failure and unload resulted in a crash with a list corruption. Now sctp can be unloaded again. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-02 22:27:38 -08:00
Gerrit Renker	86739fb96e	dccp: Do not let initial option overhead shrink the MPS This fixes a problem caused by the overlap of the connection-setup and established-state phases of DCCP connections. During connection setup, the client retransmits Confirm Feature-Negotiation options until a response from the server signals that it can move from the half-established PARTOPEN into the OPEN state, whereupon the connection is fully established on both ends (RFC 4340, 8.1.5). However, since the client may already send data while it is in the PARTOPEN state, consequences arise for the Maximum Packet Size: the problem is that the initial option overhead is much higher than for the subsequent established phase, as it involves potentially many variable-length list-type options (server-priority options, RFC 4340, 6.4). Applying the standard MPS is insufficient here: especially with larger payloads this can lead to annoying, counter-intuitive EMSGSIZE errors. On the other hand, reducing the MPS available for the established phase by the added initial overhead is highly wasteful and inefficient. The solution chosen therefore is a two-phase strategy: If the payload length of the DataAck in PARTOPEN is too large, an Ack is sent to carry the options, and the feature-negotiation list is then flushed. This means that the server gets two Acks for one Response. If both Acks get lost, it is probably better to restart the connection anyway and devising yet another special-case does not seem worth the extra complexity. The result is a higher utilisation of the available packet space for the data transmission phase (established state) of a connection. The patch (over-)estimates the initial overhead to be 32*4 bytes -- commonly seen values were around 90 bytes for initial feature-negotiation options. It uses sizeof(u32) to mean "aligned units of 4 bytes". For consistency, another use of 4-byte alignment is adapted. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-02 03:07:23 -08:00
Gerrit Renker	361a5c1dd0	dccp: Minimise header option overhead in setting the MPS This patch resolves a long-standing FIXME to dynamically update the Maximum Packet Size depending on actual options usage. It uses the flags set by the feature-negotiation infrastructure to compute the required header option size. Most options are fixed-size, a notable exception are Ack Vectors (required currently only by CCID-2). These can have any length between 3 and 1020 bytes. As a result of testing, 16 bytes (2 bytes for type/length plus 14 Ack Vector cells) have been found to be sufficient for loss-free situations. There are currently no CCID-specific header options which may appear on data packets, thus it is not necessary to define a corresponding CCID field as suggested in the old comment. Further changes: ---------------- Adjusted the type of 'cur_mps' to match the unsigned return type of the function. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-02 03:07:23 -08:00
Ilpo Järvinen	9ce0146102	tcp: get rid of two unnecessary u16s in TCP skb flags copying I guess these fields were one day 16-bit in the struct but nowadays they're just using 8 bits anyway. This is just a precaution, didn't result any change in my case but who knows what all those varying gcc versions & options do. I've been told that 16-bit is not so nice with some cpus. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-02 03:00:17 -08:00
Ilpo Järvinen	0d6a775e27	tcp: in sendmsg/pages open code the real goto target copied was assigned zero right before the goto, so if (copied) cannot ever be true. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-02 03:00:16 -08:00
Ilpo Järvinen	cabeccbd17	tcp: kill eff_sacks "cache", the sole user can calculate itself Also fixes insignificant bug that would cause sending of stale SACK block (would occur in some corner cases). Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-02 03:00:16 -08:00
Ilpo Järvinen	758ce5c8d1	tcp: add helper for AI algorithm It seems that implementation in yeah was inconsistent to what other did as it would increase cwnd one ack earlier than the others do. Size benefits: bictcp_cong_avoid \| -36 tcp_cong_avoid_ai \| +52 bictcp_cong_avoid \| -34 tcp_scalable_cong_avoid \| -36 tcp_veno_cong_avoid \| -12 tcp_yeah_cong_avoid \| -38 = -104 bytes total Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-02 03:00:15 -08:00
Ilpo Järvinen	571a5dd8d0	htcp: merge icsk_ca_state compare Similar to what is done elsewhere in TCP code when double state checks are being done. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-02 03:00:14 -08:00
Ilpo Järvinen	e6c7d08579	tcp: drop unnecessary local var in collapse Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-02 03:00:13 -08:00
Ilpo Järvinen	bc079e9ede	tcp: cleanup ca_state mess in tcp_timer Redundant checks made indentation impossible to follow. However, it might be useful to make this ca_state+is_sack indexed array. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-02 03:00:13 -08:00
Ilpo Järvinen	7363a5b233	tcp: separate timeout marking loop to it's own function Some comment about its current state added. So far I have seen very few cases where the thing is actually useful, usually just marginally (though admittedly I don't usually see top of window losses where it seems possible that there could be some gain), instead, more often the cases suffer from L-marking spike which is certainly not desirable (I'll bury improving it to my todo list, but on a low prio position). Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-02 03:00:12 -08:00
Ilpo Järvinen	d0af4160d1	tcp: remove redundant code from tcp_mark_lost_retrans Arnd Hannemann <hannemann@nets.rwth-aachen.de> noticed and was puzzled by the fact that !tcp_is_fack(tp) leads to early return near the beginning and the later on tcp_is_fack(tp) was still used in an if condition. The later check was a left-over from RFC3517 SACK stuff (== !tcp_is_fack(tp) behavior nowadays) as there wasn't clear way how to handle this particular check cheaply in the spirit of RFC3517 (using only SACK blocks, not holes + SACK blocks as with FACK). I sort of left it there as a reminder but since it's confusing other people just remove it and comment the missing-feature stuff instead. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Cc: Arnd Hannemann <hannemann@nets.rwth-aachen.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-02 03:00:11 -08:00
Ilpo Järvinen	02276f3c96	tcp: fix corner case issue in segmentation during rexmitting If cur_mss grew very recently so that the previously G/TSOed skb now fits well into a single segment it would get send up in parts unless we calculate # of segments again. This corner-case could happen eg. after mtu probe completes or less than previously sack blocks are required for the opposite direction. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-02 03:00:11 -08:00
Ilpo Järvinen	d3d2ae4545	tcp: Don't clear hints when tcp_fragmenting 1) We didn't remove any skbs, so no need to handle stale refs. 2) scoreboard_skb_hint is trivial, no timestamps were changed so no need to clear that one 3) lost_skb_hint needs tweaking similar to that of tcp_sacktag_one(). Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-02 03:00:10 -08:00
Ilpo Järvinen	62ad27619c	tcp: deferring in middle of queue makes very little sense If skb can be sent right away, we certainly should do that if it's in the middle of the queue because it won't get more data into it. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-02 03:00:10 -08:00
Ilpo Järvinen	59a08cba6a	tcp: fix lost_cnt_hint miscounts It is possible that lost_cnt_hint gets underflow in tcp_clean_rtx_queue because the cumulative ACK can cover the segment where lost_skb_hint points to only partially, which means that the hint is not cleared, opposite to what my (earlier) comment claimed. Also I don't agree what I ended up writing about non-trivial case there to be what I intented to say. It was not supposed to happen that the hint won't get cleared and we underflow in any scenario. In general, this is quite hard to trigger in practice. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-02 03:00:09 -08:00
Ilpo Järvinen	ac11ba753f	tcp: don't backtrack to sacked skbs Backtracking to sacked skbs is a horrible performance killer since the hint cannot be advanced successfully past them... ...And it's totally unnecessary too. In theory this is 2.6.27..28 regression but I doubt anybody can make .28 to have worse performance because of other TCP improvements. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-02 03:00:08 -08:00
David S. Miller	c3b3240450	rds: Fix build on powerpc. As reported by Stephen Rothwell. > Today's linux-next build (powerpc allyesconfig) failed like this: > > net/rds/cong.c: In function 'rds_cong_set_bit': > net/rds/cong.c:284: error: implicit declaration of function 'generic___set_le_bit' > net/rds/cong.c: In function 'rds_cong_clear_bit': > net/rds/cong.c:298: error: implicit declaration of function 'generic___clear_le_bit' > net/rds/cong.c: In function 'rds_cong_test_bit': > net/rds/cong.c:309: error: implicit declaration of function 'generic_test_le_bit' Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-02 01:49:28 -08:00
David S. Miller	aa4abc9bcc	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/wireless/iwlwifi/iwl-tx.c net/8021q/vlan_core.c net/core/dev.c	2009-03-01 21:35:16 -08:00
Ilpo Järvinen	9ec06ff57a	tcp: fix retrans_out leaks There's conflicting assumptions in shifting, the caller assumes that dupsack results in S'ed skbs (or a part of it) for sure but never gave a hint to tcp_sacktag_one when dsack is actually in use. Thus DSACK retrans_out -= pcount was not taken and the counter became out of sync. Remove obstacle from that information flow to get DSACKs accounted in tcp_sacktag_one as expected. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Tested-by: Denys Fedoryshchenko <denys@visp.net.lb> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-01 00:21:36 -08:00
Herbert Xu	4ead443163	netpoll: Add drop checks to all entry points The netpoll entry checks are required to ensure that we don't receive normal packets when invoked via netpoll. Unfortunately it only ever worked for the netif_receive_skb/netif_rx entry points. The VLAN (and subsequently GRO) entry point didn't have the check and therefore can trigger all sorts of weird problems. This patch adds the netpoll check to all entry points. I'm still uneasy with receiving at all under netpoll (which apparently is only used by the out-of-tree kdump code). The reason is it is perfectly legal to receive all data including headers into highmem if netpoll is off, but if you try to do that with netpoll on and someone gets a printk in an IRQ handler you're going to get a nice BUG_ON. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-03-01 00:11:52 -08:00
David S. Miller	8010dc306b	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6	2009-02-28 22:32:16 -08:00
Jouni Malinen	0bfbce18b9	nl80211: Avoid AP mode BUG_ON hang with invalid lock assert "cfg80211: add assert_cfg80211_lock() to ensure proper protection" added assert_cfg80211_lock() calls into various places. At least one of them, nl80211_send_wiphy(), should not have been there. That triggers the BUG_ON in assert_cfg80211_lock() and pretty much kills the kernel whenever someone runs hostapd.. Remove that call and make assert_cfg80211_lock() use WARN_ON instead of BUG_ON to be a bit more friendly to users. Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com> Acked-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-02-27 14:53:04 -05:00
Alina Friedrichsen	4a332a385a	mac80211: Give it some time to do the TSF sync Give slow hardware some time to do the TSF sync, to not run into an IBSS merging endless loop in some rarely situations. Signed-off-by: Alina Friedrichsen <x-alina@gmx.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-02-27 14:53:03 -05:00
Alina Friedrichsen	34e8f08231	mac80211: Don't merge with the same BSSID It was not a good idea to do a TSF reset on strange IBSS merges to the same BSSID. For example it will break the TSF sync of ath9k completely and it is unnecessary as all hardware I have tested do a TSF sync to a higher value automatically and IBSS merges are only done to higher TSF values. It only need a TSF reset to accept a lower value, when the IBSS network is changed manually. Signed-off-by: Alina Friedrichsen <x-alina@gmx.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-02-27 14:53:02 -05:00
Luis R. Rodriguez	2f92cd2e5f	cfg80211: pass the regulatory_request to ignore_request Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-02-27 14:53:00 -05:00
Luis R. Rodriguez	d951c1ddeb	cfg80211: do not kzalloc() again for a new request on __regulatory_hint Since we already have a regulatory request from the workqueue use that and avoid a new kzalloc() Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-02-27 14:53:00 -05:00
Luis R. Rodriguez	28da32d7ca	cfg80211: pass the regulatory_request struct in __regulatory_hint() We were passing value by value, lets just pass the struct. Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-02-27 14:53:00 -05:00
Luis R. Rodriguez	d1c96a9a29	cfg80211: make __regulatory_hint() static Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-02-27 14:52:59 -05:00
Luis R. Rodriguez	e38f8a7a8b	cfg80211: Add AP beacon regulatory hints When devices are world roaming they cannot beacon or do active scan on 5 GHz or on channels 12, 13 and 14 on the 2 GHz band. Although we have a good regulatory API some cards may _always_ world roam, this is also true when a system does not have CRDA present. Devices doing world roaming can still passive scan, if they find a beacon from an AP on one of the world roaming frequencies we make the assumption we can do the same and we also remove the passive scan requirement. This adds support for providing beacon regulatory hints based on scans. This works for devices that do either hardware or software scanning. If a channel has not yet been marked as having had a beacon present on it we queue the beacon hint processing into the workqueue. All wireless devices will benefit from beacon regulatory hints from any wireless device on a system including new devices connected to the system at a later time. Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-02-27 14:52:59 -05:00
Luis R. Rodriguez	3fc71f775a	cfg80211: enable 5 GHz world roaming channels The current static world regulatory domain is too restrictive, we can use some 5 GHz channels world wide so long as they do not touch frequencies which require DFS. The compromise is we must also enforce passive scanning and disallow usage of a mode of operation that beacons: (AP \| IBSS \| Mesh) Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-02-27 14:52:59 -05:00
Luis R. Rodriguez	68798a6263	cfg80211: enable active-scan / beaconing on Ch 1-11 for world regdom This enables active scan and beaconing on Channels 1 through 11 on the static world regulatory domain. Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-02-27 14:52:58 -05:00
Luis R. Rodriguez	69b1572bd8	cfg80211: rename regdom_changed to regdom_changes() and use it Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-02-27 14:52:58 -05:00
Luis R. Rodriguez	fff32c04f6	cfg80211: allow drivers that agree on regulatory to agree This allows drivers that agree on regulatory to share their regulatory domain. Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-02-27 14:52:58 -05:00
Luis R. Rodriguez	fb1fc7add5	cfg80211: comments style cleanup Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-02-27 14:52:57 -05:00
Luis R. Rodriguez	fe33eb3908	cfg80211: move all regulatory hints to workqueue All regulatory hints (core, driver, userspace and 11d) are now processed in a workqueue. Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-02-27 14:52:57 -05:00
Luis R. Rodriguez	0441d6ffc7	cfg80211: free rd on unlikely event on 11d hint This was never happening but it was still wrong, so correct it. Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-02-27 14:52:57 -05:00
Luis R. Rodriguez	915278e099	cfg80211: remove likely from an 11d hint case Truth of the matter this was confusing people so mark it as unlikely as that is the case now. Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-02-27 14:52:56 -05:00
Luis R. Rodriguez	d335fe6391	cfg80211: protect first access of last_request on 11d hint under mutex We were not protecting last_request there is a small possible race between an 11d hint and another routine which calls reset_regdomains() which can prevent a valid country IE from being processed. This is not critical as it will still be procesed soon after but locking prior to it is correct. Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-02-27 14:52:56 -05:00
Luis R. Rodriguez	806a9e3967	cfg80211: make regulatory_request use wiphy_idx instead of wiphy We do this so later on we can move the pending requests onto a workqueue. By using the wiphy_idx instead of the wiphy we can later easily check if the wiphy has disappeared or not. Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-02-27 14:52:56 -05:00
Luis R. Rodriguez	761cf7ecff	cfg80211: add assert_cfg80211_lock() to ensure proper protection Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-02-27 14:52:56 -05:00
Luis R. Rodriguez	bcf4f99b7b	cfg80211: propagate -ENOMEM during regulatory_init() Calling kobject_uevent_env() can fail mainly due to out of memory conditions. We do not want to continue during such conditions so propagate that as well instead of letting cfg80211 load as if everything is peachy. Additionally lets clarify that when CRDA is not called during cfg80211's initialization _and_ if the error is not an -ENOMEM its because kobject_uevent_env() failed to call CRDA, not because CRDA failed. For those who want to find out why we also let you do so by enabling the kernel config CONFIG_CFG80211_REG_DEBUG -- you'll get an actual stack trace. So for now we'll treat non -ENOMEM kobject_uevent_env() failures as non fatal during cfg80211's initialization. CC: Greg KH <greg@kroah.com> Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-02-27 14:52:55 -05:00
Luis R. Rodriguez	ba25c14142	cfg80211: add regulatory_hint_core() to separate the core reg hint This makes the core hint path more readable and allows for us to later make it obvious under what circumstances we need locking or not. Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-02-27 14:52:55 -05:00
Luis R. Rodriguez	80778f18c0	nl80211: disallow user requests prior to regulatory_init() If cfg80211 is built into the kernel there is perhaps a small time window betwen nl80211_init() and regulatory_init() where cfg80211_regdomain hasn't yet been initialized to let the wireless core do its work. During that rare case and time frame (if its even possible) we don't allow user regulatory changes as cfg80211 is working on enabling its first regulatory domain. To check for cfg80211_regdomain we now contend the entire operation using the cfg80211_mutex. Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-02-27 14:52:55 -05:00
Luis R. Rodriguez	a1794390f1	cfg80211: rename cfg80211_drv_mutex to cfg80211_mutex cfg80211_drv_mutex is protecting more than the driver list, this renames it and documents what its currently supposed to protect. Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-02-27 14:52:55 -05:00
Luis R. Rodriguez	85fd129a72	cfg80211: add wiphy_idx_valid to check for wiphy_idx sanity This will later be used by others, for now make use of it in cfg80211_drv_by_wiphy_idx() to return early if an invalid wiphy_idx has been provided. Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-02-27 14:52:54 -05:00
Luis R. Rodriguez	b5850a7a4f	cfg80211: rename cfg80211_registered_device's idx to wiphy_idx Makes it clearer to read when comparing to ifidx Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-02-27 14:52:54 -05:00
Alina Friedrichsen	79f6440c52	mac80211: Introduce a generic commit() to apply changes This patch introduces a generic commit() function which initiate a new network joining process. It should be called after some interface config changes, so that the changes get applied more cleanly. Currently set_ssid() and set_bssid() call it. Others can be added in future patches. In version 1 the header files was forgotten, sorry. Signed-off-by: Alina Friedrichsen <x-alina@gmx.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-02-27 14:52:54 -05:00
Michael Buesch	80e775bf08	mac80211: Add software scan notifiers This adds optional notifier functions for software scan. Signed-off-by: Michael Buesch <mb@bu3sch.de> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-02-27 14:52:51 -05:00
Johannes Berg	4aa188e1a8	mac80211/cfg80211: move iwrange handler to cfg80211 The previous patch made cfg80211 generally aware of the signal type a given hardware will give, so now it can implement SIOCGIWRANGE itself, removing more wext stuff from mac80211. Might need to be a little more parametrized once we have more hardware using cfg80211 and new hardware capabilities. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-02-27 14:52:42 -05:00
Johannes Berg	77965c970d	cfg80211: clean up signal type It wasn't a good idea to make the signal type a per-BSS option, although then it is closer to the actual value. Move it to be a per-wiphy setting, update mac80211 to match. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-02-27 14:52:42 -05:00
Johannes Berg	630e64c487	nl80211: remove admin requirement from station get There's no particular reason to not let untrusted users see this information -- it's just the stations we're talking to, packet counters for them and possibly some mesh things. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-02-27 14:52:41 -05:00
Johannes Berg	0a16ec5f5e	mac80211: add missing kernel-doc Document the new shutdown member. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-02-27 14:52:41 -05:00
Johannes Berg	a77b855245	cfg80211/mac80211: fill qual.qual value/adjust max_qual.qual Due to various bugs in the software stack we end up having to fill qual.qual; level should be used, but wpa_supplicant doesn't properly ignore qual.qual, NM should use qual.level regardless of that because qual.qual is 0 but doesn't handle IW_QUAL_DBM right now. So fill qual.qual with the qual.level value clamped to -110..-40 dBm or just the regular 'unspecified' signal level. This requires a mac80211 change to properly announce the max_qual.qual and avg_qual.qual values. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-02-27 14:52:40 -05:00
Dan Williams	cb3a8eec0e	cfg80211: age scan results on resume Scanned BSS entries are timestamped with jiffies, which doesn't increment across suspend and hibernate. On resume, every BSS in the scan list looks like it was scanned within the last 10 seconds, irregardless of how long the machine was actually asleep. Age scan results on resume with the time spent during sleep so userspace has a clue how old they really are. Signed-off-by: Dan Williams <dcbw@redhat.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-02-27 14:52:40 -05:00
Jouni Malinen	98c8a60a04	nl80211: Provide access to STA TX/RX packet counters The TX/RX packet counters are needed to fill in RADIUS Accounting attributes Acct-Output-Packets and Acct-Input-Packets. We already collect the needed information, but only the TX/RX bytes were previously exposed through nl80211. Allow applications to fetch the packet counters, too, to provide more complete support for accounting. Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com> Acked-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-02-27 14:52:39 -05:00
Jouni Malinen	70692ad292	nl80211: Optional IEs into scan request This extends the NL80211_CMD_TRIGGER_SCAN command to allow applications to specify a set of information element(s) to be added into Probe Request frames with NL80211_ATTR_IE. This provides support for the MLME-SCAN.request primitive parameter VendorSpecificInfo and can be used, e.g., to implement WPS scanning. Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com> Acked-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-02-27 14:52:38 -05:00
Randy Dunlap	13e967b292	wireless: fix for CONFIG_NL80211=n Add empty function for case of CONFIG_NL80211=n: net/wireless/scan.c:35: error: implicit declaration of function 'nl80211_send_scan_aborted' Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Acked-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-02-27 14:52:35 -05:00
Sujith	81cb7623ad	mac80211: Extend the rate control API with an update callback The AP can switch dynamically between 20/40 Mhz channel width, in which case we switch the local operating channel, but the rate control algorithm is not notified. This patch adds a new callback to indicate such changes to the RC algorithm. Currently, HT channel width change is notified, but this callback can be used to indicate any new requirements that might come up later on. Signed-off-by: Sujith <Sujith.Manoharan@atheros.com> Acked-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-02-27 14:51:45 -05:00
Johannes Berg	469002983f	mac80211: split IBSS/managed code This patch splits out the ibss code and data from managed (station) mode. The reason to do this is to better separate the state machines, and have the code be contained better so it gets easier to determine what exactly a given change will affect, that in turn makes it easier to understand. This is quite some churn, especially because I split sdata->u.sta into sdata->u.mgd and sdata->u.ibss, but I think it's easier to maintain that way. I've also shuffled around some code -- null function sending is only applicable to managed interfaces so put that into that file, some other functions are needed from various places so put them into util, and also rearranged the prototypes in ieee80211_i.h accordingly. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-02-27 14:51:42 -05:00
Johannes Berg	96f5e66e8a	mac80211: fix aggregation for hardware with ampdu queues Hardware with AMPDU queues currently has broken aggregation. This patch fixes it by making all A-MPDUs go over the regular AC queues, but keeping track of the hardware queues in mac80211. As a first rough version, it actually stops the AC queue for extended periods of time, which can be removed by adding buffering internal to mac80211, but is currently not a huge problem because people rarely use multiple TIDs that are in the same AC (and iwlwifi currently doesn't operate as AP). This is a short-term fix, my current medium-term plan, which I hope to execute soon as well, but am not sure can finish before .30, looks like this: 1) rework the internal queuing layer in mac80211 that we use for fragments if the driver stopped queue in the middle of a fragmented frame to be able to queue more frames at once (rather than just a single frame with its fragments) 2) instead of stopping the entire AC queue, queue up the frames in a per-station/per-TID queue during aggregation session initiation, when the session has come up take all those frames and put them onto the queue from 1) 3) push the ampdu queue layer abstraction this patch introduces in mac80211 into the driver, and remove the virtual queue stuff from mac80211 again This plan will probably also affect ath9k in that mac80211 queues the frames instead of passing them down, even when there are no ampdu queues. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-02-27 14:51:42 -05:00
Johannes Berg	076ae609d2	mac80211: disallow moving netns mac80211 currently assumes init_net for all interfaces, so really will not cope well with network namespaces, at least at this time. To change this, we would have keep track of the netns in addition to the ifindex, which is not something I want to think about right now. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Cc: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-02-27 14:51:39 -05:00
Vasanthakumar Thiagarajan	53d6f81c78	mac80211: Make sure non-HT connection when IEEE80211_STA_TKIP_WEP_USED is set It is possible that some broken AP might send HT IEs in it's assoc response even though the STA has not sent them in assoc req when WEP/TKIP is used as pairwise cipher suite. Also it is important to check this bit before enabling ht mode in beacon receive path. Signed-off-by: Vasanthakumar Thiagarajan <vasanth@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-02-27 14:51:39 -05:00
Jarek Poplawski	1844f74794	pkt_sched: sch_drr: Fix oops in drr_change_class. drr_change_class lacks a check for NULL of tca[TCA_OPTIONS], so oops is possible. Reported-by: Denys Fedoryschenko <denys@visp.net.lb> Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-27 02:42:38 -08:00
Andy Grover	fe17f84f5f	RDS: Kconfig and Makefile Add RDS Kconfig and Makefile, and modify net/'s to add us to the build. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-26 23:43:35 -08:00
Andy Grover	cbd151bfc7	RDS: Add RDS to AF key strings Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-26 23:43:19 -08:00
Andy Grover	55b7ed0b58	RDS: Common RDMA transport code Although most of IB and iWARP are separated from each other, there is some common code required to handle their shared CM listen port. This code listens for CM events and then dispatches the event to the appropriate transport, either IB or iWARP. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-26 23:39:33 -08:00
Andy Grover	fcd8b7c0ec	RDS: Add iWARP support Support for iWARP NICs is implemented as a separate RDS transport from IB. The code, however, is very similar to IB (it was forked, basically.) so let's keep it in one changeset. The reason for this duplicationis that despite its similarity to IB, there are a number of places where it has different semantics. iwarp zcopy support is still under development, and giving it its own sandbox ensures that IB code isn't disrupted while iwarp changes. Over time these transports will re-converge. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-26 23:39:33 -08:00
Andy Grover	e6babe4cc4	RDS/IB: Stats and sysctls IB-specific stats and sysctls. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-26 23:39:32 -08:00
Andy Grover	1e23b3ee0e	RDS/IB: Receive datagrams via IB Header parsing, ring refill. It puts the incoming data into an rds_incoming struct, which is passed up to rds-core. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-26 23:39:32 -08:00
Andy Grover	6a0979df32	RDS/IB: Implement IB-specific datagram send. Specific to IB is a credits-based flow control mechanism, in addition to the expected usage of the IB API to package outgoing data into work requests. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-26 23:39:31 -08:00
Andy Grover	08b48a1ed8	RDS/IB: Implement RDMA ops using FMRs Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-26 23:39:31 -08:00
Andy Grover	f528efe276	RDS/IB: Ring-handling code. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-26 23:39:30 -08:00
Andy Grover	ec16227e14	RDS/IB: Infiniband transport Registers as an RDS transport and an IB client, and uses IB CM API to allocate ids, queue pairs, and the rest of that fun stuff. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-26 23:39:30 -08:00
Andy Grover	eff5f53bef	RDS: RDMA support Some transports may support RDMA features. This handles the non-transport-specific parts, like pinning user pages and tracking mapped regions. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-26 23:39:29 -08:00
Andy Grover	bdbe6fbc6a	RDS: recv.c Upon receiving a datagram from the transport, RDS parses the headers and potentially queues an ACK. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-26 23:39:29 -08:00
Andy Grover	5c11559046	RDS: send.c This is the code to send an RDS datagram. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-26 23:39:28 -08:00
Andy Grover	7875e18e09	RDS: Message parsing Parsing of newly-received RDS message headers (including ext. headers) and copy-to/from-user routines. page.c implements a per-cpu page remainder cache, to reduce the number of allocations needed for small datagrams. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-26 23:39:28 -08:00
Andy Grover	3e5048495c	RDS: sysctls RDS exposes a few tunable parameters via sysctls. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-26 23:39:26 -08:00
Andy Grover	13796bf9ed	RDS: loopback A simple rds transport to handle loopback connections. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-26 23:39:26 -08:00
Andy Grover	00e0f34c61	RDS: Connection handling While arguably the fact that the underlying transport needs a connection to convey RDS's datagrame reliably is not important to rds proper, the transports implemented so far (IB and TCP) have both been connection-oriented, and so the connection state machine-related code is in the common rds code. This patch also includes several work items, to handle connecting, sending, receiving, and shutdown. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-26 23:39:25 -08:00
Andy Grover	a8c879a7ee	RDS: Info and stats RDS currently generates a lot of stats that are accessible via the rds-info utility. This code implements the support for this. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-26 23:39:25 -08:00
Andy Grover	0fbc78cbf5	RDS: Transport code RDS supports multiple transports. While this initial submission only supports Infiniband transport, this abstraction allows others to be added. We're working on an iWARP transport, and also see UDP over DCB as another possibility. This code handles transport registration. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-26 23:39:24 -08:00
Andy Grover	922cb17a5c	RDS: Congestion-handling code RDS handles per-socket congestion by updating peers with a complete congestion map (8KB). This code keeps track of these maps for itself and ones received from peers. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-26 23:39:24 -08:00
Andy Grover	39de828179	RDS: Main header file RDS's main data structure definitions and exported functions. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-26 23:39:23 -08:00
Andy Grover	639b321b4d	RDS: Socket interface Implement the RDS (Reliable Datagram Sockets) interface. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-26 23:39:23 -08:00
Hannes Eder	9ee62630fd	wanrouter: fix sparse warnings: context imbalance Impact: Attribute functions with __acquires(...) resp. __releases(...). Fix this sparse warnings: net/wanrouter/wanproc.c:82:13: warning: context imbalance in 'r_start' - wrong count at exit net/wanrouter/wanproc.c:103:13: warning: context imbalance in 'r_stop' - unexpected unlock net/wanrouter/wanmain.c:765:13: warning: context imbalance in 'lock_adapter_irq' - wrong count at exit net/wanrouter/wanmain.c:771:13: warning: context imbalance in 'unlock_adapter_irq' - unexpected unlock Signed-off-by: Hannes Eder <hannes@hanneseder.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-26 23:13:36 -08:00
Hannes Eder	56bca31ff1	inet fragments: fix sparse warning: context imbalance Impact: Attribute function with __releases(...) Fix this sparse warning: net/ipv4/inet_fragment.c:276:35: warning: context imbalance in 'inet_frag_find' - unexpected unlock Signed-off-by: Hannes Eder <hannes@hanneseder.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-26 23:13:35 -08:00
Hannes Eder	e57c624be8	decnet: fix sparse warnings: symbol shadows an earlier one Impact: Remove redundant variable declarations, resp. rename inner scope variable. Fix this sparse warnings: net/decnet/af_decnet.c:1252:40: warning: symbol 'skb' shadows an earlier one net/decnet/af_decnet.c:1223:24: originally declared here net/decnet/af_decnet.c:1582:29: warning: symbol 'val' shadows an earlier one net/decnet/af_decnet.c:1527:22: originally declared here net/decnet/dn_dev.c:687:21: warning: symbol 'err' shadows an earlier one net/decnet/dn_dev.c:670:13: originally declared here net/decnet/sysctl_net_decnet.c:182:21: warning: symbol 'len' shadows an earlier one net/decnet/sysctl_net_decnet.c:173:16: originally declared here Signed-off-by: Hannes Eder <hannes@hanneseder.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-26 23:13:35 -08:00
Hannes Eder	8521c27ee7	decnet: fix sparse warnings: context imbalance Impact: Attribute functions with __acquires(...) resp. __releases(...). Fix this sparse warnings: net/decnet/dn_dev.c:1324:13: warning: context imbalance in 'dn_dev_seq_start' - wrong count at exit net/decnet/dn_dev.c:1366:13: warning: context imbalance in 'dn_dev_seq_stop' - unexpected unlock Signed-off-by: Hannes Eder <hannes@hanneseder.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-26 23:13:34 -08:00
Hannes Eder	63d819caeb	sysctl: fix sparse warning: Should it be static? Impact: Include header file. Fix this sparse warning: net/core/sysctl_net_core.c:123:32: warning: symbol 'net_core_path' was not declared. Should it be static? Signed-off-by: Hannes Eder <hannes@hanneseder.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-26 23:13:34 -08:00
Hannes Eder	2db096086e	appletalk: fix warning: format not a string literal and no ... Impact: Use 'static const char[]' instead of 'static char[]', and since the data is const now it can be placed in __initconst. Fix this warning: net/appletalk/ddp.c: In function 'atalk_init': net/appletalk/ddp.c:1894: warning: format not a string literal and no format arguments Signed-off-by: Hannes Eder <hannes@hanneseder.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-26 23:13:33 -08:00
Hannes Eder	e3db6cb421	9p: fix sparse warning: cast adds address space Impact: Trust in the comment and add '__force' to the cast. Fix this sparse warning: net/9p/trans_fd.c:420:34: warning: cast adds address space to expression (<asn:1>) Signed-off-by: Hannes Eder <hannes@hanneseder.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-26 23:13:32 -08:00
Hannes Eder	81c553299f	net/802: fix sparse warnings: context imbalance Impact: Attribute function with __acquires(...) resp. __releases(...). Fix this sparse warnings: net/802/tr.c:492:21: warning: context imbalance in 'rif_seq_start' - wrong count at exit net/802/tr.c:519:13: warning: context imbalance in 'rif_seq_stop' - unexpected unlock Signed-off-by: Hannes Eder <hannes@hanneseder.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-26 23:13:32 -08:00
Wei Yongjun	c3431ea71e	llc: remove some pointless conditionals before kfree_skb() Remove some pointless conditionals before kfree_skb(). Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-26 23:07:38 -08:00
Wei Yongjun	47a30b26e5	iucv: remove some pointless conditionals before kfree_skb() Remove some pointless conditionals before kfree_skb(). Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-26 23:07:37 -08:00
Wei Yongjun	db849df63c	decnet: remove some pointless conditionals before kfree_skb() Remove some pointless conditionals before kfree_skb(). Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-26 23:07:36 -08:00
Wei Yongjun	f3fbbe0f6f	core: remove some pointless conditionals before kfree_skb() Remove some pointless conditionals before kfree_skb(). Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-26 23:07:36 -08:00
Wei Yongjun	acb5d75b9b	packet: remove some pointless conditionals before kfree_skb() Remove some pointless conditionals before kfree_skb(). Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-26 23:07:35 -08:00
Wei Yongjun	ce030edfb4	can: remove some pointless conditionals before kfree_skb() Remove some pointless conditionals before kfree_skb(). Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-26 23:07:35 -08:00
Wei Yongjun	91744f6559	netlink: remove some pointless conditionals before kfree_skb() Remove some pointless conditionals before kfree_skb(). Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-26 23:07:34 -08:00
Wei Yongjun	40d44446cf	unix: remove some pointless conditionals before kfree_skb() Remove some pointless conditionals before kfree_skb(). Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-26 23:07:34 -08:00
Wei Yongjun	86dc1ad2be	pktgen: remove some pointless conditionals before kfree_skb() Remove some pointless conditionals before kfree_skb(). Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-26 23:07:33 -08:00
Wei Yongjun	6f96106867	af_key: remove some pointless conditionals before kfree_skb() Remove some pointless conditionals before kfree_skb(). Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-26 23:07:32 -08:00
Wei Yongjun	7585b97a48	Bluetooth: Remove some pointless conditionals before kfree_skb() Remove some pointless conditionals before kfree_skb(). Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2009-02-27 06:14:49 +01:00
Dave Young	2ae9a6be5f	Bluetooth: Move hci_conn_del_sysfs() back to avoid device destruct too early The following commit introduce a regression: commit `7d0db0a373` Author: Marcel Holtmann <marcel@holtmann.org> Date: Mon Jul 14 20:13:51 2008 +0200 [Bluetooth] Use a more unique bus name for connections I get panic as following (by netconsole): [ 2709.344034] usb 5-1: new full speed USB device using uhci_hcd and address 4 [ 2709.505776] usb 5-1: configuration #1 chosen from 1 choice [ 2709.569207] Bluetooth: Generic Bluetooth USB driver ver 0.4 [ 2709.570169] usbcore: registered new interface driver btusb [ 2845.742781] BUG: unable to handle kernel paging request at 6b6b6c2f [ 2845.742958] IP: [<c015515c>] __lock_acquire+0x6c/0xa80 [ 2845.743087] *pde = 00000000 [ 2845.743206] Oops: 0002 [#1] SMP [ 2845.743377] last sysfs file: /sys/class/bluetooth/hci0/hci0:6/type [ 2845.743742] Modules linked in: btusb netconsole snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss rfcomm l2cap bluetooth vfat fuse snd_hda_codec_idt snd_hda_intel snd_hda_codec snd_hwdep snd_pcm pl2303 snd_timer psmouse usbserial snd 3c59x e100 serio_raw soundcore i2c_i801 intel_agp mii agpgart snd_page_alloc rtc_cmos rtc_core thermal processor rtc_lib button thermal_sys sg evdev [ 2845.743742] [ 2845.743742] Pid: 0, comm: swapper Not tainted (2.6.29-rc5-smp #54) Dell DM051 [ 2845.743742] EIP: 0060:[<c015515c>] EFLAGS: 00010002 CPU: 0 [ 2845.743742] EIP is at __lock_acquire+0x6c/0xa80 [ 2845.743742] EAX: 00000046 EBX: 00000046 ECX: 6b6b6b6b EDX: 00000002 [ 2845.743742] ESI: 6b6b6b6b EDI: 00000000 EBP: c064fd14 ESP: c064fcc8 [ 2845.743742] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 [ 2845.743742] Process swapper (pid: 0, ti=c064e000 task=c05d1400 task.ti=c064e000) [ 2845.743742] Stack: [ 2845.743742] c05d1400 00000002 c05d1400 00000001 00000002 00000000 f65388dc c05d1400 [ 2845.743742] 6b6b6b6b 00000292 c064fd0c c0153732 00000000 00000000 00000001 f700fa50 [ 2845.743742] 00000046 00000000 00000000 c064fd40 c0155be6 00000000 00000002 00000001 [ 2845.743742] Call Trace: [ 2845.743742] [<c0153732>] ? trace_hardirqs_on_caller+0x72/0x1c0 [ 2845.743742] [<c0155be6>] ? lock_acquire+0x76/0xa0 [ 2845.743742] [<c03e1aad>] ? skb_dequeue+0x1d/0x70 [ 2845.743742] [<c046c885>] ? _spin_lock_irqsave+0x45/0x80 [ 2845.743742] [<c03e1aad>] ? skb_dequeue+0x1d/0x70 [ 2845.743742] [<c03e1aad>] ? skb_dequeue+0x1d/0x70 [ 2845.743742] [<c03e1f94>] ? skb_queue_purge+0x14/0x20 [ 2845.743742] [<f8171f5a>] ? hci_conn_del+0x10a/0x1c0 [bluetooth] [ 2845.743742] [<f81399c9>] ? l2cap_disconn_ind+0x59/0xb0 [l2cap] [ 2845.743742] [<f81795ce>] ? hci_conn_del_sysfs+0x8e/0xd0 [bluetooth] [ 2845.743742] [<f8175758>] ? hci_event_packet+0x5f8/0x31c0 [bluetooth] [ 2845.743742] [<c03dfe19>] ? sock_def_readable+0x59/0x80 [ 2845.743742] [<c046c14d>] ? _read_unlock+0x1d/0x20 [ 2845.743742] [<f8178aa9>] ? hci_send_to_sock+0xe9/0x1d0 [bluetooth] [ 2845.743742] [<c015388b>] ? trace_hardirqs_on+0xb/0x10 [ 2845.743742] [<f816fa6a>] ? hci_rx_task+0x2ba/0x490 [bluetooth] [ 2845.743742] [<c0133661>] ? tasklet_action+0x31/0xc0 [ 2845.743742] [<c013367c>] ? tasklet_action+0x4c/0xc0 [ 2845.743742] [<c0132eb7>] ? __do_softirq+0xa7/0x170 [ 2845.743742] [<c0116dec>] ? ack_apic_level+0x5c/0x1c0 [ 2845.743742] [<c0132fd7>] ? do_softirq+0x57/0x60 [ 2845.743742] [<c01333dc>] ? irq_exit+0x7c/0x90 [ 2845.743742] [<c01055bb>] ? do_IRQ+0x4b/0x90 [ 2845.743742] [<c01333d5>] ? irq_exit+0x75/0x90 [ 2845.743742] [<c010392c>] ? common_interrupt+0x2c/0x34 [ 2845.743742] [<c010a14f>] ? mwait_idle+0x4f/0x70 [ 2845.743742] [<c0101c05>] ? cpu_idle+0x65/0xb0 [ 2845.743742] [<c045731e>] ? rest_init+0x4e/0x60 [ 2845.743742] Code: 0f 84 69 02 00 00 83 ff 07 0f 87 1e 06 00 00 85 ff 0f 85 08 05 00 00 8b 4d cc 8b 49 04 85 c9 89 4d d4 0f 84 f7 04 00 00 8b 75 d4 <f0> ff 86 c4 00 00 00 89 f0 e8 56 a9 ff ff 85 c0 0f 85 6e 03 00 [ 2845.743742] EIP: [<c015515c>] __lock_acquire+0x6c/0xa80 SS:ESP 0068:c064fcc8 [ 2845.743742] ---[ end trace 4c985b38f022279f ]--- [ 2845.743742] Kernel panic - not syncing: Fatal exception in interrupt [ 2845.743742] ------------[ cut here ]------------ [ 2845.743742] WARNING: at kernel/smp.c:329 smp_call_function_many+0x151/0x200() [ 2845.743742] Hardware name: Dell DM051 [ 2845.743742] Modules linked in: btusb netconsole snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss rfcomm l2cap bluetooth vfat fuse snd_hda_codec_idt snd_hda_intel snd_hda_codec snd_hwdep snd_pcm pl2303 snd_timer psmouse usbserial snd 3c59x e100 serio_raw soundcore i2c_i801 intel_agp mii agpgart snd_page_alloc rtc_cmos rtc_core thermal processor rtc_lib button thermal_sys sg evdev [ 2845.743742] Pid: 0, comm: swapper Tainted: G D 2.6.29-rc5-smp #54 [ 2845.743742] Call Trace: [ 2845.743742] [<c012e076>] warn_slowpath+0x86/0xa0 [ 2845.743742] [<c015041b>] ? trace_hardirqs_off+0xb/0x10 [ 2845.743742] [<c0146384>] ? up+0x14/0x40 [ 2845.743742] [<c012e661>] ? release_console_sem+0x31/0x1e0 [ 2845.743742] [<c046c8ab>] ? _spin_lock_irqsave+0x6b/0x80 [ 2845.743742] [<c015041b>] ? trace_hardirqs_off+0xb/0x10 [ 2845.743742] [<c046c900>] ? _read_lock_irqsave+0x40/0x80 [ 2845.743742] [<c012e7f2>] ? release_console_sem+0x1c2/0x1e0 [ 2845.743742] [<c0146384>] ? up+0x14/0x40 [ 2845.743742] [<c015041b>] ? trace_hardirqs_off+0xb/0x10 [ 2845.743742] [<c046a3d7>] ? __mutex_unlock_slowpath+0x97/0x160 [ 2845.743742] [<c046a563>] ? mutex_trylock+0xb3/0x180 [ 2845.743742] [<c046a4a8>] ? mutex_unlock+0x8/0x10 [ 2845.743742] [<c015b991>] smp_call_function_many+0x151/0x200 [ 2845.743742] [<c010a1a0>] ? stop_this_cpu+0x0/0x40 [ 2845.743742] [<c015ba61>] smp_call_function+0x21/0x30 [ 2845.743742] [<c01137ae>] native_smp_send_stop+0x1e/0x50 [ 2845.743742] [<c012e0f5>] panic+0x55/0x110 [ 2845.743742] [<c01065a8>] oops_end+0xb8/0xc0 [ 2845.743742] [<c010668f>] die+0x4f/0x70 [ 2845.743742] [<c011a8c9>] do_page_fault+0x269/0x610 [ 2845.743742] [<c011a660>] ? do_page_fault+0x0/0x610 [ 2845.743742] [<c046cbaf>] error_code+0x77/0x7c [ 2845.743742] [<c015515c>] ? __lock_acquire+0x6c/0xa80 [ 2845.743742] [<c0153732>] ? trace_hardirqs_on_caller+0x72/0x1c0 [ 2845.743742] [<c0155be6>] lock_acquire+0x76/0xa0 [ 2845.743742] [<c03e1aad>] ? skb_dequeue+0x1d/0x70 [ 2845.743742] [<c046c885>] _spin_lock_irqsave+0x45/0x80 [ 2845.743742] [<c03e1aad>] ? skb_dequeue+0x1d/0x70 [ 2845.743742] [<c03e1aad>] skb_dequeue+0x1d/0x70 [ 2845.743742] [<c03e1f94>] skb_queue_purge+0x14/0x20 [ 2845.743742] [<f8171f5a>] hci_conn_del+0x10a/0x1c0 [bluetooth] [ 2845.743742] [<f81399c9>] ? l2cap_disconn_ind+0x59/0xb0 [l2cap] [ 2845.743742] [<f81795ce>] ? hci_conn_del_sysfs+0x8e/0xd0 [bluetooth] [ 2845.743742] [<f8175758>] hci_event_packet+0x5f8/0x31c0 [bluetooth] [ 2845.743742] [<c03dfe19>] ? sock_def_readable+0x59/0x80 [ 2845.743742] [<c046c14d>] ? _read_unlock+0x1d/0x20 [ 2845.743742] [<f8178aa9>] ? hci_send_to_sock+0xe9/0x1d0 [bluetooth] [ 2845.743742] [<c015388b>] ? trace_hardirqs_on+0xb/0x10 [ 2845.743742] [<f816fa6a>] hci_rx_task+0x2ba/0x490 [bluetooth] [ 2845.743742] [<c0133661>] ? tasklet_action+0x31/0xc0 [ 2845.743742] [<c013367c>] tasklet_action+0x4c/0xc0 [ 2845.743742] [<c0132eb7>] __do_softirq+0xa7/0x170 [ 2845.743742] [<c0116dec>] ? ack_apic_level+0x5c/0x1c0 [ 2845.743742] [<c0132fd7>] do_softirq+0x57/0x60 [ 2845.743742] [<c01333dc>] irq_exit+0x7c/0x90 [ 2845.743742] [<c01055bb>] do_IRQ+0x4b/0x90 [ 2845.743742] [<c01333d5>] ? irq_exit+0x75/0x90 [ 2845.743742] [<c010392c>] common_interrupt+0x2c/0x34 [ 2845.743742] [<c010a14f>] ? mwait_idle+0x4f/0x70 [ 2845.743742] [<c0101c05>] cpu_idle+0x65/0xb0 [ 2845.743742] [<c045731e>] rest_init+0x4e/0x60 [ 2845.743742] ---[ end trace 4c985b38f02227a0 ]--- [ 2845.743742] ------------[ cut here ]------------ [ 2845.743742] WARNING: at kernel/smp.c:226 smp_call_function_single+0x8e/0x110() [ 2845.743742] Hardware name: Dell DM051 [ 2845.743742] Modules linked in: btusb netconsole snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss rfcomm l2cap bluetooth vfat fuse snd_hda_codec_idt snd_hda_intel snd_hda_codec snd_hwdep snd_pcm pl2303 snd_timer psmouse usbserial snd 3c59x e100 serio_raw soundcore i2c_i801 intel_agp mii agpgart snd_page_alloc rtc_cmos rtc_core thermal processor rtc_lib button thermal_sys sg evdev [ 2845.743742] Pid: 0, comm: swapper Tainted: G D W 2.6.29-rc5-smp #54 [ 2845.743742] Call Trace: [ 2845.743742] [<c012e076>] warn_slowpath+0x86/0xa0 [ 2845.743742] [<c012e000>] ? warn_slowpath+0x10/0xa0 [ 2845.743742] [<c015041b>] ? trace_hardirqs_off+0xb/0x10 [ 2845.743742] [<c0146384>] ? up+0x14/0x40 [ 2845.743742] [<c012e661>] ? release_console_sem+0x31/0x1e0 [ 2845.743742] [<c046c8ab>] ? _spin_lock_irqsave+0x6b/0x80 [ 2845.743742] [<c015041b>] ? trace_hardirqs_off+0xb/0x10 [ 2845.743742] [<c046c900>] ? _read_lock_irqsave+0x40/0x80 [ 2845.743742] [<c012e7f2>] ? release_console_sem+0x1c2/0x1e0 [ 2845.743742] [<c0146384>] ? up+0x14/0x40 [ 2845.743742] [<c015b7be>] smp_call_function_single+0x8e/0x110 [ 2845.743742] [<c010a1a0>] ? stop_this_cpu+0x0/0x40 [ 2845.743742] [<c026d23f>] ? cpumask_next_and+0x1f/0x40 [ 2845.743742] [<c015b95a>] smp_call_function_many+0x11a/0x200 [ 2845.743742] [<c010a1a0>] ? stop_this_cpu+0x0/0x40 [ 2845.743742] [<c015ba61>] smp_call_function+0x21/0x30 [ 2845.743742] [<c01137ae>] native_smp_send_stop+0x1e/0x50 [ 2845.743742] [<c012e0f5>] panic+0x55/0x110 [ 2845.743742] [<c01065a8>] oops_end+0xb8/0xc0 [ 2845.743742] [<c010668f>] die+0x4f/0x70 [ 2845.743742] [<c011a8c9>] do_page_fault+0x269/0x610 [ 2845.743742] [<c011a660>] ? do_page_fault+0x0/0x610 [ 2845.743742] [<c046cbaf>] error_code+0x77/0x7c [ 2845.743742] [<c015515c>] ? __lock_acquire+0x6c/0xa80 [ 2845.743742] [<c0153732>] ? trace_hardirqs_on_caller+0x72/0x1c0 [ 2845.743742] [<c0155be6>] lock_acquire+0x76/0xa0 [ 2845.743742] [<c03e1aad>] ? skb_dequeue+0x1d/0x70 [ 2845.743742] [<c046c885>] _spin_lock_irqsave+0x45/0x80 [ 2845.743742] [<c03e1aad>] ? skb_dequeue+0x1d/0x70 [ 2845.743742] [<c03e1aad>] skb_dequeue+0x1d/0x70 [ 2845.743742] [<c03e1f94>] skb_queue_purge+0x14/0x20 [ 2845.743742] [<f8171f5a>] hci_conn_del+0x10a/0x1c0 [bluetooth] [ 2845.743742] [<f81399c9>] ? l2cap_disconn_ind+0x59/0xb0 [l2cap] [ 2845.743742] [<f81795ce>] ? hci_conn_del_sysfs+0x8e/0xd0 [bluetooth] [ 2845.743742] [<f8175758>] hci_event_packet+0x5f8/0x31c0 [bluetooth] [ 2845.743742] [<c03dfe19>] ? sock_def_readable+0x59/0x80 [ 2845.743742] [<c046c14d>] ? _read_unlock+0x1d/0x20 [ 2845.743742] [<f8178aa9>] ? hci_send_to_sock+0xe9/0x1d0 [bluetooth] [ 2845.743742] [<c015388b>] ? trace_hardirqs_on+0xb/0x10 [ 2845.743742] [<f816fa6a>] hci_rx_task+0x2ba/0x490 [bluetooth] [ 2845.743742] [<c0133661>] ? tasklet_action+0x31/0xc0 [ 2845.743742] [<c013367c>] tasklet_action+0x4c/0xc0 [ 2845.743742] [<c0132eb7>] __do_softirq+0xa7/0x170 [ 2845.743742] [<c0116dec>] ? ack_apic_level+0x5c/0x1c0 [ 2845.743742] [<c0132fd7>] do_softirq+0x57/0x60 [ 2845.743742] [<c01333dc>] irq_exit+0x7c/0x90 [ 2845.743742] [<c01055bb>] do_IRQ+0x4b/0x90 [ 2845.743742] [<c01333d5>] ? irq_exit+0x75/0x90 [ 2845.743742] [<c010392c>] common_interrupt+0x2c/0x34 [ 2845.743742] [<c010a14f>] ? mwait_idle+0x4f/0x70 [ 2845.743742] [<c0101c05>] cpu_idle+0x65/0xb0 [ 2845.743742] [<c045731e>] rest_init+0x4e/0x60 [ 2845.743742] ---[ end trace 4c985b38f02227a1 ]--- [ 2845.743742] Rebooting in 3 seconds.. My logitec bluetooth mouse trying connect to pc, but pc side reject the connection again and again. then panic happens. The reason is due to hci_conn_del_sysfs now called in hci_event_packet, the del work is done in a workqueue, so it's possible done before skb_queue_purge called. I move the hci_conn_del_sysfs after skb_queue_purge just as that before marcel's commit. Remove the hci_conn_del_sysfs in hci_conn_hash_flush as well due to hci_conn_del will deal with the work. Signed-off-by: Dave Young <hidave.darkstar@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2009-02-27 06:14:49 +01:00
Marcel Holtmann	2526d3d8b2	Bluetooth: Permit BT_SECURITY also for L2CAP raw sockets Userspace pairing code can be simplified if it doesn't have to fall back to using L2CAP_LM in the case of L2CAP raw sockets. This patch allows the BT_SECURITY socket option to be used for these sockets. Signed-off-by: Johan Hedberg <johan.hedberg@nokia.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2009-02-27 06:14:48 +01:00
Marcel Holtmann	37e62f5516	Bluetooth: Fix RFCOMM usage of in-kernel L2CAP sockets The CID value of L2CAP sockets need to be set to zero. All userspace applications do this via memset() on the sockaddr_l2 structure. The RFCOMM implementation uses in-kernel L2CAP sockets and so it has to make sure that l2_cid is set to zero. Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2009-02-27 06:14:48 +01:00
Marcel Holtmann	2a517ca687	Bluetooth: Disallow usage of L2CAP CID setting for now In the future the L2CAP layer will have full support for fixed channels and right now it already can export the channel assignment, but for the functions bind() and connect() the usage of only CID 0 is allowed. This allows an easy detection if the kernel supports fixed channels or not, because otherwise it would impossible for application to tell. Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2009-02-27 06:14:47 +01:00
Marcel Holtmann	8bf4794174	Bluetooth: Change RFCOMM to use BT_CONNECT2 for BT_DEFER_SETUP When BT_DEFER_SETUP is enabled on a RFCOMM socket, then switch its current state from BT_OPEN to BT_CONNECT2. This gives the Bluetooth core a unified way to handle L2CAP and RFCOMM sockets. The BT_CONNECT2 state is designated for incoming connections. Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2009-02-27 06:14:47 +01:00
Marcel Holtmann	d5f2d2be68	Bluetooth: Fix poll() misbehavior when using BT_DEFER_SETUP When BT_DEFER_SETUP has been enabled on a Bluetooth socket it keeps signaling POLLIN all the time. This is a wrong behavior. The POLLIN should only be signaled if the client socket is in BT_CONNECT2 state and the parent has been BT_DEFER_SETUP enabled. Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2009-02-27 06:14:46 +01:00
Marcel Holtmann	96a3183322	Bluetooth: Set authentication requirement before requesting it The authentication requirement got only updated when the security level increased. This is a wrong behavior. The authentication requirement is read by the Bluetooth daemon to make proper decisions when handling the IO capabilities exchange. So set the value that is currently expected by the higher layers like L2CAP and RFCOMM. Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2009-02-27 06:14:44 +01:00
Marcel Holtmann	00ae4af91d	Bluetooth: Fix authentication requirements for L2CAP security check The L2CAP layer can trigger the authentication via an ACL connection or later on to increase the security level. When increasing the security level it didn't use the same authentication requirements when triggering a new ACL connection. Make sure that exactly the same authentication requirements are used. The only exception here are the L2CAP raw sockets which are only used for dedicated bonding. Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2009-02-27 06:14:43 +01:00
Marcel Holtmann	2950f21acb	Bluetooth: Ask upper layers for HCI disconnect reason Some of the qualification tests demand that in case of failures in L2CAP the HCI disconnect should indicate a reason why L2CAP fails. This is a bluntly layer violation since multiple L2CAP connections could be using the same ACL and thus forcing a disconnect reason is not a good idea. To comply with the Bluetooth test specification, the disconnect reason is now stored in the L2CAP connection structure and every time a new L2CAP channel is added it will set back to its default. So only in the case where the L2CAP channel with the disconnect reason is really the last one, it will propagated to the HCI layer. The HCI layer has been extended with a disconnect indication that allows it to ask upper layers for a disconnect reason. The upper layer must not support this callback and in that case it will nicely default to the existing behavior. If an upper layer like L2CAP can provide a disconnect reason that one will be used to disconnect the ACL or SCO link. No modification to the ACL disconnect timeout have been made. So in case of Linux to Linux connection the initiator will disconnect the ACL link before the acceptor side can signal the specific disconnect reason. That is perfectly fine since Linux doesn't make use of this value anyway. The L2CAP layer has a perfect valid error code for rejecting connection due to a security violation. It is unclear why the Bluetooth specification insists on having specific HCI disconnect reason. Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2009-02-27 06:14:43 +01:00
Marcel Holtmann	f29972de8e	Bluetooth: Add CID field to L2CAP socket address structure In preparation for L2CAP fixed channel support, the CID value of a L2CAP connection needs to be accessible via the socket interface. The CID is the connection identifier and exists as source and destination value. So extend the L2CAP socket address structure with this field and change getsockname() and getpeername() to fill it in. The bind() and connect() functions have been modified to handle L2CAP socket address structures of variable sizes. This makes them future proof if additional fields need to be added. Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2009-02-27 06:14:42 +01:00
Marcel Holtmann	e1027a7c69	Bluetooth: Request L2CAP fixed channel list if available If the extended features mask indicates support for fixed channels, request the list of available fixed channels. This also enables the fixed channel features bit so remote implementations can request information about it. Currently only the signal channel will be listed. Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2009-02-27 06:14:42 +01:00
Marcel Holtmann	435fef20ac	Bluetooth: Don't enforce authentication for L2CAP PSM 1 and 3 The recommendation for the L2CAP PSM 1 (SDP) is to not use any kind of authentication or encryption. So don't trigger authentication for incoming and outgoing SDP connections. For L2CAP PSM 3 (RFCOMM) there is no clear requirement, but with Bluetooth 2.1 the initiator is required to enable authentication and encryption first and this gets enforced. So there is no need to trigger an additional authentication step. The RFCOMM service security will make sure that a secure enough link key is present. When the encryption gets enabled after the SDP connection setup, then switch the security level from SDP to low security. Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2009-02-27 06:14:41 +01:00
Marcel Holtmann	6a8d3010b3	Bluetooth: Fix double L2CAP connection request If the remote L2CAP server uses authentication pending stage and encryption is enabled it can happen that a L2CAP connection request is sent twice due to a race condition in the connection state machine. When the remote side indicates any kind of connection pending, then track this state and skip sending of L2CAP commands for this period. Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2009-02-27 06:14:41 +01:00
Marcel Holtmann	984947dc64	Bluetooth: Fix race condition with L2CAP information request When two L2CAP connections are requested quickly after the ACL link has been established there exists a window for a race condition where a connection request is sent before the information response has been received. Any connection request should only be sent after an exchange of the extended features mask has been finished. Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2009-02-27 06:14:41 +01:00
Marcel Holtmann	657e17b03c	Bluetooth: Set authentication requirements if not available When no authentication requirements are selected, but an outgoing or incoming connection has requested any kind of security enforcement, then set these authentication requirements. This ensures that the userspace always gets informed about the authentication requirements (if available). Only when no security enforcement has happened, the kernel will signal invalid requirements. Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2009-02-27 06:14:40 +01:00
Marcel Holtmann	0684e5f9fb	Bluetooth: Use general bonding whenever possible When receiving incoming connection to specific services, always use general bonding. This ensures that the link key gets stored and can be used for further authentications. Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2009-02-27 06:14:40 +01:00
Marcel Holtmann	efc7688b55	Bluetooth: Add SCO fallback for eSCO connection attempts When attempting to setup eSCO connections it can happen that some link manager implementations fail to properly negotiate the eSCO parameters and thus fail the eSCO setup. Normally the link manager is responsible for the negotiation of the parameters and actually fallback to SCO if no agreement can be reached. In cases where the link manager is just too stupid, then at least try to establish a SCO link if eSCO fails. For the Bluetooth devices with EDR support this includes handling packet types of EDR basebands. This is particular tricky since for the EDR the logic of enabling/disabling one specific packet type is turned around. This fix contains an extra bitmask to disable eSCO EDR packet when trying to fallback to a SCO connection. Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2009-02-27 06:14:37 +01:00
Marcel Holtmann	255c76014a	Bluetooth: Don't check encryption for L2CAP raw sockets For L2CAP sockets with medium and high security requirement a missing encryption will enforce the closing of the link. For the L2CAP raw sockets this is not needed, so skip that check. This fixes a crash when pairing Bluetooth 2.0 (and earlier) devices since the L2CAP state machine got confused and then locked up the whole system. Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2009-02-27 06:14:36 +01:00
Jaikumar Ganesh	6e1031a400	Bluetooth: When encryption is dropped, do not send RFCOMM packets During a role change with pre-Bluetooth 2.1 devices, the remote side drops the encryption of the RFCOMM connection. We allow a grace period for the encryption to be re-established, before dropping the connection. During this grace period, the RFCOMM_SEC_PENDING flag is set. Check this flag before sending RFCOMM packets. Signed-off-by: Jaikumar Ganesh <jaikumar@google.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2009-02-27 06:14:35 +01:00
Dave Young	dd2efd03b4	Bluetooth: Remove CONFIG_DEBUG_LOCK_ALLOC ifdefs Due to lockdep changes, the CONFIG_DEBUG_LOCK_ALLOC ifdef is not needed now. So just remove it here. The following commit fixed the !lockdep build warnings: commit `e8f6fbf62d` Author: Ingo Molnar <mingo@elte.hu> Date: Wed Nov 12 01:38:36 2008 +0000 lockdep: include/linux/lockdep.h - fix warning in net/bluetooth/af_bluetooth.c Signed-off-by: Dave Young <hidave.darkstar@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2009-02-27 06:14:34 +01:00
Marcel Holtmann	5f9018af00	Bluetooth: Update version numbers With the support for the enhanced security model and the support for deferring connection setup, it is a good idea to increase various version numbers. This is purely cosmetic and has no effect on the behavior, but can be really helpful when debugging problems in different kernel versions. Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2009-02-27 06:14:34 +01:00
Marcel Holtmann	0588d94fd7	Bluetooth: Restrict application of socket options The new socket options should only be evaluated for SOL_BLUETOOTH level and not for every other level. Previously this causes some minor issues when detecting if a kernel with certain features is available. Also restrict BT_SECURITY to SOCK_SEQPACKET for L2CAP and SOCK_STREAM for the RFCOMM protocol. Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2009-02-27 06:14:33 +01:00
Marcel Holtmann	f62e4323ab	Bluetooth: Disconnect L2CAP connections without encryption For L2CAP connections with high security setting, the link will be immediately dropped when the encryption gets disabled. For L2CAP connections with medium security there will be grace period where the remote device has the chance to re-enable encryption. If it doesn't happen then the link will also be disconnected. The requirement for the grace period with medium security comes from Bluetooth 2.0 and earlier devices that require role switching. Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2009-02-27 06:14:33 +01:00
Marcel Holtmann	8c84b83076	Bluetooth: Pause RFCOMM TX when encryption drops A role switch with devices following the Bluetooth pre-2.1 standards or without Encryption Pause and Resume support is not possible if encryption is enabled. Most newer headsets require the role switch, but also require that the connection is encrypted. For connections with a high security mode setting, the link will be immediately dropped. When the connection uses medium security mode setting, then a grace period is introduced where the TX is halted and the remote device gets a change to re-enable encryption after the role switch. If not re-enabled the link will be dropped. Based on initial work by Ville Tervo <ville.tervo@nokia.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2009-02-27 06:14:33 +01:00
Marcel Holtmann	9f2c8a03fb	Bluetooth: Replace RFCOMM link mode with security level Change the RFCOMM internals to use the new security levels and remove the link mode details. Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2009-02-27 06:14:26 +01:00
Marcel Holtmann	2af6b9d518	Bluetooth: Replace L2CAP link mode with security level Change the L2CAP internals to use the new security levels and remove the link mode details. Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2009-02-27 06:14:26 +01:00
Marcel Holtmann	8c1b235594	Bluetooth: Add enhanced security model for Simple Pairing The current security model is based around the flags AUTH, ENCRYPT and SECURE. Starting with support for the Bluetooth 2.1 specification this is no longer sufficient. The different security levels are now defined as SDP, LOW, MEDIUM and SECURE. Previously it was possible to set each security independently, but this actually doesn't make a lot of sense. For Bluetooth the encryption depends on a previous successful authentication. Also you can only update your existing link key if you successfully created at least one before. And of course the update of link keys without having proper encryption in place is a security issue. The new security levels from the Bluetooth 2.1 specification are now used internally. All old settings are mapped to the new values and this way it ensures that old applications still work. The only limitation is that it is no longer possible to set authentication without also enabling encryption. No application should have done this anyway since this is actually a security issue. Without encryption the integrity of the authentication can't be guaranteed. As default for a new L2CAP or RFCOMM connection, the LOW security level is used. The only exception here are the service discovery sessions on PSM 1 where SDP level is used. To have similar security strength as with a Bluetooth 2.0 and before combination key, the MEDIUM level should be used. This is according to the Bluetooth specification. The MEDIUM level will not require any kind of man-in-the-middle (MITM) protection. Only the HIGH security level will require this. Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2009-02-27 06:14:25 +01:00
Marcel Holtmann	c89b6e6bda	Bluetooth: Fix SCO state handling for incoming connections When the remote device supports only SCO connections, on receipt of the HCI_EV_CONN_COMPLETE event packet, the connect state is changed to BT_CONNECTED, but the socket state is not updated. Hence, the connect() call times out even though the SCO connection has been successfully established. Based on a report by Jaikumar Ganesh <jaikumar@google.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2009-02-27 06:14:25 +01:00
Marcel Holtmann	71aeeaa1fd	Bluetooth: Reject incoming SCO connections without listeners All SCO and eSCO connection are auto-accepted no matter if there is a corresponding listening socket for them. This patch changes this and connection requests for SCO and eSCO without any socket are rejected. Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2009-02-27 06:14:24 +01:00
Marcel Holtmann	f66dc81f44	Bluetooth: Add support for deferring L2CAP connection setup In order to decide if listening L2CAP sockets should be accept()ed the BD_ADDR of the remote device needs to be known. This patch adds a socket option which defines a timeout for deferring the actual connection setup. Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2009-02-27 06:14:24 +01:00
Marcel Holtmann	bb23c0ab82	Bluetooth: Add support for deferring RFCOMM connection setup In order to decide if listening RFCOMM sockets should be accept()ed the BD_ADDR of the remote device needs to be known. This patch adds a socket option which defines a timeout for deferring the actual connection setup. The connection setup is done after reading from the socket for the first time. Until then writing to the socket returns ENOTCONN. Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2009-02-27 06:14:23 +01:00
Marcel Holtmann	c4f912e155	Bluetooth: Add global deferred socket parameter The L2CAP and RFCOMM applications require support for authorization and the ability of rejecting incoming connection requests. The socket interface is not really able to support this. This patch does the ground work for a socket option to defer connection setup. Setting this option allows calling of accept() and then the first read() will trigger the final connection setup. Calling close() would reject the connection. Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2009-02-27 06:14:23 +01:00
Marcel Holtmann	d58daf42d2	Bluetooth: Preparation for usage of SOL_BLUETOOTH The socket option levels SOL_L2CAP, SOL_RFOMM and SOL_SCO are currently in use by various Bluetooth applications. Going forward the common option level SOL_BLUETOOTH should be used. This patch prepares the clean split of the old and new option levels while keeping everything backward compatibility. Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2009-02-27 06:14:22 +01:00
Victor Shcherbatyuk	91aa35a5aa	Bluetooth: Fix issue with return value of rfcomm_sock_sendmsg() In case of connection failures the rfcomm_sock_sendmsg() should return an error and not a 0 value. Signed-off-by: Victor Shcherbatyuk <victor.shcherbatyuk@tomtom.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2009-02-27 06:14:21 +01:00
Pavel Emelyanov	3f53a38131	ipv6: don't use tw net when accounting for recycled tw We already have a valid net in that place, but this is not just a cleanup - the tw pointer can be NULL there sometimes, thus causing an oops in NET_NS=y case. The same place in ipv4 code already works correctly using existing net, rather than tw's one. The bug exists since 2.6.27. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-26 03:35:13 -08:00
Ingo Molnar	0dcec8c27b	alloc_percpu: add align argument to __alloc_percpu, fix Impact: build fix API was changed, but not all usage sites were converted: net/ipv4/route.c: In function ‘ip_rt_init’: net/ipv4/route.c:3379: error: too few arguments to function ‘__alloc_percpu’ Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-02-25 14:09:41 +01:00
David S. Miller	f11c179eea	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/wireless/orinoco/orinoco.c	2009-02-25 00:02:05 -08:00
Wei Yongjun	bb80087a94	sit: used time_before for comparing jiffies The functions time_before is more robust for comparing jiffies against other values. Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-24 23:37:19 -08:00
Wei Yongjun	26d94b46d0	ipip: used time_before for comparing jiffies The functions time_before is more robust for comparing jiffies against other values. Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-24 23:36:47 -08:00
Wei Yongjun	da6185d874	gre: used time_before for comparing jiffies The functions time_before is more robust for comparing jiffies against other values. Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-24 23:34:48 -08:00
Wei Yongjun	800d55f146	ipv6: Remove some pointless conditionals before kfree_skb() Remove some pointless conditionals before kfree_skb(). The semantic match that finds the problem is as follows: (http://www.emn.fr/x-info/coccinelle/) // <smpl> @@ expression E; @@ - if (E) - kfree_skb(E); + kfree_skb(E); // </smpl> Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-24 23:33:52 -08:00
Pablo Neira Ayuso	1ce85fe402	netlink: change nlmsg_notify() return value logic This patch changes the return value of nlmsg_notify() as follows: If NETLINK_BROADCAST_ERROR is set by any of the listeners and an error in the delivery happened, return the broadcast error; else if there are no listeners apart from the socket that requested a change with the echo flag, return the result of the unicast notification. Thus, with this patch, the unicast notification is handled in the same way of a broadcast listener that has set the NETLINK_BROADCAST_ERROR socket flag. This patch is useful in case that the caller of nlmsg_notify() wants to know the result of the delivery of a netlink notification (including the broadcast delivery) and take any action in case that the delivery failed. For example, ctnetlink can drop packets if the event delivery failed to provide reliable logging and state-synchronization at the cost of dropping packets. This patch also modifies the rtnetlink code to ignore the return value of rtnl_notify() in all callers. The function rtnl_notify() (before this patch) returned the error of the unicast notification which makes rtnl_set_sk_err() reports errors to all listeners. This is not of any help since the origin of the change (the socket that requested the echoing) notices the ENOBUFS error if the notification fails and should resync itself. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-24 23:18:28 -08:00
Joe Perches	a52b8bd338	tcp_scalable: Update malformed & dead url Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-24 16:40:16 -08:00
David S. Miller	8b6f92b1bd	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-2.6	2009-02-24 13:49:05 -08:00
Ingo Molnar	0edcf8d692	Merge branch 'tj-percpu' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc into core/percpu Conflicts: arch/x86/include/asm/pgtable.h	2009-02-24 21:52:45 +01:00
Eric Dumazet	28337ff543	netfilter: xt_hashlimit fix Commit `784544739a` (netfilter: iptables: lock free counters) broke xt_hashlimit netfilter module : This module was storing a pointer inside its xt_hashlimit_info, and this pointer is not relocated when we temporarly switch tables (iptables -L). This hack is not not needed at all (probably a leftover from ancient time), as each cpu should and can access to its own copy. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-02-24 15:30:29 +01:00
Josef Drexler	325fb5b4d2	netfilter: xt_recent: fix proc-file addition/removal of IPv4 addresses Fix regression introduded by commit `079aa88` (netfilter: xt_recent: IPv6 support): From http://bugzilla.kernel.org/show_bug.cgi?id=12753: Problem Description: An uninitialized buffer causes IPv4 addresses added manually (via the +IP command to the proc interface) to never match any packets. Similarly, the -IP command fails to remove IPv4 addresses. Details: In the function recent_entry_lookup, the xt_recent module does comparisons of the entire nf_inet_addr union value, both for IPv4 and IPv6 addresses. For addresses initialized from actual packets the remaining 12 bytes not occupied by the IPv4 are zeroed so this works correctly. However when setting the nf_inet_addr addr variable in the recent_mt_proc_write function, only the IPv4 bytes are initialized and the remaining 12 bytes contain garbage. Hence addresses added in this way never match any packets, unless these uninitialized 12 bytes happened to be zero by coincidence. Similarly, addresses cannot consistently be removed using the proc interface due to mismatch of the garbage bytes (although it will sometimes work to remove an address that was added manually). Reading the /proc/net/xt_recent/ entries hides this problem because this only uses the first 4 bytes when displaying IPv4 addresses. Steps to reproduce: $ iptables -I INPUT -m recent --rcheck -j LOG $ echo +169.254.156.239 > /proc/net/xt_recent/DEFAULT $ cat /proc/net/xt_recent/DEFAULT src=169.254.156.239 ttl: 0 last_seen: 119910 oldest_pkt: 1 119910 [At this point no packets from 169.254.156.239 are being logged.] $ iptables -I INPUT -s 169.254.156.239 -m recent --set $ cat /proc/net/xt_recent/DEFAULT src=169.254.156.239 ttl: 0 last_seen: 119910 oldest_pkt: 1 119910 src=169.254.156.239 ttl: 255 last_seen: 126184 oldest_pkt: 4 125434, 125684, 125934, 126184 [At this point, adding the address via an iptables rule, packets are being logged correctly.] $ echo -169.254.156.239 > /proc/net/xt_recent/DEFAULT $ cat /proc/net/xt_recent/DEFAULT src=169.254.156.239 ttl: 0 last_seen: 119910 oldest_pkt: 1 119910 src=169.254.156.239 ttl: 255 last_seen: 126992 oldest_pkt: 10 125434, 125684, 125934, 126184, 126434, 126684, 126934, 126991, 126991, 126992 $ echo -169.254.156.239 > /proc/net/xt_recent/DEFAULT $ cat /proc/net/xt_recent/DEFAULT src=169.254.156.239 ttl: 0 last_seen: 119910 oldest_pkt: 1 119910 src=169.254.156.239 ttl: 255 last_seen: 126992 oldest_pkt: 10 125434, 125684, 125934, 126184, 126434, 126684, 126934, 126991, 126991, 126992 [Removing the address via /proc interface failed evidently.] Possible solutions: - initialize the addr variable in recent_mt_proc_write - compare only 4 bytes for IPv4 addresses in recent_entry_lookup Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-02-24 14:53:12 +01:00
Pablo Neira Ayuso	7d1e04598e	netfilter: nf_conntrack: account packets drop by tcp_packet() Since tcp_packet() may return -NF_DROP in two situations, the packet-drop stats must be increased. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-02-24 14:48:01 +01:00
David S. Miller	e70049b9e7	Merge branch 'master' of /home/davem/src/GIT/linux-2.6/	2009-02-24 03:50:29 -08:00
Jesper Dangaard Brouer	d18921a0e3	Doc: Refer to ip-sysctl.txt for strict vs. loose rp_filter mode The IP_ADVANCED_ROUTER Kconfig describes the rp_filter proc option. Recent changes added a loose mode. Instead of documenting this change too places, refer to the document describing it: Documentation/networking/ip-sysctl.txt I'm considering moving the rp_filter description away from the Kconfig file into ip-sysctl.txt. Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-24 03:47:42 -08:00
Linus Torvalds	f7e603ad8f	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: net: amend the fix for SO_BSDCOMPAT gsopt infoleak netns: build fix for net_alloc_generic	2009-02-23 20:29:21 -08:00
Eugene Teo	50fee1dec5	net: amend the fix for SO_BSDCOMPAT gsopt infoleak The fix for CVE-2009-0676 (upstream commit `df0bca04`) is incomplete. Note that the same problem of leaking kernel memory will reappear if someone on some architecture uses struct timeval with some internal padding (for example tv_sec 64-bit and tv_usec 32-bit) --- then, you are going to leak the padded bytes to userspace. Signed-off-by: Eugene Teo <eugeneteo@kernel.sg> Reported-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-23 15:38:41 -08:00
Clemens Noss	ebe47d47b7	netns: build fix for net_alloc_generic net_alloc_generic was defined in #ifdef CONFIG_NET_NS, but used unconditionally. Move net_alloc_generic out of #ifdef. Signed-off-by: Clemens Noss <cnoss@gmx.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-23 15:37:35 -08:00
Linus Torvalds	d38e84ee39	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: netns: fix double free at netns creation veth : add the set_mac_address capability sunlance: Beyond ARRAY_SIZE of ib->btx_ring sungem: another error printed one too early ISDN: fix sc/shmem printk format warning SMSC: timeout reaches -1 smsc9420: handle magic field of ethtool_eeprom sundance: missing parentheses? smsc9420: fix another postfixed timeout wimax/i2400m: driver loads firmware v1.4 instead of v1.3 vlan: Update skb->mac_header in __vlan_put_tag(). cxgb3: Add support for PCI ID 0x35. tcp: remove obsoleted comment about different passes TG3: &&/\|\| confusion ATM: misplaced parentheses? net/mv643xx: don't disable the mib timer too early and lock properly net/mv643xx: use GFP_ATOMIC while atomic atl1c: Atheros L1C Gigabit Ethernet driver net: Kill skb_truesize_check(), it only catches false-positives. net: forcedeth: Fix wake-on-lan regression	2009-02-23 14:36:05 -08:00
Eric W. Biederman	ce16c5337a	netns: Remove net_alive It turns out that net_alive is unnecessary, and the original problem that led to it being added was simply that the icmp code thought it was a network device and wound up being unable to handle packets while there were still packets in the network namespace. Now that icmp and tcp have been fixed to properly register themselves this problem is no longer present and we have a stronger guarantee that packets will not arrive in a network namespace then that provided by net_alive in netif_receive_skb. So remove net_alive allowing packet reception run a little faster. Additionally document the strong reason why network namespace cleanup is safe so that if something happens again someone else will have a chance of figuring it out. Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-22 19:54:50 -08:00
Eric W. Biederman	6a1b3054d9	tcp: Like icmp use register_pernet_subsys To remove the possibility of packets flying around when network devices are being cleaned up use reisger_pernet_subsys instead of register_pernet_device. Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com> Acked-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-22 19:54:49 -08:00
Eric W. Biederman	959d272649	netns: Fix icmp shutdown. Recently I had a kernel panic in icmp_send during a network namespace cleanup. There were packets in the arp queue that failed to be sent and we attempted to generate an ICMP host unreachable message, but failed because icmp_sk_exit had already been called. The network devices are removed from a network namespace and their arp queues are flushed before we do attempt to shutdown subsystems so this error should have been impossible. It turns out icmp_init is using register_pernet_device instead of register_pernet_subsys. Which resulted in icmp being shut down while we still had the possibility of packets in flight, making a nasty NULL pointer deference in interrupt context possible. Changing this to register_pernet_subsys fixes the problem in my testing. Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com> Acked-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-22 19:54:48 -08:00
Jesper Dangaard Brouer	a6e8f27f3c	ipv4: Clean whitespaces in net/ipv4/Kconfig. While going through net/ipv4/Kconfig cleanup whitespaces. Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-22 19:54:48 -08:00
Jesper Dangaard Brouer	b2cc46a8ee	ipv4: Fix rp_filter description in net/ipv4/Kconfig. The reverse path filter (rp_filter) will NOT get enabled when enabling forwarding. Read the code and tested in in practice. Most distributions do enable it in startup scripts. Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-22 19:54:47 -08:00
Stephen Hemminger	0117cfabe3	snap: handle registration error and compile warning If this module can't load, it is almost certainly because something else is already bound to that SAP. So in that case, return the same error code as other SAP usage, and fail the module load. Also fixes a compiler warning about printk of non const. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-22 19:54:47 -08:00
Stephen Hemminger	01af4a0e3c	llc: fix non-const printk warning Mark some strings as const. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-22 19:54:46 -08:00
Stephen Hemminger	5747a1aacd	ip: ipip compile warning Get rid of compile warning about non-const format Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-22 19:54:45 -08:00
Stephen Hemminger	c1cf8422f0	ip: add loose reverse path filtering Extend existing reverse path filter option to allow strict or loose filtering. (See http://en.wikipedia.org/wiki/Reverse_path_filtering). For compatibility with existing usage, the value 1 is chosen for strict mode and 2 for loose mode. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-22 19:54:45 -08:00
Paul Moore	586c250037	cipso: Fix documentation comment The CIPSO protocol engine incorrectly stated that the FIPS-188 specification could be found in the kernel's Documentation directory. This patch corrects that by removing the comment and directing users to the FIPS-188 documented hosted online. For the sake of completeness I've also included a link to the CIPSO draft specification on the NetLabel website. Thanks to Randy Dunlap for spotting the error and letting me know. Signed-off-by: Paul Moore <paul.moore@hp.com> Signed-off-by: James Morris <jmorris@namei.org>	2009-02-23 10:05:54 +11:00
Daniel Lezcano	486a87f1e5	netns: fix double free at netns creation This patch fix a double free when a network namespace fails. The previous code does a kfree of the net_generic structure when one of the init subsystem initialization fails. The 'setup_net' function does kfree(ng) and returns an error. The caller, 'copy_net_ns', call net_free on error, and this one calls kfree(net->gen), making this pointer freed twice. This patch make the code symetric, the net_alloc does the net_generic allocation and the net_free frees the net_generic. Signed-off-by: Daniel Lezcano <daniel.lezcano@free.fr> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-22 00:07:53 -08:00
Herbert Xu	7691367d71	tcp: Always set urgent pointer if it's beyond snd_nxt Our TCP stack does not set the urgent flag if the urgent pointer does not fit in 16 bits, i.e., if it is more than 64K from the sequence number of a packet. This behaviour is different from the BSDs, and clearly contradicts the purpose of urgent mode, which is to send the notification (though not necessarily the associated data) as soon as possible. Our current behaviour may in fact delay the urgent notification indefinitely if the receiver window does not open up. Simply matching BSD however may break legacy applications which incorrectly rely on the out-of-band delivery of urgent data, and conversely the in-band delivery of non-urgent data. Alexey Kuznetsov suggested a safe solution of following BSD only if the urgent pointer itself has not yet been transmitted. This way we guarantee that when the remote end sees the packet with non-urgent data marked as urgent due to wrap-around we would have advanced the urgent pointer beyond, either to the actual urgent data or to an as-yet untransmitted packet. The only potential downside is that applications on the remote end may see multiple SIGURG notifications. However, this would occur anyway with other TCP stacks. More importantly, the outcome of such a duplicate notification is likely to be harmless since the signal itself does not carry any information other than the fact that we're in urgent mode. Thanks to Ilpo Järvinen for fixing a critical bug in this and Jeff Chua for reporting that bug. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-21 23:52:29 -08:00
Hannes Eder	66da8c529a	ipv6: fix sparse warning: Using plain integer as NULL pointer Fix this sparse warning: net/ipv6/xfrm6_state.c:72:26: warning: Using plain integer as NULL pointer Signed-off-by: Hannes Eder <hannes@hanneseder.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-21 23:37:10 -08:00
Patrick Ohly	cd4d8fdad1	net: kernel panic in dev_hard_start_xmit: remove faulty software TX time stamping The current implementation of the TX software time stamping fallback is faulty because it accesses the skb after ndo_start_xmit() returns successfully. This patch removes the fallback, which fixes kernel panics seen during stress tests. Hardware time stamping is not affected by this removal. Signed-off-by: Patrick Ohly <patrick.ohly@intel.com> Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-21 02:42:18 -08:00
Eric Dumazet	08361aa807	netfilter: ip_tables: unfold two critical loops in ip_packet_match() While doing oprofile tests I noticed two loops are not properly unrolled by gcc Using a hand coded unrolled loop provides nice speedup : ipt_do_table credited of 2.52 % of cpu instead of 3.29 % in tbench. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-02-20 11:03:33 +01:00
Adam Nielsen	268cb38e18	netfilter: x_tables: add LED trigger target Kernel module providing implementation of LED netfilter target. Each instance of the target appears as a led-trigger device, which can be associated with one or more LEDs in /sys/class/leds/ Signed-off-by: Adam Nielsen <a.nielsen@shikadi.net> Acked-by: Richard Purdie <rpurdie@linux.intel.com> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-02-20 10:55:14 +01:00
Hagen Paul Pfeifer	af07d241dc	netfilter: fix hardcoded size assumptions get_random_bytes() is sometimes called with a hard coded size assumption of an integer. This could not be true for next centuries. This patch replace it with a compile time statement. Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-02-20 10:48:06 +01:00
Hagen Paul Pfeifer	e478075c6f	netfilter: nf_conntrack: table max size should hold at least table size Table size is defined as unsigned, wheres the table maximum size is defined as a signed integer. The calculation of max is 8 or 4, multiplied the table size. Therefore the max value is aligned to unsigned. Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-02-20 10:47:09 +01:00
Stephen Hemminger	784544739a	netfilter: iptables: lock free counters The reader/writer lock in ip_tables is acquired in the critical path of processing packets and is one of the reasons just loading iptables can cause a 20% performance loss. The rwlock serves two functions: 1) it prevents changes to table state (xt_replace) while table is in use. This is now handled by doing rcu on the xt_table. When table is replaced, the new table(s) are put in and the old one table(s) are freed after RCU period. 2) it provides synchronization when accesing the counter values. This is now handled by swapping in new table_info entries for each cpu then summing the old values, and putting the result back onto one cpu. On a busy system it may cause sampling to occur at different times on each cpu, but no packet/byte counts are lost in the process. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Sucessfully tested on my dual quad core machine too, but iptables only (no ipv6 here) BTW, my new "tbench 8" result is 2450 MB/s, (it was 2150 MB/s not so long ago) Acked-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-02-20 10:35:32 +01:00
Pablo Neira Ayuso	be0c22a46c	netlink: add NETLINK_BROADCAST_ERROR socket option This patch adds NETLINK_BROADCAST_ERROR which is a netlink socket option that the listener can set to make netlink_broadcast() return errors in the delivery to the caller. This option is useful if the caller of netlink_broadcast() do something with the result of the message delivery, like in ctnetlink where it drops a network packet if the event delivery failed, this is used to enable reliable logging and state-synchronization. If this socket option is not set, netlink_broadcast() only reports ESRCH errors and silently ignore ENOBUFS errors, which is what most netlink_broadcast() callers should do. This socket option is based on a suggestion from Patrick McHardy. Patrick McHardy can exchange this patch for a beer from me ;). Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-20 01:01:08 -08:00
Santwona Behera	59089d8d16	ethtool: Add RX pkt classification interface Signed-off-by: Santwona Behera <santwona.behera@sun.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-20 00:58:13 -08:00

... 5 6 7 8 9 ...

12273 Commits