linux

Commit Graph

Author	SHA1	Message	Date
Jay Vosburgh	6146b1a4da	bonding: Fix ALB mode to balance traffic on VLANs The current ALB function that processes incoming ARPs does not handle traffic for VLANs configured above bonding. This causes traffic on those VLANs to all be assigned the same slave. This patch corrects that misbehavior by locating the bonding interface nested below the VLAN interface. Bug reported by Sven Anders <anders@anduras.de>, who also tested an earlier version of this patch and confirmed that it resolved the problem. Signed-off-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2008-11-06 00:49:40 -05:00
Brian Haley	305d552acc	bonding: send IPv6 neighbor advertisement on failover This patch adds better IPv6 failover support for bonding devices, especially when in active-backup mode and there are only IPv6 addresses configured, as reported by Alex Sidorenko. - Creates a new file, net/drivers/bonding/bond_ipv6.c, for the IPv6-specific routines. Both regular bonds and VLANs over bonds are supported. - Adds a new tunable, num_unsol_na, to limit the number of unsolicited IPv6 Neighbor Advertisements that are sent on a failover event. Default is 1. - Creates two new IPv6 neighbor discovery functions: ndisc_build_skb() ndisc_send_skb() These were required to support VLANs since we have to be able to add the VLAN id to the skb since ndisc_send_na() and friends shouldn't be asked to do this. These two routines are basically __ndisc_send() split into two pieces, in a slightly different order. - Updates Documentation/networking/bonding.txt and bumps the rev of bond support to 3.4.0. On failover, this new code will generate one packet: - An unsolicited IPv6 Neighbor Advertisement, which helps the switch learn that the address has moved to the new slave. Testing has shown that sending just the NA results in pretty good behavior when in active-back mode, I saw no lost ping packets for example. Signed-off-by: Brian Haley <brian.haley@hp.com> Signed-off-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2008-11-06 00:49:37 -05:00
Jarek Poplawski	61c9eaf900	pkt_sched: Fix qdisc len in qdisc_peek_dequeued() A packet dequeued and stored as gso_skb in qdisc_peek_dequeued() should be seen as part of the queue for sch->q.qlen queries until it's really dequeued with qdisc_dequeue_peeked(), so qlen needs additional updating in these functions. (Updating qstats.backlog shouldn't matter here.) Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-05 16:02:34 -08:00
Eric W. Biederman	0a36b345ab	net: Don't leak packets when a netns is going down I have been tracking for a while a case where when the network namespace exits the cleanup gets stck in an endless precessess of: unregister_netdevice: waiting for lo to become free. Usage count = 3 unregister_netdevice: waiting for lo to become free. Usage count = 3 unregister_netdevice: waiting for lo to become free. Usage count = 3 unregister_netdevice: waiting for lo to become free. Usage count = 3 unregister_netdevice: waiting for lo to become free. Usage count = 3 unregister_netdevice: waiting for lo to become free. Usage count = 3 unregister_netdevice: waiting for lo to become free. Usage count = 3 It turns out that if you listen on a multicast address an unsubscribe packet is sent when the network device goes down. If you shutdown the network namespace without carefully cleaning up this can trigger the unsubscribe packet to be sent over the loopback interface while the network namespace is going down. All of which is fine except when we drop the packet and forget to free it leaking the skb and the dst entry attached to. As it turns out the dst entry hold a reference to the idev which holds the dev and keeps everything from being cleaned up. Yuck! By fixing my earlier thinko and add the needed kfree_skb and everything cleans up beautifully. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-05 16:00:24 -08:00
Eric W. Biederman	ae33bc40c0	net: Guaranetee the proper ordering of the loopback device. I was recently hunting a bug that occurred in network namespace cleanup. In looking at the code it became apparrent that we have and will continue to have cases where if we have anything going on in a network namespace there will be assumptions that the loopback device is present. Things like sending igmp unsubscribe messages when we bring down network devices invokes the routing code which assumes that at least the loopback driver is present. Therefore to avoid magic initcall ordering hackery that is hard to follow and hard to get right insert a call to register the loopback device directly from net_dev_init(). This guarantes that the loopback device is the first device registered and the last network device to go away. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-05 16:00:02 -08:00
Eric W. Biederman	d0c082cea6	netns: Delete virtual interfaces during namespace cleanup When physical devices are inside of network namespace and that network namespace terminates we can not make them go away. We have to keep them and moving them to the initial network namespace is the best we can do. For virtual devices left in a network namespace that is exiting we have no need to preserve them and we now have the infrastructure that allows us to delete them. So delete virtual devices when we exit a network namespace. Keeping the necessary user space clean up after a network namespace exits much more tractable. Acked-by: Daniel Lezcano <dlezcano@fr.ibm.com> Acked-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-05 15:59:38 -08:00
Geert Uytterhoeven	dc8a0843a4	[JFFS2] fix race condition in jffs2_lzo_compress() deflate_mutex protects the globals lzo_mem and lzo_compress_buf. However, jffs2_lzo_compress() unlocks deflate_mutex _before_ it has copied out the compressed data from lzo_compress_buf. Correct this by moving the mutex unlock after the copy. In addition, document what deflate_mutex actually protects. Cc: stable@kernel.org Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com> Acked-by: Richard Purdie <rpurdie@openedhand.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2008-11-05 23:22:02 +01:00
Randy Dunlap	b0d5fdef52	net/9p: fix printk format warnings Fix printk format warnings in net/9p. Built cleanly on 7 arches. net/9p/client.c:820: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'u64' net/9p/client.c:820: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'u64' net/9p/client.c:867: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'u64' net/9p/client.c:867: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'u64' net/9p/client.c:932: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'u64' net/9p/client.c:932: warning: format '%llx' expects type 'long long unsigned int', but argument 6 has type 'u64' net/9p/client.c:982: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'u64' net/9p/client.c:982: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'u64' net/9p/client.c:1025: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'u64' net/9p/client.c:1025: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'u64' net/9p/client.c:1227: warning: format '%llx' expects type 'long long unsigned int', but argument 7 has type 'u64' net/9p/client.c:1227: warning: format '%llx' expects type 'long long unsigned int', but argument 12 has type 'u64' net/9p/client.c:1227: warning: format '%llx' expects type 'long long unsigned int', but argument 8 has type 'u64' net/9p/client.c:1227: warning: format '%llx' expects type 'long long unsigned int', but argument 13 has type 'u64' net/9p/client.c:1252: warning: format '%llx' expects type 'long long unsigned int', but argument 7 has type 'u64' net/9p/client.c:1252: warning: format '%llx' expects type 'long long unsigned int', but argument 12 has type 'u64' net/9p/client.c:1252: warning: format '%llx' expects type 'long long unsigned int', but argument 8 has type 'u64' net/9p/client.c:1252: warning: format '%llx' expects type 'long long unsigned int', but argument 13 has type 'u64' Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>	2008-11-05 13:19:07 -06:00
Roel Kluin	9f3e9bbe62	unsigned fid->fid cannot be negative Signed-off-by: Roel Kluin <roel.kluin@gmail.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>	2008-11-05 13:19:07 -06:00
Huang Weiyi	1558c62149	9p: rdma: remove duplicated #include Removed duplicated #include <rdma/ib_verbs.h> in net/9p/trans_rdma.c. Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>	2008-11-05 13:19:07 -06:00
Tom Tucker	45abdf1c7b	p9: Fix leak of waitqueue in request allocation path If a T or R fcall cannot be allocated, the function returns an error but neglects to free the wait queue that was successfully allocated. If it comes through again a second time this wq will be overwritten with a new allocation and the old allocation will be leaked. Also, if the client is subsequently closed, the close path will attempt to clean up these allocations, so set the req fields to NULL to avoid duplicate free. Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>	2008-11-05 13:19:06 -06:00
Tom Tucker	82b189eaaf	9p: Remove unneeded free of fcall for Flush T and R fcall are reused until the client is destroyed. There does not need to be a special case for Flush Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>	2008-11-05 13:19:06 -06:00
Tom Tucker	cac23d6505	9p: Make all client spin locks IRQ safe The client lock must be IRQ safe. Some of the lock acquisition paths took regular spin locks. Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>	2008-11-05 13:19:06 -06:00
Tom Tucker	517ac45af4	9p: rdma: Set trans prior to requesting async connection ops The RDMA connection manager is fundamentally asynchronous. Since the async callback context is the client pointer, the transport in the client struct needs to be set prior to calling the first async op. Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>	2008-11-05 13:19:06 -06:00
Ingo Molnar	9fcd18c9e6	sched: re-tune balancing Impact: improve wakeup affinity on NUMA systems, tweak SMP systems Given the fixes+tweaks to the wakeup-buddy code, re-tweak the domain balancing defaults on NUMA and SMP systems. Turn on SD_WAKE_AFFINE which was off on x86 NUMA - there's no reason why we would not want to have wakeup affinity across nodes as well. (we already do this in the standard NUMA template.) lat_ctx on a NUMA box is particularly happy about this change: before: \| phoenix:~/l> ./lat_ctx -s 0 2 \| "size=0k ovr=2.60 \| 2 5.70 after: \| phoenix:~/l> ./lat_ctx -s 0 2 \| "size=0k ovr=2.65 \| 2 2.07 a 2.75x speedup. pipe-test is similarly happy about it too: \| phoenix:~/sched-tests> ./pipe-test \| 18.26 usecs/loop. \| 14.70 usecs/loop. \| 14.38 usecs/loop. \| 10.55 usecs/loop. # +WAKE_AFFINE on domain0+domain1 \| 8.63 usecs/loop. \| 8.59 usecs/loop. \| 9.03 usecs/loop. \| 8.94 usecs/loop. \| 8.96 usecs/loop. \| 8.63 usecs/loop. Also: - disable SD_BALANCE_NEWIDLE on NUMA and SMP domains (keep it for siblings) - enable SD_WAKE_BALANCE on SMP domains Sysbench+postgresql improves all around the board, quite significantly: .28-rc3-11474e2c .28-rc3-11474e2c-tune ------------------------------------------------- 1: 571 688 +17.08% 2: 1236 1206 -2.55% 4: 2381 2642 +9.89% 8: 4958 5164 +3.99% 16: 9580 9574 -0.07% 32: 7128 8118 +12.20% 64: 7342 8266 +11.18% 128: 7342 8064 +8.95% 256: 7519 7884 +4.62% 512: 7350 7731 +4.93% ------------------------------------------------- SUM: 55412 59341 +6.62% So it's a win both for the runup portion, the peak area and the tail. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-11-05 18:04:38 +01:00
Eric W. Biederman	467622ef2a	[MTD] [NOR] Fix cfi_send_gen_cmd handling of x16 devices in x8 mode (v4) For "unlock" cycles to 16bit devices in 8bit compatibility mode we need to use the byte addresses 0xaaa and 0x555. These effectively match the word address 0x555 and 0x2aa, except the latter has its low bit set. Most chips don't care about the value of the 'A-1' pin in x8 mode, but some -- like the ST M29W320D -- do. So we need to be careful to set it where appropriate. cfi_send_gen_cmd is only ever passed addresses where the low byte is 0x00, 0x55 or 0xaa. Of those, only addresses ending 0xaa are affected by this patch, by masking in the extra low bit when the device is known to be in compatibility mode. [dwmw2: Do it only when (cmd_ofs & 0xff) == 0xaa] v4: Fix stupid typo in cfi_build_cmd_addr that failed to compile I'm writing this patch way to late at night. v3: Bring all of the work back into cfi_build_cmd_addr including calling of map_bankwidth(map) and cfi_interleave(cfi) So every caller doesn't need to. v2: Only modified the address if we our device_type is larger than our bus width. Cc: stable@kernel.org Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2008-11-05 14:40:25 +01:00
David S. Miller	518a09ef11	tcp: Fix recvmsg MSG_PEEK influence of blocking behavior. Vito Caputo noticed that tcp_recvmsg() returns immediately from partial reads when MSG_PEEK is used. In particular, this means that SO_RCVLOWAT is not respected. Simply remove the test. And this matches the behavior of several other systems, including BSD. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-05 03:36:01 -08:00
Alexey Dobriyan	efb9a8c28c	netfilter: netns ct: walk netns list under RTNL netns list (just list) is under RTNL. But helper and proto unregistration happen during rmmod when RTNL is not held, and that's how it was tested: modprobe/rmmod vs clone(CLONE_NEWNET)/exit. BUG: unable to handle kernel paging request at 0000000000100100 <=== IP: [<ffffffffa009890f>] nf_conntrack_l4proto_unregister+0x96/0xae [nf_conntrack] PGD 15e300067 PUD 15e1d8067 PMD 0 Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC last sysfs file: /sys/kernel/uevent_seqnum CPU 0 Modules linked in: nf_conntrack_proto_sctp(-) nf_conntrack_proto_dccp(-) af_packet iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 iptable_filter ip_tables xt_tcpudp ip6table_filter ip6_tables x_tables ipv6 sr_mod cdrom [last unloaded: nf_conntrack_proto_sctp] Pid: 16758, comm: rmmod Not tainted 2.6.28-rc2-netns-xfrm #3 RIP: 0010:[<ffffffffa009890f>] [<ffffffffa009890f>] nf_conntrack_l4proto_unregister+0x96/0xae [nf_conntrack] RSP: 0018:ffff88015dc1fec8 EFLAGS: 00010212 RAX: 0000000000000000 RBX: 00000000001000f8 RCX: 0000000000000000 RDX: ffffffffa009575c RSI: 0000000000000003 RDI: ffffffffa00956b5 RBP: ffff88015dc1fed8 R08: 0000000000000002 R09: 0000000000000000 R10: 0000000000000000 R11: ffff88015dc1fe48 R12: ffffffffa0458f60 R13: 0000000000000880 R14: 00007fff4c361d30 R15: 0000000000000880 FS: 00007f624435a6f0(0000) GS:ffffffff80521580(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000100100 CR3: 0000000168969000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process rmmod (pid: 16758, threadinfo ffff88015dc1e000, task ffff880179864218) Stack: ffffffffa0459100 0000000000000000 ffff88015dc1fee8 ffffffffa0457934 ffff88015dc1ff78 ffffffff80253fef 746e6e6f635f666e 6f72705f6b636172 00707463735f6f74 ffffffff8024cb30 00000000023b8010 0000000000000000 Call Trace: [<ffffffffa0457934>] nf_conntrack_proto_sctp_fini+0x10/0x1e [nf_conntrack_proto_sctp] [<ffffffff80253fef>] sys_delete_module+0x19f/0x1fe [<ffffffff8024cb30>] ? trace_hardirqs_on_caller+0xf0/0x114 [<ffffffff803ea9b2>] ? trace_hardirqs_on_thunk+0x3a/0x3f [<ffffffff8020b52b>] system_call_fastpath+0x16/0x1b Code: 13 35 e0 e8 c4 6c 1a e0 48 8b 1d 6d c6 46 e0 eb 16 48 89 df 4c 89 e2 48 c7 c6 fc 85 09 a0 e8 61 cd ff ff 48 8b 5b 08 48 83 eb 08 <48> 8b 43 08 0f 18 08 48 8d 43 08 48 3d 60 4f 50 80 75 d3 5b 41 RIP [<ffffffffa009890f>] nf_conntrack_l4proto_unregister+0x96/0xae [nf_conntrack] RSP <ffff88015dc1fec8> CR2: 0000000000100100 ---[ end trace bde8ac82debf7192 ]--- Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-05 03:03:18 -08:00
Benjamin Thery	e3ec6cfc26	ipv6: fix run pending DAD when interface becomes ready With some net devices types, an IPv6 address configured while the interface was down can stay 'tentative' forever, even after the interface is set up. In some case, pending IPv6 DADs are not executed when the device becomes ready. I observed this while doing some tests with kvm. If I assign an IPv6 address to my interface eth0 (kvm driver rtl8139) when it is still down then the address is flagged tentative (IFA_F_TENTATIVE). Then, I set eth0 up, and to my surprise, the address stays 'tentative', no DAD is executed and the address can't be pinged. I also observed the same behaviour, without kvm, with virtual interfaces types macvlan and veth. Some easy steps to reproduce the issue with macvlan: 1. ip link add link eth0 type macvlan 2. ip -6 addr add 2003::ab32/64 dev macvlan0 3. ip addr show dev macvlan0 ... inet6 2003::ab32/64 scope global tentative ... 4. ip link set macvlan0 up 5. ip addr show dev macvlan0 ... inet6 2003::ab32/64 scope global tentative ... Address is still tentative I think there's a bug in net/ipv6/addrconf.c, addrconf_notify(): addrconf_dad_run() is not always run when the interface is flagged IF_READY. Currently it is only run when receiving NETDEV_CHANGE event. Looks like some (virtual) devices doesn't send this event when becoming up. For both NETDEV_UP and NETDEV_CHANGE events, when the interface becomes ready, run_pending should be set to 1. Patch below. 'run_pending = 1' could be moved below the if/else block but it makes the code less readable. Signed-off-by: Benjamin Thery <benjamin.thery@bull.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-05 01:43:57 -08:00
Eric Dumazet	270acefafe	net: sk_free_datagram() should use sk_mem_reclaim_partial() I noticed a contention on udp_memory_allocated on regular UDP applications. While tcp_memory_allocated is seldom used, it appears each incoming UDP frame is currently touching udp_memory_allocated when queued, and when received by application. One possible solution is to use sk_mem_reclaim_partial() instead of sk_mem_reclaim(), so that we keep a small reserve (less than one page) of memory for each UDP socket. We did something very similar on TCP side in commit `9993e7d313` ([TCP]: Do not purge sk_forward_alloc entirely in tcp_delack_timer()) A more complex solution would need to convert prot->memory_allocated to use a percpu_counter with batches of 64 or 128 pages. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-05 01:38:06 -08:00
Randy Dunlap	b22cecdd8f	net/9p: fix printk format warnings Fix printk format warnings in net/9p. Built cleanly on 7 arches. net/9p/client.c:820: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'u64' net/9p/client.c:820: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'u64' net/9p/client.c:867: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'u64' net/9p/client.c:867: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'u64' net/9p/client.c:932: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'u64' net/9p/client.c:932: warning: format '%llx' expects type 'long long unsigned int', but argument 6 has type 'u64' net/9p/client.c:982: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'u64' net/9p/client.c:982: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'u64' net/9p/client.c:1025: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'u64' net/9p/client.c:1025: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'u64' net/9p/client.c:1227: warning: format '%llx' expects type 'long long unsigned int', but argument 7 has type 'u64' net/9p/client.c:1227: warning: format '%llx' expects type 'long long unsigned int', but argument 12 has type 'u64' net/9p/client.c:1227: warning: format '%llx' expects type 'long long unsigned int', but argument 8 has type 'u64' net/9p/client.c:1227: warning: format '%llx' expects type 'long long unsigned int', but argument 13 has type 'u64' net/9p/client.c:1252: warning: format '%llx' expects type 'long long unsigned int', but argument 7 has type 'u64' net/9p/client.c:1252: warning: format '%llx' expects type 'long long unsigned int', but argument 12 has type 'u64' net/9p/client.c:1252: warning: format '%llx' expects type 'long long unsigned int', but argument 8 has type 'u64' net/9p/client.c:1252: warning: format '%llx' expects type 'long long unsigned int', but argument 13 has type 'u64' Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-05 01:35:55 -08:00
Peter Zijlstra	02479099c2	sched: fix buddies for group scheduling Impact: scheduling order fix for group scheduling For each level in the hierarchy, set the buddy to point to the right entity. Therefore, when we do the hierarchical schedule, we have a fair chance of ending up where we meant to. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-11-05 10:30:15 +01:00
Peter Zijlstra	4793241be4	sched: backward looking buddy Impact: improve/change/fix wakeup-buddy scheduling Currently we only have a forward looking buddy, that is, we prefer to schedule to the task we last woke up, under the presumption that its going to consume the data we just produced, and therefore will have cache hot benefits. This allows co-waking producer/consumer task pairs to run ahead of the pack for a little while, keeping their cache warm. Without this, we would interleave all pairs, utterly trashing the cache. This patch introduces a backward looking buddy, that is, suppose that in the above scenario, the consumer preempts the producer before it can go to sleep, we will therefore miss the wakeup from consumer to producer (its already running, after all), breaking the cycle and reverting to the cache-trashing interleaved schedule pattern. The backward buddy will try to schedule back to the task that woke us up in case the forward buddy is not available, under the assumption that the last task will be the one with the most cache hot task around barring current. This will basically allow a task to continue after it got preempted. In order to avoid starvation, we allow either buddy to get wakeup_gran ahead of the pack. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-11-05 10:30:14 +01:00
Peter Zijlstra	d95f98d069	sched: fix fair preempt check Impact: fix cross-class preemption Inter-class wakeup preemptions should go on class order. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-11-05 10:30:13 +01:00
Peter Zijlstra	f4b6755fb3	sched: cleanup fair task selection Impact: cleanup Clean up task selection Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-11-05 10:30:13 +01:00
Stephen Rothwell	454666eb78	powerpc: Fix "unused variable" warning in pci_dlpar.c This gets rid of this build warning: arch/powerpc/platforms/pseries/pci_dlpar.c: In function 'init_phb_dynamic': arch/powerpc/platforms/pseries/pci_dlpar.c:192: warning: unused variable 'b' This is one of the very few warnings left in a ppc64_defconfig build and getting rid of it will make it easier to see future introduced ones (in fact this was introduced very recently). Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-11-05 19:59:08 +11:00
Alexey Dobriyan	9c8b4aff18	powerpc/cell: Fix compile error in ras.c This fixes this error on Cell when CONFIG_KEXEC = n: arch/powerpc/platforms/cell/ras.c:299: error: implicit declaration of function 'crash_shutdown_register' We have to include <asm/kexec.h> because it contains the dummy definition of crash_shutdown_register that is used when CONFIG_KEXEC=n, but <linux/kexec.h> doesn't include <asm/kexec.h> in that case. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-11-05 19:59:08 +11:00
Alexey Dobriyan	fce4d58353	powerpc/ps3: Fix compile error in ps3-lpm.c Compiling with CONFIG_SMP = n and CONFIG_PS3_LPM != n gives this error: drivers/ps3/ps3-lpm.c:838: error: implicit declaration of function 'get_hard_smp_processor_id' This fixes it. We have to include <asm/smp.h> rather than <linux/smp.h> because the UP definition of get_hard_smp_processor_id() is in <asm/smp.h>, and <linux/smp.h> only includes <asm/smp.h> if CONFIG_SMP = y. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Acked-by: Geoff Levand <geoffrey.levand@am.sony.com> Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-11-05 19:59:08 +11:00
Gerrit Renker	d99a7bd210	dccp: Cleanup routines for feature negotiation This inserts the required de-allocation routines for memory allocated by feature negotiation in the socket destructors, replacing dccp_feat_clean() in one instance. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-04 23:56:30 -08:00
Gerrit Renker	ac75773c27	dccp: Per-socket initialisation of feature negotiation This provides feature-negotiation initialisation for both DCCP sockets and DCCP request_sockets, to support feature negotiation during connection setup. It also resolves a FIXME regarding the congestion control initialisation. Thanks to Wei Yongjun for help with the IPv6 side of this patch. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-04 23:55:49 -08:00
Gerrit Renker	61e6473efb	dccp: List management for new feature negotiation This adds list initial fields and list management functions for the new feature negotiation implementation. Thanks to Arnaldo for suggestions and improvements. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-04 23:54:04 -08:00
Gerrit Renker	7d43d1a0f2	dccp: Implement lookup table for feature-negotiation information A lookup table for feature-negotiation information, extracted from RFC 4340/42, is provided by this patch. All currently known features can be found in this table, along with their feature location, their default value, and type. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-04 23:43:47 -08:00
Gerrit Renker	bd012f2e7b	dccp: Basic data structure for feature negotiation This patch prepares for the new and extended feature-negotiation routines. The following feature-negotiation data structures are provided: * a container for the various (SP or NN) values, * symbolic state names to track feature states, * an entry struct which holds all current information together, * elementary functions to fill in and process these structures. Entry structs are arranged as FIFO for the following reason: RFC 4340 specifies that if multiple options of the same type are present, they are processed in the order of their appearance in the packet; which means that this order needs to be preserved in the local data structure (the later insertion code also respects this order). The struct list_head has been chosen for the following reasons: the most frequent operations are * add new entry at tail (when receiving Change or setting socket options); * delete entry (when Confirm has been received); * deep copy of entire list (cloning from listening socket onto request socket). The NN value has been set to 64 bit, which is a currently sufficient upper limit (Sequence Window feature has 48 bit). Thanks to Arnaldo, who contributed the streamlined layout of the entry struct. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-04 23:38:20 -08:00
Patrick McHardy	9b22ea5609	net: fix packet socket delivery in rx irq handler The changes to deliver hardware accelerated VLAN packets to packet sockets (commit `bc1d0411`) caused a warning for non-NAPI drivers. The __vlan_hwaccel_rx() function is called directly from the drivers RX function, for non-NAPI drivers that means its still in RX IRQ context: [ 27.779463] ------------[ cut here ]------------ [ 27.779509] WARNING: at kernel/softirq.c:136 local_bh_enable+0x37/0x81() ... [ 27.782520] [<c0264755>] netif_nit_deliver+0x5b/0x75 [ 27.782590] [<c02bba83>] __vlan_hwaccel_rx+0x79/0x162 [ 27.782664] [<f8851c1d>] atl1_intr+0x9a9/0xa7c [atl1] [ 27.782738] [<c0155b17>] handle_IRQ_event+0x23/0x51 [ 27.782808] [<c015692e>] handle_edge_irq+0xc2/0x102 [ 27.782878] [<c0105fd5>] do_IRQ+0x4d/0x64 Split hardware accelerated VLAN reception into two parts to fix this: - __vlan_hwaccel_rx just stores the VLAN TCI and performs the VLAN device lookup, then calls netif_receive_skb()/netif_rx() - vlan_hwaccel_do_receive(), which is invoked by netif_receive_skb() in softirq context, performs the real reception and delivery to packet sockets. Reported-and-tested-by: Ramon Casellas <ramon.casellas@cttc.es> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-04 14:49:57 -08:00
Andreas Steffen	79654a7698	xfrm: Have af-specific init_tempsel() initialize family field of temporary selector While adding MIGRATE support to strongSwan, Andreas Steffen noticed that the selectors provided in XFRM_MSG_ACQUIRE have their family field uninitialized (those in MIGRATE do have their family set). Looking at the code, this is because the af-specific init_tempsel() (called via afinfo->init_tempsel() in xfrm_init_tempsel()) do not set the value. Reported-by: Andreas Steffen <andreas.steffen@strongswan.org> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Arnaud Ebalard <arno@natisbad.org>	2008-11-04 14:49:19 -08:00
Alexey Dobriyan	d5f642384e	net: #ifdef ->sk_security Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-04 14:45:58 -08:00
Tony Lindgren	5c32f62b97	ARM: OMAP: Fix define for twl4030 irqs Otherwise twl4030 gpios won't work. Signed-off-by: Tony Lindgren <tony@atomide.com>	2008-11-04 13:35:08 -08:00
Tony Lindgren	52414739ca	ARM: OMAP: Fix get_irqnr_and_base to clear spurious interrupt bits On omap24xx, INTCPS_SIR_IRQ_OFFSET bits [6:0] contains the current active interrupt number. However, on 34xx INTCPS_SIR_IRQ_OFFSET bits [31:7] also contains the SPURIOUSIRQFLAG, which gets set if the interrupt sorting information is invalid. If the SPURIOUSIRQFLAG bits are not ignored, the interrupt code will occasionally produce a bunch of confusing errors: irq -33, desc: c02ddcc8, depth: 0, count: 0, unhandled: 0 ->handle_irq(): c006f23c, handle_bad_irq+0x0/0x22c ->chip(): 00000000, 0x0 ->action(): 00000000 Fix this by masking out only the ACTIVEIRQ bits. Also fix a confusing comment. Signed-off-by: Tony Lindgren <tony@atomide.com>	2008-11-04 13:35:07 -08:00
Zhaolei	e621f266d4	ARM: OMAP: Fix debugfs_create_'s error checking method for arm/plat-omap debugfs_create_() returns NULL if an error occurs, returns -ENODEV when debugfs is not enabled in the kernel. Comparing to PATCH v1, because clk_debugfs_init is included in "#if defined CONFIG_DEBUG_FS", we only need to check NULL return. Thanks Li Zefan <lizf@cn.fujitsu.com> debugfs_create_u8() and other function's return value's checking method are also fixed in this patch. Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com> Signed-off-by: Tony Lindgren <tony@atomide.com>	2008-11-04 13:35:07 -08:00
Sanjeev Premi	85d7a07026	ARM: OMAP: Fix compiler warnings in gpmc.c Fix these compiler warnings: gpmc.c: In function 'gpmc_init': gpmc.c:432: warning: 'return' with a value, in function returning void gpmc.c:439: warning: 'return' with a value, in function returning void Signed-off-by: Sanjeev Premi <premi@ti.com> Signed-off-by: Tony Lindgren <tony@atomide.com>	2008-11-04 13:35:06 -08:00
Linus Torvalds	75fa67706c	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: xfrm: Fix xfrm_policy_gc_lock handling. niu: Use pci_ioremap_bar(). bnx2x: Version Update bnx2x: Calling netif_carrier_off at the end of the probe bnx2x: PCI configuration bug on big-endian bnx2x: Removing the PMF indication when unloading mv643xx_eth: fix SMI bus access timeouts net: kconfig cleanup fs_enet: fix polling XFRM: copy_to_user_kmaddress() reports local address twice SMC91x: Fix compilation on some platforms. udp: Fix the SNMP counter of UDP_MIB_INERRORS udp: Fix the SNMP counter of UDP_MIB_INDATAGRAMS drivers/net/smc911x.c: Fix lockdep warning on xmit.	2008-11-04 08:30:12 -08:00
Linus Torvalds	4edfd20faf	Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev: libata: mask off DET when restoring SControl for detach libata: implement ATA_HORKAGE_ATAPI_MOD16_DMA and apply it libata: Fix a potential race condition in ata_scsi_park_show() sata_nv: fix generic, nf2/3 detection regression sata_via: restore vt*_prepare_host error handling sata_promise: add ATA engine reset to reset ops	2008-11-04 08:19:01 -08:00
Jianjun Kong	54074d5932	drivers: remove duplicated #include Signed-off-by: Jianjun Kong <jianjun@zeuux.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-11-04 08:18:19 -08:00
Russell King	d2ed5cb80a	[ARM] fix VFP+softfloat binaries 2.6.28-rc tightened up the ELF architecture checks on ARM. For non-EABI it only allows VFP if the hardware supports it. However, the kernel fails to also inspect the soft-float flag, so it incorrectly rejects binaries using soft-VFP. The fix is simple: also check that EF_ARM_SOFT_FLOAT isn't set before rejecting VFP binaries on non-VFP hardware. Acked-by: Mikael Pettersson <mikpe@it.uu.se> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2008-11-04 15:02:09 +00:00
Mark McLoughlin	e4ab1b3cbb	x86/docs: remove noirqbalance param docs Impact: documentation fix irqbalance was removed by: commit `8b8e8c1bf7` Author: Yinghai Lu <yhlu.kernel@gmail.com> Date: Tue Aug 19 20:50:23 2008 -0700 Remove the associated documentation for noirqbalance. Signed-off-by: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-11-04 14:50:34 +01:00
Alok Kataria	70de9a9704	x86: don't use tsc_khz to calculate lpj if notsc is passed Impact: fix udelay when "notsc" boot parameter is passed With notsc passed on commandline, tsc may not be used for udelays, make sure that we do not use tsc_khz to calculate the lpj value in such cases. Reported-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> Signed-off-by: Alok N Kataria <akataria@vmware.com> Cc: <stable@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-11-04 09:55:26 +01:00
Tejun Heo	299246f9a2	libata: mask off DET when restoring SControl for detach libata restores SControl on detach; however, trying to restore non-zero DET can cause undeterministic behavior including PMP device going offline till power cycling. Mask off DET when restoring SControl. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2008-11-04 01:08:33 -05:00
Tejun Heo	6a87e42e95	libata: implement ATA_HORKAGE_ATAPI_MOD16_DMA and apply it libata always uses PIO for ATAPI commands when the number of bytes to transfer isn't multiple of 16 but quantum DAT72 chokes on odd bytes PIO transfers. Implement a horkage to skip the mod16 check and apply it to the quantum device. This is reported by John Clark in the following thread. http://thread.gmane.org/gmane.linux.ide/34748 Signed-off-by: Tejun Heo <tj@kernel.org> Cc: John Clark <clarkjc@runbox.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2008-11-04 01:08:27 -05:00
Elias Oltmanns	a464189de3	libata: Fix a potential race condition in ata_scsi_park_show() Peter Moulder has pointed out that there is a slight chance that a negative value might be passed to jiffies_to_msecs() in ata_scsi_park_show(). This is fixed by saving the value of jiffies in a local variable, thus also reducing code since the volatile variable jiffies is accessed only once. Signed-off-by: Elias Oltmanns <eo@nebensachen.de> Signed-off-by: Tejun Heo <tj.kernel.org> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2008-11-04 01:08:24 -05:00
Tejun Heo	3c324283e6	sata_nv: fix generic, nf2/3 detection regression All three flavors of sata_nv's are different in how their hardreset behaves. * generic: Hardreset is not reliable. Link often doesn't come online after hardreset. * nf2/3: A little bit better - link comes online with longer debounce timing. However, nf2/3 can't reliable wait for the first D2H Register FIS, so it can't wait for device readiness or classify the device after hardreset. Follow-up SRST required. * ck804: Hardreset finally works. The core layer change to prefer hardreset and follow up changes exposed the above issues and caused various detection regressions for all three flavors. This patch, hopefully, fixes all the known issues and should make sata_nv error handling more reliable. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2008-11-04 01:08:11 -05:00

... 3 4 5 6 7 ...

118885 Commits All Branches Search

118885 Commits

All Branches