Define __BIOVEC_PHYS_MERGEABLE as the default implementation of
BIOVEC_PHYS_MERGEABLE, so that its available for reuse within an
arch-specific definition of BIOVEC_PHYS_MERGEABLE.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Impact: clarify menuconfig text
Mention ACPI in the top-level menu to give a clue as to where
it lives. This matches what ia64 does.
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Adding a spare to a raid10 doesn't cause recovery to start.
This is due to an silly type in
commit 6c2fce2ef6
and so is a bug in 2.6.27 and .28-rc.
Thanks to Thomas Backlund for bisecting to find this.
Cc: Thomas Backlund <tmb@mandriva.org>
Cc: stable@kernel.org
Signed-off-by: NeilBrown <neilb@suse.de>
It turns out that it is only safe to call blkdev_ioctl when the device
is actually open (as ->bd_disk is set to NULL on last close). And it
is quite possible for do_md_stop to be called when the device is not
open. So discard the call to blkdev_ioctl(BLKRRPART) which was
added in
commit 934d9c23b4
It is just as easy to call this ioctl from userspace when needed (on
mdadm -S) so leave it out of the kernel
Signed-off-by: NeilBrown <neilb@suse.de>
Impact: make NR_IRQS big enough for system with lots of apic/pins
If lots of IO_APIC's are there (or can be there), size the same way
as 64-bit, depending on MAX_IO_APICS and NR_CPUS.
This fixes the boot problem reported by Ben Hutchings on a 32-bit
server with 5 IO-APICs and 240 IO-APIC pins.
Signed-off-by: Yinghai <yinghai@kernel.org>
Tested-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: fix boot hang on 32-bit systems with more than 224 IO-APIC pins
On some 32-bit systems with a lot of IO-APICs probe_nr_irqs() can
return a value larger than NR_IRQS. This will lead to probe_irq_on()
overrunning the irq_desc array.
I hit this when running net-next-2.6 (close to 2.6.28-rc3) on a
Supermicro dual Xeon system. NR_IRQS is 224 but probe_nr_irqs() detects
5 IOAPICs and returns 240. Here are the log messages:
Tue Nov 4 16:53:47 2008 ACPI: IOAPIC (id[0x01] address[0xfec00000] gsi_base[0])
Tue Nov 4 16:53:47 2008 IOAPIC[0]: apic_id 1, version 32, address 0xfec00000, GSI 0-23
Tue Nov 4 16:53:47 2008 ACPI: IOAPIC (id[0x02] address[0xfec81000] gsi_base[24])
Tue Nov 4 16:53:47 2008 IOAPIC[1]: apic_id 2, version 32, address 0xfec81000, GSI 24-47
Tue Nov 4 16:53:47 2008 ACPI: IOAPIC (id[0x03] address[0xfec81400] gsi_base[48])
Tue Nov 4 16:53:47 2008 IOAPIC[2]: apic_id 3, version 32, address 0xfec81400, GSI 48-71
Tue Nov 4 16:53:47 2008 ACPI: IOAPIC (id[0x04] address[0xfec82000] gsi_base[72])
Tue Nov 4 16:53:47 2008 IOAPIC[3]: apic_id 4, version 32, address 0xfec82000, GSI 72-95
Tue Nov 4 16:53:47 2008 ACPI: IOAPIC (id[0x05] address[0xfec82400] gsi_base[96])
Tue Nov 4 16:53:47 2008 IOAPIC[4]: apic_id 5, version 32, address 0xfec82400, GSI 96-119
Tue Nov 4 16:53:47 2008 ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 high edge)
Tue Nov 4 16:53:47 2008 ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
Tue Nov 4 16:53:47 2008 Enabling APIC mode: Flat. Using 5 I/O APICs
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Acked-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
David Gibson suggested that since we are now unconditionally copying
the dtb into a malloc()ed buffer, it would be sensible to add a little
padding to the buffer at that point, so that further device tree
manipulations won't need to reallocate it.
This implements that suggestion.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Andrew Morton suggested that using a macro that makes an array
reference look like a function call makes it harder to understand the
code.
This therefore removes the huge_pgtable_cache(psize) macro and
replaces its uses with pgtable_cache[HUGE_PGTABLE_INDEX(psize)].
Signed-off-by: Jon Tollefson <kniht@linux.vnet.ibm.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Free dynamically allocated device data structures when device registration
fails. This fixes memory leakage when the registration fails.
Signed-off-by: Masakazu Mokuno <mokuno@sm.sony.co.jp>
Signed-off-by: Geoff Levand <geoffrey.levand@am.sony.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Since we started using the generic timekeeping code, we haven't had a
powerpc-specific version of do_gettimeofday, and hence there is now
nothing that reads the do_gtod variable in arch/powerpc/kernel/time.c.
This therefore removes it and the code that sets it.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Currently the clock_gettime implementation in the VDSO produces a
result with microsecond resolution for the cases that are handled
without a system call, i.e. CLOCK_REALTIME and CLOCK_MONOTONIC. The
nanoseconds field of the result is obtained by computing a
microseconds value and multiplying by 1000.
This changes the code in the VDSO to do the computation for
clock_gettime with nanosecond resolution. That means that the
resolution of the result will ultimately depend on the timebase
frequency.
Because the timestamp in the VDSO datapage (stamp_xsec, the real time
corresponding to the timebase count in tb_orig_stamp) is in units of
2^-20 seconds, it doesn't have sufficient resolution for computing a
result with nanosecond resolution. Therefore this adds a copy of
xtime to the VDSO datapage and updates it in update_gtod() along with
the other time-related fields.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Now that all of the remaining dma_mapping_ops have had their
map_/unmap_single functions updated to become map/unmap_page
functions, there is no need to have the map_/unmap_single function
pointers in the dma_mapping_ops.
So, this removes them and also removes the code that does the checking
for which set of functions to use.
Signed-off-by: Mark Nelson <markn@au1.ibm.com>
Acked-by: Becky Bruce <becky.bruce@freescale.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This does a few cosmetic cleanups, moving a couple of things around
but without actually changing what the code does.
(There is a minor change in ordering of operations in
pcibios_setup_bus_devices but it should have no impact).
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The pseries PCI hotplug code has a number of issues, ranging from
incorrect resource setup to crashes, depending on what is added,
when, whether it contains a bridge, etc etc....
This fixes a whole bunch of these, while actually simplifying the code
a bit, using more generic code in the process and factoring out common
code between adding of a PHB, a slot or a device.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
To properly fix PCI hotplug, it's useful to be able to make the fixup
passes on all devices whether they were just hot plugged or already
there.
However, pcibios_allocate_bus_resources() wouldn't cope well with
being called twice for a given bus. This makes it ignore resources
that have already been allocated, along with adding a bit of debug
output.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
To properly fix PCI hotplug, it's useful to be able to make the fixup
passes on all devices whether they were just hot plugged or already
there.
The EEH code however used to not be very friendly with calling
eeh_add_device_late() multiple time, and not very rebust in the way it
generally tests whether a device is in the expected state vs. the EEH
code.
This improves it, along with cleaning up a couple of debug printk's.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Currently, our PCI code uses the pcibios_fixup_bus() callback, which
is called by the generic code when probing PCI buses, for two
different things.
One is to set up things related to the bus itself, such as reading
bridge resources for P2P bridges, fixing them up, or setting up the
iommu's associated with bridges on some platforms.
The other is some setup for each individual device under that bridge,
mostly setting up DMA mappings and interrupts.
The problem is that this approach doesn't work well with PCI hotplug
when an existing bus is re-probed for new children. We fix this
problem by splitting pcibios_fixup_bus into two routines:
pcibios_setup_bus_self() is now called to setup the bus itself
pcibios_setup_bus_devices() is now called to setup devices
pcibios_fixup_bus() is then modified to call these two after reading the
bridge bases, and the OF based PCI probe is modified to avoid calling
into the first one when rescanning an existing bridge.
[paulus@samba.org - fixed eeh.h for 32-bit compile now that pci-common.c
is including it unconditionally.]
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
deflate_mutex protects the globals lzo_mem and lzo_compress_buf. However,
jffs2_lzo_compress() unlocks deflate_mutex _before_ it has copied out the
compressed data from lzo_compress_buf. Correct this by moving the mutex
unlock after the copy.
In addition, document what deflate_mutex actually protects.
Cc: stable@kernel.org
Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Acked-by: Richard Purdie <rpurdie@openedhand.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Fix printk format warnings in net/9p.
Built cleanly on 7 arches.
net/9p/client.c:820: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'u64'
net/9p/client.c:820: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'u64'
net/9p/client.c:867: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'u64'
net/9p/client.c:867: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'u64'
net/9p/client.c:932: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'u64'
net/9p/client.c:932: warning: format '%llx' expects type 'long long unsigned int', but argument 6 has type 'u64'
net/9p/client.c:982: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'u64'
net/9p/client.c:982: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'u64'
net/9p/client.c:1025: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'u64'
net/9p/client.c:1025: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'u64'
net/9p/client.c:1227: warning: format '%llx' expects type 'long long unsigned int', but argument 7 has type 'u64'
net/9p/client.c:1227: warning: format '%llx' expects type 'long long unsigned int', but argument 12 has type 'u64'
net/9p/client.c:1227: warning: format '%llx' expects type 'long long unsigned int', but argument 8 has type 'u64'
net/9p/client.c:1227: warning: format '%llx' expects type 'long long unsigned int', but argument 13 has type 'u64'
net/9p/client.c:1252: warning: format '%llx' expects type 'long long unsigned int', but argument 7 has type 'u64'
net/9p/client.c:1252: warning: format '%llx' expects type 'long long unsigned int', but argument 12 has type 'u64'
net/9p/client.c:1252: warning: format '%llx' expects type 'long long unsigned int', but argument 8 has type 'u64'
net/9p/client.c:1252: warning: format '%llx' expects type 'long long unsigned int', but argument 13 has type 'u64'
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
Removed duplicated #include <rdma/ib_verbs.h> in
net/9p/trans_rdma.c.
Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
If a T or R fcall cannot be allocated, the function returns an error
but neglects to free the wait queue that was successfully allocated.
If it comes through again a second time this wq will be overwritten
with a new allocation and the old allocation will be leaked.
Also, if the client is subsequently closed, the close path will
attempt to clean up these allocations, so set the req fields to
NULL to avoid duplicate free.
Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
T and R fcall are reused until the client is destroyed. There does
not need to be a special case for Flush
Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
The client lock must be IRQ safe. Some of the lock acquisition paths
took regular spin locks.
Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
The RDMA connection manager is fundamentally asynchronous.
Since the async callback context is the client pointer, the
transport in the client struct needs to be set prior to calling
the first async op.
Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
Impact: improve wakeup affinity on NUMA systems, tweak SMP systems
Given the fixes+tweaks to the wakeup-buddy code, re-tweak the domain
balancing defaults on NUMA and SMP systems.
Turn on SD_WAKE_AFFINE which was off on x86 NUMA - there's no reason
why we would not want to have wakeup affinity across nodes as well.
(we already do this in the standard NUMA template.)
lat_ctx on a NUMA box is particularly happy about this change:
before:
| phoenix:~/l> ./lat_ctx -s 0 2
| "size=0k ovr=2.60
| 2 5.70
after:
| phoenix:~/l> ./lat_ctx -s 0 2
| "size=0k ovr=2.65
| 2 2.07
a 2.75x speedup.
pipe-test is similarly happy about it too:
| phoenix:~/sched-tests> ./pipe-test
| 18.26 usecs/loop.
| 14.70 usecs/loop.
| 14.38 usecs/loop.
| 10.55 usecs/loop. # +WAKE_AFFINE on domain0+domain1
| 8.63 usecs/loop.
| 8.59 usecs/loop.
| 9.03 usecs/loop.
| 8.94 usecs/loop.
| 8.96 usecs/loop.
| 8.63 usecs/loop.
Also:
- disable SD_BALANCE_NEWIDLE on NUMA and SMP domains (keep it for siblings)
- enable SD_WAKE_BALANCE on SMP domains
Sysbench+postgresql improves all around the board, quite significantly:
.28-rc3-11474e2c .28-rc3-11474e2c-tune
-------------------------------------------------
1: 571 688 +17.08%
2: 1236 1206 -2.55%
4: 2381 2642 +9.89%
8: 4958 5164 +3.99%
16: 9580 9574 -0.07%
32: 7128 8118 +12.20%
64: 7342 8266 +11.18%
128: 7342 8064 +8.95%
256: 7519 7884 +4.62%
512: 7350 7731 +4.93%
-------------------------------------------------
SUM: 55412 59341 +6.62%
So it's a win both for the runup portion, the peak area and the tail.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
For "unlock" cycles to 16bit devices in 8bit compatibility mode we need
to use the byte addresses 0xaaa and 0x555. These effectively match
the word address 0x555 and 0x2aa, except the latter has its low bit set.
Most chips don't care about the value of the 'A-1' pin in x8 mode,
but some -- like the ST M29W320D -- do. So we need to be careful to
set it where appropriate.
cfi_send_gen_cmd is only ever passed addresses where the low byte
is 0x00, 0x55 or 0xaa. Of those, only addresses ending 0xaa are
affected by this patch, by masking in the extra low bit when the device
is known to be in compatibility mode.
[dwmw2: Do it only when (cmd_ofs & 0xff) == 0xaa]
v4: Fix stupid typo in cfi_build_cmd_addr that failed to compile
I'm writing this patch way to late at night.
v3: Bring all of the work back into cfi_build_cmd_addr
including calling of map_bankwidth(map) and cfi_interleave(cfi)
So every caller doesn't need to.
v2: Only modified the address if we our device_type is larger than our
bus width.
Cc: stable@kernel.org
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Vito Caputo noticed that tcp_recvmsg() returns immediately from
partial reads when MSG_PEEK is used. In particular, this means that
SO_RCVLOWAT is not respected.
Simply remove the test. And this matches the behavior of several
other systems, including BSD.
Signed-off-by: David S. Miller <davem@davemloft.net>
The function pcibios_do_bus_setup() was used by pcibios_fixup_bus()
to perform setup that is different between the 32-bit and 64-bit
code. This difference no longer exists, thus the function is removed
and the setup now done directly from pci-common.c.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The 32-bit and 64-bit powerpc PCI code used to set up the resource
pointers of the root bus of a given PHB in completely different
places.
This unifies this in large part, by making 32-bit use a routine very
similar to what 64-bit does when initially scanning the PCI busses.
The actual setup of the PHB resources itself is then moved to a
common function in pci-common.c.
This should cause no functional change on 64-bit. On 32-bit, the
effect is that the PHB resources are going to be setup a bit earlier,
instead of being setup from pcibios_fixup_bus().
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This removes the various DBG() macro from the powerpc PCI code and
makes it use the standard pr_debug instead.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Update memcpy() to add two new feature sections: one for aligning the
destination before copying and one for copying using aligned load
and store doubles.
These new feature sections will only affect Power6 and Cell because
the CPU feature bit was only added to these two processors.
Power6 gets its best performance in memcpy() when aligning neither the
source nor the destination, while Cell gets its best performance when
just the destination is aligned. But in order to save on CPU feature
bits we can use the previously added CPU_FTR_CP_USE_DCBTZ feature bit
to differentiate between Power6 and Cell (because CPU_FTR_CP_USE_DCBTZ
was added to Cell but not Power6).
The first feature section acts to nop out the branch that takes us to
the code that aligns us to an eight byte boundary for the destination.
We only want to nop out this branch on Power6.
So the ALT_FTR_SECTION_END() for this feature section creates a test
mask of the two feature bits ORed together and provides an expected
result of just CPU_FTR_UNALIGNED_LD_STD, thus we nop out the branch
if we're on a CPU that has CPU_FTR_UNALIGNED_LD_STD set and
CPU_FTR_CP_USE_DCBTZ unset.
For the second feature section added, if we're on a CPU that has the
CPU_FTR_UNALIGNED_LD_STD bit set then we don't want to do the copy
with aligned loads and stores (and the appropriate shifting left and
right instructions), so we want to nop out the branch to
.Lsrc_unaligned.
The andi. used for this branch is moved to just above the branch
because this allows us to nop out both instructions with just one
feature section which gives us better performance and doesn't hurt
readability which two separate feature sections did.
Moving the andi. to just above the branch doesn't have any noticeable
negative effect on the remaining 64bit processors (the ones that
didn't have this feature bit added).
On Cell this simple modification results in an improvement to measured
memcpy() bandwidth of up to 50% in the hot cache case and up to 15% in
the cold cache case.
On Power6 we get memory bandwidth results that are up to three times
faster in the hot cache case and up to 50% faster in the cold cache
case.
Commit 2a9294369b ("powerpc: Add new CPU
feature: CPU_FTR_CP_USE_DCBTZ") was where CPU_FTR_CP_USE_DCBTZ was
added.
To say that Cell gets its best performance in memcpy() with just the
destination aligned is true but only for the reason that the indirect
shift and rotate instructions, sld and srd, are microcoded on Cell.
This means that either the destination or the source can be aligned,
but not both, and seeing as we get better performance with the
destination aligned we choose this option.
While we're at it make a one line change from cmpldi r1,... to
cmpldi cr1,... for consistency.
Signed-off-by: Mark Nelson <markn@au1.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Add a new CPU feature bit, CPU_FTR_UNALIGNED_LD_STD, to be added
to the 64bit powerpc chips that can do unaligned load double and
store double without any performance hit.
This is added to Power6 and Cell and will be used in the next commit
to disable the code that gets the destination address aligned on
those CPUs where doing that doesn't improve performance.
Signed-off-by: Mark Nelson <markn@au1.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
A new field has been added to the VPA as a method for the client OS to
communicate to firmware the number of page-ins it is performing when
running collaborative memory overcommit. The hypervisor will use this
information to better determine if a partition is experiencing memory
pressure and needs more memory allocated to it.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The 'ibm,interrupt-server#-size' properties are not in the cpu nodes,
which is where we currently look for them, but rather live under the
interrupt source controller nodes (which have "ibm,ppc-xics" in their
compatible property).
This moves the code that looks for the ibm,interrupt-server#-size
properties from xics_update_irq_servers() into xics_init_IRQ().
Also this adds a check for mismatched sizes across the interrupt
source controller nodes. Not sure this is necessary as in this case
the firmware might be seriously busted.
This property only appears on POWER6 boxes and is only used in the
set-indicator(gqirm) call, and apparently firmware currently ignores
the value we pass. Nevertheless we need to fix it in case future
firmware versions use it.
Signed-off-by: Sebastien Dugue <sebastien.dugue@bull.net>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
We don't want to encourage the device_type usage. It isn't used in
the code, so we can simply remove it from the dts files.
Suggested-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
When no hardware method is provided to sync the timebase registers
across the machine, and the platform doesn't sync them for us, then we
use a generic software implementation. Currently, the code for that
has many printks, and they don't have log levels. Most of the printks
are only useful for debugging the code, and since we haven't had any
problems with it for years, this turns them into pr_debug.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The code to properly expose domain numbers in /proc is somewhat
bogus on ppc64 as it depends on the "buid" field being non-0,
but that field is really pseries specific.
This removes that code and makes ppc64 use the same code as 32-bit
which effectively decides whether to expose domains based on
ppc_pci_flags set by the platform, and sets the default for 64-bit
to enable domains and enable compatibility for domain 0 (which
strips the domain number for domain 0 to help with X servers).
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
With some net devices types, an IPv6 address configured while the
interface was down can stay 'tentative' forever, even after the interface
is set up. In some case, pending IPv6 DADs are not executed when the
device becomes ready.
I observed this while doing some tests with kvm. If I assign an IPv6
address to my interface eth0 (kvm driver rtl8139) when it is still down
then the address is flagged tentative (IFA_F_TENTATIVE). Then, I set
eth0 up, and to my surprise, the address stays 'tentative', no DAD is
executed and the address can't be pinged.
I also observed the same behaviour, without kvm, with virtual interfaces
types macvlan and veth.
Some easy steps to reproduce the issue with macvlan:
1. ip link add link eth0 type macvlan
2. ip -6 addr add 2003::ab32/64 dev macvlan0
3. ip addr show dev macvlan0
...
inet6 2003::ab32/64 scope global tentative
...
4. ip link set macvlan0 up
5. ip addr show dev macvlan0
...
inet6 2003::ab32/64 scope global tentative
...
Address is still tentative
I think there's a bug in net/ipv6/addrconf.c, addrconf_notify():
addrconf_dad_run() is not always run when the interface is flagged IF_READY.
Currently it is only run when receiving NETDEV_CHANGE event. Looks like
some (virtual) devices doesn't send this event when becoming up.
For both NETDEV_UP and NETDEV_CHANGE events, when the interface becomes
ready, run_pending should be set to 1. Patch below.
'run_pending = 1' could be moved below the if/else block but it makes
the code less readable.
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix printk format warnings in net/9p.
Built cleanly on 7 arches.
net/9p/client.c:820: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'u64'
net/9p/client.c:820: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'u64'
net/9p/client.c:867: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'u64'
net/9p/client.c:867: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'u64'
net/9p/client.c:932: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'u64'
net/9p/client.c:932: warning: format '%llx' expects type 'long long unsigned int', but argument 6 has type 'u64'
net/9p/client.c:982: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'u64'
net/9p/client.c:982: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'u64'
net/9p/client.c:1025: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'u64'
net/9p/client.c:1025: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'u64'
net/9p/client.c:1227: warning: format '%llx' expects type 'long long unsigned int', but argument 7 has type 'u64'
net/9p/client.c:1227: warning: format '%llx' expects type 'long long unsigned int', but argument 12 has type 'u64'
net/9p/client.c:1227: warning: format '%llx' expects type 'long long unsigned int', but argument 8 has type 'u64'
net/9p/client.c:1227: warning: format '%llx' expects type 'long long unsigned int', but argument 13 has type 'u64'
net/9p/client.c:1252: warning: format '%llx' expects type 'long long unsigned int', but argument 7 has type 'u64'
net/9p/client.c:1252: warning: format '%llx' expects type 'long long unsigned int', but argument 12 has type 'u64'
net/9p/client.c:1252: warning: format '%llx' expects type 'long long unsigned int', but argument 8 has type 'u64'
net/9p/client.c:1252: warning: format '%llx' expects type 'long long unsigned int', but argument 13 has type 'u64'
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Impact: scheduling order fix for group scheduling
For each level in the hierarchy, set the buddy to point to the right entity.
Therefore, when we do the hierarchical schedule, we have a fair chance of
ending up where we meant to.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: improve/change/fix wakeup-buddy scheduling
Currently we only have a forward looking buddy, that is, we prefer to
schedule to the task we last woke up, under the presumption that its
going to consume the data we just produced, and therefore will have
cache hot benefits.
This allows co-waking producer/consumer task pairs to run ahead of the
pack for a little while, keeping their cache warm. Without this, we
would interleave all pairs, utterly trashing the cache.
This patch introduces a backward looking buddy, that is, suppose that
in the above scenario, the consumer preempts the producer before it
can go to sleep, we will therefore miss the wakeup from consumer to
producer (its already running, after all), breaking the cycle and
reverting to the cache-trashing interleaved schedule pattern.
The backward buddy will try to schedule back to the task that woke us
up in case the forward buddy is not available, under the assumption
that the last task will be the one with the most cache hot task around
barring current.
This will basically allow a task to continue after it got preempted.
In order to avoid starvation, we allow either buddy to get wakeup_gran
ahead of the pack.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: fix cross-class preemption
Inter-class wakeup preemptions should go on class order.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
In 777e208d40 we changed from outputting
field->cpu (a char) to iter->cpu (unsigned int), increasing the resulting
structure size by 3 bytes.
Signed-off-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This gets rid of this build warning:
arch/powerpc/platforms/pseries/pci_dlpar.c: In function 'init_phb_dynamic':
arch/powerpc/platforms/pseries/pci_dlpar.c:192: warning: unused variable 'b'
This is one of the very few warnings left in a ppc64_defconfig build and
getting rid of it will make it easier to see future introduced ones (in
fact this was introduced very recently).
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>