Take account of whether a socket is bound to a particular device when
selecting an IPv6 raw socket to receive a packet. Also perform this
check when receiving IPv6 packets with router alert options.
Signed-off-by: Andrew McDonald <andrew@mcdonald.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
- Add new nfnetlink_queue module
- Add new ipt_NFQUEUE and ip6t_NFQUEUE modules to access queue numbers 1-65535
- Mark ip_queue and ip6_queue Kconfig options as OBSOLETE
- Update feature-removal-schedule to remove ip[6]_queue in December
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
- split netfiler verdict in 16bit verdict and 16bit queue number
- add 'queuenum' argument to nf_queue_outfn_t and its users ip[6]_queue
- move NFNL_SUBSYS_ definitions from enum to #define
- introduce autoloading for nfnetlink subsystem modules
- add MODULE_ALIAS_NFNL_SUBSYS macro
- add nf_unregister_queue_handlers() to register all handlers for a given
nf_queue_outfn_t
- add more verbose DEBUGP macro definition to nfnetlink.c
- make nfnetlink_subsys_register fail if subsys already exists
- add some more comments and debug statements to nfnetlink.c
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The rerouting functionality is required by the core, therefore it has
to be implemented by the core and not in individual queue handlers.
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
- Remove bogus code for compiling netlink as module
- Add module refcounting support for modules implementing a netlink
protocol
- Add support for autoloading modules that implement a netlink protocol
as soon as someone opens a socket for that protocol
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is nothing IPv4-specific in it. In fact, it was already used by
IPv6, too... Upcoming nfnetlink_queue code will use it for any kind
of packet.
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Instead, set it in one place, namely the beginning of
netif_receive_skb().
Based upon suggestions from Jamal Hadi Salim.
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch contains the following possible cleanups:
- make needlessly global code static
- #if 0 the following unused global function:
- xfrm4_state.c: xfrm4_state_fini
- remove the following unneeded EXPORT_SYMBOL's:
- ip_output.c: ip_finish_output
- ip_output.c: sysctl_ip_default_ttl
- fib_frontend.c: ip_dev_find
- inetpeer.c: inet_peer_idlock
- ip_options.c: ip_options_compile
- ip_options.c: ip_options_undo
- net/core/request_sock.c: sysctl_max_syn_backlog
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Bonding just wants the device before the skb_bond()
decapsulation occurs, so simply pass that original
device into packet_type->func() as an argument.
It remains to be seen whether we can use this same
exact thing to get rid of skb->input_dev as well.
Signed-off-by: David S. Miller <davem@davemloft.net>
Add ctnetlink subsystem for userspace-access to ip_conntrack table.
This allows reading and updating of existing entries, as well as
creating new ones (and new expect's) via nfnetlink.
Please note the 'strange' byte order: nfattr (tag+length) are in host
byte order, while the payload is always guaranteed to be in network
byte order. This allows a simple userspace process to encapsulate netlink
messages into arch-independent udp packets by just processing/swapping the
headers and not knowing anything about the actual payload.
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
This removes the private element from skbuff, that is only used by
HIPPI. Instead it uses skb->cb[] to hold the additional data that is
needed in the output path from hard_header to device driver.
PS: The only qdisc that might potentially corrupt this cb[] is if
netem was used over HIPPI. I will take care of that by fixing netem
to use skb->stamp. I don't expect many users of netem over HIPPI
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Introduce "nfnetlink" (netfilter netlink) layer. This layer is used as
transport layer for all userspace communication of the new upcoming
netfilter subsystems, such as ctnetlink, nfnetlink_queue and some day even
the mythical pkttables ;)
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This adds a notifier chain based event mechanism for ip_conntrack state
changes. As opposed to the previous implementations in patch-o-matic, we
do no longer need a field in the skb to achieve this.
Thanks to the valuable input from Patrick McHardy and Rusty on the idea
of a per_cpu implementation.
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove the "list" member of struct sk_buff, as it is entirely
redundant. All SKB list removal callers know which list the
SKB is on, so storing this in sk_buff does nothing other than
taking up some space.
Two tricky bits were SCTP, which I took care of, and two ATM
drivers which Francois Romieu <romieu@fr.zoreil.com> fixed
up.
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
As discussed at netconf'05, we're trying to save every bit in sk_buff.
The patch below makes sk_buff 8 bytes smaller. I did some basic
testing on my notebook and it seems to work.
The only real in-tree user of nfcache was IPVS, who only needs a
single bit. Unfortunately I couldn't find some other free bit in
sk_buff to stuff that bit into, so I introduced a separate field for
them. Maybe the IPVS guys can resolve that to further save space.
Initially I wanted to shrink pkt_type to three bits (PACKET_HOST and
alike are only 6 values defined), but unfortunately the bluetooth code
overloads pkt_type :(
The conntrack-event-api (out-of-tree) uses nfcache, but Rusty just
came up with a way how to do it without any skb fields, so it's safe
to remove it.
- remove all never-implemented 'nfcache' code
- don't have ipvs code abuse 'nfcache' field. currently get's their own
compile-conditional skb->ipvs_property field. IPVS maintainers can
decide to move this bit elswhere, but nfcache needs to die.
- remove skb->nfcache field to save 4 bytes
- move skb->nfctinfo into three unused bits to save further 4 bytes
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
As discussed at netconf'05, we convert nfmark and conntrack-mark to be
32bits even on 64bit architectures.
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patch from Richard Purdie
Add some extra pxa27x register definitions needed for the Sharp
SL-C3000 (Spitz).
Signed-off-by: Richard Purdie <rpurdie@rpsys.net>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Nicolas Pitre
There is no need to define __ARCH_WANT_SYS_FADVISE64 on ARM since it
only serves to compile in a compatibility wrapper for sys_fadvise64
which never was tied to any syscall number.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Ben Dooks
Add the definitions for the S3C2410_CLKSLOW registers to
the header files, and show the values when the system
starts up
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Deepak Saxena
This patch implements the set_irq_type() hooks for configuring GPIO
IRQ type and updates all the platforms to use it instead of the
gpio_line_config() function which is now used to configure input
vs. output on the pins.
Signed-off-by: Deepak Saxena <dsaxena@plexity.net>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
It appears that a memory barrier soon after a mispredicted
branch, not just in the delay slot, can cause the hang
condition of this cpu errata.
So move them out-of-line, and explicitly put them into
a "branch always, predict taken" delay slot which should
fully kill this problem.
Signed-off-by: David S. Miller <davem@davemloft.net>
When the spinlock routines were moved out of line into
kernel/spinlock.c this made it so that the debugging
spinlocks record lock acquisition program counts in the
kernel/spinlock.c functions not in their callers.
This makes the debugging info kind of useless.
So record the correct caller's program counter and
now this feature is useful once more.
Signed-off-by: David S. Miller <davem@davemloft.net>
Removed sparc architecture specific users of asm/segment.h and
asm-sparc/segment.h itself
Signed-off-by: Kumar Gala <kumar.gala@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Removed sparc64 architecture specific users of asm/segment.h and
asm-sparc64/segment.h itself
Signed-off-by: Kumar Gala <kumar.gala@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Current uncorrectable error handling was poor enough
that the processor could just loop taking the same
trap over and over again. Fix things up so that we
at least get a log message and perhaps even some register
state.
In the process, much consolidation became possible,
particularly with the correctable error handler.
Prefix assembler and C function names with "spitfire"
to indicate that these are for Ultra-I/II/IIi/IIe only.
More work is needed to make these routines robust and
featureful to the level of the Ultra-III error handlers.
Signed-off-by: David S. Miller <davem@davemloft.net>
* ieee1394_device_id has kernel_ulong_t field after an odd number of
__u32 ones. Since mod_devicetable.h is included both from kernel and
from host build helper, we may be in trouble if we are building on
32bit host for 64bit target - userland sees unsigned long long,
kernel sees unsigned long and while their sizes match, alignments
might not. Fixed by forcing alignment. Fortunately, almost nobody
else needs that - the rest of such fields is naturally aligned as it
is.
* of_device_id has void * in it. Host userland helpers need
kernel_ulong_t instead, since their void * might have nothing to do
with the kernel one. Fixed in the same way it's done for similar
problems in pcmcia_device_id (ifdef __KERNEL__).
* pcmcia_device_id has the same problem as ieee1394_device_id. Fixed
the same way.
Signed-off-by: Al Viro <viro@parcelfarce.linux.theplanet.co.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Paulus, I think this is now a reasonable candidate for the post-2.6.13
queue.
Relax address restrictions for hugepages on ppc64
Presently, 64-bit applications on ppc64 may only use hugepages in the
address region from 1-1.5T. Furthermore, if hugepages are enabled in
the kernel config, they may only use hugepages and never normal pages
in this area. This patch relaxes this restriction, allowing any
address to be used with hugepages, but with a 1TB granularity. That
is if you map a hugepage anywhere in the region 1TB-2TB, that entire
area will be reserved exclusively for hugepages for the remainder of
the process's lifetime. This works analagously to hugepages in 32-bit
applications, where hugepages can be mapped anywhere, but with 256MB
(mmu segment) granularity.
This patch applies on top of the four level pagetable patch
(http://patchwork.ozlabs.org/linuxppc64/patch?id=1936).
Signed-off-by: David Gibson <dwg@au1.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This patch moves power4_enable_pmcs() to arch/ppc64/kernel/pmc.c.
I've tested it on P5 LPAR and P4. It does what it used to.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
If both CONFIG_XMON and CONFIG_XMON_DEFAULT is enabled in the .config,
there is no way to disable xmon again. setup_system calls first xmon_init,
later parse_early_param. So a new 'xmon=off' cmdline option will do the right
thing.
Signed-off-by: Olaf Hering <olh@suse.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
We can now remove CONFIG_MSCHUNKS as it doesn't do anything interesting
anymore.
The only macro in abs_addr.h which is called by non-iSeries code is
phys_to_abs(), so remove the other dummy implementations, and we add a
firmware feature check to phys_to_abs().
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
We no longer need the lmb code to know about abs and phys addresses, so
remove the physbase variable from the lmb_property struct.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
abs_to_phys() is a macro that turns out to do nothing, and also has the
unfortunate property that it's not the inverse of phys_to_abs() on iSeries.
The following is for my benefit as much as everyone else.
With CONFIG_MSCHUNKS enabled, the lmb code is changed such that it keeps
a physbase variable for each lmb region. This is used to take the possibly
discontiguous lmb regions and present them as a contiguous address space
beginning from zero.
In this context each lmb region's base address is its "absolute" base
address, and its physbase is it's "physical" address (from Linux's point of
view). The abs_to_phys() macro does the mapping from "absolute" to "physical".
Note: This is not related to the iSeries mapping of physical to absolute
(ie. Hypervisor) addresses which is maintained with the msChunks structure.
And the msChunks structure is not controlled via CONFIG_MSCHUNKS.
Once upon a time you could compile for non-iSeries with CONFIG_MSCHUNKS
enabled. But these days CONFIG_MSCHUNKS depends on CONFIG_PPC_ISERIES, so
for non-iSeries code abs_to_phys() is a no-op.
On iSeries we always have one lmb region which spans from 0 to
systemcfg->physicalMemorySize (arch/ppc64/kernel/iSeries_setup.c line 383).
This region has a base (ie. absolute) address of 0, and a physbase address
of 0 (as calculated in lmb_analyze() (arch/ppc64/kernel/lmb.c line 144)).
On iSeries, abs_to_phys(aa) is defined as lmb_abs_to_phys(aa), which finds
the lmb region containing aa (and there's only one, ie. 0), and then does:
return lmb.memory.region[0].physbase + (aa - lmb.memory.region[0].base)
physbase == base == 0, so you're left with "return aa".
So remove abs_to_phys(), and lmb_abs_to_phys() which is the implementation
of abs_to_phys() for iSeries.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
physRpn_to_absRpn is a no-op on non-iSeries platforms, remove the two
redundant calls.
There's only one caller on iSeries so fold the logic in there so we can get
rid of it completely.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The only caller of chunk_offset() and abs_chunk() is phys_to_abs(), so
fold the former two into the latter.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Rename the msChunks struct to get rid of the StUdlY caps and make it a bit
clearer what it's for.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Chunks are 256KB, so use constants for the size/shift/mask, rather than
getting them from the msChunks struct. The iSeries debugger (??) might still
need access to the values in the msChunks struct, so we keep them around
for now, but set them from the constant values.
Replace msChunks_entry typedef with regular u32.
Simplify msChunks_alloc() to manipulate klimit directly, rather than via
a parameter.
Move msChunks_alloc() and msChunks into iSeries_setup.c, as that's where
they're used.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The msChunks code was written to work on pSeries, but now it's only used on
iSeries. This means there's no need to do PTRRELOC anymore, so remove it all.
A few places were getting "extern reloc_offset()" from abs_addr.h, move it
into system.h instead.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Make firmware_has_feature() evaluate at compile time for the non pSeries
case and tidy up code where possible.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Create the firmware_has_feature() inline and move the firmware feature
stuff into its own header file.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The firmware_features field of struct cpu_spec should really be a separate
variable as the firmware features do not depend on the chip and the
bitmask is constructed independently. By removing it, we save 112 bytes
from the cpu_specs array and we access the bitmask directly instead of via
the cur_cpu_spec pointer.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
On ppc64 machines with segment tables, CPU0's segment table is at a
fixed address, currently 0x9000. This patch moves it to the free
space at 0x6000, just below the fwnmi data area. This saves 8k of
space in vmlinux and the runtime kernel image.
Signed-off-by: David Gibson <dwg@au1.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Comments in head.S suggest that the iSeries naca has a fixed address,
because tools expect to find it there. The only tool which appears to
access the naca is addRamDisk, but both the in-kernel version and the
version used in RHEL and SuSE in fact locate the NACA the same way as
the hypervisor does, by following the pointer in the hvReleaseData
structure.
Since the requirement for a fixed address seems to be obsolete, this
patch removes the naca from head.S and replaces it with a normal C
initializer.
For good measure, it removes an old version of addRamDisk.c which was
sitting, unused, in the ppc32 tree.
Signed-off-by: David Gibson <dwg@au1.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This patch just splits out the pSeries specific parts of vio.c.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This patch allows us to have a different bus if matching function for
each platform.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Since the iSeries vio iommu tables cannot be used until after the vio bus has
been initialised, move the initialisation of the tables to there.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This patch splits the iSeries specific parts out of vio.c.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This patch updates the format of the flattened device-tree passed
between the boot trampoline and the kernel to support a more compact
representation, for use by embedded systems mostly.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Implement 4-level pagetables for ppc64
This patch implements full four-level page tables for ppc64, thereby
extending the usable user address range to 44 bits (16T).
The patch uses a full page for the tables at the bottom and top level,
and a quarter page for the intermediate levels. It uses full 64-bit
pointers at every level, thus also increasing the addressable range of
physical memory. This patch also tweaks the VSID allocation to allow
matching range for user addresses (this halves the number of available
contexts) and adds some #if and BUILD_BUG sanity checks.
Signed-off-by: David Gibson <dwg@au1.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This patch adds back the code that was taken out, thus re-enabling:
* The PHY Layer to initialize without crashing
* Drivers to actually connect to PHYs
* The entire PHY Control Layer
This patch is used by the gianfar driver, and other drivers which are in
development.
Signed-off-by: Andy Fleming <afleming@freescale.com>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
- changes license of all code from OSL+GPL to plain ole GPL
- except for NVIDIA, who hasn't yet responded about sata_nv
- copyright holders were already contacted privately
- adds info in each driver about where hardware/protocol docs may be
obtained
- where I have made major contributions, updated copyright dates
Debug variables and procfs dir should be "ieee80211", not "ipw".
Signed-off-by: Jouni Malinen <jkmaline@cc.hut.fi>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
IEEE 802.11 code has no business touching payloads of EAPOL frames.
There are some EAPOL structures defined for debugging and these were
confusingly called EAP types which they are not. Let's just remove these
before someone else starts using them in the kernel.
Signed-off-by: Jouni Malinen <jkmaline@cc.hut.fi>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
No need to maintain support for WIRELESS_EXT < 17 since this kernel
tree is already using WIRELESS_EXT 18.
Signed-off-by: Jouni Malinen <jkmaline@cc.hut.fi>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
Move the InfiniBand headers from drivers/infiniband/include to include/rdma.
This allows InfiniBand-using code to live elsewhere, and lets us remove the
ugly EXTRA_CFLAGS include path from the InfiniBand Makefiles.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Some nodes can have large holes on x86-64.
This fixes problems with the VM allowing too many dirty pages because it
overestimates the number of available RAM in a node. In extreme cases you
can end up with all RAM filled with dirty pages which can lead to deadlocks
and other nasty behaviour.
This patch just tells the VM about the known holes from e820. Reserved
(like the kernel text or mem_map) is still not taken into account, but that
should be only a few percent error now.
Small detail is that the flat setup uses the NUMA free_area_init_node() now
too because it offers more flexibility.
(akpm: lotsa thanks to Martin for working this problem out)
Cc: Martin Bligh <mbligh@mbligh.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
I recently had a BUG_ON() go off spuriously on a gcc 4.0 compiled kernel.
It turns out gcc-4.0 was removing a sign extension while earlier gcc
versions would not. Thinking this to be a compiler bug, I submitted a
report:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23422
It turns out we need to cast the input in order to tell gcc to sign extend
it.
Thanks to Andrew Pinski for his help on this bug.
Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Patch to support P-state transitions on ia64. This driver is based on ACPI,
and uses the ACPI processor driver interface to find out the P-state support
information for the processor. This driver plugs into generic cpufreq
infrastructure.
Once this driver is loaded successfully, ondemand/userspace governor can be
used to change the CPU frequency dynamically based on load or on request from
userspace process.
Refer :
ACPI specification -
http://www.acpi.info
P-state related PAL calls -
http://developer.intel.com/design/itanium/downloads/24869909.pdf
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Altix patch to abstract irq_affinity down to the pci provider level since
different SGI hardware implements this in different ways.
Signed-off-by: Mark Maule <maule@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
From: Michael Wu <flamingice@sourmilk.net>
This patch:
- fixes misc. whitespace/comments
- replaces u16 with __le16/__be16 where appropriate
Signed-off-by: Michael Wu <flamingice@sourmilk.net>
Signed-off-by: Jiri Benc <jbenc@suse.cz>
From: Gertjan van Wingerde <gwingerde@home.nl>
Attached patch cleans up the long lists of #defines for status codes,
reason codes, and information elements.
Signed-off-by: Gertjan van Wingerde <gwingerde@home.nl>
Signed-off-by: Jiri Benc <jbenc@suse.cz>
From: Gertjan van Wingerde <gwingerde@home.nl>
Attached patch updates the definitions of the generic ieee80211 stack to
the latest versions of the published 802.11x specification suite.
Signed-off-by: Gertjan van Wingerde <gwingerde@home.nl>
Signed-off-by: Jiri Benc <jbenc@suse.cz>
Clean up of SGI SN partitioning related code.
The SN_SAL_GET_SN_INFO SAL call returns the partition ID, making
the SN_SAL_SYSCTL_PARTITION_GET SAL call redundant. Remove sn_partid
and use sn_partition_id.
Signed-off-by: Russ Anderson (rja@sgi.com)
Signed-off-by: Tony Luck <tony.luck@intel.com>
Add a new exported function for determining the nearest node
with CPUs for I/O nodes and fix a bug where the hwperf dynamic
misc device was being registered before misc_init().
Signed-off-by: Mark Goodwin <markgw@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Bugfix to export PCI topology information in /proc/sgi_sn/sn_topology.
Signed-off-by: Mark Goodwin <markgw@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Currently, region numbers are defined in several files, with several
names. For example, we have REGION_KERNEL in asm/page.h and
RGN_KERNEL in pgtable.h
We also have address definitions that should depend on the
RGN_XXX macros, but are currently just long constants.
The following patch reorganises all the definitions so that they have
the same form (RGN_XXX), are in one place, and that addresses that
depend on RGN_XXX are derived from them.
(This is a necessary but not sufficient patch to allow UML-like
operation on IA64).
Thanks to David Mosberger for catching the change I missed in mmu_context.h.
Signed-off-by: Peter Chubb <peterc@gelato.unsw.edu.au>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Parenthesize "p" to avoid ambiguity. No callers have a problem today;
this is just to clean up the bad form.
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
As pointed out in the following thread, the CLOCK_TICK_RATE setting for
IXP4xx is incorrect b/c the HW ignores the lowest 2 bits of the LATCH
value.
http://lists.arm.linux.org.uk/pipermail/linux-arm-kernel/2005-August/030950.html
Tnx to George Anziger and Egil Hjelmeland for finding the issue.
Signed-off-by: Deepak Saxena <dsaxena@plexity.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
remove the bogus games with explicit ifdefs on __CHECKER__
Signed-off-by: Al Viro <viro@parcelfarce.linux.theplanet.co.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
a bunch of functions switched from volatile to __attribute__((noreturn)) and
from const to __attribute_pure__
Signed-off-by: Al Viro <viro@parcelfarce.linux.theplanet.co.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
extern on physid_2_cpu[] does not belong in smp.h - the thing is static.
Signed-off-by: Al Viro <viro@parcelfarce.linux.theplanet.co.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
alpha xchg has to be a macro - alpha disables always_inline and if that
puppy does not get inlined, we immediately blow up on undefined reference.
Happens even on gcc3; with gcc4 that happens a _lot_.
Signed-off-by: Al Viro <viro@parcelfarce.linux.theplanet.co.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
fixed kconfig dependencies on ISA_DMA_API for parts of sound/* that rely
on it.
Signed-off-by: Al Viro <viro@parcelfarce.linux.theplanet.co.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
We ran into the limit with the maximum number of waiters at one of our sites.
This patch increases the number of possible waiters from 2^15 to 2^31 by using
a long for the counter in struct rw_semaphore. S390 and alpha already do this.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Acked-by: Kenneth Chen <kenneth.w.chen@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
o Brown paperbag bug - ax25_findbyuid() was always returning a NULL pointer
as the result. Breaks ROSE completly and AX.25 if UID policy set to deny.
o While the list structure of AX.25's UID to callsign mapping table was
properly protected by a spinlock, it's elements were not refcounted
resulting in a race between removal and usage of an element.
Signed-off-by: Ralf Baechle DL5RB <ralf@linux-mips.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The socket flag cleanups that went into 2.6.12-rc1 are basically oring
the flags of an old socket into the socket just being created.
Unfortunately that one was just initialized by sock_init_data(), so already
has SOCK_ZAPPED set. As the result zapped sockets are created and all
incoming connection will fail due to this bug which again was carefully
replicated to at least AX.25, NET/ROM or ROSE.
In order to keep the abstraction alive I've introduced sock_copy_flags()
to copy the socket flags from one sockets to another and used that
instead of the bitwise copy thing. Anyway, the idea here has probably
been to copy all flags, so sock_copy_flags() should be the right thing.
With this the ham radio protocols are usable again, so I hope this will
make it into 2.6.13.
Signed-off-by: Ralf Baechle DL5RB <ralf@linux-mips.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Interrupts from devices sharing the same IRQ could cause
ata_host_intr to finish commands being processed by atapi_packet_task
if the commands are using ATA_PROT_ATAPI_NODATA or ATA_PROT_ATAPI_DMA
protocol. This is because libata interrupt handler is unaware that
interrupts are not expected during that period. This patch adds
ATA_FLAG_NOINTR flag to tell the interrupt handler that we're not
expecting interrupts.
Note that once proper HSM is implemented for interrupt-driven PIO,
this should be merged into it and this flag will be removed.
ahci.c is a different kind of beast, so it's left alone.
* The following drivers use ata_qc_issue_prot and ata_interrupt, so
changes in libata core will do.
ata_piix sata_sil sata_svw sata_via sata_sis sata_uli
* The following drivers use ata_qc_issue_prot and custom intr handler.
They need this change to work correctly.
sata_nv sata_vsc
* The following drivers use custom issue function and intr handler.
Currently all custom issue functions don't support ATAPI, so this
change is irrelevant, updated for consistency and to avoid later
mistakes.
sata_promise sata_qstor sata_sx4
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
This bug could cause oopses and page state corruption, because ncpfs
used the generic page-cache symlink handlign functions. But those
functions only work if the page cache is guaranteed to be "stable", ie a
page that was installed when the symlink walk was started has to still
be installed in the page cache at the end of the walk.
We could have fixed ncpfs to not use the generic helper routines, but it
is in many ways much cleaner to instead improve on the symlink walking
helper routines so that they don't require that absolute stability.
We do this by allowing "follow_link()" to return a error-pointer as a
cookie, which is fed back to the cleanup "put_link()" routine. This
also simplifies NFS symlink handling.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
GCC 4.x really dislikes the games we are playing in
unaligned.c, and the cleanest way to fix this is to
move things into assembler.
Noted by Al Viro.
Signed-off-by: David S. Miller <davem@davemloft.net>
There's no point in having the host name duplicated between
the mmc_host structure and the encapsulated class device
structure.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Create a mmc_host class to allow enumeration of MMC host controllers
even though they have no card(s) inserted.
Patch based on work by Pierre Ossman.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Not only was this unused, but its somewhat eccentric declaration
of "static inline const unsigned long" gives gcc4 heartburn.
Signed-off-by: Tony Luck <tony.luck@intel.com>
BCM5785 (HT1000) is a Opteron Southbridge from Serverworks/Broadcom that
incorporates a single channel ATA100 IDE controller that is functionally
identical to the Serverworks CSB6 IDE controller. This patch adds support
for the new PCI device ID and also the support for this controller.
Signed-off-by: Narendra Sankar <nsankar@broadcom.com>
Acked-by: Jeff Garzik <jgarzik@pobox.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@elka.pw.edu.pl>
Adds support for Netcell Revolution to pci-ide generic driver by including
it in the list of devices matched. Includes the Revolution in the list of
simplex devices forced into DMA mode.
Signed-off-by: Matt Gillette <matt.gillette@netcell.com>
Cc: Bartlomiej Zolnierkiewicz <B.Zolnierkiewicz@elka.pw.edu.pl>
Cc: Jeff Garzik <jgarzik@pobox.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@elka.pw.edu.pl>
Fixes the incorrect DCR base value for the 440SP SRAM controller.
Signed-off-by: Matt Porter <mporter@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Fixes build on 4xx stb03xxx when general purpose dma engine support is
enabled.
Signed-off-by: Matt Porter <mporter@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Add inotify and ioprio syscall stubs to SH64.
Signed-off-by: Robert Love <rml@novell.com>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Add inotify and ioprio syscall stubs to SH.
Signed-off-by: Robert Love <rml@novell.com>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Down the road we want to eliminate the use of the global kernel lock entirely
from the NFS client. To do this, we need to protect the fields in the
nfs_inode structure adequately. Start by serializing updates to the
"cache_validity" field.
Note this change addresses an SMP hang found by njw@osdl.org, where processes
deadlock because nfs_end_data_update and nfs_revalidate_mapping update the
"cache_validity" field without proper serialization.
Test plan:
Millions of fsx ops on SMP clients. Run Nick Wilson's breaknfs program on
large SMP clients.
Signed-off-by: Chuck Lever <cel@netapp.com>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Introduce atomic bitops to manipulate the bits in the nfs_inode structure's
"flags" field.
Using bitops means we can use a generic wait_on_bit call instead of an ad hoc
locking scheme in fs/nfs/inode.c, so we can remove the "nfs_i_wait" field from
nfs_inode at the same time.
The other new flags field will continue to use bitmask and logic AND and OR.
This permits several flags to be set at the same time efficiently. The
following patch adds a spin lock to protect these flags, and this spin lock
will later cover other fields in the nfs_inode structure, amortizing the cost
of using this type of serialization.
Test plan:
Millions of fsx ops on SMP clients.
Signed-off-by: Chuck Lever <cel@netapp.com>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Certain bits in nfsi->flags can be manipulated with atomic bitops, and some
are better manipulated via logical bitmask operations.
This patch splits the flags field into two. The next patch introduces atomic
bitops for one of the fields.
Test plan:
Millions of fsx ops on SMP clients.
Signed-off-by: Chuck Lever <cel@netapp.com>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Add platform device data for the SA11x0 MCP device. This allows
platforms to customise the configuration of the SA11x0 MCP device
according to their needs.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Shub2 provides a much improved mechanism for issuing internode
TLB purges. Add code to support the newer mechanism. There is also
some debug code (disabled) that is useful for testing.
Collect statistics on the number, type & duration of TLB purges.
This data will be useful for making future improvements in the algorithms.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Update the SN address macros so that they work on both shub1
and shub2. Most of the code to support shub2 was added last year
but this patch fixes a few bugs and adds macros to help generate
both processor-specific physical addresses & numalink physical
addresses. More cleanup & optimization will be done later.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Paulus suggested that we put xLparMap in its own .c file so that we can
generate a .s file to be included into head.S. This doesn't get around
the problem of having it at a fixed address, but it makes it more
palatable.
It would be good if this could be included in 2.6.13 as it solves our
build problems with various versions of binutils and gcc. In
particular, it allows us to build an iSeries kernel on Debian unstable
using their biarch compiler.
This has been built and booted on iSeries and built for pSeries and g5.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
On the 6700/6702 PXH part, a MSI may get corrupted if an ACPI hotplug
driver and SHPC driver in MSI mode are used together.
This patch will prevent MSI from being enabled for the SHPC as part of
an early pci quirk, as well as on any pci device which sets the no_msi
bit.
Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Chuck Ebbert noticed that the desc_empty macro is incorrect. Fix it.
Thankfully, this is not used as a security check, but it can falsely
overwrite TLS segments with carefully chosen base / limits. I do not
believe this is an issue in practice, but it is a kernel bug.
Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Chris Wright <chrisw@osdl.org>
[ x86-64 had the same problem, and the same fix. Linus ]
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
When the client performs an exclusive create and opens the file for writing,
a Netapp filer will first create the file using the mode 01777. It does this
since an NFSv3/v4 exclusive create cannot immediately set the mode bits.
The 01777 mode then gets put into the inode->i_mode. After the file creation
is successful, we then do a setattr to change the mode to the correct value
(as per the NFS spec).
The problem is that nfs_refresh_inode() no longer updates inode->i_mode, so
the latter retains the 01777 mode. A bit later, the VFS notices this, and calls
remove_suid(). This of course now resets the file mode to inode->i_mode & 0777.
Hey presto, the file mode on the server is now magically changed to 0777. Duh...
Fixes http://bugzilla.linux-nfs.org/show_bug.cgi?id=32
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This adds a MOVE_SELF event to inotify. It is sent whenever the inode
you are watching is moved. We need this event so that we can catch
something like this:
- app1:
watch /etc/mtab
- app2:
cp /etc/mtab /tmp/mtab-work
mv /etc/mtab /etc/mtab~
mv /tmp/mtab-work /etc/mtab
app1 still thinks it's watching /etc/mtab but it's actually watching
/etc/mtab~.
Signed-off-by: John McCutchan <ttb@tentacle.dhs.org>
Signed-off-by: Robert Love <rml@novell.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
IEEE 802.11 has a capability field flag called ESS, but ieee80211 had
renamed this to BSS for some reason. hostap has been using
WLAN_CAPABILITY_ESS and since that matches with the standard, lets use
it as the name for this define. Add WLAN_CAPABILITY_BSS as a backwards
compatibility name for the same bit since ieee80211 and ipw2200 are
using this and there are versions outside kernel tree that expect to
find this define name.
Signed-off-by: Jouni Malinen <jkmaline@cc.hut.fi>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
IEEE 802.11 frame control has two bits reserved for protocol
version. IEEE80211_FCTL_VERS was not used anywhere, but I would assume
it was supposed to be a mask for the protocol field and as such, it
should be 0x0003, not 0x0002. This matches with WLAN_FC_PVER
definition in hostap.
Signed-off-by: Jouni Malinen <jkmaline@cc.hut.fi>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
This reverts commits
71db63acff
[PATCH] increase PCIBIOS_MIN_IO on x86
and
0b2bfb4e7f
ACPI: increase PCIBIOS_MIN_IO on x86
since Lukas Sandströ<lukass@etek.chalmers.se> reports that this breaks
his on-board nvidia audio.
We should re-visit this later. For now we revert the change
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
There was a rather silly and embarrassing typo in the sh _syscall6().
For the syscall ABI we have the trapa value specified as 0x10 + number
of arguments, this was being set incorrectly in the _syscall6() case
which ended up causing some problems for users.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The patch below should fix a race which could cause stale TLB entries.
Specifically, when 2 CPUs ended up racing for entrance to
wrap_mmu_context(). The losing CPU would find that by the time it
acquired ctx.lock, mm->context already had a valid value, but then it
failed to (re-)check the delayed TLB flushing logic and hence could
end up using a context number when there were still stale entries in
its TLB. The fix is to check for delayed TLB flushes only after
mm->context is valid (non-zero). The patch also makes GCC v4.x
happier by defining a non-volatile variant of mm_context_t called
nv_mm_context_t.
Signed-off-by: David Mosberger-Tang <David.Mosberger@acm.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
This fixes a race during initialization with the NAPI softirq
processing by using an RCU approach.
This race was discovered when refill_skbs() was added to
the setup code.
Signed-off-by: Matt Mackall <mpm@selenic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add limited retry logic to netpoll_send_skb
Each time we attempt to send, decrement our per-device retry counter.
On every successful send, we reset the counter.
We delay 50us between attempts with up to 20000 retries for a total of
1 second. After we've exhausted our retries, subsequent failed
attempts will try only once until reset by success.
Signed-off-by: Matt Mackall <mpm@selenic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There are many instances of
skb->protocol = htons(ETH_P_*);
skb->protocol = __constant_htons(ETH_P_*);
and
skb->protocol = *_type_trans(...);
Most of *_type_trans() are already endian-annotated, so, let's shift
attention on other warnings.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Altix patch to add an SN pci provider for TIOCE, which is SGI's
PCI Express implementation.
Signed-off-by: Mark Maule <maule@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Altix patch to add TIO "huge-window" address support to sn_dma_flush().
Update copyright in affected files.
Signed-off-by: Mark Maule <maule@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Altix patch to abstract the force_interrupt() mechanism away from the
pcibr provider.
Signed-off-by: Mark Maule <maule@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Cosmetic altix patch to rename SGI_PCIBR_ERROR to something more generic and
remove a duplicate #define.
Signed-off-by: Mark Maule <maule@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
1. Nontemporal store for spin unlock.
A nontemporal store will not update the LRU setting for the cacheline. The
cacheline with the lock may therefore be evicted faster from the cpu
caches. Doing so may be useful since it increases the chance that the
exclusive cache line has been evicted when another cpu is trying to
acquire the lock.
The time between dropping and reacquiring a lock on the same cpu is
typically very small so the danger of the cacheline being
evicted is negligible.
2. Avoid semaphore operation in write_unlock and use nontemporal store
write_lock uses a cmpxchg like the regular spin_lock but write_unlock uses
clear_bit which requires a load and then a loop over a cmpxchg. The
following patch makes write_unlock simply use a nontemporal store to clear
the highest 8 bits. We will then still have the lower 3 bytes (24 bits)
left to count the readers.
Doing the byte store will reduce the number of possible readers from 2^31
to 2^24 = 16 million.
These patches were discussed already:
http://marc.theaimsgroup.com/?t=111472054400001&r=1&w=2http://marc.theaimsgroup.com/?l=linux-ia64&m=111401837707849&w=2
The nontemporal stores will only work using GCC. If a compiler is used
that does not support inline asm then fallback C code is used. This
will preserve the byte store but not be able to do the nontemporal stores.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
This patch removes the following stupid compile error that happens
when CONFIG_HOTPLUG is not defined on ia64.
arch/ia64/kernel/built-in.o(.text+0x712): In function `acpi_unregister_ioapic':
: undefined reference to `iosapic_remove'
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Patch from Ben Dooks
Rename the s3c2410_report_oc() to s3c2410_usb_report_oc()
as this is an usb specific function.
Change port power on the usb-simtec implementation to only
power up the output if both are set, as per the usb 1.1
specification
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Unfortunately, we can't use the "user" bit in the page tables to
control whether a page table entry is "global" or "asid" specific,
since the vector page is mapped as "user" accessible but is not
process specific.
Therefore, give direct control of the ARMv6 "nG" (not global)
bit to the mm layers.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
1. Move hwif_to_node to ide.h
2. Use hwif_to_node in ide-disk.c
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This removes the now unused fsnotify_unlink & fsnotify_rmdir code.
Compile tested.
Signed-off-by: John McCutchan <ttb@tentacle.dhs.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Revert commit fec59a711e, which is
breaking sparc64 that doesn't have a working pci_update_resource.
We'll re-do this after 2.6.13 when we'll do it all properly.
We have some nasty issues with 2.6.12-rc6. Any request to scan on
the lpfc or qla2xxx FC adapters will oops. What is happening is the
system is defaulting to non-transport registered targets, which
inherit the parent of the scan. On this second scan, performed by
the attribute, the parent becomes the shost instead of the rport.
The slave functions in the 2 FC adapters use starget_to_rport()
routines, which incorrectly map the shost as an rport pointer.
Additionally, this pointed out other weaknesses:
- If the target structure is torn down outside of the transport,
we have no method for it to be regenerated at the proper parent.
- We have race conditions on the target being allocated by both
the midlayer scan (parent=shost) and by the fc transport
(parent=rport).
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
NETLINK_ARPD is unused, allocate it to the Open-iSCSI folks.
NETLINK_ROUTE6 and NETLINK_TAPBASE are no longer used, delete
them.
Signed-off-by: David S. Miller <davem@davemloft.net>
The patch below unhooks fsnotify from vfs_unlink & vfs_rmdir. It
introduces two new fsnotify calls, that are hooked in at the dcache
level. This not only more closely matches how the VFS layer works, it
also avoids the problem with locking and inode lifetimes.
The two functions are
- fsnotify_nameremove -- called when a directory entry is going away.
It notifies the PARENT of the deletion. This is called from
d_delete().
- inoderemove -- called when the files inode itself is going away. It
notifies the inode that is being deleted. This is called from
dentry_iput().
Signed-off-by: John McCutchan <ttb@tentacle.dhs.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
sparc can not include linux/pagemap.h because of the following circular
dependency:
asm-sparc/pgtable include linux/swap.h
linux/swap.h include now linux/pagemap.h
linux/pagemap.h include linux/mm.h
linux/mm.h include asm/pgtable.h
It needs to have the swp_entry_t type fully visible in pgtable.h,
we can't work around this using macros.
Signed-off-by: Olaf Hering <olh@suse.de>
Cc: William Lee Irwin III <wli@holomorphy.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
In file included from linux-2.6.13-rc5/arch/i386/kernel/timers/timer_pit.c:20:
linux-2.6.13-rc5/include/asm-i386/mach-visws/do_timer.h: In function `do_timer_overflow':
linux-2.6.13-rc5/include/asm-i386/mach-visws/do_timer.h:32: error: `i8259A_lock' undeclared (first use in this function)
linux-2.6.13-rc5/include/asm-i386/mach-visws/do_timer.h:32: error: (Each undeclared identifier is reported only once
linux-2.6.13-rc5/include/asm-i386/mach-visws/do_timer.h:32: error: for each function it appears in.)
make[3]: *** [arch/i386/kernel/timers/timer_pit.o] Error 1
make[2]: *** [arch/i386/kernel/timers] Error 2
make[1]: *** [arch/i386/kernel] Error 2
make: *** [_all] Error 2
Signed-off-by: Tom Duffy <thomas.duffy.99@alumni.brown.edu>
Cc: Andrey Panin <pazke@orbita1.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
It's not the real deflateBound() in newer zlib libraries, partly because
the upcoming usage of it won't have the "stream" available, so we can't
have the same interfaces anyway.
Here's an incremental patch with comment updates and some additional
grammar cleanups.
Signed-off-by: Matt Porter <mporter@kernel.crashing.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch removes the unused bt_dump() function and it also removes
its BT_DMP macro. It also unexports the hci_dev_get(), hci_send_cmd()
and hci_si_event() functions.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
This patch fixes a bug in the PPC440 pagetable attributes that breaks swap
support. It also adds some notes on the PPC440 attribute fields.
Signed-off-by: Geoff Levand <geoffrey.levand@am.sony.com> for CELF
Signed-off-by: Matt Porter <mporter@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
My patch in commit fa72b903f7 incorrectly
removed blk_queue_tag->real_max_depth.
The original resize implementation was incorrect in the following
points.
* actual allocation size of tag_index was shorter than real_max_size,
but assumed to be of the same size, possibly causing memory access
beyond the allocated area.
* bits in tag_map between max_deptn and real_max_depth were
initialized to 1's, making the tags permanently reserved.
In an attempt to fix above two bugs, I had removed allocation optimization
in init_tag_map and real_max_size. Tag map/index were allocated and freed
immediately during resize.
Unfortunately, I wasn't considering that tag map/index can be resized
dynamically with tags beyond new_depth active. This led to accessing
freed area after shrinking tags and led to the following bug reporting
thread on linux-scsi.
http://marc.theaimsgroup.com/?l=linux-scsi&m=112319898111885&w=2
To fix the problem, I've revived real_max_depth without allocation
optimization in init_tag_map, and Andrew Vasquez confirmed that the
problem was fixed. As Jens is not going to be available for a week, he
asked me to make sure that this patch reaches you.
http://marc.theaimsgroup.com/?l=linux-scsi&m=112325778530886&w=2
Also, a comment was added to make sure that real_max_size is needed for
dynamic shrinking.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This avoids the whole #ifdef mess by just getting a copy of
dentry->d_inode before d_delete is called - that makes the codepaths the
same for the INOTIFY/DNOTIFY cases as for the regular no-notify case.
I've been running this under a Gnome session for the last 10 minutes.
Inotify is being used extensively.
Signed-off-by: John McCutchan <ttb@tentacle.dhs.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
In yenta_socket, we default to using the resource setting of the CardBus
bridge. However, this is a PCI-bus-centric view of resources and thus needs
to be converted to generic resources first. Therefore, add a call to
pcibios_bus_to_resource() call in between. This function is a mere wrapper on
x86 and friends, however on some others it already exists, is added in this
patch (alpha, arm, ppc, ppc64) or still needs to be provided (parisc -- where
is its pcibios_resource_to_bus() ?).
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Some PCI devices (e.g. 3c905B, 3c556B) lose all configuration
(including BARs) when transitioning from D3hot->D0. This leaves such
a device in an inaccessible state. The patch below causes the BARs
to be restored when enabling such a device, so that its driver will
be able to access it.
The patch also adds pci_restore_bars as a new global symbol, and adds a
correpsonding EXPORT_SYMBOL_GPL for that.
Some firmware (e.g. Thinkpad T21) leaves devices in D3hot after a
(re)boot. Most drivers call pci_enable_device very early, so devices
left in D3hot that lose configuration during the D3hot->D0 transition
will be inaccessible to their drivers.
Drivers could be modified to account for this, but it would
be difficult to know which drivers need modification. This is
especially true since often many devices are covered by the same
driver. It likely would be necessary to replicate code across dozens
of drivers.
The patch below should trigger only when transitioning from D3hot->D0
(or at boot), and only for devices that have the "no soft reset" bit
cleared in the PM control register. I believe it is safe to include
this patch as part of the PCI infrastructure.
The cleanest implementation of pci_restore_bars was to call
pci_update_resource. Unfortunately, that does not currently exist
for the sparc64 architecture. The patch below includes a null
implemenation of pci_update_resource for sparc64.
Some have expressed interest in making general use of the the
pci_restore_bars function, so that has been exported to GPL licensed
modules.
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The kexec boot is not successful on some power machines since all CPUs are
getting removed from global interrupt queue (GIQ) before kexec boot. Some
systems always expect at least one CPU in GIQ. Hence, this patch will make
sure that only secondary CPUs are removed from GIQ.
Signed-off-by: Haren Myneni <hbabu@us.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The recent change to never ignore the bitmap, revealed that the bitmap isn't
begin flushed properly when an array is stopped.
We call bitmap_daemon_work three times as there is a three-stage pipeline for
flushing updates to the bitmap file.
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Checking pte_dirty instead of pte_write in __follow_page is problematic
for s390, and for copy_one_pte which leaves dirty when clearing write.
So revert __follow_page to check pte_write as before, and make
do_wp_page pass back a special extra VM_FAULT_WRITE bit to say it has
done its full job: once get_user_pages receives this value, it no longer
requires pte_write in __follow_page.
But most callers of handle_mm_fault, in the various architectures, have
switch statements which do not expect this new case. To avoid changing
them all in a hurry, make an inline wrapper function (using the old
name) that masks off the new bit, and use the extended interface with
double underscores.
Yes, we do have a call to do_wp_page from do_swap_page, but no need to
change that: in rare case it's needed, another do_wp_page will follow.
Signed-off-by: Hugh Dickins <hugh@veritas.com>
[ Cleanups by Nick Piggin ]
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
There is a number of x86 laptops that have some non-PCI IO ports
in the 0x1000-0x1fff range, and it's quite hard to control the correct
order of resource allocation between PCI and other subsystems controlling
these ports. Especially with modular kernel.
So just increase PCIBIOS_MIN_IO to 0x4000 to prevent any new PCI
resource allocations in the problematic range (this limitation must
apply _only_ to the root bus resources - see Linus' change in
pci_bus_alloc_resource). As PCIBIOS_MIN_IO and PCIBIOS_MIN_CARDBUS_IO
are the same now on i386 and x86-64, we can remove the latter.
Signed-off-by: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Dont include asm-generic/topology.h unconditionally, we end up overriding
all the ppc64 specific functions when NUMA is on.
Signed-off-by: Anton Blanchard <anton@samba.org>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
We don't want these to be global functions.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Add system calls for io priorities and inotify.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Add PPC440EP core support. PPC440EP is a PPC440-based SoC with a classic PPC
FPU and another set of peripherals.
Signed-off-by: Wade Farnsworth <wfarnsworth@mvista.com>
Signed-off-by: Matt Porter <mporter@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
- Fixed some bttv card numbers.
- BTTV and SAA7134 version numbers incremented to reflect changes.
- pci_dma_supported() is called after pci_set_dma_mask() which
already did check that for us. This patch removes the unneeded call to
pci_dma_supported() at bttv-driver.c
- Ensure a sufficient I2C bus idle time between 2 messages for
saa7134-i2c.c
- It is important to write at first to MO_GP3_IO for cx88-tvaudio.c
- Use try_to_freeze() instead of refrigerator at msp3400.c
- Recognizing the MFPE05-2 Tuner at tveeprom.c
- Add new parameter to help identify radio chipsets at tuner module:
show_i2c=1 will show 16 reading bytes from detected tuners.
- BTTV does generate some Unimplemented IOCTL log at tuner module:
0x40046d11(dir=1,tp=0x6d,nr=17,sz=4) means that it is sending
MSP3400 calls to non-msp3400 tuners. Warning eliminated.
VIDIOSAUDIO is also called, so debug messages updated. It is still
requiring IOCTL implementation.
- Added two more tuners.
- Add support for the SVideo input on the GDI Black Gold.
Signed-off-by: Peter Missel <peter.missel@onlinehome.de>
Signed-off-by: Graham Bevan <graham.bevan@ntlworld.com>
Signed-off-by: Torsten Seeboth <Torsten.Seeboth@t-online.de>
Signed-off-by: Hartmut Hackmann <hartmut.hackmann@t.online.de>
Signed-off-by: Tobias Klauser <tklauser@nuerscht.ch>
Signed-off-by: Michael Krufky <mkrufky@m1k.net>
Signed-off-by: Mauro Carvalho Chehab <mchehab@brturbo.com.br>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
When a file is moved over an existing file that you are watching,
inotify won't send you a DELETE_SELF event and it won't unref the inode
until the inotify instance is closed by the application.
Signed-off-by: John McCutchan <ttb@tentacle.dhs.org>
Signed-off-by: Robert Love <rml@novell.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-Wundef found an (although perhaps harmless) bug:
<-- snip -->
...
CC net/ieee80211/ieee80211_crypt.o
In file included from net/ieee80211/ieee80211_crypt.c:21:
include/net/ieee80211.h:26:5: warning: "WIRELESS_EXT" is not defined
CC net/ieee80211/ieee80211_crypt_wep.o
In file included from net/ieee80211/ieee80211_crypt_wep.c:20:
include/net/ieee80211.h:26:5: warning: "WIRELESS_EXT" is not defined
CC net/ieee80211/ieee80211_crypt_ccmp.o
CC net/ieee80211/ieee80211_crypt_tkip.o
In file included from net/ieee80211/ieee80211_crypt_tkip.c:23:
include/net/ieee80211.h:26:5: warning: "WIRELESS_EXT" is not defined
...
<-- snip -->
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
ethernet drivers to remain as ignorant as is reasonable of the connected
PHY's design and operation details.
Signed-off-by: Andy Fleming <afleming@freescale.com>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
inotify system call support for PPC64
[ I don't think we need sys32 compatibility versions--and if we do, I
failed in life. ]
Signed-off-by: Robert Love <rml@novell.com>
Acked-by: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Add inotify system call stubs to PPC32.
Signed-off-by: Robert Love <rml@novell.com>
Acked-by: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Add reference count and disable ACPI PCI Interrupt Link
when no device still uses it.
Warn when drivers have not released Link at suspend time.
http://bugzilla.kernel.org/show_bug.cgi?id=3469
Signed-off-by: David Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
sync_tsc was using smp_call_function to ask the boot processor to report
it's tsc value. smp_call_function performs an IPI_send_allbutself which is
a broadcast ipi. There is a window during processor startup during which
the target cpu has started and before it has initialized it's interrupt
vectors so it can properly process an interrupt. Receveing an interrupt
during that window will triple fault the cpu and do other nasty things.
Why cli does not protect us from that is beyond me.
The simple fix is to match ia64 and provide a smp_call_function_single.
Which avoids the broadcast and is more efficient.
This certainly fixes the problem of getting stuck on boot which was
very easy to trigger on my SMP Hyperthreaded Xeon, and I think
it fixes it for the right reasons.
Minor changes by AK
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
In the patch from:
http://www.uwsg.iu.edu/hypermail/linux/kernel/0506.3/0985.html
Is the the following line suppose inside the if CONFIG_PCI=n
#define pci_dma_burst_advice(pdev, strat, strategy_parameter) do { } while (0)
Signed-off-by: Kumar Gala <kumar.gala@freescale.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Use the standard hardware page table manipulation macros.
This is possible now that linux works with all 4 levels
of the page tables.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Some edge problems with the original C rewrite.
Thanks go to Cal Peake, who pinpointed the breakage to the rewrite, and
tested this fixed version.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
We added an include of asm/vm86.h in include/asm-i386/ptrace.h. Since UML
includes the underlying arch's ptrace.h, it needs an asm/vm86.h in order to
build.
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch eliminates the GCC4 warning on the x86_64 platform:
kernel/sched.c:1824: warning: control may reach end of non-void function
'sched_find_first_bit' being inlined.
The change follows the lead of others, i.e. it is guaranteed that at least
one of b[0], b[1], or b[2] will have a bit set and evaluate to true. That
being said, GCC4.0.0 notices that the code flow does not return anything if
b[0], b[1] and b[2] are not true. Since we know better, if it's not b[0] or
b[1], it has to be b[2].
Signed-off-by: Jesse Millan <jessem@cs.pdx.edu>
Signed-off-by: Domen Puncer <domen@coderock.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This avoids some potential stack overflows with very deep softirq callchains.
i386 does this too.
TOADD CFI annotation
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This avoids confusing the disassembler. Costs 2 bytes per BUG.
Thanks to Suresh Siddha and Jan Beulich for suggesting suitable instructions.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Harmless because the kernel didn't use it. Noticed by Travis Betak
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Were either outdated or misleading.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Minor cleanup.
Move things into their include files, remove obsolete includes, fix
indentation, remove obsolete special cases etc.
I also added the per cpu section to asm-generic/sections.h and fixed
init/main.c to use it.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Makes it slightly more efficient.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
We sometimes forgot to check whether the exclusive store succeeded.
Ensure that we always check. Also ensure that we always use the
out of line versions, since the inline versions are not SMP safe.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
It may shut up gcc, but it also incorrectly changes the semantics of the
smp_call_function() helpers.
You can fix the warning other ways if you are interested (create another
inline function that takes no arguments and returns zero), but
preferably gcc just shouldn't complain about unused return values from
statement expressions in the first place.
Avoid using "rep scas", just let the compiler select a sequence of
regular instructions.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Apparently gcc 4.0 complains about "({ 0; });", which leads to -Werror
breakage in one of the alpha oprofile modules.
One might could argue that this is a gcc bug, in that statement-expressions
should be considered to be function-like rather than statement-like for the
purposes of this warning. But it's just as easy to use an inline function
in the first place, side-stepping the issue.
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
EMU10K1/EMU10K2 driver
It allows the user to force the snd-emu10k1 module to think the user
has a particular sound card. Useful if their particular sound card
is not yet recognised.
Signed-off-by: James Courtier-Dutton <James@superbug.co.uk>
ALSA Core
This patch changes, adds and remove some comments, which will
make now more sense and fit on a 80-char line. It also changes
the order of snd_power_wait() to make the file more readable.
It removes the device.c comment in front of _snd_minor,
cause snd_minor has nothing to do with device.c.
The both typos in the kernel-docs were corrected too.
Signed-off-by: Henrik Kretzschmar <henne@nachtwindheim.de>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
`gcc -W' likes to complain if the static keyword is not at the beginning of
the declaration. This patch fixes all remaining occurrences of "inline
static" up with "static inline" in the entire kernel tree (140 occurrences in
47 files).
While making this change I came across a few lines with trailing whitespace
that I also fixed up, I have also added or removed a blank line or two here
and there, but there are no functional changes in the patch.
Signed-off-by: Jesper Juhl <juhl-lkml@dif.dk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch contains the following possible cleanups:
- make two needlessly global structs static
- #if 0 the EXPORT_SYMBOL'ed but unused function tveeprom_dump
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Use tabs for formatting like anywhere else in this file.
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
turn many #if $undefined_string into #ifdef $undefined_string to fix some
warnings after -Wno-def was added to global CFLAGS
Signed-off-by: Olaf Hering <olh@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The cache parameter to mb_cache_shrink isn't used. We may as well remove
it.
Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
I believe that there is a problem with the handling of POSIX locks, which
the attached patch should address.
The problem appears to be a race between fcntl(2) and close(2). A
multithreaded application could close a file descriptor at the same time as
it is trying to acquire a lock using the same file descriptor. I would
suggest that that multithreaded application is not providing the proper
synchronization for itself, but the OS should still behave correctly.
SUS3 (Single UNIX Specification Version 3, read: POSIX) indicates that when
a file descriptor is closed, that all POSIX locks on the file, owned by the
process which closed the file descriptor, should be released.
The trick here is when those locks are released. The current code releases
all locks which exist when close is processing, but any locks in progress
are handled when the last reference to the open file is released.
There are three cases to consider.
One is the simple case, a multithreaded (mt) process has a file open and
races to close it and acquire a lock on it. In this case, the close will
release one reference to the open file and when the fcntl is done, it will
release the other reference. For this situation, no locks should exist on
the file when both the close and fcntl operations are done. The current
system will handle this case because the last reference to the open file is
being released.
The second case is when the mt process has dup(2)'d the file descriptor.
The close will release one reference to the file and the fcntl, when done,
will release another, but there will still be at least one more reference
to the open file. One could argue that the existence of a lock on the file
after the close has completed is okay, because it was acquired after the
close operation and there is still a way for the application to release the
lock on the file, using an existing file descriptor.
The third case is when the mt process has forked, after opening the file
and either before or after becoming an mt process. In this case, each
process would hold a reference to the open file. For each process, this
degenerates to first case above. However, the lock continues to exist
until both processes have released their references to the open file. This
lock could block other lock requests.
The changes to release the lock when the last reference to the open file
aren't quite right because they would allow the lock to exist as long as
there was a reference to the open file. This is too long.
The new proposed solution is to add support in the fcntl code path to
detect a race with close and then to release the lock which was just
acquired when such as race is detected. This causes locks to be released
in a timely fashion and for the system to conform to the POSIX semantic
specification.
This was tested by instrumenting a kernel to detect the handling locks and
then running a program which generates case #3 above. A dangling lock
could be reliably generated. When the changes to detect the close/fcntl
race were added, a dangling lock could no longer be generated.
Cc: Matthew Wilcox <willy@debian.org>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The atomic64 primitives are supposed to have 64-bit parameters instead of int.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The find_next_{zero}_bit primitives on s390* should never return a bit number
bigger then the bit field size. In the case of a bitfield that doesn't end on
a word boundary, an offset that makes the search start at the last word of the
bit field and the last word doesn't contain any zero/one bits the search is
continued with a call to find_first_bit with a negative size. The search
normally ends pretty quickly because the words following the bit field contain
a mix of zeros and ones. But the bit number that is returned in this case is
too big.
To fix this and additional if to check for this case is needed. To make the
code easier to read I removed the assembler parts from the
find_next_{zero}_bit functions, the C-ified code is as good.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>