Commit Graph

2182 Commits

Author SHA1 Message Date
Bjorn Helgaas 2adf75160b PCI: read bridge windows before filling in subtractive decode resources
No functional change; this fills in the bus subtractive decode resources
after reading the bridge window information rather than before.  Also,
print out the subtractive decode resources as we already do for the
positive decode windows.

Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-02-23 09:43:25 -08:00
Bjorn Helgaas fa27b2d108 PCI: split up pci_read_bridge_bases()
No functional change; this breaks up pci_read_bridge_bases() into separate
pieces for the I/O, memory, and prefetchable memory windows, similar to how
Yinghai recently split up pci_setup_bridge() in 68e84ff3bdc.

Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-02-23 09:43:17 -08:00
Kenji Kaneshige b16694f70c PCIe PME: use pci_pcie_cap()
Use pci_pcie_cap() instead of pci_find_capability() to get PCIe
capability offset. This reduces redundant search in PCI configuration
space.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-02-22 16:21:21 -08:00
Rafael J. Wysocki 6cbf82148f PCI PM: Run-time callbacks for PCI bus type
Introduce run-time PM callbacks for the PCI bus type.  Make the new
callbacks work in analogy with the existing system sleep PM
callbacks, so that the drivers already converted to struct dev_pm_ops
can use their suspend and resume routines for run-time PM without
modifications.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-02-22 16:21:19 -08:00
Kenji Kaneshige 552be54cc4 PCIe PME: use pci_is_pcie()
Use pci_is_pcie() instead of looking at obsolete is_pcie field in
struct pci_dev.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-02-22 16:21:10 -08:00
Rafael J. Wysocki b67ea76172 PCI / ACPI / PM: Platform support for PCI PME wake-up
Although the majority of PCI devices can generate PMEs that in
principle may be used to wake up devices suspended at run time,
platform support is generally necessary to convert PMEs into wake-up
events that can be delivered to the kernel.  If ACPI is used for this
purpose, PME signals generated by a PCI device will trigger the ACPI
GPE associated with the device to generate an ACPI wake-up event that
we can set up a handler for, provided that everything is configured
correctly.

Unfortunately, the subset of PCI devices that have GPEs associated
with them is quite limited.  The devices without dedicated GPEs have
to rely on the GPEs associated with other devices (in the majority of
cases their upstream bridges and, possibly, the root bridge) to
generate ACPI wake-up events in response to PME signals from them.

Add ACPI platform support for PCI PME wake-up:
o Add a framework making is possible to use ACPI system notify
  handlers for run-time PM.
o Add new PCI platform callback ->run_wake() to struct
  pci_platform_pm_ops allowing us to enable/disable the platform to
  generate wake-up events for given device.  Implemet this callback
  for the ACPI platform.
o Define ACPI wake-up handlers for PCI devices and PCI root buses and
  make the PCI-ACPI binding code register wake-up notifiers for all
  PCI devices present in the ACPI tables.
o Add function pci_dev_run_wake() which can be used by PCI drivers to
  check if given device is capable of generating wake-up events at
  run time.

Developed in cooperation with Matthew Garrett <mjg@redhat.com>.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-02-22 16:21:02 -08:00
Rafael J. Wysocki c39fae1416 PCI PM: Make it possible to force using INTx for PCIe PME signaling
Apparently, some machines may have problems with PCI run-time power
management if MSIs are used for the native PCIe PME signaling.  In
particular, on the MSI Wind U-100 PCIe PME interrupts are not
generated by a PCIe root port after a resume from suspend to RAM, if
the system wake-up was triggered by a PME from the device attached to
this port.  [It doesn't help to free the interrupt on suspend and
request it back on resume, even if that is done along with disabling
the MSI and re-enabling it, respectively.]  However, if INTx
interrupts are used for this purpose on the same machine, everything
works just fine.

For this reason, add a kernel command line switch allowing one to
request that MSIs be not used for the native PCIe PME signaling,
introduce a DMI table allowing us to blacklist machines that need
this switch to be set by default and put the MSI Wind U-100 into this
table.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-02-22 16:20:39 -08:00
Rafael J. Wysocki c7f486567c PCI PM: PCIe PME root port service driver
PCIe native PME detection mechanism is based on interrupts generated
by root ports or event collectors every time a PCIe device sends a
PME message upstream.

Once a PME message has been sent by an endpoint device and received
by its root port (or event collector in the case of root complex
integrated endpoints), the Requester ID from the message header is
registered in the root port's Root Status register.  At the same
time, the PME Status bit of the Root Status register is set to
indicate that there's a PME to handle.  If PCIe PME interrupt is
enabled for the root port, it generates an interrupt once the PME
Status has been set.  After receiving the interrupt, the kernel can
identify the PCIe device that generated the PME using the Requester
ID from the root port's Root Status register. [For details, see PCI
Express Base Specification, Rev. 2.0.]

Implement a driver for the PCIe PME root port service working in
accordance with the above description.

Based on a patch from Shaohua Li <shaohua.li@intel.com>.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-02-22 16:20:31 -08:00
Rafael J. Wysocki 58ff463396 PCI PM: Add function for checking PME status of devices
Add function pci_check_pme_status() that will check the PME status
bit of given device and clear it along with the PME enable bit.  It
will be necessary for PCI run-time power management.

Based on a patch from Shaohua Li <shaohua.li@intel.com>

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-02-22 16:20:24 -08:00
Yinghai Lu 9958610552 PCI: set PCI_PREF_RANGE_TYPE_64 in pci_bridge_check_ranges
Make pci_bridge_check_ranges() store the PCI_PREF_RANGE_TYPE_64 in
addition to IORESOURCE_MEM_64.  Just like pci_read_bridge_bases().

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-02-22 16:17:25 -08:00
Yinghai Lu 32180e402f PCI: pciehp: second try to get big range for pcie devices
Handle the case where the slot bridge that doesn't get a pre-allocated
resource big enough to handle its child resources..  For example pcie
devices need 256M, but the bridge only gets 2M preallocated.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-02-22 16:17:25 -08:00
Yinghai Lu 9789ac979b PCI: pciehp: cleanup flow in pciehp_configure_device
Move bus_size_bridges and assign resources out of pciehp_add_bridge()
and do them all together, one time, including slot bridge, to avoid to
calling assign resources several times when there are several bridges
under the slot bridge.  Using pci_assign_unassigned_bridge_resources.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Reviewed-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-02-22 16:17:24 -08:00
Yinghai Lu 6841ec681a PCI: introduce pci_assign_unassigned_bridge_resources
For use by pciehp.

pci_setup_bridge() will not check enabled for the slot bridge, otherwise
update res is not updated to bridge BAR.  That is, bridge is already
enabled for port service.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Reviewed-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-02-22 16:17:23 -08:00
Yinghai Lu 977d17bb17 PCI: update bridge resources to get more big ranges in PCI assign unssigned
BIOS separates IO ranges between several IOHs, and on some slots, BIOS assigns
resources to a bridge, but stops assigning resources to the device under that
bridge, because the device needs a big resource.

So:
  1. allocate resources and record the failed device resources
  2. clear the BIOS assigned resources of the parent bridge of failing device
  3. go back and call pci assign unassigned
  4. if it still fails, go up the tree, clear more bridges. and try again

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-02-22 16:17:22 -08:00
Yinghai Lu d65245c329 PCI: don't shrink bridge resources
When clearing leaf bridge resources, trying to get a big enough one, we
could shrink the bridge if there is no resource under it.  Confirm
against the old resource side to make sure we're increasing the
allocation.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-02-22 16:17:21 -08:00
Yinghai Lu cd81e1ea1a PCI: reject mmio ranges starting at 0 on pci_bridge read
We already track unassigned resources in struct resource, and this
prevents us from overwriting resource flags and info in the unassigned
case.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-02-22 16:17:21 -08:00
Yinghai Lu 568ddef873 PCI: add failed_list to pci_bus_assign_resources
This allows us to track failed allocations for later re-trying with
reallocation.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Reviewed-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-02-22 16:17:20 -08:00
Yinghai Lu 5009b46025 PCI: add pci_bridge_release_resources and pci_bus_release_bridge_resources
We use this in later patches to free resrouce ranges for reassignment in
an effort to support a wider variety of PCI topologies.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Reviewed-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-02-22 16:17:19 -08:00
Andrew Morton ba02b242bb PCI hotplug: check ioremap() return value in ibmphp_ebda.c
check ioremap() return value.

Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-02-22 16:17:19 -08:00
Randy Dunlap 939fdc6735 PCI hotplug: fix ibmphp build error
Add header file to fix build error:

drivers/pci/hotplug/ibmphp_hpc.c:135: error: implicit declaration of function 'init_MUTEX'
drivers/pci/hotplug/ibmphp_hpc.c:136: error: implicit declaration of function 'init_MUTEX_LOCKED'
drivers/pci/hotplug/ibmphp_hpc.c:797: error: implicit declaration of function 'down'
drivers/pci/hotplug/ibmphp_hpc.c:807: error: implicit declaration of function 'up'

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-02-22 16:17:18 -08:00
Matthew Wilcox 4fb88c1a28 PCI: Make pci_scan_slot more robust
Yinghai pointed out that the new pci_scan_slot() crashes when called
on an ARI-capable slot that is empty.  Fix this by exiting early from
pci_scan_slot if there is no device in the slot.

Also make next_ari_func() robust against devices not existing in case
the ARI capability is corrupt.  ARI also requires that the devices be
listed in order, so if we find a function listed that is out of order,
stop scanning to prevent loops.

Signed-off-by: Matthew Wilcox <matthew@wil.cx>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-02-22 16:17:17 -08:00
Jiri Slaby 0bf01c3c86 PCI: hotplug/cpcihp, fix pci device refcounting
Stanse found an ommitted pci_dev_put on one error path in
cpcihp_generic_init. The path is taken on !dev, but also when
dev->hdr_type != PCI_HEADER_TYPE_BRIDGE. However it omits to
pci_dev_put on the latter.

As it is fine to pass NULL to pci_dev_put, put it in there
uncoditionally.

Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Cc: Scott Murray <scott@spiteful.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-02-22 16:17:02 -08:00
Tilman Schmidt 41a68a748b PCI: push deprecated pci_find_device() function to last user
The ISDN4Linux HiSax driver family contains the last remaining users
of the deprecated pci_find_device() function. This patch creates a
private copy of that function in HiSax, and removes the now unused
global function together with its controlling configuration option,
CONFIG_PCI_LEGACY.

Signed-off-by: Tilman Schmidt <tilman@imap.cc>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-02-22 16:17:01 -08:00
Yinghai Lu 7c9342b8dd PCI: don't dump resource when bus resource flags indicates unused
Don't print out resources without flags to avoid cluttering up the debug
output.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-02-22 16:17:00 -08:00
Yinghai Lu 7cc5997d1d PCI: separate pci_setup_bridge to small functions
This is a good cleanup in itself, and makes it easier to modify specific
resource types in later code.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-02-22 16:16:59 -08:00
Chandru b0fc889c43 PCI hotplug: ibmphp: read the length of ebda and map entire ebda region
ibmphp driver currently maps only 1KB of ebda memory area into kernel address
space during driver initialization. This causes kernel oops when the driver is
modprobe'd and it accesses memory beyond 1KB within ebda segment. The first
byte of ebda segment actually stores the length of the ebda region in
Kilobytes. Hence make use of the length parameter and map the entire ebda
region.

Signed-off-by: Chandru Siddalingappa <chandru@linux.vnet.ibm.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-02-22 16:16:58 -08:00
Jiri Slaby 6fcaf17ac7 PCI hotplug: fix memory leaks
Stanse found a cut&pasted memory leak in pciehp_queue_pushbutton_work
and shpchp_queue_pushbutton_work. info is not freed/assigned on all
paths. Fix that.

Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-02-22 16:16:57 -08:00
Dominik Brodowski 3b7a17fcda resource/PCI: mark struct resource as const
Now that we return the new resource start position, there is no
need to update "struct resource" inside the align function.
Therefore, mark the struct resource as const.

Cc: Bjorn Helgaas <bjorn.helgaas@hp.com>
Cc: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-02-22 16:16:57 -08:00
Dominik Brodowski b26b2d494b resource/PCI: align functions now return start of resource
As suggested by Linus, align functions should return the start
of a resource, not void. An update of "res->start" is no longer
necessary.

Cc: Bjorn Helgaas <bjorn.helgaas@hp.com>
Cc: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-02-22 16:16:56 -08:00
Rafael J. Wysocki 93177a748b PCI: Clean up build for CONFIG_PCI_QUIRKS unset
Currently, drivers/pci/quirks.c is built unconditionally, but if
CONFIG_PCI_QUIRKS is unset, the only things actually built in this
file are definitions of global variables and empty functions (due to
the #ifdef CONFIG_PCI_QUIRKS embracing all of the code inside the
file).  This is not particularly nice and if someone overlooks
the #ifdef CONFIG_PCI_QUIRKS, build errors are introduced.

To clean that up, move the definitions of the global variables in
quirks.c that are always built to pci.c, move the definitions of
the empty functions (compiled when CONFIG_PCI_QUIRKS is unset) to
headers (additionally make these functions static inline) and modify
drivers/pci/Makefile so that quirks.c is only built if
CONFIG_PCI_QUIRKS is set.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-02-22 16:15:21 -08:00
Jesse Barnes 3804259475 PCI hotplug: remove obsolete usage of get_bus_speed from rpaphp hotplug ops
No longer needed and causes build breakage.

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-02-22 16:15:20 -08:00
Matthew Wilcox 9dfd97fe12 PCI: Add support for reporting PCIe 3.0 speeds
Add the 8.0 GT/s speed.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-02-22 16:15:19 -08:00
Matthew Wilcox 45b4cdd57e PCI: Add support for AGP in cur/max bus speed
Take advantage of some gaps in the table to fit in support for AGP speeds.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-02-22 16:15:19 -08:00
Matthew Wilcox 9be60ca049 PCI: Add support for detection of PCIe and PCI-X bus speeds
Both PCIe and PCI-X bridges report their secondary bus speed in their
respective capabilities.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-02-22 16:15:18 -08:00
Matthew Wilcox 3749c51ac6 PCI: Make current and maximum bus speeds part of the PCI core
Move the max_bus_speed and cur_bus_speed into the pci_bus.  Expose the
values through the PCI slot driver instead of the hotplug slot driver.
Update all the hotplug drivers to use the pci_bus instead of their own
data structures.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-02-22 16:15:17 -08:00
Matthew Wilcox 536c8cb49e PCI: Unify pcie_link_speed and pci_bus_speed
These enums must not overlap anyway, since we only have a single
pci_bus_speed_strings array.  Use a single enum, and move it to
pci.h.  Add 'SPEED' to the pcie names to make it clear what they are.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-02-22 16:15:17 -08:00
Matthew Wilcox f07852d644 PCI: Rewrite pci_scan_slot
The Alternate Routing-ID Interpretation capability allows a single device
to have up to 256 functions.  They can be populated sparsely, so the
current technique of scanning every eighth function is not guaranteed
to find them all.  By introducing a 'next_fn' function pointer, we can
use the linked list of functions in the ARI capability to scan all the
functions which exist.

We can then speed up the pci_scan_slot by skipping the scan of subsequent
devfns for PCIe devices which are the direct children of Root Ports or
Downstream Ports.  These devices are only permitted to implement device
0, unless they are ARI devices, in which case they'll be scanned by the
ARI code above.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-02-22 16:15:16 -08:00
Alexander Duyck 7a0deb6bcd pci: add support for 82576NS serdes to existing SR-IOV quirk
This patch adds support for the 82576NS Serdes adapter to the existing pci
quirk for 82576 parts.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-02-22 15:45:33 -08:00
Jesse Barnes cf4c43dd43 PCI: Add pci_bus_find_ext_capability
For use by code that needs to walk extended capability lists before
pci_dev structures are set up.

Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
LKML-Reference: <43F901BD926A4E43B106BF17856F07559FB80CFD@orsmsx508.amr.corp.intel.com>
Signed-off-by: Jacob Pan <jacob.jun.pan@intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-02-19 16:12:26 -08:00
Len Brown 0e2ecbaefd Merge branches 'bugzilla-14886', 'bugzilla-15000', 'bugzilla-15040', 'bugzilla-15108', 'pdc', 'hotplug-null-ref' and 'thinkpad' into release 2010-02-18 03:51:04 -05:00
David S. Miller 2bb4646fce Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2010-02-16 22:09:29 -08:00
Williams, Mitch A fb8a0d9d1b pci: Add SR-IOV convenience functions and macros
Add and export pci_num_vf to allow other subsystems to determine how many
virtual function devices are associated with an SR-IOV physical function
device.
Add macros dev_is_pci, dev_is_ps, and dev_num_vf to make it easier for
non-PCI specific code to determine SR-IOV capabilities.

Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-02-12 16:56:07 -08:00
H. Peter Anvin c85e4aae69 ibmphp: Rename add_range() to add_bus_range() to avoid conflict
Rename add_range() to add_bus_range() to avoid conflict with the
naming of the generic range manipulation functions.

LKML-Reference: <1265793639-15071-4-git-send-email-yinghai@kernel.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-02-10 17:45:09 -08:00
Mike Travis 95a8b6efc5 pci: Update pci_set_vga_state() to call arch functions
Update pci_set_vga_state to call arch dependent functions to enable Legacy
VGA I/O transactions to be redirected to correct target.

[akpm@linux-foundation.org: make pci_register_set_vga_state() __init]
Signed-off-by: Mike Travis <travis@sgi.com>
LKML-Reference: <201002022238.o12McE1J018723@imap1.linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Robin Holt <holt@sgi.com>
Cc: Jack Steiner <steiner@sgi.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: David Airlie <airlied@linux.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-02-05 14:05:41 -08:00
Andres Salomon 73d2eaac8a CS5536: apply pci quirk for BIOS SMBUS bug
The new cs5535-* drivers use PCI header config info rather than MSRs to
determine the memory region to use for things like GPIOs and MFGPTs.  As
anticipated, we've run into a buggy BIOS:

[    0.081818] pci 0000:00:14.0: reg 10: [io  0x6000-0x7fff]
[    0.081906] pci 0000:00:14.0: reg 14: [io  0x6100-0x61ff]
[    0.082015] pci 0000:00:14.0: reg 18: [io  0x6200-0x63ff]
[    0.082917] pci 0000:00:14.2: reg 20: [io  0xe000-0xe00f]
[    0.083551] pci 0000:00:15.0: reg 10: [mem 0xa0010000-0xa0010fff]
[    0.084436] pci 0000:00:15.1: reg 10: [mem 0xa0011000-0xa0011fff]
[    0.088816] PCI: pci_cache_line_size set to 32 bytes
[    0.088938] pci 0000:00:14.0: address space collision: [io 0x6100-0x61ff] already in use
[    0.089052] pci 0000:00:14.0: can't reserve [io  0x6100-0x61ff]

This is a Soekris board, and its BIOS sets the size of the PCI ISA bridge
device's BAR0 to 8k.  In reality, it should be 8 bytes (BAR0 is used for
SMBus stuff).  This quirk checks for an incorrect size, and resets it
accordingly.

Signed-off-by: Andres Salomon <dilinger@collabora.co.uk>
Tested-by: Leigh Porter <leigh@leighporter.org>
Tested-by: Jens Rottmann <JRottmann@LiPPERTEmbedded.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-02-05 07:36:50 -08:00
Linus Torvalds 32337f8a70 Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
  powerpc: TIF_ABI_PENDING bit removal
  powerpc/pseries: Fix xics build without CONFIG_SMP
  powerpc/4xx: Add pcix type 1 transactions
  powerpc/pci: Add missing call to header fixup
  powerpc/pci: Add missing hookup to pci_slot
  powerpc/pci: Add calls to set_pcie_port_type() and set_pcie_hotplug_bridge()
  powerpc/40x: Update the PowerPC 40x board defconfigs
  powerpc/44x: Update PowerPC 44x board defconfigs
2010-02-01 10:37:58 -08:00
Thomas Renninger 7779688fc3 ACPI: acpi_bus_{scan,bus,add}: return -ENODEV if no device was found
Callers (acpi_memhotplug.c, dock.c and others) check for the return
value of acpi_bus_add() and assume a valid device was returned in
case zero was returned.

Thus return -ENODEV if no device was found in acpi_bus_scan and
propagate this through acpi_bus_add and acpi_bus_start.

Also remove a confusing comment in acpiphp_glue.c, acpi_bus_scan
will and cannot invoke if acpi_bus_add returns no valid device.

Signed-off-by: Thomas Renninger <trenn@suse.de>
Acked-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2010-01-31 21:43:32 -05:00
Benjamin Herrenschmidt bb209c8287 powerpc/pci: Add calls to set_pcie_port_type() and set_pcie_hotplug_bridge()
We are missing these when building the pci_dev from scratch off
the Open Firmware device-tree

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-01-29 16:51:10 +11:00
Lin Ming 439913fffd ACPI: replace acpi_integer by u64
acpi_integer is now obsolete and removed from the ACPICA code base,
replaced by u64.

Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2010-01-28 01:47:33 -05:00
Andrew Patterson bd1f46deba PCI: fix nested spinlock hang in aer_inject
The aer_inject module hangs in aer_inject() when checking the device's
error masks.  The hang is due to a recursive use of the aer_inject lock.
The aer_inject() routine grabs the lock while processing the error and then
calls pci_read_config_dword to read the masks. The pci_read_config_dword
routine is earlier overridden by pci_read_aer, which among other things,
grabs the aer_inject lock.

Fixed by moving the pci_read_config_dword calls to read the masks to before
the lock is taken.

Acked-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Andrew Patterson <andrew.patterson@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-01-25 10:42:52 -08:00
Youquan,Song b49bfd3290 PCIe AER: prevent AER injection if hardware masks error reporting
The Correcteable/Uncorrectable Error Mask Registers are used by PCIe AER
driver which will controls the reporting of individual errors to PCIe RC
via PCIe error messages.

If hardware masks special error reporting to RC, the aer_inject driver
should not inject aer error.

Acked-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Youquan, Song <youquan.song@intel.com>
Acked-by: Ying, Huang <ying.huang@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-01-04 15:52:49 -08:00
Rafael J. Wysocki 1ae861e652 PCI/PM: Use per-device D3 delays
It turns out that some PCI devices require extra delays when changing
power state from D3 to D0 (and the other way around).  Although this
is against the PCI specification, we can handle it quite easily by
allowing drivers to define arbitrary D3 delays for devices known to
require extra time for switching power states.

Introduce additional field d3_delay in struct pci_dev and use it to
store the value of the device's D0->D3 delay, in miliseconds.  Make
the PCI PM core code use the per-device d3_delay unless
pci_pm_d3_delay is greater (in which case the latter is used).
[This also allows the driver to specify d3_delay shorter than the
 10 ms required by the PCI standard if the device is known to be able
 to handle that.]

Make the sky2 driver set d3_delay to 150 for devices handled by it.

Fixes http://bugzilla.kernel.org/show_bug.cgi?id=14730 which is a
listed regression from 2.6.30.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-01-04 15:41:47 -08:00
David John 6be954d1f9 PCI: Check the node argument passed to cpumask_of_node
Commit e0cd516 "PCI: derive nearby CPUs from device's instead of bus'
NUMA information" causes an null pointer dereference when reading from
the sysfs attributes local_cpu* on Intel machines with no ACPI NUMA
proximity info, since dev->numa_node gets set to -1 for all PCI devices,
which then gets passed to cpumask_of_node.

Add a check to prevent this.

Signed-off-by: David John <davidjon@xenontk.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-01-04 15:10:56 -08:00
Youquan,Song 46256f83d0 PCI: AER: fix aer inject result in kernel oops
If the BIOS does not export _OSC to allow OS take over the PCIe AER, the
pcie aer driver will not initialize the aer service. However, the
aer_inject driver does not check this scenario, which results in a kernel
oops when injecting an aer error into OS.  For example:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000350
IP: [<ffffffff812e08f7>] _spin_lock_irqsave+0xc/0x23
PGD 155c41067 PUD 157fe0067 PMD 0
Oops: 0002 [#1] SMP
Pid: 5119, comm: aer-inject Not tainted 2.6.32-rc8-mce #2
RIP: 0010:[<ffffffff812e08f7>]  [<ffffffff812e08f7>] _spin_lock_irqsave+0xc/0x23
RSP: 0018:ffff880157f81e28  EFLAGS: 00010096
RAX: 0000000000000296 RBX: 0000000000000000 RCX: 0000000000000100
RDX: 0000000000010000 RSI: 0000000000000246 RDI: 0000000000000350
RBP: ffff880157f81e28 R08: 0000000000000004 R09: ffff880157f81dac
R10: ffff88015a666f60 R11: ffff88015a666f40 R12: ffff88015758cc00
R13: 0000000000000350 R14: 0000000000000000 R15: 0000000000000100
FS:  00007f4d4a66e6f0(0000) GS:ffff8800282e0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000350 CR3: 000000015661a000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process aer-inject (pid: 5119, threadinfo ffff880157f80000, task ffff8801585f4340)
Stack:
 ffff880157f81e78 ffffffff811b1615 ffff880157f81e78 ffffffff81222823
Call Trace:
 [<ffffffff811b1615>] aer_irq+0x38/0x117
 [<ffffffff81222823>] ? device_for_each_child+0x5f/0x6f
 [<ffffffffa00967bf>] aer_inject_write+0x409/0x45e [aer_inject]
 [<ffffffff810eb80e>] vfs_write+0xae/0x16a
 [<ffffffff810eb98e>] sys_write+0x47/0x6e
 [<ffffffff8100ba2b>] system_call_fastpath+0x16/0x1b
RIP  [<ffffffff812e08f7>] _spin_lock_irqsave+0xc/0x23
 RSP <ffff880157f81e28>
CR2: 0000000000000350

So check the _OSC before assuming that AER is available to the OS.

Signed-off-by: Youquan, Song <youquan.song@intel.com>
Acked-by: Ying, Huang <ying.huang@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-01-04 08:31:46 -08:00
Hidetoshi Seto 40da4186a5 PCI: pcie portdrv: style cleanup
No change in logic.

Before:
  drivers/pci/pcie/portdrv_core.c:
    total: 7 errors, 2 warnings, 508 lines checked
  drivers/pci/pcie/portdrv_pci.c:
    total: 4 errors, 2 warnings, 300 lines checked

After:
  drivers/pci/pcie/portdrv_core.c:
    total: 0 errors, 0 warnings, 506 lines checked
  drivers/pci/pcie/portdrv_pci.c:
    total: 0 errors, 0 warnings, 299 lines checked

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-01-04 08:29:37 -08:00
Linus Torvalds df9d1e8a43 pci: avoid compiler warning in quirks.c
Introduced by commit 5b889bf23 ("PCI: Fix build if quirks are not
enabled"), which made the pci_dev_reset_methods[] array static and
'const', but didn't then change the code to match, and use a const
pointer when moving it to quirks.c.

Trivially fixed by just adding the required 'const' to the iterator
variable.

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-31 16:44:43 -08:00
Rafael J. Wysocki 5b889bf237 PCI: Fix build if quirks are not enabled
After commit b9c3b26641 ("PCI: support
device-specific reset methods") the kernel build is broken if
CONFIG_PCI_QUIRKS is unset.

Fix this by moving pci_dev_specific_reset() to drivers/pci/quirks.c and
providing an empty replacement for !CONFIG_PCI_QUIRKS builds.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-31 12:00:45 -08:00
Linus Torvalds d661d76b02 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
  PCI/cardbus: Add a fixup hook and fix powerpc
  PCI: change PCI nomenclature in drivers/pci/ (non-comment changes)
  PCI: change PCI nomenclature in drivers/pci/ (comment changes)
  PCI: fix section mismatch on update_res()
  PCI: add Intel 82599 Virtual Function specific reset method
  PCI: add Intel USB specific reset method
  PCI: support device-specific reset methods
  PCI: Handle case when no pci device can provide cache line size hint
  PCI/PM: Propagate wake-up enable for PCIe devices too
  vgaarbiter: fix a typo in the vgaarbiter Documentation
2009-12-30 13:13:24 -08:00
Benjamin Herrenschmidt 2d1c861871 PCI/cardbus: Add a fixup hook and fix powerpc
The cardbus code creates PCI devices without ever going through the
necessary fixup bits and pieces that normal PCI devices go through.

There's in fact a commented out call to pcibios_fixup_bus() in there,
it's commented because ... it doesn't work.

I could make pcibios_fixup_bus() do the right thing on powerpc easily
but I felt it cleaner instead to provide a specific hook pci_fixup_cardbus
for which a weak empty implementation is provided by the PCI core.

This fixes cardbus on powerbooks and probably all other PowerPC
platforms which was broken completely for ever on some platforms and
since 2.6.31 on others such as PowerBooks when we made the DMA ops
mandatory (since those are setup by the fixups).

Acked-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-12-16 18:55:51 -08:00
Stefan Assmann 7e8af37a9a PCI: change PCI nomenclature in drivers/pci/ (non-comment changes)
Changing occurrences of variants of PCI-X and PCIe to the PCI-SIG
terms listed in the "Trademark and Logo Usage Guidelines".
http://www.pcisig.com/developers/procedures/logos/Trademark_and_Logo_Usage_Guidelines_updated_112206.pdf

Patch is limited to drivers/pci/ and changes concern non-comment parts or
anything that might be visible to the user.

Signed-off-by: Stefan Assmann <sassmann@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-12-16 13:37:54 -08:00
Stefan Assmann 45e829ea41 PCI: change PCI nomenclature in drivers/pci/ (comment changes)
Changing occurrences of variants of PCI-X and PCIe to the PCI-SIG
terms listed in the "Trademark and Logo Usage Guidelines".
http://www.pcisig.com/developers/procedures/logos/Trademark_and_Logo_Usage_Guidelines_updated_112206.pdf

Patch is limited to drivers/pci/ and changes concern comments only.

Signed-off-by: Stefan Assmann <sassmann@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-12-16 13:37:53 -08:00
Dexuan Cui c763e7b58f PCI: add Intel 82599 Virtual Function specific reset method
Handle device specific timeout and use FLR.

Signed-off-by: Yu Zhao <yu.zhao@intel.com>
Signed-off-by: Dexuan Cui <dexuan.cui@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-12-16 13:37:52 -08:00
Dexuan Cui aeb30016fe PCI: add Intel USB specific reset method
Handle device specific reset requirements (i.e. vendor reg for reset
along with appropriate timeout).

Signed-off-by: Yu Zhao <yu.zhao@intel.com>
Signed-off-by: Dexuan Cui <dexuan.cui@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-12-16 13:37:51 -08:00
Dexuan Cui b9c3b26641 PCI: support device-specific reset methods
Add a new type of quirk for resetting devices at pci_dev_reset time.
This is necessary to handle device with nonstandard reset procedures,
especially useful for guest drivers.

Signed-off-by: Yu Zhao <yu.zhao@intel.com>
Signed-off-by: Dexuan Cui <dexuan.cui@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-12-16 13:37:50 -08:00
Csaba Henk 2820f333e3 PCI: Handle case when no pci device can provide cache line size hint
Prior to this patch, if pci_read_config_byte(dev, PCI_CACHE_LINE_SIZE, ...)
returns 0 for all dev, pci_cache_line_size ends up set to zero
(instead of pci_dfl_cache_line_size).

This patch ensures the pci_cache_line_size = pci_dfl_cache_line_size
setting in the above scenario.

This happens in case of a kvm-88 guest (where, consequently, the rtl8139
NIC failed to initialize).

Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Csaba Henk <csaba@gluster.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-12-16 13:37:50 -08:00
Rafael J. Wysocki dc1a94ae17 PCI/PM: Propagate wake-up enable for PCIe devices too
Having read the PM part of the PCIe 2.0 specification more carefully
I think that it was a mistake to restrict the wake-up enable
propagation to non-PCIe devices, because if we do not request
control of the root ports' PME registers via OSC, PCIe PME is
supposed to be handled by the platform, just like the non-PCIe PME.
Even if we do that, the wake-up propagation is done to allow the
devices to wake up the system from sleep states which involves the
platform anyway, so it won't hurt.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-12-16 13:37:49 -08:00
Linus Torvalds a79960e576 Merge git://git.infradead.org/iommu-2.6
* git://git.infradead.org/iommu-2.6:
  implement early_io{re,un}map for ia64
  Revert "Intel IOMMU: Avoid memory allocation failures in dma map api calls"
  intel-iommu: ignore page table validation in pass through mode
  intel-iommu: Fix oops with intel_iommu=igfx_off
  intel-iommu: Check for an RMRR which ends before it starts.
  intel-iommu: Apply BIOS sanity checks for interrupt remapping too.
  intel-iommu: Detect DMAR in hyperspace at probe time.
  dmar: Fix build failure without NUMA, warn on bogus RHSA tables and don't abort
  iommu: Allocate dma-remapping structures using numa locality info
  intr_remap: Allocate intr-remapping table using numa locality info
  dmar: Allocate queued invalidation structure using numa locality info
  dmar: support for parsing Remapping Hardware Static Affinity structure
2009-12-16 10:11:38 -08:00
Alexey Dobriyan 471452104b const: constify remaining dev_pm_ops
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-15 08:53:25 -08:00
Linus Torvalds 11bd04f6f3 Merge branch 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6
* 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: (109 commits)
  PCI: fix coding style issue in pci_save_state()
  PCI: add pci_request_acs
  PCI: fix BUG_ON triggered by logical PCIe root port removal
  PCI: remove ifdefed pci_cleanup_aer_correct_error_status
  PCI: unconditionally clear AER uncorr status register during cleanup
  x86/PCI: claim SR-IOV BARs in pcibios_allocate_resource
  PCI: portdrv: remove redundant definitions
  PCI: portdrv: remove unnecessary struct pcie_port_data
  PCI: portdrv: minor cleanup for pcie_port_device_register
  PCI: portdrv: add missing irq cleanup
  PCI: portdrv: enable device before irq initialization
  PCI: portdrv: cleanup service irqs initialization
  PCI: portdrv: check capabilities first
  PCI: portdrv: move PME capability check
  PCI: portdrv: remove redundant pcie type calculation
  PCI: portdrv: cleanup pcie_device registration
  PCI: portdrv: remove redundant pcie_port_device_probe
  PCI: Always set prefetchable base/limit upper32 registers
  PCI: read-modify-write the pcie device control register when initiating pcie flr
  PCI: show dma_mask bits in /sys
  ...

Fixed up conflicts in:
	arch/x86/kernel/amd_iommu_init.c
	drivers/pci/dmar.c
	drivers/pci/hotplug/acpiphp_glue.c
2009-12-11 12:18:16 -08:00
Linus Torvalds 3067e02f8f Merge branch 'acpica' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6
* 'acpica' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6:
  ACPICA: Update version to 20091112.
  ACPICA: Add additional module-level code support
  ACPICA: Deploy new create integer interface where appropriate
  ACPICA: New internal utility function to create Integer objects
  ACPICA: Add repair for predefined methods that must return sorted lists
  ACPICA: Fix possible fault if return Package objects contain NULL elements
  ACPICA: Add post-order callback to acpi_walk_namespace
  ACPICA: Change package length error message to an info message
  ACPICA: Reduce severity of predefined repair messages, Warning to Info
  ACPICA: Update version to 20091013
  ACPICA: Fix possible memory leak for Scope ASL operator
  ACPICA: Remove possibility of executing _REG methods twice
  ACPICA: Add repair for bad _MAT buffers
  ACPICA: Add repair for bad _BIF/_BIX packages
2009-12-09 19:57:06 -08:00
Linus Torvalds 849e8dea09 Merge branch 'timers-for-linus-hpet' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'timers-for-linus-hpet' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: hpet: Make WARN_ON understandable
  x86: arch specific support for remapping HPET MSIs
  intr-remap: generic support for remapping HPET MSIs
  x86, hpet: Simplify the HPET code
  x86, hpet: Disable per-cpu hpet timer if ARAT is supported
2009-12-08 19:26:55 -08:00
KOSAKI Motohiro 354bb65e6e Revert "Intel IOMMU: Avoid memory allocation failures in dma map api calls"
commit eb3fa7cb51 said Intel IOMMU

    Intel IOMMU driver needs memory during DMA map calls to setup its
    internal page tables and for other data structures.  As we all know
    that these DMA map calls are mostly called in the interrupt context
    or with the spinlock held by the upper level drivers(network/storage
    drivers), so in order to avoid any memory allocation failure due to
    low memory issues, this patch makes memory allocation by temporarily
    setting PF_MEMALLOC flags for the current task before making memory
    allocation calls.

    We evaluated mempools as a backup when kmem_cache_alloc() fails
    and found that mempools are really not useful here because
     1) We don't know for sure how much to reserve in advance
     2) And mempools are not useful for GFP_ATOMIC case (as we call
        memory alloc functions with GFP_ATOMIC)

    (akpm: point 2 is wrong...)

The above description doesn't justify to waste system emergency memory
at all. Non MM subsystem must not use PF_MEMALLOC. Memory reclaim need
few memory, anyone must not prevent it. Otherwise the system cause
mysterious hang-up and/or OOM Killer invokation.

Plus, akpm already pointed out what we should do.

Then, this patch revert it.

Cc: Keshavamurthy Anil S <anil.s.keshavamurthy@intel.com>
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-12-08 10:12:04 +00:00
Chris Wright 1672af1164 intel-iommu: ignore page table validation in pass through mode
We are seeing a bug when booting w/ iommu=pt with current upstream
(bisect blames 19943b0e30 "intel-iommu:
Unify hardware and software passthrough support).

The issue is specific to this loop during identity map initialization
of each device:

domain_context_mapping_one(si_domain, ..., CONTEXT_TT_PASS_THROUGH)
...
		/* Skip top levels of page tables for
		* iommu which has less agaw than default.
		*/
		for (agaw = domain->agaw; agaw != iommu->agaw; agaw--) {
			pgd = phys_to_virt(dma_pte_addr(pgd));
			if (!dma_pte_present(pgd)) {      <------ failing here
				spin_unlock_irqrestore(&iommu->lock, flags);
			return -ENOMEM;
		}

This box has 2 iommu's in it.  The catchall iommu has MGAW == 48, and
SAGAW == 4.  The other iommu has MGAW == 39, SAGAW == 2.

The device that's failing the above pgd test is the only device connected
to the non-catchall iommu, which has a smaller address width than the
domain default.  This test is not necessary since the context is in PT
mode and the ASR is ignored.

Thanks to Don Dutile for discovering and debugging this one.

Cc: stable@kernel.org
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-12-08 10:03:25 +00:00
David Woodhouse 44cd613c0e intel-iommu: Fix oops with intel_iommu=igfx_off
The hotplug notifier will call find_domain() to see if the device in
question has been assigned an IOMMU domain. However, this should never
be called for devices with a "dummy" domain, such as graphics devices
when intel_iommu=igfx_off is set and the corresponding IOMMU isn't even
initialised. If you do that, it'll oops as it dereferences the (-1)
pointer.

The notifier function should check iommu_no_mapping() for the
device before doing anything else.

Cc: stable@kernel.org
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-12-08 10:03:06 +00:00
David Woodhouse 5595b528b4 intel-iommu: Check for an RMRR which ends before it starts.
Some HP BIOSes report an RMRR region (a region which needs a 1:1 mapping
in the IOMMU for a given device) which has an end address lower than its
start address. Detect that and warn, rather than triggering the
BUG() in dma_pte_clear_range().

Cc: stable@kernel.org
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-12-08 10:02:52 +00:00
David Woodhouse 6ecbf01c7c intel-iommu: Apply BIOS sanity checks for interrupt remapping too.
The BIOS errors where an IOMMU is reported either at zero or a bogus
address are causing problems even when the IOMMU is disabled -- because
interrupt remapping uses the same hardware. Ensure that the checks get
applied for the interrupt remapping initialisation too.

Cc: stable@kernel.org
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-12-08 10:02:39 +00:00
Chris Wright 2c99220810 intel-iommu: Detect DMAR in hyperspace at probe time.
Many BIOSes will lie to us about the existence of an IOMMU, and claim
that there is one at an address which actually returns all 0xFF.

We need to detect this early, so that we know we don't have a viable
IOMMU and can set up swiotlb before it's too late.

Cc: stable@kernel.org
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-12-08 10:02:15 +00:00
David Woodhouse ec20849193 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6
Merge the BIOS workarounds from 2.6.32, and the swiotlb fallback on failure.
2009-12-08 09:59:24 +00:00
Linus Torvalds 7b626acb8f Merge branch 'core-iommu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core-iommu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (63 commits)
  x86, Calgary IOMMU quirk: Find nearest matching Calgary while walking up the PCI tree
  x86/amd-iommu: Remove amd_iommu_pd_table
  x86/amd-iommu: Move reset_iommu_command_buffer out of locked code
  x86/amd-iommu: Cleanup DTE flushing code
  x86/amd-iommu: Introduce iommu_flush_device() function
  x86/amd-iommu: Cleanup attach/detach_device code
  x86/amd-iommu: Keep devices per domain in a list
  x86/amd-iommu: Add device bind reference counting
  x86/amd-iommu: Use dev->arch->iommu to store iommu related information
  x86/amd-iommu: Remove support for domain sharing
  x86/amd-iommu: Rearrange dma_ops related functions
  x86/amd-iommu: Move some pte allocation functions in the right section
  x86/amd-iommu: Remove iommu parameter from dma_ops_domain_alloc
  x86/amd-iommu: Use get_device_id and check_device where appropriate
  x86/amd-iommu: Move find_protection_domain to helper functions
  x86/amd-iommu: Simplify get_device_resources()
  x86/amd-iommu: Let domain_for_device handle aliases
  x86/amd-iommu: Remove iommu specific handling from dma_ops path
  x86/amd-iommu: Remove iommu parameter from __(un)map_single
  x86/amd-iommu: Make alloc_new_range aware of multiple IOMMUs
  ...
2009-12-05 09:49:07 -08:00
Kleber Sacilotto de Souza 9e0b5b2c44 PCI: fix coding style issue in pci_save_state()
Remove a stray space in pci_save_state().

Signed-off-by: Kleber Sacilotto de Souza <klebers@linux.vnet.ibm.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-12-04 16:21:02 -08:00
Chris Wright 5d990b6275 PCI: add pci_request_acs
Commit ae21ee65e8 "PCI: acs p2p upsteram
forwarding enabling" doesn't actually enable ACS.

Add a function to pci core to allow an IOMMU to request that ACS
be enabled.  The existing mechanism of using iommu_found() in the pci
core to know when ACS should be enabled doesn't actually work due to
initialization order;  iommu has only been detected not initialized.

Have Intel and AMD IOMMUs request ACS, and Xen does as well during early
init of dom0.

Cc: Allen Kay <allen.m.kay@intel.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-12-04 16:19:24 -08:00
Kenji Kaneshige b26a34aa47 PCI: fix BUG_ON triggered by logical PCIe root port removal
This problem happened when removing PCIe root port using PCI logical
hotplug operation.

The immediate cause of this problem is that the pointer to invalid
data structure is passed to pcie_update_aspm_capable() by
pcie_aspm_exit_link_state(). When pcie_aspm_exit_link_state() received
a pointer to root port link, it unconfigures the root port link and
frees its data structure at first. At this point, there are not links
to configure under the root port and the data structure for root port
link is already freed. So pcie_aspm_exit_link_state() must not call
pcie_update_aspm_capable() and pcie_config_aspm_path().

This patch fixes the problem by changing pcie_aspm_exit_link_state()
not to call pcie_update_aspm_capable() and pcie_config_aspm_path() if
the specified link is root port link.

------------[ cut here ]------------
kernel BUG at drivers/pci/pcie/aspm.c:606!
invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC
last sysfs file: /sys/devices/pci0000:40/0000:40:13.0/remove
CPU 1
Modules linked in: shpchp
Pid: 9345, comm: sysfsd Not tainted 2.6.32-rc5 #98 ProLiant DL785 G6
RIP: 0010:[<ffffffff811df69b>]  [<ffffffff811df69b>] pcie_update_aspm_capable+0x15/0xbe
RSP: 0018:ffff88082a2f5ca0  EFLAGS: 00010202
RAX: 0000000000000e77 RBX: ffff88182cc3e000 RCX: ffff88082a33d006
RDX: 0000000000000001 RSI: ffffffff811dff4a RDI: ffff88182cc3e000
RBP: ffff88082a2f5cc0 R08: ffff88182cc3e000 R09: 0000000000000000
R10: ffff88182fc00180 R11: ffff88182fc00198 R12: ffff88182cc3e000
R13: 0000000000000000 R14: ffff88182cc3e000 R15: ffff88082a2f5e20
FS:  00007f259a64b6f0(0000) GS:ffff880864600000(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 00007feb53f73da0 CR3: 000000102cc94000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process sysfsd (pid: 9345, threadinfo ffff88082a2f4000, task ffff88082a33cf00)
Stack:
 ffff88182cc3e000 ffff88182cc3e000 0000000000000000 ffff88082a33cf00
<0> ffff88082a2f5cf0 ffffffff811dff52 ffff88082a2f5cf0 ffff88082c525168
<0> ffff88402c9fd2f8 ffff88402c9fd2f8 ffff88082a2f5d20 ffffffff811d7db2
Call Trace:
 [<ffffffff811dff52>] pcie_aspm_exit_link_state+0xf5/0x11e
 [<ffffffff811d7db2>] pci_stop_bus_device+0x76/0x7e
 [<ffffffff811d7d67>] pci_stop_bus_device+0x2b/0x7e
 [<ffffffff811d7e4f>] pci_remove_bus_device+0x15/0xb9
 [<ffffffff811dcb8c>] remove_callback+0x29/0x3a
 [<ffffffff81135aeb>] sysfs_schedule_callback_work+0x15/0x6d
 [<ffffffff81072790>] worker_thread+0x19d/0x298
 [<ffffffff8107273b>] ? worker_thread+0x148/0x298
 [<ffffffff81135ad6>] ? sysfs_schedule_callback_work+0x0/0x6d
 [<ffffffff810765c0>] ? autoremove_wake_function+0x0/0x38
 [<ffffffff810725f3>] ? worker_thread+0x0/0x298
 [<ffffffff8107629e>] kthread+0x7d/0x85
 [<ffffffff8102eafa>] child_rip+0xa/0x20
 [<ffffffff8102e4bc>] ? restore_args+0x0/0x30
 [<ffffffff81076221>] ? kthread+0x0/0x85
 [<ffffffff8102eaf0>] ? child_rip+0x0/0x20
Code: 89 e5 8a 50 48 31 c0 c0 ea 03 83 e2 07 e8 b2 de fe ff c9 48 98 c3 55 48 89 e5 41 56 49 89 fe 41 55 41 54 53 48 83 7f 10 00 74 04 <0f> 0b eb fe 48 8b 05 da 7d 63 00 4c 8d 60 e8 4c 89 e1 eb 24 4c
RIP  [<ffffffff811df69b>] pcie_update_aspm_capable+0x15/0xbe
 RSP <ffff88082a2f5ca0>
---[ end trace 6ae0f65bdeab8555 ]---

Reported-by: Alex Chiang <achiang@hp.com>
Tested-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-12-04 16:09:59 -08:00
Andrew Patterson 638bba0828 PCI: remove ifdefed pci_cleanup_aer_correct_error_status
The pci_cleanup_aer_correct_error_status() function has been
#if 0'd out since 2.6.25.  Time to remove the dead code.

Signed-off-by: Andrew Patterson <andrew.patterson@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-12-04 16:03:19 -08:00
Andrew Patterson 6cdfd995a6 PCI: unconditionally clear AER uncorr status register during cleanup
The current implementation of pci_cleanup_aer_uncorrect_error_status
only clears either fatal or non-fatal error status bits depending
on the state of the I/O channel. This implementation will then often
leave some bits set after PCI error recovery completes.  The uncleared bit
settings will then be falsely reported the next time an AER interrupt is
generated for that hierarchy. An easy way to illustrate this issue is to
use the aer-inject module to simultaneously inject both an uncorrectable
non-fatal and uncorrectable fatal error.  One of the errors will not be
cleared.

This patch resolves this issue by unconditionally clearing all bits in
the AER uncorrectable status register. All settings and corrective action
strategies are saved and determined before
pci_cleanup_aer_uncorrect_error_status is called, so this change should not
affect errory handling functionality.

Signed-off-by: Andrew Patterson <andrew.patterson@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-12-04 16:03:11 -08:00
Kenji Kaneshige f9f45604ed PCI: portdrv: remove redundant definitions
Remove unnecessary definitions from portdrv.h and use generic
definitions instead.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-12-04 15:56:24 -08:00
Kenji Kaneshige 694f88ef7a PCI: portdrv: remove unnecessary struct pcie_port_data
Remove 'port_type' field in struct pcie_port_data(), because we can
get port type information from struct pci_dev. With this change, this
patch also does followings:

 - Remove struct pcie_port_data because it no longer has any field.
 - Remove portdrv private definitions about port type (PCIE_RC_PORT,
   PCIE_SW_UPSTREAM_PORT and PCIE_SW_DOWNSTREAM_PORT), and use generic
   definitions instead.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-12-04 15:56:19 -08:00
Kenji Kaneshige 40717c39b1 PCI: portdrv: minor cleanup for pcie_port_device_register
Minor cleanups for pcie_port_device_register().

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-12-04 15:56:10 -08:00
Kenji Kaneshige fbb5de70bb PCI: portdrv: add missing irq cleanup
Add missing service irqs cleanup in the error code path of
pcie_port_device_register().

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-12-04 15:56:06 -08:00
Kenji Kaneshige 1ce5e83063 PCI: portdrv: enable device before irq initialization
Call pci_enable_device() before initializing service irqs, because
legacy interrupt is initialized in pci_enable_device() on some
architectures.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-12-04 15:55:59 -08:00
Kenji Kaneshige dc5351784e PCI: portdrv: cleanup service irqs initialization
This patch cleans up the service irqs initialization as follows:

 - Remove 'irq_mode' field in pcie_port_data and related definitions,
   which is not needed because we can get the same information from
   'is_msix', 'is_msi' and 'pin' fields in struct pci_dev.

 - Change the name of 'vectors' argument of assign_interrupt_mode() to
   'irqs' because it holds irq numbers actually. People might confuse
   it with CPU vector or MSI/MSI-X vector.

 - Change function name assign_interrupt_mode() to init_service_irqs()
   becasuse we no longer have 'irq_mode' data structure, and new name
   is more straightforward (IMO).

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-12-04 15:55:51 -08:00
Kenji Kaneshige d013598d9a PCI: portdrv: check capabilities first
Move capability check capability to the beginning of
pcie_port_device_register() prevents redundant execution path.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-12-04 15:55:44 -08:00
Kenji Kaneshige 9e5d0b16da PCI: portdrv: move PME capability check
No reason to check PME capability outside get_port_device_capability().
Do it in get_port_device_capability().

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-12-04 15:55:37 -08:00
Kenji Kaneshige 2dd60e96b4 PCI: portdrv: remove redundant pcie type calculation
PCIe port type is already stored in 'pcie_type' field of struct
pci_dev. So we don't need to get it from pci configuration space.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-12-04 15:55:26 -08:00
Kenji Kaneshige 52a0f24bea PCI: portdrv: cleanup pcie_device registration
In the current port bus driver implementation, pcie_device allocation,
initialization and registration are done in separated functions. Doing
those in one function make the code simple and easier to read.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-12-04 15:55:18 -08:00
Kenji Kaneshige 898294c975 PCI: portdrv: remove redundant pcie_port_device_probe
We don't need pcie_port_device_probe() because we can get pci
device/port type using pci_is_pcie() and 'pcie_type' fields in struct
pci_dev. Remove pcie_port_device_probe().

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-12-04 15:55:12 -08:00
Alex Williamson 59353ea30e PCI: Always set prefetchable base/limit upper32 registers
Prior to 1f82de10 we always initialized the upper 32bits of the
prefetchable memory window, regardless of the address range used.
Now we only touch it for a >32bit address, which means the upper32
registers remain whatever the BIOS initialized them too.

It's valid for the BIOS to set the upper32 base/limit to
0xffffffff/0x00000000, which makes us program prefetchable ranges
like 0xffffffffabc00000 - 0x00000000abc00000

Revert the chunk of 1f82de10 that made this conditional so we always
write the upper32 registers and remove now unused pref_mem64 variable.

Signed-off-by: Alex Williamson <alex.williamson@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-12-04 15:52:43 -08:00
Shmulik Ravid 04b55c4732 PCI: read-modify-write the pcie device control register when initiating pcie flr
The pcie_flr routine writes the device control register with the FLR bit
set clearing all other fields for the FLR duration. Among other fields,
the Max_Payload_Size is also cleared which can cause errors if there are
transactions lurking in the HW pipeline. The patch replaces the blank
write with read-modify-write of the control register keeping the other
fields intact.

Signed-off-by: Shmulik Ravid <shmulikr@broadcom.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-12-04 15:49:44 -08:00
Yinghai Lu bb965401fd PCI: show dma_mask bits in /sys
So we can catch if the driver sets an incorrect dma_mask.

Reviewed-by: Grant Grundler <grundler@google.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-12-04 15:47:50 -08:00
Yinghai Lu c6a415761c PCI: add debug output for DMA mask info
This allows us to find out what DMA mask is used for each PCI device at boot
time; useful for debugging.

After the patch:
ehci_hcd 0000:00:02.1: using 31bit consistent DMA mask
e1000 0000:0b:01.0: using 64bit DMA mask
e1000 0000:0b:01.0: using 64bit consistent DMA mask
e1000e 0000:04:00.0: using 64bit DMA mask
e1000e 0000:04:00.0: using 64bit consistent DMA mask
ixgb 0000:0c:01.0: using 64bit DMA mask
ixgb 0000:0c:01.0: using 64bit consistent DMA mask
aacraid 0000:86:00.0: using 32bit DMA mask
aacraid 0000:86:00.0: using 32bit consistent DMA mask
aacraid 0000:86:00.0: using 64bit DMA mask
aacraid 0000:86:00.0: using 64bit consistent DMA mask
qla2xxx 0000:0c:02.0: using 64bit consistent DMA mask
qla2xxx 0000:0c:02.1: using 64bit consistent DMA mask
lpfc 0000:06:00.0: using 64bit DMA mask
lpfc 0000:06:00.1: using 64bit DMA mask
pata_amd 0000:00:06.0: using 32bit DMA mask
pata_amd 0000:00:06.0: using 32bit consistent DMA mask
mptsas 0000:0c:04.0: using 64bit DMA mask
mptsas 0000:0c:04.0: using 64bit consistent DMA mask

forcedeth 0000:00:08.0: using 39bit DMA mask
forcedeth 0000:00:08.0: using 39bit consistent DMA mask
niu 0000:02:00.0: using 44bit DMA mask
niu 0000:02:00.0: using 44bit consistent DMA mask
sata_nv 0000:00:05.0: using 32bit DMA mask
sata_nv 0000:00:05.0: using 32bit consistent DMA mask
ib_mthca 0000:03:00.0: using 64bit DMA mask
ib_mthca 0000:03:00.0: using 64bit consistent DMA mask

Reviewed-by: Grant Grundler <grundler@google.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-12-04 15:46:20 -08:00
Jesse Barnes 5c788a695a PCI: ibmphp_hpc: don't release hw sem twice if kthread stops
If we stop the kthread, we may end up up'ing the sem twice, which seems
unintended.

Reported-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-12-04 15:18:01 -08:00