Definitions of MASK macros in aerdrv_errprint.c are tricky and unsafe.
For example, AER_AGENT_TRANSMITTER_MASK(_sev, _stat) does work like:
static inline func(int _sev, int _stat)
{
if (_sev == AER_CORRECTABLE)
return (_stat & (PCI_ERR_COR_REP_ROLL|PCI_ERR_COR_REP_TIMER));
else
return (_stat & PCI_ERR_COR_REP_ROLL);
}
In case of else path here, for uncorrectable errors, testing bits in
_stat by PCI_ERR_COR_* does not make sense because _stat should have only
PCI_ERR_UNC_* bits originated in uncorrectable error status register.
But at this time this is safe because uncorrectable error using bit
position same to PCI_ERR_COR_REP_ROLL(= bit position 8) is not defined.
Likewise, AER_AGENT_COMPLETER_MASK is always PCI_ERR_UNC_COMP_ABORT but
it works because bit 15 of correctable error status is not defined.
It means that these MASK macros will turn to be wrong once if new error
is defined. (In fact, bit 15 of correctable is now defined in PCIe 2.1)
This patch changes these MASK macros to be more strict, not to return
PCI_ERR_COR_* bits for uncorrectable error status and vise versa.
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Reviewed-by: Andrew Patterson <andrew.patterson@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Patch 3 implements the core part of PCI-Express AER and aerdrv
port service driver.
When a root port service device is probed, the aerdrv will call
request_irq to register irq handler for AER error interrupt.
When a device sends an PCI-Express error message to the root port,
the root port will trigger an interrupt, by either MSI or IO-APIC,
then kernel would run the irq handler. The handler collects root
error status register and schedules a work. The work will call
the core part to process the error based on its type
(Correctable/non-fatal/fatal).
As for Correctable errors, the patch chooses to just clear the correctable
error status register of the device.
As for the non-fatal error, the patch follows generic PCI error handler
rules to call the error callback functions of the endpoint's driver. If
the device is a bridge, the patch chooses to broadcast the error to
downstream devices.
As for the fatal error, the patch resets the pci-express link and
follows generic PCI error handler rules to call the error callback
functions of the endpoint's driver. If the device is a bridge, the patch
chooses to broadcast the error to downstream devices.
Signed-off-by: Zhang Yanmin <yanmin.zhang@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>