Commit Graph

21 Commits

Author SHA1 Message Date
Chen Gong c5a130325f ACPI/APEI: Add parameter check before error injection
When param1 is enabled in EINJ but not assigned with a valid
value, sometimes it will cause the error like below:

APEI: Can not request [mem 0x7aaa7000-0x7aaa7007] for APEI EINJ Trigger registers

It is because some firmware will access target address specified in
param1 to trigger the error when injecting memory error. This will
cause resource conflict with regular memory. So It must be removed
from trigger table resources, but incorrect param1/param2
combination will stop this action. Add extra check to avoid
this kind of error.

Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2013-06-06 15:20:51 -07:00
Wei Yongjun b8edb64119 ACPI, APEI, EINJ: Fix error return code in einj_init()
Fix to return -ENOMEM in the debugfs_create_xxx() error handling
case instead of 0, as done elsewhere in this function.

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Reviewed-by: Chen Gong <gong.chen@linux.intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2013-06-05 16:18:12 -07:00
Chen Gong 112f1fc08d ACPI, APEI, EINJ: Add missed ACPI5 support for error trigger table
To handle error trigger table correctly, memory region must be
removed from request region. We had a series of patches to do this
culminating in:
	commit b4e008dc5
	ACPI, APEI, EINJ, Refine the fix of resource conflict

but when ACPI5 support was added, we missed updating this area. So
when using EINJ table on an ACPI5 enabled machine, we get following error:

APEI: Can not request [mem 0x526b80000-0x526b80007] for APEI EINJ
Trigger registers

Fix this by checking for the acpi5 case and using the same code
that was added earlier.

Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2012-12-07 11:50:02 -08:00
Chen Gong ee49089dc7 ACPI, APEI, EINJ, new parameter to control trigger action
Some APEI firmware implementation will access injected address
specified in param1 to trigger the error when injecting memory
error, which means if one SRAR error is injected, the crash
always happens because it is executed in kernel context. This
new parameter can disable trigger action and control is taken
over by the user. In this way, an SRAR error can happen in user
context instead of crashing the system. This function is highly
depended on BIOS implementation so please ensure you know the
BIOS trigger procedure before you enable this switch.

v2:
  notrigger should be created together with param1/param2

Tested-by: Tony Luck <tony.luck@lintel.com>
Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2012-03-30 03:30:18 -04:00
Chen Gong 185210cc75 ACPI, APEI, EINJ, limit the range of einj_param
On the platforms with ACPI4.x support, parameter extension
is not always doable, which means only parameter extension
is enabled, einj_param can take effect.

v2->v1: stopping early in einj_get_parameter_address for einj_param

Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
Acked-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2012-03-30 03:30:18 -04:00
Luck, Tony 459413db33 Use acpi_os_map_memory() instead of ioremap() in einj driver
ioremap() has become more picky and is now spitting out console messages like:

 ioremap error for 0xbddbd000-0xbddbe000, requested 0x10, got 0x0

when loading the einj driver.  What we are trying to so here is map
a couple of data structures that the EINJ table points to. Perhaps
acpi_os_map_memory() is a better tool for this?
Most importantly it works, but as a side benefit it maps the structures
into kernel virtual space so we can access them with normal C memory
dereferences, so instead of using:
	writel(param1, &v5param->apicid);
we can use the more natural:
	v5param->apicid = param1;

Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2012-01-23 19:39:10 -05:00
Dan Carpenter 29924b9f8f ACPI, APEI, EINJ, cleanup 0 vs NULL confusion
This function is returning pointers.  Sparse complains here:
drivers/acpi/apei/einj.c:262:32: warning:
	Using plain integer as NULL pointer

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2012-01-23 19:38:52 -05:00
Niklas Söderlund 4c40aed869 ACPI, APEI, EINJ Allow empty Trigger Error Action Table
According to the ACPI spec [1] section 18.6.4 the TRIGGER_ERROR action
table can consists of zero elements.

[1] Advanced Configuration and Power Interface Specification
    Revision 5.0, December 6, 2011
	http://www.acpi.info/DOWNLOADS/ACPIspec50.pdf

Signed-off-by: Niklas Söderlund <niklas.soderlund@ericsson.com>
Acked-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2012-01-23 19:31:11 -05:00
Len Brown 79ba0db69c Merge branches 'einj', 'intel_idle', 'misc', 'srat' and 'turbostat-ivb' into release 2012-01-18 01:15:54 -05:00
Tony Luck c130bd6f82 acpi/apei/einj: Add extensions to EINJ from rev 5.0 of acpi spec
ACPI 5.0 provides extensions to the EINJ mechanism to specify the
target for the error injection - by APICID for cpu related errors,
by address for memory related errors, and by segment/bus/device/function
for PCIe related errors. Also extensions for vendor specific error
injections.

Tested-by: Chen Gong <gong.chen@linux.intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2012-01-18 01:14:17 -05:00
Xiao, Hui b4e008dc53 ACPI, APEI, EINJ, Refine the fix of resource conflict
Current fix for resource conflict is to remove the address region <param1 &
param2, ~param2+1> from trigger resource, which is highly relies on valid user
input. This patch is trying to avoid such potential issues by fetching the
exact address region from trigger action table entry.

Signed-off-by: Xiao, Hui <hui.xiao@linux.intel.com>
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2012-01-17 03:54:41 -05:00
Huang Ying fdea163d8c ACPI, APEI, EINJ, Fix resource conflict on some machine
Some APEI firmware implementation will access injected address
specified in param1 to trigger the error when injecting memory error.
This will cause resource conflict with RAM.

On one of our testing machine, if injecting at memory address
0x10000000, the following error will be reported in dmesg:

  APEI: Can not request iomem region <0000000010000000-0000000010000008> for GARs.

This patch removes the injecting memory address range from trigger
table resources to avoid conflict.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Tested-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2012-01-17 03:54:38 -05:00
Huang Ying ad6861547b ACPI, APEI, Remove table not found message
Because APEI tables are optional, these message may confuse users, for
example,

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/599715

Reported-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2012-01-17 03:54:29 -05:00
Bjorn Helgaas 46b91e379f ACPI, APEI, Print resource errors in conventional format
Use the normal %pR-like format for MMIO and I/O port ranges.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2012-01-17 03:54:26 -05:00
Huang Ying c3e6088e10 ACPI, APEI, EINJ Param support is disabled by default
EINJ parameter support is only usable for some specific BIOS.
Originally, it is expected to have no harm for BIOS does not support
it.  But now, we found it will cause issue (memory overwriting) for
some BIOS.  So param support is disabled by default and only enabled
when newly added module parameter named "param_extension" is
explicitly specified.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Cc: Matthew Garrett <mjg@redhat.com>
Acked-by: Don Zickus <dzickus@redhat.com>
Acked-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2011-08-03 11:15:59 -04:00
Huang Ying 392913de7c ACPI, APEI, Use apei_exec_run_optional in APEI EINJ and ERST
This patch changes APEI EINJ and ERST to use apei_exec_run for
mandatory actions, and apei_exec_run_optional for optional actions.

Cc: Thomas Renninger <trenn@novell.com>
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2011-07-13 23:35:14 -04:00
Roland Dreier dbee8a0aff x86: remove 32-bit versions of readq()/writeq()
The presense of a writeq() implementation on 32-bit x86 that splits the
64-bit write into two 32-bit writes turns out to break the mpt2sas driver
(and in general is risky for drivers as was discussed in
<http://lkml.kernel.org/r/adaab6c1h7c.fsf@cisco.com>).  To fix this,
revert 2c5643b1c5 ("x86: provide readq()/writeq() on 32-bit too") and
follow-on cleanups.

This unfortunately leads to pushing non-atomic definitions of readq() and
write() to various x86-only drivers that in the meantime started using the
definitions in the x86 version of <asm/io.h>.  However as discussed
exhaustively, this is actually the right thing to do, because the right
way to split a 64-bit transaction is hardware dependent and therefore
belongs in the hardware driver (eg mpt2sas needs a spinlock to make sure
no other accesses occur in between the two halves of the access).

Build tested on 32- and 64-bit x86 allmodconfig.

Link: http://lkml.kernel.org/r/x86-32-writeq-is-broken@mdm.bga.com
Acked-by: Hitoshi Mitake <h.mitake@gmail.com>
Cc: Kashyap Desai <Kashyap.Desai@lsi.com>
Cc: Len Brown <lenb@kernel.org>
Cc: Ravi Anand <ravi.anand@qlogic.com>
Cc: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Cc: Matthew Garrett <mjg@redhat.com>
Cc: Jason Uhlenkott <juhlenko@akamai.com>
Acked-by: James Bottomley <James.Bottomley@parallels.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-25 08:39:44 -07:00
Stefan Weil e8a8b252fb Fix spelling mistakes in comments
milisecond -> millisecond
 meassge -> message

Cc: Kalle Valo <kvalo@adurom.com>
Cc: Jiri Kosina <trivial@kernel.org>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Stefan Weil <weil@mail.berlios.de>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2011-01-03 13:51:58 +01:00
Huang Ying 3a78f96532 ACPI, APEI, Fix APEI related table size checking
On Huang Ying's machine:

erst_tab->header_length == sizeof(struct acpi_table_einj)

but Yinghai reported that on his machine,

erst_tab->header_length == sizeof(struct acpi_table_einj) -
sizeof(struct acpi_table_header)

To make erst table size checking code works on all systems, both
testing are treated as PASS.

Same situation applies to einj_tab->header_length, so corresponding
table size checking is changed in similar way too.

v2:

- Treat both table size as valid

Originally-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2010-09-29 13:59:18 -04:00
Huang Ying 6e320ec1d9 ACPI, APEI, EINJ injection parameters support
Some hardware error injection needs parameters, for example, it is
useful to specify memory address and memory address mask for memory
errors.

Some BIOSes allow parameters to be specified via an unpublished
extension. This patch adds support to it. The parameters will be
ignored on machines without necessary BIOS support.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2010-05-19 22:42:08 -04:00
Huang Ying e40213450b ACPI, APEI, EINJ support
EINJ provides a hardware error injection mechanism, this is useful for
debugging and testing of other APEI and RAS features.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2010-05-19 22:35:29 -04:00