Commit Graph

185 Commits

Author SHA1 Message Date
Len Brown 79ba0db69c Merge branches 'einj', 'intel_idle', 'misc', 'srat' and 'turbostat-ivb' into release 2012-01-18 01:15:54 -05:00
Tony Luck c130bd6f82 acpi/apei/einj: Add extensions to EINJ from rev 5.0 of acpi spec
ACPI 5.0 provides extensions to the EINJ mechanism to specify the
target for the error injection - by APICID for cpu related errors,
by address for memory related errors, and by segment/bus/device/function
for PCIe related errors. Also extensions for vendor specific error
injections.

Tested-by: Chen Gong <gong.chen@linux.intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2012-01-18 01:14:17 -05:00
Myron Stowe 700130b41f ACPI APEI: Convert atomicio routines
APEI needs memory access in interrupt context.  The obvious choice is
acpi_read(), but originally it couldn't be used in interrupt context
because it makes temporary mappings with ioremap().  Therefore, we added
drivers/acpi/atomicio.c, which provides:
    acpi_pre_map_gar()     -- ioremap in process context
	acpi_atomic_read()     -- memory access in interrupt context
	acpi_post_unmap_gar()  -- iounmap

Later we added acpi_os_map_generic_address() (2971852) and enhanced
acpi_read() so it works in interrupt context as long as the address has
been previously mapped (620242a).  Now this sequence:
    acpi_os_map_generic_address()    -- ioremap in process context
    acpi_read()/apei_read()          -- now OK in interrupt context
    acpi_os_unmap_generic_address()
is equivalent to what atomicio.c provides.

This patch introduces apei_read() and apei_write(), which currently are
functional equivalents of acpi_read() and acpi_write().  This is mainly
proactive, to prevent APEI breakages if acpi_read() and acpi_write()
are ever augmented to support the 'bit_offset' field of GAS, as APEI's
__apei_exec_write_register() precludes splitting up functionality
related to 'bit_offset' and APEI's 'mask' (see its
APEI_EXEC_PRESERVE_REGISTER block).

With apei_read() and apei_write() in place, usages of atomicio routines
are converted to apei_read()/apei_write() and existing calls within
osl.c and the CA, based on the re-factoring that was done in an earlier
patch series - http://marc.info/?l=linux-acpi&m=128769263327206&w=2:
    acpi_pre_map_gar()     -->  acpi_os_map_generic_address()
    acpi_post_unmap_gar()  -->  acpi_os_unmap_generic_address()
    acpi_atomic_read()     -->  apei_read()
    acpi_atomic_write()    -->  apei_write()

Note that acpi_read() and acpi_write() currently use 'bit_width'
for accessing GARs which seems incorrect.  'bit_width' is the size of
the register, while 'access_width' is the size of the access the
processor must generate on the bus.  The 'access_width' may be larger,
for example, if the hardware only supports 32-bit or 64-bit reads.  I
wanted to minimize any possible impacts with this patch series so I
did *not* change this behavior.

Signed-off-by: Myron Stowe <myron.stowe@redhat.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2012-01-17 04:36:40 -05:00
Huang Ying 4134b8c881 ACPI, APEI, Resolve false conflict between ACPI NVS and APEI
Some firmware will access memory in ACPI NVS region via APEI.  That
is, instructions in APEI ERST/EINJ table will read/write ACPI NVS
region.  The original resource conflict checking in APEI code will
check memory/ioport accessed by APEI via general resource management
mech.  But ACPI NVS region is marked as busy already, so that the
false resource conflict will prevent APEI ERST/EINJ to work.

To fix this, this patch excludes ACPI NVS regions when APEI components
request resources.  So that they will not conflict with ACPI NVS
regions.

Reported-and-tested-by: Pavel Ivanov <paivanof@gmail.com>
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2012-01-17 03:54:46 -05:00
Xiao, Hui b4e008dc53 ACPI, APEI, EINJ, Refine the fix of resource conflict
Current fix for resource conflict is to remove the address region <param1 &
param2, ~param2+1> from trigger resource, which is highly relies on valid user
input. This patch is trying to avoid such potential issues by fetching the
exact address region from trigger action table entry.

Signed-off-by: Xiao, Hui <hui.xiao@linux.intel.com>
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2012-01-17 03:54:41 -05:00
Huang Ying fdea163d8c ACPI, APEI, EINJ, Fix resource conflict on some machine
Some APEI firmware implementation will access injected address
specified in param1 to trigger the error when injecting memory error.
This will cause resource conflict with RAM.

On one of our testing machine, if injecting at memory address
0x10000000, the following error will be reported in dmesg:

  APEI: Can not request iomem region <0000000010000000-0000000010000008> for GARs.

This patch removes the injecting memory address range from trigger
table resources to avoid conflict.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Tested-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2012-01-17 03:54:38 -05:00
Huang Ying 46d12f0bcb ACPI, APEI, Printk queued error record before panic
Because printk is not safe inside NMI handler, the recoverable error
records received in NMI handler will be queued to be printked in a
delayed IRQ context via irq_work.  If a fatal error occurs after the
recoverable error and before the irq_work processed, we lost a error
report.

To solve the issue, the queued error records are printked in NMI
handler if system will go panic.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2012-01-17 03:54:33 -05:00
Huang Ying 5ba82ab534 ACPI, APEI, GHES, Distinguish interleaved error report in kernel log
In most cases, printk only guarantees messages from different printk
calling will not be interleaved between each other.  But, one APEI
GHES hardware error report will involve multiple printk calling,
normally each for one line.  So it is possible that the hardware error
report comes from different generic hardware error source will be
interleaved.

In this patch, a sequence number is prefixed to each line of error
report.  So that, even if they are interleaved, they still can be
distinguished by the prefixed sequence number.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2012-01-17 03:54:31 -05:00
Huang Ying ad6861547b ACPI, APEI, Remove table not found message
Because APEI tables are optional, these message may confuse users, for
example,

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/599715

Reported-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2012-01-17 03:54:29 -05:00
Bjorn Helgaas 46b91e379f ACPI, APEI, Print resource errors in conventional format
Use the normal %pR-like format for MMIO and I/O port ranges.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2012-01-17 03:54:26 -05:00
Huang Ying a654e5ee4f ACPI, APEI, GHES: Add PCIe AER recovery support
aer_recover_queue() is called when recoverable PCIe AER errors are
notified by firmware to do the recovery work.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2012-01-17 03:54:21 -05:00
Rusty Russell 90ab5ee941 module_param: make bool parameters really bool (drivers & misc)
module_param(bool) used to counter-intuitively take an int.  In
fddd5201 (mid-2009) we allowed bool or int/unsigned int using a messy
trick.

It's time to remove the int/unsigned int option.  For this version
it'll simply give a warning, but it'll break next kernel version.

Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-01-13 09:32:20 +10:30
Kees Cook 3d6d8d20ec pstore: pass reason to backend write callback
This allows a backend to filter on the dmesg reason as well as the pstore
reason. When ramoops is switched to pstore, this is needed since it has
no interest in storing non-crash dmesg details.

Drop pstore_write() as it has no users, and handling the "reason" here
has no obviously correct value.

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2011-11-17 13:13:29 -08:00
Kees Cook f6f8285132 pstore: pass allocated memory region back to caller
The buf_lock cannot be held while populating the inodes, so make the backend
pass forward an allocated and filled buffer instead. This solves the following
backtrace. The effect is that "buf" is only ever used to notify the backends
that something was written to it, and shouldn't be used in the read path.

To replace the buf_lock during the read path, isolate the open/read/close
loop with a separate mutex to maintain serialized access to the backend.

Note that is is up to the pstore backend to cope if the (*write)() path is
called in the middle of the read path.

[   59.691019] BUG: sleeping function called from invalid context at .../mm/slub.c:847
[   59.691019] in_atomic(): 0, irqs_disabled(): 1, pid: 1819, name: mount
[   59.691019] Pid: 1819, comm: mount Not tainted 3.0.8 #1
[   59.691019] Call Trace:
[   59.691019]  [<810252d5>] __might_sleep+0xc3/0xca
[   59.691019]  [<810a26e6>] kmem_cache_alloc+0x32/0xf3
[   59.691019]  [<810b53ac>] ? __d_lookup_rcu+0x6f/0xf4
[   59.691019]  [<810b68b1>] alloc_inode+0x2a/0x64
[   59.691019]  [<810b6903>] new_inode+0x18/0x43
[   59.691019]  [<81142447>] pstore_get_inode.isra.1+0x11/0x98
[   59.691019]  [<81142623>] pstore_mkfile+0xae/0x26f
[   59.691019]  [<810a2a66>] ? kmem_cache_free+0x19/0xb1
[   59.691019]  [<8116c821>] ? ida_get_new_above+0x140/0x158
[   59.691019]  [<811708ea>] ? __init_rwsem+0x1e/0x2c
[   59.691019]  [<810b67e8>] ? inode_init_always+0x111/0x1b0
[   59.691019]  [<8102127e>] ? should_resched+0xd/0x27
[   59.691019]  [<8137977f>] ? _cond_resched+0xd/0x21
[   59.691019]  [<81142abf>] pstore_get_records+0x52/0xa7
[   59.691019]  [<8114254b>] pstore_fill_super+0x7d/0x91
[   59.691019]  [<810a7ff5>] mount_single+0x46/0x82
[   59.691019]  [<8114231a>] pstore_mount+0x15/0x17
[   59.691019]  [<811424ce>] ? pstore_get_inode.isra.1+0x98/0x98
[   59.691019]  [<810a8199>] mount_fs+0x5a/0x12d
[   59.691019]  [<810b9174>] ? alloc_vfsmnt+0xa4/0x14a
[   59.691019]  [<810b9474>] vfs_kern_mount+0x4f/0x7d
[   59.691019]  [<810b9d7e>] do_kern_mount+0x34/0xb2
[   59.691019]  [<810bb15f>] do_mount+0x5fc/0x64a
[   59.691019]  [<810912fb>] ? strndup_user+0x2e/0x3f
[   59.691019]  [<810bb3cb>] sys_mount+0x66/0x99
[   59.691019]  [<8137b537>] sysenter_do_call+0x12/0x26

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2011-11-17 12:58:07 -08:00
Linus Torvalds 1c39865151 Merge branch 'pstore' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux
* 'pstore' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux:
  pstore: make pstore write function return normal success/fail value
  pstore: change mutex locking to spin_locks
  pstore: defer inserting OOPS entries into pstore
2011-11-01 10:52:29 -07:00
Linus Torvalds 8a4a8918ed Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (46 commits)
  llist: Add back llist_add_batch() and llist_del_first() prototypes
  sched: Don't use tasklist_lock for debug prints
  sched: Warn on rt throttling
  sched: Unify the ->cpus_allowed mask copy
  sched: Wrap scheduler p->cpus_allowed access
  sched: Request for idle balance during nohz idle load balance
  sched: Use resched IPI to kick off the nohz idle balance
  sched: Fix idle_cpu()
  llist: Remove cpu_relax() usage in cmpxchg loops
  sched: Convert to struct llist
  llist: Add llist_next()
  irq_work: Use llist in the struct irq_work logic
  llist: Return whether list is empty before adding in llist_add()
  llist: Move cpu_relax() to after the cmpxchg()
  llist: Remove the platform-dependent NMI checks
  llist: Make some llist functions inline
  sched, tracing: Show PREEMPT_ACTIVE state in trace_sched_switch
  sched: Remove redundant test in check_preempt_tick()
  sched: Add documentation for bandwidth control
  sched: Return unused runtime on group dequeue
  ...
2011-10-26 17:08:43 +02:00
Chen Gong b238b8fa93 pstore: make pstore write function return normal success/fail value
Currently pstore write interface employs record id as return
value, but it is not enough because it can't tell caller if
the write operation is successful. Pass the record id back via
an argument pointer and return zero for success, non-zero for
failure.

Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2011-10-12 09:17:24 -07:00
Don Zickus 9c48f1c629 x86, nmi: Wire up NMI handlers to new routines
Just convert all the files that have an nmi handler to the new routines.
Most of it is straight forward conversion.  A couple of places needed some
tweaking like kgdb which separates the debug notifier from the nmi handler
and mce removes a call to notify_die.

[Thanks to Ying for finding out the history behind that mce call

https://lkml.org/lkml/2010/5/27/114

And Boris responding that he would like to remove that call because of it

https://lkml.org/lkml/2011/9/21/163]

The things that get converted are the registeration/unregistration routines
and the nmi handler itself has its args changed along with code removal
to check which list it is on (most are on one NMI list except for kgdb
which has both an NMI routine and an NMI Unknown routine).

Signed-off-by: Don Zickus <dzickus@redhat.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Corey Minyard <minyard@acm.org>
Cc: Jason Wessel <jason.wessel@windriver.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Corey Minyard <minyard@acm.org>
Cc: Jack Steiner <steiner@sgi.com>
Link: http://lkml.kernel.org/r/1317409584-23662-4-git-send-email-dzickus@redhat.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-10-10 06:56:57 +02:00
Huang Ying 1230db8e15 llist: Make some llist functions inline
Because llist code will be used in performance critical scheduler
code path, make llist_add() and llist_del_all() inline to avoid
function calling overhead and related 'glue' overhead.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1315461646-1379-2-git-send-email-ying.huang@intel.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-10-04 11:30:53 +02:00
Don Zickus abd4d5587b pstore: change mutex locking to spin_locks
pstore was using mutex locking to protect read/write access to the
backend plug-ins.  This causes problems when pstore is executed in
an NMI context through panic() -> kmsg_dump().

This patch changes the mutex to a spin_lock_irqsave then also checks to
see if we are in an NMI context.  If we are in an NMI and can't get the
lock, just print a message stating that and blow by the locking.

All this is probably a hack around the bigger locking problem but it
solves my current situation of trying to sleep in an NMI context.

Tested by loading the lkdtm module and executing a HARDLOCKUP which
will cause the machine to panic inside the nmi handler.

Signed-off-by: Don Zickus <dzickus@redhat.com>
Acked-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2011-08-16 11:55:58 -07:00
Chen Gong 03ba176a29 ACPI APEI: Add Kconfig option IRQ_WORK for GHES
IRQ_WORK is used by GHES, but it is selected by PERF_EVENT.
For now PERF_EVENT is selected by x86 by default, but
in concept, IRQ_WORK should be selected by GHES, not by others.

Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2011-08-11 15:42:09 -04:00
Matthew Garrett b3b46d76d0 APEI: Fix WHEA _OSC call
Bit 0 of the support parameter to the OSC call should be set in order to
indicate that the OS supports the WHEA mechanism. Stuart Hayes tracked
an APEI issue on some Dell platforms down to this.

Reported-by: Stuart Hayes <Stuart_Hayes@Dell.com>
Signed-off-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2011-08-11 12:18:38 -04:00
Len Brown d0e323b470 Merge branch 'apei' into apei-release
Some trivial conflicts due to other various merges
adding to the end of common lists sooner than this one.

	arch/ia64/Kconfig
	arch/powerpc/Kconfig
	arch/x86/Kconfig
	lib/Kconfig
	lib/Makefile

Signed-off-by: Len Brown <len.brown@intel.com>
2011-08-03 11:30:42 -04:00
Huang Ying c3e6088e10 ACPI, APEI, EINJ Param support is disabled by default
EINJ parameter support is only usable for some specific BIOS.
Originally, it is expected to have no harm for BIOS does not support
it.  But now, we found it will cause issue (memory overwriting) for
some BIOS.  So param support is disabled by default and only enabled
when newly added module parameter named "param_extension" is
explicitly specified.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Cc: Matthew Garrett <mjg@redhat.com>
Acked-by: Don Zickus <dzickus@redhat.com>
Acked-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2011-08-03 11:15:59 -04:00
Len Brown 70cb6e1da0 APEI GHES: 32-bit buildfix
drivers/acpi/apei/ghes.c:542: warning: integer overflow in expression
drivers/acpi/apei/ghes.c:619: warning: integer overflow in expression

ghes.c:(.text+0x46289): undefined reference to `__udivdi3'
  in function ghes_estatus_cache_add().

Reported-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Len Brown <len.brown@intel.com>
2011-08-03 11:15:59 -04:00
Huang Ying ba61ca4aab ACPI, APEI, GHES: Add hardware memory error recovery support
memory_failure_queue() is called when recoverable memory errors are
notified by firmware to do the recovery work.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2011-08-03 11:15:58 -04:00
Huang Ying 152cef40a8 ACPI, APEI, GHES, Error records content based throttle
printk is used by GHES to report hardware errors.  Ratelimit is
enforced on the printk to avoid too many hardware error reports in
kernel log.  Because there may be thousands or even millions of
corrected hardware errors during system running.

Currently, a simple scheme is used.  That is, the total number of
hardware error reporting is ratelimited.  This may cause some issues
in practice.

For example, there are two kinds of hardware errors occurred in
system.  One is corrected memory error, because the fault memory
address is accessed frequently, there may be hundreds error report
per-second.  The other is corrected PCIe AER error, it will be
reported once per-second.  Because they share one ratelimit control
structure, it is highly possible that only memory error is reported.

To avoid the above issue, an error record content based throttle
algorithm is implemented in the patch.  Where after the first
successful reporting, all error records that are same are throttled for
some time, to let other kinds of error records have the opportunity to
be reported.

In above example, the memory errors will be throttled for some time,
after being printked.  Then the PCIe AER error will be printked
successfully.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2011-08-03 11:15:57 -04:00
Huang Ying 67eb2e9907 ACPI, APEI, GHES, printk support for recoverable error via NMI
Some APEI GHES recoverable errors are reported via NMI, but printk is
not safe in NMI context.

To solve the issue, a lock-less memory allocator is used to allocate
memory in NMI handler, save the error record into the allocated
memory, put the error record into a lock-less list.  On the other
hand, an irq_work is used to delay the operation from NMI context to
IRQ context.  The irq_work IRQ handler will remove nodes from
lock-less list, printk the error record and do some further processing
include recovery operation, then free the memory.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2011-08-03 11:15:57 -04:00
Matthew Garrett b94fdd077e pstore: Make "part" unsigned
We'll never have a negative part, so just make this an unsigned int.

Signed-off-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2011-07-22 16:14:29 -07:00
Matthew Garrett 56280682ce pstore: Add extra context for writes and erases
EFI only provides small amounts of individual storage, and conventionally
puts metadata in the storage variable name. Rather than add a metadata
header to the (already limited) variable storage, it's easier for us to
modify pstore to pass all the information we need to construct a unique
variable name to the appropriate functions.

Signed-off-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2011-07-22 16:14:20 -07:00
Matthew Garrett 638c1fd303 pstore: Extend API for more flexibility in new backends
Some pstore implementations may not have a static context, so extend the
API to pass the pstore_info struct to all calls and allow for a context
pointer.

Signed-off-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2011-07-22 16:14:06 -07:00
Huang Ying 9fb0bfe140 ACPI, APEI, Add WHEA _OSC support
APEI firmware first mode must be turned on explicitly on some
machines, otherwise there may be no GHES hardware error record for
hardware error notification.  APEI bit in generic _OSC call can be
used to do that, but on some machine, a special WHEA _OSC call must be
used.  This patch adds the support to that WHEA _OSC call.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Reviewed-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2011-07-13 23:38:49 -04:00
Huang Ying b6a9501658 ACPI, APEI, GHES, Support disable GHES at boot time
Some machine may have broken firmware so that GHES and firmware first
mode should be disabled.  This patch adds support to that.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Reviewed-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2011-07-13 23:36:34 -04:00
Huang Ying 86cd47334b ACPI, APEI, GHES, Prevent GHES to be built as module
GHES (Generic Hardware Error Source) is used to process hardware error
notification in firmware first mode.  But because firmware first mode
can be turned on but can not be turned off, it is unreasonable to
unload the GHES module with firmware first mode turned on.  To avoid
confusion, this patch makes GHES can be enabled/disabled in
configuration time, but not built as module and unloaded at run time.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Reviewed-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2011-07-13 23:35:57 -04:00
Huang Ying 392913de7c ACPI, APEI, Use apei_exec_run_optional in APEI EINJ and ERST
This patch changes APEI EINJ and ERST to use apei_exec_run for
mandatory actions, and apei_exec_run_optional for optional actions.

Cc: Thomas Renninger <trenn@novell.com>
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2011-07-13 23:35:14 -04:00
Huang Ying eecf2f7124 ACPI, APEI, Add apei_exec_run_optional
Some actions in APEI ERST and EINJ tables are optional, for example,
ACPI_EINJ_BEGIN_OPERATION action is used to do some preparation for
error injection, and firmware may choose to do nothing here.  While
some other actions are mandatory, for example, firmware must provide
ACPI_EINJ_GET_ERROR_TYPE implementation.

Original implementation treats all actions as optional (that is, can
have no instructions), that may cause issue if firmware does not
provide some mandatory actions.  To fix this, this patch adds
apei_exec_run_optional, which should be used for optional actions.
The original apei_exec_run should be used for mandatory actions.

Cc: Thomas Renninger <trenn@novell.com>
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2011-07-13 23:34:49 -04:00
Huang Ying 5588340d46 ACPI, APEI, GHES, Do not ratelimit fatal error printk before panic
printk is used by GHES to report hardware errors.  Normally, the
printk will be ratelimited to avoid too many hardware error reports in
kernel log.  Because there may be thousands or even millions of
corrected hardware errors during system running.

That is different for fatal hardware error, because system will go
panic as soon as possible, there will be no more than several error
records.  And these error records are valuable for system fault
diagnosis, so they should not be ratelimited.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2011-07-13 23:33:57 -04:00
Chen Gong d37afc50e6 ACPI, APEI, ERST, Fix erst-dbg long record reading issue
When we debug ERST table with erst-dbg, if the error record in ERST
table is too long(>4K), it can't be read out.  So this patch increases
the buffer size to 16K to ensure such error records can be read from
ERST table.

Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2011-07-13 23:31:51 -04:00
Huang Ying ca7cc5110a ACPI, APEI, ERST, Prevent erst_dbg from loading if ERST is disabled
erst_dbg module can not work when ERST is disabled.  So disable module
loading to provide clearer information to user.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2011-07-13 23:29:52 -04:00
Huang Ying 4d2b2956ef ACPI, APEI, HEST, Detect duplicated hardware error source ID
The firmware on some machine will report duplicated hardware error
source ID in HEST.  This is considered a firmware bug.  To provide
better warning message, this patch adds duplicated hardware error
source ID detecting and corresponding printk.

This patch fixes #37412 on kernel bugzilla:
https://bugzilla.kernel.org/show_bug.cgi?id=37412

Reported-by: marconifabio@ubuntu-it.org
Signed-off-by: Huang Ying <ying.huang@intel.com>
Tested-by: Mathias <janedo.spam@gmail.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2011-07-13 23:27:56 -04:00
Roland Dreier dbee8a0aff x86: remove 32-bit versions of readq()/writeq()
The presense of a writeq() implementation on 32-bit x86 that splits the
64-bit write into two 32-bit writes turns out to break the mpt2sas driver
(and in general is risky for drivers as was discussed in
<http://lkml.kernel.org/r/adaab6c1h7c.fsf@cisco.com>).  To fix this,
revert 2c5643b1c5 ("x86: provide readq()/writeq() on 32-bit too") and
follow-on cleanups.

This unfortunately leads to pushing non-atomic definitions of readq() and
write() to various x86-only drivers that in the meantime started using the
definitions in the x86 version of <asm/io.h>.  However as discussed
exhaustively, this is actually the right thing to do, because the right
way to split a 64-bit transaction is hardware dependent and therefore
belongs in the hardware driver (eg mpt2sas needs a spinlock to make sure
no other accesses occur in between the two halves of the access).

Build tested on 32- and 64-bit x86 allmodconfig.

Link: http://lkml.kernel.org/r/x86-32-writeq-is-broken@mdm.bga.com
Acked-by: Hitoshi Mitake <h.mitake@gmail.com>
Cc: Kashyap Desai <Kashyap.Desai@lsi.com>
Cc: Len Brown <lenb@kernel.org>
Cc: Ravi Anand <ravi.anand@qlogic.com>
Cc: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Cc: Matthew Garrett <mjg@redhat.com>
Cc: Jason Uhlenkott <juhlenko@akamai.com>
Acked-by: James Bottomley <James.Bottomley@parallels.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-25 08:39:44 -07:00
Luck, Tony 5d2a8342f6 pstore: Fix Kconfig dependencies for apei->pstore
Geert Uytterhoeven ran a dependency checker which kicked out this warning:

+ warning: (ACPI_APEI) selects PSTORE which has unmet direct dependencies (MISC_FILESYSTEMS):  => N/A

Randy confirmed that the fix was to "select MISC_FILESYSTEMS" too.

Tested-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2011-05-20 10:34:35 -07:00
Chen Gong f5ec25deb2 pstore: fix potential logic issue in pstore read interface
1) in the calling of erst_read, the parameter of buffer size
maybe overflows and cause crash

2) the return value of erst_read should be checked more strictly

Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2011-05-16 11:05:08 -07:00
Chen Gong 06cf91b4b4 pstore: fix pstore filesystem mount/remount issue
Currently after mount/remount operation on pstore filesystem,
the content on pstore will be lost. It is because current ERST
implementation doesn't support multi-user usage, which moves
internal pointer to the end after accessing it. Adding
multi-user support for pstore usage.

Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2011-05-16 11:05:00 -07:00
Chen Gong 8d38d74b64 pstore: fix one type of return value in pstore
the return type of function _read_ in pstore is size_t,
but in the callback function of _read_, the logic doesn't
consider it too much, which means if negative value (assuming
error here) is returned, it will be converted to positive because
of type casting. ssize_t is enough for this function.

Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2011-05-16 11:04:51 -07:00
Lucas De Marchi 25985edced Fix common misspellings
Fixes generated by 'codespell' and manually reviewed.

Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
2011-03-31 11:26:23 -03:00
Len Brown 02e2407858 Merge branch 'linus' into release
Conflicts:
	arch/x86/kernel/acpi/sleep.c

Signed-off-by: Len Brown <len.brown@intel.com>
2011-03-23 02:34:54 -04:00
Huang Ying c413d76820 ACPI, APEI, Add PCIe AER error information printing support
The AER error information printing support is implemented in
drivers/pci/pcie/aer/aer_print.c.  So some string constants, functions
and macros definitions can be re-used without being exported.

The original PCIe AER error information printing function is not
re-used directly because the overall format is quite different.  And
changing the original printing format may make some original users'
scripts broken.

Signed-off-by: Huang Ying <ying.huang@intel.com>
CC: Jesse Barnes <jbarnes@virtuousgeek.org>
CC: Zhang Yanmin <yanmin.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2011-03-21 22:59:08 -04:00
Huang Ying 885b976fad ACPI, APEI, Add ERST record ID cache
APEI ERST firmware interface and implementation has no multiple users
in mind.  For example, if there is four records in storage with ID: 1,
2, 3 and 4, if two ERST readers enumerate the records via
GET_NEXT_RECORD_ID as follow,

reader 1		reader 2
1
			2
3
			4
-1
			-1

where -1 signals there is no more record ID.

Reader 1 has no chance to check record 2 and 4, while reader 2 has no
chance to check record 1 and 3.  And any other GET_NEXT_RECORD_ID will
return -1, that is, other readers will has no chance to check any
record even they are not cleared by anyone.

This makes raw GET_NEXT_RECORD_ID not suitable for used by multiple
users.

To solve the issue, an in-memory ERST record ID cache is designed and
implemented.  When enumerating record ID, the ID returned by
GET_NEXT_RECORD_ID is added into cache in addition to be returned to
caller.  So other readers can check the cache to get all record ID
available.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2011-03-21 22:59:06 -04:00
Tony Luck afe997a183 Pull pstorev4 into release branch 2011-03-16 09:58:31 -07:00
Rafael J. Wysocki d3072e6a7e ACPI: Fix boot problem related to APEI with acpi_disabled set
Commit 415e12b237 ("PCI/ACPI: Request _OSC control once for each root
bridge (v3)") put the acpi_hest_init() call in acpi_pci_root_init() into
a wrong place, presumably because the author confused acpi_pci_disabled
with acpi_disabled.  Bring the code ordering in acpi_pci_root_init()
back to sanity.

Additionally, make sure that hest_disable is set when acpi_disabled is
set, which is going to prevent acpi_hest_parse(), that still may be
executed for acpi_disabled=1 through aer_acpi_firmware_first(), from
crashing because of uninitialized hest_tab.

Reported-and-tested-by: Andres Salomon <dilinger@queued.net>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-01-16 11:56:26 -08:00
Linus Torvalds d73b388459 Merge branch 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6
* 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
  PCI/PM: Report wakeup events before resuming devices
  PCI/PM: Use pm_wakeup_event() directly for reporting wakeup events
  PCI: sysfs: Update ROM to include default owner write access
  x86/PCI: make Broadcom CNB20LE driver EMBEDDED and EXPERIMENTAL
  x86/PCI: don't use native Broadcom CNB20LE driver when ACPI is available
  PCI/ACPI: Request _OSC control once for each root bridge (v3)
  PCI: enable pci=bfsort by default on future Dell systems
  PCI/PCIe: Clear Root PME Status bits early during system resume
  PCI: pci-stub: ignore zero-length id parameters
  x86/PCI: irq and pci_ids patch for Intel Patsburg
  PCI: Skip id checking if no id is passed
  PCI: fix __pci_device_probe kernel-doc warning
  PCI: make pci_restore_state return void
  PCI: Disable ASPM if BIOS asks us to
  PCI: Add mask bit definition for MSI-X table
  PCI: MSI: Move MSI-X entry definition to pci_regs.h

Fix up trivial conflicts in drivers/net/{skge.c,sky2.c} that had in the
meantime been converted to not use legacy PCI power management, and thus
no longer use pci_restore_state() at all (and that caused trivial
conflicts with the "make pci_restore_state return void" patch)
2011-01-14 09:29:05 -08:00
Rafael J. Wysocki 415e12b237 PCI/ACPI: Request _OSC control once for each root bridge (v3)
Move the evaluation of acpi_pci_osc_control_set() (to request control of
PCI Express native features) into acpi_pci_root_add() to avoid calling
it many times for the same root complex with the same arguments.
Additionally, check if all of the requisite _OSC support bits are set
before calling acpi_pci_osc_control_set() for a given root complex.

References: https://bugzilla.kernel.org/show_bug.cgi?id=20232
Reported-by: Ozan Caglayan <ozan@pardus.org.tr>
Tested-by: Ozan Caglayan <ozan@pardus.org.tr>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-01-14 08:55:41 -08:00
Linus Torvalds 52cfd503ad Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: (59 commits)
  ACPI / PM: Fix build problems for !CONFIG_ACPI related to NVS rework
  ACPI: fix resource check message
  ACPI / Battery: Update information on info notification and resume
  ACPI: Drop device flag wake_capable
  ACPI: Always check if _PRW is present before trying to evaluate it
  ACPI / PM: Check status of power resources under mutexes
  ACPI / PM: Rename acpi_power_off_device()
  ACPI / PM: Drop acpi_power_nocheck
  ACPI / PM: Drop acpi_bus_get_power()
  Platform / x86: Make fujitsu_laptop use acpi_bus_update_power()
  ACPI / Fan: Rework the handling of power resources
  ACPI / PM: Register power resource devices as soon as they are needed
  ACPI / PM: Register acpi_power_driver early
  ACPI / PM: Add function for updating device power state consistently
  ACPI / PM: Add function for device power state initialization
  ACPI / PM: Introduce __acpi_bus_get_power()
  ACPI / PM: Introduce function for refcounting device power resources
  ACPI / PM: Add functions for manipulating lists of power resources
  ACPI / PM: Prevent acpi_power_get_inferred_state() from making changes
  ACPICA: Update version to 20101209
  ...
2011-01-13 20:15:35 -08:00
Len Brown 03b6e6e58d Merge branch 'apei' into release 2011-01-12 05:02:22 -05:00
Huang Ying 81e88fdc43 ACPI, APEI, Generic Hardware Error Source POLL/IRQ/NMI notification type support
Generic Hardware Error Source provides a way to report platform
hardware errors (such as that from chipset). It works in so called
"Firmware First" mode, that is, hardware errors are reported to
firmware firstly, then reported to Linux by firmware. This way, some
non-standard hardware error registers or non-standard hardware link
can be checked by firmware to produce more valuable hardware error
information for Linux.

This patch adds POLL/IRQ/NMI notification types support.

Because the memory area used to transfer hardware error information
from BIOS to Linux can be determined only in NMI, IRQ or timer
handler, but general ioremap can not be used in atomic context, so a
special version of atomic ioremap is implemented for that.

Known issue:

- Error information can not be printed for recoverable errors notified
  via NMI, because printk is not NMI-safe. Will fix this via delay
  printing to IRQ context via irq_work or make printk NMI-safe.

v2:

- adjust printk format per comments.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2011-01-12 03:06:19 -05:00
Tony Luck 0bb77c465f pstore: X86 platform interface using ACPI/APEI/ERST
The 'error record serialization table' in ACPI provides a suitable
amount of persistent storage for use by the pstore filesystem.

Signed-off-by: Tony Luck <tony.luck@intel.com>
2011-01-03 14:22:11 -08:00
Stefan Weil e8a8b252fb Fix spelling mistakes in comments
milisecond -> millisecond
 meassge -> message

Cc: Kalle Valo <kvalo@adurom.com>
Cc: Jiri Kosina <trivial@kernel.org>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Stefan Weil <weil@mail.berlios.de>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2011-01-03 13:51:58 +01:00
Huang Ying 32c361f574 ACPI, APEI, Report GHES error information via printk
printk is one of the methods to report hardware errors to user space.
This patch implements hardware error reporting for GHES via printk.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2010-12-13 23:42:39 -05:00
Huang Ying f59c55d04b ACPI, APEI, Add APEI generic error status printing support
In APEI, Hardware error information reported by firmware to Linux
kernel is in the data structure of APEI generic error status (struct
acpi_hes_generic_status).  While now printk is used by Linux kernel to
report hardware error information to user space.

So, this patch adds printing support for the data structure, so that
the corresponding hardware error information can be reported to user
space via printk.

PCIe AER information printing is not implemented yet.  Will refactor the
original PCIe AER information printing code to avoid code duplicating.

The output format is as follow:

<error record> :=
APEI generic hardware error status
severity: <integer>, <severity string>
section: <integer>, severity: <integer>, <severity string>
flags: <integer>
<section flags strings>
fru_id: <uuid string>
fru_text: <string>
section_type: <section type string>
<section data>

<severity string>* := recoverable | fatal | corrected | info

<section flags strings># :=
[primary][, containment warning][, reset][, threshold exceeded]\
[, resource not accessible][, latent error]

<section type string> := generic processor error | memory error | \
PCIe error | unknown, <uuid string>

<section data> :=
<generic processor section data> | <memory section data> | \
<pcie section data> | <null>

<generic processor section data> :=
[processor_type: <integer>, <proc type string>]
[processor_isa: <integer>, <proc isa string>]
[error_type: <integer>
<proc error type strings>]
[operation: <integer>, <proc operation string>]
[flags: <integer>
<proc flags strings>]
[level: <integer>]
[version_info: <integer>]
[processor_id: <integer>]
[target_address: <integer>]
[requestor_id: <integer>]
[responder_id: <integer>]
[IP: <integer>]

<proc type string>* := IA32/X64 | IA64

<proc isa string>* := IA32 | IA64 | X64

<processor error type strings># :=
[cache error][, TLB error][, bus error][, micro-architectural error]

<proc operation string>* := unknown or generic | data read | data write | \
instruction execution

<proc flags strings># :=
[restartable][, precise IP][, overflow][, corrected]

<memory section data> :=
[error_status: <integer>]
[physical_address: <integer>]
[physical_address_mask: <integer>]
[node: <integer>]
[card: <integer>]
[module: <integer>]
[bank: <integer>]
[device: <integer>]
[row: <integer>]
[column: <integer>]
[bit_position: <integer>]
[requestor_id: <integer>]
[responder_id: <integer>]
[target_id: <integer>]
[error_type: <integer>, <mem error type string>]

<mem error type string>* :=
unknown | no error | single-bit ECC | multi-bit ECC | \
single-symbol chipkill ECC | multi-symbol chipkill ECC | master abort | \
target abort | parity error | watchdog timeout | invalid address | \
mirror Broken | memory sparing | scrub corrected error | \
scrub uncorrected error

<pcie section data> :=
[port_type: <integer>, <pcie port type string>]
[version: <integer>.<integer>]
[command: <integer>, status: <integer>]
[device_id: <integer>:<integer>:<integer>.<integer>
slot: <integer>
secondary_bus: <integer>
vendor_id: <integer>, device_id: <integer>
class_code: <integer>]
[serial number: <integer>, <integer>]
[bridge: secondary_status: <integer>, control: <integer>]

<pcie port type string>* := PCIe end point | legacy PCI end point | \
unknown | unknown | root port | upstream switch port | \
downstream switch port | PCIe to PCI/PCI-X bridge | \
PCI/PCI-X to PCIe bridge | root complex integrated endpoint device | \
root complex event collector

Where, [] designate corresponding content is optional

All <field string> description with * has the following format:

field: <integer>, <field string>

Where value of <integer> should be the position of "string" in <field
string> description. Otherwise, <field string> will be "unknown".

All <field strings> description with # has the following format:

field: <integer>
<field strings>

Where each string in <fields strings> corresponding to one set bit of
<integer>. The bit position is the position of "string" in <field
strings> description.

For more detailed explanation of every field, please refer to UEFI
specification version 2.3 or later, section Appendix N: Common
Platform Error Record.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2010-12-13 23:42:12 -05:00
Jan Beulich bec4f22a2d ACPI/HEST: adjust section selection
Properly const-, __init-, and __read_mostly-annotate this code.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2010-12-11 02:01:48 -05:00
Huang Ying 3b38bb5f7f ACPI, APEI, use raw spinlock in ERST
ERST writing may be used in NMI or Machine Check Exception handler. So
it need to use raw spinlock instead of normal spinlock.  This patch
fixes it.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2010-12-11 02:01:46 -05:00
Linus Torvalds 092e0e7e52 Merge branch 'llseek' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl
* 'llseek' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl:
  vfs: make no_llseek the default
  vfs: don't use BKL in default_llseek
  llseek: automatically add .llseek fop
  libfs: use generic_file_llseek for simple_attr
  mac80211: disallow seeks in minstrel debug code
  lirc: make chardev nonseekable
  viotape: use noop_llseek
  raw: use explicit llseek file operations
  ibmasmfs: use generic_file_llseek
  spufs: use llseek in all file operations
  arm/omap: use generic_file_llseek in iommu_debug
  lkdtm: use generic_file_llseek in debugfs
  net/wireless: use generic_file_llseek in debugfs
  drm: use noop_llseek
2010-10-22 10:52:56 -07:00
Arnd Bergmann 6038f373a3 llseek: automatically add .llseek fop
All file_operations should get a .llseek operation so we can make
nonseekable_open the default for future file operations without a
.llseek pointer.

The three cases that we can automatically detect are no_llseek, seq_lseek
and default_llseek. For cases where we can we can automatically prove that
the file offset is always ignored, we use noop_llseek, which maintains
the current behavior of not returning an error from a seek.

New drivers should normally not use noop_llseek but instead use no_llseek
and call nonseekable_open at open time.  Existing drivers can be converted
to do the same when the maintainer knows for certain that no user code
relies on calling seek on the device file.

The generated code is often incorrectly indented and right now contains
comments that clarify for each added line why a specific variant was
chosen. In the version that gets submitted upstream, the comments will
be gone and I will manually fix the indentation, because there does not
seem to be a way to do that using coccinelle.

Some amount of new code is currently sitting in linux-next that should get
the same modifications, which I will do at the end of the merge window.

Many thanks to Julia Lawall for helping me learn to write a semantic
patch that does all this.

===== begin semantic patch =====
// This adds an llseek= method to all file operations,
// as a preparation for making no_llseek the default.
//
// The rules are
// - use no_llseek explicitly if we do nonseekable_open
// - use seq_lseek for sequential files
// - use default_llseek if we know we access f_pos
// - use noop_llseek if we know we don't access f_pos,
//   but we still want to allow users to call lseek
//
@ open1 exists @
identifier nested_open;
@@
nested_open(...)
{
<+...
nonseekable_open(...)
...+>
}

@ open exists@
identifier open_f;
identifier i, f;
identifier open1.nested_open;
@@
int open_f(struct inode *i, struct file *f)
{
<+...
(
nonseekable_open(...)
|
nested_open(...)
)
...+>
}

@ read disable optional_qualifier exists @
identifier read_f;
identifier f, p, s, off;
type ssize_t, size_t, loff_t;
expression E;
identifier func;
@@
ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off)
{
<+...
(
   *off = E
|
   *off += E
|
   func(..., off, ...)
|
   E = *off
)
...+>
}

@ read_no_fpos disable optional_qualifier exists @
identifier read_f;
identifier f, p, s, off;
type ssize_t, size_t, loff_t;
@@
ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off)
{
... when != off
}

@ write @
identifier write_f;
identifier f, p, s, off;
type ssize_t, size_t, loff_t;
expression E;
identifier func;
@@
ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off)
{
<+...
(
  *off = E
|
  *off += E
|
  func(..., off, ...)
|
  E = *off
)
...+>
}

@ write_no_fpos @
identifier write_f;
identifier f, p, s, off;
type ssize_t, size_t, loff_t;
@@
ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off)
{
... when != off
}

@ fops0 @
identifier fops;
@@
struct file_operations fops = {
 ...
};

@ has_llseek depends on fops0 @
identifier fops0.fops;
identifier llseek_f;
@@
struct file_operations fops = {
...
 .llseek = llseek_f,
...
};

@ has_read depends on fops0 @
identifier fops0.fops;
identifier read_f;
@@
struct file_operations fops = {
...
 .read = read_f,
...
};

@ has_write depends on fops0 @
identifier fops0.fops;
identifier write_f;
@@
struct file_operations fops = {
...
 .write = write_f,
...
};

@ has_open depends on fops0 @
identifier fops0.fops;
identifier open_f;
@@
struct file_operations fops = {
...
 .open = open_f,
...
};

// use no_llseek if we call nonseekable_open
////////////////////////////////////////////
@ nonseekable1 depends on !has_llseek && has_open @
identifier fops0.fops;
identifier nso ~= "nonseekable_open";
@@
struct file_operations fops = {
...  .open = nso, ...
+.llseek = no_llseek, /* nonseekable */
};

@ nonseekable2 depends on !has_llseek @
identifier fops0.fops;
identifier open.open_f;
@@
struct file_operations fops = {
...  .open = open_f, ...
+.llseek = no_llseek, /* open uses nonseekable */
};

// use seq_lseek for sequential files
/////////////////////////////////////
@ seq depends on !has_llseek @
identifier fops0.fops;
identifier sr ~= "seq_read";
@@
struct file_operations fops = {
...  .read = sr, ...
+.llseek = seq_lseek, /* we have seq_read */
};

// use default_llseek if there is a readdir
///////////////////////////////////////////
@ fops1 depends on !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier readdir_e;
@@
// any other fop is used that changes pos
struct file_operations fops = {
... .readdir = readdir_e, ...
+.llseek = default_llseek, /* readdir is present */
};

// use default_llseek if at least one of read/write touches f_pos
/////////////////////////////////////////////////////////////////
@ fops2 depends on !fops1 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier read.read_f;
@@
// read fops use offset
struct file_operations fops = {
... .read = read_f, ...
+.llseek = default_llseek, /* read accesses f_pos */
};

@ fops3 depends on !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier write.write_f;
@@
// write fops use offset
struct file_operations fops = {
... .write = write_f, ...
+	.llseek = default_llseek, /* write accesses f_pos */
};

// Use noop_llseek if neither read nor write accesses f_pos
///////////////////////////////////////////////////////////

@ fops4 depends on !fops1 && !fops2 && !fops3 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier read_no_fpos.read_f;
identifier write_no_fpos.write_f;
@@
// write fops use offset
struct file_operations fops = {
...
 .write = write_f,
 .read = read_f,
...
+.llseek = noop_llseek, /* read and write both use no f_pos */
};

@ depends on has_write && !has_read && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier write_no_fpos.write_f;
@@
struct file_operations fops = {
... .write = write_f, ...
+.llseek = noop_llseek, /* write uses no f_pos */
};

@ depends on has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier read_no_fpos.read_f;
@@
struct file_operations fops = {
... .read = read_f, ...
+.llseek = noop_llseek, /* read uses no f_pos */
};

@ depends on !has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
@@
struct file_operations fops = {
...
+.llseek = noop_llseek, /* no read or write fn */
};
===== End semantic patch =====

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Julia Lawall <julia@diku.dk>
Cc: Christoph Hellwig <hch@infradead.org>
2010-10-15 15:53:27 +02:00
Len Brown fdb8c58a16 Merge branches 'apei', 'battery-mwh-fix', 'bugzilla-10807', 'bugzilla-14736', 'bugzilla-14679', 'bugzilla-16396', 'launchpad-613381' and 'misc' into release 2010-09-29 15:18:28 -04:00
Huang Ying 0bbba38a61 ACPI, APEI, Fix ERST MOVE_DATA instruction implementation
The src_base and dst_base fields in apei_exec_context are physical
address, so they should be ioremaped before being used in ERST
MOVE_DATA instruction.

Reported-by: Javier Martinez Canillas <martinez.javier@gmail.com>
Reported-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2010-09-29 14:10:09 -04:00
Huang Ying 23f124ca3d ACPI, APEI, Fix error path for memory allocation
In ERST debug/test support patch, a dynamic allocated buffer is
used. The may-failed memory allocation should be tried firstly before
free the previous buffer.

APEI resource management memory allocation related error path is fixed
too.

v2:

- Fix error messages for APEI resources management

Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2010-09-29 14:02:35 -04:00
Jin Dongming 1dd6b20e36 ACPI, APEI, HEST Fix the unsuitable usage of platform_data
platform_data in hest_parse_ghes() is used for saving the address of entry
information of erst_tab. When the device is failed to be added, platform_data
will be freed by platform_device_put(). But the value saved in platform_data
should not be freed here. If it is done, it will make system panic.

So I think platform_data should save the address of allocated memory
which saves entry information of erst_tab.

This patch fixed it and I confirmed it on x86_64 next-tree.

v2:
    Transport the pointer of hest_hdr to platform_data using
    platform_device_add_data()

Signed-off-by: Jin Dongming <jin.dongming@np.css.fujitsu.com>
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2010-09-29 14:02:26 -04:00
Huang Ying 3a78f96532 ACPI, APEI, Fix APEI related table size checking
On Huang Ying's machine:

erst_tab->header_length == sizeof(struct acpi_table_einj)

but Yinghai reported that on his machine,

erst_tab->header_length == sizeof(struct acpi_table_einj) -
sizeof(struct acpi_table_header)

To make erst table size checking code works on all systems, both
testing are treated as PASS.

Same situation applies to einj_tab->header_length, so corresponding
table size checking is changed in similar way too.

v2:

- Treat both table size as valid

Originally-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2010-09-29 13:59:18 -04:00
Lucas De Marchi 58f87ed0d4 ACPI: Fix typos
Signed-off-by: Len Brown <len.brown@intel.com>
2010-09-28 21:38:19 -04:00
Len Brown 95ee46aa86 Merge branch 'linus' into release
Conflicts:
	drivers/acpi/debug.c

Signed-off-by: Len Brown <len.brown@intel.com>
2010-08-15 01:06:31 -04:00
Huang Ying 2ff729d506 ACPI, APEI, ERST debug support
This patch adds debugging/testing support to ERST. A misc device is
implemented to export raw ERST read/write/clear etc operations to user
space. With this patch, we can add ERST testing support to
linuxfirmwarekit ISO (linuxfirmwarekit.org) to verify the kernel
support and the firmware implementation.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Acked-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2010-08-14 22:47:55 -04:00
Thomas Gleixner 0a7992c908 acpi: fix bogus preemption logic
The ACPI_PREEMPTION_POINT() logic was introduced in commit 8bd108d
(ACPICA: add preemption point after each opcode parse).  The follow up
commits abe1dfab6, 138d15692, c084ca70 tried to fix the preemption logic
back and forth, but nobody noticed that the usage of
in_atomic_preempt_off() in that context is wrong.

The check which guards the call of cond_resched() is:

    if (!in_atomic_preempt_off() && !irqs_disabled())

in_atomic_preempt_off() is not intended for general use as the comment
above the macro definition clearly says:

 * Check whether we were atomic before we did preempt_disable():
 * (used by the scheduler, *after* releasing the kernel lock)

On a CONFIG_PREEMPT=n kernel the usage of in_atomic_preempt_off() works by
accident, but with CONFIG_PREEMPT=y it's just broken.

The whole purpose of the ACPI_PREEMPTION_POINT() is to reduce the latency
on a CONFIG_PREEMPT=n kernel, so make ACPI_PREEMPTION_POINT() depend on
CONFIG_PREEMPT=n and remove the in_atomic_preempt_off() check.

Addresses https://bugzilla.kernel.org/show_bug.cgi?id=16210

[akpm@linux-foundation.org: fix build]
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Len Brown <lenb@kernel.org>
Cc: Francois Valenduc <francois.valenduc@tvcablenet.be>
Cc: Lin Ming <ming.m.lin@intel.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-08-12 08:43:29 -07:00
Huang Ying 7ad6e94355 ACPI, APEI, Manage GHES as platform devices
Register GHES during HEST initialization as platform devices. And make
GHES driver into platform device driver. So that the GHES driver
module can be loaded automatically when there are GHES available.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2010-08-08 14:55:52 -04:00
Huang Ying ad4ecef2f1 ACPI, APEI, Rename CPER and GHES severity constants
The abbreviation of severity should be SEV instead of SER, so the CPER
severity constants are renamed accordingly. GHES severity constants
are renamed in the same way too.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2010-08-08 14:55:26 -04:00
Huang Ying 2663b3f235 ACPI, APEI, Fix a typo of error path of apei_resources_request
Fix a typo of error path of apei_resources_request. release_mem_region
and release_region should be interchange.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2010-08-08 14:55:12 -04:00
Daniel J Blueman 980533b018 correct console log level when ERST ACPI table is not found
When booting 2.6.35-rc3 on a x86 system without an ERST ACPI table with
the 'quiet' option, we still observe an "ERST: Table is not found!"
warning.

Quiesce it to the same info log level as the other 'table not found'
warnings.

Signed-off-by: Daniel J Blueman <daniel.blueman@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-07-01 18:40:29 -07:00
Tejun Heo e0fb8c4185 acpi: update gfp/slab.h includes
Implicit slab.h inclusion via percpu.h is about to go away.  Make sure
gfp.h or slab.h is included as necessary.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Len Brown <lenb@kernel.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
2010-06-28 10:19:19 +10:00
Huang Ying 6e320ec1d9 ACPI, APEI, EINJ injection parameters support
Some hardware error injection needs parameters, for example, it is
useful to specify memory address and memory address mask for memory
errors.

Some BIOSes allow parameters to be specified via an unpublished
extension. This patch adds support to it. The parameters will be
ignored on machines without necessary BIOS support.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2010-05-19 22:42:08 -04:00
Huang Ying a08f82d080 ACPI, APEI, Error Record Serialization Table (ERST) support
ERST is a way provided by APEI to save and retrieve hardware error
record to and from some simple persistent storage (such as flash).

The Linux kernel support implementation is quite simple and workable
in NMI context. So it can be used to save hardware error record into
flash in hardware error exception or NMI handler, where other more
complex persistent storage such as disk is not usable. After saving
hardware error records via ERST in hardware error exception or NMI
handler, the error records can be retrieved and logged into disk or
network after a clean reboot.

For more information about ERST, please refer to ACPI Specification
version 4.0, section 17.4.

This patch incorporate fixes from Jin Dongming.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
CC: Jin Dongming <jin.dongming@np.css.fujitsu.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2010-05-19 22:41:31 -04:00
Huang Ying d334a49113 ACPI, APEI, Generic Hardware Error Source memory error support
Generic Hardware Error Source provides a way to report platform
hardware errors (such as that from chipset). It works in so called
"Firmware First" mode, that is, hardware errors are reported to
firmware firstly, then reported to Linux by firmware. This way, some
non-standard hardware error registers or non-standard hardware link
can be checked by firmware to produce more valuable hardware error
information for Linux.

Now, only SCI notification type and memory errors are supported. More
notification type and hardware error type will be added later. These
memory errors are reported to user space through /dev/mcelog via
faking a corrected Machine Check, so that the error memory page can be
offlined by /sbin/mcelog if the error count for one page is beyond the
threshold.

On some machines, Machine Check can not report physical address for
some corrected memory errors, but GHES can do that. So this simplified
GHES is implemented firstly.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2010-05-19 22:41:16 -04:00
Huang Ying 06d65deade ACPI, APEI, UEFI Common Platform Error Record (CPER) header
CPER stands for Common Platform Error Record, it is the hardware error
record format used to describe platform hardware error by various APEI
tables, such as ERST, BERT and HEST etc.

For more information about CPER, please refer to Appendix N of UEFI
Specification version 2.3.

This patch mainly includes the data structure difinition header file
used by other files.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2010-05-19 22:41:05 -04:00
Huang Ying e40213450b ACPI, APEI, EINJ support
EINJ provides a hardware error injection mechanism, this is useful for
debugging and testing of other APEI and RAS features.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2010-05-19 22:35:29 -04:00
Huang Ying 9dc9666416 ACPI, APEI, HEST table parsing
HEST describes error sources in detail; communicating operational
parameters (i.e. severity levels, masking bits, and threshold values)
to OS as necessary. It also allows the platform to report error
sources for which OS would typically not implement support (for
example, chipset-specific error registers).

HEST information may be needed by other subsystems. For example, HEST
PCIE AER error source information describes whether a PCIE root port
works in "firmware first" mode, this is needed by general PCIE AER
error subsystem. So a public HEST tabling parsing interface is
provided.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2010-05-19 22:35:06 -04:00
Huang Ying a643ce207f ACPI, APEI, APEI supporting infrastructure
APEI stands for ACPI Platform Error Interface, which allows to report
errors (for example from the chipset) to the operating system. This
improves NMI handling especially. In addition it supports error
serialization and error injection.

For more information about APEI, please refer to ACPI Specification
version 4.0, chapter 17.

This patch provides some common functions used by more than one APEI
tables, mainly framework of interpreter for EINJ and ERST.

A machine readable language is defined for EINJ and ERST for OS to
execute, and so to drive the firmware to fulfill the corresponding
functions. The machine language for EINJ and ERST is compatible, so a
common framework is defined for them.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2010-05-19 22:34:30 -04:00