Given the fact that the ACPI "EINJ" (error injection) facility is not
universally available, implement software infrastructure to validate the
memcpy_mcsafe() exception handling implementation.
For each potential read exception point in memcpy_mcsafe(), inject a
emulated exception point at the address identified by 'mcsafe_inject'
variable. With this infrastructure implement a test to validate that the
'bytes remaining' calculation is correct for a range of various source
buffer alignments.
This code is compiled out by default. The CONFIG_MCSAFE_DEBUG
configuration symbol needs to be manually enabled by editing
Kconfig.debug. I.e. this functionality can not be accidentally enabled
by a user / distro, it's only for development.
Cc: <x86@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Reported-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Sysfs userspace tooling generally expects the kernel to emit a newlines
when reading sysfs attributes.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The nfit_test.1 bus provides a pmem topology without blk-aperture
enabling, so it presents different failure modes for label space
handling. Allow custom DSM command error injection.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Given that libnvdimm driver stack takes specific actions on DIMM command
error codes like -EACCES, provide a facility to inject custom failures.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The default value for smart ctrl_temperature was the same as the
threshold for ctrl_temperature. As a result, any arbitrary smart
injection to the nfit_test dimm could cause this alarm to trigger
and cause an acpi notification. Drop the default value to below the
threshold, so that unrelated injections don't trigger notifications.
Cc: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Add support for the smart injection command in the nvdimm unit test
framework. This allows for directly injecting to smart fields and flags
that are supported in the injection command. If the injected values are
past the threshold, then an acpi notification is also triggered.
Cc: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
When you load nfit_test you currently see the following error in dmesg:
nfit_test nfit_test.0: found a zero length table '0' parsing nfit
This happens because when we parse the nfit_test.0 table via
acpi_nfit_init(), we specify a size of nfit_test->nfit_size. For the first
pass through nfit_test.0 where (t->setup_hotplug == 0) this is the size of
the entire buffer we allocated, including space for the hot plug
structures, not the size that we've actually filled in.
Fix this by only trying to parse the size of the structures that we've
filled in.
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
It turns out that we were overrunning the 'nfit_buf' buffer in
nfit_test0_setup() in the (t->setup_hotplug == 1) case because we failed to
correctly account for all of the acpi_nfit_memory_map structures.
Fix the structure count which will increase the allocation size of
'nfit_buf' in nfit_test0_alloc(). Also add some WARN_ON()s to
nfit_test0_setup() and nfit_test1_setup() to catch future issues where the
size of the buffer doesn't match the amount of data we're writing.
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
In nfit_test0_setup() and nfit_test1_setup() we keep an 'offset' value
which we use to calculate where in our 'nfit_buf' we will place our next
structure. The handling of 'offset' and the calculation of the placement
of the next structure is a bit inconsistent, though. We don't update
'offset' after we insert each structure, sometimes causing us to update it
for multiple structures' sizes at once. When calculating the position of
the next structure we aren't always able to just use 'offset', but
sometimes have to add in other structure sizes as well.
Fix this by updating 'offset' after each structure insertion in a
consistent way, allowing us to always calculate the position of the next
structure to be inserted by just using 'nfit_buf + offset'.
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The libnvdimm unit tests will fail when they are run against the
production / in-tree version of libnvdimm.ko or nfit.ko due to
symbols not being mocked per nfit_test's expectation. For example,
nfit_test expects acpi_evaluate_dsm() to be replaced by
__wrap_acpi_evaluate_dsm() to test how acpi_nfit_ctl() responds to
different stimuli.
Create a test-only symbol name that nfit_test links against to cause
module load failures when the wrong module is present.
For example, with this change, attempts to use the wrong module will
report:
nfit_test: Unknown symbol libnvdimm_test (err 0)
Reported-by: Dave Jiang <dave.jiang@intel.com>
Reported-by: Vishal Verma <vishal.l.verma@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Adding support code to simulate the enabling of LSS status in support of
the Intel DSM v1.6 Function Index 10: Enable Latch System Shutdown Status.
This is only for testing of libndctl support for LSS enable. The actual
functionality requires a reboot and therefore is not simulated. The enable
value is not recorded in nfit_test since there's no DSM to actually query
the current status of the LSS enable.
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Vishal Verma <vishal.l.verma@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Adding support in nfit_test for DSM v1.6 firmware update sequence. The test
will simulate the flashing of firmware to the DIMM. A bogus version string
will be returned as the test has no idea how to parse the firmware binary.
Any bogus binary can be used to "update" as the actual binary is not copied
into the kernel.
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Vishal Verma <vishal.l.verma@intel.com>
[ vishal: also move smart calls into the nd_cmd_call block ]
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Adding NFIT platform capabilities sub table in nfit_test simulated ACPI
NFIT table. Only the first NFIT table is added with the capability
sub-table.
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Allow the smart_threshold values to be changed via the 'set smart
threshold command' and trigger notifications when the thresholds are
met.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The kernel's ND_IOCTL_SMART_THRESHOLD command is based on a payload
definition that has become broken / out-of-sync with recent versions of
the NVDIMM_FAMILY_INTEL definition. Deprecate the use of the
ND_IOCTL_SMART_THRESHOLD command in favor of the ND_CMD_CALL approach
taken by NVDIMM_FAMILY_{HPE,MSFT}, where we can manage the per-vendor
variance in userspace.
In a couple years, when the new scheme is widely deployed in userspace
packages, the ND_IOCTL_SMART_THRESHOLD support can be removed. For now
we prevent new binaries from compiling against the kernel header
definitions, but kernel still compatible with old binaries. The
libndctl.h [1] header is now the authoritative interface definition for
NVDIMM SMART.
[1]: https://github.com/pmem/ndctl
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Validate command parsing in acpi_nfit_ctl for the clear error command.
This tests for a crash condition introduced by commit 4b27db7e26
"acpi, nfit: add support for the _LSI, _LSR, and _LSW label methods".
Cc: Vishal Verma <vishal.l.verma@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Ensure that the in/out sizes passed in the nd_cmd_package are sane for
the fixed output size commands (i.e. inject error and clear injected
error).
Reported-by: Dariusz Dokupil <dariusz.dokupil@intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The injected badrange entries can only be cleared from the kernel's
accounting by writing to the affected blocks, so when such a write sends
the clear errror DSM to nfit_test, also clear the ranges from
nfit_test's badrange list. This lets an 'ARS Inject error status' DSM to
return the correct status, omitting the cleared ranges.
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Add nfit_test emulation for the new ACPI 6.2 error injectino DSMs.
This will allow unit tests to selectively inject the errors they wish to
test for.
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
[vishal: Move injection functions to ND_CMD_CALL]
[vishal: Add support for the notification option]
[vishal: move an nfit_test private definition into a local header]
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
To test ndctl list which use interface of Translate SPA,
nfit_test needs to emulates it.
This test module searches region which includes SPA and
returns 1 dimm handle which is last one.
Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Reviewed-by: Vishal Verma <vishal.l.verma@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Improve coverage of NVDIMM-N test scenarios by providing a test bus
incapable of label operations.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
mmio_flush_range() suffers from a lack of clearly-defined semantics,
and is somewhat ambiguous to port to other architectures where the
scope of the writeback implied by "flush" and ordering might matter,
but MMIO would tend to imply non-cacheable anyway. Per the rationale
in 67a3e8fe90 ("nd_blk: change aperture mapping from WC to WB"), the
only existing use is actually to invalidate clean cache lines for
ARCH_MEMREMAP_PMEM type mappings *without* writeback. Since the recent
cleanup of the pmem API, that also now happens to be the exact purpose
of arch_invalidate_pmem(), which would be a far more well-defined tool
for the job.
Rather than risk potentially inconsistent implementations of
mmio_flush_range() for the sake of one callsite, streamline things by
removing it entirely and instead move the ARCH_MEMREMAP_PMEM related
definitions up to the libnvdimm level, so they can be shared by NFIT
as well. This allows NFIT to be enabled for arm64.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The root cause of panic is the num_pm of nfit_test1 is wrong.
Though 1 is specified for num_pm at nfit_test_init(), it must be 2,
because nfit_test1->spa_set[] array has 2 elements.
Since the array is smaller than expected, the driver breaks other area.
(it is often the link list of devres).
As a result, panic occurs like the following example.
CPU: 4 PID: 2233 Comm: lt-libndctl Tainted: G O 4.12.0-rc1+ #12
RIP: 0010:__list_del_entry_valid+0x6c/0xa0
Call Trace:
release_nodes+0x76/0x260
devres_release_all+0x3c/0x50
device_release_driver_internal+0x159/0x200
device_release_driver+0x12/0x20
bus_remove_device+0xfd/0x170
device_del+0x1e8/0x330
platform_device_del+0x28/0x90
platform_device_unregister+0x12/0x30
nfit_test_exit+0x2a/0x93b [nfit_test]
Cc: <stable@vger.kernel.org>
Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
acpi_evaluate_dsm() and friends take a pointer to a raw buffer of 16
bytes. Instead we convert them to use guid_t type. At the same time we
convert current users.
acpi_str_to_uuid() becomes useless after the conversion and it's safe to
get rid of it.
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Cc: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Cc: Ben Skeggs <bskeggs@redhat.com>
Acked-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Acked-by: Joerg Roedel <jroedel@suse.de>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Yisen Zhuang <yisen.zhuang@huawei.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Acked-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Acked-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
The workqueue may still be running when the devres callbacks start
firing to deallocate an acpi_nfit_desc instance. Stop and flush the
workqueue before letting any other devres de-allocations proceed.
Reported-by: Linda Knippers <linda.knippers@hpe.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Keep the nfit_test instances alive until after nfit_test_teardown(), as
we may be doing resource lookups until the final un-registrations have
completed. This fixes crashes of the form.
BUG: unable to handle kernel NULL pointer dereference at 0000000000000038
IP: __release_resource+0x12/0x90
Call Trace:
remove_resource+0x23/0x40
__wrap_remove_resource+0x29/0x30 [nfit_test_iomap]
acpi_nfit_remove_resource+0xe/0x10 [nfit]
devm_action_release+0xf/0x20
release_nodes+0x16d/0x2b0
devres_release_all+0x3c/0x60
device_release+0x21/0x90
kobject_release+0x6a/0x170
kobject_put+0x2f/0x60
put_device+0x17/0x20
platform_device_unregister+0x20/0x30
nfit_test_exit+0x36/0x960 [nfit_test]
Reported-by: Linda Knippers <linda.knippers@hpe.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Add a simulated dimm with an ACPI_NFIT_MEM_MAP_FAILED indication, and
set the ACPI_NFIT_MEM_HEALTH_ENABLED flag on all the dimms where
nfit_test simulates health events, but spread it out over several
redundant memdev entries to test that the nfit driver coalesces all the
flags.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
For testing changes to the iset cookie algorithm we need a value that is
constant from run-to-run.
Stop including dynamic data in the emulated region_offset values. Also,
pick values that sort in a different order depending on whether the
comparison is a memcmp() of two 8-byte arrays or subtraction of two
64-bit values.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
A recent flurry of bug discoveries in the nfit driver's DSM marshalling
routine has highlighted the fact that we do not have unit test coverage
for this routine. Add a self-test of acpi_nfit_ctl() routine before
probing the "nfit_test.0" device. This mocks stimulus to acpi_nfit_ctl()
and if any of the tests fail "nfit_test.0" will be unavailable causing
the rest of the tests to not run / fail.
This unit test will also be a place to land reproductions of quirky BIOS
behavior discovered in the field and ensure the kernel does not regress
against implementations it has seen in practice.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Update nfit_test infrastructure to enable labels for the dimm on the
nfit_test.1 bus. This bus has a pmem region without aliased blk space,
so it is a candidate for dynamically enabling label support by writing
a namespace index block.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Update nfit_test to handle multiple sub-allocations within a given pmem
region. The mock resource now tracks and un-tracks sub-ranges as they
are requested and released (either explicitly or via devm callback).
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Add an nfit_test specific attribute for gating whether a get_config_size
DSM, or any DSM for that matter, succeeds or fails. The get_config_size
DSM is initial motivation since that is the first command libnvdimm core
issues to determine the state of the namespace label area.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Commit 480b6837aa "nvdimm: fix PHYS_PFN/PFN_PHYS mixup" identified
that we were passing an invalid address to devm_nvdimm_ioremap(). With
that fixed it exposed a bug in the memory reservation size for flush
hint tables. Since we map a full page we need to mock a full page of
memory to back the flush hint table entries.
Cc: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Trigger an nmemX/nfit/flags attribute to fire an event whenever a
smart-threshold DSM is received.
Reviewed-by: Vishal Verma <vishal.l.verma@intel.com>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
We have had a couple bugs in this implementation in the past and before
we add another ->notify() implementation for nvdimm devices, lets allow
this routine to be exercised via nfit_test.
Rewrite acpi_nfit_notify() in terms of a generic struct device and
acpi_handle parameter, and then implement a mock acpi_evaluate_object()
that returns a _FIT payload.
Cc: Vishal Verma <vishal.l.verma@intel.com>
Reviewed-by: Vishal Verma <vishal.l.verma@intel.com>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The unit tests crash when hotplug races the previous probe. This race
requires that the loading of the nfit_test module be terminated with
SIGTERM, and the module to be unloaded while the ars scan is still
running.
In contrast to the normal nfit driver, the unit test calls
acpi_nfit_init() twice to simulate hotplug, whereas the nominal case
goes through the acpi_nfit_notify() event handler. The
acpi_nfit_notify() path is careful to flush the previous region
registration before servicing the hotplug event. The unit test was
missing this guarantee.
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [<ffffffff810cdce7>] pwq_activate_delayed_work+0x47/0x170
[..]
Call Trace:
[<ffffffff810ce186>] pwq_dec_nr_in_flight+0x66/0xa0
[<ffffffff810ce490>] process_one_work+0x2d0/0x680
[<ffffffff810ce331>] ? process_one_work+0x171/0x680
[<ffffffff810ce88e>] worker_thread+0x4e/0x480
[<ffffffff810ce840>] ? process_one_work+0x680/0x680
[<ffffffff810ce840>] ? process_one_work+0x680/0x680
[<ffffffff810d5343>] kthread+0xf3/0x110
[<ffffffff8199846f>] ret_from_fork+0x1f/0x40
[<ffffffff810d5250>] ? kthread_create_on_node+0x230/0x230
Cc: <stable@vger.kernel.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
While testing the new on-demand ARS patches we discovered that
differences between the nfit_test and normal nfit driver shutdown paths
can leak resources. Unify the shutdown paths to trigger via a devm_
callback when the acpi_desc->dev is unbound from its driver.
Reviewed-by: Lee, Chun-Yi <jlee@suse.com>
Reported-by: Vishal Verma <vishal.l.verma@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Let the provider module be explicitly passed in rather than implicitly
assumed by the module that calls nvdimm_bus_register(). This is in
preparation for unifying the nfit and nfit_test driver teardown paths.
Reviewed-by: Lee, Chun-Yi <jlee@suse.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Pass the nfit buffer as a parameter rather than hanging it off of
acpi_desc.
Reviewed-by: "Lee, Chun-Yi" <jlee@suse.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
New for ACPI 6.1, these fields are used in the common dimm
representation format defined by section 5.2.25.9 "NVDIMM representation
format".
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Test the virtual disk ranges that platform firmware like EDK2/OVMF might
emit.
Tested-by: "Lee, Chun-Yi" <jlee@suse.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
DMA_CMA is incompatible with SWIOTLB used in enterprise distro
configurations. Switch to vmalloc() allocations for all resources.
Acked-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Clarify the distinction between "commands", the ioctls userspace calls
to request the kernel take some action on a given dimm device, and
"_DSMs", the actual function numbers used in the firmware interface to
the DIMM. _DSMs are ACPI specific whereas commands are Linux kernel
generic.
This is in preparation for breaking the 1:1 implicit relationship
between the kernel ioctl number space and the firmware specific function
numbers.
Cc: Jerry Hoemann <jerry.hoemann@hpe.com>
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>