Allow for the enablement of MEF and the support for the extended
epoch in SIE and VSIE for the extended guest TOD-Clock.
A new interface is used for getting/setting a guest's extended TOD-Clock
that uses a single ioctl invocation, KVM_S390_VM_TOD_EXT. Since the
host time is a moving target that might see an epoch switch or STP sync
checks we need an atomic ioctl and cannot use the exisiting two
interfaces. The old method of getting and setting the guest TOD-Clock is
still retained and is used when the old ioctls are called.
Signed-off-by: Collin L. Walling <walling@linux.vnet.ibm.com>
Reviewed-by: Janosch Frank <frankja@linux.vnet.ibm.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com>
Reviewed-by: Jason J. Herne <jjherne@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
If the exit flag of SIE indicates that a machine check has happened
during guest's running and needs to be injected, inject it to the guest
accordingly.
But some machine checks, e.g. Channel Report Pending (CRW), refer to
host conditions only (the guest's channel devices are not managed by
the kernel directly) and are therefore not injected into the guest.
External Damage (ED) is also not reinjected into the guest because ETR
conditions are gone in Linux and STP conditions are not enabled in the
guest, and ED contains only these 8 ETR and STP conditions.
In general, instruction-processing damage, system recovery,
storage error, service-processor damage and channel subsystem damage
will be reinjected into the guest, and the remain (System damage,
timing-facility damage, warning, ED and CRW) will be handled on the host.
Signed-off-by: QingFeng Hao <haoqf@linux.vnet.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
If the KSS facility is available on the machine, we also make it
available for our KVM guests.
The KSS facility bypasses storage key management as long as the guest
does not issue a related instruction. When that happens, the control is
returned to the host, which has to turn off KSS for a guest vcpu
before retrying the instruction.
Signed-off-by: Corey S. McQuay <csmcquay@linux.vnet.ibm.com>
Signed-off-by: Farhan Ali <alifm@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
This patch adds guarded storage support for KVM guest. We need to
setup the necessary control blocks, the kvm_run structure for the
new registers, the necessary wrappers for VSIE, as well as the
machine check save areas.
GS is enabled lazily and the register saving and reloading is done in
KVM code. As this feature adds new content for migration, we provide
a new capability for enablement (KVM_CAP_S390_GS).
Signed-off-by: Fan Zhang <zhangfan@linux.vnet.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Janosch Frank <frankja@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Let's replace the bitmasks by defines. Reconstructed from code, comments
and commit messages.
Tried to keep the defines short and map them to feature names. In case
they don't completely map to features, keep them in the stye of ICTL
defines.
This effectively drops all "U" from the existing numbers. I think this
should be fine (as similarly done for e.g. ICTL defines).
I am not 100% sure about the ECA_MVPGI and ECA_PROTEXCI bits as they are
always used in pairs.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20170313104828.13362-1-david@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
[some renames, add one missing place]
When we get a PER i-fetch event on an EXECUTE or EXECUTE RELATIVE LONG
instruction, because the executed instruction generated a PER i-fetch
event, then the PER address points at the EXECUTE function, not the
fetched one.
Therefore, when filtering PER events, we have to take care of the
really fetched instruction, which we can only get by reading in guest
virtual memory.
For icpt code 4 and 56, we directly have additional information about an
EXECUTE instruction at hand. For icpt code 8, we always have to read
in guest virtual memory.
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
[small fixes]
sparse with __CHECK_ENDIAN__ shows that ar_t was never properly
used across KVM on s390. We can now:
- fix all places
- do not make ar_t special
Since ar_t is just used as a register number (no endianness issues
for u8), and all other register numbers are also just plain int
variables, let's just use u8, which matches the __u8 in the userspace
ABI for the memop ioctl.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Janosch Frank <frankja@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
If the SCA entries aren't used by the hardware (no SIGPIF), we
can simply not set the entries, stick to the basic sca and allow more
than 64 VCPUs.
To hinder any other facility from using these entries, let's properly
provoke intercepts by not setting the MCN and keeping the entries
unset.
This effectively allows when running KVM under KVM (vSIE) or under z/VM to
provide more than 64 VCPUs to a guest. Let's limit it to 255 for now, to
not run into problems if the CPU numbers are limited somewhere else.
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Only enable runtime instrumentation if the guest issues an RI related
instruction or if userspace changes the riccb to a valid state.
This makes entry/exit a tiny bit faster.
Initial patch by Christian Borntraeger
Signed-off-by: Fan Zhang <zhangfan@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Vector registers are only to be stored if the facility is available
and if the guest has set up the machine check extended save area.
If anything goes wrong while writing the vector registers, the vector
registers are to be marked as invalid. Please note that we are allowed
to write the registers although they are marked as invalid.
Machine checks and "store status" SIGP orders are two different concepts,
let's correctly separate these. As the SIGP part is completely handled in
user space, we can drop it.
This patch is based on a patch from Cornelia Huck.
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
In case we have to emuluate an instruction or part of it (instruction,
partial instruction, operation exception), we have to inject a PER
instruction-fetching event for that instruction, if hardware told us to do
so.
In case we retry an instruction, we must not inject the PER event.
Please note that we don't filter the events properly yet, so guest
debugging will be visible for the guest.
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
As we want to make use of CPUSTAT_WAIT also when a VCPU is not idle but
to force interception of external calls, let's check in the bitmap instead.
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Whenever we want to wake up a VCPU (e.g. when injecting an IRQ), we
have to kick it out of vsie, so the request will be handled faster.
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
This patch adds basic support for nested virtualization on s390x, called
VSIE (virtual SIE) and allows it to be used by the guest if the necessary
facilities are supported by the hardware and enabled for the guest.
In order to make this work, we have to shadow the sie control block
provided by guest 2. In order to gain some performance, we have to
reuse the same shadow blocks as good as possible. For now, we allow
as many shadow blocks as we have VCPUs (that way, every VCPU can run the
VSIE concurrently).
We have to watch out for the prefix getting unmapped out of our shadow
gmap and properly get the VCPU out of VSIE in that case, to fault the
prefix pages back in. We use the PROG_REQUEST bit for that purpose.
This patch is based on an initial prototype by Tobias Elpelt.
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
For now, we only have an interface to query and configure facilities
indicated via STFL(E). However, we also have features indicated via
SCLP, that have to be indicated to the guest by user space and usually
require KVM support.
This patch allows user space to query and configure available cpu features
for the guest.
Please note that disabling a feature doesn't necessarily mean that it is
completely disabled (e.g. ESOP is mostly handled by the SIE). We will try
our best to disable it.
Most features (e.g. SCLP) can't directly be forwarded, as most of them need
in addition to hardware support, support in KVM. As we later on want to
turn these features in KVM explicitly on/off (to simulate different
behavior), we have to filter all features provided by the hardware and
make them configurable.
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Store Hypervisor Information is an emulated z/VM instruction that
provides a guest with basic information about the layers it is running
on. This includes information about the cpu configuration of both the
machine and the lpar, as well as their names, machine model and
machine type. This information enables an application to determine the
maximum capacity of CPs and IFLs available to software.
The instruction is available whenever the facility bit 74 is set,
otherwise executing it results in an operation exception.
It is important to check the validity flags in the sections before
using data from any structure member. It is not guaranteed that all
members will be valid on all machines / machine configurations.
Signed-off-by: Janosch Frank <frankja@linux.vnet.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
We can fit the 2k for the STFLE interpretation and the crypto
control block into one DMA page. As we now only have to allocate
one DMA page, we can clean up the code a bit.
As a nice side effect, this also fixes a problem with crycbd alignment in
case special allocation debug options are enabled, debugged by Sascha
Silbe.
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
The cpu timer is a mean to measure task execution time. We want
to account everything for a VCPU for which it is responsible. Therefore,
if the VCPU wants to sleep, it shall be accounted for it.
We can easily get this done by not disabling cpu timer accounting when
scheduled out while sleeping because of enabled wait.
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
We want to manually step the cpu timer in certain scenarios in the future.
Let's abstract any access to the cpu timer, so we can hide the complexity
internally.
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
We have some confusion about ilc vs. ilen in our current code. So let's
correctly use the term ilen when dealing with (ilc << 1).
Program irq injection didn't take care of the correct ilc in case of
irqs triggered by EXECUTE functions, let's provide one function
kvm_s390_get_ilen() to take care of all that.
Also, manually specifying in intercept handlers the size of the
instruction (and sometimes overwriting that value for EXECUTE internally)
doesn't make too much sense. So also provide the functions:
- kvm_s390_retry_instr to retry the currently intercepted instruction
- kvm_s390_rewind_psw to rewind the PSW without internal overwrites
- kvm_s390_forward_psw to forward the PSW
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
This patch adds new structures and updates some existing ones to
provide the base for Extended SCA functionality.
The old sca_* structures were renamed to bsca_* to keep things uniform.
The access to fields of SIGP controls were turned into bitfields instead
of hardcoded bitmasks.
Signed-off-by: Eugene (jno) Dvurechenski <jno@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
This patch generalizes access to the IPTE controls, which is a part of SCA.
This is to prepare for upcoming introduction of Extended SCA support.
Signed-off-by: Eugene (jno) Dvurechenski <jno@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Let's factor this out and always use get_tod_clock_fast() when
reading the guest TOD.
STORE CLOCK FAST does not do serialization and, therefore, might
result in some fuzziness between different processors in a way
that subsequent calls on different CPUs might have time stamps that
are earlier. This semantics is fine though for all KVM use cases.
To make it obvious that the new function has STORE CLOCK FAST
semantics we name it kvm_s390_get_tod_clock_fast.
With this patch, we only have a handful of places were we
have to care about STP sync (using preempt_disable() logic).
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Let's move that whole logic into one function. We now always use unsigned
values when calculating the epoch (to avoid over/underflow defined).
Also, we always have to get all VCPUs out of SIE before doing the update
to avoid running differing VCPUs with different TODs.
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
The main reason to keep program injection in kernel separated until now
was that we were able to do some checking, if really only the owning
thread injects program interrupts (via waitqueue_active(li->wq)).
This BUG_ON was never triggered and the chances of really hitting it, if
another thread injected a program irq to another vcpu, were very small.
Let's drop this check and turn kvm_s390_inject_program_int() and
kvm_s390_inject_prog_irq() into simple inline functions that makes use of
kvm_s390_inject_vcpu().
__must_check can be dropped as they are implicitely given by
kvm_s390_inject_vcpu(), to avoid ugly long function prototypes.
Reviewed-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
In addition to the per VM debug logs, let's provide a global
one for KVM-wide events, like new guests or fatal errors.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
SPX (SET PREFIX) and SIGP (Set prefix) can change the prefix
register of a CPU. As sigp set prefix may be handled in user
space (KVM_CAP_S390_USER_SIGP), we would not log the changes
triggered via SIGP in that case. Let's have just one VCPU_EVENT
at the central location that tracks prefix changes.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
As we already only enable CMMA when userspace requests it, we can
safely move the additional checks to the request handler and avoid
doing them multiple times. This also tells userspace if CMMA is
available.
Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Some updates to the control blocks need to be done in a way that
ensures that no CPU is within SIE. Provide wrappers around the
s390_vcpu_block functions and adopt the TOD migration code to
update in a guaranteed fashion. Also rename these functions to
have the kvm_s390_ prefix as everything else.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
exit_sie_sync is used to kick CPUs out of SIE and prevent reentering at
any point in time. This is used to reload the prefix pages and to
set the IBS stuff in a way that guarantees that after this function
returns we are no longer in SIE. All current users trigger KVM requests.
The request must be set before we block the CPUs to avoid races. Let's
make this implicit by adding the request into a new function
kvm_s390_sync_requests that replaces exit_sie_sync and split out
s390_vcpu_block and s390_vcpu_unblock, that can be used to keep
CPUs out of SIE independent of requests.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
This patch adds support to migrate vcpu interrupts. Two new vcpu ioctls
are added which get/set the complete status of pending interrupts in one
go. The ioctls are marked as available with the new capability
KVM_CAP_S390_IRQ_STATE.
We can not use a ONEREG, as the number of pending local interrupts is not
constant and depends on the number of CPUs.
To retrieve the interrupt state we add an ioctl KVM_S390_GET_IRQ_STATE.
Its input parameter is a pointer to a struct kvm_s390_irq_state which
has a buffer and length. For all currently pending interrupts, we copy
a struct kvm_s390_irq into the buffer and pass it to userspace.
To store interrupt state into a buffer provided by userspace, we add an
ioctl KVM_S390_SET_IRQ_STATE. It passes a struct kvm_s390_irq_state into
the kernel and injects all interrupts contained in the buffer.
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
This patch makes interrupt handling compliant to the z/Architecture
Principles of Operation with regard to interrupt priorities.
Add a bitmap for pending floating interrupts. Each bit relates to a
interrupt type and its list. A turned on bit indicates that a list
contains items (interrupts) which need to be delivered. When delivering
interrupts on a cpu we can merge the existing bitmap for cpu-local
interrupts and floating interrupts and have a single mechanism for
delivery.
Currently we have one list for all kinds of floating interrupts and a
corresponding spin lock. This patch adds a separate list per
interrupt type. An exception to this are service signal and machine check
interrupts, as there can be only one pending interrupt at a time.
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
The patch represents capability KVM_CAP_S390_VECTOR_REGISTERS by means
of the SIMD facility bit. This allows to a) disable the use of SIMD when
used in conjunction with a not-SIMD-aware QEMU, b) to enable SIMD when
used with a SIMD-aware version of QEMU and c) finally by means of a QEMU
version using the future cpu model ioctls.
Signed-off-by: Michael Mueller <mimu@linux.vnet.ibm.com>
Reviewed-by: Eric Farman <farman@linux.vnet.ibm.com>
Tested-by: Eric Farman <farman@linux.vnet.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
In access register mode, the write_guest() read_guest() and other
functions will invoke the access register translation, which
requires an ar, designated by one of the instruction fields.
Signed-off-by: Alexander Yarygin <yarygin@linux.vnet.ibm.com>
Reviewed-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Store additional status in the machine check handler, in order to
collect status (such as vector registers) that is not defined by
store status.
Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com>
Reviewed-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
The reinjection of an I/O interrupt can fail if the list is at the limit
and between the dequeue and the reinjection, another I/O interrupt is
injected (e.g. if user space floods kvm with I/O interrupts).
This patch avoids this memory leak and returns -EFAULT in this special
case. This error is not recoverable, so let's fail hard. This can later
be avoided by not dequeuing the interrupt but working directly on the
locked list.
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Cc: stable@vger.kernel.org # 3.16+
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Most facility related decisions in KVM have to take into account:
- the facilities offered by the underlying run container (LPAR/VM)
- the facilities supported by the KVM code itself
- the facilities requested by a guest VM
This patch adds the KVM driver requested facilities to the test routine.
It additionally renames struct s390_model_fac to kvm_s390_fac and its field
names to be more meaningful.
The semantics of the facilities stored in the KVM architecture structure
is changed. The address arch.model.fac->list now points to the guest
facility list and arch.model.fac->mask points to the KVM facility mask.
This patch fixes the behaviour of KVM for some facilities for guests
that ignore the guest visible facility bits, e.g. guests could use
transactional memory intructions on hosts supporting them even if the
chosen cpu model would not offer them.
The userspace interface is not affected by this change.
Signed-off-by: Michael Mueller <mimu@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
The patch introduces facilities and cpu_ids per virtual machine.
Different virtual machines may want to expose different facilities and
cpu ids to the guest, so let's make them per-vm instead of global.
Signed-off-by: Michael Mueller <mimu@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Only one external call may be pending at a vcpu at a time. For this
reason, we have to detect whether the SIGP externcal call interpretation
facility is available. If so, all external calls have to be injected
using this mechanism.
SIGP EXTERNAL CALL orders have to return whether another external
call is already pending. This check was missing until now.
SIGP SENSE hasn't returned yet in all conditions whether an external
call was pending.
If a SIGP EXTERNAL CALL irq is to be injected and one is already
pending, -EBUSY is returned.
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
As a SIGP STOP is an interrupt with the least priority, it may only result
in stop of the vcpu when no other interrupts are left pending.
To detect whether a non-stop irq is pending, we need a way to mask out
stop irqs from the general kvm_cpu_has_interrupt() function. For this
reason, the existing function (with an outdated name) is replaced by
kvm_s390_vcpu_has_irq() which allows to mask out pending stop irqs.
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
This patch removes the famous action_bits and moves the handling of
SIGP STOP AND STORE STATUS directly into the SIGP STOP interrupt.
The new local interrupt infrastructure is used to track pending stop
requests.
STOP irqs are the only irqs that don't get actively delivered. They
remain pending until the stop function is executed (=stop intercept).
If another STOP irq is already pending, -EBUSY will now be returned
(needed for the SIGP handling code).
Migration of pending SIGP STOP (AND STORE STATUS) orders should now
be supported out of the box.
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
This patch adapts handling of local interrupts to be more compliant with
the z/Architecture Principles of Operation and introduces a data
structure
which allows more efficient handling of interrupts.
* get rid of li->active flag, use bitmap instead
* Keep interrupts in a bitmap instead of a list
* Deliver interrupts in the order of their priority as defined in the
PoP
* Use a second bitmap for sigp emergency requests, as a CPU can have
one request pending from every other CPU in the system.
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
A couple of our interception handlers rewind the PSW to the beginning
of the instruction to run the intercepted instruction again during the
next SIE entry. This normally works fine, but there is also the
possibility that the instruction did not get run directly but via an
EXECUTE instruction.
In this case, the PSW does not point to the instruction that caused the
interception, but to the EXECUTE instruction! So we've got to rewind the
PSW to the beginning of the EXECUTE instruction instead.
This is now accomplished with a new helper function kvm_s390_rewind_psw().
Signed-off-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Using the new kvm_register_device_ops() interface makes us get rid of
an #ifdef in common code.
Cc: Gleb Natapov <gleb@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We now propagate interrupt injection errors back to the ioctl. We
should mark functions that might fail with __must_check.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Currently we just kill the userspace process and exit the thread
immediatly without making sure that we don't hold any locks etc.
Improve this by making KVM_RUN return -EFAULT if the lowcore is not
mapped during interrupt delivery. To achieve this we need to pass
the return code of guest memory access routines used in interrupt
delivery all the way back to the KVM_RUN ioctl.
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Use the KVM_REQ_TLB_FLUSH request in order to trigger tlb flushes instead
of manipulating the SIE control block whenever we need it. Also trigger it for
a control register sync directly instead of (ab)using kvm_s390_set_prefix().
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
The load PSW handler does not have to inject pending machine checks.
This can wait until the CPU runs the generic interrupt injection code.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
We can get rid of the tasklet used for waking up a VCPU in the hrtimer
code but wakeup the VCPU directly.
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>