BUG: sleeping function called from invalid context at /local/scratch/dariof/linux/kernel/mutex.c:271
in_atomic(): 1, irqs_disabled(): 0, pid: 3256, name: qemu-dm
1 lock held by qemu-dm/3256:
#0: (&(&priv->lock)->rlock){......}, at: [<ffffffff813223da>] gntdev_ioctl+0x2bd/0x4d5
Pid: 3256, comm: qemu-dm Tainted: G W 3.1.0-rc8+ #5
Call Trace:
[<ffffffff81054594>] __might_sleep+0x131/0x135
[<ffffffff816bd64f>] mutex_lock_nested+0x25/0x45
[<ffffffff8131c7c8>] free_xenballooned_pages+0x20/0xb1
[<ffffffff8132194d>] gntdev_put_map+0xa8/0xdb
[<ffffffff816be546>] ? _raw_spin_lock+0x71/0x7a
[<ffffffff813223da>] ? gntdev_ioctl+0x2bd/0x4d5
[<ffffffff8132243c>] gntdev_ioctl+0x31f/0x4d5
[<ffffffff81007d62>] ? check_events+0x12/0x20
[<ffffffff811433bc>] do_vfs_ioctl+0x488/0x4d7
[<ffffffff81007d4f>] ? xen_restore_fl_direct_reloc+0x4/0x4
[<ffffffff8109168b>] ? lock_release+0x21c/0x229
[<ffffffff81135cdd>] ? rcu_read_unlock+0x21/0x32
[<ffffffff81143452>] sys_ioctl+0x47/0x6a
[<ffffffff816bfd82>] system_call_fastpath+0x16/0x1b
gntdev_put_map tries to acquire a mutex when freeing pages back to the
xenballoon pool, so it cannot be called with a spinlock held. In
gntdev_release, the spinlock is not needed as we are freeing the
structure later; in the ioctl, only the list manipulation needs to be
under the lock.
Reported-and-Tested-By: Dario Faggioli <dario.faggioli@citrix.com>
Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
If we want to use granted pages for AIO, changing the mappings of a user
vma and the corresponding p2m is not enough, we also need to update the
kernel mappings accordingly.
Currently this is only needed for pages that are created for user usages
through /dev/xen/gntdev. As in, pages that have been in use by the
kernel and use the P2M will not need this special mapping.
However there are no guarantees that in the future the kernel won't
start accessing pages through the 1:1 even for internal usage.
In order to avoid the complexity of dealing with highmem, we allocated
the pages lowmem.
We issue a HYPERVISOR_grant_table_op right away in
m2p_add_override and we remove the mappings using another
HYPERVISOR_grant_table_op in m2p_remove_override.
Considering that m2p_add_override and m2p_remove_override are called
once per page we use multicalls and hypercall batching.
Use the kmap_op pointer directly as argument to do the mapping as it is
guaranteed to be present up until the unmapping is done.
Before issuing any unmapping multicalls, we need to make sure that the
mapping has already being done, because we need the kmap->handle to be
set correctly.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
[v1: Removed GRANT_FRAME_BIT usage]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Add an highmem parameter to alloc_xenballooned_pages, to allow callers to
request lowmem or highmem pages.
Fix the code style of free_xenballooned_pages' prototype.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* 'stable/backend.base.v3' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
xen/pci: Fix compiler error when CONFIG_XEN_PRIVILEGED_GUEST is not set.
xen/p2m: Add EXPORT_SYMBOL_GPL to the M2P override functions.
xen/p2m/m2p/gnttab: Support GNTMAP_host_map in the M2P override.
xen/irq: The Xen hypervisor cleans up the PIRQs if the other domain forgot.
xen/irq: Export 'xen_pirq_from_irq' function.
xen/irq: Add support to check if IRQ line is shared with other domains.
xen/irq: Check if the PCI device is owned by a domain different than DOMID_SELF.
xen/pci: Add xen_[find|register|unregister]_device_domain_owner functions.
* 'stable/gntalloc.v7' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
xen/gntdev,gntalloc: Remove unneeded VM flags
We should unlock here and also decrement the number of &map->users.
Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
copy_to_user() returns the amount of data remaining to be copied. We
want to return a negative error code here. The upper layers just
call WARN_ON() if we return non-zero so this doesn't change the
behavior. But returning -EFAULT is still cleaner.
Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Grant mappings cause the PFN<->MFN mapping to be lost on the pages used
for the mapping. Instead of leaking memory, use pages that have already
been ballooned out and so have no valid mapping. This removes the need
for the bad-page leak workaround as pages are repopulated by the balloon
driver.
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
The only time when granted pages need to be treated specially is when
using Xen's PTE modification for grant mappings owned by another domain
(that is, only gntdev on PV guests). Otherwise, the area does not
require VM_DONTCOPY and VM_PFNMAP, since it can be accessed just like
any other page of RAM.
Since the vm_operations_struct close operations decrement reference
counts, a corresponding open function that increments them is required
now that it is possible to have multiple references to a single area.
We are careful in the gntdev to check if we can remove those flags. The
reason that we need to be careful in gntdev on PV guests is because we are
not changing the PFN/MFN mapping on PV; instead, we change the application's
page tables to point to the other domain's memory. This means that the vma
cannot be copied without using another grant mapping hypercall; it also
requires special handling on unmap, which is the reason for gntdev's
dependency on the MMU notifier.
For gntalloc, this is not a concern - the pages are owned by the domain
using the gntalloc device, and can be mapped and unmapped in the same manner
as any other page of memory.
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
[v2: Added in git commit "We are.." from email correspondence]
addr is actually a virtual address so use an unsigned long. Fixes:
CC drivers/xen/gntdev.o
drivers/xen/gntdev.c: In function 'map_grant_pages':
drivers/xen/gntdev.c:268: warning: cast from pointer to integer of different size
Reduce the scope of the variable at the same time.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
The handle with numeric value 0 is a valid map handle, so it cannot
be used to indicate that a page has not been mapped. Use -1 instead.
Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
If an already-mapped area of the device was mapped into userspace a
second time, a hypercall was incorrectly made to remap the memory
again. Avoid the hypercall on later mmap calls, and fail the mmap call
if a writable mapping is attempted on a read-only range.
Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
In paravirtualized domains, mn_invl_page or mn_invl_range_start can
unmap a segment of a mapped region without unmapping all pages. When
the region is later released, the pages will be unmapped twice, leading
to an incorrect -EINVAL return.
Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
The is_mapped flag used to be set at the completion of the map operation,
but was not checked in all error paths. Use map->vma instead, which will
now be cleared if the initial grant mapping fails.
Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
In paravirtualized guests, the struct page* for mappings is only a
placeholder, and cannot be used to access the granted memory. Use the
userspace mapping that we have set up in order to implement
UNMAP_NOTIFY_CLEAR_BYTE.
Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
The error path did not decrement the reference count of the grant structure.
Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
This ioctl allows the users of a shared page to be notified when
the other end exits abnormally.
[v2: updated description in structs]
Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
HVM does not allow direct PTE modification, so instead we request
that Xen change its internal p2m mappings on the allocated pages and
map the memory into userspace normally.
Note:
The HVM path for map and unmap is slightly different: HVM keeps the pages
mapped until the area is deleted, while the PV case (use_ptemod being true)
must unmap them when userspace unmaps the range. In the normal use case,
this makes no difference to users since unmap time is deletion time.
[v2: Expanded commit descr.]
Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
This allows userspace to perform mmap() on the gntdev device and then
immediately close the filehandle or remove the mapping using the
remove ioctl, with the mapped area remaining valid until unmapped.
This also fixes an infinite loop when a gntdev device is closed
without first unmapping all areas.
Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
This should be faster if many mappings exist, and also removes
the only user of map->vma not related to PTE modification.
Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Because there is no limitation on how many times a user can open a
given device file, an per-file-description limit on the number of
pages granted offers little to no benefit. Change to a global limit
and remove the ioctl() as the parameter can now be changed via sysfs.
Xen tools changeset 22768:f8d801e5573e is needed to eliminate the
error this change produces in xc_gnttab_set_max_grants.
Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Use gnttab_map_refs and gnttab_unmap_refs to map and unmap the grant
ref, so that we can have a corresponding struct page.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
apply_to_page_range will acquire PTE lock while priv->lock is held,
and mn_invl_range_start tries to acquire priv->lock with PTE already
held. Fix by not holding priv->lock during the entire map operation.
This is safe because map->vma is set nonzero while the lock is held,
which will cause subsequent maps to fail and will cause the unmap
ioctl (and other users of gntdev_del_map) to return -EBUSY until the
area is unmapped. It is similarly impossible for gntdev_vma_close to
be called while the vma is still being created.
Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
It's the struct page of the L1 pte page. But we can get its mfn
by simply doing an arbitrary_virt_to_machine() on it anyway (which is
the safe conservative choice; since we no longer allow HIGHPTE pages,
we would never expect to be operating on a mapped pte page).
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
This flag controls the meaning of gnttab_map_grant_ref.host_addr and
specifies that the field contains a reference to the pte entry to be
used to perform the mapping. Therefore move the use of this flag to
the point at which we actually use a reference to the pte instead of
something else, splitting up the usage of the flag in this way is
confusing and potentially error prone.
The other flags are all properties of the mapping itself as opposed to
properties of the hypercall arguments and therefore it make sense to
continue to pass them round in map->flags.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: Stefano Stabellini <Stefano.Stabellini@eu.citrix.com>
Cc: Derek G. Murray <Derek.Murray@cl.cam.ac.uk>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
These pages are from other domains, so don't have any local PFN.
VM_PFNMAP is the closest concept Linux has to this.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
The gntdev driver allows usermode to map granted pages from other
domains. This is typically used to implement a Xen backend driver
in user mode.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Stefano Stabellini <Stefano.Stabellini@eu.citrix.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>