Commit Graph

547861 Commits

Author SHA1 Message Date
Julien Grall 5995a68a62 xen/privcmd: Add support for Linux 64KB page granularity
The hypercall interface (as well as the toolstack) is always using 4KB
page granularity. When the toolstack is asking for mapping a series of
guest PFN in a batch, it expects to have the page map contiguously in
its virtual memory.

When Linux is using 64KB page granularity, the privcmd driver will have
to map multiple Xen PFN in a single Linux page.

Note that this solution works on page granularity which is a multiple of
4KB.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-10-23 14:20:42 +01:00
Julien Grall d0089e8a0e net/xen-netback: Make it running on 64KB page granularity
The PV network protocol is using 4KB page granularity. The goal of this
patch is to allow a Linux using 64KB page granularity working as a
network backend on a non-modified Xen.

It's only necessary to adapt the ring size and break skb data in small
chunk of 4KB. The rest of the code is relying on the grant table code.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-10-23 14:20:41 +01:00
Julien Grall 30c5d7f0da net/xen-netfront: Make it running on 64KB page granularity
The PV network protocol is using 4KB page granularity. The goal of this
patch is to allow a Linux using 64KB page granularity using network
device on a non-modified Xen.

It's only necessary to adapt the ring size and break skb data in small
chunk of 4KB. The rest of the code is relying on the grant table code.

Note that we allocate a Linux page for each rx skb but only the first
4KB is used. We may improve the memory usage by extending the size of
the rx skb.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-10-23 14:20:40 +01:00
Julien Grall 67de5dfbc1 block/xen-blkback: Make it running on 64KB page granularity
The PV block protocol is using 4KB page granularity. The goal of this
patch is to allow a Linux using 64KB page granularity behaving as a
block backend on a non-modified Xen.

It's only necessary to adapt the ring size and the number of request per
indirect frames. The rest of the code is relying on the grant table
code.

Note that the grant table code is allocating a Linux page per grant
which will result to waste 6OKB for every grant when Linux is using 64KB
page granularity. This could be improved by sharing the page between
multiple grants.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Acked-by: "Roger Pau Monné" <roger.pau@citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-10-23 14:20:40 +01:00
Julien Grall c004a6fe0c block/xen-blkfront: Make it running on 64KB page granularity
The PV block protocol is using 4KB page granularity. The goal of this
patch is to allow a Linux using 64KB page granularity using block
device on a non-modified Xen.

The block API is using segment which should at least be the size of a
Linux page. Therefore, the driver will have to break the page in chunk
of 4K before giving the page to the backend.

When breaking a 64KB segment in 4KB chunks, it is possible that some
chunks are empty. As the PV protocol always require to have data in the
chunk, we have to count the number of Xen page which will be in use and
avoid sending empty chunks.

Note that, a pre-defined number of grants are reserved before preparing
the request. This pre-defined number is based on the number and the
maximum size of the segments. If each segment contains a very small
amount of data, the driver may reserve too many grants (16 grants is
reserved per segment with 64KB page granularity).

Furthermore, in the case of persistent grants we allocate one Linux page
per grant although only the first 4KB of the page will be effectively
in use. This could be improved by sharing the page with multiple grants.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-10-23 14:20:39 +01:00
Julien Grall 5ed5451d99 xen/grant-table: Make it running on 64KB granularity
The Xen interface is using 4KB page granularity. This means that each
grant is 4KB.

The current implementation allocates a Linux page per grant. On Linux
using 64KB page granularity, only the first 4KB of the page will be
used.

We could decrease the memory wasted by sharing the page with multiple
grant. It will require some care with the {Set,Clear}ForeignPage macro.

Note that no changes has been made in the x86 code because both Linux
and Xen will only use 4KB page granularity.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-10-23 14:20:39 +01:00
Julien Grall a001c9d95c xen/events: fifo: Make it running on 64KB granularity
Only use the first 4KB of the page to store the events channel info. It
means that we will waste 60KB every time we allocate page for:
     * control block: a page is allocating per CPU
     * event array: a page is allocating everytime we need to expand it

I think we can reduce the memory waste for the 2 areas by:

    * control block: sharing between multiple vCPUs. Although it will
    require some bookkeeping in order to not free the page when the CPU
    goes offline and the other CPUs sharing the page still there

    * event array: always extend the array event by 64K (i.e 16 4K
    chunk). That would require more care when we fail to expand the
    event channel.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-10-23 14:20:38 +01:00
Julien Grall 30756c6299 xen/balloon: Don't rely on the page granularity is the same for Xen and Linux
For ARM64 guests, Linux is able to support either 64K or 4K page
granularity. Although, the hypercall interface is always based on 4K
page granularity.

With 64K page granularity, a single page will be spread over multiple
Xen frame.

To avoid splitting the page into 4K frame, take advantage of the
extent_order field to directly allocate/free chunk of the Linux page
size.

Note that PVMMU is only used for PV guest (which is x86) and the page
granularity is always 4KB. Some BUILD_BUG_ON has been added to ensure
that because the code has not been modified.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-10-23 14:20:38 +01:00
Julien Grall 9652c08012 tty/hvc: xen: Use xen page definition
The console ring is always based on the page granularity of Xen.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-10-23 14:20:37 +01:00
Julien Grall 7d567928db xen/xenbus: Use Xen page definition
All the ring (xenstore, and PV rings) are always based on the page
granularity of Xen.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-10-23 14:20:37 +01:00
Julien Grall 36f8abd36f xen/biomerge: Don't allow biovec's to be merged when Linux is not using 4KB pages
On ARM all dma-capable devices on a same platform may not be protected
by an IOMMU. The DMA requests have to use the BFN (i.e MFN on ARM) in
order to use correctly the device.

While the DOM0 memory is allocated in a 1:1 fashion (PFN == MFN), grant
mapping will screw this contiguous mapping.

When Linux is using 64KB page granularitary, the page may be split
accross multiple non-contiguous MFN (Xen is using 4KB page
granularity). Therefore a DMA request will likely fail.

Checking that a 64KB page is using contiguous MFN is tedious. For
now, always says that biovec are not mergeable.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-10-23 14:20:36 +01:00
Julien Grall 4f503fbdf3 block/xen-blkfront: split get_grant in 2
Prepare the code to support 64KB page granularity. The first
implementation will use a full Linux page per indirect and persistent
grant. When non-persistent grant is used, each page of a bio request
may be split in multiple grant.

Furthermore, the field page of the grant structure is only used to copy
data from persistent grant or indirect grant. Avoid to set it for other
use case as it will have no meaning given the page will be split in
multiple grant.

Provide 2 functions, to setup indirect grant, the other for bio page.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-10-23 14:20:36 +01:00
Julien Grall a7a6df2223 block/xen-blkfront: Store a page rather a pfn in the grant structure
All the usage of the field pfn are done using the same idiom:

pfn_to_page(grant->pfn)

This will  return always the same page. Store directly the page in the
grant to clean up the code.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-10-23 14:20:35 +01:00
Julien Grall 33204663ef block/xen-blkfront: Split blkif_queue_request in 2
Currently, blkif_queue_request has 2 distinct execution path:
    - Send a discard request
    - Send a read/write request

The function is also allocating grants to use for generating the
request. Although, this is only used for read/write request.

Rather than having a function with 2 distinct execution path, separate
the function in 2. This will also remove one level of tabulation.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-10-23 14:20:34 +01:00
Julien Grall 3922f32c1e xen/grant: Add helper gnttab_page_grant_foreign_access_ref_one
Many PV drivers contain the idiom:

pfn = page_to_gfn(...) /* Or similar */
gnttab_grant_foreign_access_ref

Replace it by a new helper. Note that when Linux is using a different
page granularity than Xen, the helper only gives access to the first 4KB
grant.

This is useful where drivers are allocating a full Linux page for each
grant.

Also include xen/interface/grant_table.h rather than xen/grant_table.h in
asm/page.h for x86 to fix a compilation issue [1]. Only the former is
useful in order to get the structure definition.

[1] Interdependency between asm/page.h and xen/grant_table.h which result
to page_mfn not being defined when necessary.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-10-23 14:20:34 +01:00
Julien Grall 008c320a96 xen/grant: Introduce helpers to split a page into grant
Currently, a grant is always based on the Xen page granularity (i.e
4KB). When Linux is using a different page granularity, a single page
will be split between multiple grants.

The new helpers will be in charge of splitting the Linux page into grants
and call a function given by the caller on each grant.

Also provide an helper to count the number of grants within a given
contiguous region.

Note that the x86/include/asm/xen/page.h is now including
xen/interface/grant_table.h rather than xen/grant_table.h. It's
necessary because xen/grant_table.h depends on asm/xen/page.h and will
break the compilation. Furthermore, only definition in
interface/grant_table.h is required.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-10-23 14:20:33 +01:00
Julien Grall 1084b1988d xen: Add Xen specific page definition
The Xen hypercall interface is always using 4K page granularity on ARM
and x86 architecture.

With the incoming support of 64K page granularity for ARM64 guest, it
won't be possible to re-use the Linux page definition in Xen drivers.

Introduce Xen page definition helpers based on the Linux page
definition. They have exactly the same name but prefixed with
XEN_/xen_ prefix.

Also modify xen_page_to_gfn to use new Xen page definition.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-10-23 14:20:33 +01:00
Julien Grall 5031612b5e arm/xen: Drop pte_mfn and mfn_pte
They are not used in common code expect in one place in balloon.c which is
only compiled when Linux is using PV MMU. It's not the case on ARM.

Rather than worrying how to handle the 64KB case, drop them.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-10-23 14:20:32 +01:00
Julien Grall a0f2e80fcd net/xen-netback: xenvif_gop_frag_copy: move GSO check out of the loop
The skb doesn't change within the function. Therefore it's only
necessary to check if we need GSO once at the beginning.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-10-23 14:20:32 +01:00
David Vrabel 4a69c909de xen/balloon: pre-allocate p2m entries for ballooned pages
Pages returned by alloc_xenballooned_pages() will be used for grant
mapping which will call set_phys_to_machine() (in PV guests).

Ballooned pages are set as INVALID_P2M_ENTRY in the p2m and thus may
be using the (shared) missing tables and a subsequent
set_phys_to_machine() will need to allocate new tables.

Since the grant mapping may be done from a context that cannot sleep,
the p2m entries must already be allocated.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
2015-10-23 14:20:31 +01:00
David Vrabel 8edfcf882e x86/xen: export xen_alloc_p2m_entry()
Rename alloc_p2m() to xen_alloc_p2m_entry() and export it.

This is useful for ensuring that a p2m entry is allocated (i.e., not a
shared missing or identity entry) so that subsequent set_phys_to_machine()
calls will require no further allocations.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
---
v3:
- Make xen_alloc_p2m_entry() a nop on auto-xlate guests.
2015-10-23 14:20:28 +01:00
David Vrabel 1cf6a6c829 xen/balloon: use hotplugged pages for foreign mappings etc.
alloc_xenballooned_pages() is used to get ballooned pages to back
foreign mappings etc.  Instead of having to balloon out real pages,
use (if supported) hotplugged memory.

This makes more memory available to the guest and reduces
fragmentation in the p2m.

This is only enabled if the xen.balloon.hotplug_unpopulated sysctl is
set to 1.  This sysctl defaults to 0 in case the udev rules to
automatically online hotplugged memory do not exist.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
---
v3:
- Add xen.balloon.hotplug_unpopulated sysctl to enable use of hotplug
  for unpopulated pages.
2015-10-23 14:20:05 +01:00
David Vrabel 81b286e0f1 xen/balloon: make alloc_xenballoon_pages() always allocate low pages
All users of alloc_xenballoon_pages() wanted low memory pages, so
remove the option for high memory.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
2015-10-23 14:20:05 +01:00
David Vrabel b2ac6aa8f7 xen/balloon: only hotplug additional memory if required
Now that we track the total number of pages (included hotplugged
regions), it is easy to determine if more memory needs to be
hotplugged.

Add a new BP_WAIT state to signal that the balloon process needs to
wait until kicked by the memory add notifier (when the new section is
onlined by userspace).

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
---
v3:
- Return BP_WAIT if enough sections are already hotplugged.

v2:
- New BP_WAIT status after adding new memory sections.
2015-10-23 14:20:04 +01:00
David Vrabel de5a77d842 xen/balloon: rationalize memory hotplug stats
The stats used for memory hotplug make no sense and are fiddled with
in odd ways.  Remove them and introduce total_pages to track the total
number of pages (both populated and unpopulated) including those within
hotplugged regions (note that this includes not yet onlined pages).

This will be used in a subsequent commit (xen/balloon: only hotplug
additional memory if required) when deciding whether additional memory
needs to be hotplugged.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
2015-10-23 14:20:04 +01:00
David Vrabel 55b3da98a4 xen/balloon: find non-conflicting regions to place hotplugged memory
Instead of placing hotplugged memory at the end of RAM (which may
conflict with PCI devices or reserved regions) use allocate_resource()
to get a new, suitably aligned resource that does not conflict.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
---
v3:
- Remove stale comment.
2015-10-23 14:20:03 +01:00
David Vrabel f5775e0b61 x86/xen: discard RAM regions above the maximum reservation
During setup, discard RAM regions that are above the maximum
reservation (instead of marking them as E820_UNUSABLE).  This allows
hotplug memory to be placed at these addresses.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
2015-10-23 14:20:03 +01:00
David Vrabel f6a6cb1afe xen/balloon: remove scratch page left overs
Commit 0bb599fd30 (xen: remove scratch
frames for ballooned pages and m2p override) removed the use of the
scratch page for ballooned out pages.

Remove some left over function definitions.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
2015-10-23 14:20:02 +01:00
David Vrabel 62cedb9f13 mm: memory hotplug with an existing resource
Add add_memory_resource() to add memory using an existing "System RAM"
resource.  This is useful if the memory region is being located by
finding a free resource slot with allocate_resource().

Xen guests will make use of this in their balloon driver to hotplug
arbitrary amounts of memory in response to toolstack requests.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
Reviewed-by: Tang Chen <tangchen@cn.fujitsu.com>
2015-10-23 14:19:58 +01:00
Linus Torvalds 7379047d55 Linux 4.3-rc6 2015-10-18 16:08:42 -07:00
Linus Torvalds c44b325568 Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
 "Here are some bugfixes for the I2C subsystem.

  Kieran found a flaw in the recently renewed wake irq handling.  Mika
  handled a user bug report where the ACPI info turned out to be
  unusable.  I updated MAINTAINERS so that such bug reports will sooner
  get to the right people.  Geert pointed me to a problem of some i2c
  drivers regarding PM which I fixed"

* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  i2c: designware: Do not use parameters from ACPI on Dell Inspiron 7348
  MAINTAINERS: add maintainers for Synopsis Designware I2C drivers
  i2c: designware-platdrv: enable RuntimePM before registering to the core
  i2c: s3c2410: enable RuntimePM before registering to the core
  i2c: rcar: enable RuntimePM before registering to the core
  i2c: return probe deferred status on dev_pm_domain_attach
2015-10-18 12:07:48 -07:00
Mika Westerberg 56d4b8a24c i2c: designware: Do not use parameters from ACPI on Dell Inspiron 7348
ACPI SSCN/FMCN methods were originally added because then the platform can
provide the most accurate HCNT/LCNT values to the driver. However, this
seems not to be true for Dell Inspiron 7348 where using these causes the
touchpad to fail in boot:

  i2c_hid i2c-DLL0675:00: failed to retrieve report from device.
  i2c_designware INT3433:00: i2c_dw_handle_tx_abort: lost arbitration
  i2c_hid i2c-DLL0675:00: failed to retrieve report from device.
  i2c_designware INT3433:00: controller timed out

The values received from ACPI are (in fast mode):

  HCNT: 72
  LCNT: 160

this translates to following timings (input clock is 100MHz on Broadwell):

  tHIGH: 720 ns (spec min 600 ns)
  tLOW: 1600 ns (spec min 1300 ns)
  Bus period: 2920 ns (assuming 300 ns tf and tr)
  Bus speed: 342.5 kHz

Both tHIGH and tLOW are within the I2C specification.

The calculated values when ACPI parameters are not used are (in fast mode):

  HCNT: 87
  LCNT: 159

which translates to:

  tHIGH: 870 ns (spec min 600 ns)
  tLOW: 1590 ns (spec min 1300 ns)
  Bus period 3060 ns (assuming 300 ns tf and tr)
  Bus speed 326.8 kHz

These values are also within the I2C specification.

Since both ACPI and calculated values meet the I2C specification timing
requirements it is hard to say why the touchpad does not function properly
with the ACPI values except that the bus speed is higher in this case (but
still well below the max 400kHz).

Solve this by adding DMI quirk to the driver that disables using ACPI
parameters on this particulare machine.

Reported-by: Pavel Roskin <plroskin@gmail.com>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Tested-by: Pavel Roskin <plroskin@gmail.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Cc: stable@kernel.org
2015-10-18 14:11:08 +02:00
Linus Torvalds 81429a6dbc Merge branches 'irq-urgent-for-linus' and 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq/timer fixes from Thomas Gleixner:
 "irq: a fix for the new hierarchical MSI interrupt handling which
  unbreaks PCI=n configurations.

  timers: a fix for the new hrtimer clock offset update mechanism to
  ensure that the boot time offset is respected"

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  genirq/msi: Do not use pci_msi_[un]mask_irq as default methods

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  timekeeping: Increment clock_was_set_seq in timekeeping_init()
2015-10-17 08:47:27 -07:00
Linus Torvalds 16c8b9cb24 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input fixes from Dmitry Torokhov:
 "Just two small fixups to ads7846 touchscreen controller driver and
  Cypress touchpad driver"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: cyapa - fix the copy paste error on electrodes_rx value
  Input: ads7846 - correct the value got from SPI
2015-10-16 17:39:27 -07:00
Linus Torvalds 4f3f9573b6 Just one revert for Armada XP devices. The conversion to
of_clk_get_parent_name() wasn't a direct translation, so we
 revert back to of_clk_get() + __clk_get_name(). We could make
 of_clk_get_parent_name() more robust, but that may have unintended
 side-effects, so we'll do that in the next version.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABCAAGBQJWIYkmAAoJENidgRMleOc9hiMP+gP1dsiiAsZCmJLwSQeqE2Gl
 klVRiX7q2jPu61nEo3fSnhlrIm45PARr5J/vDbW8+/DV2RnEsbI9RfdDgkPSNqe3
 0VbhEGKY2RhMpjx1dpmyC5WFPWR8ESel5CljXXAGhdMfuaLxy55HJvpWaljwR0cr
 w6Gd66K381GBXLKgFUdYmlXbNLbtvS81Q4z4v/NZZt1r/hqBsjAkyRpE8ix8gHpy
 8CgzY6PRgyF45+66PpkuSsXSfLSDxvkv15DM7pxa37nsFMh10mqRqqagTX6KLJAD
 4687nBT0P4YzsUYoCGmY4mg8tkiiEnFqKPQFQ/lHxT74l9Ddst4OZoj67VuPBp59
 3YM364Rshj5H89EkZ5ARKLLJ2eFmK7eqGfwqpJVD3K51IDzUznDLE7WZlb0lBQXO
 5hkpeFUAbd83RSFXIYZmpkHJamp/cmCnbnUYcluS/h/tgzOleO2oK4P7ud9vkxvB
 mtbHsK+ax7te4FUVyI6zk4IPrzFhkDkuOg0CAsJRPQSeQ4TPNbOk7RgSZ9zNqq/w
 2KoI2tbJQ6fyknH8kBB2Oxmfl1FWNPy5keLGEnX1FJ/4jXlFcJF00MJ4bW1cbj2i
 gnwXcqrMZavT0ixNLhDfLILM7i8UkxoXw3jvn0k+CHtpwGuQjKet4q6eUgDcgkiv
 yFAzvxyYsKf7OtH/tFmL
 =kSCW
 -----END PGP SIGNATURE-----

Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux

Pull clk fix from Stephen Boyd:
 "Just one revert for Armada XP devices: the conversion to
  of_clk_get_parent_name() wasn't a direct translation, so we
  revert back to of_clk_get() + __clk_get_name().

  We could make of_clk_get_parent_name() more robust, but that
  may have unintended side-effects, so we'll do that in the
  next version"

* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
  Partially revert "clk: mvebu: Convert to clk_hw based provider APIs"
2015-10-16 17:11:14 -07:00
Linus Torvalds 045ce74349 Two DM target error path cleanup fixes (one for stable in DM thinp and
one for a v4.3-rc5 thinko in DM snapshot).
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJWIVMAAAoJEMUj8QotnQNaUmQH/3/zAJbGRL7WBdPRMdG1Fq5G
 nM4I3wUcorLWaZZZ6jpN3A4zX+UIOscH/Si6/33pV7+wG3JuvIcOgZQ7CBLW6Ipl
 D8JqpIjG5yfO71CPFIzhM/aHlJCjAj79Ejyj0W9ci2xntbglIxYcipSd55P4LgzU
 7Yu1mA32gHHIN0sghvZEVIpavwYV6RRxhTEwxYP0GeTc3PRpUbSyuYJijlWVPH8T
 orTg5SeqRLDymYLMRrGs8lHlOBhbKipXsZwfLwUq8zXIBYFseKdQpuac1u/HMJlS
 ssnLUooQdOnSxxf8Vn/ozhD8BxKkO0jtVZ8OAb3PKXD4QDQgMXLXwNjxDyOU2DY=
 =VCRB
 -----END PGP SIGNATURE-----

Merge tag 'dm-4.3-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm

Pull device mapper fixes from Mike Snitzer:
 "Two DM target error path cleanup fixes (one for stable in DM thinp and
  one for a v4.3-rc5 thinko in DM snapshot)"

* tag 'dm-4.3-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
  dm thin: fix missing pool reference count decrement in pool_ctr error path
  dm snapshot persistent: fix missing cleanup in persistent_ctr error path
2015-10-16 13:03:05 -07:00
Linus Torvalds 6aa8ca4df0 Merge branch 'for-linus-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs
Pull btrfs fixes from Chris Mason:
 "I have two more bug fixes for btrfs.

  My commit fixes a bug we hit last week at FB, a combination of lots of
  hard links and an admin command to resolve inode numbers.

  Dave is adding checks to make sure balance on current kernels ignores
  filters it doesn't understand.  The penalty for being wrong is just
  doing more work (not crashing etc), but it's a good fix"

* 'for-linus-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
  btrfs: fix use after free iterating extrefs
  btrfs: check unsupported filters in balance arguments
2015-10-16 12:55:34 -07:00
Linus Torvalds 59bcce1216 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client
Pull Ceph fixes from Sage Weil:
 "Just two small items from Ilya:

  The first patch fixes the RBD readahead to grab full objects.  The
  second fixes the write ops to prevent undue promotion when a cache
  tier is configured on the server side"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
  rbd: use writefull op for object size writes
  rbd: set max_sectors explicitly
2015-10-16 12:47:02 -07:00
Linus Torvalds a4c4c49a61 Power management and ACPI fixes for v4.3-rc6
- Fix a regression introduced by a recent ACPICA cleanup that
    uncovered a latent bug (Lv Zheng).
 
  - Fix a recent regression in the generic power domains framework
    that may cause it to violate PM QoS latency constraints in some
    cases (Ulf Hansson).
 
  - Fix an intel_pstate driver crash on the Knights Landing chips
    that do not update the MPERF counter as often as expected by the
    driver which may result in a divide by 0 (Srinivas Pandruvada).
 
 /
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABCAAGBQJWIPwyAAoJEILEb/54YlRx4OYP/ivVGcwUYVLbb9c7ePX220pb
 QlkwQH/uTh1tSwSCNq2X4e6hgPdVKYx3/Buc0t89bVFwV+upqUcfNMfj0GWvXO3e
 2fLgb/27Zg3ZKBe8XqeHbibN7HQ9Wz/s9rJRLVCWLUY/5WHRCyyO2JfELjuX4pNq
 tJTH7WIFIJ3riuV97DfO55ZrpKL2msnFvItv+pKVuuVWJMJKpHg+NQKQ5UWegDed
 IYaT1FlvVwGwlrvGvUNi8qWTwrziNDKd460qrOZFLTTdMwg4UZPI991euK6ZEot7
 refZPtrDIrPBD3tPOWjxERVuleIL+Bw7sq+JKUX9IdM4m69UcAOz63oXRLYm/ilV
 nAvo6eDFc37FVQHGv1JzzbZIyvZOeFMfvRN3R06Qfm9Kdu2wjjTd/fYd63IabHe+
 HwpgZbEhGRarwZ3VOJzQrLaa/gMTltpRKEiMQeHSnUmSCVJiKK/q5b9RBFfAOpfa
 wIlaFxsx1GC8QBatL4I8X2M0w9UQpY4N48NZX+FTRx1zCEkocugHbn6vIHW7USVu
 5mBoghdnPcfZJS66cSPa508LZOdfhw4vB8jQXzlG+v1PBdw7ISf6wnEyTawYMpiB
 pnIciv18WYTk/MmVdkf97TLHMbuiMVkyOaSZ1SpdE+leN8dGOhCscBb/P088ellk
 A9N4dzCeaSo6uHY6OMhZ
 =+ku3
 -----END PGP SIGNATURE-----

Merge tag 'pm+acpi-4.3-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management and ACPI fixes from Rafael Wysocki:
 "These fix two recent regressions (ACPICA, the generic power domains
  framework) and one crash that may happen on specific hardware
  supported since 4.1 (intel_pstate).

  Specifics:

   - Fix a regression introduced by a recent ACPICA cleanup that
     uncovered a latent bug (Lv Zheng).

   - Fix a recent regression in the generic power domains framework that
     may cause it to violate PM QoS latency constraints in some cases
     (Ulf Hansson).

   - Fix an intel_pstate driver crash on the Knights Landing chips that
     do not update the MPERF counter as often as expected by the driver
     which may result in a divide by 0 (Srinivas Pandruvada)"

* tag 'pm+acpi-4.3-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  cpufreq: intel_pstate: Fix divide by zero on Knights Landing (KNL)
  ACPICA: Tables: Fix FADT dependency regression
  PM / Domains: Fix validation of latency constraints in genpd governor
2015-10-16 12:25:54 -07:00
Linus Torvalds 8b7b56f37b Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
 "Nothing too crazy or exciting:

   - two MAINTAINERS entries that I didn't see the point in delaying.
   - one drm mst fix to stop sending uninitialised data to monitors
   - two amdgpu fixes
   - one radeon mst tiling fix
   - one vmwgfx regression fix
   - one virtio warning fix.

  I have found one locking problem that needs a bit of reorg to fix, but
  I'm not sure it's worth putting in -fixes as I don't think we've seen
  it hit in the real world ever, I just found it using the virtio-gpu
  driver when working on it.  I'll possibly send it next week once I've
  time to discuss with Daniel"

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
  drm/virtio: use %llu format string form atomic64_t
  MAINTAINERS: Add myself as maintainer for the gma500 driver
  MAINTAINERS: add a maintainer for the atmel-hlcdc DRM driver
  drm/amdgpu: Keep the pflip interrupts always enabled v7
  drm/amdgpu: adjust default dispclk (v2)
  drm/dp/mst: make mst i2c transfer code more robust.
  drm/radeon: attach tile property to mst connector
  drm/vmwgfx: Fix kernel NULL pointer dereference on older hardware
2015-10-16 12:19:11 -07:00
Linus Torvalds ebb65c81e1 powerpc fixes for 4.3 #3
- Re-enable CONFIG_SCSI_DH in our defconfigs
  - Remove unused os_area_db_id_video_mode
  - cxl: fix leak of IRQ names in cxl_free_afu_irqs() from Andrew
  - cxl: fix leak of ctx->irq_bitmap when releasing context via kernel API from Andrew
  - cxl: fix leak of ctx->mapping when releasing kernel API contexts from Andrew
  - cxl: Workaround malformed pcie packets on some cards from Philippe
  - cxl: Fix number of allocated pages in SPA from Christophe Lombard
  - Fix checkstop in native_hpte_clear() with lockdep from Cyril
  - Panic on unhandled Machine Check on powernv from Daniel
  - selftests/powerpc: Fix build failure of load_unaligned_zeropad test
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWIM0xAAoJEFHr6jzI4aWAsCIP/04uAiPCqWOwHjr8/eAlNAmJ
 GaA6b91QUUpBlyXgzYZShS/FQEnyukbGTUzaS3KwijOdRJtCHxvl2eG7pOCws+GS
 2YeA9mBm7MgYT0BJ+KLGCgrF5C/sc+LN3udO9Kf1LimLpp+fIILHgEmhrfy00wUp
 f7tJ/Rvpt23PmcCDX0PhA7NuOrRu5hQOQ9rsqJfzc7XObZAG1AfISPgALgaeAINc
 XqQfWiNFLmDJyhV9K39rUXSTvHYl6pPnfDj4GelfjQD2l/csH0M4MeGW2tHNkgVy
 CakLWOP3zdZVTYTcB8wypnoZxATPhEsHehJmQ4fu3n0WR1vHfCqh4rFZuPaaX0NG
 P3In0eOV285RIpNLcwkchN+07Ops1Fvi5XonaQpgHCcI9c4H7IAGPbQau2DhR9sU
 DyZQ+/6wNzpXbM7llM3VyTA2zvvyiuEzuIZI78XWexO/Ny6TCItRtEqJEXMA+ChX
 lKbLluRnQcnn5sizK0yj4mtkffAbu7Za1KGl1nm1Q/5pBQWsC40wFcRLNNdzqVmH
 7tSp8cIEYunCYKy5bAheWJTzpUgGD55EEcUkQFHVm5LKBXyA73qJRSMuLZqtnB3z
 g6eTiEKhZvVFedNMDNFnNWrvOnd8JpyjGLRAbqgwMhN+lgVvmwwSSB6V2SefMnuL
 HCSGqR40vPA9bH0Cz/ND
 =3ze+
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-4.3-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:
 - Re-enable CONFIG_SCSI_DH in our defconfigs
 - Remove unused os_area_db_id_video_mode
 - cxl: fix leak of IRQ names in cxl_free_afu_irqs() from Andrew
 - cxl: fix leak of ctx->irq_bitmap when releasing context via kernel API from Andrew
 - cxl: fix leak of ctx->mapping when releasing kernel API contexts from Andrew
 - cxl: Workaround malformed pcie packets on some cards from Philippe
 - cxl: Fix number of allocated pages in SPA from Christophe Lombard
 - Fix checkstop in native_hpte_clear() with lockdep from Cyril
 - Panic on unhandled Machine Check on powernv from Daniel
 - selftests/powerpc: Fix build failure of load_unaligned_zeropad test

* tag 'powerpc-4.3-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  selftests/powerpc: Fix build failure of load_unaligned_zeropad test
  powerpc/powernv: Panic on unhandled Machine Check
  powerpc: Fix checkstop in native_hpte_clear() with lockdep
  cxl: Fix number of allocated pages in SPA
  cxl: Workaround malformed pcie packets on some cards
  cxl: fix leak of ctx->mapping when releasing kernel API contexts
  cxl: fix leak of ctx->irq_bitmap when releasing context via kernel API
  cxl: fix leak of IRQ names in cxl_free_afu_irqs()
  powerpc/ps3: Remove unused os_area_db_id_video_mode
  powerpc/configs: Re-enable CONFIG_SCSI_DH
2015-10-16 12:07:43 -07:00
Linus Torvalds 3d875182d7 Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
 "6 fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  sh: add copy_user_page() alias for __copy_user()
  lib/Kconfig: ZLIB_DEFLATE must select BITREVERSE
  mm, dax: fix DAX deadlocks
  memcg: convert threshold to bytes
  builddeb: remove debian/files before build
  mm, fs: obey gfp_mapping for add_to_page_cache()
2015-10-16 11:42:37 -07:00
Ross Zwisler 934ed25ea5 sh: add copy_user_page() alias for __copy_user()
copy_user_page() is needed by DAX.  Without this we get a compile error
for DAX on SH:

  fs/dax.c:280:2: error: implicit declaration of function `copy_user_page' [-Werror=implicit-function-declaration]
    copy_user_page(vto, (void __force *)vfrom, vaddr, to);
      ^

This was done with a random config that happened to include DAX support.

This patch has only been compile tested.

Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Matthew Wilcox <willy@linux.intel.com>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-10-16 11:42:28 -07:00
Andrew Morton 1fd4e5c347 lib/Kconfig: ZLIB_DEFLATE must select BITREVERSE
lib/built-in.o: In function `__bitrev32':
deftree.c:(.text+0x1e799): undefined reference to `byte_rev_table'
deftree.c:(.text+0x1e7a0): undefined reference to `byte_rev_table'
deftree.c:(.text+0x1e7b4): undefined reference to `byte_rev_table'
deftree.c:(.text+0x1e7c1): undefined reference to `byte_rev_table'

Anything which uses bitrevX() has to select BITREVERSE, to grab
lib/bitrev.o.

Reported-by: Jim Davis <jim.epost@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-10-16 11:42:28 -07:00
Ross Zwisler 0f90cc6609 mm, dax: fix DAX deadlocks
The following two locking commits in the DAX code:

commit 843172978b ("dax: fix race between simultaneous faults")
commit 46c043ede4 ("mm: take i_mmap_lock in unmap_mapping_range() for DAX")

introduced a number of deadlocks and other issues which need to be fixed
for the v4.3 kernel.  The list of issues in DAX after these commits
(some newly introduced by the commits, some preexisting) can be found
here:

  https://lkml.org/lkml/2015/9/25/602 (Subject: "Re: [PATCH] dax: fix deadlock in __dax_fault").

This undoes most of the changes introduced by those two commits,
essentially returning us to the DAX locking scheme that was used in
v4.2.

Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Dan Williams <dan.j.williams@intel.com>
Tested-by: Dave Chinner <dchinner@redhat.com>
Cc: Jan Kara <jack@suse.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Matthew Wilcox <matthew.r.wilcox@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-10-16 11:42:28 -07:00
Shaohua Li 424cdc1413 memcg: convert threshold to bytes
page_counter_memparse() returns pages for the threshold, while
mem_cgroup_usage() returns bytes for memory usage.  Convert the
threshold to bytes.

Fixes: 3e32cb2e0a ("memcg: rename cgroup_event to mem_cgroup_event").
Signed-off-by: Shaohua Li <shli@fb.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-10-16 11:42:28 -07:00
Riku Voipio 8d740a37b9 builddeb: remove debian/files before build
Commit 3716001bcb ("deb-pkg: add source package") added the ability to
create a debian changelog file.  This exposed that previously the
builddeb script hasn't cleared debian/files between builds.

As debian/files keeps accumulating entries, the changes file will end up
growing indefinelty.  With outdated entries in debian/files, builddeb
script will exit with failure.  This regression impacts those who use
"make deb-pkg" target to build kernel into a .deb package and never use
"make mrproper" or other means to clean kernel tree from generated
directories.

To fix the regression, remove debian/files before starting build and in
the generated clean rule.

Fixes: 3716001bcb ("deb-pkg: add source package")
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Reported-by: Doug Smythies <dsmythies@telus.net>
Tested-by: Doug Smythies <dsmythies@telus.net>
Tested-by: Kalle Valo <kvalo@codeaurora.org>
Acked-by: Ben Hutchings <ben@decadent.org.uk>
Cc: Michal Marek <mmarek@suse.cz>
Cc: maximilian attems <maks@stro.at>
Cc: Chris J Arges <chris.j.arges@canonical.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-10-16 11:42:28 -07:00
Michal Hocko 063d99b4fa mm, fs: obey gfp_mapping for add_to_page_cache()
Commit 6afdb859b7 ("mm: do not ignore mapping_gfp_mask in page cache
allocation paths") has caught some users of hardcoded GFP_KERNEL used in
the page cache allocation paths.  This, however, wasn't complete and
there were others which went unnoticed.

Dave Chinner has reported the following deadlock for xfs on loop device:
: With the recent merge of the loop device changes, I'm now seeing
: XFS deadlock on my single CPU, 1GB RAM VM running xfs/073.
:
: The deadlocked is as follows:
:
: kloopd1: loop_queue_read_work
:       xfs_file_iter_read
:       lock XFS inode XFS_IOLOCK_SHARED (on image file)
:       page cache read (GFP_KERNEL)
:       radix tree alloc
:       memory reclaim
:       reclaim XFS inodes
:       log force to unpin inodes
:       <wait for log IO completion>
:
: xfs-cil/loop1: <does log force IO work>
:       xlog_cil_push
:       xlog_write
:       <loop issuing log writes>
:               xlog_state_get_iclog_space()
:               <blocks due to all log buffers under write io>
:               <waits for IO completion>
:
: kloopd1: loop_queue_write_work
:       xfs_file_write_iter
:       lock XFS inode XFS_IOLOCK_EXCL (on image file)
:       <wait for inode to be unlocked>
:
: i.e. the kloopd, with it's split read and write work queues, has
: introduced a dependency through memory reclaim. i.e. that writes
: need to be able to progress for reads make progress.
:
: The problem, fundamentally, is that mpage_readpages() does a
: GFP_KERNEL allocation, rather than paying attention to the inode's
: mapping gfp mask, which is set to GFP_NOFS.
:
: The didn't used to happen, because the loop device used to issue
: reads through the splice path and that does:
:
:       error = add_to_page_cache_lru(page, mapping, index,
:                       GFP_KERNEL & mapping_gfp_mask(mapping));

This has changed by commit aa4d86163e ("block: loop: switch to VFS
ITER_BVEC").

This patch changes mpage_readpage{s} to follow gfp mask set for the
mapping.  There are, however, other places which are doing basically the
same.

lustre:ll_dir_filler is doing GFP_KERNEL from the function which
apparently uses GFP_NOFS for other allocations so let's make this
consistent.

cifs:readpages_get_pages is called from cifs_readpages and
__cifs_readpages_from_fscache called from the same path obeys mapping
gfp.

ramfs_nommu_expand_for_mapping is hardcoding GFP_KERNEL as well
regardless it uses mapping_gfp_mask for the page allocation.

ext4_mpage_readpages is the called from the page cache allocation path
same as read_pages and read_cache_pages

As I've noticed in my previous post I cannot say I would be happy about
sprinkling mapping_gfp_mask all over the place and it sounds like we
should drop gfp_mask argument altogether and use it internally in
__add_to_page_cache_locked that would require all the filesystems to use
mapping gfp consistently which I am not sure is the case here.  From a
quick glance it seems that some file system use it all the time while
others are selective.

Signed-off-by: Michal Hocko <mhocko@suse.com>
Reported-by: Dave Chinner <david@fromorbit.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Ming Lei <ming.lei@canonical.com>
Cc: Andreas Dilger <andreas.dilger@intel.com>
Cc: Oleg Drokin <oleg.drokin@intel.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-10-16 11:42:28 -07:00
Ilya Dryomov e30b7577bf rbd: use writefull op for object size writes
This covers only the simplest case - an object size sized write, but
it's still useful in tiering setups when EC is used for the base tier
as writefull op can be proxied, saving an object promotion.

Even though updating ceph_osdc_new_request() to allow writefull should
just be a matter of fixing an assert, I didn't do it because its only
user is cephfs.  All other sites were updated.

Reflects ceph.git commit 7bfb7f9025a8ee0d2305f49bf0336d2424da5b5b.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Alex Elder <elder@linaro.org>
2015-10-16 16:49:01 +02:00
Ilya Dryomov 0d9fde4fc8 rbd: set max_sectors explicitly
Commit 30e2bc08b2 ("Revert "block: remove artifical max_hw_sectors
cap"") restored a clamp on max_sectors.  It's now 2560 sectors instead
of 1024, but it's not good enough: we set max_hw_sectors to rbd object
size because we don't want object sized I/Os to be split, and the
default object size is 4M.

So, set max_sectors to max_hw_sectors in rbd at queue init time.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Alex Elder <elder@linaro.org>
2015-10-16 16:48:36 +02:00