When we receive a remote identity address during SMP key distribution we
should ensure that any associated L2CAP channel instances get their
address information correspondingly updated (so that e.g. doing
getpeername on associated sockets returns the correct address).
This patch adds a new L2CAP core function l2cap_conn_update_id_addr()
which is used to iterate through all L2CAP channels associated with a
connection and update their address information.
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Commit bfacbb9 (Bluetooth: Use devname:vhci module alias for virtual HCI
driver) added the module alias to hci_vhci module so it's possible to
create the /dev/vhci node. However creating an alias without
specifying the minor doesn't allow us to create the node ahead,
triggerring module auto-load when it's first accessed.
Starting with depmod from kmod 16 we started to warn if there's a
devname alias without specifying the major and minor.
Let's do the same done for uhid, kvm, fuse and others, specifying a
fixed minor. In systems with systemd as the init the following will
happen: on early boot systemd will call "kmod static-nodes" to read
/lib/modules/$(uname -r)/modules.devname and then create the nodes. When
first accessed these "dead" nodes will trigger the module loading.
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
There are many situations where we need to check if an LE address is an
RPA and if so try to look up the IRK for it. To simplify such cases this
patch adds a convenience function for the job.
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
When mgmt_unpair_device is called we should also remove any associated
IRKs. This patch adds a hci_remove_irk convenience function and ensures
that it's called when mgmt_unpair_device is called.
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
There are many functions that never fail but still declare an integer
return value for no reason. This patch converts these functions to use a
void return value to avoid any confusion of whether they can fail or not.
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
When removing Long Term Keys we should also be checking that the given
address type (public vs random) matches. This patch updates the
hci_remove_ltk function to take an extra parameter and uses it for
address type matching.
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
This patch implements the Load IRKs command for the management
interface. The command is used to load the kernel with the initial set
of IRKs. It also sets a HCI_RPA_RESOLVING flag to indicate that we can
start requesting devices to distribute their IRK to us.
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
When implementing support for Resolvable Private Addresses (RPAs) we'll
need to in several places be able to identify such addresses. This patch
adds a simple convenience function to do the identification of the
address type.
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
This patch adds the initial IRK storage and management functions to the
HCI core. This includes storing a list of IRKs per HCI device and the
ability to add, remove and lookup entries in that list.
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Previously the crypto context has only been available for LE SMP
sessions, but now that we'll need to perform operations also during
discovery it makes sense to have this context part of the hci_dev
struct. Later, the context can be removed from the SMP context.
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
The tty driver api design prefers no-fail writes if the driver
write_room() method has previously indicated space is available
to accept writes. Since this is trivially possible for the
RFCOMM tty driver, do so.
Introduce rfcomm_dlc_send_noerror(), which queues but does not
schedule the krfcomm thread if the dlc is not yet connected
(and thus does not error based on the connection state).
The mtu size test is also unnecessary since the caller already
chunks the written data into mtu size.
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Tested-By: Alexander Holler <holler@ahsoftware.de>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Only one session/channel combination may be in use at any one
time. However, the failure does not occur until the tty is
opened (in rfcomm_dlc_open()).
Because these settings are actually bound at rfcomm device
creation (via RFCOMMCREATEDEV ioctl), validate and fail before
creating the rfcomm tty device.
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Tested-By: Alexander Holler <holler@ahsoftware.de>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
When RFCOMM_RELEASE_ONHUP is set, the rfcomm tty driver 'takes over'
the initial rfcomm_dev reference created by the RFCOMMCREATEDEV ioctl.
The assumption is that the rfcomm tty driver will release the
rfcomm_dev reference when the tty is freed (in rfcomm_tty_cleanup()).
However, if the tty is never opened, the 'take over' never occurs,
so when RFCOMMRELEASEDEV ioctl is called, the reference is not
released.
Track the state of the reference 'take over' so that the release
is guaranteed by either the RFCOMMRELEASEDEV ioctl or the rfcomm tty
driver.
Note that the synchronous hangup in rfcomm_release_dev() ensures
that rfcomm_tty_install() cannot race with the RFCOMMRELEASEDEV ioctl.
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Tested-By: Alexander Holler <holler@ahsoftware.de>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
No logic prevents an rfcomm_dev from being released multiple
times. For example, if the rfcomm_dev ref count is large due
to pending tx, then multiple RFCOMMRELEASEDEV ioctls may
mistakenly release the rfcomm_dev too many times. Note that
concurrent ioctls are not required to create this condition.
Introduce RFCOMM_DEV_RELEASED status bit which guarantees the
rfcomm_dev can only be released once.
NB: Since the flags are exported to userspace, introduce the status
field to track state for which userspace should not be aware.
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Tested-By: Alexander Holler <holler@ahsoftware.de>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
The tty core supports two models for handling tty_port lifetimes;
the tty_port can use the kref supplied by tty_port (which will
automatically destruct the tty_port when the ref count drops to
zero) or it can destruct the tty_port manually.
For tty drivers that choose to use the port kref to manage the
tty_port lifetime, it is not possible to safely acquire a port
reference conditionally. If the last reference is released after
evaluating the condition but before acquiring the reference, a
bogus reference will be held while the tty_port destruction
commences.
Rather, only acquire a port reference if the ref count is non-zero
and allow the caller to distinguish if a reference has successfully
been acquired.
Cc: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Tested-By: Alexander Holler <holler@ahsoftware.de>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Now that the LE L2CAP Connection Oriented Channel support has undergone a
decent amount of testing we can make it officially supported. This patch
removes the enable_lecoc module parameter which was previously needed to
enable support for LE L2CAP CoC.
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
There's no way the driver can pre-build the radiotap header,
so remove the comment stating that it can.
Reported-by: Jouni Malinen <j@w1.fi>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
This patch adds to hdev the connection parameters list (hdev->le_
conn_params). The elements from this list (struct hci_conn_params)
contains the connection parameters (for now, minimum and maximum
connection interval) that should be used during the connection
establishment.
Moreover, this patch adds helper functions to manipulate hdev->le_
conn_params list. Some of these functions are also declared in
hci_core.h since they will be used outside hci_core.c in upcoming
patches.
Signed-off-by: Andre Guedes <andre.guedes@openbossa.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
The LTK key types available right now are unauthenticated and
authenticated ones. Provide two simple constants for it.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
The struct smp_ltk does not need to be packed and so remove __packed.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
The field is not a boolean, it is actually a field for a key type. So
name it properly.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
When encryption for LE links has been enabled, it will always be use
AES-CCM encryption. In case of BR/EDR Secure Connections, the link
will also use AES-CCM encryption. In both cases track the AES-CCM
status in the connection flags.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Originally allowing the use of debug keys was done via the Load Link
Keys management command. However this is BR/EDR specific and to be
flexible and allow extending this to LE as well, make this an independent
command.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
When the controller has been enabled to allow usage of debug keys, then
clearly identify that in the current settings information.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
This patch groups the list_head fields from struct hci_dev together
and removes empty lines between them.
Signed-off-by: Andre Guedes <andre.guedes@openbossa.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
This patch creates two new fields in struct hci_conn to save the
minimum and maximum connection interval values used to establish
the connection this object represents.
This change is required in order to know what parameters the
connection is currently using.
Signed-off-by: Andre Guedes <andre.guedes@openbossa.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
If LTK distribution happens in both directions we will have two LTKs for
the same remote device: one which is used when we're connecting as
master and another when we're connecting as slave. When looking up LTKs
from the locally stored list we shouldn't blindly return the first match
but also consider which type of key is in question. If we do not do this
we may end up selecting an incorrect encryption key for a connection.
This patch fixes the issue by always specifying to the LTK lookup
functions whether we're looking for a master or a slave key.
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
There's no reason why A2MP should need or deserve its on channel type.
Instead we should be able to group all fixed CID users under a single
channel type and reuse as much code as possible for them. Where CID
specific exceptions are needed the chan-scid value can be used.
This patch renames the current A2MP channel type to a generic one and
thereby paves the way to allow converting ATT and SMP (and any future
fixed channel protocols) to use the new channel type.
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
This patch adds a queue for incoming L2CAP data that's received before
l2cap_connect_cfm is called and processes the data once
l2cap_connect_cfm is called. This way we ensure that we have e.g. all
remote features before processing L2CAP signaling data (which is very
important for making the correct security decisions).
The processing of the pending rx data needs to be done through
queue_work since unlike l2cap_recv_acldata, l2cap_connect_cfm is called
with the hci_dev lock held which could cause potential deadlocks.
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
For debugging purposes of Secure Connection Only support a simple
debugfs entry is used to indicate if this mode is active or not.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
With the introduction of security level 4, the RFCOMM sockets need to
be made aware of this new level. This change ensures that the pairing
requirements are set correctly for these connections.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
With the introduction of security level 4, the L2CAP sockets need to
be made aware of this new level. This change ensures that the pairing
requirements are set correctly for these connections.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
The security level 4 is a new strong security requirement that is based
around 128-bit equivalent strength for link and encryption keys required
using FIPS approved algorithms. Which means that E0, SAFER+ and P-192
are not allowed. Only connections created with P-256 resulting from
using Secure Connections support are allowed.
This security level needs to be enforced when Secure Connection Only
mode is enabled for a controller or a service requires FIPS compliant
strong security. Currently it is not possible to enable either of
these two cases. This patch just puts in the foundation for being
able to handle security level 4 in the future.
It should be noted that devices or services with security level 4
requirement can only communicate using Bluetooth 4.1 controllers
with support for Secure Connections. There is no backward compatibilty
if used with older hardware.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
It is important to know if Secure Connections support has been enabled
for a given remote device. The information is provided in the remote
host features page. So track this information and provide a simple
helper function to extract the status.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
The current management interface only allows to provide the remote
OOB input of P-192 data. This extends the command to also accept
P-256 data as well. To make this backwards compatible, the userspace
can decide to only provide P-192 data or the combined P-192 and P-256
data. It is also allowed to leave the P-192 data empty if userspace
only has the remote P-256 data.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Add function to allow adding P-192 and P-256 data to the internal
storage. This also fixes a few coding style issues from the previous
helper functions for the out-of-band credentials storage.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
When Secure Connections has been enabled it is possible to provide P-192
and/or P-256 data during the pairing process. The internal out-of-band
credentials storage has been extended to also hold P-256 data.
Initially the P-256 data will be empty and with Secure Connections enabled
no P-256 data will be provided. This is according to the specification
since it might be possible that the remote side did not provide either
of the out-of-band credentials.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
The Bluetooth 4.1 specification with Secure Connections support has
just been released and controllers with this feature are still in
an early stage.
A handful of controllers have already support for it, but they do
not always identify this feature correctly. This debugfs entry
allows to tell the kernel that the controller can be treated as
it would fully support Secure Connections.
Using debugfs to force Secure Connections support of course does
not make this feature magically appear in all controllers. This
is a debug functionality for early adopters. Once the majority
of controllers matures this quirk will be removed.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
For Secure Connections support and the usage of out-of-band pairing,
it is needed to read the P-256 hash and randomizer or P-192 hash and
randomizer. This change will read P-192 data when Secure Connections
is disabled and P-192 and P-256 data when it is enabled.
The difference is between using HCI Read Local OOB Data and using the
new HCI Read Local OOB Extended Data command. The first one has been
introduced with Bluetooth 2.1 and returns only the P-192 data.
< HCI Command: Read Local OOB Data (0x03|0x0057) plen 0
> HCI Event: Command Complete (0x0e) plen 36
Read Local OOB Data (0x03|0x0057) ncmd 1
Status: Success (0x00)
Hash C from P-192: 975a59baa1c4eee391477cb410b23e6d
Randomizer R with P-192: 9ee63b7dec411d3b467c5ae446df7f7d
The second command has been introduced with Bluetooth 4.1 and will
return P-192 and P-256 data.
< HCI Command: Read Local OOB Extended Data (0x03|0x007d) plen 0
> HCI Event: Command Complete (0x0e) plen 68
Read Local OOB Extended Data (0x03|0x007d) ncmd 1
Status: Success (0x00)
Hash C from P-192: 6489731804b156fa6355efb8124a1389
Randomizer R with P-192: 4781d5352fb215b2958222b3937b6026
Hash C from P-256: 69ef8a928b9d07fc149e630e74ecb991
Randomizer R with P-256: 4781d5352fb215b2958222b3937b6026
The change for the management interface is transparent and no change
is required for existing userspace. The Secure Connections feature
needs to be manually enabled. When it is disabled, then userspace
only gets the P-192 returned and with Secure Connections enabled,
userspace gets P-192 and P-256 in an extended structure.
It is also acceptable to just ignore the P-256 data since it is not
required to support them. The pairing with out-of-band credentials
will still succeed. However then of course no Secure Connection will
b established.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
The support for Secure Connections need to be explicitly enabled by
userspace. This is required since only userspace that can handle the
new link key types should enable support for Secure Connections.
This command handling is similar to how Secure Simple Pairing enabling
is done. It also tracks the case when Secure Connections support is
enabled via raw HCI commands. This makes sure that the host features
page is updated as well.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
The MGMT_SETTING_SECURE_CONN setting is used to track the support and
status for Secure Connections from the management interface. For HCI
based tracking HCI_SC_ENABLED flag is used.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
With the introduction of Secure Connections, the list of link key types
got extended by P-256 versions of authenticated and unauthenticated
link keys.
To avoid any confusion the previous authenticated and unauthenticated
link key types got ammended with a P912 postfix. And the two new keys
have a P256 postfix now. Existing code using the previous definitions
has been adjusted.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
The Secure Connections feature introduces the support for P-256 strength
pairings (compared to P-192 with Secure Simple Pairing). This however
means that for out-of-band pairing the hash and randomizer needs to be
differentiated. Two new commands are introduced to handle the possible
combinations of P-192 and P-256. This add the HCI command definition
for both.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
The Secure Connections feature is optional and host stacks have to
manually enable it. This add the HCI command definiton for reading
and writing this setting.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
The support for Secure Connections introduces two new controller
features and one new host feature.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
NAPI was originally added to mac80211 a long time ago (by John in
commit 4e6cbfd09c in July 2010), but then removed years later
(by Stanislaw in commit 30c97120c6 in February 2013). No driver
ever used it, so that was fine.
Now I'm adding support for NAPI to our driver, so add some code
to mac80211 again to support NAPI. John was originally wrapping
some (but not nearly all NAPI-related functions), but that doesn't
scale very well with the number of functions that are there, some
of which are even only inlines. Thus, instead of doing that, let
the drivers manage the NAPI struct, except for napi_add() which is
needed so mac80211 knows how to call napi_gro_receive().
Also remove some no longer needed definitions that were left when
NAPI support was removed.
Reviewed-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Reviewed-by: Eyal Shapira <eyal@wizery.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
There's no driver using this flag and consequently no userspace
application is actually looking at it. As it seems unlikely for
any driver to start using it, remove it and the (very little)
code that used it.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
In case of beacon_loss with IEEE80211_HW_CONNECTION_MONITOR
device, mac80211 probes the ap (and disconnects on timeout)
but ignores the ack.
If we already got an ack, there's no reason to continue
disconnecting. this can help devices that supports
IEEE80211_HW_CONNECTION_MONITOR only partially (e.g. take
care of keep alives, but does not probe the ap.
In case the device wants to disconnect without probing,
it can just call ieee80211_connection_loss.
Signed-off-by: Eliad Peller <eliadx.peller@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Pull vfs fixes from Al Viro:
"A couple of fixes, both -stable fodder. The O_SYNC bug is fairly
old..."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
fix a kmap leak in virtio_console
fix O_SYNC|O_APPEND syncing the wrong range on write()
It actually goes back to 2004 ([PATCH] Concurrent O_SYNC write support)
when sync_page_range() had been introduced; generic_file_write{,v}() correctly
synced
pos_after_write - written .. pos_after_write - 1
but generic_file_aio_write() synced
pos_before_write .. pos_before_write + written - 1
instead. Which is not the same thing with O_APPEND, obviously.
A couple of years later correct variant had been killed off when
everything switched to use of generic_file_aio_write().
All users of generic_file_aio_write() are affected, and the same bug
has been copied into other instances of ->aio_write().
The fix is trivial; the only subtle point is that generic_write_sync()
ought to be inlined to avoid calculations useless for the majority of
calls.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Pull x86 fixes from Peter Anvin:
"Quite a varied little collection of fixes. Most of them are
relatively small or isolated; the biggest one is Mel Gorman's fixes
for TLB range flushing.
A couple of AMD-related fixes (including not crashing when given an
invalid microcode image) and fix a crash when compiled with gcov"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86, microcode, AMD: Unify valid container checks
x86, hweight: Fix BUG when booting with CONFIG_GCOV_PROFILE_ALL=y
x86/efi: Allow mapping BGRT on x86-32
x86: Fix the initialization of physnode_map
x86, cpu hotplug: Fix stack frame warning in check_irq_vectors_for_cpu_disable()
x86/intel/mid: Fix X86_INTEL_MID dependencies
arch/x86/mm/srat: Skip NUMA_NO_NODE while parsing SLIT
mm, x86: Revisit tlb_flushall_shift tuning for page flushes except on IvyBridge
x86: mm: change tlb_flushall_shift for IvyBridge
x86/mm: Eliminate redundant page table walk during TLB range flushing
x86/mm: Clean up inconsistencies when flushing TLB ranges
mm, x86: Account for TLB flushes only when debugging
x86/AMD/NB: Fix amd_set_subcaches() parameter type
x86/quirks: Add workaround for AMD F16h Erratum792
x86, doc, kconfig: Fix dud URL for Microcode data
This is a patch to improve swap readahead algorithm. It's from Hugh and
I slightly changed it.
Hugh's original changelog:
swapin readahead does a blind readahead, whether or not the swapin is
sequential. This may be ok on harddisk, because large reads have
relatively small costs, and if the readahead pages are unneeded they can
be reclaimed easily - though, what if their allocation forced reclaim of
useful pages? But on SSD devices large reads are more expensive than
small ones: if the readahead pages are unneeded, reading them in caused
significant overhead.
This patch adds very simplistic random read detection. Stealing the
PageReadahead technique from Konstantin Khlebnikov's patch, avoiding the
vma/anon_vma sophistications of Shaohua Li's patch, swapin_nr_pages()
simply looks at readahead's current success rate, and narrows or widens
its readahead window accordingly. There is little science to its
heuristic: it's about as stupid as can be whilst remaining effective.
The table below shows elapsed times (in centiseconds) when running a
single repetitive swapping load across a 1000MB mapping in 900MB ram
with 1GB swap (the harddisk tests had taken painfully too long when I
used mem=500M, but SSD shows similar results for that).
Vanilla is the 3.6-rc7 kernel on which I started; Shaohua denotes his
Sep 3 patch in mmotm and linux-next; HughOld denotes my Oct 1 patch
which Shaohua showed to be defective; HughNew this Nov 14 patch, with
page_cluster as usual at default of 3 (8-page reads); HughPC4 this same
patch with page_cluster 4 (16-page reads); HughPC0 with page_cluster 0
(1-page reads: no readahead).
HDD for swapping to harddisk, SSD for swapping to VertexII SSD. Seq for
sequential access to the mapping, cycling five times around; Rand for
the same number of random touches. Anon for a MAP_PRIVATE anon mapping;
Shmem for a MAP_SHARED anon mapping, equivalent to tmpfs.
One weakness of Shaohua's vma/anon_vma approach was that it did not
optimize Shmem: seen below. Konstantin's approach was perhaps mistuned,
50% slower on Seq: did not compete and is not shown below.
HDD Vanilla Shaohua HughOld HughNew HughPC4 HughPC0
Seq Anon 73921 76210 75611 76904 78191 121542
Seq Shmem 73601 73176 73855 72947 74543 118322
Rand Anon 895392 831243 871569 845197 846496 841680
Rand Shmem 1058375 1053486 827935 764955 764376 756489
SSD Vanilla Shaohua HughOld HughNew HughPC4 HughPC0
Seq Anon 24634 24198 24673 25107 21614 70018
Seq Shmem 24959 24932 25052 25703 22030 69678
Rand Anon 43014 26146 28075 25989 26935 25901
Rand Shmem 45349 45215 28249 24268 24138 24332
These tests are, of course, two extremes of a very simple case: under
heavier mixed loads I've not yet observed any consistent improvement or
degradation, and wider testing would be welcome.
Shaohua Li:
Test shows Vanilla is slightly better in sequential workload than Hugh's
patch. I observed with Hugh's patch sometimes the readahead size is
shrinked too fast (from 8 to 1 immediately) in sequential workload if
there is no hit. And in such case, continuing doing readahead is good
actually.
I don't prepare a sophisticated algorithm for the sequential workload
because so far we can't guarantee sequential accessed pages are swap out
sequentially. So I slightly change Hugh's heuristic - don't shrink
readahead size too fast.
Here is my test result (unit second, 3 runs average):
Vanilla Hugh New
Seq 356 370 360
Random 4525 2447 2444
Attached graph is the swapin/swapout throughput I collected with 'vmstat
2'. The first part is running a random workload (till around 1200 of
the x-axis) and the second part is running a sequential workload.
swapin and swapout throughput are almost identical in steady state in
both workloads. These are expected behavior. while in Vanilla, swapin
is much bigger than swapout especially in random workload (because wrong
readahead).
Original patches by: Shaohua Li and Konstantin Khlebnikov.
[fengguang.wu@intel.com: swapin_nr_pages() can be static]
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Shaohua Li <shli@fusionio.com>
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Konstantin Khlebnikov <khlebnikov@openvz.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This capabilities weren't propagated to the radiotap header.
We don't set here the VHT_KNOWN / MCS_HAVE flag because not
all the low level drivers will know how to properly flag
the frames, hence the low level driver will be in charge
of setting IEEE80211_RADIOTAP_MCS_HAVE_FEC,
IEEE80211_RADIOTAP_MCS_HAVE_STBC and / or
IEEE80211_RADIOTAP_VHT_KNOWN_STBC according to its
capabilities.
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
ieee80211_rx_status.flags is full. Define a new vht_flag
variable to be able to set more VHT related flags and make
room in flags.
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Acked-by: Kalle Valo <kvalo@qca.qualcomm.com> [ath10k]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
The purpose of this housekeeping is to make some room for
VHT flags. The radiotap vendor fields weren't in use.
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
- Revert "xen/grant-table: Avoid m2p_override during mapping" as it broke Xen ARM build.
- Fix CR4 not being set on AP processors in Xen PVH mode.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJS8AyQAAoJEFjIrFwIi8fJbD4IAJssMuaLI5CRsSWBgDFHHDFt
srVJpDOYQiDr/TxkwFCVcL4sFy9Htb3KMArU4eIBl6uMqQbGa+3rHyXcHYI219YY
XH3D8RG+9JChwsxtaeUEzwx1C8ehcygD34vtdcoQXa7eBuEi4TL3HeLifR+HrXKO
UdFrTA34FmvpVFbSuRXkZh5sd6ca9et9xHuQHM8SIY6pVokY6xaEYOp17tfPZpwM
7A6LFjUjXeugHC2L3+/H8UOHA9nSZQvnMiZOWq2Cusc2Dt2V7emzgk2wcc2CHttf
EA6GbtiJzHqMPmt5EjubI9hHdSMB31HpY4hnQE38+ucl+BwiSdRE9z2Rm4TYClg=
=IX4M
-----END PGP SIGNATURE-----
Merge tag 'stable/for-linus-3.14-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull Xen fixes from Konrad Rzeszutek Wilk:
"Bug-fixes:
- Revert "xen/grant-table: Avoid m2p_override during mapping" as it
broke Xen ARM build.
- Fix CR4 not being set on AP processors in Xen PVH mode"
* tag 'stable/for-linus-3.14-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen/pvh: set CR4 flags for APs
Revert "xen/grant-table: Avoid m2p_override during mapping"
Pull NVMe driver update from Matthew Wilcox:
"Looks like I missed the merge window ... but these are almost all
bugfixes anyway (the ones that aren't have been baking for months)"
* git://git.infradead.org/users/willy/linux-nvme:
NVMe: Namespace use after free on surprise removal
NVMe: Correct uses of INIT_WORK
NVMe: Include device and queue numbers in interrupt name
NVMe: Add a pci_driver shutdown method
NVMe: Disable admin queue on init failure
NVMe: Dynamically allocate partition numbers
NVMe: Async IO queue deletion
NVMe: Surprise removal handling
NVMe: Abort timed out commands
NVMe: Schedule reset for failed controllers
NVMe: Device resume error handling
NVMe: Cache dev->pci_dev in a local pointer
NVMe: Fix lockdep warnings
NVMe: compat SG_IO ioctl
NVMe: remove deprecated IRQF_DISABLED
NVMe: Avoid shift operation when writing cq head doorbell
This changes 'do_execve()' to get the executable name as a 'struct
filename', and to free it when it is done. This is what the normal
users want, and it simplifies and streamlines their error handling.
The controlled lifetime of the executable name also fixes a
use-after-free problem with the trace_sched_process_exec tracepoint: the
lifetime of the passed-in string for kernel users was not at all
obvious, and the user-mode helper code used UMH_WAIT_EXEC to serialize
the pathname allocation lifetime with the execve() having finished,
which in turn meant that the trace point that happened after
mm_release() of the old process VM ended up using already free'd memory.
To solve the kernel string lifetime issue, this simply introduces
"getname_kernel()" that works like the normal user-space getname()
function, except with the source coming from kernel memory.
As Oleg points out, this also means that we could drop the tcomm[] array
from 'struct linux_binprm', since the pathname lifetime now covers
setup_new_exec(). That would be a separate cleanup.
Reported-by: Igor Zhbanov <i.zhbanov@samsung.com>
Tested-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The element ID list is currently almost sorted by amendment
or similar topic, but the order is difficult to maintain and
not very transparent. Sort the list by ID instead, and add
a lot of missing IDs.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
In case we will get regulatory request with rule
where max_bandwidth_khz is set to 0 handle this
case as a special one.
If max_bandwidth_khz == 0 we should calculate maximum
available bandwidth base on all frequency contiguous rules.
In case we need auto calculation we just have to set:
country PL: DFS-ETSI
(2402 - 2482 @ 40), (N/A, 20)
(5170 - 5250 @ AUTO), (N/A, 20)
(5250 - 5330 @ AUTO), (N/A, 20), DFS
(5490 - 5710 @ 80), (N/A, 27), DFS
This mean we will calculate maximum bw for rules where
AUTO (N/A) were set, 160MHz (5330 - 5170) in example above.
So we will get:
(5170 - 5250 @ 160), (N/A, 20)
(5250 - 5330 @ 160), (N/A, 20), DFS
In other case:
country FR: DFS-ETSI
(2402 - 2482 @ 40), (N/A, 20)
(5170 - 5250 @ AUTO), (N/A, 20)
(5250 - 5330 @ 80), (N/A, 20), DFS
(5490 - 5710 @ 80), (N/A, 27), DFS
We will get 80MHz (5250 - 5170):
(5170 - 5250 @ 80), (N/A, 20)
(5250 - 5330 @ 80), (N/A, 20), DFS
Base on this calculations we will set correct channel
bandwidth flags (eg. IEEE80211_CHAN_NO_80MHZ).
We don't need any changes in CRDA or internal regulatory.
Signed-off-by: Janusz Dziedzic <janusz.dziedzic@tieto.com>
[extend nl80211 description a bit, fix typo]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
It was possible to break interface combinations in
the following way:
combo 1: iftype = AP, num_ifaces = 2, num_chans = 2,
combo 2: iftype = AP, num_ifaces = 1, num_chans = 1, radar = HT20
With the above interface combinations it was
possible to:
step 1. start AP on DFS channel by matching combo 2
step 2. start AP on non-DFS channel by matching combo 1
This was possible beacuse (step 2) did not consider
if other interfaces require radar detection.
The patch changes how cfg80211 tracks channels -
instead of channel itself now a complete chandef
is stored.
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
When receiving an IBSS_JOINED event select the BSS object
based on the {bssid, channel} couple rather than the bssid
only.
With the current approach if another cell having the same
BSSID (but using a different channel) exists then cfg80211
picks up the wrong BSS object.
The result is a mismatching channel configuration between
cfg80211 and the driver, that can lead to any sort of
problem.
The issue can be triggered by having an IBSS sitting on
given channel and then asking the driver to create a new
cell using the same BSSID but with a different frequency.
By passing the channel to cfg80211_get_bss() we can solve
this ambiguity and retrieve/create the correct BSS object.
All the users of cfg80211_ibss_joined() have been changed
accordingly.
Moreover WARN when cfg80211_ibss_joined() gets a NULL
channel as argument and remove a bogus call of the same
function in ath6kl (it does not make sense to call
cfg80211_ibss_joined() with a zero BSSID on ibss-leave).
Cc: Kalle Valo <kvalo@qca.qualcomm.com>
Cc: Arend van Spriel <arend@broadcom.com>
Cc: Bing Zhao <bzhao@marvell.com>
Cc: Jussi Kivilinna <jussi.kivilinna@iki.fi>
Cc: libertas-dev@lists.infradead.org
Acked-by: Kalle Valo <kvalo@qca.qualcomm.com>
Signed-off-by: Antonio Quartulli <antonio@open-mesh.com>
[minor code cleanup in ath6kl]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Action, disassoc and deauth frames are bufferable, and as such don't
have the PM bit in the frame control field reserved which means we
need to react to the bit when receiving in such a frame.
Fix this by introducing a new helper ieee80211_is_bufferable_mmpdu()
and using it for the RX path that currently ignores the PM bit in
any non-data frames for doze->wake transitions, but listens to it in
all frames for wake->doze transitions, both of which are wrong.
Also use the new helper in the TX path to clean up the code.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
The scheduled scan matchsets were intended to be a list of filters,
with the found BSS having to pass at least one of them to be passed
to the host. When the RSSI attribute was added, however, this was
broken and currently wpa_supplicant adds that attribute in its own
matchset; however, it doesn't intend that to mean that anything
that passes the RSSI filter should be passed to the host, instead
it wants it to mean that everything needs to also have higher RSSI.
This is semantically problematic because we have a list of filters
like [ SSID1, SSID2, SSID3, RSSI ] with no real indication which
one should be OR'ed and which one AND'ed.
To fix this, move the RSSI filter attribute into each matchset. As
we need to stay backward compatible, treat a matchset with only the
RSSI attribute as a "default RSSI filter" for all other matchsets,
but only if there are other matchsets (an RSSI-only matchset by
itself is still desirable.)
To make driver implementation easier, keep a global min_rssi_thold
for the entire request as well. The only affected driver is ath6kl.
I found this when I looked into the code after Raja Mani submitted
a patch fixing the n_match_sets calculation to disregard the RSSI,
but that patch didn't address the semantic issue.
Reported-by: Raja Mani <rmani@qti.qualcomm.com>
Acked-by: Luciano Coelho <luciano.coelho@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
A few places weren't checking that the frame passed to the
function actually has enough data even though the function
clearly documents it must have a payload byte. Make this
safer by changing the function to take an skb and checking
the length inside. The old version is preserved for now as
the rtl* drivers use it and don't have a correct skb.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
There's not a single rate control algorithm actually in
a separate module where the module refcount would be
required. Similarly, there's no specific rate control
module.
Therefore, all the module handling code in rate control
is really just dead code, so remove it.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Change the code to allow making all the rate control ops
const, nothing ever needs to change them. Also change all
drivers to make use of this and mark the ops const.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Allow to force SGI, LGI.
Mainly for test purpose.
Signed-off-by: Janusz Dziedzic <janusz.dziedzic@tieto.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
This required liberally sprinkling 'const' over brcmfmac
and mwifiex but seems like a useful thing to do since the
pointer can't really be written.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Addition of the frequency hints showed up couple of places in cfg80211
where pointers could be marked const and a shared function could be used
to fetch a valid channel.
Signed-off-by: Jouni Malinen <j@w1.fi>
[fix mwifiex]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
This allows drivers to advertise the maximum number of associated
stations they support in AP mode (including P2P GO). User space
applications can use this for cleaner way of handling the limit (e.g.,
hostapd rejecting IEEE 802.11 authentication without manual
configuration of the limit) or to figure out what type of use cases can
be executed with multiple devices before trying and failing.
Signed-off-by: Jouni Malinen <j@w1.fi>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
This clarifies the expected driver behavior on the older
NL80211_ATTR_MAC and NL80211_ATTR_WIPHY_FREQ attributes and adds a new
set of similar attributes with _HINT postfix to enable use of a
recommendation of the initial BSS to choose. This can be helpful for
some drivers that can avoid an additional full scan on connection
request if the information is provided to them (user space tools like
wpa_supplicant already has that information available based on earlier
scans).
In addition, this can be used to get more expected behavior for cases
where a specific BSS should be picked first based on operations like
Interworking network selection or WPS. These cases were already easily
addressed with drivers that leave BSS selection to user space, but there
was no convenient way to do this with drivers that take care of BSS
selection internally without using the NL80211_ATTR_MAC which is not
really desired since it is needed for other purposes to force the
association to remain with the same BSS.
Signed-off-by: Jouni Malinen <j@w1.fi>
[add const, fix policy]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
A beacon should never have a Channel Switch Announcement information
element with a count of 0, because a count of 1 means switch just
before the next beacon. So, if a count of 0 was valid in a beacon, it
would have been transmitted in the next channel already, which is
useless. A CSA count equal to zero is only meaningful in action
frames or probe_responses.
Fix the ieee80211_csa_is_complete() and ieee80211_update_csa()
functions accordingly.
With a CSA count of 0, we won't transmit any CSA beacons, because the
switch will happen before the next TBTT. To avoid extra work and
potential confusion in the drivers, complete the CSA immediately,
instead of waiting for the driver to call ieee80211_csa_finish().
To keep things simpler, we also switch immediately when the CSA count
is 1, while in theory we should delay the switch until just before the
next TBTT.
Additionally, move the ieee80211_csa_finish() function to cfg.c,
where it makes more sense.
Tested-by: Simon Wunderlich <sw@simonwunderlich.de>
Acked-by: Simon Wunderlich <sw@simonwunderlich.de>
Signed-off-by: Luciano Coelho <luciano.coelho@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
This reverts commit 08ece5bb23.
As it breaks ARM builds and needs more attention
on the ARM side.
Acked-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Pull SLAB changes from Pekka Enberg:
"Random bug fixes that have accumulated in my inbox over the past few
months"
* 'slab/next' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux:
mm: Fix warning on make htmldocs caused by slab.c
mm: slub: work around unneeded lockdep warning
mm: sl[uo]b: fix misleading comments
slub: Fix possible format string bug.
slub: use lockdep_assert_held
slub: Fix calculation of cpu slabs
slab.h: remove duplicate kmalloc declaration and fix kernel-doc warnings
nfs41_wake_and_assign_slot() relies on the task->tk_msg.rpc_argp and
task->tk_msg.rpc_resp always pointing to the session sequence arguments.
nfs4_proc_open_confirm tries to pull a fast one by reusing the open
sequence structure, thus causing corruption of the NFSv4 slot table.
Cc: stable@vger.kernel.org # 3.12+
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Pull vfs fixes from Al Viro:
"Several obvious fixes"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
Fix mountpoint reference leakage in linkat
hfsplus: use xattr handlers for removexattr
Typo in compat_sys_lseek() declaration
fs/super.c: sync ro remount after blocking writers
vfs: unexport the getname() symbol
Highlights:
- Fix several races in nfs_revalidate_mapping
- NFSv4.1 slot leakage in the pNFS files driver
- Stable fix for a slot leak in nfs40_sequence_done
- Don't reject NFSv4 servers that support ACLs with only ALLOW aces
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJS7Bb+AAoJEGcL54qWCgDyDuQP/17nKR5e6MLhixcAbvlcH+pN
8CGolAM3HmRXDWUW/PkBH3UguG8Tzx1Ex26vIxipPeTSwZabf6194Twj6L97DEGZ
2SouD158BW1TkAbhEN/alKB/4ZCPos05iXjZkrL7MRff+8FD0UvWR2pBT1F2jQdY
ZftG76Q72qhZHfH07ZMxM/v4Oy2Ge98RDD35gfuuqMSjHpmN9tiB55PeheW33LVY
fu6I/JEwmlJpgy2qUcDv7v0V4mDpjC7XbcjjHpMHL8zp/C5Rx/rdgt9OQPlwmjdV
FD8MWNXLc5TWxIouLDFPVUv3WZPjyu449QHS9Wc95fSqsHcdl4j4SwLAoSvUIdHt
vDI5PtWhw3WAezbtiuCQnT0xcoIOn5bLjOVP13taDcV9vlZLcFlyOpZ5gVE4/Yju
zm4sCW2+imDc74ERGa4fukF6QhzzAVmD8RlCJwuNzwCfXiZ36+xSanMYiPoUiwLL
OVNgymrm0fe7GVFQKWN2D+Vr68OQEmARO+KfA3UzP5rQV+9CU8zSHjbcoRWZ59QG
VahOS5WDLQSrMp8W37yAHH9IiAWveAAKJJTHlOniRqH90QYPgyW18fTo7YcpW313
AQGFgr/1n4t27MWRLu5rdoN5v8+kwNi0UV6oboNIPoP1v15NkEMvc7HKFj5M883R
qEYfe5wqN/eRNj68NT/+
=B7f0
-----END PGP SIGNATURE-----
Merge tag 'nfs-for-3.14-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client bugfixes from Trond Myklebust:
"Highlights:
- Fix several races in nfs_revalidate_mapping
- NFSv4.1 slot leakage in the pNFS files driver
- Stable fix for a slot leak in nfs40_sequence_done
- Don't reject NFSv4 servers that support ACLs with only ALLOW aces"
* tag 'nfs-for-3.14-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
nfs: initialize the ACL support bits to zero.
NFSv4.1: Cleanup
NFSv4.1: Clean up nfs41_sequence_done
NFSv4: Fix a slot leak in nfs40_sequence_done
NFSv4.1 free slot before resending I/O to MDS
nfs: add memory barriers around NFS_INO_INVALID_DATA and NFS_INO_INVALIDATING
NFS: Fix races in nfs_revalidate_mapping
sunrpc: turn warn_gssd() log message into a dprintk()
NFS: fix the handling of NFS_INO_INVALID_DATA flag in nfs_revalidate_mapping
nfs: handle servers that support only ALLOW ACE type.
Pull SCSI target updates from Nicholas Bellinger:
"The highlights this round include:
- add support for SCSI Referrals (Hannes)
- add support for T10 DIF into target core (nab + mkp)
- add support for T10 DIF emulation in FILEIO + RAMDISK backends (Sagi + nab)
- add support for T10 DIF -> bio_integrity passthrough in IBLOCK backend (nab)
- prep changes to iser-target for >= v3.15 T10 DIF support (Sagi)
- add support for qla2xxx N_Port ID Virtualization - NPIV (Saurav + Quinn)
- allow percpu_ida_alloc() to receive task state bitmask (Kent)
- fix >= v3.12 iscsi-target session reset hung task regression (nab)
- fix >= v3.13 percpu_ref se_lun->lun_ref_active race (nab)
- fix a long-standing network portal creation race (Andy)"
* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (51 commits)
target: Fix percpu_ref_put race in transport_lun_remove_cmd
target/iscsi: Fix network portal creation race
target: Report bad sector in sense data for DIF errors
iscsi-target: Convert gfp_t parameter to task state bitmask
iscsi-target: Fix connection reset hang with percpu_ida_alloc
percpu_ida: Make percpu_ida_alloc + callers accept task state bitmask
iscsi-target: Pre-allocate more tags to avoid ack starvation
qla2xxx: Configure NPIV fc_vport via tcm_qla2xxx_npiv_make_lport
qla2xxx: Enhancements to enable NPIV support for QLOGIC ISPs with TCM/LIO.
qla2xxx: Fix scsi_host leak on qlt_lport_register callback failure
IB/isert: pass scatterlist instead of cmd to fast_reg_mr routine
IB/isert: Move fastreg descriptor creation to a function
IB/isert: Avoid frwr notation, user fastreg
IB/isert: seperate connection protection domains and dma MRs
tcm_loop: Enable DIF/DIX modes in SCSI host LLD
target/rd: Add DIF protection into rd_execute_rw
target/rd: Add support for protection SGL setup + release
target/rd: Refactor rd_build_device_space + rd_release_device_space
target/file: Add DIF protection support to fd_execute_rw
target/file: Add DIF protection init/format support
...
Pull media updates from Mauro Carvalho Chehab:
- a new jpeg codec driver for Samsung Exynos (jpeg-hw-exynos4)
- a new dvb frontend for ds2103 chipset (m88ds2103)
- a new sensor driver for Samsung S5K5BAF UXGA (s5k5baf)
- new drivers for R-Car VSP1
- a new radio driver: radio-raremono
- a new tuner driver for ts2022 chipset (m88ts2022)
- the analog part of em28xx is now a separate module that only
load/runs if the device is not a pure digital TV device
- added a staging driver for bcm2048 radio devices
- the omap 2 video driver (omap24xx) was moved to staging. This driver
is for an old hardware and uses a deprecated Kernel internal API. If
nobody cares enough to fix it, it would be removed on a couple Kernel
releases
- the sn9c102 driver was moved to staging. This driver was replaced by
gspca, and disabled on some distros, as almost all devices are known
to work properly with gspca. It should be removed from kernel on a
couple Kernel releases
- lots of driver fixes, improvements and cleanups
* 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (421 commits)
[media] media: v4l2-dev: fix video device index assignment
[media] rc-core: reuse device numbers
[media] em28xx-cards: properly initialize the device bitmap
[media] Staging: media: Fix line length exceeding 80 characters in as102_drv.c
[media] Staging: media: Fix line length exceeding 80 characters in as102_fe.c
[media] Staging: media: Fix quoted string split across line in as102_fe.c
[media] media: st-rc: Add reset support
[media] m2m-deinterlace: fix allocated struct type
[media] radio-usb-si4713: fix sparse non static symbol warnings
[media] em28xx-audio: remove needless check before usb_free_coherent()
[media] au0828: Fix sparse non static symbol warning
Revert "[media] go7007-usb: only use go->dev after allocated"
[media] em28xx-audio: provide an error code when URB submit fails
[media] em28xx: fix check for audio only usb interfaces when changing the usb alternate setting
[media] em28xx: fix usb alternate setting for analog and digital video endpoints > 0
[media] em28xx: make 'em28xx_ctrl_ops' static
em28xx-alsa: Fix error patch for init/fini
[media] em28xx-audio: flush work at .fini
[media] drxk: remove the option to load firmware asynchronously
[media] em28xx: adjust period size at runtime
...
- ACPI device hotplug fix preventing ACPI drivers from binding to device
objects that acpi_bus_trim() has been called for and the devices
represented by them may not be operational.
- Recent cpufreq changes related to the "boost" (turbo) feature broke
the acpi-cpufreq error code path causing a NULL pointer dereference
to occur on some systems. Fix from Konrad Rzeszutek Wilk.
- The log level of a CPU initialization error message added recently
needs to be reduced, because the particular BIOS issue indicated by
it turns out to be widespread and doesn't really matter for the
majority of systems having it. From Jiang Liu.
- The regulator API needs to be told to stay away from things on systems
with ACPI BIOSes or it may conflict with the BIOS' own handling of
voltage regulators. Fix from Mark Brown that works around a 3.13
regression in lm90 on PCs occuring if the regulator API is enabled.
- Prevent the Exynos4 devfreq driver from being built on multiplatform,
because it depends on things that aren't available during such builds.
From Sachin Kamat.
- Upstream ACPICA doesn't use the bool type as defined in the kernel,
so modify the kernel's ACPICA code to follow the upstream in that
respect (only one variable definition is affected) to reduce
divergences between the two. From Lv Zheng.
- Make the ACPI device PM code use ACPI_COMPANION() instead of its own
routine doing the same thing (and invokng ACPI_COMPANION() in the
process).
- Modify some routines in the ACPI processor driver to follow the
common convention and return negative integers on errors. From
Hanjun Guo.
/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABCAAGBQJS6mQLAAoJEILEb/54YlRxc6sP/0tYmO8qOF0XwHhA9AQKlcVG
DZxui1pEbhXM3lftwqvVnma1hOKY1T014+JFxldCjvV9hyxdxTvcdvhKRxrj/eAk
DY7p2TK3N0eBScpMQLv/nKMjS20Z0bxz3QNxhC0S3eOYk5QPN3aUhsqG2iu6aFxJ
FzOOV5mQGhGT6a3LmXvjNROXjJN4dibMmGvl+bJ/8Y5zJUUvAOWmxphqjaIZm7WC
Lh4a8wUfxfmq8Mj1DDI/qk/aWP8QTBDYiG8C8XjEa1pj9oHjsrCHNogSrfRHRwa1
sDW/6EjSbLYnKuUDeoA61wXyVvlvR22uTTswDwbJ2mt18WTcmbvPP6Znh7m3oiF2
aUDoh/8KvYGRhDBZm+wf6TjD/KdcIQdOptNl6irCMsHkYNDK/n7f++Blkjy/xmea
UsF6l4XkIbqJCWWsTwO2VjB0p0QPRS49s/YtUxjETqCjdhNKnAQr2wfBH5/fcD6c
yJPkCGOlqcPQAPOrWyabAuStDfFUEUoFkZ/kbZov2xUIXyJFv5gfj1boZ3Ekqhmd
klcOmsvVP8QZLF82iFIhDdgWLNCRPESNfJpN0mLpg2sYMWQtK+r1OnSKpbVNW0Jn
6Ea23WSm0ZEd0QX8tYLjf/iUsIUjn48gJuXG+CDlTtkQtLNdr1VTQd5ZeKm2kcOg
+82EVRoeZi+sqx4YrEQi
=VBdN
-----END PGP SIGNATURE-----
Merge tag 'pm+acpi-3.14-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI and power management fixes and cleanups from Rafael Wysocki:
- ACPI device hotplug fix preventing ACPI drivers from binding to device
objects that acpi_bus_trim() has been called for and the devices
represented by them may not be operational.
- Recent cpufreq changes related to the "boost" (turbo) feature broke
the acpi-cpufreq error code path causing a NULL pointer dereference
to occur on some systems. Fix from Konrad Rzeszutek Wilk.
- The log level of a CPU initialization error message added recently
needs to be reduced, because the particular BIOS issue indicated by
it turns out to be widespread and doesn't really matter for the
majority of systems having it. From Jiang Liu.
- The regulator API needs to be told to stay away from things on systems
with ACPI BIOSes or it may conflict with the BIOS' own handling of
voltage regulators. Fix from Mark Brown that works around a 3.13
regression in lm90 on PCs occuring if the regulator API is enabled.
- Prevent the Exynos4 devfreq driver from being built on multiplatform,
because it depends on things that aren't available during such builds.
From Sachin Kamat.
- Upstream ACPICA doesn't use the bool type as defined in the kernel,
so modify the kernel's ACPICA code to follow the upstream in that
respect (only one variable definition is affected) to reduce
divergences between the two. From Lv Zheng.
- Make the ACPI device PM code use ACPI_COMPANION() instead of its own
routine doing the same thing (and invokng ACPI_COMPANION() in the
process).
- Modify some routines in the ACPI processor driver to follow the
common convention and return negative integers on errors. From
Hanjun Guo.
* tag 'pm+acpi-3.14-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI / scan: Clear match_driver flag in acpi_bus_trim()
ACPI / init: Flag use of ACPI and ACPI idioms for power supplies to regulator API
acpi-cpufreq: De-register CPU notifier and free struct msr on error.
ACPICA: Remove bool usage from ACPICA.
PM / devfreq: Disable Exynos4 driver build on multiplatform
ACPI / PM: Use ACPI_COMPANION() to get ACPI companions of devices
ACPI / scan: reduce log level of "ACPI: \_PR_.CPU4: failed to get CPU APIC ID"
ACPI / processor: Return specific error value when mapping lapic id
Pull timer/dynticks updates from Ingo Molnar:
"This tree contains misc dynticks updates: a fix and three cleanups"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/nohz: Fix overflow error in scheduler_tick_max_deferment()
nohz_full: fix code style issue of tick_nohz_full_stop_tick
nohz: Get timekeeping max deferment outside jiffies_lock
tick: Rename tick_check_idle() to tick_irq_enter()
Pull core debug changes from Ingo Molnar:
"This contains mostly kernel debugging related updates:
- make hung_task detection more configurable to distros
- add final bits for x86 UV NMI debugging, with related KGDB changes
- update the mailing-list of MAINTAINERS entries I'm involved with"
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
hung_task: Display every hung task warning
sysctl: Add neg_one as a standard constraint
x86/uv/nmi, kgdb/kdb: Fix UV NMI handler when KDB not configured
x86/uv/nmi: Fix Sparse warnings
kgdb/kdb: Fix no KDB config problem
MAINTAINERS: Restore "L: linux-kernel@vger.kernel.org" entries
- Xen ARM couldn't use the new FIFO events
- Xen ARM couldn't use the SWIOTLB if compiled as 32-bit with 64-bit PCIe devices.
- Grant table were doing needless M2P operations.
- Ratchet down the self-balloon code so it won't OOM.
- Fix misplaced kfree in Xen PVH error code paths.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJS68IQAAoJEFjIrFwIi8fJAWgH/j4HStEey3rgGcqwIWSHkkap
+t55wsrT8Ylq6CzZjaUtCo3pB7HotW526x/0rA2pxVqHn/8oCN/1EtdrNtYm/umX
qOoda+db5NIjAEGVLWSLqGyokJQDrX/brXIWfYR300e9fnJi7yT/rFC4QHoZVUYl
5LME8XH/jE012vvYelNu6DbbodlRmVCT8hctJS+eB5ER2WmtD9Pkw4GybEXPVYJz
hE0Ts1DN91nKP2FGJb+mfB9UFT5X8i00akAK+Qc1R3sRnRh6eRoNV8dgyCnudKpO
UPEdiAZvgij+mzlgIYSz6nKH0U/VbvRsG3lc3i5Si3o+vR3CYPCkvzOGX2d0rjw=
=7cxW
-----END PGP SIGNATURE-----
Merge tag 'stable/for-linus-3.14-rc0-late-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull Xen bugfixes from Konrad Rzeszutek Wilk:
"Bug-fixes for the new features that were added during this cycle.
There are also two fixes for long-standing issues for which we have a
solution: grant-table operations extra work that was not needed
causing performance issues and the self balloon code was too
aggressive causing OOMs.
Details:
- Xen ARM couldn't use the new FIFO events
- Xen ARM couldn't use the SWIOTLB if compiled as 32-bit with 64-bit PCIe devices.
- Grant table were doing needless M2P operations.
- Ratchet down the self-balloon code so it won't OOM.
- Fix misplaced kfree in Xen PVH error code paths"
* tag 'stable/for-linus-3.14-rc0-late-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen/pvh: Fix misplaced kfree from xlated_setup_gnttab_pages
drivers: xen: deaggressive selfballoon driver
xen/grant-table: Avoid m2p_override during mapping
xen/gnttab: Use phys_addr_t to describe the grant frame base address
xen: swiotlb: handle sizeof(dma_addr_t) != sizeof(phys_addr_t)
arm/xen: Initialize event channels earlier
The grant mapping API does m2p_override unnecessarily: only gntdev needs it,
for blkback and future netback patches it just cause a lock contention, as
those pages never go to userspace. Therefore this series does the following:
- the original functions were renamed to __gnttab_[un]map_refs, with a new
parameter m2p_override
- based on m2p_override either they follow the original behaviour, or just set
the private flag and call set_phys_to_machine
- gnttab_[un]map_refs are now a wrapper to call __gnttab_[un]map_refs with
m2p_override false
- a new function gnttab_[un]map_refs_userspace provides the old behaviour
It also removes a stray space from page.h and change ret to 0 if
XENFEAT_auto_translated_physmap, as that is the only possible return value
there.
v2:
- move the storing of the old mfn in page->index to gnttab_map_refs
- move the function header update to a separate patch
v3:
- a new approach to retain old behaviour where it needed
- squash the patches into one
v4:
- move out the common bits from m2p* functions, and pass pfn/mfn as parameter
- clear page->private before doing anything with the page, so m2p_find_override
won't race with this
v5:
- change return value handling in __gnttab_[un]map_refs
- remove a stray space in page.h
- add detail why ret = 0 now at some places
v6:
- don't pass pfn to m2p* functions, just get it locally
Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com>
Suggested-by: David Vrabel <david.vrabel@citrix.com>
Acked-by: David Vrabel <david.vrabel@citrix.com>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
On x86, SLUB creates and handles <=8192-byte allocations internally.
It passes larger ones up to the allocator. Saying "up to order 2" is,
at best, ambiguous. Is that order-1? Or (order-2 bytes)? Make
it more clear.
SLOB commits a similar sin. It *handles* page-size requests, but the
comment says that it passes up "all page size and larger requests".
SLOB also swaps around the order of the very-similarly-named
KMALLOC_SHIFT_HIGH and KMALLOC_SHIFT_MAX #defines. Make it
consistent with the order of the other two allocators.
Cc: Matt Mackall <mpm@selenic.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Christoph Lameter <cl@linux-foundation.org>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
Pull btrfs updates from Chris Mason:
"This is a pretty big pull, and most of these changes have been
floating in btrfs-next for a long time. Filipe's properties work is a
cool building block for inheriting attributes like compression down on
a per inode basis.
Jeff Mahoney kicked in code to export filesystem info into sysfs.
Otherwise, lots of performance improvements, cleanups and bug fixes.
Looks like there are still a few other small pending incrementals, but
I wanted to get the bulk of this in first"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (149 commits)
Btrfs: fix spin_unlock in check_ref_cleanup
Btrfs: setup inode location during btrfs_init_inode_locked
Btrfs: don't use ram_bytes for uncompressed inline items
Btrfs: fix btrfs_search_slot_for_read backwards iteration
Btrfs: do not export ulist functions
Btrfs: rework ulist with list+rb_tree
Btrfs: fix memory leaks on walking backrefs failure
Btrfs: fix send file hole detection leading to data corruption
Btrfs: add a reschedule point in btrfs_find_all_roots()
Btrfs: make send's file extent item search more efficient
Btrfs: fix to catch all errors when resolving indirect ref
Btrfs: fix protection between walking backrefs and root deletion
btrfs: fix warning while merging two adjacent extents
Btrfs: fix infinite path build loops in incremental send
btrfs: undo sysfs when open_ctree() fails
Btrfs: fix snprintf usage by send's gen_unique_name
btrfs: fix defrag 32-bit integer overflow
btrfs: sysfs: list the NO_HOLES feature
btrfs: sysfs: don't show reserved incompat feature
btrfs: call permission checks earlier in ioctls and return EPERM
...
Merge misc fixes from Andrew Morton:
"A few hotfixes and various leftovers which were awaiting other merges.
Mainly movement of zram into mm/"
* emailed patches fron Andrew Morton <akpm@linux-foundation.org>: (25 commits)
memcg: fix mutex not unlocked on memcg_create_kmem_cache fail path
Documentation/filesystems/vfs.txt: update file_operations documentation
mm, oom: base root bonus on current usage
mm: don't lose the SOFT_DIRTY flag on mprotect
mm/slub.c: fix page->_count corruption (again)
mm/mempolicy.c: fix mempolicy printing in numa_maps
zram: remove zram->lock in read path and change it with mutex
zram: remove workqueue for freeing removed pending slot
zram: introduce zram->tb_lock
zram: use atomic operation for stat
zram: remove unnecessary free
zram: delay pending free request in read path
zram: fix race between reset and flushing pending work
zsmalloc: add maintainers
zram: add zram maintainers
zsmalloc: add copyright
zram: add copyright
zram: remove old private project comment
zram: promote zram from staging
zsmalloc: move it under mm
...
Pull MIPS updates from Ralf Baechle:
"The most notable new addition inside this pull request is the support
for MIPS's latest and greatest core called "inter/proAptiv". The
patch series describes this core as follows.
"The interAptiv is a power-efficient multi-core microprocessor
for use in system-on-chip (SoC) applications. The interAptiv combines
a multi-threading pipeline with a coherence manager to deliver improved
computational throughput and power efficiency. The interAptiv can
contain one to four MIPS32R3 interAptiv cores, system level
coherence manager with L2 cache, optional coherent I/O port,
and optional floating point unit."
The platform specific patches touch all 3 Broadcom families. It adds
support for the new Broadcom/Netlogix XLP9xx Soc, building a common
BCM63XX SMP kernel for all BCM63XX SoCs regardless of core type/count
and full gpio button/led descriptions for BCM47xx.
The rest of the series are cleanups and bug fixes that are MIPS
generic and consist largely of changes that Imgtec/MIPS had published
in their linux-mti-3.10.git stable tree. Random other cleanups and
patches preparing code to be merged in 3.15"
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: (139 commits)
mips: select ARCH_MIGHT_HAVE_PC_SERIO
mips: delete non-required instances of include <linux/init.h>
MIPS: KVM: remove shadow_tlb code
MIPS: KVM: use common EHINV aware UNIQUE_ENTRYHI
mips/ide: flush dcache also if icache does not snoop dcache
MIPS: BCM47XX: fix position of cpu_wait disabling
MIPS: BCM63XX: select correct MIPS_L1_CACHE_SHIFT value
MIPS: update MIPS_L1_CACHE_SHIFT based on MIPS_L1_CACHE_SHIFT_<N>
MIPS: introduce MIPS_L1_CACHE_SHIFT_<N>
MIPS: ZBOOT: gather string functions into string.c
arch/mips/pci: don't check resource with devm_ioremap_resource
arch/mips/lantiq/xway: don't check resource with devm_ioremap_resource
bcma: gpio: don't cast u32 to unsigned long
ssb: gpio: add own IRQ domain
MIPS: BCM47XX: fix sparse warnings in board.c
MIPS: BCM47XX: add board detection for Linksys WRT54GS V1
MIPS: BCM47XX: fix detection for some boards
MIPS: BCM47XX: Enable buttons support on SSB
MIPS: BCM47XX: Convert WNDR4500 to new syntax
MIPS: BCM47XX: Use "timer" trigger for status LEDs
...
Pull more powerpc bits from Ben Herrenschmidt:
"Here are a few more powerpc bits for this merge window. The bulk is
made of two pull requests from Scott and Anatolij that I had missed
previously (they arrived while I was away). Since both their branches
are in -next independently, and the content has been around for a
little while, they can still go in.
The rest is mostly bug and regression fixes, a small series of
cleanups to our pseries cpuidle code (including moving it to the right
place), and one new cpuidle bakend for the powernv platform. I also
wired up the new sched_attr syscalls"
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (37 commits)
powerpc: Wire up sched_setattr and sched_getattr syscalls
powerpc/hugetlb: Replace __get_cpu_var with get_cpu_var
powerpc: Make sure "cache" directory is removed when offlining cpu
powerpc/mm: Fix mmap errno when MAP_FIXED is set and mapping exceeds the allowed address space
powerpc/powernv/cpuidle: Back-end cpuidle driver for powernv platform.
powerpc/pseries/cpuidle: smt-snooze-delay cleanup.
powerpc/pseries/cpuidle: Remove MAX_IDLE_STATE macro.
powerpc/pseries/cpuidle: Make cpuidle-pseries backend driver a non-module.
powerpc/pseries/cpuidle: Use cpuidle_register() for initialisation.
powerpc/pseries/cpuidle: Move processor_idle.c to drivers/cpuidle.
powerpc: Fix 32-bit frames for signals delivered when transactional
powerpc/iommu: Fix initialisation of DART iommu table
powerpc/numa: Fix decimal permissions
powerpc/mm: Fix compile error of pgtable-ppc64.h
powerpc: Fix hw breakpoints on !HAVE_HW_BREAKPOINT configurations
clk: corenet: Adds the clock binding
powerpc/booke64: Guard e6500 tlb handler with CONFIG_PPC_FSL_BOOK3E
powerpc/512x: dts: add MPC5125 clock specs
powerpc/512x: clk: support MPC5121/5123/5125 SoC variants
powerpc/512x: clk: enforce even SDHC divider values
...
Pull kbuild changes from Michal Marek:
- fix make -s detection with make-4.0
- fix for scripts/setlocalversion when the kernel repository is a
submodule
- do not hardcode ';' in macros that expand to assembler code, as some
architectures' assemblers use a different character for newline
- Fix passing --gdwarf-2 to the assembler
* 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
frv: Remove redundant debugging info flag
mn10300: Remove redundant debugging info flag
kbuild: Fix debugging info generation for .S files
arch: use ASM_NL instead of ';' for assembler new line character in the macro
kbuild: Fix silent builds with make-4
Fix detectition of kernel git repository in setlocalversion script [take #2]
Add my copyright to the zsmalloc source code which I maintain.
Signed-off-by: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch moves zsmalloc under mm directory.
Before that, description will explain why we have needed custom
allocator.
Zsmalloc is a new slab-based memory allocator for storing compressed
pages. It is designed for low fragmentation and high allocation success
rate on large object, but <= PAGE_SIZE allocations.
zsmalloc differs from the kernel slab allocator in two primary ways to
achieve these design goals.
zsmalloc never requires high order page allocations to back slabs, or
"size classes" in zsmalloc terms. Instead it allows multiple
single-order pages to be stitched together into a "zspage" which backs
the slab. This allows for higher allocation success rate under memory
pressure.
Also, zsmalloc allows objects to span page boundaries within the zspage.
This allows for lower fragmentation than could be had with the kernel
slab allocator for objects between PAGE_SIZE/2 and PAGE_SIZE. With the
kernel slab allocator, if a page compresses to 60% of it original size,
the memory savings gained through compression is lost in fragmentation
because another object of the same size can't be stored in the leftover
space.
This ability to span pages results in zsmalloc allocations not being
directly addressable by the user. The user is given an
non-dereferencable handle in response to an allocation request. That
handle must be mapped, using zs_map_object(), which returns a pointer to
the mapped region that can be used. The mapping is necessary since the
object data may reside in two different noncontigious pages.
The zsmalloc fulfills the allocation needs for zram perfectly
[sjenning@linux.vnet.ibm.com: borrow Seth's quote]
Signed-off-by: Minchan Kim <minchan@kernel.org>
Acked-by: Nitin Gupta <ngupta@vflare.org>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Bob Liu <bob.liu@oracle.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Luigi Semenzato <semenzato@google.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Seth Jennings <sjenning@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Make smp_call_function_single and friends more efficient by using a
lockless list.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Now we have memblock_virt_alloc_low to replace original bootmem api in
swiotlb.
But we should not use BOOTMEM_LOW_LIMIT for arch that does not support
CONFIG_NOBOOTMEM, as old api take 0.
| #define alloc_bootmem_low(x) \
| __alloc_bootmem_low(x, SMP_CACHE_BYTES, 0)
|#define alloc_bootmem_low_pages_nopanic(x) \
| __alloc_bootmem_low_nopanic(x, PAGE_SIZE, 0)
and we have
#define BOOTMEM_LOW_LIMIT __pa(MAX_DMA_ADDRESS)
for CONFIG_NOBOOTMEM.
Restore goal to 0 to fix ia64 crash, that Tony found.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Reported-by: Tony Luck <tony.luck@gmail.com>
Tested-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull block IO driver changes from Jens Axboe:
- bcache update from Kent Overstreet.
- two bcache fixes from Nicholas Swenson.
- cciss pci init error fix from Andrew.
- underflow fix in the parallel IDE pg_write code from Dan Carpenter.
I'm sure the 1 (or 0) users of that are now happy.
- two PCI related fixes for sx8 from Jingoo Han.
- floppy init fix for first block read from Jiri Kosina.
- pktcdvd error return miss fix from Julia Lawall.
- removal of IRQF_SHARED from the SEGA Dreamcast CD-ROM code from
Michael Opdenacker.
- comment typo fix for the loop driver from Olaf Hering.
- potential oops fix for null_blk from Raghavendra K T.
- two fixes from Sam Bradshaw (Micron) for the mtip32xx driver, fixing
an OOM problem and a problem with handling security locked conditions
* 'for-3.14/drivers' of git://git.kernel.dk/linux-block: (47 commits)
mg_disk: Spelling s/finised/finished/
null_blk: Null pointer deference problem in alloc_page_buffers
mtip32xx: Correctly handle security locked condition
mtip32xx: Make SGL container per-command to eliminate high order dma allocation
drivers/block/loop.c: fix comment typo in loop_config_discard
drivers/block/cciss.c:cciss_init_one(): use proper errnos
drivers/block/paride/pg.c: underflow bug in pg_write()
drivers/block/sx8.c: remove unnecessary pci_set_drvdata()
drivers/block/sx8.c: use module_pci_driver()
floppy: bail out in open() if drive is not responding to block0 read
bcache: Fix auxiliary search trees for key size > cacheline size
bcache: Don't return -EINTR when insert finished
bcache: Improve bucket_prio() calculation
bcache: Add bch_bkey_equal_header()
bcache: update bch_bkey_try_merge
bcache: Move insert_fixup() to btree_keys_ops
bcache: Convert sorting to btree_keys
bcache: Convert debug code to btree_keys
bcache: Convert btree_iter to struct btree_keys
bcache: Refactor bset_tree sysfs stats
...