When reading the dm cache metadata from disk, ignore the policy hints
unless they were generated by the same major version number of the same
policy module.
The hints are considered to be private data belonging to the specific
module that generated them and there is no requirement for them to make
sense to different versions of the policy that generated them.
Policy modules are all required to work fine if no previous hints are
supplied (or if existing hints are lost).
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Separate dm cache policy version string into 3 unsigned numbers
corresponding to major, minor and patchlevel and store them at the end
of the on-disk metadata so we know which version of the policy generated
the hints in case a future version wants to use them differently.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
We have found a race in the optimisation used in the dm cache
writethrough implementation. Currently, dm core sends the cache target
two bios, one for the origin device and one for the cache device and
these are processed in parallel. This patch avoids the race by
changing the code back to a simpler (slower) implementation which
processes the two writes in series, one after the other, until we can
develop a complete fix for the problem.
When the cache is in writethrough mode it needs to send WRITE bios to
both the origin and cache devices.
Previously we've been implementing this by having dm core query the
cache target on every write to find out how many copies of the bio it
wants. The cache will ask for two bios if the block is in the cache,
and one otherwise.
Then main problem with this is it's racey. At the time this check is
made the bio hasn't yet been submitted and so isn't being taken into
account when quiescing a block for migration (promotion or demotion).
This means a single bio may be submitted when two were needed because
the block has since been promoted to the cache (catastrophic), or two
bios where only one is needed (harmless).
I really don't want to start entering bios into the quiescing system
(deferred_set) in the get_num_write_bios callback. Instead this patch
simplifies things; only one bio is submitted by the core, this is
first written to the origin and then the cache device in series.
Obviously this will have a latency impact.
deferred_writethrough_bios is introduced to record bios that must be
later issued to the cache device from the worker thread. This deferred
submission, after the origin bio completes, is required given that we're
in interrupt context (writethrough_endio).
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
When writing the dirty bitset to the metadata device on a clean
shutdown, clear the dirty bits. Previously they were left indicating
the cache was dirty. This led to confusion about whether there really
was dirty data in the cache or not. (This was a harmless bug.)
Reported-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
If the cache policy's config values are not able to be set we must
set the policy to NULL after destroying it in create_cache_policy()
so we don't attempt to destroy it a second time later.
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Return error if cache_create() fails.
A missing return check made cache_ctr continue even after an error in
cache_create() resulting in the cache object being destroyed. So a
simple failure like an odd number of cache policy config value arguments
would result in an oops.
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Squash various 32bit link errors.
>> on i386:
>> drivers/built-in.o: In function `is_discarded_oblock':
>> dm-cache-target.c:(.text+0x1ea28e): undefined reference to `__udivdi3'
...
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
A deadlock was found in the prefetch code in the dm verity map
function. This patch fixes this by transferring the prefetch
to a worker thread and skipping it completely if kmalloc fails.
If generic_make_request is called recursively, it queues the I/O
request on the current->bio_list without making the I/O request
and returns. The routine making the recursive call cannot wait
for the I/O to complete.
The deadlock occurs when one thread grabs the bufio_client
mutex and waits for an I/O to complete but the I/O is queued
on another thread's current->bio_list and is waiting to get
the mutex held by the first thread.
The fix recognises that prefetching is not essential. If memory
can be allocated, it queues the prefetch request to the worker thread,
but if not, it does nothing.
Signed-off-by: Paul Taysom <taysom@chromium.org>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Cc: stable@kernel.org
Fix a discard granularity calculation to work for non power of 2 block sizes.
In order for thinp to passdown discard bios to the underlying data
device, the data device must have a discard granularity that is a
factor of the thinp block size. Originally this check was done by
using bitops since the block_size was known to be a power of two.
Introduced by commit f13945d757
("dm thin: support a non power of 2 discard_granularity").
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Fix a bug in dm_btree_remove that could leave leaf values with incorrect
reference counts. The effect of this was that removal of a shared block
could result in the space maps thinking the block was no longer used.
More concretely, if you have a thin device and a snapshot of it, sending
a discard to a shared region of the thin could corrupt the snapshot.
Thinp uses a 2-level nested btree to store it's mappings. This first
level is indexed by thin device, and the second level by logical
block.
Often when we're removing an entry in this mapping tree we need to
rebalance nodes, which can involve shadowing them, possibly creating a
copy if the block is shared. If we do create a copy then children of
that node need to have their reference counts incremented. In this
way reference counts percolate down the tree as shared trees diverge.
The rebalance functions were incrementing the children at the
appropriate time, but they were always assuming the children were
internal nodes. This meant the leaf values (in our case packed
block/flags entries) were not being incremented.
Cc: stable@vger.kernel.org
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
The udc_irq service runs the isr_tr_complete_handler which in turn
"nukes" the endpoints, including a call to rndis_response_complete,
if appropriate. If the rndis_msg_parser fails here, an error will
be printed using a dev_err call (through the ERROR() macro).
However, if the usb cable was just disconnected the device (cdev)
might not be available and will be null. Since the dev_err macro will
dereference the cdev pointer we get a null pointer exception.
Reviewed-by: Radovan Lekanovic <radovan.lekanovic@sonymobile.com>
Signed-off-by: Truls Bengtsson <truls.bengtsson@sonymobile.com>
Signed-off-by: Oskar Andero <oskar.andero@sonymobile.com>
Acked-by: Michal Nazarewicz <mina86@mina86.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
Building a kernel for imx_v4_v5_defconfig with CONFIG_USB_ULPI disabled, results
in the following error:
arch/arm/mach-imx/built-in.o: In function 'pca100_init':
platform-mx2-emma.c:(.init.text+0x6788): undefined reference to 'otg_ulpi_create'
platform-mx2-emma.c:(.init.text+0x682c): undefined reference to 'mxc_ulpi_access_ops'
Fix this by providing a no-op definition of *otg_ulpi_create for the case when
CONFIG_USB_ULPI is not defined.
Acked-by: Igor Grinberg <grinberg@compulab.co.il>
Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
This patch fixes an "off-by-one" bug found in
581791f (FunctionFS: enable multiple functions).
During gfs_bind/gfs_unbind the functionfs_bind/functionfs_unbind should be
called for every functionfs instance. With the "i" pre-decremented they
were not called for the zeroth instance.
Acked-by: Michal Nazarewicz <mina86@mina86.com>
Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Cc: <stable@vger.kernel.org>
[ balbi@ti.com : added offending commit's subject ]
Signed-off-by: Felipe Balbi <balbi@ti.com>
This patch (as1667) removes an incorrect driver->unbind() call from
the net2280 driver. If startup fails, the UDC core takes care of
unbinding the gadget driver automatically; the controller driver
shouldn't do it too.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Felipe Balbi <balbi@ti.com>
This patch (as1666) fixes a regression in the UDC core. The core
takes care of unbinding gadget drivers, and it does the unbinding
before telling the UDC driver to turn off the controller hardware.
When the call to the udc_stop callback is made, the gadget no longer
has a driver. The callback routine should not be invoked with a
pointer to the old driver; doing so can cause problems (such as
use-after-free accesses in net2280).
This patch should be applied, with appropriate context changes, to all
the stable kernels going back to 3.1.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
CC: <stable@vger.kernel.org>
Signed-off-by: Felipe Balbi <balbi@ti.com>
with the latest udc_start/udc_stop conversion,
too much code was deleted which ended up creating
a regression in net2272 and net2280 drivers.
To fix the regression we revert one hunk of the
original commits.
Signed-off-by: Felipe Balbi <balbi@ti.com>
There is a typo in convert_to_spdif_status() about checking the
emphasis IEC958 status bit. It should check the given value instead
of the resultant value.
Reported-by: Martin Weishart <martin.weishart@telosalliance.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
In data=journal mode, if we unmount the file system before a
transaction has a chance to complete, when the journal inode is being
evicted, we can end up calling into jbd2_log_wait_commit() for the
last transaction, after the journalling machinery has been shut down.
Arguably we should adjust ext4_should_journal_data() to return FALSE
for the journal inode, but the only place it matters is
ext4_evict_inode(), and so to save a bit of CPU time, and to make the
patch much more obviously correct by inspection(tm), we'll fix it by
explicitly not trying to waiting for a journal commit when we are
evicting the journal inode, since it's guaranteed to never succeed in
this case.
This can be easily replicated via:
mount -t ext4 -o data=journal /dev/vdb /vdb ; umount /vdb
------------[ cut here ]------------
WARNING: at /usr/projects/linux/ext4/fs/jbd2/journal.c:542 __jbd2_log_start_commit+0xba/0xcd()
Hardware name: Bochs
JBD2: bad log_start_commit: 3005630206 3005630206 0 0
Modules linked in:
Pid: 2909, comm: umount Not tainted 3.8.0-rc3 #1020
Call Trace:
[<c015c0ef>] warn_slowpath_common+0x68/0x7d
[<c02b7e7d>] ? __jbd2_log_start_commit+0xba/0xcd
[<c015c177>] warn_slowpath_fmt+0x2b/0x2f
[<c02b7e7d>] __jbd2_log_start_commit+0xba/0xcd
[<c02b8075>] jbd2_log_start_commit+0x24/0x34
[<c0279ed5>] ext4_evict_inode+0x71/0x2e3
[<c021f0ec>] evict+0x94/0x135
[<c021f9aa>] iput+0x10a/0x110
[<c02b7836>] jbd2_journal_destroy+0x190/0x1ce
[<c0175284>] ? bit_waitqueue+0x50/0x50
[<c028d23f>] ext4_put_super+0x52/0x294
[<c020efe3>] generic_shutdown_super+0x48/0xb4
[<c020f071>] kill_block_super+0x22/0x60
[<c020f3e0>] deactivate_locked_super+0x22/0x49
[<c020f5d6>] deactivate_super+0x30/0x33
[<c0222795>] mntput_no_expire+0x107/0x10c
[<c02233a7>] sys_umount+0x2cf/0x2e0
[<c02233ca>] sys_oldumount+0x12/0x14
[<c08096b8>] syscall_call+0x7/0xb
---[ end trace 6a954cc790501c1f ]---
jbd2_log_wait_commit: error: j_commit_request=-1289337090, tid=0
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: stable@vger.kernel.org
Commit 84c17543ab (ext4: move work from io_end to inode) triggered a
regression when running xfstest #270 when the file system is mounted
with dioread_nolock.
The problem is that after ext4_evict_inode() calls ext4_ioend_wait(),
this guarantees that last io_end structure has been freed, but it does
not guarantee that the workqueue structure, which was moved into the
inode by commit 84c17543ab, is actually finished. Once
ext4_flush_completed_IO() calls ext4_free_io_end() on CPU #1, this
will allow ext4_ioend_wait() to return on CPU #2, at which point the
evict_inode() codepath can race against the workqueue code on CPU #1
accessing EXT4_I(inode)->i_unwritten_work to find the next item of
work to do.
Fix this by calling cancel_work_sync() in ext4_ioend_wait(), which
will be renamed ext4_ioend_shutdown(), since it is only used by
ext4_evict_inode(). Also, move the call to ext4_ioend_shutdown()
until after truncate_inode_pages() and filemap_write_and_wait() are
called, to make sure all dirty pages have been written back and
flushed from the page cache first.
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [<c01dda6a>] cwq_activate_delayed_work+0x3b/0x7e
*pdpt = 0000000030bc3001 *pde = 0000000000000000
Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
Modules linked in:
Pid: 6, comm: kworker/u:0 Not tainted 3.8.0-rc3-00013-g84c1754-dirty #91 Bochs Bochs
EIP: 0060:[<c01dda6a>] EFLAGS: 00010046 CPU: 0
EIP is at cwq_activate_delayed_work+0x3b/0x7e
EAX: 00000000 EBX: 00000000 ECX: f505fe54 EDX: 00000000
ESI: ed5b697c EDI: 00000006 EBP: f64b7e8c ESP: f64b7e84
DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
CR0: 8005003b CR2: 00000000 CR3: 30bc2000 CR4: 000006f0
DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
DR6: ffff0ff0 DR7: 00000400
Process kworker/u:0 (pid: 6, ti=f64b6000 task=f64b4160 task.ti=f64b6000)
Stack:
f505fe00 00000006 f64b7e9c c01de3d7 f6435540 00000003 f64b7efc c01def1d
f6435540 00000002 00000000 0000008a c16d0808 c040a10b c16d07d8 c16d08b0
f505fe00 c16d0780 00000000 00000000 ee153df4 c1ce4a30 c17d0e30 00000000
Call Trace:
[<c01de3d7>] cwq_dec_nr_in_flight+0x71/0xfb
[<c01def1d>] process_one_work+0x5d8/0x637
[<c040a10b>] ? ext4_end_bio+0x300/0x300
[<c01e3105>] worker_thread+0x249/0x3ef
[<c01ea317>] kthread+0xd8/0xeb
[<c01e2ebc>] ? manage_workers+0x4bb/0x4bb
[<c023a370>] ? trace_hardirqs_on+0x27/0x37
[<c0f1b4b7>] ret_from_kernel_thread+0x1b/0x28
[<c01ea23f>] ? __init_kthread_worker+0x71/0x71
Code: 01 83 15 ac ff 6c c1 00 31 db 89 c6 8b 00 a8 04 74 12 89 c3 30 db 83 05 b0 ff 6c c1 01 83 15 b4 ff 6c c1 00 89 f0 e8 42 ff ff ff <8b> 13 89 f0 83 05 b8 ff 6c c1
6c c1 00 31 c9 83
EIP: [<c01dda6a>] cwq_activate_delayed_work+0x3b/0x7e SS:ESP 0068:f64b7e84
CR2: 0000000000000000
---[ end trace a1923229da53d8a4 ]---
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: Jan Kara <jack@suse.cz>
Creation of individual mixer controls may fail, but that shouldn't cause
the entire mixer creation to fail. Even worse, if the mixer creation
fails, that will error out the entire device probing.
All the functions called by parse_audio_unit() should return -EINVAL if
they find descriptors that are unsupported or believed to be malformed,
so we can safely handle this error code as a non-fatal condition in
snd_usb_mixer_controls().
That fixes a long standing bug which is commonly worked around by
adding quirks which make the driver ignore entire interfaces. Some of
them might now be unnecessary.
Signed-off-by: Daniel Mack <zonque@gmail.com>
Reported-and-tested-by: Rodolfo Thomazelli <pe.soberbo@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
In check_input_term() and parse_audio_feature_unit(), propagate the
error value that has been returned by a failing function instead of
-EINVAL. That helps cleaning up the error pathes in the mixer.
Signed-off-by: Daniel Mack <zonque@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
UAC2_EXTENSION_UNIT_V2 differs from UAC1_EXTENSION_UNIT, but can be handled in
the same way when parsing the unit. Otherwise parse_audio_unit() fails when it
sees an extension unit on a UAC2 device.
UAC2_EXTENSION_UNIT_V2 is outside the range allocated by UAC1.
Signed-off-by: Torstein Hegge <hegge@resisty.net>
Acked-by: Daniel Mack <zonque@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Alex writes:
"Mostly just small bug fixes. Big change is new pci ids
for Richland APUs."
* 'drm-fixes-3.9' of git://people.freedesktop.org/~agd5f/linux:
drm/radeon: add Richland pci ids
drm/radeon: add support for Richland APUs
drm/radeon/benchmark: allow same domains for dma copy
drm/radeon/benchmark: make sure bo blit copy exists before using it
drm/radeon: fix backend map setup on 1 RB trinity boards
drm/radeon: fix S/R on VM systems (cayman/TN/SI)
Lots of thermal fixes and fix a lockdep warning we've been seeing.
* 'drm-nouveau-fixes-3.9' of git://anongit.freedesktop.org/git/nouveau/linux-2.6:
drm/nv50/kms: prevent lockdep false-positive in page flipping path
drm/nouveau/core: fix return value of nouveau_object_del()
drm/nouveau/hwmon: do not expose a buggy temperature if it is unavailable
drm/nouveau/therm: display the availability of the internal sensor
drm/nouveau/therm: disable temperature management if the sensor isn't readable
drm/nouveau/therm: disable auto fan management if temperature is not available
drm/nv40/therm: reserve negative temperatures for errors
drm/nv40/therm: disable temperature reading if the bios misses some parameters
drm/nouveau/therm-ic: the temperature is off by sensor_constant, warn the user
drm/nouveau/therm: remove some confusion introduced by therm_mode
drm/nouveau/therm: do not make assumptions on temperature
drm/nv40/therm: increase the sensor's settling delay to 20ms
drm/nv40/therm: improve selection between the old and the new style
Once instance of this Kconfig macro remained after commit
51acbcec6c ("md: remove
CONFIG_MULTICORE_RAID456"). Remove that one too. And, while we're at it,
also remove it from the defconfig files that carry it.
Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: NeilBrown <neilb@suse.de>
A number of problems can occur due to races between
resync/recovery and discard.
- if sync_request calls handle_stripe() while a discard is
happening on the stripe, it might call handle_stripe_clean_event
before all of the individual discard requests have completed
(so some devices are still locked, but not all).
Since commit ca64cae960
md/raid5: Make sure we clear R5_Discard when discard is finished.
this will cause R5_Discard to be cleared for the parity device,
so handle_stripe_clean_event() will not be called when the other
devices do become unlocked, so their ->written will not be cleared.
This ultimately leads to a WARN_ON in init_stripe and a lock-up.
- If handle_stripe_clean_event() does clear R5_UPTODATE at an awkward
time for resync, it can lead to s->uptodate being less than disks
in handle_parity_checks5(), which triggers a BUG (because it is
one).
So:
- keep R5_Discard on the parity device until all other devices have
completed their discard request
- make sure we don't try to have a 'discard' and a 'sync' action at
the same time.
This involves a new stripe flag to we know when a 'discard' is
happening, and the use of R5_Overlap on the parity disk so when a
discard is wanted while a sync is active, so we know to wake up
the discard at the appropriate time.
Discard support for RAID5 was added in 3.7, so this is suitable for
any -stable kernel since 3.7.
Cc: stable@vger.kernel.org (v3.7+)
Reported-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Tested-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
MD: Prevent sysfs operations on uninitialized kobjects
Device-mapper does not use sysfs; but when device-mapper is leveraging
MD's RAID personalities, MD sometimes attempts to update sysfs. This
patch adds checks for 'mddev-kobj.sd' in sysfs_[un]link_rdev to ensure
it is about to operate on something valid. This patch also checks for
'mddev->kobj.sd' before calling 'sysfs_notify' in 'remove_and_add_spares'.
Although 'sysfs_notify' already makes this check, doing so in
'remove_and_add_spares' prevents an additional mutex operation.
Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
MD RAID5: Fix kernel oops when RAID4/5/6 is used via device-mapper
Commit a9add5d (v3.8-rc1) added blktrace calls to the RAID4/5/6 driver.
However, when device-mapper is used to create RAID4/5/6 arrays, the
mddev->gendisk and mddev->queue fields are not setup. Therefore, calling
things like trace_block_bio_remap will cause a kernel oops. This patch
conditionalizes those calls on whether the proper fields exist to make
the calls. (Device-mapper will call trace_block_bio_remap on its own.)
This patch is suitable for the 3.8.y stable kernel.
Cc: stable@vger.kernel.org (v3.8+)
Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Pull kvm fixes from Marcelo Tosatti.
* git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: Fix bounds checking in ioapic indirect register reads (CVE-2013-1798)
KVM: x86: Convert MSR_KVM_SYSTEM_TIME to use gfn_to_hva_cache functions (CVE-2013-1797)
KVM: x86: fix for buffer overflow in handling of MSR_KVM_SYSTEM_TIME (CVE-2013-1796)
KVM: x86: fix deadlock in clock-in-progress request handling
KVM: allow host header to be included even for !CONFIG_KVM
Since commit 1ed850f356
md/raid5: make sure to_read and to_write never go negative.
It has been possible for handle_stripe_dirtying to be called
when there isn't actually any work to do.
It then calls schedule_reconstruction() which will set R5_LOCKED
on the parity block(s) even when nothing else is happening.
This then causes problems in do_release_stripe().
So add checks to schedule_reconstruction() so that if it doesn't
find anything to do, it just aborts.
This bug was introduced in v3.7, so the patch is suitable
for -stable kernels since then.
Cc: stable@vger.kernel.org (v3.7+)
Reported-by: majianpeng <majianpeng@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
We can only have one page of data in each sg element, so we can not
cross a page boundary. Fail this case.
The 'while (len > 0 && data_len > 0) {}' loop is not necessary. The loop
can only be executed once.
Signed-off-by: Asias He <asias@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch bumps the default FILEIO backend FD_MAX_SECTORS value from
1024 -> 2048 in order to allow block_size=512 to handle 1M sized I/Os.
The current default rejects I/Os larger than 512K in sbc_parse_cdb():
[12015.915146] SCSI OP 2ah with too big sectors 1347 exceeds backend
hw_max_sectors: 1024
[12015.977744] SCSI OP 2ah with too big sectors 2048 exceeds backend
hw_max_sectors: 1024
This issue is present in >= v3.5 based kernels, introduced after the
removal of se_task logic.
Reported-by: Viljami Ilola <azmulx@netikka.fi>
Cc: <stable@vger.kernel.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Commit 28c70f162 ("drm/i915: use the gmbus irq for waits") switched to
using GMBUS irqs instead of GPIO bit-banging for chipset generations 4
and above.
It turns out though that on many systems this leads to spurious interrupts
being generated, long after the register write to disable the IRQs has been
issued.
Typically this results in the spurious interrupt source getting
disabled:
[ 9.636345] irq 16: nobody cared (try booting with the "irqpoll" option)
[ 9.637915] Pid: 4157, comm: ifup Tainted: GF 3.9.0-rc2-00341-g0863702 #422
[ 9.639484] Call Trace:
[ 9.640731] <IRQ> [<ffffffff8109b40d>] __report_bad_irq+0x1d/0xc7
[ 9.640731] [<ffffffff8109b7db>] note_interrupt+0x15b/0x1e8
[ 9.640731] [<ffffffff810999f7>] handle_irq_event_percpu+0x1bf/0x214
[ 9.640731] [<ffffffff81099a88>] handle_irq_event+0x3c/0x5c
[ 9.640731] [<ffffffff8109c139>] handle_fasteoi_irq+0x7a/0xb0
[ 9.640731] [<ffffffff8100400e>] handle_irq+0x1a/0x24
[ 9.640731] [<ffffffff81003d17>] do_IRQ+0x48/0xaf
[ 9.640731] [<ffffffff8142f1ea>] common_interrupt+0x6a/0x6a
[ 9.640731] <EOI> [<ffffffff8142f952>] ? system_call_fastpath+0x16/0x1b
[ 9.640731] handlers:
[ 9.640731] [<ffffffffa000d771>] usb_hcd_irq [usbcore]
[ 9.640731] [<ffffffffa0306189>] yenta_interrupt [yenta_socket]
[ 9.640731] Disabling IRQ #16
The really curious thing is now that irq 16 is _not_ the interrupt for
the i915 driver when using MSI, but it _is_ the interrupt when not
using MSI. So by all indications it seems like gmbus is able to
generate a legacy (shared) interrupt in MSI mode on some
configurations. I've tried to reproduce this and the differentiating
thing seems to be that on unaffected systems no other device uses irq
16 (which seems to be the non-MSI intel gfx interrupt on all gm45).
I have no idea how that even can happen.
To avoid tempting this elephant into a rage, just disable gmbus
interrupt support on gen 4.
v2: Improve the commit message with exact details of what's going on.
Also add a comment in the code to warn against this particular
elephant in the room.
v3: Move the comment explaing how gen4 blows up next to the definition
of HAS_GMBUS_IRQ to keep the code-flow straight. Suggested by Chris
Wilson.
Signed-off-by: Jiri Kosina <jkosina@suse.cz> (v1)
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
References: https://lkml.org/lkml/2013/3/8/325
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
- Fix for the return type of xfs_iomap_eof_prealloc_initial_size
from a1e16c2666
- Fix for a failed buffer readahead causing subsequent callers to
fail incorrectly
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJRSOIAAAoJENaLyazVq6ZODqQP/2m1iZVIA9CXFf5hS2QZgkc2
MHq+QaQ1aaZlAIRCnZO4XrWoLw4tH7AmsHA7dVJVz/ZhVrJg4ahfdSS6qR5EGWFb
I5uE8LD8ZhpIiW6mBytJ7g9ST6xnaeean2sMwa0BcVK3uF84nO/uBopntZVrVlZE
sMuklZe8GfxDpF6SBxVGG+5+OeLXzFmf+s+xoCYN410uuzYoT8/jveFP6a5ARcmH
xEcOJA2+3o2z4/fsdx/Euf6LnDMSyOsAFUJCtnmBdKUA5w9DrJJqGpDDPEkg9h6d
/DTPYXEWx6+w4xoMnIf09oEdCSamBVTWcRFXtftN03VNrbRNtyVwAc8HUaSNmt0p
I3P/b5NJ5guH7uK72jp61N2RP7D5KOqwkwR58Y1SJWuwcgatYuB3NM5UeUyJBILj
ViZ4DsKGE6BCl8T3hwkN+mxSxB+o7O8AypjWdEviBXbVIG9CwOxr1IEatl3eyV5T
8QsNFb0LJcWzl1+F/uUYe1Goeqxvzupt7omUaRONdMnac3uFIk0ARrdxXFgawIJ9
lgeftBCmMkqqLZUACSfmfCYNwyupz3E6bYB7Azwx01qg7CzTPUfIL2SxqDYp2dup
/s+R7HL4HOJ0FCzjCZxHHO/1jsWgu265dJdpaQw/UcIe2IuEFGr558deHEM62bDW
rWCVHj5eY5NRGyzSwzqB
=41Vk
-----END PGP SIGNATURE-----
Merge tag 'for-linus-v3.9-rc4' of git://oss.sgi.com/xfs/xfs
Pull XFS fixes from Ben Myers:
- Fix for a potential infinite loop which was introduced in commit
4d559a3bcb ("xfs: limit speculative prealloc near ENOSPC
thresholds")
- Fix for the return type of xfs_iomap_eof_prealloc_initial_size from
commit a1e16c2666 ("xfs: limit speculative prealloc size on sparse
files")
- Fix for a failed buffer readahead causing subsequent callers to fail
incorrectly
* tag 'for-linus-v3.9-rc4' of git://oss.sgi.com/xfs/xfs:
xfs: ensure we capture IO errors correctly
xfs: fix xfs_iomap_eof_prealloc_initial_size type
xfs: fix potential infinite loop in xfs_iomap_prealloc_size()
Mantas Mikulėnas reported that his graphics hardware failed to
initialise after commit f9a37be0f0 ("x86: Use PCI setup data").
The aim of this commit was to ensure that ROM images were available on
some Apple systems that don't expose the GPU ROM via any other source.
In this case, UEFI appears to have provided a broken ROM image that we
were using even though there was a perfectly valid ROM available via
other sources. The simplest way to handle this seems to be to just
re-order pci_map_rom() and leave any firmare-supplied ROM to last.
Signed-off-by: Matthew Garrett <matthew.garrett@nebula.com>
Tested-by: Mantas Mikulėnas <grawity@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull sparc fixes from David Miller:
"Just some minor fixups, a sunsu console setup panic cure, and
recognition of a Fujitsu sun4v cpu."
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
sparc: remove unused "config BITS"
sparc: delete "if !ULTRA_HAS_POPULATION_COUNT"
sparc64: correctly recognize SPARC64-X chips
sparc,leon: fix GRPCI2 device0 PCI config space access
sunsu: Fix panic in case of nonexistent port at "console=ttySY" cmdline option
- Fix padding computation in struct ucontext (no ABI change).
- Minor clean-up after the signal patches (unused var).
- Two old Kconfig options clean-up.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
iQIcBAABAgAGBQJRSKTfAAoJEGvWsS0AyF7xR6IP/0/KsTKWikL5BJb1AIb20OMi
VKnqZYZefzSb/vQf7lx/k6sZ6aQ8y6CxoXMuEV42CVZG3JgDzUERgvX4/3upFTFM
5s5+pDLp5ASE97oDpRV0HkYePM0MwQGnyZjD1MBskxcAheYFnPbALGEnV5wG0J5b
7/FjUmmL5jbQPUhweGh3jHIWOvwNyQfXya+kdKiI/SGHOqqJ5DUY631yiUB5GUEa
KNCCYHCE2OyfcbZTV0oDFjleeokZC0J1fKRph28925k5DOZX/FDDs2C1i8dqL5hV
wHWpVFngtqrgHf/vriXn80vXgLoWvdYBD1tuFpDLyEmSpTdbVyjjZPz9pp6L4shb
oYxcFcPmf5PGH2+cZM2JzZ0dxx0NdnpEJBqdYcsjdwhM3InM0rVAy2mUu1uAEppg
4CQ/8+KZK4RW1UksuxVA+7oE83Q6Q9xGng66Y39J2d7a+GnDDLtdydYf9Z3e/ayF
lXnNsb3Hvh+Wq4/cjjwijPCf4WThlU2k1i+i+nAURsNnoLp4VkbzR/vvvwykeLE5
Wn/zEPUlNRUAN7JuskNx17yMSGpIeWaL46+odX00oDChVTUv/Gvr3ngxetNpvPxU
ErmVU2njxvrCrxquGA5fh4F3YKhhaW6KRvXYce6dB2jgdQyABmSwextt28TZTGtM
nGDTtStktMZEt09WbsjZ
=FN/w
-----END PGP SIGNATURE-----
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-aarch64
Pull arm64 fixes from Catalin Marinas:
- Fix !SMP build error.
- Fix padding computation in struct ucontext (no ABI change).
- Minor clean-up after the signal patches (unused var).
- Two old Kconfig options clean-up.
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-aarch64:
arm64: Kconfig.debug: Remove unused CONFIG_DEBUG_ERRORS
arm64: Do not select GENERIC_HARDIRQS_NO_DEPRECATED
arm64: fix padding computation in struct ucontext
arm64: Fix build error with !SMP
arm64: Removed unused variable in compat_setup_rt_frame()
sparc's asm/module.h got removed in commit
786d35d45c ("Make most arch asm/module.h
files use asm-generic/module.h"). That removed the only two uses of this
Kconfig symbol. So we can remove its entry too.
> >From arch/sparc/Makefile:
> ifeq ($(CONFIG_SPARC32),y)
> [...]
>
> [...]
> export BITS := 32
> [...]
>
> else
> [...]
>
> [...]
> export BITS := 64
> [...]
>
> So $(BITS) is set depending on whether CONFIG_SPARC32 is set or not.
> Using $(BITS) in sparc's Makefiles is not using CONFIG_BITS. That
> doesn't count as usage of "config BITS".
Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull networking fixes from David Miller:
1) Fix ARM BPF JIT handling of negative 'k' values, from Chen Gang.
2) Insufficient space reserved for bridge netlink values, fix from
Stephen Hemminger.
3) Some dst_neigh_lookup*() callers don't interpret error pointer
correctly, fix from Zhouyi Zhou.
4) Fix transport match in SCTP active_path loops, from Xugeng Zhang.
5) Fix qeth driver handling of multi-order SKB frags, from Frank
Blaschka.
6) fec driver is missing napi_disable() call, resulting in crashes on
unload, from Georg Hofmann.
7) Don't try to handle PMTU events on a listening socket, fix from Eric
Dumazet.
8) Fix timestamp location calculations in IP option processing, from
David Ward.
9) FIB_TABLE_HASHSZ setting is not controlled by the correct kconfig
tests, from Denis V Lunev.
10) Fix TX descriptor push handling in SFC driver, from Ben Hutchings.
11) Fix isdn/hisax and tulip/de4x5 kconfig dependencies, from Arnd
Bergmann.
12) bnx2x statistics don't handle 4GB rollover correctly, fix from
Maciej Żenczykowski.
13) Openvswitch bug fixes for vport del/new error reporting, missing
genlmsg_end() call in netlink processing, and mis-parsing of
LLC/SNAP ethernet types. From Rich Lane.
14) SKB pfmemalloc state should only be propagated from the head page of
a compound page, fix from Pavel Emelyanov.
15) Fix link handling in tg3 driver for 5715 chips when autonegotation
is disabled. From Nithin Sujir.
16) Fix inverted test of cpdma_check_free_tx_desc return value in
davinci_emac driver, from Mugunthan V N.
17) vlan_depth is incorrectly calculated in skb_network_protocol(), from
Li RongQing.
18) Fix probing of Gobi 1K devices in qmi_wwan driver, and fix NCM
device mode backwards compat in cdc_ncm driver. From Bjørn Mork.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (52 commits)
inet: limit length of fragment queue hash table bucket lists
qeth: Fix scatter-gather regression
qeth: Fix invalid router settings handling
qeth: delay feature trace
tcp: dont handle MTU reduction on LISTEN socket
bnx2x: fix occasional statistics off-by-4GB error
vhost/net: fix heads usage of ubuf_info
bridge: Add support for setting BR_ROOT_BLOCK flag.
bnx2x: add missing napi deletion in error path
drivers: net: ethernet: ti: davinci_emac: fix usage of cpdma_check_free_tx_desc()
ethernet/tulip: DE4x5 needs VIRT_TO_BUS
isdn: hisax: netjet requires VIRT_TO_BUS
net: cdc_ncm, cdc_mbim: allow user to prefer NCM for backwards compatibility
rtnetlink: Mask the rta_type when range checking
Revert "ip_gre: make ipgre_tunnel_xmit() not parse network header as IP unconditionally"
Fix dst_neigh_lookup/dst_neigh_lookup_skb return value handling bug
smsc75xx: configuration help incorrectly mentions smsc95xx
net: fec: fix missing napi_disable call
net: fec: restart the FEC when PHY speed changes
skb: Propagate pfmemalloc on skb from head page only
...
Commit 2d78d4beb6 ("[PATCH] bitops:
sparc64: use generic bitops") made the default of GENERIC_HWEIGHT depend
on !ULTRA_HAS_POPULATION_COUNT. But since there's no Kconfig symbol with
that name, this always evaluates to true. Delete this dependency.
Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
If the guest specifies a IOAPIC_REG_SELECT with an invalid value and follows
that with a read of the IOAPIC_REG_WINDOW KVM does not properly validate
that request. ioapic_read_indirect contains an
ASSERT(redir_index < IOAPIC_NUM_PINS), but the ASSERT has no effect in
non-debug builds. In recent kernels this allows a guest to cause a kernel
oops by reading invalid memory. In older kernels (pre-3.3) this allows a
guest to read from large ranges of host memory.
Tested: tested against apic unit tests.
Signed-off-by: Andrew Honig <ahonig@google.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
There is a potential use after free issue with the handling of
MSR_KVM_SYSTEM_TIME. If the guest specifies a GPA in a movable or removable
memory such as frame buffers then KVM might continue to write to that
address even after it's removed via KVM_SET_USER_MEMORY_REGION. KVM pins
the page in memory so it's unlikely to cause an issue, but if the user
space component re-purposes the memory previously used for the guest, then
the guest will be able to corrupt that memory.
Tested: Tested against kvmclock unit test
Signed-off-by: Andrew Honig <ahonig@google.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
If the guest sets the GPA of the time_page so that the request to update the
time straddles a page then KVM will write onto an incorrect page. The
write is done byusing kmap atomic to get a pointer to the page for the time
structure and then performing a memcpy to that page starting at an offset
that the guest controls. Well behaved guests always provide a 32-byte aligned
address, however a malicious guest could use this to corrupt host kernel
memory.
Tested: Tested against kvmclock unit test.
Signed-off-by: Andrew Honig <ahonig@google.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
The Kconfig entry for DEBUG_ERRORS is a verbatim copy of the former arm
entry for that symbol. It got removed in v2.6.39 because it wasn't
actually used anywhere. There are still no users of DEBUG_ERRORS so
remove this entry too.
Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
[catalin.marinas@arm.com: removed option from defconfig]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Config option GENERIC_HARDIRQS_NO_DEPRECATED was removed in commit
78c8982564 ("genirq: Remove the now obsolete
config options and select statements"), but the select was accidentally
reintroduced in commit 8c2c3df31e ("arm64:
Build infrastructure").
Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
This patch introduces a constant limit of the fragment queue hash
table bucket list lengths. Currently the limit 128 is choosen somewhat
arbitrary and just ensures that we can fill up the fragment cache with
empty packets up to the default ip_frag_high_thresh limits. It should
just protect from list iteration eating considerable amounts of cpu.
If we reach the maximum length in one hash bucket a warning is printed.
This is implemented on the caller side of inet_frag_find to distinguish
between the different users of inet_fragment.c.
I dropped the out of memory warning in the ipv4 fragment lookup path,
because we already get a warning by the slab allocator.
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Jesper Dangaard Brouer <jbrouer@redhat.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes a scatter-gather regression introduced with
commit 5640f768 net: use a per task frag allocator
Now the qeth driver can cope with bigger framents and split a fragment in
sub framents if required.
Signed-off-by: Frank Blaschka <blaschka@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Give a bad return code when specifying a router setting that is either
invalid or not support on the respective device type. In addition, fall back
the previous setting instead of silently switching back to 'no routing'.
Signed-off-by: Stefan Raspl <raspl@linux.vnet.ibm.com>
Signed-off-by: Frank Blaschka <blaschka@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Delay tracing of the card features until the optional commands have been
enabled.
Signed-off-by: Stefan Raspl <raspl@linux.vnet.ibm.com>
Signed-off-by: Frank Blaschka <blaschka@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>