Commit Graph

68424 Commits

Author SHA1 Message Date
Stefan Richter cb7c96da36 firewire: core: optimize Topology Map creation
The Topology Map of the local node was created in CPU byte order,
then a temporary big endian copy was created to compute the CRC,
and when a read request to the Topology Map arrived it had to be
converted to big endian byte order again.

We now generate it in big endian byte order in the first place.
This also rids us of 1000 bytes stack usage in tasklet context.

Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
2009-10-14 23:10:48 +02:00
Stefan Richter fe242579e9 firewire: core: clarify generate_config_rom usage
Move the static config ROM buffer into the scope of the two callers of
generate_config_rom().  That way the ROM length can be passed over as
return value rather than through a pointer argument.

It also becomes more obvious that accesses to the config ROM buffer have
to be serialized and how this is accomplished.  And firewire-core.ko
shrinks a bit as well.

Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
2009-10-14 23:10:48 +02:00
Stefan Richter 8e85973efc firewire: optimize config ROM creation
The config ROM image of the local node was created in CPU byte order,
then a temporary big endian copy was created to compute the CRC, and
finally the card driver created its own big endian copy.

We now generate it in big endian byte order in the first place to avoid
one byte order conversion and the temporary on-stack copy of the ROM
image (1000 bytes stack usage in process context).  Furthermore, two
1000 bytes memset()s are replaced by one 1000 bytes - ROM length sized
memset.

The trivial fw_memcpy_{from,to}_be32() helpers are now superfluous and
removed.  The newly added __compute_block_crc() function will be folded
into fw_compute_block_crc() in a subsequent change.

Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
2009-10-14 23:10:48 +02:00
Stefan Richter e21fcf798e firewire: cdev: normalize variable names
Unify some names:
  - "e" for pointers to subtypes of struct event,
  - "event" for struct members and pointers to struct event,
  - "r" for pointers to subtypes of struct client_resource,
  - "resource" for struct members and pointers to struct client_resource,
  - other names for struct members and pointers to other types.

Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
2009-10-14 23:10:48 +02:00
Stefan Richter 9fb551bf72 firewire: normalize style of queue_work wrappers
A few stylistic changes to unify some code patterns in the subsystem:

  - The similar queue_delayed_work helpers fw_schedule_bm_work,
    schedule_iso_resource, and sbp2_queue_work now have the same call
    convention.
  - Two conditional calls of schedule_iso_resource are factored into
    another small helper.
  - An sbp2_target_get helper is added as counterpart to
    sbp2_target_put.

Object size of firewire-core is decreased a little bit, object size of
firewire-sbp2 remains unchanged.

Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
2009-10-14 23:10:48 +02:00
Stefan Richter 7e44c0b56b firewire: cdev: fix memory leak in an error path
If copy_from_user in an FW_CDEV_IOC_SEND_RESPONSE ioctl failed, an
inbound_transaction_resource instance is no longer referenced and needs
to be freed.

Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
2009-10-14 21:55:19 +02:00
Stefan Richter eaf76e0d02 firewire: sbp2: provide fallback if mgt_ORB_timeout is missing
The Unit_Characteristics entry of an SBP-2 unit directory is not
mandatory as far as I can tell.  If it is missing, we would probably
fail to log in into the target because firewire-sbp2 would not wait for
status after it sent the login request.

The fix moves the cleanup of tgt->mgt_orb_timeout into a place where it
is executed exactly once before login, rather than 0..n times depending
on the target's config ROM.  With targets with one or more
Unit_Characteristics entries, the result is the same as before.

Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
2009-10-14 21:55:19 +02:00
Stefan Richter 625f0850a8 ieee1394: sbp2: remove a workaround for Momobay FX-3A
The inquiry delay is not necessary anymore in tests on a recent kernel.

Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
2009-09-12 14:48:41 +02:00
Stefan Richter 3c5f80357c firewire: sbp2: remove a workaround for Momobay FX-3A
The inquiry delay does more harm than good in tests on a recent kernel.

Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
2009-09-12 14:48:40 +02:00
Stefan Richter 094614fc14 firewire: sbp2: fix status reception
Per SBP-2 clause 5.3, a target shall store 8...32 bytes of status
information.  Trailing zeros after the first 8 bytes don't need to be
stored, they are implicit.  Fix the status write handler to clear all
unwritten status data.

Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
2009-09-12 14:48:40 +02:00
Stefan Richter 85cb9b6864 firewire: core: fix topology map response handler
This register is 1 kBytes large.  Adjust topology_map.length to prevent
registration of other response handlers in this region and to make sure
that we respond to requests to the upper half of the register.

Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
2009-09-12 14:48:40 +02:00
Stefan Richter b171e204b3 firewire: core: fix race with parallel PCI device probe
The config ROM buffer received from generate_config_rom is a globally
shared static buffer.  Extend the card_mutex protection in fw_add_card
until after the config ROM was copied into the card driver's buffer.
Otherwise, parallelized card driver probes may end up with ROM contents
that were meant for a different card.

firewire-ohci's card->driver->enable hook is safe to be called within
the card_mutex.  Furthermore, it is safe to reorder card_list update
versus card enable, which simplifies the code a little.

Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
2009-09-12 14:48:40 +02:00
Stefan Richter 18668ff9a3 firewire: core: header file cleanup
fw_card_get, fw_card_put, fw_card_release are currently not exported for
use outside the firewire-core.  Move their definitions/ declarations
from the subsystem header file to the core header file.

Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
2009-09-12 14:48:40 +02:00
Stefan Richter 928ec5f148 firewire: ohci: fix Self ID Count register mask (safeguard against buffer overflow)
The selfIDSize field of Self ID Count is 9 bits wide, and we are only
interested in the high 8 bits.  Fix the mask accordingly.  The
previously too large mask didn't do damage though because the next few
bits in the register are reserved and therefore zero with presently
existing hardware.

Also, check for the maximum possible self ID count of 252 (according to
OHCI 1.1 clause 11.2 and IEEE 1394a-2000 clause 4.3.4.1, i.e. up to four
self IDs of up to 63 nodes, even though IEEE 1394 up to edition 2008
defines only up to three self IDs per node).  More than 252 self IDs
would only happen if the self ID receive DMA unit malfunctioned, which
would likely be caught by other self ID buffer checks.  However, check
it early to be sure.  More than 253 quadlets would overflow the Topology
Map CSR.

Reported-By: PaX Team
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
2009-09-12 14:48:40 +02:00
Michael Buesch 64549e9357 ieee1394: raw1394: Do not leak memory on failed trylock.
Do not leak the allocated memory in case the mutex_trylock() failed
to acquire the lock.

Signed-off-by: Michael Buesch <mb@bu3sch.de>

This bug does not happen in practice:  All raw1394 clients use
libraw1394, and accesses to a libraw1394 handle need to be serialized
by the client.  This is documented in libraw1394's API reference.

Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
2009-09-12 14:48:40 +02:00
Ed Cashin 7135a71b19 aoe: allocate unused request_queue for sysfs
Andy Whitcroft reported an oops in aoe triggered by use of an
incorrectly initialised request_queue object:

  [ 2645.959090] kobject '<NULL>' (ffff880059ca22c0): tried to add
		an uninitialized object, something is seriously wrong.
  [ 2645.959104] Pid: 6, comm: events/0 Not tainted 2.6.31-5-generic #24-Ubuntu
  [ 2645.959107] Call Trace:
  [ 2645.959139] [<ffffffff8126ca2f>] kobject_add+0x5f/0x70
  [ 2645.959151] [<ffffffff8125b4ab>] blk_register_queue+0x8b/0xf0
  [ 2645.959155] [<ffffffff8126043f>] add_disk+0x8f/0x160
  [ 2645.959161] [<ffffffffa01673c4>] aoeblk_gdalloc+0x164/0x1c0 [aoe]

The request queue of an aoe device is not used but can be allocated in
code that does not sleep.

Bruno bisected this regression down to

  cd43e26f07

  block: Expose stacked device queues in sysfs

"This seems to generate /sys/block/$device/queue and its contents for
 everyone who is using queues, not just for those queues that have a
 non-NULL queue->request_fn."

Addresses http://bugs.launchpad.net/bugs/410198
Addresses http://bugzilla.kernel.org/show_bug.cgi?id=13942

Note that embedding a queue inside another object has always been
an illegal construct, since the queues are reference counted and
must persist until the last reference is dropped. So aoe was
always buggy in this respect (Jens).

Signed-off-by: Ed Cashin <ecashin@coraid.com>
Cc: Andy Whitcroft <apw@canonical.com>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Bruno Premont <bonbons@linux-vserver.org>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-09-09 14:10:18 +02:00
Linus Torvalds e6890f6f3d i915: disable interrupts before tearing down GEM state
Reinette Chatre reports a frozen system (with blinking keyboard LEDs)
when switching from graphics mode to the text console, or when
suspending (which does the same thing). With netconsole, the oops
turned out to be

	BUG: unable to handle kernel NULL pointer dereference at 0000000000000084
	IP: [<ffffffffa03ecaab>] i915_driver_irq_handler+0x26b/0xd20 [i915]

and it's due to the i915_gem.c code doing drm_irq_uninstall() after
having done i915_gem_idle(). And the i915_gem_idle() path will do

  i915_gem_idle() ->
    i915_gem_cleanup_ringbuffer() ->
      i915_gem_cleanup_hws() ->
        dev_priv->hw_status_page = NULL;

but if an i915 interrupt comes in after this stage, it may want to
access that hw_status_page, and gets the above NULL pointer dereference.

And since the NULL pointer dereference happens from within an interrupt,
and with the screen still in graphics mode, the common end result is
simply a silently hung machine.

Fix it by simply uninstalling the irq handler before idling rather than
after. Fixes

    http://bugzilla.kernel.org/show_bug.cgi?id=13819

Reported-and-tested-by: Reinette Chatre <reinette.chatre@intel.com>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-09-08 17:09:24 -07:00
Zhenyu Wang 7c8460db30 drm/i915: fix mask bits setting
eDP is exclusive connector too, and add missing crtc_mask
setting for TV.

This fixes

	http://bugzilla.kernel.org/show_bug.cgi?id=14139

Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Reported-and-tested-by: Carlos R. Mafra <crmafra2@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-09-08 10:16:20 -07:00
Linus Torvalds 3ff323f890 Merge branch 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6
* 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
  drm/radeon/kms: add LTE/GTE discard + rv515 two sided stencil register.
2009-09-07 11:42:25 -07:00
Linus Torvalds 4886b5b485 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
  gianfar: Fix build.
2009-09-07 11:40:24 -07:00
Linus Torvalds cbeb2864b1 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/ide-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/ide-2.6:
  pcmcia: add CNF-CDROM-ID for ide
2009-09-07 11:40:15 -07:00
Linus Torvalds f69fb9c398 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/anholt/drm-intel
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/anholt/drm-intel:
  agp/intel: support for new chip variant of IGDNG mobile
  drm/i915: Unref old_obj on get_fence_reg() error path
  drm/i915: increase default latency constant (v2 w/comment)
2009-09-07 11:38:30 -07:00
Dave Airlie a54775c875 drm/radeon/kms: add LTE/GTE discard + rv515 two sided stencil register.
This adds some rv350+ register for LTE/GTE discard,
and enables the rv515 two sided stencil register.
It also disables the DEPTHXY_OFFSET register which
can be used to workaround the CS checker.
Moves rs690 to proper place in rs600 and uses correct
table on rs600.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2009-09-07 15:26:19 +10:00
David S. Miller d9d8e0418f gianfar: Fix build.
Reported by Michael Guntsche <mike@it-loops.com>

--------------------
Commit
38bddf04bc gianfar: gfar_remove needs to call unregister_netdev()

breaks the build of the gianfar driver because "dev" is undefined in
this function. To quickly test rc9 I changed this to priv->ndev but I do
not know if this is the correct one.
--------------------

Signed-off-by: David S. Miller <davem@davemloft.net>
2009-09-06 01:41:24 -07:00
Linus Torvalds f815c335d2 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6:
  firewire: sbp2: fix freeing of unallocated memory
  firewire: ohci: fix Ricoh R5C832, video reception
  firewire: ohci: fix Agere FW643 and multiple cameras
  firewire: core: fix crash in iso resource management
2009-09-05 14:59:00 -07:00
Linus Torvalds 5136a6c0fd Merge git://git.infradead.org/~dwmw2/mtd-2.6.31
* git://git.infradead.org/~dwmw2/mtd-2.6.31:
  JFFS2: add missing verify buffer allocation/deallocation
  mtd: nftl: fix offset alignments
  mtd: nftl: write support is broken
  mtd: m25p80: fix null pointer dereference bug
2009-09-05 14:57:04 -07:00
Linus Torvalds 59430c2f43 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
  tc: Fix unitialized kernel memory leak
  pkt_sched: Revert tasklet_hrtimer changes.
  net: sk_free() should be allowed right after sk_alloc()
  gianfar: gfar_remove needs to call unregister_netdev()
  ipw2200: firmware DMA loading rework
2009-09-05 14:52:41 -07:00
Linus Torvalds 3bb314f01c Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq:
  [CPUFREQ] Re-enable cpufreq suspend and resume code
2009-09-05 14:51:24 -07:00
Linus Torvalds 154f807e55 Merge git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-2.6-dm
* git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-2.6-dm:
  dm snapshot: fix on disk chunk size validation
  dm exception store: split set_chunk_size
  dm snapshot: fix header corruption race on invalidation
  dm snapshot: refactor zero_disk_area to use chunk_io
  dm log: userspace add luid to distinguish between concurrent log instances
  dm raid1: do not allow log_failure variable to unset after being set
  dm log: remove incorrect field from userspace table output
  dm log: fix userspace status output
  dm stripe: expose correct io hints
  dm table: add more context to terse warning messages
  dm table: fix queue_limit checking device iterator
  dm snapshot: implement iterate devices
  dm multipath: fix oops when request based io fails when no paths
2009-09-05 13:51:07 -07:00
Linus Torvalds 9b6a3df372 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
  PCI SR-IOV: correct broken resource alignment calculations
2009-09-05 13:50:46 -07:00
Linus Torvalds 6399534472 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: atkbd - add Compaq Presario R4000-series repeat quirk
  Input: i8042 - add Acer Aspire 5536 to the nomux list
2009-09-05 13:41:29 -07:00
Linus Torvalds ac89a9174d pty: don't limit the writes to 'pty_space()' inside 'pty_write()'
The whole write-room thing is something that is up to the _caller_ to
worry about, not the pty layer itself.  The total buffer space will
still be limited by the buffering routines themselves, so there is no
advantage or need in having pty_write() artificially limit the size
somehow.

And what happened was that the caller (the n_tty line discipline, in
this case) may have verified that there is room for 2 bytes to be
written (for NL -> CRNL expansion), and it used to then do those writes
as two single-byte writes.  And if the first byte written (CR) then
caused a new tty buffer to be allocated, pty_space() may have returned
zero when trying to write the second byte (LF), and then incorrectly
failed the write - leading to a lost newline character.

This should finally fix

	http://bugzilla.kernel.org/show_bug.cgi?id=14015

Reported-by: Mikael Pettersson <mikpe@it.uu.se>
Acked-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-09-05 13:27:10 -07:00
Linus Torvalds 37f81fa1f6 n_tty: do O_ONLCR translation as a single write
When translating CR to CRNL in the n_tty line discipline, we did it as
two tty_put_char() calls.  Which works, but is stupid, and has caused
problems before too with bad interactions with the write_room() logic.
The generic USB serial driver had that problem, for example.

Now the pty layer had similar issues after being moved to the generic
tty buffering code (in commit d945cb9cce20ac7143c2de8d88b187f62db99bdc:
"pty: Rework the pty layer to use the normal buffering logic").

So stop doing the silly separate two writes, and do it as a single write
instead.  That's what the n_tty layer already does for the space
expansion of tabs (XTABS), and it means that we'll now always have just
a single write for the CRNL to match the single 'tty_write_room()' test,
which hopefully means that the next time somebody screws up buffering,
it won't cause weeks of debugging.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-09-05 12:46:07 -07:00
Stefan Richter baed6b82d9 firewire: sbp2: fix freeing of unallocated memory
If a target writes invalid status (typically status of a command that
already timed out), firewire-sbp2 attempts to put away an ORB that
doesn't exist.  https://bugzilla.redhat.com/show_bug.cgi?id=519772

Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
2009-09-05 15:59:34 +02:00
Stefan Richter 4fe0badd58 firewire: ohci: fix Ricoh R5C832, video reception
In dual-buffer DMA mode, no video frames are ever received from R5C832
by libdc1394.  Fallback to packet-per-buffer DMA works reliably.
http://thread.gmane.org/gmane.linux.kernel.firewire.devel/13393/focus=13476

Reported-by: Jonathan Cameron <jic23@cam.ac.uk>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
2009-09-05 15:59:34 +02:00
Stefan Richter fc383796a8 firewire: ohci: fix Agere FW643 and multiple cameras
An Agere FW643 OHCI 1.1 card works fine for video reception from one
camera but fails early if receiving from two cameras.  After a short
while, no IR IRQ events occur and the context control register does not
react anymore.  This happens regardless whether both IR DMA contexts are
dual-buffer or one is dual-buffer and the other packet-per-buffer.

This can be worked around by disabling dual buffer DMA mode entirely.
http://sourceforge.net/mailarchive/message.php?msg_name=4A7C0594.2020208%40gmail.com
(Reported by Samuel Audet.)

In another report (by Jonathan Cameron), an FW643 works OK with two
cameras in dual buffer mode.  Whether this is due to different chip
revisions or different usage patterns (different video formats) is not
yet clear.  However, as far as the current capabilities of
firewire-core's isochronous I/O interface are concerned, simply
switching off dual-buffer on non-working and working FW643s alike is not
a problem in practice.  We only need to revisit this issue if we are
going to enhance the interface, e.g. so that applications can explicitly
choose modes.

Reported-by: Samuel Audet <samuel.audet@gmail.com>
Reported-by: Jonathan Cameron <jic23@cam.ac.uk>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
2009-09-05 15:59:34 +02:00
Stefan Richter 1821bc19d5 firewire: core: fix crash in iso resource management
This fixes a regression due to post 2.6.30 commit "firewire: core: do
not DMA-map stack addresses" 6fdc037094.

As David Moore noted, a previously correct sizeof() expression became
wrong since the commit changed its argument from an array to a pointer.
This resulted in an oops in ohci_cancel_packet in the shared workqueue
thread's context when an isochronous resource was to be freed.

Reported-by: Jonathan Cameron <jic23@cam.ac.uk>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
2009-09-05 15:59:34 +02:00
Mikulas Patocka ae0b7448e9 dm snapshot: fix on disk chunk size validation
Fix some problems seen in the chunk size processing when activating a
pre-existing snapshot.

For a new snapshot, the chunk size can either be supplied by the creator
or a default value can be used.  For an existing snapshot, the
chunk size in the snapshot header on disk should always be used.

If someone attempts to load an existing snapshot and has the 'default
chunk size' option set, the kernel uses its default value even when it
is incorrect for the snapshot being loaded.  This patch ensures the
correct on-disk value is always used.

Secondly, when the code does use the chunk size stored on the disk it is
prudent to revalidate it, so the code can exit cleanly if it got
corrupted as happened in
https://bugzilla.redhat.com/show_bug.cgi?id=461506 .

Cc: stable@kernel.org
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2009-09-04 20:40:43 +01:00
Mikulas Patocka 2defcc3fb4 dm exception store: split set_chunk_size
Break the function set_chunk_size to two functions in preparation for
the fix in the following patch.

Cc: stable@kernel.org
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2009-09-04 20:40:41 +01:00
Mikulas Patocka 61578dcd3f dm snapshot: fix header corruption race on invalidation
If a persistent snapshot fills up, a race can corrupt the on-disk header
which causes a crash on any future attempt to activate the snapshot
(typically while booting).  This patch fixes the race.

When the snapshot overflows, __invalidate_snapshot is called, which calls
snapshot store method drop_snapshot. It goes to persistent_drop_snapshot that
calls write_header. write_header constructs the new header in the "area"
location.

Concurrently, an existing kcopyd job may finish, call copy_callback
and commit_exception method, that goes to persistent_commit_exception.
persistent_commit_exception doesn't do locking, relying on the fact that
callbacks are single-threaded, but it can race with snapshot invalidation and
overwrite the header that is just being written while the snapshot is being
invalidated.

The result of this race is a corrupted header being written that can
lead to a crash on further reactivation (if chunk_size is zero in the
corrupted header).

The fix is to use separate memory areas for each.

See the bug: https://bugzilla.redhat.com/show_bug.cgi?id=461506

Cc: stable@kernel.org
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2009-09-04 20:40:39 +01:00
Mikulas Patocka 02d2fd31de dm snapshot: refactor zero_disk_area to use chunk_io
Refactor chunk_io to prepare for the fix in the following patch.

Pass an area pointer to chunk_io and simplify zero_disk_area to use
chunk_io.  No functional change.

Cc: stable@kernel.org
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2009-09-04 20:40:37 +01:00
Jonathan Brassow 7ec23d5094 dm log: userspace add luid to distinguish between concurrent log instances
Device-mapper userspace logs (like the clustered log) are
identified by a universally unique identifier (UUID).  This
identifier is used to associate requests from the kernel to
a specific log in userspace.  The UUID must be unique everywhere,
since multiple machines may use this identifier when communicating
about a particular log, as is the case for cluster logs.

Sometimes, device-mapper/LVM may re-use a UUID.  This is the
case during pvmoves, when moving from one segment of an LV
to another, or when resizing a mirror, etc.  In these cases,
a new log is created with the same UUID and loaded in the
"inactive" slot.  When a device-mapper "resume" is issued,
the "live" table is deactivated and the new "inactive" table
becomes "live".  (The "inactive" table can also be removed
via a device-mapper 'clear' command.)

The above two issues were colliding.  More than one log was being
created with the same UUID, and there was no way to distinguish
between them.  So, sometimes the wrong log would be swapped
out during the exchange.

The solution is to create a locally unique identifier,
'luid', to go along with the UUID.  This new identifier is used
to determine exactly which log is being referenced by the kernel
when the log exchange is made.  The identifier is not
universally safe, but it does not need to be, since
create/destroy/suspend/resume operations are bound to a specific
machine; and these are the operations that make up the exchange.

Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2009-09-04 20:40:34 +01:00
Jonathan Brassow d2b698644c dm raid1: do not allow log_failure variable to unset after being set
This patch fixes a bug which was triggering a case where the primary leg
could not be changed on failure even when the mirror was in-sync.

The case involves the failure of the primary device along with
the transient failure of the log device.  The problem is that
bios can be put on the 'failures' list (due to log failure)
before 'fail_mirror' is called due to the primary device failure.
Normally, this is fine, but if the log device failure is transient,
a subsequent iteration of the work thread, 'do_mirror', will
reset 'log_failure'.  The 'do_failures' function then resets
the 'in_sync' variable when processing bios on the failures list.
The 'in_sync' variable is what is used to determine if the
primary device can be switched in the event of a failure.  Since
this has been reset, the primary device is incorrectly assumed
to be not switchable.

The case has been seen in the cluster mirror context, where one
machine realizes the log device is dead before the other machines.
As the responsibilities of the server migrate from one node to
another (because the mirror is being reconfigured due to the failure),
the new server may think for a moment that the log device is fine -
thus resetting the 'log_failure' variable.

In any case, it is inappropiate for us to reset the 'log_failure'
variable.  The above bug simply illustrates that it can actually
hurt us.

Cc: stable@kernel.org
Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2009-09-04 20:40:32 +01:00
Jonathan Brassow b8313b6da7 dm log: remove incorrect field from userspace table output
The output of 'dmsetup table' includes an internal field that should not
be there.  This patch removes it.  To make the fix simpler, we first
reorder a constructor argument

The 'device size' argument is generated internally.  Currently it is
placed as the last space-separated word of the constructor string.
However, we need to use a version of the string without this word, so we
move it to the beginning instead so it is trivial to skip past it.

We keep a copy of the arguments passed to userspace for creating a log,
just in case we need to resend them.  These are the same arguments that
are desired in the STATUSTYPE_TABLE request, except for one.  When
creating the userspace log, the userspace daemon must know the size of
the mirror, so that is added to the arguments given in the constructor
table.  We were printing this extra argument out as well, which is a
mistake.

Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2009-09-04 20:40:30 +01:00
Jonathan Brassow 4142a96917 dm log: fix userspace status output
Fix 'dmsetup table' output.

There is a missing ' ' at the end of the string causing two
words to run together.

Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2009-09-04 20:40:28 +01:00
Mike Snitzer 40bea43127 dm stripe: expose correct io hints
Set sensible I/O hints for striped DM devices in the topology
infrastructure added for 2.6.31 for userspace tools to
obtain via sysfs.

Add .io_hints to 'struct target_type' to allow the I/O hints portion
(io_min and io_opt) of the 'struct queue_limits' to be set by each
target and implement this for dm-stripe.

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2009-09-04 20:40:25 +01:00
Mike Snitzer a963a95622 dm table: add more context to terse warning messages
A couple of recent warning messages make it difficult for the reader to
determine exactly what is wrong.  This patch adds more information to
those messages.

The messages were added by these commits:
  5dea271b6d ("dm table: pass correct dev area size
to device_area_is_valid")
  ea9df47cc9 ("dm table: fix blk_stack_limits arg
to use bytes not sectors")

The patch also corrects references to logical_block_size in printk format
strings from %hu to %u.

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2009-09-04 20:40:24 +01:00
Mikulas Patocka f6a1ed1086 dm table: fix queue_limit checking device iterator
The logic to check for valid device areas is inverted relative to proper
use with iterate_devices.

The iterate_devices method calls its callback for every underlying
device in the target.  If any callback returns non-zero, iterate_devices
exits immediately.  But the callback device_area_is_valid() returns 0 on
error and 1 on success.  The overall effect without is that an error is
issued only if every device is invalid.

This patch renames device_area_is_valid to device_area_is_invalid and
inverts the logic so that one invalid device is sufficient to raise
an error.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2009-09-04 20:40:22 +01:00
Mike Snitzer 8811f46c1f dm snapshot: implement iterate devices
Implement the .iterate_devices for the origin and snapshot targets.
dm-snapshot's lack of .iterate_devices resulted in the inability to
properly establish queue_limits for both targets.

With 4K sector drives: an unfortunate side-effect of not establishing
proper limits in either targets' DM device was that IO to the devices
would fail even though both had been created without error.

Commit af4874e03e ("dm target:s introduce
iterate devices fn") in 2.6.31-rc1 should have implemented .iterate_devices
for dm-snap.c's origin and snapshot targets.

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2009-09-04 20:40:19 +01:00
Kiyoshi Ueda a77e28c7e1 dm multipath: fix oops when request based io fails when no paths
The patch posted at http://marc.info/?l=dm-devel&m=124539787228784&w=2
which was merged into cec47e3d4a ("dm:
prepare for request based option") introduced a regression in
request-based dm.

If map_request() calls dm_kill_unmapped_request() to complete a cloned
bio without dispatching it, clone->bio is still set when
dm_end_request() is called and the BUG_ON(clone->bio) is incorrect.

The patch fixes this bug by freeing bio in dm_end_request() if the clone
has bio.  I've redone my tests to cover all I/O paths and confirmed
there's no other regression.

Here is the oops I hit in request-based dm when I do I/O to a multipath
device which doesn't have any active path nor queue_if_no_path setting:

------------[ cut here ]------------
kernel BUG at /root/2.6.31-rc4.rqdm/drivers/md/dm.c:828!
invalid opcode: 0000 [#1] SMP
last sysfs file: /sys/devices/system/cpu/cpu3/cache/index2/shared_cpu_map
CPU 1
Modules linked in: autofs4 sunrpc cpufreq_ondemand acpi_cpufreq dm_mirror dm_region_hash dm_log dm_service_time dm_multipath scsi_dh dm_mod video output sbs sbshc battery ac sg sr_mod e1000e button cdrom serio_raw rtc_cmos rtc_core rtc_lib piix lpfc scsi_transport_fc ata_piix libata megaraid_sas sd_mod scsi_mod crc_t10dif ext3 jbd uhci_hcd ohci_hcd ehci_hcd [last unloaded: microcode]
Pid: 7, comm: ksoftirqd/1 Not tainted 2.6.31-rc4.rqdm #1 Express5800/120Lj [N8100-1417]
RIP: 0010:[<ffffffffa023629d>]  [<ffffffffa023629d>] dm_softirq_done+0xbd/0x100 [dm_mod]
RSP: 0018:ffff8800280a1f08  EFLAGS: 00010282
RAX: ffffffffa02544e0 RBX: ffff8802aa1111d0 RCX: ffff8802aa1111e0
RDX: ffff8802ab913e70 RSI: 0000000000000000 RDI: ffff8802ab913e70
RBP: ffff8800280a1f28 R08: ffffc90005457040 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000000 R12: 00000000fffffffb
R13: ffff8802ab913e88 R14: ffff8802ab9c1438 R15: 0000000000000100
FS:  0000000000000000(0000) GS:ffff88002809e000(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000003d54a98640 CR3: 000000029f0a1000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process ksoftirqd/1 (pid: 7, threadinfo ffff8802ae50e000, task ffff8802ae4f8040)
Stack:
 ffff8800280a1f38 0000000000000020 ffffffff814f30a0 0000000000000004
<0> ffff8800280a1f58 ffffffff8116b245 ffff8800280a1f38 ffff8800280a1f38
<0> ffff8800280a1f58 0000000000000001 ffff8800280a1fa8 ffffffff810477bc
Call Trace:
 <IRQ>
 [<ffffffff8116b245>] blk_done_softirq+0x75/0x90
 [<ffffffff810477bc>] __do_softirq+0xcc/0x210
 [<ffffffff81047170>] ? ksoftirqd+0x0/0x110
 [<ffffffff8100ce7c>] call_softirq+0x1c/0x50
 <EOI>
 [<ffffffff8100e785>] do_softirq+0x65/0xa0
 [<ffffffff81047170>] ? ksoftirqd+0x0/0x110
 [<ffffffff810471e0>] ksoftirqd+0x70/0x110
 [<ffffffff81059559>] kthread+0x99/0xb0
 [<ffffffff8100cd7a>] child_rip+0xa/0x20
 [<ffffffff8100c73c>] ? restore_args+0x0/0x30
 [<ffffffff810594c0>] ? kthread+0x0/0xb0
 [<ffffffff8100cd70>] ? child_rip+0x0/0x20
Code: 44 89 e6 48 89 df e8 23 fb f2 e0 be 01 00 00 00 4c 89 f7 e8 f6 fd ff ff 5b 41 5c 41 5d 41 5e c9 c3 4c 89 ef e8 85 fe ff ff eb ed <0f> 0b eb fe 41 8b 85 dc 00 00 00 48 83 bb 10 01 00 00 00 89 83
RIP  [<ffffffffa023629d>] dm_softirq_done+0xbd/0x100 [dm_mod]
 RSP <ffff8800280a1f08>
---[ end trace 16af0a1d8542da55 ]---

Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2009-09-04 20:40:16 +01:00