Commit Graph

151 Commits

Author SHA1 Message Date
Linus Torvalds 643ad15d47 Merge branch 'mm-pkeys-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 protection key support from Ingo Molnar:
 "This tree adds support for a new memory protection hardware feature
  that is available in upcoming Intel CPUs: 'protection keys' (pkeys).

  There's a background article at LWN.net:

      https://lwn.net/Articles/643797/

  The gist is that protection keys allow the encoding of
  user-controllable permission masks in the pte.  So instead of having a
  fixed protection mask in the pte (which needs a system call to change
  and works on a per page basis), the user can map a (handful of)
  protection mask variants and can change the masks runtime relatively
  cheaply, without having to change every single page in the affected
  virtual memory range.

  This allows the dynamic switching of the protection bits of large
  amounts of virtual memory, via user-space instructions.  It also
  allows more precise control of MMU permission bits: for example the
  executable bit is separate from the read bit (see more about that
  below).

  This tree adds the MM infrastructure and low level x86 glue needed for
  that, plus it adds a high level API to make use of protection keys -
  if a user-space application calls:

        mmap(..., PROT_EXEC);

  or

        mprotect(ptr, sz, PROT_EXEC);

  (note PROT_EXEC-only, without PROT_READ/WRITE), the kernel will notice
  this special case, and will set a special protection key on this
  memory range.  It also sets the appropriate bits in the Protection
  Keys User Rights (PKRU) register so that the memory becomes unreadable
  and unwritable.

  So using protection keys the kernel is able to implement 'true'
  PROT_EXEC on x86 CPUs: without protection keys PROT_EXEC implies
  PROT_READ as well.  Unreadable executable mappings have security
  advantages: they cannot be read via information leaks to figure out
  ASLR details, nor can they be scanned for ROP gadgets - and they
  cannot be used by exploits for data purposes either.

  We know about no user-space code that relies on pure PROT_EXEC
  mappings today, but binary loaders could start making use of this new
  feature to map binaries and libraries in a more secure fashion.

  There is other pending pkeys work that offers more high level system
  call APIs to manage protection keys - but those are not part of this
  pull request.

  Right now there's a Kconfig that controls this feature
  (CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS) that is default enabled
  (like most x86 CPU feature enablement code that has no runtime
  overhead), but it's not user-configurable at the moment.  If there's
  any serious problem with this then we can make it configurable and/or
  flip the default"

* 'mm-pkeys-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (38 commits)
  x86/mm/pkeys: Fix mismerge of protection keys CPUID bits
  mm/pkeys: Fix siginfo ABI breakage caused by new u64 field
  x86/mm/pkeys: Fix access_error() denial of writes to write-only VMA
  mm/core, x86/mm/pkeys: Add execute-only protection keys support
  x86/mm/pkeys: Create an x86 arch_calc_vm_prot_bits() for VMA flags
  x86/mm/pkeys: Allow kernel to modify user pkey rights register
  x86/fpu: Allow setting of XSAVE state
  x86/mm: Factor out LDT init from context init
  mm/core, x86/mm/pkeys: Add arch_validate_pkey()
  mm/core, arch, powerpc: Pass a protection key in to calc_vm_flag_bits()
  x86/mm/pkeys: Actually enable Memory Protection Keys in the CPU
  x86/mm/pkeys: Add Kconfig prompt to existing config option
  x86/mm/pkeys: Dump pkey from VMA in /proc/pid/smaps
  x86/mm/pkeys: Dump PKRU with other kernel registers
  mm/core, x86/mm/pkeys: Differentiate instruction fetches
  x86/mm/pkeys: Optimize fault handling in access_error()
  mm/core: Do not enforce PKEY permissions on remote mm access
  um, pkeys: Add UML arch_*_access_permitted() methods
  mm/gup, x86/mm/pkeys: Check VMAs and PTEs for protection keys
  x86/mm/gup: Simplify get_user_pages() PTE bit handling
  ...
2016-03-20 19:08:56 -07:00
Kai Makisara 8038e6456a st: Fix MTMKPART to work with newer drives
Change the MTMKPART operation of the MTIOCTOP ioctl so that it works
also with current drives (LTO-5/6, etc.). Send a separate FORMAT MEDIUM
command if the partition mode page indicates that this is required. Use
LOAD to position the tape at the beginning of tape.

The operation is extended so that if the argument is negative, its
absolute value specifies the size of partition 0, which is the
physically first partition of the tape.

Signed-off-by: Kai Mäkisara <kai.makisara@kolumbus.fi>
Tested-by: Shane M Seymour <shane.seymour@hpe.com>
Tested-by: Laurence Oberman <loberman@redhat.com>
Tested-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-02-23 21:27:02 -05:00
Dave Hansen d4edcf0d56 mm/gup: Switch all callers of get_user_pages() to not pass tsk/mm
We will soon modify the vanilla get_user_pages() so it can no
longer be used on mm/tasks other than 'current/current->mm',
which is by far the most common way it is called.  For now,
we allow the old-style calls, but warn when they are used.
(implemented in previous patch)

This patch switches all callers of:

	get_user_pages()
	get_user_pages_unlocked()
	get_user_pages_locked()

to stop passing tsk/mm so they will no longer see the warnings.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave@sr71.net>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: jack@suse.cz
Cc: linux-mm@kvack.org
Link: http://lkml.kernel.org/r/20160212210156.113E9407@viggo.jf.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-16 10:11:12 +01:00
James Bottomley be9e2f775f Merge branch 'mkp-fixes' into fixes 2015-12-03 09:32:33 -08:00
Maurizio Lombardi ab08ee1439 st: fix potential null pointer dereference.
If cdev_add() returns an error, the code calls
cdev_del() passing the STm->cdevs[rew] pointer as parameter;
the problem is that the pointer has not been initialized yet.

This patch fixes the problem by moving the STm->cdevs[rew] pointer
initialization before the call to cdev_add().
It also sets STm->devs[rew] and STm->cdevs[rew] to NULL in
case of failure.

Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2015-11-19 12:12:42 -05:00
Seymour, Shane M d9b43a10f0 st: allow debug output to be enabled or disabled via sysfs
Change st driver to allow enabling or disabling debug output
via sysfs file /sys/bus/scsi/drivers/st/debug_flag.

Previously the only way to enable debug output was:

1. loading the driver with the module parameter debug_flag=1
2. an ioctl call (this method was also the only way to dynamically
disable debug output).

To use the ioctl you need a second tape drive (if you are
actively testing the first tape drive) since a second process
cannot open the first tape drive if it is in use.

The this change is only functional if the value of the macro
DEBUG in st.c is a non-zero value (which it is by default).

Signed-off-by: Shane Seymour <shane.seymour@hpe.com>
Reviewed-by: Laurence Oberman <oberman.l@gmail.com>
Acked-by: Kai Mäkisara <kai.makisara@kolumbus.fi>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
2015-11-09 17:17:27 -08:00
Johannes Thumshirn 2d3a5d215e st: Destroy st_index_idr on module exit
Destroy st_index_idr on module exit, reclaiming the allocated memory.

This was detected by the following semantic patch (written by Luis Rodriguez
<mcgrof@suse.com>)
<SmPL>
@ defines_module_init @
declarer name module_init, module_exit;
declarer name DEFINE_IDR;
identifier init;
@@

module_init(init);

@ defines_module_exit @
identifier exit;
@@

module_exit(exit);

@ declares_idr depends on defines_module_init && defines_module_exit @
identifier idr;
@@

DEFINE_IDR(idr);

@ on_exit_calls_destroy depends on declares_idr && defines_module_exit @
identifier declares_idr.idr, defines_module_exit.exit;
@@

exit(void)
{
 ...
 idr_destroy(&idr);
 ...
}

@ missing_module_idr_destroy depends on declares_idr && defines_module_exit && !on_exit_calls_destroy @
identifier declares_idr.idr, defines_module_exit.exit;
@@

exit(void)
{
 ...
 +idr_destroy(&idr);
}
</SmPL>

Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Acked-by: Kai Mäkisara <kai.makisara@kolumbus.fi>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
2015-08-26 07:45:25 -07:00
Seymour, Shane M 10978e48cc st: convert DRIVER_ATTR macros to DRIVER_ATTR_RO
Convert DRIVER_ATTR macros to DRIVER_ATTR_RO requested by
Greg KH. Also switched to using scnprintf instead of snprintf
per Documentation/filesystems/sysfs.txt.

Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Shane Seymour <shane.seymour@hp.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Kai Mäkisara <kai.makisara@kolumbus.fi>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
2015-08-12 15:54:24 -07:00
Seymour, Shane M 442d75628a st: convert to using driver attr groups for sysfs
This patch changes the st driver to use attribute groups so
driver sysfs files are created automatically. See the
following for reference:

http://kroah.com/log/blog/2013/06/26/how-to-create-a-sysfs-file-correctly/

Signed-off-by: Shane Seymour <shane.seymour@hp.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Kai Mäkisara <kai.makisara@kolumbus.fi>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
2015-08-12 11:52:05 -07:00
Seymour, Shane M e7ac6c6666 st: null pointer dereference panic caused by use after kref_put by st_open
Two SLES11 SP3 servers encountered similar crashes simultaneously
following some kind of SAN/tape target issue:

...
qla2xxx [0000:81:00.0]-801c:3: Abort command issued nexus=3:0:2 --  1 2002.
qla2xxx [0000:81:00.0]-801c:3: Abort command issued nexus=3:0:2 --  1 2002.
qla2xxx [0000:81:00.0]-8009:3: DEVICE RESET ISSUED nexus=3:0:2 cmd=ffff882f89c2c7c0.
qla2xxx [0000:81:00.0]-800c:3: do_reset failed for cmd=ffff882f89c2c7c0.
qla2xxx [0000:81:00.0]-800f:3: DEVICE RESET FAILED: Task management failed nexus=3:0:2 cmd=ffff882f89c2c7c0.
qla2xxx [0000:81:00.0]-8009:3: TARGET RESET ISSUED nexus=3:0:2 cmd=ffff882f89c2c7c0.
qla2xxx [0000:81:00.0]-800c:3: do_reset failed for cmd=ffff882f89c2c7c0.
qla2xxx [0000:81:00.0]-800f:3: TARGET RESET FAILED: Task management failed nexus=3:0:2 cmd=ffff882f89c2c7c0.
qla2xxx [0000:81:00.0]-8012:3: BUS RESET ISSUED nexus=3:0:2.
qla2xxx [0000:81:00.0]-802b:3: BUS RESET SUCCEEDED nexus=3:0:2.
qla2xxx [0000:81:00.0]-505f:3: Link is operational (8 Gbps).
qla2xxx [0000:81:00.0]-8018:3: ADAPTER RESET ISSUED nexus=3:0:2.
qla2xxx [0000:81:00.0]-00af:3: Performing ISP error recovery - ha=ffff88bf04d18000.
 rport-3:0-0: blocked FC remote port time out: removing target and saving binding
qla2xxx [0000:81:00.0]-505f:3: Link is operational (8 Gbps).
qla2xxx [0000:81:00.0]-8017:3: ADAPTER RESET SUCCEEDED nexus=3:0:2.
 rport-2:0-0: blocked FC remote port time out: removing target and saving binding
sg_rq_end_io: device detached
BUG: unable to handle kernel NULL pointer dereference at 00000000000002a8
IP: [<ffffffff8133b268>] __pm_runtime_idle+0x28/0x90
PGD 7e6586f067 PUD 7e5af06067 PMD 0 [1739975.390354] Oops: 0002 [#1] SMP
CPU 0
...
Supported: No, Proprietary modules are loaded [1739975.390463]
Pid: 27965, comm: ABCD Tainted: PF           X 3.0.101-0.29-default #1 HP ProLiant DL580 Gen8
RIP: 0010:[<ffffffff8133b268>]  [<ffffffff8133b268>] __pm_runtime_idle+0x28/0x90
RSP: 0018:ffff8839dc1e7c68  EFLAGS: 00010202
RAX: 0000000000000000 RBX: ffff883f0592fc00 RCX: 0000000000000090
RDX: 0000000000000000 RSI: 0000000000000004 RDI: 0000000000000138
RBP: 0000000000000138 R08: 0000000000000010 R09: ffffffff81bd39d0
R10: 00000000000009c0 R11: ffffffff81025790 R12: 0000000000000001
R13: ffff883022212b80 R14: 0000000000000004 R15: ffff883022212b80
FS:  00007f8e54560720(0000) GS:ffff88407f800000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00000000000002a8 CR3: 0000007e6ced6000 CR4: 00000000001407f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process ABCD (pid: 27965, threadinfo ffff8839dc1e6000, task ffff883592e0c640)
Stack:
 ffff883f0592fc00 00000000fffffffa 0000000000000001 ffff883022212b80
 ffff883eff772400 ffffffffa03fa309 0000000000000000 0000000000000000
 ffffffffa04003a0 ffff883f063196c0 ffff887f0379a930 ffffffff8115ea1e
Call Trace:
 [<ffffffffa03fa309>] st_open+0x129/0x240 [st]
 [<ffffffff8115ea1e>] chrdev_open+0x13e/0x200
 [<ffffffff811588a8>] __dentry_open+0x198/0x310
 [<ffffffff81167d74>] do_last+0x1f4/0x800
 [<ffffffff81168fe9>] path_openat+0xd9/0x420
 [<ffffffff8116946c>] do_filp_open+0x4c/0xc0
 [<ffffffff8115a00f>] do_sys_open+0x17f/0x250
 [<ffffffff81468d92>] system_call_fastpath+0x16/0x1b
 [<00007f8e4f617fd0>] 0x7f8e4f617fcf
Code: eb d3 90 48 83 ec 28 40 f6 c6 04 48 89 6c 24 08 4c 89 74 24 20 48 89 fd 48 89 1c 24 4c 89 64 24 10 41 89 f6 4c 89 6c 24 18 74 11 <f0> ff 8f 70 01 00 00 0f 94 c0 45 31 ed 84 c0 74 2b 4c 8d a5 a0
RIP  [<ffffffff8133b268>] __pm_runtime_idle+0x28/0x90
 RSP <ffff8839dc1e7c68>
CR2: 00000000000002a8

Analysis reveals the cause of the crash to be due to STp->device
being NULL. The pointer was NULLed via scsi_tape_put(STp) when it
calls scsi_tape_release(). In st_open() we jump to err_out after
scsi_block_when_processing_errors() completes and returns the
device as offline (sdev_state was SDEV_DEL):

1180 /* Open the device. Needs to take the BKL only because of incrementing the SCSI host
1181    module count. */
1182 static int st_open(struct inode *inode, struct file *filp)
1183 {
1184         int i, retval = (-EIO);
1185         int resumed = 0;
1186         struct scsi_tape *STp;
1187         struct st_partstat *STps;
1188         int dev = TAPE_NR(inode);
1189         char *name;
...
1217         if (scsi_autopm_get_device(STp->device) < 0) {
1218                 retval = -EIO;
1219                 goto err_out;
1220         }
1221         resumed = 1;
1222         if (!scsi_block_when_processing_errors(STp->device)) {
1223                 retval = (-ENXIO);
1224                 goto err_out;
1225         }
...
1264  err_out:
1265         normalize_buffer(STp->buffer);
1266         spin_lock(&st_use_lock);
1267         STp->in_use = 0;
1268         spin_unlock(&st_use_lock);
1269         scsi_tape_put(STp); <-- STp->device = 0 after this
1270         if (resumed)
1271                 scsi_autopm_put_device(STp->device);
1272         return retval;

The ref count for the struct scsi_tape had already been reduced
to 1 when the .remove method of the st module had been called.
The kref_put() in scsi_tape_put() caused scsi_tape_release()
to be called:

0266 static void scsi_tape_put(struct scsi_tape *STp)
0267 {
0268         struct scsi_device *sdev = STp->device;
0269
0270         mutex_lock(&st_ref_mutex);
0271         kref_put(&STp->kref, scsi_tape_release); <-- calls this
0272         scsi_device_put(sdev);
0273         mutex_unlock(&st_ref_mutex);
0274 }

In scsi_tape_release() the struct scsi_device in the struct
scsi_tape gets set to NULL:

4273 static void scsi_tape_release(struct kref *kref)
4274 {
4275         struct scsi_tape *tpnt = to_scsi_tape(kref);
4276         struct gendisk *disk = tpnt->disk;
4277
4278         tpnt->device = NULL; <<<---- where the dev is nulled
4279
4280         if (tpnt->buffer) {
4281                 normalize_buffer(tpnt->buffer);
4282                 kfree(tpnt->buffer->reserved_pages);
4283                 kfree(tpnt->buffer);
4284         }
4285
4286         disk->private_data = NULL;
4287         put_disk(disk);
4288         kfree(tpnt);
4289         return;
4290 }

Although the problem was reported on SLES11.3 the problem appears
in linux-next as well.

The crash is fixed by reordering the code so we no longer access
the struct scsi_tape after the kref_put() is done on it in st_open().

Signed-off-by: Shane Seymour <shane.seymour@hp.com>
Signed-off-by: Darren Lavender <darren.lavender@hp.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.com>
Acked-by: Kai Mäkisara <kai.makisara@kolumbus.fi>
Cc: stable@vger.kernel.org
Signed-off-by: James Bottomley <JBottomley@Odin.com>
2015-07-16 15:32:32 +03:00
Seymour, Shane M 05545c92db st: implement tape statistics
This patch implements tape statistics in the st module via
sysfs. Current no statistics are available for tape I/O and there
is no easy way to reuse the block layer statistics for tape
as tape is a character device and does not have perform I/O in
sector sized chunks (the size of the data written to tape
can change). For tapes we also need extra stats related to
things like tape movement (via other I/O).

There have been multiple end users requesting statistics
including AT&T (and some HP customers who have not given
permission to be named). It is impossible for them
to investigate any issues related to tape performance
in a non-invasive way.

[jejb: eliminate PRId64]
Signed-off-by: Shane Seymour <shane.seymour@hp.com>
Tested-by: Shane Seymour <shane.seymour@hp.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
2015-06-02 08:03:25 -07:00
Andrea Arcangeli 7e33912849 mm: gup: use get_user_pages_unlocked
This allows those get_user_pages calls to pass FAULT_FLAG_ALLOW_RETRY to
the page fault in order to release the mmap_sem during the I/O.

Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Reviewed-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andres Lagar-Cavilla <andreslc@google.com>
Cc: Peter Feiner <pfeiner@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-11 17:06:05 -08:00
Christoph Hellwig 3af6b35261 scsi: remove scsi_driver owner field
The driver core driver structure has grown an owner field and now
requires it to be set for all modular drivers.  Set it up for
all scsi_driver instances and get rid of the now superflous
scsi_driver owner field.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: Shane M Seymour <shane.seymour@hp.com>
Reviewed-by: Ewan D. Milne <emilne@redhat.com
Reviewed-by: Hannes Reinecke <hare@suse.de>
2014-11-24 20:01:28 +01:00
Christoph Hellwig dccfa688ca st: call scsi_set_medium_removal directly
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
2014-11-12 11:16:12 +01:00
Christoph Hellwig 906d15fbd2 scsi: split scsi_nonblockable_ioctl
The calling conventions for this function are bad as it could return
-ENODEV both for a device not currently online and a not recognized ioctl.

Add a new scsi_ioctl_block_when_processing_errors function that wraps
scsi_block_when_processing_errors with the a special case for the
SG_SCSI_RESET ioctl command, and handle the SG_SCSI_RESET case itself
in scsi_ioctl.  All callers of scsi_ioctl now must call the above helper
to check for the EH state, so that the ioctl handler itself doesn't
have to.

Reported-by: Robert Elliott <Elliott@hp.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
2014-11-12 11:16:11 +01:00
Hannes Reinecke d811b848eb scsi: use sdev as argument for sense code printing
We should be using the standard dev_printk() variants for
sense code printing.

[hch: remove __scsi_print_sense call in xen-scsiback, Acked by Juergen]
[hch: folded bracing fix from Dan Carpenter]
Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Robert Elliott <elliott@hp.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2014-11-12 11:15:58 +01:00
Hannes Reinecke 22e0d99415 scsi: introduce sdev_prefix_printk()
Like scmd_printk(), but the device name is passed in as
a string. Can be used by eg ULDs which do not have access
to the scsi_cmnd structure.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Robert Elliott <elliott@hp.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2014-11-12 11:15:57 +01:00
Laurence Oberman 2bec708a88 st: add a debug_flag module parameter request
This patch adds a debug_flag parameter that can be set on module load, and allows the DEBUG facility without a module recompile.
Note that now DEBUG 1 is the default with this patch.

Usage: modprobe st debug_flag=1

Signed-off-by: Laurence Oberman <loberman@redhat.com>
Acked-by: Kai M??kisara <kai.makisara@kolumbus.fi>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2014-11-12 11:15:55 +01:00
Linus Torvalds d3dc366bba Merge branch 'for-3.18/core' of git://git.kernel.dk/linux-block
Pull core block layer changes from Jens Axboe:
 "This is the core block IO pull request for 3.18.  Apart from the new
  and improved flush machinery for blk-mq, this is all mostly bug fixes
  and cleanups.

   - blk-mq timeout updates and fixes from Christoph.

   - Removal of REQ_END, also from Christoph.  We pass it through the
     ->queue_rq() hook for blk-mq instead, freeing up one of the request
     bits.  The space was overly tight on 32-bit, so Martin also killed
     REQ_KERNEL since it's no longer used.

   - blk integrity updates and fixes from Martin and Gu Zheng.

   - Update to the flush machinery for blk-mq from Ming Lei.  Now we
     have a per hardware context flush request, which both cleans up the
     code should scale better for flush intensive workloads on blk-mq.

   - Improve the error printing, from Rob Elliott.

   - Backing device improvements and cleanups from Tejun.

   - Fixup of a misplaced rq_complete() tracepoint from Hannes.

   - Make blk_get_request() return error pointers, fixing up issues
     where we NULL deref when a device goes bad or missing.  From Joe
     Lawrence.

   - Prep work for drastically reducing the memory consumption of dm
     devices from Junichi Nomura.  This allows creating clone bio sets
     without preallocating a lot of memory.

   - Fix a blk-mq hang on certain combinations of queue depths and
     hardware queues from me.

   - Limit memory consumption for blk-mq devices for crash dump
     scenarios and drivers that use crazy high depths (certain SCSI
     shared tag setups).  We now just use a single queue and limited
     depth for that"

* 'for-3.18/core' of git://git.kernel.dk/linux-block: (58 commits)
  block: Remove REQ_KERNEL
  blk-mq: allocate cpumask on the home node
  bio-integrity: remove the needless fail handle of bip_slab creating
  block: include func name in __get_request prints
  block: make blk_update_request print prefix match ratelimited prefix
  blk-merge: don't compute bi_phys_segments from bi_vcnt for cloned bio
  block: fix alignment_offset math that assumes io_min is a power-of-2
  blk-mq: Make bt_clear_tag() easier to read
  blk-mq: fix potential hang if rolling wakeup depth is too high
  block: add bioset_create_nobvec()
  block: use bio_clone_fast() in blk_rq_prep_clone()
  block: misplaced rq_complete tracepoint
  sd: Honor block layer integrity handling flags
  block: Replace strnicmp with strncasecmp
  block: Add T10 Protection Information functions
  block: Don't merge requests if integrity flags differ
  block: Integrity checksum flag
  block: Relocate bio integrity flags
  block: Add a disk flag to block integrity profile
  block: Add prefix to block integrity profile flags
  ...
2014-10-18 11:53:51 -07:00
Subhash Jadavani 6fe8c1dbef scsi: balance out autopm get/put calls in scsi_sysfs_add_sdev()
SCSI Well-known logical units generally don't have any scsi driver
associated with it which means no one will call scsi_autopm_put_device()
on these wlun scsi devices and this would result in keeping the
corresponding scsi device always active (hence LLD can't be suspended as
well). Same exact problem can be seen for other scsi device representing
normal logical unit whose driver is yet to be loaded. This patch fixes
the above problem with this approach:

- make the scsi_autopm_put_device call at the end of scsi_sysfs_add_sdev
  to make it balance out the get earlier in the function.
- let drivers do paired get/put calls in their probe methods.

Signed-off-by: Subhash Jadavani <subhashj@codeaurora.org>
Signed-off-by: Dolev Raviv <draviv@codeaurora.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2014-09-15 16:02:05 -07:00
Joe Lawrence a492f07545 block,scsi: fixup blk_get_request dead queue scenarios
The blk_get_request function may fail in low-memory conditions or during
device removal (even if __GFP_WAIT is set). To distinguish between these
errors, modify the blk_get_request call stack to return the appropriate
ERR_PTR. Verify that all callers check the return status and consider
IS_ERR instead of a simple NULL pointer check.

For consistency, make a similar change to the blk_mq_alloc_request leg
of blk_get_request.  It may fail if the queue is dead, or the caller was
unwilling to wait.

Signed-off-by: Joe Lawrence <joe.lawrence@stratus.com>
Acked-by: Jiri Kosina <jkosina@suse.cz> [for pktdvd]
Acked-by: Boaz Harrosh <bharrosh@panasas.com> [for osd]
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2014-08-28 10:03:46 -06:00
Hannes Reinecke b30d8bca5b scsi: Implement st_printk()
Update the st driver to use dev_printk() variants instead of
plain printk(); this will prefix logging messages with the
appropriate device.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2014-07-17 22:07:41 +02:00
Jens Axboe f27b087b81 block: add blk_rq_set_block_pc()
With the optimizations around not clearing the full request at alloc
time, we are leaving some of the needed init for REQ_TYPE_BLOCK_PC
up to the user allocating the request.

Add a blk_rq_set_block_pc() that sets the command type to
REQ_TYPE_BLOCK_PC, and properly initializes the members associated
with this type of request. Update callers to use this function instead
of manipulating rq->cmd_type directly.

Includes fixes from Christoph Hellwig <hch@lst.de> for my half-assed
attempt.

Signed-off-by: Jens Axboe <axboe@fb.com>
2014-06-06 07:57:37 -06:00
Maurizio Lombardi d6216c4734 [SCSI] st: fix corruption of the st_modedef structures in st_set_options()
When copying the st_modedef structures the devs pointers must be preserved
in the same way as with the cdevs pointers.

This fixes bug 70271: https://bugzilla.kernel.org/show_bug.cgi?id=70271

[  135.037052] BUG: unable to handle kernel NULL pointer dereference at 0000000000000098
[  135.045048] IP: [<ffffffff812af6a1>] kernfs_find_ns+0x21/0x150
[  135.050999] PGD 220623067 PUD 222171067 PMD 0
[  135.055593] Oops: 0000 [#1] SMP
[  135.058938] Modules linked in: bnx2fc cnic uio fcoe libfcoe libfc 8021q mrp scsi_transport_fc garp scsi_tgt stp llc binfmt_misc dm_round_robin dm_multipath uinput iTCO_wdt iTCO_vendor_support microcode sg pcspkr serio_raw osst st(-) i2c_i801 lpc_ich mfd_core e1000e ptp pps_core ipmi_si ipmi_msghandler video tpm_infineon ext4(F) jbd2(F) mbcache(F) sd_mod(F) crc_t10dif(F) crct10dif_common(F) sr_mod(F) cdrom(F) pata_acpi(F) ata_generic(F) ata_piix(F) libata(F) mpt2sas(F) scsi_transport_sas(F) raid_class(F) ast(F) ttm(F) drm_kms_helper(F) drm(F) i2c_algo_bit(F) sysimgblt(F) sysfillrect(F) i2c_core(F) syscopyarea(F) dm_mirror(F) dm_region_hash(F) dm_log(F) dm_mod(F)
[  135.119686] CPU: 2 PID: 2028 Comm: rmmod Tainted: GF            3.14.0-rc1-linux-mainline+ #14
[  135.128453] Hardware name: wortmann To be filled by O.E.M./P8B-M Series, BIOS 6103 12/06/2012
[  135.137127] task: ffff880001de29d0 ti: ffff8802206e4000 task.ti: ffff8802206e4000
[  135.144742] RIP: 0010:[<ffffffff812af6a1>]  [<ffffffff812af6a1>] kernfs_find_ns+0x21/0x150
[  135.153148] RSP: 0018:ffff8802206e5c98  EFLAGS: 00010282
[  135.158562] RAX: ffff880001de29d0 RBX: 0000000000000000 RCX: 0000000000000006
[  135.165814] RDX: 0000000000000000 RSI: ffffffff817627e0 RDI: 0000000000000000
[  135.173040] RBP: ffff8802206e5cc8 R08: 0000000000000000 R09: 0000000000000001
[  135.180303] R10: 0000000000000000 R11: 0000000000000001 R12: ffffffff817627e0
[  135.187554] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000001
[  135.194774] FS:  00007f817c720700(0000) GS:ffff880227200000(0000) knlGS:0000000000000000
[  135.202995] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  135.208878] CR2: 0000000000000098 CR3: 00000002219b0000 CR4: 00000000000407e0
[  135.216139] Stack:
[  135.218185]  ffffffff81af63a0 0000000000000000 ffffffff817627e0 0000000000000000
[  135.225783]  0000000000000000 0000000000000001 ffff8802206e5cf8 ffffffff812af8de
[  135.233347]  ffff880226801900 ffffffff81b43320 0000000000000000 ffff880221a7c1c0
[  135.240972] Call Trace:
[  135.243463]  [<ffffffff812af8de>] kernfs_find_and_get_ns+0x3e/0x70
[  135.249743]  [<ffffffff812ae27d>] sysfs_unmerge_group+0x1d/0x60
[  135.255716]  [<ffffffff81464da9>] pm_qos_sysfs_remove_latency+0x19/0x20
[  135.262430]  [<ffffffff81466a91>] dev_pm_qos_constraints_destroy+0x31/0x1e0
[  135.269500]  [<ffffffff81464de6>] dpm_sysfs_remove+0x16/0x50
[  135.275263]  [<ffffffff8145c077>] device_del+0x47/0x1e0
[  135.280554]  [<ffffffff8145c232>] device_unregister+0x22/0x60
[  135.286406]  [<ffffffffa02e23bd>] remove_cdevs+0x4d/0x90 [st]
[  135.292247]  [<ffffffffa02e78ff>] st_remove+0x3f/0xb0 [st]
[  135.297851]  [<ffffffff8145f39f>] __device_release_driver+0x7f/0xf0
[  135.304237]  [<ffffffff8145f4e8>] driver_detach+0xd8/0xe0
[  135.309722]  [<ffffffff8145e0fc>] bus_remove_driver+0x5c/0xd0
[  135.315553]  [<ffffffff81460170>] driver_unregister+0x30/0x70
[  135.321366]  [<ffffffffa02e97f4>] exit_st+0x5c/0x868 [st]
[  135.326861]  [<ffffffff8111b31a>] SyS_delete_module+0x19a/0x1f0
[  135.332891]  [<ffffffff810e336d>] ? trace_hardirqs_on+0xd/0x10
[  135.338811]  [<ffffffff81141974>] ? __audit_syscall_entry+0x94/0x100
[  135.345282]  [<ffffffff8135b1fe>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[  135.351806]  [<ffffffff816e8de9>] system_call_fastpath+0x16/0x1b
[  135.357859] Code: ff eb e3 0f 1f 80 00 00 00 00 55 48 89 e5 48 83 ec 30 48 89 5d d8 4c 89 65 e0 4c 89 6d e8 4c 89 75 f0 4c 89 7d f8 66 66 66 66 90 <44> 0f b7 bf 98 00 00 00 8b 05 71 6d 87 00 48 89 fb 49 89 f4 49
[  135.378282] RIP  [<ffffffff812af6a1>] kernfs_find_ns+0x21/0x150
[  135.384355]  RSP <ffff8802206e5c98>
[  135.387881] CR2: 0000000000000098
[  135.391298] ---[ end trace 1968409221ddb3c8 ]---

Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
Acked-by: Kai Mäkisara <kai.makisara@kolumbus.fi>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2014-03-15 10:19:22 -07:00
Bodo Stroesser 769989a4a0 [SCSI] st: fix enlarge_buffer
This patch removes a bug in enlarge_buffer() that can make a
read or write fail under special conditions.

After changing TRY_DIRECT_IO to 0 and ST_MAX_SG to 32 in
st_options.h, a program that writes a first block of 128k and
than a second bigger block (e.g. 256k) fails. The second write
returns errno EOVERFLOW, as enlarge_buffer() checks the sg list
and detects that it already is full.
As enlarge_buffer uses different page allocation orders
depending on the size of the buffer needed, the check does not
make sense.

Signed-off-by: Bodo Stroesser <bstroesser@ts.fujitsu.com>
Acked-by: Kai Mäkisara <kai.makisara@kolumbus.fi>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2013-12-19 20:56:28 -08:00
Greg Kroah-Hartman c69c6be5a8 [SCSI] st: convert class code to use dev_groups
The dev_attrs field of struct class is going away soon, dev_groups
should be used instead.  This converts the scsi tape class code to use
the correct field.

Cc: Kai Mäkisara <Kai.Makisara@kolumbus.fi>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2013-08-21 10:10:50 -07:00
Joe Lawrence 2b5bebccd2 [SCSI] st: Take additional queue ref in st_probe
This patch fixes a reference count bug in the SCSI tape driver which can be
reproduced with the following:

* Boot with slub_debug=FZPU, tape drive attached
* echo 1 > /sys/devices/... tape device pci path .../remove
* Wait for device removal
* echo 1 > /sys/kernel/slab/blkdev_queue/validate
* Slub debug complains about corrupted poison pattern

In commit 523e1d39 (block: make gendisk hold a reference to its queue)
add_disk() and disk_release() were modified to get/put an additional
reference on a disk queue to fix a reference counting discrepency
between bdev release and SCSI device removal.  The ST driver never
calls add_disk(), so this commit introduced an extra kref put when the
ST driver frees its struct gendisk.

Attempts were made to fix this bug at the block level [1] but later
abandoned due to floppy driver issues [2].

[1] https://lkml.org/lkml/2012/8/27/354
[2] https://lkml.org/lkml/2012/9/22/113

Signed-off-by: Joe Lawrence <joe.lawrence@stratus.com>
Tested-by: Ewan D. Milne <emilne@redhat.com>
Acked-by: Kai Mäkisara <Kai.Makisara@kolumbus.fi>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2013-04-06 11:14:20 +01:00
Tejun Heo b98c52b572 scsi: convert to idr_alloc()
Convert to the much saner new idr interface.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27 19:10:18 -08:00
Al Viro 496ad9aa8e new helper: file_inode(file)
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-22 23:31:31 -05:00
Hannes Reinecke 0644f5393e [SCSI] st: remove st_mutex
The st_mutex was created when the BKL was removed, and
prevents simultaneous st_open calls. It is better to
protect just the necessary data.

Signed-off-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Lee Duncan <lduncan@suse.com>
Acked-by: Kai Mäkisara <Kai.Makisara@kolumbus.fi>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-09-24 13:07:02 +04:00
Jeff Mahoney 26898afd67 [SCSI] st: clean up device file creation and removal
This patch cleans up the st device file creation and removal.

Reviewed-by: Lee Duncan <lduncan@suse.com>
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Acked-by: Kai Mäkisara <kai.makisara@kolumbus.fi>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-09-14 17:59:29 +01:00
Jeff Mahoney 6c648d95a6 [SCSI] st: get rid of scsi_tapes array
st currently allocates an array to store pointers to all of the
scsi_tape objects. It's used to discover available indexes to use as the
base for the minor number selection and later to look up scsi_tape
devices for character devices.

We switch to using an IDR for minor selection and a pointer from
st_modedef back to scsi_tape for the lookups.

Reviewed-by: Lee Duncan <lduncan@suse.com>
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Acked-by: Kai Mäkisara <kai.makisara@kolumbus.fi>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-09-14 17:59:28 +01:00
Jeff Mahoney e3f2a9cc84 [SCSI] st: clean up dev cleanup in st_probe
st_probe leaves a cdev pointer hanging around that is compared
during the error path and freed later. There's no need for the pointer
to hang around at all. So we free it immediately and simplify the error
handling.

Reviewed-by: Lee Duncan <lduncan@suse.com>
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Acked-by: Kai Mäkisara <kai.makisara@kolumbus.fi>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-09-14 17:59:28 +01:00
Jeff Mahoney af23782bef [SCSI] st: Use static class attributes
st currently sets up and tears down class attributes manually for
every tape drive in the system. This patch uses a statically defined
class with class attributes to let the device core do it for us.

Reviewed-by: Lee Duncan <lduncan@suse.com>
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Acked-by: Kai Mäkisara <kai.makisara@kolumbus.fi>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-09-14 17:59:28 +01:00
Linus Torvalds a75ee6ecd4 SCSI updates on 20120331
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.18 (GNU/Linux)
 
 iQEcBAABAgAGBQJPdrIWAAoJEDeqqVYsXL0Mny4IAMTzXGOXCykpWhdIe2R8w0Ys
 eIoTJhBKoQWnLTV8cOODtwmtZcoQLeXkZmizZiAJvX6O1tOgueg+W4AFa9grxXGI
 O0d1bSb2ardzU7VZrZSY60Hd4bylMwn4Xv/0dRrQMwTJO0LEeGWsJPV2+2BuXwMB
 lGCNB67oUBXgMOI1jUZQRwx/mBzQ3e/gINjnpZTNKHia7YkX/yVTFISq7htgfDN7
 1wRGxymbHtVap3NbtUO96BUUndAiF5vom+4WNvaQUyPrCc6aoGWjv+J9DQXY/zgv
 AYjujAluK396D6YncGFAWBzYOg9WFbq54v0PRUanjcTTAu5ILs2BxqWdhmnvl14=
 =IH8T
 -----END PGP SIGNATURE-----

Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6

Pull SCSI updates from James Bottomley:
 "This is primarily another round of driver updates (lpfc, bfa, fcoe,
  ipr) plus a new ufshcd driver.  There shouldn't be anything
  controversial in here (The final deletion of scsi proc_ops which
  caused some build breakage has been held over until the next merge
  window to give us more time to stabilise it).

  I'm afraid, with me moving continents at exactly the wrong time,
  anything submitted after the merge window opened has been held over to
  the next merge window."

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (63 commits)
  [SCSI] ipr: Driver version 2.5.3
  [SCSI] ipr: Increase alignment boundary of command blocks
  [SCSI] ipr: Increase max concurrent oustanding commands
  [SCSI] ipr: Remove unnecessary memory barriers
  [SCSI] ipr: Remove unnecessary interrupt clearing on new adapters
  [SCSI] ipr: Fix target id allocation re-use problem
  [SCSI] atp870u, mpt2sas, qla4xxx use pci_dev->revision
  [SCSI] fcoe: Drop the rtnl_mutex before calling fcoe_ctlr_link_up
  [SCSI] bfa: Update the driver version to 3.0.23.0
  [SCSI] bfa: BSG and User interface fixes.
  [SCSI] bfa: Fix to avoid vport delete hang on request queue full scenario.
  [SCSI] bfa: Move service parameter programming logic into firmware.
  [SCSI] bfa: Revised Fabric Assigned Address(FAA) feature implementation.
  [SCSI] bfa: Flash controller IOC pll init fixes.
  [SCSI] bfa: Serialize the IOC hw semaphore unlock logic.
  [SCSI] bfa: Modify ISR to process pending completions
  [SCSI] bfa: Add fc host issue lip support
  [SCSI] mpt2sas: remove extraneous sas_log_info messages
  [SCSI] libfc: fcoe_transport_create fails in single-CPU environment
  [SCSI] fcoe: reduce contention for fcoe_rx_list lock [v2]
  ...
2012-03-31 13:31:23 -07:00
David Howells 9ffc93f203 Remove all #inclusions of asm/system.h
Remove all #inclusions of asm/system.h preparatory to splitting and killing
it.  Performed with the following command:

perl -p -i -e 's!^#\s*include\s*<asm/system[.]h>.*\n!!' `grep -Irl '^#\s*include\s*<asm/system[.]h>' *`

Signed-off-by: David Howells <dhowells@redhat.com>
2012-03-28 18:30:03 +01:00
Lee Duncan c743e44fbb [SCSI] st: expand ability to write immediate filemarks
The st tape driver recently added the MTWEOFI ioctl, which writes
a tape filemark (EOF), like the MTWEOF ioctl, except that MTWEOFI
returns immediately. This makes certain applications, like backup
software, run much more quickly on buffered tape drives.

Since legacy applications do not know about this new MTWEOFI ioctl,
this patch adds a new ioctl option that tells the st driver to return
immediately when writing an EOF (i.e. a filemark). This new flag
is much like the existing flag that tells the st driver to perform
writes (and certain other IOs) immediately, but this new flag only
applies to writing EOFs.

This new feature is controlled via the MTSETDRVBUFFER ioctl, using
the newly-defined MT_ST_NOWAIT_EOF flag.

Use of this new feature is displayed via the sysfs tape "options"
attribute.

The st documentation was updated to mention this new flag, as well
as the problems that can occur from using it.

Signed-off-by: Lee Duncan <lduncan@suse.com>
Acked-by: Kai Makisara <kai.makisara@kolumbus.fi>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-03-27 08:26:34 +01:00
Oliver Neukum 46a243f72d [SCSI] st: implement PM
This implements basic power management for SCSI tapes.

Signed-off-by: Oliver Neukum <oneukum@suse.de>
Acked-by: Kai Mäkisara <kai.makisara@kolumbus.fi>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-19 08:08:51 -06:00
Petr Uzel c68bf8eeaa [SCSI] st: fix race in st_scsi_execute_end
The call to complete() in st_scsi_execute_end() wakes up sleeping thread
in write_behind_check(), which frees the st_request, thus invalidating
the pointer to the associated bio structure, which is then passed to the
blk_rq_unmap_user(). Fix by storing pointer to bio structure into
temporary local variable.

This bug is present since at least linux-2.6.32.

CC: stable@kernel.org
Signed-off-by: Petr Uzel <petr.uzel@suse.cz>
Reported-by: Juergen Groß <juergen.gross@ts.fujitsu.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Acked-by: Kai Mäkisara <kai.makisara@kolumbus.fi>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-10-30 13:27:28 +04:00
FUJITA Tomonori 46081b1664 [SCSI] st: Increase success probability in driver buffer allocation
Modify allocation to try the minimum possible page order allowed by the HBA
scatter/gather segment limit in allocation of the driver's internal
buffer. This increases the probability of successful allocation. The
allocation may still fail if this minimum order is > 0.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Kai Makisara <kai.makisara@kolumbus.fi>
Reported-by: Lukas Kolbe <lkolbe@techfak.uni-bielefeld.de>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-12-22 23:26:50 -06:00
Kai Makisara 373daacfce [SCSI] st: Store page order before driver buffer allocation
The order of the pages allocated for the driver buffer must be stored before
allocation because it is used in freeing already allocated pages if
allocation fails.

Signed-off-by: Kai Makisara <kai.makisara@kolumbus.fi>
Reported-by: Lukas Kolbe <lkolbe@techfak.uni-bielefeld.de>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-12-22 23:26:50 -06:00
Linus Torvalds c70b5296e7 Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (84 commits)
  [SCSI] be2iscsi: SGE Len == 64K
  [SCSI] be2iscsi: Remove premature free of cid
  [SCSI] be2iscsi: More time for FW
  [SCSI] libsas: fix bug for vacant phy
  [SCSI] sd: Fix overflow with big physical blocks
  [SCSI] st: add MTWEOFI to write filemarks without flushing drive buffer
  [SCSI] libsas: Don't issue commands to devices that have been hot-removed
  [SCSI] megaraid_sas: Add Online Controller Reset to MegaRAID SAS drive
  [SCSI] lpfc 8.3.17: Update lpfc driver version to 8.3.17
  [SCSI] lpfc 8.3.17: Replace function reset methodology
  [SCSI] lpfc 8.3.17: SCSI fixes
  [SCSI] lpfc 8.3.17: BSG fixes
  [SCSI] lpfc 8.3.17: SLI Additions and Fixes
  [SCSI] lpfc 8.3.17: Code Cleanup and Locking fixes
  [SCSI] zfcp: Remove scsi_cmnd->serial_number from debug traces
  [SCSI] ipr: fix array error logging
  [SCSI] aha152x: enable PCMCIA on 64bit
  [SCSI] scsi_dh_alua: Handle all states correctly
  [SCSI] cxgb4i: connection and ddp setting update
  [SCSI] cxgb3i: fixed connection over vlan
  ...
2010-10-22 17:34:15 -07:00
Kai Makisara 3e51d3c924 [SCSI] st: add MTWEOFI to write filemarks without flushing drive buffer
This patch adds a new MTIOCTOP operation MTWEOFI that writes filemarks with
immediate bit set. This means that the drive does not flush its buffer and the
next file can be started immediately. This speeds up writing in applications
that have to write multiple small files.

Signed-off-by: Kai Makisara <kai.makisara@kolumbus.fi>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-10-08 17:16:22 -05:00
Arnd Bergmann 2a48fc0ab2 block: autoconvert trivial BKL users to private mutex
The block device drivers have all gained new lock_kernel
calls from a recent pushdown, and some of the drivers
were already using the BKL before.

This turns the BKL into a set of per-driver mutexes.
Still need to check whether this is safe to do.

file=$1
name=$2
if grep -q lock_kernel ${file} ; then
    if grep -q 'include.*linux.mutex.h' ${file} ; then
            sed -i '/include.*<linux\/smp_lock.h>/d' ${file}
    else
            sed -i 's/include.*<linux\/smp_lock.h>.*$/include <linux\/mutex.h>/g' ${file}
    fi
    sed -i ${file} \
        -e "/^#include.*linux.mutex.h/,$ {
                1,/^\(static\|int\|long\)/ {
                     /^\(static\|int\|long\)/istatic DEFINE_MUTEX(${name}_mutex);

} }"  \
    -e "s/\(un\)*lock_kernel\>[ ]*()/mutex_\1lock(\&${name}_mutex)/g" \
    -e '/[      ]*cycle_kernel_lock();/d'
else
    sed -i -e '/include.*\<smp_lock.h\>/d' ${file}  \
                -e '/cycle_kernel_lock()/d'
fi

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2010-10-05 15:01:10 +02:00
Jan Blunck b4d878e23c st: use noop_llseek() instead of default_llseek()
st_open() suggests that llseek() doesn't work: "We really want to do
nonseekable_open(inode, filp); here, but some versions of tar incorrectly
call lseek on tapes and bail out if that fails.  So we disallow pread()
and pwrite(), but permit lseeks."

Instead of using the fallback default_llseek() the driver should use
noop_llseek() which leaves the file->f_pos untouched but succeeds.

Signed-off-by: Jan Blunck <jblunck@suse.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Kai Makisara <Kai.Makisara@kolumbus.fi>
Cc: Willem Riede <osst@riede.org>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-27 09:12:56 -07:00
Tejun Heo 5a0e3ad6af include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files.  percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed.  Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability.  As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

  http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
  only the necessary includes are there.  ie. if only gfp is used,
  gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
  blocks and try to put the new include such that its order conforms
  to its surrounding.  It's put in the include block which contains
  core kernel includes, in the same order that the rest are ordered -
  alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
  doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
  because the file doesn't have fitting include block), it prints out
  an error message indicating which .h file needs to be added to the
  file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
   over 4000 files, deleting around 700 includes and adding ~480 gfp.h
   and ~3000 slab.h inclusions.  The script emitted errors for ~400
   files.

2. Each error was manually checked.  Some didn't need the inclusion,
   some needed manual addition while adding it to implementation .h or
   embedding .c file was more appropriate for others.  This step added
   inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
   from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
   e.g. lib/decompress_*.c used malloc/free() wrappers around slab
   APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
   editing them as sprinkling gfp.h and slab.h inclusions around .h
   files could easily lead to inclusion dependency hell.  Most gfp.h
   inclusion directives were ignored as stuff from gfp.h was usually
   wildly available and often used in preprocessor macros.  Each
   slab.h inclusion directive was examined and added manually as
   necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
   were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
   distributed build env didn't work with gcov compiles) and a few
   more options had to be turned off depending on archs to make things
   build (like ipr on powerpc/64 which failed due to missing writeq).

   * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
   * powerpc and powerpc64 SMP allmodconfig
   * sparc and sparc64 SMP allmodconfig
   * ia64 SMP allmodconfig
   * s390 SMP allmodconfig
   * alpha SMP allmodconfig
   * um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
   a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-03-30 22:02:32 +09:00
Martin K. Petersen 8a78362c4e block: Consolidate phys_segment and hw_segment limits
Except for SCSI no device drivers distinguish between physical and
hardware segment limits.  Consolidate the two into a single segment
limit.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2010-02-26 13:58:08 +01:00
FUJITA Tomonori c982c368bb [SCSI] st: fix mdata->page_order handling
dio transfer always resets mdata->page_order to zero. It breaks
high-order pages previously allocated for non-dio transfer.

This patches adds reserved_page_order to st_buffer structure to save
page order for non-dio transfer.

http://bugzilla.kernel.org/show_bug.cgi?id=14563

When enlarge_buffer() allocates 524288 from 0, st uses six-order page
allocation. So mdata->page_order is 6 and frp_seg is 2.

After that, if st uses dio, sgl_map_user_pages() sets
mdata->page_order to 0 for st_do_scsi(). After that, when we call
normalize_buffer(), it frees only free frp_seg * PAGE_SIZE (2 * 4096)
though we should free frp_seg * PAGE_SIZE << 6 (2 * 4096 << 6). So we
see buffer_size is set to 516096 (524288 - 8192).

Reported-by: Joachim Breuer <linux-kernel@jmbreuer.net>
Tested-by: Joachim Breuer <linux-kernel@jmbreuer.net>
Acked-by: Kai Makisara <kai.makisara@kolumbus.fi>
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: stable@kernel.org
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2009-12-10 08:54:13 -06:00
Roel Kluin 832151f458 [SCSI] st: fix test of value range in st_set_options()
value cannot logically be less than START and greater than BUFFERSIZE.

#define EXTENDED_SENSE_START  18

// vi include/scsi/scsi_cmnd.h +105
#define SCSI_SENSE_BUFFERSIZE 	96

[akpm@linux-foundation.org: fix warning]
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Kai Makisara <kai.makisara@kolumbus.fi>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2009-12-04 12:01:49 -06:00
David Jeffery 2c2ed8bfd8 [SCSI] st: fix possible memory use after free after MTSETBLK ioctl
A memory use after free bug can manifest if the MTSETBLK or SET_DENS_AND_BLK
ioctl features are used to set the tape's blocksize from 0 to non-zero.
After the driver sets the new block size, in this one case it calls
normalize_buffer() to free the device's internal data buffers.  However, the
ioctl code assumes there is always a buffer and does not check or allocate
a buffer if there isn't one.  So any following ioctl calls can corrupt
a part of memory by writing data to memory that the st driver had freed.

This patch removes the normalize_buffer() call and the specialness of
changing from a 0 to non-zero blocksize to fix the possible use of
memory after it has been freed by the st driver.

signed-off-by: David Jeffery <djeffery@redhat.com>
Acked-by: Kai Makisara <kai.makisara@kolumbus.fi>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2009-10-02 14:11:58 -05:00