linux

Commit Graph

Author	SHA1	Message	Date
Josh Durgin	c666601a93	rbd: move snap_rwsem to the device, rename to header_rwsem A new temporary header is allocated each time the header changes, but only the changed properties are copied over. We don't need a new semaphore for each header update. This addresses http://tracker.newdream.net/issues/2174 Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com> Reviewed-by: Alex Elder <elder@dreamhost.com>	2012-03-22 10:47:52 -05:00
Alex Elder	32eec68d2f	rbd: don't drop the rbd_id too early Currently an rbd device's id is released when it is removed, but it is done before the code is run to clean up sysfs-related files (such as /sys/bus/rbd/devices/1). It's possible that an rbd is still in use after the rbd_remove() call has been made. It's essentially the same as an active inode that stays around after it has been removed--until its final close operation. This means that the id shows up as free for reuse at a time it should not be. The effect of this was seen by Jens Rehpoehler, who: - had a filesystem mounted on an rbd device - unmapped that filesystem (without unmounting) - found that the mount still worked properly - but hit a panic when he attempted to re-map a new rbd device This re-map attempt found the previously-unmapped id available. The subsequent attempt to reuse it was met with a panic while attempting to (re-)install the sysfs entry for the new mapped device. Fix this by holding off "putting" the rbd id, until the rbd_device release function is called--when the last reference is finally dropped. Note: This fixes: http://tracker.newdream.net/issues/1907 Signed-off-by: Alex Elder <elder@dreamhost.com> Signed-off-by: Sage Weil <sage@newdream.net>	2012-03-22 10:47:50 -05:00
Alex Elder	593a9e7b34	rbd: small changes Here is another set of small code tidy-ups: - Define SECTOR_SHIFT and SECTOR_SIZE, and use these symbolic names throughout. Tell the blk_queue system our physical block size, in the (unlikely) event we want to use something other than the default. - Delete the definition of struct rbd_info, which is never used. - Move the definition of dev_to_rbd() down in its source file, just above where it gets first used, and change its name to dev_to_rbd_dev(). - Replace an open-coded operation in rbd_dev_release() to use dev_to_rbd_dev() instead. - Calculate the segment size for a given rbd_device just once in rbd_init_disk(). - Use the '%zd' conversion specifier in rbd_snap_size_show(), since the value formatted is a size_t. - Switch to the '%llu' conversion specifier in rbd_snap_id_show(). since the value formatted is unsigned. Signed-off-by: Alex Elder <elder@dreamhost.com>	2012-03-22 10:47:50 -05:00
Alex Elder	00f1f36ffa	rbd: do some refactoring A few blocks of code are rearranged a bit here: - In rbd_header_from_disk(): - Don't bother computing snap_count until we're sure the on-disk header starts with a good signature. - Move a few independent lines of code so they are after a check for a failed memory allocation. - Get rid of unnecessary local variable "ret". - Make a few other changes in rbd_read_header(), similar to the above--just moving things around a bit while preserving the functionality. - In rbd_rq_fn(), just assign rq in the while loop's controlling expression rather than duplicating it before and at the end of the loop body. This allows the use of "continue" rather than "goto next" in a number of spots. - Rearrange the logic in snap_by_name(). End result is the same. Signed-off-by: Alex Elder <elder@dreamhost.com>	2012-03-22 10:47:50 -05:00
Alex Elder	fed4c143ba	rbd: fix module sysfs setup/teardown code Once rbd_bus_type is registered, it allows an "add" operation via the /sys/bus/rbd/add bus attribute, and adding a new rbd device that way establishes a connection between the device and rbd_root_dev. But rbd_root_dev is not registered until after the rbd_bus_type registration is complete. This could (in principle anyway) result in an invalid state. Since rbd_root_dev has no tie to rbd_bus_type we can reorder these two initializations and never be faced with this scenario. In addition, unregister the device in the event the bus registration fails at module init time. Signed-off-by: Alex Elder <elder@dreamhost.com> Signed-off-by: Sage Weil <sage@newdream.net>	2012-03-22 10:47:50 -05:00
Alex Elder	7ef3214af2	rbd: don't allocate mon_addrs buffer in rbd_add() The mon_addrs buffer in rbd_add is used to hold a copy of the monitor IP addresses supplied via /sys/bus/rbd/add. That is passed to rbd_get_client(), which never modifies it (nor do any of the functions it gets passed to thereafter)--the mon_addr parameter to rbd_get_client() is a pointer to constant data, so it can't be modifed. Furthermore, rbd_get_client() has the length of the mon_addrs buffer and that is used to ensure nothing goes beyond its end. Based on all this, there is no reason that a buffer needs to be used to hold a copy of the mon_addrs provided via /sys/bus/rbd/add. Instead, the location within that passed-in buffer can be provided, along with the length of the "token" therein which represents the monitor IP's. A small change to rbd_add_parse_args() allows the address within the buffer to be passed back, and the length is already returned. This now means that, at least from the perspective of this interface, there is no such thing as a list of monitor addresses that is too long. Signed-off-by: Alex Elder <elder@dreamhost.com>	2012-03-22 10:47:50 -05:00
Alex Elder	5214ecc45c	rbd: have rbd_parse_args() report found mon_addrs size The argument parsing routine already computes the size of the mon_addrs buffer it extracts from the "command." Pass it to the caller so it can use it to provide the length to rbd_get_client(). Signed-off-by: Alex Elder <elder@dreamhost.com>	2012-03-22 10:47:49 -05:00
Alex Elder	81a8979378	rbd: do a few checks at build time This is a bit gratuitous, but there are a few things that can be verified at build time rather than run time, so do that. Signed-off-by: Alex Elder <elder@dreamhost.com>	2012-03-22 10:47:49 -05:00
Alex Elder	e28fff268e	rbd: don't use sscanf() in rbd_add_parse_args() Make use of a few simple helper routines to parse the arguments rather than sscanf(). This will treat both missing and too-long arguments as invalid input (rather than silently truncating the input in the too-long case). In time this can also be used by rbd_add() to use the passed-in buffer in place, rather than copying its contents into new buffers. It appears to me that the sscanf() previously used would not correctly handle a supplied snapshot--the two final "%s" conversion specifications were not separated by a space, and I'm not sure how sscanf() handles that situation. It may not be well-defined. So that may be a bug this change fixes (but I didn't verify that). The sizes of the mon_addrs and options buffers are now passed to rbd_add_parse_args(), so they can be supplied to copy_token(). Signed-off-by: Alex Elder <elder@dreamhost.com>	2012-03-22 10:47:49 -05:00
Alex Elder	a725f65e52	rbd: encapsulate argument parsing for rbd_add() Move the code that parses the arguments provided to rbd_add() (which are supplied via /sys/bus/rbd/add) into a separate function. Also rename the "mon_dev_name" variable in rbd_add() to be "mon_addrs". The variable represents a list of one or more comma-separated monitor IP addresses, each with an optional port number. I think "mon_addrs" captures that notion a little better. Signed-off-by: Alex Elder <elder@dreamhost.com>	2012-03-22 10:47:48 -05:00
Alex Elder	27cc25943f	rbd: simplify error handling in rbd_add() If a couple pointers are initialized to NULL then a single "out_nomem" label can be used for all of the memory allocation failure cases in rbd_add(). Also, get rid of the "irc" local variable there. There is no real need for "rc" to be type ssize_t, and it can be used in the spot "irc" was. Signed-off-by: Alex Elder <elder@dreamhost.com> Signed-off-by: Sage Weil <sage@newdream.net>	2012-03-22 10:47:48 -05:00
Alex Elder	60571c7d55	rbd: reduce memory used for rbd_dev fields The length of the string containing the monitor address specification(s) will never exceed the length of the string passed in to rbd_add(). The same holds true for the ceph + rbd options string. So reduce the amount of memory allocated for these to that length rather than the maximum (1024 bytes). Signed-off-by: Alex Elder <elder@dreamhost.com> Signed-off-by: Sage Weil <sage@newdream.net>	2012-03-22 10:47:48 -05:00
Alex Elder	d720bcb0a8	rbd: have rbd_get_client() return a rbd_client Since rbd_get_client() currently returns an error code. It assigns the rbd_client field of the rbd_device structure it is passed if successful. Instead, have it return the created rbd_client structure and return a pointer-coded error if there is an error. This makes the assignment of the client pointer more obvious at the call site. Signed-off-by: Alex Elder <elder@dreamhost.com> Signed-off-by: Sage Weil <sage@newdream.net>	2012-03-22 10:47:48 -05:00
Alex Elder	f0f8cef5a3	rbd: a few simple changes Here are a few very simple cleanups: - Add a "RBD_" prefix to the two driver name string definitions. - Move the definition of struct rbd_request below struct rbd_req_coll to avoid the need for an empty declaration of the latter. - Move and group the definitions of rbd_root_dev_release() and rbd_root_dev, as well as rbd_bus_type and rbd_bus_attrs[], close to the top of the file. Arrange the latter so rbd_bus_type.bus_attrs can be initialized statically. - Get rid of an unnecessary local variable in rbd_open(). - Rework some hokey logic in rbd_bus_add_dev(), so the value of "ret" at the end is either 0 or -ENOENT to avoid the need for the code duplication that was there. - Rename a goto target in rbd_add(). Signed-off-by: Alex Elder <elder@dreamhost.com> Signed-off-by: Sage Weil <sage@newdream.net>	2012-03-22 10:47:48 -05:00
Alex Elder	432b858749	rbd: rename "node_lock" The spinlock used to protect rbd_client_list is named "node_lock". Rename it to "rbd_client_list_lock" to make it more obvious what it's for. Signed-off-by: Alex Elder <elder@dreamhost.com> Signed-off-by: Sage Weil <sage@newdream.net>	2012-03-22 10:47:48 -05:00
Alex Elder	bc534d86be	rbd: move ctl_mutex lock inside rbd_client_create() Since rbd_client_create() is only called in one place, move the acquisition of the mutex around that call inside that function. Signed-off-by: Alex Elder <elder@dreamhost.com> Signed-off-by: Sage Weil <sage@newdream.net>	2012-03-22 10:47:47 -05:00
Alex Elder	d97081b0c7	rbd: move ctl_mutex lock inside rbd_get_client() Since rbd_get_client() is only called in one place, move the acquisition of the mutex around that call inside that function. Furthermore, within rbd_get_client(), it appears the mutex only needs to be held while calling rbd_client_create(). (Moving the lock inside that function will wait for the next patch.) Signed-off-by: Alex Elder <elder@dreamhost.com> Signed-off-by: Sage Weil <sage@newdream.net>	2012-03-22 10:47:47 -05:00
Alex Elder	e6994d3dde	rbd: release client list lock sooner In rbd_get_client(), if a client is reused, a number of things get done while still holding the list lock unnecessarily. This just moves a few things that need no lock protection outside the lock. Signed-off-by: Alex Elder <elder@dreamhost.com> Signed-off-by: Sage Weil <sage@newdream.net>	2012-03-22 10:47:47 -05:00
Alex Elder	d184f6bfde	rbd: restore previous rbd id sequence behavior It used to be that selecting a new unique identifier for an added rbd device required searching all existing ones to find the highest id is used. A recent change made that unnecessary, but made it so that id's used were monotonically non-decreasing. It's a bit more pleasant to have smaller rbd id's though, and this change makes ids get allocated as they were before--each new id is one more than the maximum currently in use. Signed-off-by: Alex Elder <elder@dreamhost.com> Signed-off-by: Sage Weil <sage@newdream.net>	2012-03-22 10:47:47 -05:00
Alex Elder	499afd5b8e	rbd: tie rbd_dev_list changes to rbd_id operations The only time entries are added to or removed from the global rbd_dev_list is exactly when a "put" or "get" operation is being performed on a rbd_dev's id. So just move the list management code into get/put routines. Signed-off-by: Alex Elder <elder@dreamhost.com> Signed-off-by: Sage Weil <sage@newdream.net>	2012-03-22 10:47:47 -05:00
Alex Elder	e124a82f3c	rbd: protect the rbd_dev_list with a spinlock The rbd_dev_list is just a simple list of all the current rbd_devices. Using the ctl_mutex as a concurrency guard is overkill. Instead, use a spinlock for that specific purpose. This also reduces the window that the ctl_mutex needs to be held in rbd_add(). Signed-off-by: Alex Elder <elder@dreamhost.com> Signed-off-by: Sage Weil <sage@newdream.net>	2012-03-22 10:47:47 -05:00
Alex Elder	1ddbe94eda	rbd: rework calculation of new rbd id's In order to select a new unique identifier for an added rbd device, the list of all existing ones is searched and a value one greater than the highest id is used. The list search can be avoided by using an atomic variable that keeps track of the current highest id. Using a get/put model for id's we can limit the boundless growth of id numbers a bit by arranging to reuse the current highest id once it gets released. Add these calls to "put" the id when an rbd is getting removed. Note that this changes the pattern of device id's used--new values will never be below the highest one seen so far (even if there exists an unused lower one). I assert this is OK because the key property of an rbd id is its uniqueness, not its magnitude. Regardless, a follow-on patch will restore the old way of doing things, I just think this commit just makes the incremental change to atomics a little easier to understand. Signed-off-by: Alex Elder <elder@dreamhost.com> Signed-off-by: Sage Weil <sage@newdream.net>	2012-03-22 10:47:47 -05:00
Alex Elder	b7f23c361b	rbd: encapsulate new rbd id selection Move the loop that finds a new unique rbd id to use into its own helper function. Signed-off-by: Alex Elder <elder@dreamhost.com> Signed-off-by: Sage Weil <sage@newdream.net>	2012-03-22 10:47:47 -05:00
Josh Durgin	cc9d734c3d	rbd: use a single value of snap_name to mean no snap There's already a constant for this anyway. Since rbd_header_set_snap() is only used to set the rbd device snap_name field, just do that within that function rather than having it take the snap_name as an argument. Signed-off-by: Alex Elder <elder@dreamhost.com> Signed-off-by: Sage Weil <sage@newdream.net> v2: Changed interface rbd_header_set_snap() so it explicitly updates the snap_name in the rbd_device. Also added a BUILD_BUG_ON() to verify the size of the snap_name field is sufficient for SNAP_HEAD_NAME.	2012-03-22 10:47:47 -05:00
Alex Elder	1dbb439913	rbd: do not duplicate ceph_client pointer in rbd_device The rbd_device structure maintains a duplicate copy of the ceph_client pointer maintained in its rbd_client structure. There appears to be no good reason for this, and its presence presents a risk of them getting out of synch or otherwise misused. So kill it off, and use the rbd_client copy only. Signed-off-by: Alex Elder <elder@dreamhost.com> Signed-off-by: Sage Weil <sage@newdream.net>	2012-03-22 10:47:47 -05:00
Alex Elder	ee57741c52	rbd: make ceph_parse_options() return a pointer ceph_parse_options() takes the address of a pointer as an argument and uses it to return the address of an allocated structure if successful. With this interface is not evident at call sites that the pointer is always initialized. Change the interface to return the address instead (or a pointer-coded error code) to make the validity of the returned pointer obvious. Signed-off-by: Alex Elder <elder@dreamhost.com> Signed-off-by: Sage Weil <sage@newdream.net>	2012-03-22 10:47:47 -05:00
Alex Elder	2107978668	rbd: a few small cleanups Some minor cleanups in "drivers/block/rbd.c: - Use the more meaningful "RBD_MAX_OBJ_NAME_LEN" in place if "96" in the definition of RBD_MAX_MD_NAME_LEN. - Use DEFINE_SPINLOCK() to define and initialize node_lock. - Drop a needless (char *) cast in parse_rbd_opts_token(). - Make a few minor formatting changes. Signed-off-by: Alex Elder <elder@dreamhost.com> Signed-off-by: Sage Weil <sage@newdream.net>	2012-03-22 10:47:46 -05:00
Igor Mammedov	b9136d207f	xen: initialize platform-pci even if xen_emul_unplug=never When xen_emul_unplug=never is specified on kernel command line reading files from /sys/hypervisor is broken (returns -EBUSY). It is caused by xen_bus dependency on platform-pci and platform-pci isn't initialized when xen_emul_unplug=never is specified. Fix it by allowing platform-pci to ignore xen_emul_unplug=never, and do not intialize xen_[blk\|net]front instead. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2012-03-22 11:37:11 -04:00
Linus Torvalds	5375871d43	Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc Pull powerpc merge from Benjamin Herrenschmidt: "Here's the powerpc batch for this merge window. It is going to be a bit more nasty than usual as in touching things outside of arch/powerpc mostly due to the big iSeriesectomy :-) We finally got rid of the bugger (legacy iSeries support) which was a PITA to maintain and that nobody really used anymore. Here are some of the highlights: - Legacy iSeries is gone. Thanks Stephen ! There's still some bits and pieces remaining if you do a grep -ir series arch/powerpc but they are harmless and will be removed in the next few weeks hopefully. - The 'fadump' functionality (Firmware Assisted Dump) replaces the previous (equivalent) "pHyp assisted dump"... it's a rewrite of a mechanism to get the hypervisor to do crash dumps on pSeries, the new implementation hopefully being much more reliable. Thanks Mahesh Salgaonkar. - The "EEH" code (pSeries PCI error handling & recovery) got a big spring cleaning, motivated by the need to be able to implement a new backend for it on top of some new different type of firwmare. The work isn't complete yet, but a good chunk of the cleanups is there. Note that this adds a field to struct device_node which is not very nice and which Grant objects to. I will have a patch soon that moves that to a powerpc private data structure (hopefully before rc1) and we'll improve things further later on (hopefully getting rid of the need for that pointer completely). Thanks Gavin Shan. - I dug into our exception & interrupt handling code to improve the way we do lazy interrupt handling (and make it work properly with "edge" triggered interrupt sources), and while at it found & fixed a wagon of issues in those areas, including adding support for page fault retry & fatal signals on page faults. - Your usual random batch of small fixes & updates, including a bunch of new embedded boards, both Freescale and APM based ones, etc..." I fixed up some conflicts with the generalized irq-domain changes from Grant Likely, hopefully correctly. * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (141 commits) powerpc/ps3: Do not adjust the wrapper load address powerpc: Remove the rest of the legacy iSeries include files powerpc: Remove the remaining CONFIG_PPC_ISERIES pieces init: Remove CONFIG_PPC_ISERIES powerpc: Remove FW_FEATURE ISERIES from arch code tty/hvc_vio: FW_FEATURE_ISERIES is no longer selectable powerpc/spufs: Fix double unlocks powerpc/5200: convert mpc5200 to use of_platform_populate() powerpc/mpc5200: add options to mpc5200_defconfig powerpc/mpc52xx: add a4m072 board support powerpc/mpc5200: update mpc5200_defconfig to fit for charon board Documentation/powerpc/mpc52xx.txt: Checkpatch cleanup powerpc/44x: Add additional device support for APM821xx SoC and Bluestone board powerpc/44x: Add support PCI-E for APM821xx SoC and Bluestone board MAINTAINERS: Update PowerPC 4xx tree powerpc/44x: The bug fixed support for APM821xx SoC and Bluestone board powerpc: document the FSL MPIC message register binding powerpc: add support for MPIC message register API powerpc/fsl: Added aliased MSIIR register address to MSI node in dts powerpc/85xx: mpc8548cds - add 36-bit dts ...	2012-03-21 18:55:10 -07:00
Linus Torvalds	9f3938346a	Merge branch 'kmap_atomic' of git://github.com/congwang/linux Pull kmap_atomic cleanup from Cong Wang. It's been in -next for a long time, and it gets rid of the (no longer used) second argument to k[un]map_atomic(). Fix up a few trivial conflicts in various drivers, and do an "evil merge" to catch some new uses that have come in since Cong's tree. * 'kmap_atomic' of git://github.com/congwang/linux: (59 commits) feature-removal-schedule.txt: schedule the deprecated form of kmap_atomic() for removal highmem: kill all __kmap_atomic() [swarren@nvidia.com: highmem: Fix ARM build break due to __kmap_atomic rename] drbd: remove the second argument of k[un]map_atomic() zcache: remove the second argument of k[un]map_atomic() gma500: remove the second argument of k[un]map_atomic() dm: remove the second argument of k[un]map_atomic() tomoyo: remove the second argument of k[un]map_atomic() sunrpc: remove the second argument of k[un]map_atomic() rds: remove the second argument of k[un]map_atomic() net: remove the second argument of k[un]map_atomic() mm: remove the second argument of k[un]map_atomic() lib: remove the second argument of k[un]map_atomic() power: remove the second argument of k[un]map_atomic() kdb: remove the second argument of k[un]map_atomic() udf: remove the second argument of k[un]map_atomic() ubifs: remove the second argument of k[un]map_atomic() squashfs: remove the second argument of k[un]map_atomic() reiserfs: remove the second argument of k[un]map_atomic() ocfs2: remove the second argument of k[un]map_atomic() ntfs: remove the second argument of k[un]map_atomic() ...	2012-03-21 09:40:26 -07:00
Linus Torvalds	69a7aebcf0	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial Pull trivial tree from Jiri Kosina: "It's indeed trivial -- mostly documentation updates and a bunch of typo fixes from Masanari. There are also several linux/version.h include removals from Jesper." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (101 commits) kcore: fix spelling in read_kcore() comment constify struct pci_dev * in obvious cases Revert "char: Fix typo in viotape.c" init: fix wording error in mm_init comment usb: gadget: Kconfig: fix typo for 'different' Revert "power, max8998: Include linux/module.h just once in drivers/power/max8998_charger.c" writeback: fix fn name in writeback_inodes_sb_nr_if_idle() comment header writeback: fix typo in the writeback_control comment Documentation: Fix multiple typo in Documentation tpm_tis: fix tis_lock with respect to RCU Revert "media: Fix typo in mixer_drv.c and hdmi_drv.c" Doc: Update numastat.txt qla4xxx: Add missing spaces to error messages compiler.h: Fix typo security: struct security_operations kerneldoc fix Documentation: broken URL in libata.tmpl Documentation: broken URL in filesystems.tmpl mtd: simplify return logic in do_map_probe() mm: fix comment typo of truncate_inode_pages_range power: bq27x00: Fix typos in comment ...	2012-03-20 21:12:50 -07:00
Linus Torvalds	ed378a52da	USB merge for 3.4-rc1 Here's the big USB merge for the 3.4-rc1 merge window. Lots of gadget driver reworks here, driver updates, xhci changes, some new drivers added, usb-serial core reworking to fix some bugs, and other various minor things. There are some patches touching arch code, but they have all been acked by the various arch maintainers. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.18 (GNU/Linux) iEYEABECAAYFAk9njL8ACgkQMUfUDdst+ylQ9wCfbBOnIT01lGOorkaE9pom0hhk HfMAoKq1xzCR2B+OS3UMyUQffk+Ri9Ri =KIQ2 -----END PGP SIGNATURE----- Merge tag 'usb-3.3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB merge for 3.4-rc1 from Greg KH: "Here's the big USB merge for the 3.4-rc1 merge window. Lots of gadget driver reworks here, driver updates, xhci changes, some new drivers added, usb-serial core reworking to fix some bugs, and other various minor things. There are some patches touching arch code, but they have all been acked by the various arch maintainers." * tag 'usb-3.3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (302 commits) net: qmi_wwan: add support for ZTE MF820D USB: option: add ZTE MF820D usb: gadget: f_fs: Remove lock is held before freeing checks USB: option: make interface blacklist work again usb/ub: deprecate & schedule for removal the "Low Performance USB Block" driver USB: ohci-pxa27x: add clk_prepare/clk_unprepare calls USB: use generic platform driver on ath79 USB: EHCI: Add a generic platform device driver USB: OHCI: Add a generic platform device driver USB: ftdi_sio: new PID: LUMEL PD12 USB: ftdi_sio: add support for FT-X series devices USB: serial: mos7840: Fixed MCS7820 device attach problem usb: Don't make USB_ARCH_HAS_{XHCI,OHCI,EHCI} depend on USB_SUPPORT. usb gadget: fix a section mismatch when compiling g_ffs with CONFIG_USB_FUNCTIONFS_ETH USB: ohci-nxp: Remove i2c_write(), use smbus USB: ohci-nxp: Support for LPC32xx USB: ohci-nxp: Rename symbols from pnx4008 to nxp USB: OHCI-HCD: Rename ohci-pnx4008 to ohci-nxp usb: gadget: Kconfig: fix typo for 'different' usb: dwc3: pci: fix another failure path in dwc3_pci_probe() ...	2012-03-20 11:26:30 -07:00
Cong Wang	589973a704	drbd: remove the second argument of k[un]map_atomic() Signed-off-by: Cong Wang <amwang@redhat.com>	2012-03-20 21:48:29 +08:00
Cong Wang	cfd8005c99	block: remove the second argument of k[un]map_atomic() Signed-off-by: Cong Wang <amwang@redhat.com>	2012-03-20 21:48:16 +08:00
Steven Noonan	3467811e26	xen-blkfront: make blkif_io_lock spinlock per-device This patch moves the global blkif_io_lock to the per-device structure. The spinlock seems to exists for two reasons: to disable IRQs when in the interrupt handlers for blkfront, and to protect the blkfront VBDs when a detachment is requested. Having a global blkif_io_lock doesn't make sense given the use case, and it drastically hinders performance due to contention. All VBDs with pending IOs have to take the lock in order to get work done, which serializes everything pretty badly. Signed-off-by: Steven Noonan <snoonan@amazon.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2012-03-20 12:52:41 +01:00
Andrew Jones	dad5cf659b	xen/blkfront: don't put bdev right after getting it We should hang onto bdev until we're done with it. Signed-off-by: Andrew Jones <drjones@redhat.com> [v1: Fixed up git commit description] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2012-03-20 12:52:41 +01:00
Akinobu Mita	34ae2e47d9	xen-blkfront: use bitmap_set() and bitmap_clear() Use bitmap_set and bitmap_clear rather than modifying individual bits in a memory region. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: xen-devel@lists.xensource.com Cc: virtualization@lists.linux-foundation.org Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2012-03-20 12:52:41 +01:00
Daniel De Graaf	b2167ba6dd	xen/blkback: Enable blkback on HVM guests Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2012-03-20 12:52:41 +01:00
Daniel De Graaf	4f14faaab4	xen/blkback: use grant-table.c hypercall wrappers Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2012-03-20 12:52:41 +01:00
Sebastian Andrzej Siewior	7396bd9fa1	usb/ub: deprecate & schedule for removal the "Low Performance USB Block" driver Deprecate this driver. All devices which can be handled by this driver can also be handled by the usb-storage driver. Acked-By: Pete Zaitcev <zaitcev@redhat.com> Cc: Jens Axboe <jaxboe@fusionio.com> Signed-off-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-03-16 13:30:10 -07:00
Stephen Rothwell	ba7a4822b4	powerpc: Remove some of the legacy iSeries specific device drivers These drivers are specific to the PowerPC legacy iSeries platform and their Kconfig is specified in arch/powerpc. Legacy iSeries is being removed, so these drivers can no longer be selected. Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2012-03-16 09:28:05 +11:00
Linus Torvalds	f1cbd03f5e	Merge branch 'for-linus' of git://git.kernel.dk/linux-block Pull block fixes from Jens Axboe: "Been sitting on this for a while, but lets get this out the door. This fixes various important bugs for 3.3 final, along with a few more trivial ones. Please pull!" * 'for-linus' of git://git.kernel.dk/linux-block: block: fix ioc leak in put_io_context block, sx8: fix pointer math issue getting fw version Block: use a freezable workqueue for disk-event polling drivers/block/DAC960: fix -Wuninitialized warning drivers/block/DAC960: fix DAC960_V2_IOCTL_Opcode_T -Wenum-compare warning block: fix __blkdev_get and add_disk race condition block: Fix setting bio flags in drivers (sd_dif/floppy) block: Fix NULL pointer dereference in sd_revalidate_disk block: exit_io_context() should call elevator_exit_icq_fn() block: simplify ioc_release_fn() block: replace icq->changed with icq->flags	2012-03-14 17:16:45 -07:00
Greg Kroah-Hartman	f7a0d426f3	Merge 3.3-rc7 into usb-next This resolves the conflict with drivers/usb/host/ehci-fsl.h that happened with changes in Linus's and this branch at the same time. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-03-12 09:13:31 -07:00
Muthu Kumar	9354f1b8e6	floppy/scsi: fix setting of BIO flags Fix setting bio flags in drivers (sd_dif/floppy). Signed-off-by: Muthukumar R <muthur@gmail.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-03-05 15:49:43 -08:00
Dan Carpenter	ea5f4db8ec	block, sx8: fix pointer math issue getting fw version "mem" is type u8. We need parenthesis here or it screws up the pointer math probably leading to an oops. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: stable@kernel.org Acked-by: Jeff Garzik <jgarzik@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2012-03-03 19:44:39 +01:00
Danny Kukawka	cecd353a02	drivers/block/DAC960: fix -Wuninitialized warning Set CommandMailbox with memset before use it. Fix for: drivers/block/DAC960.c: In function ‘DAC960_V1_EnableMemoryMailboxInterface’: arch/x86/include/asm/io.h:61:1: warning: ‘CommandMailbox.Bytes[12]’ may be used uninitialized in this function [-Wuninitialized] drivers/block/DAC960.c:1175:30: note: ‘CommandMailbox.Bytes[12]’ was declared here Signed-off-by: Danny Kukawka <danny.kukawka@bisect.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2012-03-02 10:48:35 +01:00
Danny Kukawka	bca505f109	drivers/block/DAC960: fix DAC960_V2_IOCTL_Opcode_T -Wenum-compare warning Fixed compiler warning: comparison between ‘DAC960_V2_IOCTL_Opcode_T’ and ‘enum <anonymous>’ Renamed enum, added a new enum for SCSI_10.CommandOpcode in DAC960_V2_ProcessCompletedCommand(). Signed-off-by: Danny Kukawka <danny.kukawka@bisect.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2012-03-02 10:48:32 +01:00
Muthukumar R	12ebffd146	block: Fix setting bio flags in drivers (sd_dif/floppy) Fix setting bio flags in drivers (sd_dif/floppy). Signed-off-by: Muthukumar R <muthur@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2012-03-02 10:40:58 +01:00
Sebastian Andrzej Siewior	7ac4704c09	usb/storage: a couple defines from drivers/usb/storage/transport.h to include/linux/usb/storage.h This moves the BOT data structures for CBW and CSW from drivers internal header file to global include able file in include/. The storage gadget is using the same name for CSW but a different for CBW so I fix it up properly. The same goes for the ub driver and keucr driver in staging. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-02-28 11:05:18 -08:00
Hitoshi Mitake	797a796a13	asm-generic: architecture independent readq/writeq for 32bit environment This provides unified readq()/writeq() helper functions for 32-bit drivers. For some cases, readq/writeq without atomicity is harmful, and order of io access has to be specified explicitly. So in this patch, new two header files which contain non-atomic readq/writeq are added. - <asm-generic/io-64-nonatomic-lo-hi.h> provides non-atomic readq/ writeq with the order of lower address -> higher address - <asm-generic/io-64-nonatomic-hi-lo.h> provides non-atomic readq/ writeq with reversed order This allows us to remove some readq()s that were added drivers when the default non-atomic ones were removed in commit `dbee8a0aff` ("x86: remove 32-bit versions of readq()/writeq()") The drivers which need readq/writeq but can do with the non-atomic ones must add the line: #include <asm-generic/io-64-nonatomic-lo-hi.h> /* or hi-lo.h */ But this will be nop in 64-bit environments, and no other #ifdefs are required. So I believe that this patch can solve the problem of 1. driver-specific readq/writeq 2. atomicity and order of io access This patch is tested with building allyesconfig and allmodconfig as ARCH=x86 and ARCH=i386 on top of tip/master. Cc: Kashyap Desai <Kashyap.Desai@lsi.com> Cc: Len Brown <lenb@kernel.org> Cc: Ravi Anand <ravi.anand@qlogic.com> Cc: Vikas Chaudhary <vikas.chaudhary@qlogic.com> Cc: Matthew Garrett <mjg@redhat.com> Cc: Jason Uhlenkott <juhlenko@akamai.com> Cc: James Bottomley <James.Bottomley@parallels.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Roland Dreier <roland@purestorage.com> Cc: James Bottomley <jbottomley@parallels.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Matthew Wilcox <matthew.r.wilcox@intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Hitoshi Mitake <h.mitake@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-02-21 16:47:28 -08:00
Jesper Juhl	d0156f4d62	NVM Express: Remove unneeded include of linux/version.h from nvme.c There's no need for drivers/block/nvme.c to include linux/version.h, so remove the include. Signed-off-by: Jesper Juhl <jj@chaosbits.net> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2012-02-21 11:48:54 +01:00
Linus Torvalds	3ec1e88b33	Merge branch 'for-linus' of git://git.kernel.dk/linux-block Says Jens: "Time to push off some of the pending items. I really wanted to wait until we had the regression nailed, but alas it's not quite there yet. But I'm very confident that it's "just" a missing expire on exit, so fix from Tejun should be fairly trivial. I'm headed out for a week on the slopes. - Killing the barrier part of mtip32xx. It doesn't really support barriers, and it doesn't need them (writes are fully ordered). - A few fixes from Dan Carpenter, preventing overflows of integer multiplication. - A fixup for loop, fixing a previous commit that didn't quite solve the partial read problem from Dave Young. - A bio integer overflow fix from Kent Overstreet. - Improvement/fix of the door "keep locked" part of the cdrom shared code from Paolo Benzini. - A few cfq fixes from Shaohua Li. - A fix for bsg sysfs warning when removing a file it did not create from Stanislaw Gruszka. - Two fixes for floppy from Vivek, preventing a crash. - A few block core fixes from Tejun. One killing the over-optimized ioc exit path, cleaning that up nicely. Two others fixing an oops on elevator switch, due to calling into the scheduler merge check code without holding the queue lock." * 'for-linus' of git://git.kernel.dk/linux-block: block: fix lockdep warning on io_context release put_io_context() relay: prevent integer overflow in relay_open() loop: zero fill bio instead of return -EIO for partial read bio: don't overflow in bio_get_nr_vecs() floppy: Fix a crash during rmmod floppy: Cleanup disk->queue before caling put_disk() if add_disk() was never called cdrom: move shared static to cdrom_device_info bsg: fix sysfs link remove warning block: don't call elevator callbacks for plug merges block: separate out blk_rq_merge_ok() and blk_try_merge() from elevator functions mtip32xx: removed the irrelevant argument of mtip_hw_submit_io() and the unused member of struct driver_data block: strip out locking optimization in put_io_context() cdrom: use copy_to_user() without the underscores block: fix ioc locking warning block: fix NULL icq_cache reference block,cfq: change code order	2012-02-11 10:07:11 -08:00
Dave Young	306df0716a	loop: zero fill bio instead of return -EIO for partial read commit `8268f5a741` ("deny partial write for loop dev fd") tried to fix the loop device partial read information leak problem. But it changed the semantics of read behavior. When we read beyond the end of the device we should get 0 bytes, which is normal behavior, we should not just return -EIO Instead of returning -EIO, zero out the bio to avoid information leak in case of partail read. Signed-off-by: Dave Young <dyoung@redhat.com> Reviewed-by: Jeff Moyer <jmoyer@redhat.com> Tested-by: Jeff Moyer <jmoyer@redhat.com> Cc: Dmitry Monakhov <dmonakhov@sw.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2012-02-08 22:07:19 +01:00
Vivek Goyal	4609dff6b5	floppy: Fix a crash during rmmod floppy driver does not call add_disk() on all the drives hence we don't take gendisk reference on request queue for these drives. Don't call put_disk() with disk->queue set, otherwise we try to put the reference we never took. Reported-and-tested-by: Dirk Gouders <gouders@et.bocholt.fh-gelsenkirchen.de> Signed-off-by: Vivek Goyal<vgoyal@redhat.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2012-02-08 20:03:39 +01:00
Vivek Goyal	3f9a5aabd0	floppy: Cleanup disk->queue before caling put_disk() if add_disk() was never called add_disk() takes gendisk reference on request queue. If driver failed during initialization and never called add_disk() then that extra reference is not taken. That reference is put in put_disk(). floppy driver allocates the disk, allocates queue, sets disk->queue and then relizes that floppy controller is not present. It tries to tear down everything and tries to put a reference down in put_disk() which was never taken. In such error cases cleanup disk->queue before calling put_disk() so that we never try to put down a reference which was never taken in first place. Reported-and-tested-by: Suresh Jayaraman <sjayaraman@suse.com> Tested-by: Dirk Gouders <gouders@et.bocholt.fh-gelsenkirchen.de> Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2012-02-08 20:03:38 +01:00
Asai Thambi S P	4e8670e261	mtip32xx: removed the irrelevant argument of mtip_hw_submit_io() and the unused member of struct driver_data Removed the following: * irrelevant argument 'barrier' of mtip_hw_submit_io() * unused member 'eh_active' of struct driver_data Signed-off-by: Asai Thambi S P <asamymuthupa@micron.com> Signed-off-by: Sam Bradshaw <sbradshaw@micron.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2012-02-07 07:54:31 +01:00
Linus Torvalds	6c073a7ee2	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: rbd: fix safety of rbd_put_client() rbd: fix a memory leak in rbd_get_client() ceph: create a new session lock to avoid lock inversion ceph: fix length validation in parse_reply_info() ceph: initialize client debugfs outside of monc->mutex ceph: change "ceph.layout" xattr to be "ceph.file.layout"	2012-02-02 15:47:33 -08:00
Alex Elder	d23a4b3fd6	rbd: fix safety of rbd_put_client() The rbd_client structure uses a kref to arrange for cleaning up and freeing an instance when its last reference is dropped. The cleanup routine is rbd_client_release(), and one of the things it does is delete the rbd_client from rbd_client_list. It acquires node_lock to do so, but the way it is done is still not safe. The problem is that when attempting to reuse an existing rbd_client, the structure found might already be in the process of getting destroyed and cleaned up. Here's the scenario, with "CLIENT" representing an existing rbd_client that's involved in the race: Thread on CPU A \| Thread on CPU B --------------- \| --------------- rbd_put_client(CLIENT) \| rbd_get_client() kref_put() \| (acquires node_lock) kref->refcount becomes 0 \| __rbd_client_find() returns CLIENT calls rbd_client_release() \| kref_get(&CLIENT->kref); \| (releases node_lock) (acquires node_lock) \| deletes CLIENT from list \| ...and starts using CLIENT... (releases node_lock) \| and frees CLIENT \| <-- but CLIENT gets freed here Fix this by having rbd_put_client() acquire node_lock. The result could still be improved, but at least it avoids this problem. Signed-off-by: Alex Elder <elder@dreamhost.com> Signed-off-by: Sage Weil <sage@newdream.net>	2012-02-02 12:56:59 -08:00
Alex Elder	97bb59a03d	rbd: fix a memory leak in rbd_get_client() If an existing rbd client is found to be suitable for use in rbd_get_client(), the rbd_options structure is not being freed as it should. Fix that. Signed-off-by: Alex Elder <elder@dreamhost.com> Signed-off-by: Sage Weil <sage@newdream.net>	2012-02-02 12:49:27 -08:00
Linus Torvalds	93c3d65b28	nvme: fix merge error due to change of 'make_request_fn' fn type The type of 'make_request_fn' changed in `5a7bbad27a` ("block: remove support for bio remapping from ->make_request"), but the merge of the nvme driver didn't take that into account, and as a result the driver would compile with a warning: drivers/block/nvme.c: In function 'nvme_alloc_ns': drivers/block/nvme.c:1336:2: warning: passing argument 2 of 'blk_queue_make_request' from incompatible pointer type [enabled by default] include/linux/blkdev.h:830:13: note: expected 'void ()(struct request_queue , struct bio )' but argument is of type 'int ()(struct request_queue , struct bio )' It's benign, but the warning is annoying. Reported-by: Stephen Rothwell <sfr@canb.auug.org> Cc: Matthew Wilcox <matthew.r.wilcox@intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-01-18 15:41:27 -08:00
Linus Torvalds	92b5abbb44	Merge git://git.infradead.org/users/willy/linux-nvme * git://git.infradead.org/users/willy/linux-nvme: (105 commits) NVMe: Set number of queues correctly NVMe: Version 0.8 NVMe: Set queue flags correctly NVMe: Simplify nvme_unmap_user_pages NVMe: Mark the end of the sg list NVMe: Fix DMA mapping for admin commands NVMe: Rename IO_TIMEOUT to NVME_IO_TIMEOUT NVMe: Merge the nvme_bio and nvme_prp data structures NVMe: Change nvme_completion_fn to take a dev NVMe: Change get_nvmeq to take a dev instead of a namespace NVMe: Simplify completion handling NVMe: Update Identify Controller data structure NVMe: Implement doorbell stride capability NVMe: Version 0.7 NVMe: Don't probe namespace 0 Fix calculation of number of pages in a PRP List NVMe: Create nvme_identify and nvme_get_features functions NVMe: Fix memory leak in nvme_dev_add() NVMe: Fix calls to dma_unmap_sg NVMe: Correct sg list setup in nvme_map_user_pages ...	2012-01-18 12:34:09 -08:00
Linus Torvalds	16008d6416	Merge branch 'for-3.3/drivers' of git://git.kernel.dk/linux-block * 'for-3.3/drivers' of git://git.kernel.dk/linux-block: mtip32xx: do rebuild monitoring asynchronously xen-blkfront: Use kcalloc instead of kzalloc to allocate array mtip32xx: uninitialized variable in mtip_quiesce_io() mtip32xx: updates based on feedback xen-blkback: convert hole punching to discard request on loop devices xen/blkback: Move processing of BLKIF_OP_DISCARD from dispatch_rw_block_io xen/blk[front\|back]: Enhance discard support with secure erasing support. xen/blk[front\|back]: Squash blkif_request_rw and blkif_request_discard together mtip32xx: update to new ->make_request() API mtip32xx: add module.h include to avoid conflict with moduleh tree mtip32xx: mark a few more items static mtip32xx: ensure that all local functions are static mtip32xx: cleanup compat ioctl handling mtip32xx: fix warnings/errors on 32-bit compiles block: Add driver for Micron RealSSD pcie flash cards	2012-01-15 12:48:41 -08:00
Linus Torvalds	b3c9dd182e	Merge branch 'for-3.3/core' of git://git.kernel.dk/linux-block * 'for-3.3/core' of git://git.kernel.dk/linux-block: (37 commits) Revert "block: recursive merge requests" block: Stop using macro stubs for the bio data integrity calls blockdev: convert some macros to static inlines fs: remove unneeded plug in mpage_readpages() block: Add BLKROTATIONAL ioctl block: Introduce blk_set_stacking_limits function block: remove WARN_ON_ONCE() in exit_io_context() block: an exiting task should be allowed to create io_context block: ioc_cgroup_changed() needs to be exported block: recursive merge requests block, cfq: fix empty queue crash caused by request merge block, cfq: move icq creation and rq->elv.icq association to block core block, cfq: restructure io_cq creation path for io_context interface cleanup block, cfq: move io_cq exit/release to blk-ioc.c block, cfq: move icq cache management to block core block, cfq: move io_cq lookup to blk-ioc.c block, cfq: move cfqd->icq_list to request_queue and add request->elv.icq block, cfq: reorganize cfq_io_context into generic and cfq specific parts block: remove elevator_queue->ops block: reorder elevator switch sequence ... Fix up conflicts in: - block/blk-cgroup.c Switch from can_attach_task to can_attach - block/cfq-iosched.c conflict with now removed cic index changes (we now use q->id instead)	2012-01-15 12:24:45 -08:00
Jens Axboe	85a0f7b220	Merge branch 'for-3.3/mtip32xx' into for-3.3/drivers	2012-01-15 10:39:35 +01:00
Paolo Bonzini	577ebb374c	block: add and use scsi_blk_cmd_ioctl Introduce a wrapper around scsi_cmd_ioctl that takes a block device. The function will then be enhanced to detect partition block devices and, in that case, subject the ioctls to whitelisting. Cc: linux-scsi@vger.kernel.org Cc: Jens Axboe <axboe@kernel.dk> Cc: James Bottomley <JBottomley@parallels.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-01-14 15:07:24 -08:00
Linus Torvalds	0a80939b3e	Autogenerated GPG tag for Rusty D1ADB8F1: 15EE 8D6C AB0E 7F0C F999 BFCB D920 0E6C D1AD B8F1 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAABAgAGBQJPD2aFAAoJENkgDmzRrbjxNzsQAIeYbbrXYLjr6kQzUSngj/eC FzjaTEfYTQIeuQCFJHcHthyc5lXV4sQbo3jOezW+Bp5yuDJL2aWIHesSfWZe7imu zQdM4VshOYdAmUR9Q0AW5zhB8Smbs7/AyABiF2jm4p0ZPOuyMDSlei9sjvE9Vjvt B7g5ht7L6kz0JbDnwwy0u5gs+tEitwpXYId9Y4ysZIBzIbL0qkPX8veOddGTMy0N 8xhWXaKtufpjvxFD2ORLDsw3AkoF1xXSNuFd/5nzCNpbeE7TW931jfkPoqJumuAO 7GLxcU9kKYl+IICobC6wBtsj/RrB7w+cBXMvPGwdBliam1qaRhUcJZi5FLM/Ha5d 2A9QDYNUpoXiO8JbPXrV9Z+Y0+Co8RilsQj7R/rjZh6AbbYCWt9nxzx2Svl/RfTr xfiimHuB2P3rHjOvpCXULwOOuE5c8MzPuWncpdjiD3uGXOY/aY+X1m+if/quJw9D grPlKL0+YiRakEYUeGG4M77KCqyKFZaF7L7UQPbqfZcj8V/9AW3/7U5I/B9RlAjs idsr4fcf5s0N+oKUyTCW1ncpUDQNiwbU2NyJQqeu1ZxaRGj72AgyvsaNeyIPDyK+ f6x95Bi7i8KLjXc9Z1KvJwh2Nxt25gNUiTYVha/9H2NpJGd1cfI15kTOGXrgddVv 1pvuGcJDZwYiwfiXr3FL =HHrh -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://github.com/rustyrussell/linux Autogenerated GPG tag for Rusty D1ADB8F1: 15EE 8D6C AB0E 7F0C F999 BFCB D920 0E6C D1AD B8F1 * tag 'for-linus' of git://github.com/rustyrussell/linux: module_param: check that bool parameters really are bool. intelfbdrv.c: bailearly is an int module_param paride/pcd: fix bool verbose module parameter. module_param: make bool parameters really bool (drivers & misc) module_param: make bool parameters really bool (arch) module_param: make bool parameters really bool (core code) kernel/async: remove redundant declaration. printk: fix unnecessary module_param_name. lirc_parallel: fix module parameter description. module_param: avoid bool abuse, add bint for special cases. module_param: check type correctness for module_param_array modpost: use linker section to generate table. modpost: use a table rather than a giant if/else statement. modules: sysfs - export: taint, coresize, initsize kernel/params: replace DEBUGP with pr_debug module: replace DEBUGP with pr_debug module: struct module_ref should contains long fields module: Fix performance regression on modules with large symbol tables module: Add comments describing how the "strmap" logic works Fix up conflicts in scripts/mod/file2alias.c due to the new linker- generated table approach to adding __mod_*_device_table entries. The ARM sa11x0 mcp bus needed to be converted to that too.	2012-01-14 12:32:16 -08:00
Linus Torvalds	1a52bb0b68	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: ceph: ensure prealloc_blob is in place when removing xattr rbd: initialize snap_rwsem in rbd_add() ceph: enable/disable dentry complete flags via mount option vfs: export symbol d_find_any_alias() ceph: always initialize the dentry in open_root_dentry() libceph: remove useless return value for osd_client __send_request() ceph: avoid iput() while holding spinlock in ceph_dir_fsync ceph: avoid useless dget/dput in encode_fh ceph: dereference pointer after checking for NULL crush: fix force for non-root TAKE ceph: remove unnecessary d_fsdata conditional checks ceph: Use kmemdup rather than duplicating its implementation Fix up conflicts in fs/ceph/super.c (d_alloc_root() failure handling vs always initialize the dentry in open_root_dentry)	2012-01-13 10:29:21 -08:00
Rusty Russell	1b9fbafb3a	paride/pcd: fix bool verbose module parameter. Dan Carpenter points out that it's an int, not a bool: pcd.c:427: if (verbose > 1) pcd.c:433: if (verbose > 1) pcd.c:437: if (verbose < 2) pcd.c:506:#define DBMSG(msg) ((verbose>1)?(msg):NULL) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: Dan Carpenter <dan.carpenter@oracle.com>	2012-01-13 09:32:26 +10:30
Rusty Russell	90ab5ee941	module_param: make bool parameters really bool (drivers & misc) module_param(bool) used to counter-intuitively take an int. In `fddd5201` (mid-2009) we allowed bool or int/unsigned int using a messy trick. It's time to remove the int/unsigned int option. For this version it'll simply give a warning, but it'll break next kernel version. Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2012-01-13 09:32:20 +10:30
Alex Elder	0e805a1d85	rbd: initialize snap_rwsem in rbd_add() New rbd device structures get initialized in rbd_add(). Many of the fields rely on being initially zero-filled. However we lockdep was noticing that the rw_semaphore embedded in the header field was not getting properly initialized. Fix that. Signed-off-by: Alex Elder <elder@dreamhost.com> Signed-off-by: Sage Weil <sage@newdream.net>	2012-01-12 11:00:50 -08:00
Amit Shah	f8fb5bc23a	virtio: blk: Add freeze, restore handlers to support S4 Delete the vq and flush any pending requests from the block queue on the freeze callback to prepare for hibernation. Re-create the vq in the restore callback to resume normal function. Signed-off-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2012-01-12 15:44:45 +10:30
Amit Shah	6abd6e5a44	virtio: blk: Move vq initialization to separate function The probe and PM restore functions will share this code. Signed-off-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2012-01-12 15:44:45 +10:30
Michael S. Tsirkin	4678d6f970	virtio_blk: fix config handler race Fix a theoretical race related to config work handler: a config interrupt might happen after we flush config work but before we reset the device. It will then cause the config work to run during or after reset. Two problems with this: - if this runs after device is gone we will get use after free - access of config while reset is in progress is racy (as layout is changing). As a solution 1. flush after reset when we know there will be no more interrupts 2. add a flag to disable config access before reset Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2012-01-12 15:44:44 +10:30
Rusty Russell	f96fde41f7	virtio: rename virtqueue_add_buf_gfp to virtqueue_add_buf Remove wrapper functions. This makes the allocation type explicit in all callers; I used GPF_KERNEL where it seemed obvious, left it at GFP_ATOMIC otherwise. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Reviewed-by: Christoph Hellwig <hch@lst.de>	2012-01-12 15:44:42 +10:30
Matthew Wilcox	df34813990	NVMe: Set number of queues correctly The number of submission & completion queues should be set by calling Set Features, not Get Features. Reported-by: Kwok Kong <Kwok.Kong@idt.com> Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>	2012-01-11 09:22:24 -05:00
Linus Torvalds	4690dfa8cd	Merge branch 'next' of git://git.monstr.eu/linux-2.6-microblaze * 'next' of git://git.monstr.eu/linux-2.6-microblaze: microblaze: Wire-up new system calls microblaze: Remove NO_IRQ from architecture input: xilinx_ps2: Don't use NO_IRQ block: xsysace: Don't use NO_IRQ microblaze: Trivial asm fix microblaze: Fix debug message in module microblaze: Remove eprintk macro microblaze: Send CR before LF for early console microblaze: Change NO_IRQ to 0 microblaze: Use irq_of_parse_and_map for timer microblaze: intc: Change variable name microblaze: Use of_find_compatible_node for timer and intc microblaze: Add __cmpdi2 microblaze: Synchronize __pa __va macros	2012-01-10 17:37:49 -08:00
Matthew Wilcox	366e8217e5	NVMe: Version 0.8 Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>	2012-01-10 16:30:15 -05:00
Matthew Wilcox	4eeb9215a0	NVMe: Set queue flags correctly QUEUE_FLAG_* are flags (other than QUEUE_FLAG_DEFAULT), so they cannot be ORed together. Set the queue flags using queue_flag_set_unlocked(). Reported-by: Donald Wood <donald.e.wood@intel.com> Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>	2012-01-10 16:29:23 -05:00
Matthew Wilcox	1c2ad9faaf	NVMe: Simplify nvme_unmap_user_pages By using the iod->nents field (the same way other I/O paths do), we can avoid recalculating the number of sg entries at unmap time, and make nvme_unmap_user_pages() easier to call. Also, use the 'write' parameter instead of assuming DMA_FROM_DEVICE. Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>	2012-01-10 14:54:22 -05:00
Matthew Wilcox	fe304c43c6	NVMe: Mark the end of the sg list For user I/O and admin commands, we were forgetting to mark the end of the SG list. Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>	2012-01-10 14:54:14 -05:00
Matthew Wilcox	497421880a	NVMe: Fix DMA mapping for admin commands We were always mapping as DMA_FROM_DEVICE then unmapping with DMA_TO_DEVICE which was clearly not correct. Follow the same pattern as nvme_submit_io() and key off the bottom bit of the opcode to determine whether this is a read or a write. Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>	2012-01-10 14:54:05 -05:00
Matthew Wilcox	ff976d724a	NVMe: Rename IO_TIMEOUT to NVME_IO_TIMEOUT IO_TIMEOUT is a little too generic and might be used by other parts of the kernel in the future. Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>	2012-01-10 14:53:54 -05:00
Matthew Wilcox	eca18b2394	NVMe: Merge the nvme_bio and nvme_prp data structures The new merged data structure is called nvme_iod. This improves performance for mid-sized I/Os (in the 16k range) since we save a memory allocation. It is also a slightly simpler interface to use. Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>	2012-01-10 14:51:20 -05:00
Matthew Wilcox	5c1281a3bf	NVMe: Change nvme_completion_fn to take a dev The queue is only needed for some rare occasions, and it's more consistent to pass the device around. Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>	2012-01-10 14:51:00 -05:00
Matthew Wilcox	040a93b52a	NVMe: Change get_nvmeq to take a dev instead of a namespace Upcoming patches require calling get_nvmeq when we don't have a namespace. Some callers already have the device in a local variable anyway. Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>	2012-01-10 14:49:18 -05:00
Matthew Wilcox	c2f5b65020	NVMe: Simplify completion handling Instead of encoding the handler type in the bottom two bits of the per-completion context pointer, store the handler function as well as the context pointer. This gives us more flexibility and the code is clearer. It comes at the cost of an extra 8k of memory per queue, but this feels like a reasonable price to pay. Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>	2012-01-10 14:47:46 -05:00
Linus Torvalds	90160371b3	Merge branch 'stable/for-linus-3.3' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen * 'stable/for-linus-3.3' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: (37 commits) xen/pciback: Expand the warning message to include domain id. xen/pciback: Fix "device has been assigned to X domain!" warning xen/pciback: Move the PCI_DEV_FLAGS_ASSIGNED ops to the "[un\|]bind" xen/xenbus: don't reimplement kvasprintf via a fixed size buffer xenbus: maximum buffer size is XENSTORE_PAYLOAD_MAX xen/xenbus: Reject replies with payload > XENSTORE_PAYLOAD_MAX. Xen: consolidate and simplify struct xenbus_driver instantiation xen-gntalloc: introduce missing kfree xen/xenbus: Fix compile error - missing header for xen_initial_domain() xen/netback: Enable netback on HVM guests xen/grant-table: Support mappings required by blkback xenbus: Use grant-table wrapper functions xenbus: Support HVM backends xen/xenbus-frontend: Fix compile error with randconfig xen/xenbus-frontend: Make error message more clear xen/privcmd: Remove unused support for arch specific privcmp mmap xen: Add xenbus_backend device xen: Add xenbus device driver xen: Add privcmd device driver xen/gntalloc: fix reference counts on multi-page mappings ...	2012-01-10 10:09:59 -08:00
Linus Torvalds	98793265b4	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (53 commits) Kconfig: acpi: Fix typo in comment. misc latin1 to utf8 conversions devres: Fix a typo in devm_kfree comment btrfs: free-space-cache.c: remove extra semicolon. fat: Spelling s/obsolate/obsolete/g SCSI, pmcraid: Fix spelling error in a pmcraid_err() call tools/power turbostat: update fields in manpage mac80211: drop spelling fix types.h: fix comment spelling for 'architectures' typo fixes: aera -> area, exntension -> extension devices.txt: Fix typo of 'VMware'. sis900: Fix enum typo 'sis900_rx_bufer_status' decompress_bunzip2: remove invalid vi modeline treewide: Fix comment and string typo 'bufer' hyper-v: Update MAINTAINERS treewide: Fix typos in various parts of the kernel, and fix some comments. clockevents: drop unknown Kconfig symbol GENERIC_CLOCKEVENTS_MIGR gpio: Kconfig: drop unknown symbol 'CS5535_GPIO' leds: Kconfig: Fix typo 'D2NET_V2' sound: Kconfig: drop unknown symbol ARCH_CLPS7500 ... Fix up trivial conflicts in arch/powerpc/platforms/40x/Kconfig (some new kconfig additions, close to removed commented-out old ones)	2012-01-08 13:21:22 -08:00
Linus Torvalds	972b2c7199	Merge branch 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs * 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (165 commits) reiserfs: Properly display mount options in /proc/mounts vfs: prevent remount read-only if pending removes vfs: count unlinked inodes vfs: protect remounting superblock read-only vfs: keep list of mounts for each superblock vfs: switch ->show_options() to struct dentry * vfs: switch ->show_path() to struct dentry * vfs: switch ->show_devname() to struct dentry * vfs: switch ->show_stats to struct dentry * switch security_path_chmod() to struct path * vfs: prefer ->dentry->d_sb to ->mnt->mnt_sb vfs: trim includes a bit switch mnt_namespace ->root to struct mount vfs: take /proc//mounts and friends to fs/proc_namespace.c vfs: opencode mntget() mnt_set_mountpoint() vfs: spread struct mount - remaining argument of next_mnt() vfs: move fsnotify junk to struct mount vfs: move mnt_devname vfs: move mnt_list to struct mount vfs: switch pnode.h macros to struct mount ...	2012-01-08 12:19:57 -08:00
Al Viro	ece2ccb668	Merge branches 'vfsmount-guts', 'umode_t' and 'partitions' into Z	2012-01-06 23:15:54 -05:00
Linus Torvalds	356b95424c	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k: (21 commits) m68k/mac: Make CONFIG_HEARTBEAT unavailable on Mac m68k/serial: Remove references to obsolete serial config options m68k/net: Remove obsolete IRQ_FLG_* users m68k: Don't comment out syscalls used by glibc m68k/atari: Move declaration of atari_SCC_reset_done to header file m68k/serial: Remove references to obsolete CONFIG_SERIAL167 m68k/hp300: Export hp300_ledstate m68k: Initconst section fixes m68k/mac: cleanup macro case mac_scsi: fix mac_scsi on some powerbooks m68k/mac: fix powerbook 150 adb_type m68k/mac: fix baboon irq disable and shutdown m68k/mac: oss irq fixes m68k/mac: fix nubus slot irq disable and shutdown m68k/mac: enable via_alt_mapping on performa 580 m68k/mac: cleanup forward declarations m68k/mac: cleanup mac_irq_pending m68k/mac: cleanup mac_clear_irq m68k/mac: early console m68k/mvme16x: Add support for EARLY_PRINTK ... Fix up trivial conflict in arch/m68k/Kconfig.debug due to new EARLY_PRINTK config option addition clashing with movement of the BOOTPARAM options.	2012-01-06 18:28:12 -08:00
Michal Simek	ba2d5affde	block: xsysace: Don't use NO_IRQ Drivers shouldn't use NO_IRQ. Microblaze and PPC define NO_IRQ as 0 and this reference will be removed in near future. Signed-off-by: Michal Simek <monstr@monstr.eu> Reviewed-by: Ryan Mallon <rmallon@gmail.com> Acked-by: Grant Likely <grant.likely@secretlab.ca> CC: Rob Herring <rob.herring@calxeda.com>	2012-01-05 08:34:29 +01:00
Jan Beulich	73db144b58	Xen: consolidate and simplify struct xenbus_driver instantiation The 'name', 'owner', and 'mod_name' members are redundant with the identically named fields in the 'driver' sub-structure. Rather than switching each instance to specify these fields explicitly, introduce a macro to simplify this. Eliminate further redundancy by allowing the drvname argument to DEFINE_XENBUS_DRIVER() to be blank (in which case the first entry from the ID table will be used for .driver.name). Also eliminate the questionable xenbus_register_{back,front}end() wrappers - their sole remaining purpose was the checking of the 'owner' field, proper setting of which shouldn't be an issue anymore when the macro gets used. v2: Restore DRV_NAME for the driver name in xen-pciback. Signed-off-by: Jan Beulich <jbeulich@suse.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com> Cc: Florian Tobias Schandinat <FlorianSchandinat@gmx.de> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2012-01-04 17:01:17 -05:00
Asai Thambi S P	62ee8c13e2	mtip32xx: do rebuild monitoring asynchronously Earlier, rebuild monitoring was done in the context of probe. Now the service thread takes the responsibility of rebuild monitoring, and probe returns good status. Signed-off-by: Asai Thambi S P <asamymuthupa@micron.com> Signed-off-by: Sam Bradshaw <sbradshaw@micron.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2012-01-04 22:01:32 +01:00
Al Viro	2c9ede55ec	switch device_get_devnode() and ->devnode() to umode_t * both callers of device_get_devnode() are only interested in lower 16bits and nobody tries to return anything wider than 16bit anyway. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-01-03 22:54:55 -05:00
Al Viro	ff01bb4832	fs: move code out of buffer.c Move invalidate_bdev, block_sync_page into fs/block_dev.c. Export kill_bdev as well, so brd doesn't have to open code it. Reduce buffer_head.h requirement accordingly. Removed a rather large comment from invalidate_bdev, as it looked a bit obsolete to bother moving. The small comment replacing it says enough. Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-01-03 22:54:07 -05:00
Jens Axboe	f748040bb8	Merge branch 'stable/for-jens-3.3' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen into for-3.3/drivers	2011-12-25 16:46:46 +01:00
Linus Torvalds	b0d78ee89c	Merge branch 'for-linus' of git://git.kernel.dk/linux-block * 'for-linus' of git://git.kernel.dk/linux-block: block: don't kick empty queue in blk_drain_queue() block/swim3: Locking fixes loop: Fix discard_alignment default setting cfq-iosched: fix cfq_cic_link() race confition cfq-iosched: free cic_index if blkio_alloc_blkg_stats fails cciss: fix flush cache transfer length cciss: Add IRQF_SHARED back in for the non-MSI(X) interrupt handler loop: fix loop block driver discard and encryption comment block: initialize request_queue's numa node during	2011-12-16 10:05:14 -08:00
Thomas Meyer	f094148a17	xen-blkfront: Use kcalloc instead of kzalloc to allocate array The advantage of kcalloc is, that will prevent integer overflows which could result from the multiplication of number of elements and size and it is also a bit nicer to read. The semantic patch that makes this change is available in https://lkml.org/lkml/2011/11/25/107 Signed-off-by: Thomas Meyer <thomas@m3y3r.de> [v1: Seperated the drivers/block/cciss_scsi.c out of this patch] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2011-12-16 12:36:52 -05:00
Tejun Heo	1ba64edef6	block, sx8: kill blk_insert_request() The only user left for blk_insert_request() is sx8 and it can be trivially switched to use blk_execute_rq_nowait() - special requests aren't included in io stat and sx8 doesn't use block layer tagging. Switch sx8 and kill blk_insert_requeset(). This patch doesn't introduce any functional difference. Only compile tested. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Jeff Garzik <jgarzik@pobox.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2011-12-14 00:33:37 +01:00
Linus Torvalds	653f42f6b6	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: ceph: add missing spin_unlock at ceph_mdsc_build_path() ceph: fix SEEK_CUR, SEEK_SET regression crush: fix mapping calculation when force argument doesn't exist ceph: use i_ceph_lock instead of i_lock rbd: remove buggy rollback functionality rbd: return an error when an invalid header is read ceph: fix rasize reporting by ceph_show_options	2011-12-13 14:59:42 -08:00
Benjamin Herrenschmidt	b302545744	block/swim3: Locking fixes The old PowerMac swim3 driver has some "interesting" locking issues, using a private lock and failing to lock the queue before completing requests, which triggered WARN_ONs among others. This rips out the private lock, makes everything operate under the block queue lock, and generally makes things simpler. We used to also share a queue between the two possible instances which was problematic since we might pick the wrong controller in some cases, so make the queue and the current request per-instance and use queuedata to point to our private data which is a lot cleaner. We still share the queue lock but then, it's nearly impossible to actually use 2 swim3's simultaneously: one would need to have a Wallstreet PowerBook, the only machine afaik with two of these on the motherboard, and populate both hotswap bays with a floppy drive (the machine ships only with one), so nobody cares... While at it, add a little fix to clear up stale interrupts when loading the driver or plugging a floppy drive in a bay. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2011-12-12 12:42:12 +01:00
Finn Thain	ed04c97d51	m68k/mac: cleanup forward declarations Move some forward declarations into header files and adjust includes. Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>	2011-12-10 19:52:46 +01:00
Josh Durgin	51703306b3	rbd: remove buggy rollback functionality This doesn't interact with resizing well, since it doesn't set the size of the device to the size at the snapshot. It's also an expensive operation to be synchronous. Rollback can still be done with the userspace rbd tool. Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>	2011-12-07 10:46:19 -08:00
Josh Durgin	81e759fbf7	rbd: return an error when an invalid header is read This protects against opening future rbd images that have incompatible format changes. Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>	2011-12-07 10:46:10 -08:00
Justin P. Mattock	42b2aa86c6	treewide: Fix typos in various parts of the kernel, and fix some comments. The below patch fixes some typos in various parts of the kernel, as well as fixes some comments. Please let me know if I missed anything, and I will try to get it changed and resent. Signed-off-by: Justin P. Mattock <justinmattock@gmail.com> Acked-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2011-12-02 14:57:31 +01:00
Lukas Czerner	dfaf3c036c	loop: Fix discard_alignment default setting discard_alignment is not relevant to the loop driver since it is supposed to be set as a workaround for the old sector 63 alignments. So set it to zero rather than block size. Signed-off-by: Lukas Czerner <lczerner@redhat.com> Reported-by: Milan Broz <mbroz@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2011-12-02 14:47:03 +01:00
Stephen M. Cameron	59bd71a81b	cciss: fix flush cache transfer length We weren't filling in the transfer length of the flush cache command (it transfers 4 bytes of zeroes). Firmware didn't seem to be bothered by this, but it should be fixed. Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2011-11-28 20:12:05 +01:00
Stephen M. Cameron	6225da4815	cciss: Add IRQF_SHARED back in for the non-MSI(X) interrupt handler IRQF_SHARED is required for older controllers that don't support MSI(X) and which may end up sharing an interrupt. Also remove deprecated IRQF_DISABLED. Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2011-11-28 20:12:05 +01:00
Dave Young	ae95757a90	loop: fix loop block driver discard and encryption comment The loop driver does not support discard if encryption is enabled, fix the comment. Signed-off-by: Dave Young <dyoung@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2011-11-25 09:41:25 +01:00
Dan Carpenter	3e54a3d1b8	mtip32xx: uninitialized variable in mtip_quiesce_io() We recently introduce new continue in the loop which make gcc complain. In theory if MTIP_FLAG_SVC_THD_ACTIVE_BIT is set, we could hit continue over and over until eventually we time out of the loop. In that case "active" should be set as true, but right now it's uninitialized. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2011-11-24 12:59:00 +01:00
Asai Thambi S P	60ec0eecfa	mtip32xx: updates based on feedback * queue ncq commands when a non-ncq is in progress or error handling is active * merge variables 'internal_cmd_in_progress' and 'eh_active' into new variable 'flags' * get rid of read/write semaphore 'internal_sem' * new service thread to issue queued commands * use macros from ata.h for command codes * return ENOTTY for BLKFLSBUF ioctl * style changes Signed-off-by: Asai Thambi S P <asamymuthupa@micron.com> Signed-off-by: Sam Bradshaw <sbradshaw@micron.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2011-11-23 08:29:24 +01:00
Li Dongyang	ae18be11b5	xen-blkback: convert hole punching to discard request on loop devices As of `dfaa2ef68e`, loop devices support discard request now. We could just issue a discard request, and the loop driver will punch the hole for us, so we don't need to touch the internals of loop device and punch the hole ourselves, Thanks. V0->V1: rebased on devel/for-jens-3.3 Signed-off-by: Li Dongyang <lidongyang@novell.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2011-11-18 13:28:05 -05:00
Konrad Rzeszutek Wilk	421463526f	xen/blkback: Move processing of BLKIF_OP_DISCARD from dispatch_rw_block_io .. and move it to its own function that will deal with the discard operation. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2011-11-18 13:28:03 -05:00
Konrad Rzeszutek Wilk	5ea4298669	xen/blk[front\|back]: Enhance discard support with secure erasing support. Part of the blkdev_issue_discard(xx) operation is that it can also issue a secure discard operation that will permanantly remove the sectors in question. We advertise that we can support that via the 'discard-secure' attribute and on the request, if the 'secure' bit is set, we will attempt to pass in REQ_DISCARD \| REQ_SECURE. CC: Li Dongyang <lidongyang@novell.com> [v1: Used 'flag' instead of 'secure:1' bit] [v2: Use 'reserved' uint8_t instead of adding a new value] [v3: Check for nseg when mapping instead of operation] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2011-11-18 13:28:01 -05:00
Konrad Rzeszutek Wilk	97e36834f5	xen/blk[front\|back]: Squash blkif_request_rw and blkif_request_discard together In a union type structure to deal with the overlapping attributes in a easier manner. Suggested-by: Ian Campbell <Ian.Campbell@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2011-11-18 13:27:59 -05:00
Dan Carpenter	a2c2a0e668	paride: fix potential information leak in pg_read() Smatch has a new check for Rosenberg type information leaks where structs are copied to the user with uninitialized stack data in them. i In this case, the pg_write_hdr struct has a hole in it. struct pg_write_hdr { char magic; /* 0 1 / char func; / 1 1 / / XXX 2 bytes hole, try to pack / int dlen; / 4 4 */ Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: Tim Waugh <tim@cyberelk.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2011-11-16 09:21:50 +01:00
Stephen M. Cameron	0007a4c90a	cciss: auto engage SCSI mid layer at driver load time A long time ago, probably in 2002, one of the distros, or maybe more than one, loaded block drivers prior to loading the SCSI mid layer. This meant that the cciss driver, being a block driver, could not engage the SCSI mid layer at init time without panicking, and relied on being poked by a userland program after the system was up (and the SCSI mid layer was therefore present) to engage the SCSI mid layer. This is no longer the case, and cciss can safely rely on the SCSI mid layer being present at init time and engage the SCSI mid layer straight away. This means that users will see their tape drives and medium changers at driver load time without need for a script in /etc/rc.d that does this: for x in /proc/driver/cciss/cciss* do echo "engage scsi" > $x done However, if no tape drives or medium changers are detected, the SCSI mid layer will not be engaged. If a tape drive or medium change is later hot-added to the system it will then be necessary to use the above script or similar for the device(s) to be acceesible. Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2011-11-16 09:21:49 +01:00
Dmitry Monakhov	7035b5df3c	loop: cleanup set_status interface 1) Anyone who has read access to loopdev has permission to call set_status and may change important parameters such as lo_offset, lo_sizelimit and so on, which contradicts to read access pattern and definitely equals to write access pattern. 2) Add lo_offset over i_size check to prevent blkdev_size overflow. ##Testcase_bagin #dd if=/dev/zero of=./file bs=1k count=1 #losetup /dev/loop0 ./file /* userspace_application / struct loop_info64 loinf; fd = open("/dev/loop0", O_RDONLY); ioctl(fd, LOOP_GET_STATUS64, &loinf); / Set offset to any value which is bigger than i_size, and sizelimit * to nonzero value/ loinf.lo_offset = 40961024; loinf.lo_sizelimit = 1024; ioctl(fd, LOOP_SET_STATUS64, &loinf); /* After this loop device will have size similar to 0x7fffffffffxxxx */ #blockdev --getsz /dev/loop0 ##OUTPUT: 36028797018955968 ##Testcase_end [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2011-11-16 09:21:49 +01:00
Dmitry Monakhov	3bb9068278	loop: prevent information leak after failed read If read was not fully successful we have to fail whole bio to prevent information leak of old pages ##Testcase_begin dd if=/dev/zero of=./file bs=1M count=1 losetup /dev/loop0 ./file -o 4096 truncate -s 0 ./file # OOps loop offset is now beyond i_size, so read will silently fail. # So bio's pages would not be cleared, may which result in information leak. hexdump -C /dev/loop0 ##testcase_end Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2011-11-16 09:21:48 +01:00
Matthew Garrett	1937335856	The Windows driver .inf disables ASPM on all cciss devices. Do the same. Signed-off-by: Matthew Garrett <mjg@redhat.com> Cc: iss_storagedev@hp.com Acked-by: Mike Miller <mike.miller@hp.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2011-11-11 22:05:54 +01:00
Linus Torvalds	32aaeffbd4	Merge branch 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux * 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux: (230 commits) Revert "tracing: Include module.h in define_trace.h" irq: don't put module.h into irq.h for tracking irqgen modules. bluetooth: macroize two small inlines to avoid module.h ip_vs.h: fix implicit use of module_get/module_put from module.h nf_conntrack.h: fix up fallout from implicit moduleparam.h presence include: replace linux/module.h with "struct module" wherever possible include: convert various register fcns to macros to avoid include chaining crypto.h: remove unused crypto_tfm_alg_modname() inline uwb.h: fix implicit use of asm/page.h for PAGE_SIZE pm_runtime.h: explicitly requires notifier.h linux/dmaengine.h: fix implicit use of bitmap.h and asm/page.h miscdevice.h: fix up implicit use of lists and types stop_machine.h: fix implicit use of smp.h for smp_processor_id of: fix implicit use of errno.h in include/linux/of.h of_platform.h: delete needless include <linux/module.h> acpi: remove module.h include from platform/aclinux.h miscdevice.h: delete unnecessary inclusion of module.h device_cgroup.h: delete needless include <linux/module.h> net: sch_generic remove redundant use of <linux/module.h> net: inet_timewait_sock doesnt need <linux/module.h> ... Fix up trivial conflicts (other header files, and removal of the ab3550 mfd driver) in - drivers/media/dvb/frontends/dibx000_common.c - drivers/media/video/{mt9m111.c,ov6650.c} - drivers/mfd/ab3550-core.c - include/linux/dmaengine.h	2011-11-06 19:44:47 -08:00
Linus Torvalds	06d381484f	Merge branch 'stable/vmalloc-3.2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen * 'stable/vmalloc-3.2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: net: xen-netback: use API provided by xenbus module to map rings block: xen-blkback: use API provided by xenbus module to map rings xen: use generic functions instead of xen_{alloc, free}_vm_area()	2011-11-06 18:31:36 -08:00
Jens Axboe	a71f483d79	mtip32xx: update to new ->make_request() API Signed-off-by: Jens Axboe <axboe@kernel.dk>	2011-11-05 08:36:21 +01:00
Jens Axboe	0e838c624e	mtip32xx: add module.h include to avoid conflict with moduleh tree Signed-off-by: Jens Axboe <axboe@kernel.dk>	2011-11-05 08:35:10 +01:00
Jens Axboe	3ff147d3a8	mtip32xx: mark a few more items static Missed two items: mtip_major, and mtip_pci_driver. Signed-off-by: Jens Axboe <axboe@kernel.dk>	2011-11-05 08:35:10 +01:00
Jens Axboe	6316668fbc	mtip32xx: ensure that all local functions are static Kill the declarations in the header file and mark them as static. Reshuffle a few functions to ensure that everything is properly declared before being used. Signed-off-by: Jens Axboe <axboe@kernel.dk>	2011-11-05 08:35:10 +01:00
Jens Axboe	ef0f158734	mtip32xx: cleanup compat ioctl handling Do the conversion/copy up front instead of passing in a compat flag to the ioctl handler and subsequently to the exec_drive_taskfile() function. Signed-off-by: Jens Axboe <axboe@kernel.dk>	2011-11-05 08:35:10 +01:00
Jens Axboe	16d02c040b	mtip32xx: fix warnings/errors on 32-bit compiles We need to clean up the compat ioctl handling, but this makes it work for now at least. Signed-off-by: Jens Axboe <axboe@kernel.dk>	2011-11-05 08:35:10 +01:00
Sam Bradshaw	88523a6155	block: Add driver for Micron RealSSD pcie flash cards This adds mtip32xx, a driver supporting Microns line of pci-express flash storage cards. Signed-off-by: Asai Thambi S P <asamymuthupa@micron.com> Signed-off-by: Sam Bradshaw <sbradshaw@micron.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>	2011-11-05 08:35:10 +01:00
Linus Torvalds	3d0a8d10cf	Merge branch 'for-3.2/drivers' of git://git.kernel.dk/linux-block * 'for-3.2/drivers' of git://git.kernel.dk/linux-block: (30 commits) virtio-blk: use ida to allocate disk index hpsa: add small delay when using PCI Power Management to reset for kump cciss: add small delay when using PCI Power Management to reset for kump xen/blkback: Fix two races in the handling of barrier requests. xen/blkback: Check for proper operation. xen/blkback: Fix the inhibition to map pages when discarding sector ranges. xen/blkback: Report VBD_WSECT (wr_sect) properly. xen/blkback: Support 'feature-barrier' aka old-style BARRIER requests. xen-blkfront: plug device number leak in xlblk_init() error path xen-blkfront: If no barrier or flush is supported, use invalid operation. xen-blkback: use kzalloc() in favor of kmalloc()+memset() xen-blkback: fixed indentation and comments xen-blkfront: fix a deadlock while handling discard response xen-blkfront: Handle discard requests. xen-blkback: Implement discard requests ('feature-discard') xen-blkfront: add BLKIF_OP_DISCARD and discard request struct drivers/block/loop.c: remove unnecessary bdev argument from loop_clr_fd() drivers/block/loop.c: emit uevent on auto release drivers/block/cpqarray.c: use pci_dev->revision loop: always allow userspace partitions and optionally support automatic scanning ... Fic up trivial header file includsion conflict in drivers/block/loop.c	2011-11-04 17:22:14 -07:00
Linus Torvalds	b4fdcb02f1	Merge branch 'for-3.2/core' of git://git.kernel.dk/linux-block * 'for-3.2/core' of git://git.kernel.dk/linux-block: (29 commits) block: don't call blk_drain_queue() if elevator is not up blk-throttle: use queue_is_locked() instead of lockdep_is_held() blk-throttle: Take blkcg->lock while traversing blkcg->policy_list blk-throttle: Free up policy node associated with deleted rule block: warn if tag is greater than real_max_depth. block: make gendisk hold a reference to its queue blk-flush: move the queue kick into blk-flush: fix invalid BUG_ON in blk_insert_flush block: Remove the control of complete cpu from bio. block: fix a typo in the blk-cgroup.h file block: initialize the bounce pool if high memory may be added later block: fix request_queue lifetime handling by making blk_queue_cleanup() properly shutdown block: drop @tsk from attempt_plug_merge() and explain sync rules block: make get_request[_wait]() fail if queue is dead block: reorganize throtl_get_tg() and blk_throtl_bio() block: reorganize queue draining block: drop unnecessary blk_get/put_queue() in scsi_cmd_ioctl() and blk_get_tg() block: pass around REQ_* flags instead of broken down booleans during request alloc/free block: move blk_throtl prototypes to block/blk.h block: fix genhd refcounting in blkio_policy_parse_and_set() ... Fix up trivial conflicts due to "mddev_t" -> "struct mddev" conversion and making the request functions be of type "void" instead of "int" in - drivers/md/{faulty.c,linear.c,md.c,md.h,multipath.c,raid0.c,raid1.c,raid10.c,raid5.c} - drivers/staging/zram/zram_drv.c	2011-11-04 17:06:58 -07:00
Matthew Wilcox	f1938f6e1e	NVMe: Implement doorbell stride capability The doorbell stride allows devices to spread out their doorbells instead of packing them tightly. This feature was added as part of ECN 003. This patch also enables support for more than 512 queues :-) Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>	2011-11-04 15:53:05 -04:00
Matthew Wilcox	ce38c14957	NVMe: Version 0.7 Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>	2011-11-04 15:53:05 -04:00
Matthew Wilcox	2b2c189687	NVMe: Don't probe namespace 0 ECN 001 documented that namespace 0 is not valid. Sending an Identify with CNS of 0 and Namespace of 0 is an undefined command. Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>	2011-11-04 15:53:04 -04:00
Nisheeth Bhat	0d1bc91258	Fix calculation of number of pages in a PRP List The existing calculation underestimated the number of pages required as it did not take into account the pointer at the end of each page. The replacement calculation may overestimate the number of pages required if the last page in the PRP List is entirely full. By using ->npages as a counter as we fill in the pages, we ensure that we don't try to free a page that was never allocated. Signed-off-by: Nisheeth Bhat <nisheeth.bhat@intel.com> Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>	2011-11-04 15:53:04 -04:00
Matthew Wilcox	bc5fc7e4b2	NVMe: Create nvme_identify and nvme_get_features functions Instead of open-coding calls to nvme_submit_admin_cmd, these small wrappers are simpler to use (the patch removes 14 lines from nvme_dev_add() for example). Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>	2011-11-04 15:53:04 -04:00
Matthew Wilcox	684f5c2025	NVMe: Fix memory leak in nvme_dev_add() The driver was allocating 8k of memory, then freeing 4k of it. Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>	2011-11-04 15:53:04 -04:00
Nisheeth Bhat	d1a490e026	NVMe: Fix calls to dma_unmap_sg dma_unmap_sg() must be called with the same 'nents' passed to dma_map_sg(), not the number returned from dma_map_sg(). Signed-off-by: Nisheeth Bhat <nisheeth.bhat@intel.com> Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>	2011-11-04 15:53:04 -04:00
Matthew Wilcox	d0ba1e497b	NVMe: Correct sg list setup in nvme_map_user_pages Our SG list was constructed to always fill the entire first page, even if that was more than the length of the I/O. This is probably harmless, but some IOMMUs might do something bad. Correcting the first call to sg_set_page() made it look a lot closer to the sg_set_page() in the loop, so fold the first call to sg_set_page() into the loop. Reported-by: Nisheeth Bhat <nisheeth.bhat@intel.com> Signed-off-by: Matthew Wilcox <willy@linux.intel.com>	2011-11-04 15:53:04 -04:00
Matthew Wilcox	6413214c5d	Fix bug in NVME_IOCTL_SUBMIT_IO Missing 'break' in the switch statement meant that we'd fall through to the 'return -EINVAL' case.	2011-11-04 15:53:04 -04:00
Matthew Wilcox	6bbf1acdde	NVMe: Rework ioctls Remove the special-purpose IDENTIFY, GET_RANGE_TYPE, DOWNLOAD_FIRMWARE and ACTIVATE_FIRMWARE commands. Replace them with a generic ADMIN_CMD ioctl that can submit any admin command. Add a new ID ioctl that returns the namespace ID of the queried device. It corresponds to the SCSI Idlun ioctl. Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>	2011-11-04 15:53:03 -04:00
Matthew Wilcox	eac623ba7a	NVMe: Add the nvme thread to the wait queue before waking it up If the I/O was not completed by a single NVMe command, we add the bio to the congestion list and wake up the kthread to resubmit it. But the kthread calls remove_wait_queue() unconditionally, which will oops if it's not on the wait queue. So add the kthread to the wait queue before waking it up. Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>	2011-11-04 15:53:03 -04:00
Matthew Wilcox	6f0f54499f	NVMe: Return real error from nvme_create_queue nvme_setup_io_queues() was assuming that a NULL return from nvme_create_queue() was an out-of-memory error. That's not necessarily true; the adapter might return -EIO, for example. Change the calling convention to return an ERR_PTR on failure instead of NULL. Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>	2011-11-04 15:53:03 -04:00
Matthew Wilcox	be5e094840	NVMe: Version 0.6 Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>	2011-11-04 15:53:03 -04:00
Matthew Wilcox	184d2944cb	NVMe: Add a few calling convention notes For the benefit of reviewers, add comments to a few functions describing their calling context Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>	2011-11-04 15:53:03 -04:00
Matthew Wilcox	b77954cbdd	NVMe: Handle failures from memory allocations in nvme_setup_prps If any of the memory allocations in nvme_setup_prps fail, handle it by modifying the passed-in data length to reflect the number of bytes we are actually able to send. Also allow the caller to specify the GFP flags they need; for user-initiated commands, we can use GFP_KERNEL allocations. The various callers are updated to handle this possibility; the main I/O path is already prepared for this possibility (as it may happen due to nvme_map_bio being unable to map all the segments of the I/O). The other callers return -ENOMEM instead of doing partial I/Os. Reported-by: Andi Kleen <andi@firstfloor.org> Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>	2011-11-04 15:53:03 -04:00
Matthew Wilcox	5aff9382dd	NVMe: Use an IDA to allocate minor numbers The current approach of using the namespace ID as the minor number doesn't work when there are multiple adapters in the machine. Rather than statically partitioning the number of namespaces between adapters, dynamically allocate minor numbers to namespaces as they are detected. Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>	2011-11-04 15:53:03 -04:00
Matthew Wilcox	fd63e9ceee	NVMe: Add include of delay.h for msleep Previously it was being implicitly included through some other header file Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>	2011-11-04 15:53:02 -04:00
Matthew Wilcox	8de055350f	NVMe: Add support for timing out I/Os In the kthread, walk the list of outstanding I/Os and check they've not hit the timeout. Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>	2011-11-04 15:53:02 -04:00
Matthew Wilcox	21075bdee0	NVMe: Rename cancel_cmdid_data to cancel_cmdid The trailing '_data' on the end was annoying and inconsistent. Also, make it actually return the data since this is needed for timing out commands. Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>	2011-11-04 15:53:02 -04:00
Matthew Wilcox	09a58f5364	NVMe: Fix bug in error handling When an I/O completed with an error, we would call bio_endio twice (once with -EIO and once with 0). Found by inspection. Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>	2011-11-04 15:53:02 -04:00
Matthew Wilcox	22605f9681	NVMe: Time out initialisation after a few seconds THe device reports (in its capability register) how long it will take to initialise. If that time elapses before the ready bit becomes set, conclude the device is broken and refuse to initialise it. Log a nice error message so the user knows why we did nothing. Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>	2011-11-04 15:53:02 -04:00
Matthew Wilcox	aba2080f3f	NVMe: Fix warning in free_irq We need to clear the affinity mask before calling free_irq() Reported-by: Shane Michael Matthews <shane.matthews@intel.com> Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>	2011-11-04 15:53:02 -04:00
Matthew Wilcox	7f53f9d242	NVMe: Correct the Controller Configuration settings The arbitration field was extended by one bit, shifting the shutdown notification bits by one. Also, the SQ/CQ entry size was made configurable for future extensions. Reported-by: Paul Luse <paul.e.luse@intel.com> Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>	2011-11-04 15:53:01 -04:00
Matthew Wilcox	8ef700678f	NVMe: Version 0.5 Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>	2011-11-04 15:53:01 -04:00
Matthew Wilcox	6c7d49455c	NVMe: Change the definition of nvme_user_io The read and write commands don't define a 'result', so there's no need to copy it back to userspace. Remove the ability of the ioctl to submit commands to a different namespace; it's just asking for trouble, and the use case I have in mind will be addressed througha different ioctl in the future. That removes the need for both the block_shift and nsid arguments. Check that the opcode is one of 'read' or 'write'. Future opcodes may be added in the future, but we will need a different structure definition for them. The nblocks field is redefined to be 0-based. This allows the user to request the full 65536 blocks. Don't byteswap the reftag, apptag and appmask. Martin Petersen tells me these are calculated in big-endian and are transmitted to the device in big-endian. Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>	2011-11-04 15:53:01 -04:00
Matthew Wilcox	4948168280	NVMe: Add compat_ioctl Make ioctls work for 32-bit applications on 64-bit kernels. The structures are defined to be the same for both 32- and 64-bit applications, so we can use the same handler for both. Reported-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>	2011-11-04 15:53:01 -04:00
Matthew Wilcox	9ecdc94621	NVMe: Simplify queue lookup Fill in all the num_possible_cpus() entries with duplicate pointers. This reduces the complexity of the frequently-called get_nvmeq(), as well as avoiding a bug in it when there are fewer queues than CPUs. Reported-by: Shane Michael Matthews <shane.matthews@intel.com> Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>	2011-11-04 15:53:01 -04:00
Matthew Wilcox	3cb967c039	NVMe: Remove the kthread from the wait queue Once there are no more bios on the congestion list, we can stop waking up the nvme kthread every time a completion happens. Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>	2011-11-04 15:53:00 -04:00
Matthew Wilcox	7523d834dd	NVMe: Fix off-by-one when filling in PRP lists If the last element in the PRP list fits on the end of the page, there's no need to allocate an extra page to put that single element in. It can fit on the end of the page. Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>	2011-11-04 15:53:00 -04:00
Matthew Wilcox	ac88c36a38	NVMe: Fix interpretation of 'Number of Namespaces' field The spec says this is a 0s based value. We don't need to handle the maximal value because it's reserved to mean "every namespace". Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>	2011-11-04 15:53:00 -04:00
Matthew Wilcox	19e899b2f9	NVMe: Remove outdated comments The head can never overrun the tail since we won't allocate enough command IDs to let that happen. The status codes are in sync with the spec. Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>	2011-11-04 15:53:00 -04:00
Matthew Wilcox	fa92282149	NVMe: Fix comment formatting Reported-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>	2011-11-04 15:53:00 -04:00
Matthew Wilcox	714a7a2288	NVMe: Convert comments to kernel-doc notation Reported-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>	2011-11-04 15:53:00 -04:00
Matthew Wilcox	b57ab0fada	NVMe: Version 0.4 Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>	2011-11-04 15:52:59 -04:00
Matthew Wilcox	e6d15f79f9	NVMe: Reduce maximum queue depth by 1 The spec says we're not allowed to completely fill the submission queue. Solve this by reducing the number of allocatable cmdids by 1. Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>	2011-11-04 15:52:59 -04:00
Matthew Wilcox	d8ee9d69f2	NVMe: Fix discontiguous accesses When we submit subsequent portions of the I/O, we need to access the updated block, not start reading again from the original position. This was showing up as miscompares in the XFS randholes testcase. Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>	2011-11-04 15:52:59 -04:00
Matthew Wilcox	1ad2f8932a	NVMe: Handle bios that contain non-virtually contiguous addresses NVMe scatterlists must be virtually contiguous, like almost all I/Os. However, when the filesystem lays out files with a hole, it can be that adjacent LBAs map to non-adjacent virtual addresses. Handle this by submitting one NVMe command at a time for each virtually discontiguous range. Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>	2011-11-04 15:52:59 -04:00
Matthew Wilcox	00df5cb4eb	NVMe: Implement Flush Linux implements Flush as a bit in the bio. That means there may also be data associated with the flush; if so the flush should be sent before the data. To avoid completing the bio twice, I add CMD_CTX_FLUSH to indicate the completion routine should do nothing. Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>	2011-11-04 15:52:59 -04:00
Matthew Wilcox	c42705592b	NVMe: Mark CMD_CTX_CANCELLED as being unlikely Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>	2011-11-04 15:52:59 -04:00
Matthew Wilcox	7547881d09	NVMe: Correct SQ doorbell semantics The value written to the doorbell needs to be the first free index in the queue, not the most recently used index in the queue. Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>	2011-11-04 15:52:58 -04:00
Matthew Wilcox	740216fc59	NVMe: Let the kthread take care of devices earlier If interrupts are misconfigured, the kthread will be needed to process admin queue completions. Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>	2011-11-04 15:52:58 -04:00
Matthew Wilcox	b348b7d543	NVMe: Rename nr_queues to nr_io_queues I got confused about whether this included the admin queue or not, and had to resort to reading the spec. It doesn't include the admin queue, so make that clear in the name. Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>	2011-11-04 15:52:58 -04:00
Matthew Wilcox	ca1615424c	NVMe: Remove setting of 'flags' in rw command This was the data transfer bit until spec rev 0.92 Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>	2011-11-04 15:52:58 -04:00
Matthew Wilcox	ad8a5df97c	NVMe: Release 0.3 Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>	2011-11-04 15:52:58 -04:00
Matthew Wilcox	1fa6aeadf1	NVMe: Add a kthread to handle the congestion list Instead of trying to resubmit I/Os in the I/O completion path (in interrupt context), wake up a kthread which will resubmit I/O from user context. This allows mke2fs to run to completion. Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>	2011-11-04 15:52:58 -04:00
Matthew Wilcox	eeee322647	NVMe: Handle failures differently in nvme_submit_bio_queue() Return -EBUSY if the queue is full or -ENOMEM if we failed to allocate memory (or map a scatterlist). Also use GFP_ATOMIC to allocate the nvme_bio and move the locking to the callers of nvme_submit_bio_queue(). In nvme_make_request(), don't permit an I/O to jump the queue -- if the congestion list already has an entry, just add to the tail, rather than trying to submit. Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>	2011-11-04 15:52:58 -04:00
Matthew Wilcox	768308400f	NVMe: Handle physical merging of bvec entries In order to not overrun the sg array, we have to merge physically contiguous pages into a single sg entry. Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>	2011-11-04 15:52:57 -04:00
Matthew Wilcox	1974b1ae88	NVMe: Check for DMA mapping failure If dma_map_sg returns 0 (failure), we need to fail the I/O. Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>	2011-11-04 15:52:57 -04:00
Matthew Wilcox	d567760c40	NVMe: Pass the nvme_dev to nvme_free_prps and nvme_setup_prps We were passing the nvme_queue to access the q_dmadev for the dma_alloc_coherent calls, but since we moved to the dma pool API, we really only need the nvme_dev. Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>	2011-11-04 15:52:57 -04:00
Matthew Wilcox	99802a7aee	NVMe: Optimise memory usage for I/Os between 4k and 128k Add a second memory pool for smaller I/Os. We can pack 16 of these on a single page instead of using an entire page for each one. Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>	2011-11-04 15:52:57 -04:00
Matthew Wilcox	091b609258	NVMe: Switch to use DMA Pool API Calling dma_free_coherent from interrupt context causes warnings. Using the DMA pools delays freeing until pool destruction, so avoids the problem. Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>	2011-11-04 15:52:57 -04:00
Matthew Wilcox	d534df3c73	NVMe: Rename nvme_req_info to nvme_bio There are too many things called 'info' in this driver. This data structure is auxiliary information for a struct bio, so call it nvme_bio, or nbio when used as a variable. Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>	2011-11-04 15:52:56 -04:00
Shane Michael Matthews	e025344c56	NVMe: Initial PRP List support Add a pointer to the nvme_req_info to hold a new data structure (nvme_prps) which contains a list of the pages allocated to this particular request for holding PRP list entries. nvme_setup_prps() now returns this pointer. To allocate and free the memory used for PRP lists, we need a struct device, so we need to pass the nvme_queue pointer to many functions which didn't use to need it. Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>	2011-11-04 15:52:56 -04:00
Matthew Wilcox	51882d00f0	NVMe: Advance the sg pointer when filling in an sg list For multipage BIOs, we were always using sg[0] instead of advancing through the list. Oops :-) Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>	2011-11-04 15:52:56 -04:00
Matthew Wilcox	d2d8703481	NVMe: Renumber the special context values If POISON_POINTER_DELTA isn't defined, ensure they're in page 0 which should never be mapped. Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>	2011-11-04 15:52:56 -04:00
Matthew Wilcox	9294bbed78	NVMe: Handle the congestion list a little better In the bio completion handler, check for bios on the congestion list for this NVM queue. Also, lock the congestion list in the make_request function as the queue may end up being shared between multiple CPUs. Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>	2011-11-04 15:52:56 -04:00
Matthew Wilcox	e85248e516	NVMe: Record the timeout for each command In addition to recording the completion data for each command, record the anticipated completion time. Choose a timeout of 5 seconds for normal I/Os and 60 seconds for admin I/Os. Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>	2011-11-04 15:52:56 -04:00
Matthew Wilcox	ec6ce618d6	NVMe: Need to lock queue during interrupt handling If we're sharing a queue between multiple CPUs and we cancel a sync I/O, we must have the queue locked to avoid corrupting the stack of the thread that submitted the I/O. It turns out this is the same locking that's needed for the threaded irq handler, so share that code. Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>	2011-11-04 15:52:56 -04:00
Matthew Wilcox	48e3d39816	NVMe: Detect command IDs completing that are out of range If the adapter completes a command ID that is outside the bounds of the array, return CMD_CTX_INVALID instead of random data, and print a message in the sync_completion handler (which is rapidly becoming the misc completion handler :-) Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>	2011-11-04 15:52:55 -04:00
Matthew Wilcox	b36235df01	NVMe: Detect commands that are completed twice Set the context value to CMD_CTX_COMPLETED, and print a message in the sync_completion handler if we see it. Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>	2011-11-04 15:52:55 -04:00
Matthew Wilcox	be7b62754e	NVMe: Use a symbolic name to represent cancelled commands instead of 0 I have plans for other special values in sync_completion. Plus, this is more self-documenting, and lets us detect bogus usages. Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>	2011-11-04 15:52:55 -04:00
Matthew Wilcox	58ffacb545	NVMe: Add a module parameter to use a threaded interrupt We're currently calling bio_endio from hard interrupt context. This is not a good idea for preemptible kernels as it will cause longer latencies. Using a threaded interrupt will run the entire queue processing mechanism (including bio_endio) in a thread, which can be preempted. Unfortuantely, it also adds about 7us of latency to the single-I/O case, so make it a module parameter for the moment. Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>	2011-11-04 15:52:55 -04:00
Matthew Wilcox	b1ad37efca	NVMe: Call put_nvmeq() before calling nvme_submit_sync_cmd() We can't have preemption disabled when we call schedule(). Accept the possibility that we'll get preempted, and it'll cost us some cacheline bounces. Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>	2011-11-04 15:52:55 -04:00
Matthew Wilcox	3c0cf138d7	NVMe: Allow fatal signals to interrupt I/O If the user sends a fatal signal, sleeping in the TASK_KILLABLE state permits the task to be aborted. The only wrinkle is making sure that if/when the command completes later that it doesn't upset anything. Handle this by setting the data pointer to 0, and checking the value isn't NULL in the sync completion path. Eventually, bios can be cancelled through this path too. Note that the cmdid isn't freed to prevent reuse. We should also abort the command in the future, but this is a good start. Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>	2011-11-04 15:52:55 -04:00
Matthew Wilcox	db5d0c198d	NVMe: Release 0.2 Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>	2011-11-04 15:52:54 -04:00
Matthew Wilcox	6ee44cdced	NVMe: Add download / activate firmware ioctls Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>	2011-11-04 15:52:54 -04:00
Matthew Wilcox	388f037f4e	NVMe: Move sysfs entries to the right place Because I wasn't setting driverfs_dev, the devices were showing up under /sys/devices/virtual/block. Now they appear underneath the PCI device which they belong to. Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>	2011-11-04 15:52:54 -04:00
Shane Michael Matthews	5911f20039	NVMe: Disable the device before we write the admin queues In case the card has been left in a partially-configured state, write 0 to the Enable bit. Signed-off-by: Shane Michael Matthews <shane.matthews@intel.com> Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>	2011-11-04 15:52:54 -04:00

... 2 3 4 5 6 ...

2453 Commits