Commit Graph

40 Commits

Author SHA1 Message Date
Christoph Hellwig 70246286e9 block: get rid of bio_rw and READA
These two are confusing leftover of the old world order, combining
values of the REQ_OP_ and REQ_ namespaces.  For callers that don't
special case we mostly just replace bi_rw with bio_data_dir or
op_is_write, except for the few cases where a switch over the REQ_OP_
values makes more sense.  Any check for READA is replaced with an
explicit check for REQ_RAHEAD.  Also remove the READA alias for
REQ_RAHEAD.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-20 17:37:01 -06:00
Matias Bjørling 855cdd2c0b lightnvm: make rrpc_map_page call nvm_get_blk outside locks
The nvm_get_blk() function is called with rlun->lock held. This is ok
when the media manager implementation doesn't go out of its atomic
context. However, if a media manager persists its metadata, and
guarantees that the block is given to the target, this is no longer
a viable approach. Therefore, clean up the flow of rrpc_map_page,
and make sure that nvm_get_blk() is called without any locks acquired.

Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-07 08:51:52 -06:00
Matias Bjørling 41285fad51 lightnvm: remove _unlocked variant of [get/put]_blk
The [get/put]_blk API enables targets to get ownership of blocks at
runtime. This information is currently not recorded on disk, and the
information is therefore lost on power failure. To restore the
metadata, the [get/put]_blk must persist its metadata. In that case,
we need to control the outer lock, so that we can disable them while
updating the on-disk metadata. Fortunately, the _unlocked versions can
be removed, which allows us to move the lock into the [get/put]_blk
functions.

Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-07 08:51:52 -06:00
Matias Bjørling 8c39eddbf2 lightnvm: remove unused lists from struct rrpc_block
The ->list, ->open_list, and ->closed_list lists were previously used
for statistics. However, their usage have been removed, and thus these
can safely be removed.

Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-07 08:51:52 -06:00
Matias Bjørling 077d238999 lightnvm: remove open/close statistics for gennvm
The responsibility of the media manager is not to keep track of
open/closed blocks. This is better maintained within a target,
that already manages this information on writes.

Remove the statistics and merge the states NVM_BLK_ST_OPEN and
NVM_BLK_ST_CLOSED.

Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-07 08:51:52 -06:00
Matias Bjørling 5114e2773a lightnvm: remove checkpatch warning for unsigned ints
Checkpatch found two incidents where the type was preferred to be
written out in full.

./drivers/lightnvm/rrpc.h:184: WARNING: Prefer 'unsigned int' to bare
use of 'unsigned'
./drivers/lightnvm/rrpc.h:209: WARNING: Prefer 'unsigned int' to bare
use of 'unsigned'
./drivers/lightnvm/rrpc.c:51: WARNING: Prefer 'unsigned int' to bare use
of 'unsigned'

Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-07 08:51:52 -06:00
Wenwei Tao 0de2415bb7 lightnvm: break the loop when rqd is not null
Break the loop when rqd is not null to reduce
an unnecessary schedule.

Signed-off-by: Wenwei Tao <ww.tao0320@gmail.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-07 08:51:52 -06:00
Mike Christie 95fe6c1a20 block, fs, mm, drivers: use bio set/get op accessors
This patch converts the simple bi_rw use cases in the block,
drivers, mm and fs code to set/get the bio operation using
bio_set_op_attrs/bio_op

These should be simple one or two liner cases, so I just did them
in one patch. The next patches handle the more complicated
cases in a module per patch.

Signed-off-by: Mike Christie <mchristi@redhat.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-06-07 13:41:38 -06:00
Javier González 116f7d4a21 lightnvm: reserved space calculation incorrect
The nvm_dev->max_pages_per_blk variable was removed in favor of the new
nvm->sec_per_blk variable. The ->max_pages_per_blk variable was still
used in rrpc_capacity, reporting the reserved capacity to zero. Replace
with ->sec_per_blk to calculate the reserved area again.

Signed-off-by: Javier González <javier@cnexlabs.com>
Updated patch description. Was "lightnvm: eliminate redundant variable"
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-05-06 12:51:10 -06:00
Javier González 6d5be9590b lightnvm: rename nr_pages to nr_ppas on nvm_rq
The number of ppas contained on a request is not necessarily the number
of pages that it maps to neither on the target nor on the device side.
In order to avoid confusion, rename nr_pages to nr_ppas since it is what
the variable actually contains.

Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-05-06 12:51:10 -06:00
Javier González cca87bc9d3 lightnvm: do not assume sequential lun alloc.
When doing GC, rrpc calculates the physical LUN to which the rrpc block
belongs too. This calculation is based on the assumption that LUNs are
assigned sequentially to the LUN list. Use the reference to the LUN
instead. This saves us the calculation and allows us to align LUNs in a
different manner to, for example, take advantage of devide parallelism.

Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-05-06 12:51:10 -06:00
Javier González 57682b4915 lightnvm: do not free unused metadata on rrpc
rrpc does not save any metadata on a given request. Thus, do not attempt
to free the metadata dma region.

Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-05-06 12:51:10 -06:00
Simon A. F. Lund 6063fe399d lightnvm: rename nvm_targets to nvm_tgt_type
The functions nvm_register_target(), nvm_unregister_target() and
associated list refers to a target type that is being registered by a
target type module. Rename nvm_*_targets() to nvm_*_tgt_type(), so that
the intension is clear.

This enables target instances to use the _nvm_*_targets() naming.

Signed-off-by: Simon A. F. Lund <slund@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-05-06 12:51:10 -06:00
Wenwei Tao 909049a719 lightnvm: store rrpc->soffset in device sector size
Since we mainly use soffset in device sector size, we therefore store
this value in rrpc->soffset, instead of the offset in 512byte sector
size. This eliminates the "(ilog2(dev->sec_size) - 9)" calculation on
each I/O.

Signed-off-by: Wenwei Tao <ww.tao0320@gmail.com>
Updated patch description.
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-05-06 12:51:10 -06:00
Wenwei Tao 66e3d07f75 lightnvm: calculate rrpc total blocks and sectors up front
Calculate rrpc total blocks and sectors up front, make sense
to use them. For example, we use rrpc->nr_sects to calculate rrpc
area size, but it makes no sense if we don't initialize it up front,
since it would be zero until we finish rrpc luns init.

Signed-off-by: Wenwei Tao <ww.tao0320@gmail.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-05-06 12:51:10 -06:00
Wenwei Tao da1e284919 lightnvm: add a bitmap of luns
Add a bitmap of luns to indicate the status
of luns: inuse/available. When create targets
do the necessary check to avoid allocating luns
that are already allocated.

Signed-off-by: Wenwei Tao <ww.tao0320@gmail.com>
Freed dev->lun_map if nvm_core_init later failed in the init process.
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-03-18 18:10:38 -07:00
Wenwei Tao 4c9dacb82d lightnvm: specify target's logical address area
We can create more than one target on a lightnvm
device by specifying its begin lun and end lun.

But only specify the physical address area is not
enough, we need to get the corresponding non-
intersection logical address area division from
the backend device's logcial address space.
Otherwise the targets on the device might use
the same logical addresses cause incorrect
information in the device's l2p table.

Signed-off-by: Wenwei Tao <ww.tao0320@gmail.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-03-18 18:10:37 -07:00
Linus Torvalds 237045fc3c Merge branch 'for-4.6/drivers' of git://git.kernel.dk/linux-block
Pull block driver updates from Jens Axboe:
 "This is the block driver pull request for this merge window.  It sits
  on top of for-4.6/core, that was just sent out.

  This contains:

   - A set of fixes for lightnvm.  One from Alan, fixing an overflow,
     and the rest from the usual suspects, Javier and Matias.

   - A set of fixes for nbd from Markus and Dan, and a fixup from Arnd
     for correct usage of the signed 64-bit divider.

   - A set of bug fixes for the Micron mtip32xx, from Asai.

   - A fix for the brd discard handling from Bart.

   - Update the maintainers entry for cciss, since that hardware has
     transferred ownership.

   - Three bug fixes for bcache from Eric Wheeler.

   - Set of fixes for xen-blk{back,front} from Jan and Konrad.

   - Removal of the cpqarray driver.  It has been disabled in Kconfig
     since 2013, and we were initially scheduled to remove it in 3.15.

   - Various updates and fixes for NVMe, with the most important being:

        - Removal of the per-device NVMe thread, replacing that with a
          watchdog timer instead. From Christoph.

        - Exposing the namespace WWID through sysfs, from Keith.

        - Set of cleanups from Ming Lin.

        - Logging the controller device name instead of the underlying
          PCI device name, from Sagi.

        - And a bunch of fixes and optimizations from the usual suspects
          in this area"

* 'for-4.6/drivers' of git://git.kernel.dk/linux-block: (49 commits)
  NVMe: Expose ns wwid through single sysfs entry
  drivers:block: cpqarray clean up
  brd: Fix discard request processing
  cpqarray: remove it from the kernel
  cciss: update MAINTAINERS
  NVMe: Remove unused sq_head read in completion path
  bcache: fix cache_set_flush() NULL pointer dereference on OOM
  bcache: cleaned up error handling around register_cache()
  bcache: fix race of writeback thread starting before complete initialization
  NVMe: Create discard zero quirk white list
  nbd: use correct div_s64 helper
  mtip32xx: remove unneeded variable in mtip_cmd_timeout()
  lightnvm: generalize rrpc ppa calculations
  lightnvm: remove struct nvm_dev->total_blocks
  lightnvm: rename ->nr_pages to ->nr_sects
  lightnvm: update closed list outside of intr context
  xen/blback: Fit the important information of the thread in 17 characters
  lightnvm: fold get bb tbl when using dual/quad plane mode
  lightnvm: fix up nonsensical configure overrun checking
  xen-blkback: advertise indirect segment support earlier
  ...
2016-03-18 17:13:31 -07:00
Javier González afb18e0ed8 lightnvm: generalize rrpc ppa calculations
In rrpc, some calculations assume a certain configuration (e.g., 1 LUN,
1 sector per page). The reason behind this was that LightNVM used a
simple configuration with QEMU to test core features in the beginning.
This patch relaxes these assumptions and generalizes calculation,
allowing multiple luns to be used.

Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-03-03 14:47:53 -07:00
Matias Bjørling 4ece44af73 lightnvm: rename ->nr_pages to ->nr_sects
The struct rrpc->nr_pages can easily be interpreted as the number of
flash pages allocated to rrpc, while it is the nr_sects. Make sure that
this is reflected from the variable name.

Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-03-03 14:46:35 -07:00
Javier González 6adb03de40 lightnvm: update closed list outside of intr context
When an I/O finishes, full blocks are moved from the open to the closed
list - a lock is taken to protect the list. This happens at the moment
in the interrupt context, which is not correct.

This patch moves this logic to the block workqueue instead, avoiding
holding a spinlock without interrupt save in an interrupt context.

Signed-off-by: Javier González <javier@cnexlabs.com>
Fixes: ff0e498bfa ("lightnvm: manage open and closed blocks sepa...")
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-03-03 14:46:35 -07:00
Wenwei Tao 16c6d048d7 lightnvm: put bio before return
The bio is not returned if the data page cannot be allocated.

Signed-off-by: Wenwei Tao <ww.tao0320@gmail.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-02-04 09:19:45 -07:00
Javier González ff0e498bfa lightnvm: manage open and closed blocks separately
LightNVM targets need to know the state of the flash block when doing
flash optimizations. An example is implementing a write buffer to
respect the flash page size. Currently, block state is not accounted
for; the media manager only differentiates among free, bad and in-use
blocks.

This patch adds the logic in the generic media manager to enable
targets manage blocks into open and close separately, and it implements
such management in rrpc. It also adds a set of flags to describe the
state of the block (open, closed, free, bad).

In order to avoid taking two locks (nvm_lun and rrpc_lun) consecutively,
we introduce lockless get_/put_block primitives so that the open and
close list locks and future common logic is handled within the nvm_lun
lock.

Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-01-12 08:21:17 -07:00
Javier González d7a64d275b lightnvm: reference rrpc lun in rrpc block
Currently, a rrpc block only points to its nvm_lun. If a user wants to
find the associated rrpc lun, it will have to calculate the index and
look it up manually. By referencing the rrpc lun directly, this step can
be omitted, at the cost of a larger memory footprint.

This is important for upcoming patches that implement write buffering in
rrpc.

Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-01-12 08:21:17 -07:00
Matias Bjørling 72d256ecc5 lightnvm: move rq->error to nvm_rq->error
Instead of passing request error into the LightNVM modules, incorporate
it into the nvm_rq.

Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-01-12 08:21:17 -07:00
Wenwei Tao 4b79beb4c3 lightnvm: move the pages per block check out of the loop
There is no need to check whether dev's pages per block is
beyond rrpc support every time we init a lun, we only need
to check it once before enter the lun init loop.

Signed-off-by: Wenwei Tao <ww.tao0320@gmail.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-01-12 08:21:17 -07:00
Wenwei Tao b262924be0 lightnvm: fix locking and mempool in rrpc_lun_gc
This patch fix two issues in rrpc_lun_gc

1. prio_list is protected by rrpc_lun's lock not nvm_lun's, so
acquire rlun's lock instead of lun's before operate on the list.

2. we delete block from prio_list before allocating gcb, but gcb
allocation may fail, we end without putting it back to the list,
this makes the block won't get reclaimed in the future. To solve
this issue, delete block after gcb allocation.

Signed-off-by: Wenwei Tao <ww.tao0320@gmail.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-01-12 08:21:16 -07:00
Wenwei Tao d0ca798f96 lightnvm: put block back to gc list on its reclaim fail
We delete a block from the gc list before reclaim it, so
put it back to the list on its reclaim fail, otherwise
this block will not get reclaimed and be programmable
in the future.

Signed-off-by: Wenwei Tao <ww.tao0320@gmail.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-01-12 08:21:16 -07:00
Wenwei Tao 2b11c1b24e lightnvm: check bi_error in gc
We should check last io completion status before
starting another one.

Signed-off-by: Wenwei Tao <ww.tao0320@gmail.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-01-12 08:21:16 -07:00
Matias Bjørling 91276162de lightnvm: refactor end_io functions for sync
To implement sync I/O support within the LightNVM core, the end_io
functions are refactored to take an end_io function pointer instead of
testing for initialized media manager, followed by calling its end_io
function.

Sync I/O can then be implemented using a callback that signal I/O
completion. This is similar to the logic found in blk_to_execute_io().
By implementing it this way, the underlying device I/Os submission logic
is abstracted away from core, targets, and media managers.

Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-01-12 08:21:16 -07:00
Wenwei Tao c27278bddd lightnvm: unlock rq and free ppa_list on submission fail
When rrpc_write_ppalist_rq and rrpc_read_ppalist_rq succeed, we setup
rq correctly, but nvm_submit_io may afterward fail since it cannot
allocate request or nvme_nvm_command, we return error but forget to
cleanup the previous work.

Signed-off-by: Wenwei Tao <ww.tao0320@gmail.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-01-12 08:21:16 -07:00
Javier Gonzalez 3bfbc6adbc lightnvm: add check after mempool allocation
The mempool allocation might fail. Make sure to return error when it
does, instead of causing a kernel panic.

Signed-off-by: Javier Gonzalez <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-01-12 08:21:16 -07:00
Wenwei Tao 3cd485b1f8 lightnvm: fix bio submission issue
Put bio when submission fails, since we get it
before submission. And return error when backend
device driver doesn't provide a submit_io method,
thus we can end IO properly.

Signed-off-by: Wenwei Tao <ww.tao0320@gmail.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-01-12 08:21:15 -07:00
Matias Bjørling 16f26c3aa9 lightnvm: replace req queue with nvmdev for lld
In the case where a request queue is passed to the low lever lightnvm
device drive integration, the device driver might pass its admin
commands through another queue. Instead pass nvm_dev, and let the
low level drive the appropriate queue.

Reported-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-12-07 09:14:19 -07:00
Wenwei Tao d3d1a43842 lightnvm: put blks when luns configure failed
Put the allocated blocks back to the free list
when the luns configure failed, to make these
blocks useable to others.

Signed-off-by: Wenwei Tao <ww.tao0320@gmail.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-12-07 09:14:19 -07:00
Wenwei Tao f27a629953 lightnvm: use flags in rrpc_get_blk
rrpc_get_blk use constant 0 as the input parameter
of nvm_get_blk, this may result in getting gc block
failed unexpectedly.

Signed-off-by: Wenwei Tao <ww.tao0320@gmail.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-12-07 09:14:19 -07:00
Matias Bjørling 7386af270c lightnvm: remove linear and device addr modes
The linear and device specific address modes can be replaced with a
simple offset and bit length conversion that is generic across all
devices.

This both simplifies the specification and removes the special case for
qemu nvme, that previously relied on the linear address mapping.

Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-11-16 15:20:34 -07:00
Jens Axboe dece16353e block: change ->make_request_fn() and users to return a queue cookie
No functional changes in this patch, but it prepares us for returning
a more useful cookie related to the IO that was queued up.

Signed-off-by: Jens Axboe <axboe@fb.com>
Acked-by: Christoph Hellwig <hch@lst.de>
Acked-by: Keith Busch <keith.busch@intel.com>
2015-11-07 10:40:46 -07:00
Matias Bjørling b7ceb7d500 lightnvm: refactor phys addrs type to u64
For cases where CONFIG_LBDAF is not set. The struct ppa_addr exceeds its
type on 32 bit architectures. ppa_addr requires a 64bit integer to hold
the generic ppa format. We therefore refactor it to u64 and
replaces the sector_t usages with u64 for physical addresses.

Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-11-03 09:53:24 -07:00
Matias Bjørling ae1519ec44 rrpc: Round-robin sector target with cost-based gc
This target allows an Open-Channel SSD to be exposed asas a block
device.

It implements a round-robin approach for sector allocation,
together with a greedy cost-based garbage collector.

Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-10-29 16:21:42 +09:00