linux

Commit Graph

Author	SHA1	Message	Date
Christoph Lameter	f8891e5e1f	[PATCH] Light weight event counters The remaining counters in page_state after the zoned VM counter patches have been applied are all just for show in /proc/vmstat. They have no essential function for the VM. We use a simple increment of per cpu variables. In order to avoid the most severe races we disable preempt. Preempt does not prevent the race between an increment and an interrupt handler incrementing the same statistics counter. However, that race is exceedingly rare, we may only loose one increment or so and there is no requirement (at least not in kernel) that the vm event counters have to be accurate. In the non preempt case this results in a simple increment for each counter. For many architectures this will be reduced by the compiler to a single instruction. This single instruction is atomic for i386 and x86_64. And therefore even the rare race condition in an interrupt is avoided for both architectures in most cases. The patchset also adds an off switch for embedded systems that allows a building of linux kernels without these counters. The implementation of these counters is through inline code that hopefully results in only a single instruction increment instruction being emitted (i386, x86_64) or in the increment being hidden though instruction concurrency (EPIC architectures such as ia64 can get that done). Benefits: - VM event counter operations usually reduce to a single inline instruction on i386 and x86_64. - No interrupt disable, only preempt disable for the preempt case. Preempt disable can also be avoided by moving the counter into a spinlock. - Handling is similar to zoned VM counters. - Simple and easily extendable. - Can be omitted to reduce memory use for embedded use. References: RFC http://marc.theaimsgroup.com/?l=linux-kernel&m=113512330605497&w=2 RFC http://marc.theaimsgroup.com/?l=linux-kernel&m=114988082814934&w=2 local_t http://marc.theaimsgroup.com/?l=linux-kernel&m=114991748606690&w=2 V2 http://marc.theaimsgroup.com/?t=115014808400007&r=1&w=2 V3 http://marc.theaimsgroup.com/?l=linux-kernel&m=115024767022346&w=2 V4 http://marc.theaimsgroup.com/?l=linux-kernel&m=115047968808926&w=2 Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-06-30 11:25:36 -07:00
Jörn Engel	6ab3d5624e	Remove obsolete #include <linux/config.h> Signed-off-by: Jörn Engel <joern@wohnheim.fh-wedel.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>	2006-06-30 19:25:36 +02:00
Chandra Seetharaman	5a67e4c5b6	[PATCH] cpu hotplug: use hotplug version of cpu notifier in appropriate places Make use the of newly defined hotplug version of cpu_notifier functionality wherever appropriate. Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com> Cc: Ashok Raj <ashok.raj@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-06-27 17:32:41 -07:00
Chandra Seetharaman	054cc8a2d8	[PATCH] cpu hotplug: revert initdata patch submitted for 2.6.17 This patch reverts notifier_block changes made in 2.6.17 Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com> Cc: Ashok Raj <ashok.raj@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-06-27 17:32:41 -07:00
Andreas Mohr	d6e05edc59	spelling fixes acquired (aquired) contiguous (contigious) successful (succesful, succesfull) surprise (suprise) whether (weather) some other misspellings Signed-off-by: Andreas Mohr <andi@lisas.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>	2006-06-26 18:35:02 +02:00
Andi Kleen	8269730b38	[BLOCK] Fix bounce limit address check Do a safer check for when to enable DMA. Currently we enable ISA DMA for cases that do not need it, resulting in OOM conditions when ZONE_DMA runs out of space. Signed-off-by: Jens Axboe <axboe@suse.de>	2006-06-23 17:10:39 +02:00
Jens Axboe	b31dc66a54	[PATCH] Kill PF_SYNCWRITE flag A process flag to indicate whether we are doing sync io is incredibly ugly. It also causes performance problems when one does a lot of async io and then proceeds to sync it. Part of the io will go out as async, and the other part as sync. This causes a disconnect between the previously submitted io and the synced io. For io schedulers such as CFQ, this will cause us lost merges and suboptimal behaviour in scheduling. Remove PF_SYNCWRITE completely from the fsync/msync paths, and let the O_DIRECT path just directly indicate that the writes are sync by using WRITE_SYNC instead. Signed-off-by: Jens Axboe <axboe@suse.de>	2006-06-23 17:10:39 +02:00
Paolo 'Blaisorblade' Giarrusso	a038e25364	[PATCH] blk_start_queue() must be called with irq disabled - add warning The queue lock can be taken from interrupts so it must always be taken with irq disabling primitives. Some primitives already verify this. blk_start_queue() is called under this lock, so interrupts must be disabled. Also document this requirement clearly in blk_init_queue(), where the queue spinlock is set. Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Jens Axboe <axboe@suse.de>	2006-06-23 17:10:38 +02:00
Oleg Nesterov	626ab0e69d	[PATCH] list: use list_replace_init() instead of list_splice_init() list_splice_init(list, head) does unneeded job if it is known that list_empty(head) == 1. We can use list_replace_init() instead. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-06-23 07:43:07 -07:00
Jens Axboe	fd0ff8aa1d	[PATCH] blk: fix gendisk->in_flight accounting during barrier sequence While executing barrrier sequence, the bar_rq which carries actual write was accounted as normal IO on completion, while it wasn't on queueing. This caused gendisk->in_flight to be decremented by 1 after each barrier thus messed up statistics. This patch makes bar_rq not accounted as normal IO. As the containing barrier request as a whole is accounted, part of it shouldn't be. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jens Axboe <axboe@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-05-23 10:39:43 -07:00
Jens Axboe	dac07ec121	[BLOCK] limit request_fn recursion Don't recurse back into the driver even if the unplug threshold is met, when the driver asks for a requeue. This is both silly from a logical point of view (requeues typically happen due to driver/hardware shortage), and also dangerous since we could hit an endless request_fn -> requeue -> unplug -> request_fn loop and crash on stack overrun. Also limit blk_run_queue() to one level of recursion, similar to how blk_start_queue() works. This patch fixed a real problem with SLES10 and lpfc, and it could hit any SCSI lld that returns non-zero from it's ->queuecommand() handler. Signed-off-by: Jens Axboe <axboe@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-05-11 12:38:59 -07:00
Chandra Seetharaman	649bbaa484	[PATCH] Remove __devinitdata from notifier block definitions Few of the notifier_chain_register() callers use __devinitdata in the definition of notifier_block data structure. It is incorrect as the data structure should be available after the initializations (they do not unregister them during initializations). This was leading to an oops when notifier_chain_register() call is invoked for those callback chains after initialization. This patch fixes all such usages to _not_ have the notifier_block data structure in the init data section. Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-04-26 08:27:50 -07:00
Coywolf Qi Hunt	7daac49020	[patch] cleanup: use blk_queue_stopped This cleanup the source to use blk_queue_stopped. Signed-off-by: Coywolf Qi Hunt <qiyong@freeforge.net> Signed-off-by: Jens Axboe <axboe@suse.de>	2006-04-20 13:04:36 +02:00
Martin Waitz	a580290c3e	Documentation: fix minor kernel-doc warnings This patch updates the comments to match the actual code. Signed-off-by: Martin Waitz <tali@admingilde.org> Signed-off-by: Adrian Bunk <bunk@stusta.de>	2006-04-02 13:59:55 +02:00
Linus Torvalds	7baf398f12	Merge branch 'cfq-merge' of git://brick.kernel.dk/data/git/linux-2.6-block * 'cfq-merge' of git://brick.kernel.dk/data/git/linux-2.6-block: [BLOCK] cfq-iosched: seek and async performance fixes [PATCH] ll_rw_blk: fix 80-col offender in put_io_context() [PATCH] cfq-iosched: small cfq_choose_req() optimization [PATCH] [BLOCK] cfq-iosched: change cfq io context linking from list to tree	2006-03-28 09:25:44 -08:00
KAMEZAWA Hiroyuki	0a94502277	[PATCH] for_each_possible_cpu: fixes for generic part replaces for_each_cpu with for_each_possible_cpu(). Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-03-28 09:16:05 -08:00
Jens Axboe	7143dd4b01	[PATCH] ll_rw_blk: fix 80-col offender in put_io_context() This makes akpm more happy. Signed-off-by: Jens Axboe <axboe@suse.de>	2006-03-28 09:00:28 +02:00
Jens Axboe	e2d74ac066	[PATCH] [BLOCK] cfq-iosched: change cfq io context linking from list to tree On setups with many disks, we spend a considerable amount of time looking up the process-disk mapping on each queue of io. Testing with a NULL based block driver, this costs 40-50% reduction in throughput for 1000 disks. Signed-off-by: Jens Axboe <axboe@suse.de>	2006-03-28 08:59:01 +02:00
Linus Torvalds	4fa639123d	Merge branch 'for-linus' of git://brick.kernel.dk/data/git/linux-2.6-block * 'for-linus' of git://brick.kernel.dk/data/git/linux-2.6-block: [PATCH] Don't make debugfs depend on DEBUG_KERNEL [PATCH] Fix blktrace compile with sysfs not defined [PATCH] unused label in drivers/block/cciss. [BLOCK] increase size of disk stat counters [PATCH] blk_execute_rq_nowait-speedup [PATCH] ide-cd: quiet down GPCMD_READ_CDVD_CAPACITY failure [BLOCK] ll_rw_blk: kmalloc -> kzalloc conversion [PATCH] kzalloc() conversion in drivers/block [PATCH] update max_sectors documentation	2006-03-27 08:46:49 -08:00
NeilBrown	89e5c8b5b8	[PATCH] md: Make sure QUEUE_FLAG_CLUSTER is set properly for md. This flag should be set for a virtual device iff it is set for all underlying devices. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-03-27 08:45:00 -08:00
Andrew Morton	4c5d0bbde9	[PATCH] blk_execute_rq_nowait-speedup Both elv_add_request() and generic_unplug_device() grab the queue lock and disable interrupts, do that locally and use the __ variants. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Jens Axboe <axboe@suse.de>	2006-03-27 09:29:02 +02:00
Jens Axboe	f68110fc28	[BLOCK] ll_rw_blk: kmalloc -> kzalloc conversion Signed-off-by: Jens Axboe <axboe@suse.de>	2006-03-27 09:29:02 +02:00
Jens Axboe	2056a782f8	[PATCH] Block queue IO tracing support (blktrace) as of 2006-03-23 Signed-off-by: Jens Axboe <axboe@suse.de>	2006-03-23 20:00:26 +01:00
Al Viro	483f4afc42	[PATCH] fix sysfs interaction and lifetime rules handling for queues	2006-03-18 18:34:37 -05:00
Al Viro	334e94de9b	[PATCH] deal with rmmod/put_io_context() races Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2006-03-18 18:34:15 -05:00
Al Viro	c981ff9f89	[PATCH] fix locking in queue_requests_store() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2006-03-18 18:33:51 -05:00
Al Viro	8669aafdb5	[PATCH] fix double-free in blk_init_queue_node() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2006-03-18 18:33:49 -05:00
Andi Kleen	5ee1af9f51	[PATCH] block: disable block layer bouncing for most memory on 64bit systems The low level PCI DMA mapping functions should handle it in most cases. This should fix problems with depleting the DMA zone early. The old code used precious GFP_DMA memory in many cases where it was not needed. Signed-off-by: Andi Kleen <ak@suse.de> Cc: Jens Axboe <axboe@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-03-08 18:10:31 -08:00
Tejun Heo	30e9656cc3	[PATCH] block: implement elv_insert and use it (fix ordcolor flipping bug) q->ordcolor must only be flipped on initial queueing of a hardbarrier request. Constructing ordered sequence and requeueing used to pass through __elv_add_request() which flips q->ordcolor when it sees a barrier request. This patch separates out elv_insert() from __elv_add_request() and uses elv_insert() when constructing ordered sequence and requeueing. elv_insert() inserts the given request at the specified position and does nothing else. Signed-off-by: Tejun Heo <htejun@gmail.com> Acked-by: Jens Axboe <axboe@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-02-08 07:52:58 -08:00
Jens Axboe	9a7a67af8b	[PATCH] fix ordering on requeued request drainage Previously, if a fs request which was being drained failed and got requeued, blk_do_ordered() didn't allow it to be reissued, which causes queue stall. This patch makes blk_do_ordered() use the sequence of each request to determine whether a request can be issued or not. This fixes the bug and simplifies code. Signed-off-by: Tejun Heo <htejun@gmail.com> Acked-by: Jens Axboe <axboe@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-02-05 11:06:51 -08:00
Eric Dumazet	88a2a4ac6b	[PATCH] percpu data: only iterate over possible CPUs percpu_data blindly allocates bootmem memory to store NR_CPUS instances of cpudata, instead of allocating memory only for possible cpus. As a preparation for changing that, we need to convert various 0 -> NR_CPUS loops to use for_each_cpu(). (The above only applies to users of asm-generic/percpu.h. powerpc has gone it alone and is presently only allocating memory for present CPUs, so it's currently corrupting memory). Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: James Bottomley <James.Bottomley@steeleye.com> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: Jens Axboe <axboe@suse.de> Cc: Anton Blanchard <anton@samba.org> Acked-by: William Irwin <wli@holomorphy.com> Cc: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-02-05 11:06:51 -08:00
Jun'ichi "Nick" Nomura	3eaf840e0b	[PATCH] device-mapper disk statistics: timing Record I/O timing statistics The start time is added to struct dm_io, an existing structure allocated privately internally within dm and attached to each incoming bio. We export disk_round_stats() from block/ll_rw_blk.c instead of creating a private clone. Signed-off-by: Jun'ichi "Nick" Nomura <j-nomura@ce.jp.nec.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-02-01 08:53:11 -08:00
Jens Axboe	fddfdeafa8	[BLOCK] A few kerneldoc fixups Signed-off-by: Jens Axboe <axboe@suse.de>	2006-01-31 15:24:34 +01:00
Tetsuo Takata	60481b12b8	[BLOCK] ll_rw_blk: fix setting of ->ordered on init This makes XFS barrier mounts succeed on my SCSI system. Signed-off-by: Tetsuo Takata <takatatt@intellilink.co.jp> Signed-off-by: Jens Axboe <axboe@suse.de>	2006-01-24 10:34:36 +01:00
Jens Axboe	53e86061b5	[BLOCK] ll_rw_blk: use preempt-disabling disk_stat_add() in completion It can legally be called with interrupts/preemption enabled. Signed-off-by: Jens Axboe <axboe@suse.de>	2006-01-24 10:06:19 +01:00
Jens Axboe	2cb2e147a6	[BLOCK] ll_rw_blk: make max_sectors and max_hw_sectors unsigned ints IDE lba48 can support full 64k request size, which overflows the max_hw_sectors variable. Signed-off-by: Jens Axboe <axboe@suse.de>	2006-01-24 10:06:19 +01:00
Linus Torvalds	e2688f00dc	Merge branch 'blk-softirq' of git://brick.kernel.dk/data/git/linux-2.6-block Manual merge for trivial #include changes	2006-01-09 09:26:40 -08:00
Jens Axboe	ff856bad67	[BLOCK] ll_rw_blk: Enable out-of-order request completions through softirq Request completion can be a quite heavy process, since it needs to iterate through the entire request and complete the bio's it holds. This patch adds blk_complete_request() which moves this processing into a dedicated block softirq. Signed-off-by: Jens Axboe <axboe@suse.de>	2006-01-09 16:02:34 +01:00
Jens Axboe	356cebea11	[BLOCK] Kill blk_attempt_remerge() It's a broken interface, it's done way too late. And apparently it triggers slab problems in recent kernels as well (most likely after the generic dispatch code was merged). So kill it, ide-cd is the only user of it. Signed-off-by: Jens Axboe <axboe@suse.de>	2006-01-09 15:30:20 +01:00
Nicolas Kaiser	1abee6d2d1	[BLOCK][TRIVIAL] ll_rw_blk: header included twice linux/blkdev.h included twice Signed-off-by: Nicolas Kaiser <nikai@nikai.net> Signed-off-by: Jens Axboe <axboe@suse.de>	2006-01-09 14:44:15 +01:00
Tejun Heo	797e7dbbee	[BLOCK] reimplement handling of barrier request Reimplement handling of barrier requests. * Flexible handling to deal with various capabilities of target devices. * Retry support for falling back. * Tagged queues which don't support ordered tag can do ordered. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jens Axboe <axboe@suse.de>	2006-01-06 09:51:03 +01:00
Tejun Heo	52d9e67536	[BLOCK] ll_rw_blk: separate out bio init part from __make_request Separate out bio initialization part from __make_request. It will be used by the following blk_ordered_reimpl. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jens Axboe <axboe@suse.de>	2006-01-06 09:49:58 +01:00
Tejun Heo	8ffdc6550c	[BLOCK] add @uptodate to end_that_request_last() and @error to rq_end_io_fn() add @uptodate argument to end_that_request_last() and @error to rq_end_io_fn(). there's no generic way to pass error code to request completion function, making generic error handling of non-fs request difficult (rq->errors is driver-specific and each driver uses it differently). this patch adds @uptodate to end_that_request_last() and @error to rq_end_io_fn(). for fs requests, this doesn't really matter, so just using the same uptodate argument used in the last call to end_that_request_first() should suffice. imho, this can also help the generic command-carrying request jens is working on. Signed-off-by: tejun heo <htejun@gmail.com> Signed-Off-By: Jens Axboe <axboe@suse.de>	2006-01-06 09:49:03 +01:00
Arjan van de Ven	64100099ed	[BLOCK] mark some block/ variables cons the patch below marks various read-only variables in block/* as const, so that gcc can optimize the use of them; eg gcc will replace the use by the value directly now and will even remove the memory usage of these. Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Jens Axboe <axboe@suse.de>	2006-01-06 09:46:02 +01:00
Jens Axboe	88ee5ef157	[BLOCK] ll_rw_blk: fastpath get_request() Originally from: Nick Piggin <nickpiggin@yahoo.com.au> Move current_io_context out of the get_request fastpth. Also try to streamline a few other things in this area. Signed-off-by: Jens Axboe <axboe@suse.de>	2006-01-06 09:39:04 +01:00
Mike Christie	defd94b754	[SCSI] seperate max_sectors from max_hw_sectors - export __blk_put_request and blk_execute_rq_nowait needed for async REQ_BLOCK_PC requests - seperate max_hw_sectors and max_sectors for block/scsi_ioctl.c and SG_IO bio.c helpers per Jens's last comments. Since block/scsi_ioctl.c SG_IO was already testing against max_sectors and SCSI-ml was setting max_sectors and max_hw_sectors to the same value this does not change any scsi SG_IO behavior. It only prepares ll_rw_blk.c, scsi_ioctl.c and bio.c for when SCSI-ml begins to set a valid max_hw_sectors for all LLDs. Today if a LLD does not set it SCSI-ml sets it to a safe default and some LLDs set it to a artificial low value to overcome memory and feedback issues. Note: Since we now cap max_sectors to BLK_DEF_MAX_SECTORS, which is 1024, drivers that used to call blk_queue_max_sectors with a large value of max_sectors will now see the fs requests capped to BLK_DEF_MAX_SECTORS. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>	2005-12-15 15:11:40 -08:00
Mike Christie	6e39b69e7e	[SCSI] export blk layer functions needed for blk_execute_rq_nowait To send async requests we need these two functions exported. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>	2005-12-14 19:00:50 -08:00
Coywolf Qi Hunt	eb97b73d75	[BLOCK] new block/ directory comment tidy Some leftover comments referring to drivers/block that are now block/. They don't add any information we don't already have, so kill them. Signed-off-by: Coywolf Qi Hunt <qiyong@fc-cn.com> Signed-off-by: Jens Axboe <axboe@suse.de>	2005-11-18 21:59:31 +01:00
Linus Torvalds	333c47c847	Merge branch 'block-dir' of git://brick.kernel.dk/data/git/linux-2.6-block	2005-11-07 08:32:39 -08:00
Jens Axboe	3a65dfe8c0	[BLOCK] Move all core block layer code to new block/ directory drivers/block/ is right now a mix of core and driver parts. Lets move the core parts to a new top level directory. Al will move the fs/ related block parts to block/ next. Signed-off-by: Jens Axboe <axboe@suse.de>	2005-11-04 08:43:35 +01:00

1 2

100 Commits