Add the addresses and definitions I care about for Panel Self Refresh, as
documented in the eDP spec.
I'm sending these out before some other patches because this should be a fairly
simple one to get upstream and not require too much fuss (where the others may
have some fuss).
This file is a mess with white spacing. I tried to stay consistent with the
surrounding code.
v2: had some silly mistakes in v1 which Keith caught
Cc: Dave Airlie <airlied@redhat.com>
Cc: Keith Packard <keithp@keithp.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Keith Packard <keithp@keithp.com>
Idle the GPU before doing any unmaps. We know if VT-d is in use through
an exported variable from iommu code.
This should avoid a known HW issue.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Keith Packard <keithp@keithp.com>
For the !HAVE_ATOMIC_IOMAP case the stub functions did not call
pagefault_disable/_enable. The i915 driver relies on the map
actually being atomic, otherwise it can deadlock with it's own
pagefault handler in the gtt pwrite fastpath.
This is exercised by gem_mmap_gtt from the intel-gpu-toosl gem
testsuite.
v2: Chris Wilson noted the lack of an include.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=38115
Cc: stable@kernel.org
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Keith Packard <keithp@keithp.com>
ELD (EDID-Like Data) describes to the HDMI/DP audio driver the audio
capabilities of the plugged monitor.
This adds drm_edid_to_eld() for converting EDID to ELD. The converted
ELD will be saved in a new drm_connector.eld[128] data field. This is
necessary because the graphics driver will need to fixup some of the
data fields (eg. HDMI/DP connection type, AV sync delay) before writing
to the hardware ELD buffer. drm_av_sync_delay() will help the graphics
drivers dynamically compute the AV sync delay for fixing-up the ELD.
ELD selection policy: it's possible for one encoder to be associated
with multiple connectors (ie. monitors), in which case the first found
ELD will be returned by drm_select_eld(). This policy may not be
suitable for all users, but let's start it simple first.
The impact of ELD selection policy: assume there are two monitors, one
supports stereo playback and the other has 8-channel output; cloned
display mode is used, so that the two monitors are associated with the
same internal encoder. If only the stereo playback capability is reported,
the user won't be able to start 8-channel playback; if the 8-channel ELD
is reported, then user space applications may send 8-channel samples
down, however the user may actually be listening to the 2-channel
monitor and not connecting speakers to the 8-channel monitor.
According to James, many TVs will either refuse the display anything or
pop-up an OSD warning whenever they receive hdmi audio which they cannot
handle. Eventually we will require configurability and/or per-monitor
audio control even when the video is cloned.
CC: Zhao Yakui <yakui.zhao@intel.com>
CC: Wang Zhenyu <zhenyu.z.wang@intel.com>
CC: Jeremy Bush <contractfrombelow@gmail.com>
CC: Christopher White <c.white@pulseforce.com>
CC: Pierre-Louis Bossart <pierre-louis.bossart@intel.com>
CC: Paul Menzel <paulepanter@users.sourceforge.net>
CC: James Cloos <cloos@jhcloos.com>
CC: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Keith Packard <keithp@keithp.com>
* 'drm-nouveau-next' of git://git.freedesktop.org/git/nouveau/linux-2.6: (353 commits)
drm/nouveau: remove allocations from gart populate() hook
drm/nvc0/fb: slightly improve PMFB intr handling, move out of nvc0_graph.c
drm/nvc0/fifo: avoid touching missing subfifos
drm/nvd9/disp: bail out of mode_set_base if no fb bound to crtc
drm/nvd9/disp: stub some more api hooks so we don't oops on resume
drm/nouveau: fix printk typo in ioremap failure path
drm/nvc0/pm: minor clock readback fixes
drm/nv40/pm: execute memory reset script from vbios
drm/nv50/gr: refactor initialisation
drm/nouveau: if requested, try harder at disabling sysmem pushbufs
drm/nv50/gr: enable ctxprog xfer only when we need it to save power
drm/nouveau/dp: add support for displayport table 0x30
drm/nouveau/dp: return master dp table pointer too when looking up encoder
drm/nouveau/bios: simplify U/d table hash matching func to just match
drm/nouveau/dp: preserve non-pattern bits in DP_TRAINING_PATTERN_SET
drm/nvc0/gr: remove MODULE_FIRMWARE() lines
drm/nouveau/dp: use alternate lane mask for nvaf
drm/nouveau/dp: link rate scripts are selected with a comparison table
drm/nv40/pm: write nv40-specific reclocking routines
drm/nv40/pm: parse geometric delta clock from vbios
...
Fix kernel-doc warning about internal/private data by marking it
as "private:" so that kernel-doc will ignore it.
Warning(include/linux/regulator/consumer.h:128): No description found for parameter 'ret'
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Acked-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix kernel-doc warning in net/cfg80211.h:
Warning(include/net/cfg80211.h:1884): No description found for parameter 'registered'
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'perf-fixes-for-linus' of git://tesla.tglx.de/git/linux-2.6-tip:
x86, perf: Check that current->mm is alive before getting user callchain
perf_event: Fix broken calc_timer_values()
perf events: Fix slow and broken cgroup context switch code
Some of the flags are OS/arch dependent we add a 9p
protocol value which maps to asm-generic/fcntl.h values in Linux
Based on the original patch from Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
This bumps driver major version as a result of previous incompatible
interface changes.
In addition, a leftover command definition is removed from the
vmwgfx_drm.h header.
Also a strict version check is enforced on the exebuf ioctl.
This is intended to be the last major bump before exiting staging.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Will be needed for queries and drm event-driven throttling.
As a benefit, they help avoid stale user-space fence handles.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This is needed before we introduce the fence objects.
Otherwise this will be even more confusing. The plan is to use the following:
seqno: A 32-bit sequence number that may be passed in the fifo.
marker: Objects, carrying a seqno, that track fifo submission time. They
are used for fifo lag based throttling.
fence objects: Kernel space objects, possibly accessible from user-space and
carrying a 32-bit seqno together with signaled status.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Since we don't allow user-space to map the fifo anymore,
add a parameter to get fifo hw version and
an ioctl to copy the 3D capabilities.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Jakob Bornecranz <jakob@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This was previously used by user-space to check whether a fence
sequence had passed or not.
With fence objects that's not needed anymore.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
It doesn't seem like its needed. If this turns out to be an incorrect
assumption, we can reinstate it.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
It was only used for bringup debugging, and probably doesn't work
anymore. Remove it.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
The new DRM_RADEON_GEM_WAIT ioctl combines GEM_WAIT_IDLE and GEM_BUSY (there
is a NO_WAIT flag to get the latter) with USAGE_READ and USAGE_WRITE flags
to take advantage of the new ttm_bo_wait changes.
Also bump the DRM version.
Signed-off-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Sometimes we want to know whether a buffer is busy and wait for it (bo_wait).
However, sometimes it would be more useful to be able to query whether
a buffer is busy and being either read or written, and wait until it's stopped
being either read or written. The point of this is to be able to avoid
unnecessary waiting, e.g. if a GPU has written something to a buffer and is now
reading that buffer, and a CPU wants to map that buffer for read, it needs to
only wait for the last write. If there were no write, there wouldn't be any
waiting needed.
This, or course, requires user space drivers to send read/write flags
with each relocation (like we have read/write domains in radeon, so we can
actually use those for something useful now).
Now how this patch works:
The read/write flags should passed to ttm_validate_buffer. TTM maintains
separate sync objects of the last read and write for each buffer, in addition
to the sync object of the last use of a buffer. ttm_bo_wait then operates
with one the sync objects.
Signed-off-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (42 commits)
netpoll: fix incorrect access to skb data in __netpoll_rx
cassini: init before use in cas_interruptN.
can: ti_hecc: Fix uninitialized spinlock in probe
can: ti_hecc: Fix unintialized variable
net: sh_eth: fix the compile error
net/phy: fix DP83865 phy interrupt handler
sendmmsg/sendmsg: fix unsafe user pointer access
ibmveth: Fix leak when recycling skb and hypervisor returns error
arp: fix rcu lockdep splat in arp_process()
bridge: fix a possible use after free
bridge: Pseudo-header required for the checksum of ICMPv6
mcast: Fix source address selection for multicast listener report
MAINTAINERS: Update GIT trees for network development
ath9k: Fix PS wrappers in ath9k_set_coverage_class
carl9170: Fix mismatch in carl9170_op_set_key mutex lock-unlock
wl12xx: add max_sched_scan_ssids value to the hw description
wl12xx: Fix validation of pm_runtime_get_sync return value
wl12xx: Remove obsolete testmode NVS push command
bcma: add uevent to the bus, to autoload drivers
ath9k_hw: Fix STA (AR9485) bringup issue due to incorrect MAC address
...
The current cgroup context switch code was incorrect leading
to bogus counts. Furthermore, as soon as there was an active
cgroup event on a CPU, the context switch cost on that CPU
would increase by a significant amount as demonstrated by a
simple ping/pong example:
$ ./pong
Both processes pinned to CPU1, running for 10s
10684.51 ctxsw/s
Now start a cgroup perf stat:
$ perf stat -e cycles,cycles -A -a -G test -C 1 -- sleep 100
$ ./pong
Both processes pinned to CPU1, running for 10s
6674.61 ctxsw/s
That's a 37% penalty.
Note that pong is not even in the monitored cgroup.
The results shown by perf stat are bogus:
$ perf stat -e cycles,cycles -A -a -G test -C 1 -- sleep 100
Performance counter stats for 'sleep 100':
CPU1 <not counted> cycles test
CPU1 16,984,189,138 cycles # 0.000 GHz
The second 'cycles' event should report a count @ CPU clock
(here 2.4GHz) as it is counting across all cgroups.
The patch below fixes the bogus accounting and bypasses any
cgroup switches in case the outgoing and incoming tasks are
in the same cgroup.
With this patch the same test now yields:
$ ./pong
Both processes pinned to CPU1, running for 10s
10775.30 ctxsw/s
Start perf stat with cgroup:
$ perf stat -e cycles,cycles -A -a -G test -C 1 -- sleep 10
Run pong outside the cgroup:
$ /pong
Both processes pinned to CPU1, running for 10s
10687.80 ctxsw/s
The penalty is now less than 2%.
And the results for perf stat are correct:
$ perf stat -e cycles,cycles -A -a -G test -C 1 -- sleep 10
Performance counter stats for 'sleep 10':
CPU1 <not counted> cycles test # 0.000 GHz
CPU1 23,933,981,448 cycles # 0.000 GHz
Now perf stat reports the correct counts for
for the non cgroup event.
If we run pong inside the cgroup, then we also get the
correct counts:
$ perf stat -e cycles,cycles -A -a -G test -C 1 -- sleep 10
Performance counter stats for 'sleep 10':
CPU1 22,297,726,205 cycles test # 0.000 GHz
CPU1 23,933,981,448 cycles # 0.000 GHz
10.001457237 seconds time elapsed
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20110825135803.GA4697@quad
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The nfsservctl system call is now gone, so we should remove all
linkage for it.
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'tty-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6:
omap-serial: Allow IXON and IXOFF to be disabled.
TTY: serial, document ignoring of uart->ops->startup error
TTY: pty, fix pty counting
8250: Fix race condition in serial8250_backup_timeout().
serial/8250_pci: delete duplicate data definition
8250_pci: add support for Rosewill RC-305 4x serial port card
tty: Add "spi:" prefix for spi modalias
atmel_serial: fix atmel_default_console_device
serial: 8250_pnp: add Intermec CV60 touchscreen device
drivers/serial/ucc_uart.c: Fix compiler warning
pch_uart: Set PCIe bus number using probe parameter
serial: samsung: Fix build error
We need a callback to do some things after pwm_enable, pwm_disable
and pwm_config.
Signed-off-by: Dilan Lee <dilee@nvidia.com>
Reviewed-by: Robert Morell <rmorell@nvidia.com>
Reviewed-by: Arun Murthy <arun.murthy@stericsson.com>
Cc: Richard Purdie <rpurdie@rpsys.net>
Cc: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Replace/remove use of RIO v.1.2 registers/bits that are not
forward-compatible with newer versions of RapidIO specification.
RapidIO specification v.1.3 removed Write Port CSR, Doorbell CSR,
Mailbox CSR and Mailbox and Doorbell bits of the PEF CAR.
Use of removed (since RIO v.1.3) register bits affects users of
currently available 1.3 and 2.x compliant devices who may use not so
recent kernel versions.
Removing checks for unsupported bits makes corresponding routines
compatible with all versions of RapidIO specification. Therefore,
backporting makes stable kernel versions compliant with RIO v.1.3 and
later as well.
Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com>
Cc: Kumar Gala <galak@kernel.crashing.org>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Li Yang <leoli@freescale.com>
Cc: Thomas Moll <thomas.moll@sysgo.com>
Cc: Chul Kim <chul.kim@idt.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Purely in-memory filesystems do not use the inode hash as the dcache
tells us if an entry already exists. As a result, they do not call
unlock_new_inode, and thus directory inodes do not get put into a
different lockdep class for i_sem.
We need the different lockdep classes, because the locking order for
i_mutex is different for directory inodes and regular inodes. Directory
inodes can do "readdir()", which takes i_mutex *before* possibly taking
mm->mmap_sem (due to a page fault while copying the directory entry to
user space).
In contrast, regular inodes can be mmap'ed, which takes mm->mmap_sem
before accessing i_mutex.
The two cases can never happen for the same inode, so no real deadlock
can occur, but without the different lockdep classes, lockdep cannot
understand that. As a result, if CONFIG_DEBUG_LOCK_ALLOC is set, this
can lead to false positives from lockdep like below:
find/645 is trying to acquire lock:
(&mm->mmap_sem){++++++}, at: [<ffffffff81109514>] might_fault+0x5c/0xac
but task is already holding lock:
(&sb->s_type->i_mutex_key#15){+.+.+.}, at: [<ffffffff81149f34>]
vfs_readdir+0x5b/0xb4
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (&sb->s_type->i_mutex_key#15){+.+.+.}:
[<ffffffff8108ac26>] lock_acquire+0xbf/0x103
[<ffffffff814db822>] __mutex_lock_common+0x4c/0x361
[<ffffffff814dbc46>] mutex_lock_nested+0x40/0x45
[<ffffffff811daa87>] hugetlbfs_file_mmap+0x82/0x110
[<ffffffff81111557>] mmap_region+0x258/0x432
[<ffffffff811119dd>] do_mmap_pgoff+0x2ac/0x306
[<ffffffff81111b4f>] sys_mmap_pgoff+0x118/0x16a
[<ffffffff8100c858>] sys_mmap+0x22/0x24
[<ffffffff814e3ec2>] system_call_fastpath+0x16/0x1b
-> #0 (&mm->mmap_sem){++++++}:
[<ffffffff8108a4bc>] __lock_acquire+0xa1a/0xcf7
[<ffffffff8108ac26>] lock_acquire+0xbf/0x103
[<ffffffff81109541>] might_fault+0x89/0xac
[<ffffffff81149cff>] filldir+0x6f/0xc7
[<ffffffff811586ea>] dcache_readdir+0x67/0x205
[<ffffffff81149f54>] vfs_readdir+0x7b/0xb4
[<ffffffff8114a073>] sys_getdents+0x7e/0xd1
[<ffffffff814e3ec2>] system_call_fastpath+0x16/0x1b
This patch moves the directory vs file lockdep annotation into a helper
function that can be called by in-memory filesystems and has hugetlbfs
call it.
Signed-off-by: Josh Boyer <jwboyer@redhat.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* '3.1-rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (21 commits)
target: Convert acl_node_lock to be IRQ-disabling
target: Make locking in transport_deregister_session() IRQ safe
tcm_fc: init/exit functions should not be protected by "#ifdef MODULE"
target: Print subpage too for unhandled MODE SENSE pages
iscsi-target: Fix iscsit_allocate_se_cmd_for_tmr failure path bugs
iscsi-target: Implement iSCSI target IPv6 address printing.
target: Fix task SGL chaining breakage with transport_allocate_data_tasks
target: Fix task count > 1 handling breakage and use max_sector page alignment
target: Add missing DATA_SG_IO transport_cmd_get_valid_sectors check
target: Fix SYNCHRONIZE_CACHE zero LBA + range breakage
target: Remove duplicate task completions in transport_emulate_control_cdb
target: Fix WRITE_SAME usage with transport_get_size
target: Add WRITE_SAME (10) parsing and refactor passthrough checks
target: Fix write payload exception handling with ->new_cmd_map
iscsi-target: forever loop bug in iscsit_attach_ooo_cmdsn()
iscsi-target: remove duplicate return
target: Convert target_core_rd.c to use use BUG_ON
iscsi-target: Fix leak on failure in iscsi_copy_param_list()
target: Use ERR_CAST inlined function
target: Make standard INQUIRY return 'not connected' for tpg_virt_lun0
...
I ran into a couple of programs which broke with the new Linux 3.0
version. Some of those were binary only. I tried to use LD_PRELOAD to
work around it, but it was quite difficult and in one case impossible
because of a mix of 32bit and 64bit executables.
For example, all kind of management software from HP doesnt work, unless
we pretend to run a 2.6 kernel.
$ uname -a
Linux svivoipvnx001 3.0.0-08107-g97cd98f #1062 SMP Fri Aug 12 18:11:45 CEST 2011 i686 i686 i386 GNU/Linux
$ hpacucli ctrl all show
Error: No controllers detected.
$ rpm -qf /usr/sbin/hpacucli
hpacucli-8.75-12.0
Another notable case is that Python now reports "linux3" from
sys.platform(); which in turn can break things that were checking
sys.platform() == "linux2":
https://bugzilla.mozilla.org/show_bug.cgi?id=664564
It seems pretty clear to me though it's a bug in the apps that are using
'==' instead of .startswith(), but this allows us to unbreak broken
programs.
This patch adds a UNAME26 personality that makes the kernel report a
2.6.40+x version number instead. The x is the x in 3.x.
I know this is somewhat ugly, but I didn't find a better workaround, and
compatibility to existing programs is important.
Some programs also read /proc/sys/kernel/osrelease. This can be worked
around in user space with mount --bind (and a mount namespace)
To use:
wget ftp://ftp.kernel.org/pub/linux/kernel/people/ak/uname26/uname26.c
gcc -o uname26 uname26.c
./uname26 program
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
fuse: check size of FUSE_NOTIFY_INVAL_ENTRY message
fuse: mark pages accessed when written to
fuse: delete dead .write_begin and .write_end aops
fuse: fix flock
fuse: fix non-ANSI void function notation
tty_operations->remove is normally called like:
queue_release_one_tty
->tty_shutdown
->tty_driver_remove_tty
->tty_operations->remove
However tty_shutdown() is called from queue_release_one_tty() only if
tty_operations->shutdown is NULL. But for pty, it is not.
pty_unix98_shutdown() is used there as ->shutdown.
So tty_operations->remove of pty (i.e. pty_unix98_remove()) is never
called. This results in invalid pty_count. I.e. what can be seen in
/proc/sys/kernel/pty/nr.
I see this was already reported at:
https://lkml.org/lkml/2009/11/5/370
But it was not fixed since then.
This patch is kind of a hackish way. The problem lies in ->install. We
allocate there another tty (so-called tty->link). So ->install is
called once, but ->remove twice, for both tty and tty->link. The fix
here is to count both tty and tty->link and divide the count by 2 for
user.
And to have ->remove called, let's make tty_driver_remove_tty() global
and call that from pty_unix98_shutdown() (tty_operations->shutdown).
While at it, let's document that when ->shutdown is defined,
tty_shutdown() is not called.
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Cc: Alan Cox <alan@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: stable <stable@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Certain platform specific or Host-WiLink Interface specific actions would be
required to be taken when the chip is being enabled and after the chip is
disabled such as configuration of the mux modes for the GPIO of host connected
to the nshutdown of the chip or relinquishing UART after the chip is disabled.
Similar actions can also be taken when the chip is in deep sleep or when the
chip is awake. Performance enhancements such as configuring the host to run
faster when chip is awake and slower when chip is asleep can also be made
here.
Signed-off-by: Pavan Savoy <pavan_savoy@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch changes target_emulate_inquiry_std() to set the 'not connected'
(0x35) bit in standard INQUIRY response data when we are processing a
request to a virtual LUN=0 mapping from struct se_device *g_lun0_dev that
have been setup for us in transport_lookup_cmd_lun().
This addresses an issue where qla2xxx FC clients need to be able
to create demo-mode I_T FC Nexuses by default, but should not be
exposing the default set of TPG LUNs to all FC clients. This includes
adding an new optional target_core_fabric_ops->tpg_check_demo_mode_login_only()
caller to allow demo_mode nexuses to skip the old default of bulding
a demo-mode MappedLUNs list via core_tpg_add_node_to_devs().
(roland: Add missing tpg_check_demo_mode_login_only check in core_dev_add_lun)
Reported-by: Roland Dreier <roland@purestorage.com>
Cc: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: Nicholas Bellinger <nab@risingtidesystems.com>
* 'for-linus' of git://git.kernel.dk/linux-block: (23 commits)
Revert "cfq: Remove special treatment for metadata rqs."
block: fix flush machinery for stacking drivers with differring flush flags
block: improve rq_affinity placement
blktrace: add FLUSH/FUA support
Move some REQ flags to the common bio/request area
allow blk_flush_policy to return REQ_FSEQ_DATA independent of *FLUSH
xen/blkback: Make description more obvious.
cfq-iosched: Add documentation about idling
block: Make rq_affinity = 1 work as expected
block: swim3: fix unterminated of_device_id table
block/genhd.c: remove useless cast in diskstats_show()
drivers/cdrom/cdrom.c: relax check on dvd manufacturer value
drivers/block/drbd/drbd_nl.c: use bitmap_parse instead of __bitmap_parse
bsg-lib: add module.h include
cfq-iosched: Reduce linked group count upon group destruction
blk-throttle: correctly determine sync bio
loop: fix deadlock when sysfs and LOOP_CLR_FD race against each other
loop: add BLK_DEV_LOOP_MIN_COUNT=%i to allow distros 0 pre-allocated loop devices
loop: add management interface for on-demand device allocation
loop: replace linked list of allocated devices with an idr index
...
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
PCI: OF: Don't crash when bridge parent is NULL.
PCI: export pcie_bus_configure_settings symbol
PCI: code and comments cleanup
PCI: make cardbus-bridge resources optional
PCI: make SRIOV resources optional
PCI : ability to relocate assigned pci-resources
PCI: honor child buses add_size in hot plug configuration
PCI: Set PCI-E Max Payload Size on fabric
Revert the pass-good area introduced in ffd1f609ab ("writeback:
introduce max-pause and pass-good dirty limits") and make the max-pause
area smaller and safe.
This fixes ~30% performance regression in the ext3 data=writeback
fio_mmap_randwrite_64k/fio_mmap_randrw_64k test cases, where there are
12 JBOD disks, on each disk runs 8 concurrent tasks doing reads+writes.
Using deadline scheduler also has a regression, but not that big as CFQ,
so this suggests we have some write starvation.
The test logs show that
- the disks are sometimes under utilized
- global dirty pages sometimes rush high to the pass-good area for
several hundred seconds, while in the mean time some bdi dirty pages
drop to very low value (bdi_dirty << bdi_thresh). Then suddenly the
global dirty pages dropped under global dirty threshold and bdi_dirty
rush very high (for example, 2 times higher than bdi_thresh). During
which time balance_dirty_pages() is not called at all.
So the problems are
1) The random writes progress so slow that they break the assumption of
the max-pause logic that "8 pages per 200ms is typically more than
enough to curb heavy dirtiers".
2) The max-pause logic ignored task_bdi_thresh and thus opens the possibility
for some bdi's to over dirty pages, leading to (bdi_dirty >> bdi_thresh)
and then (bdi_thresh >> bdi_dirty) for others.
3) The higher max-pause/pass-good thresholds somehow leads to the bad
swing of dirty pages.
The fix is to allow the task to slightly dirty over task_bdi_thresh, but
no way to exceed bdi_dirty and/or global dirty_thresh.
Tests show that it fixed the JBOD regression completely (both behavior
and performance), while still being able to cut down large pause times
in balance_dirty_pages() for single-disk cases.
Reported-by: Li Shaohua <shaohua.li@intel.com>
Tested-by: Li Shaohua <shaohua.li@intel.com>
Acked-by: Jan Kara <jack@suse.cz>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
This allows the cast in lowmem_page_address (introduced as a warning
fixup to 33dd4e0ec9 "mm: make some struct page's const") to be
removed.
Propagate const'ness to page_to_section() as well since it is required
by __page_to_pfn.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Followup to 33dd4e0ec9 "mm: make some struct page's const" which missed the
HASHED_PAGE_VIRTUAL case.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
rtc: Limit RTC PIE frequency
rtc: Fix hrtimer deadlock
rtc: Handle errors correctly in rtc_irq_set_state()
Fixup trivial conflicts in drivers/rtc/interface.c due to slightly
trivially versions of the same patch coming in two different ways.
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
irq: Track the owner of irq descriptor
irq: Always set IRQF_ONESHOT if no primary handler is specified
genirq: Fix wrong bit operation
Commit ae1b153962, block: reimplement
FLUSH/FUA to support merge, introduced a performance regression when
running any sort of fsyncing workload using dm-multipath and certain
storage (in our case, an HP EVA). The test I ran was fs_mark, and it
dropped from ~800 files/sec on ext4 to ~100 files/sec. It turns out
that dm-multipath always advertised flush+fua support, and passed
commands on down the stack, where those flags used to get stripped off.
The above commit changed that behavior:
static inline struct request *__elv_next_request(struct request_queue *q)
{
struct request *rq;
while (1) {
- while (!list_empty(&q->queue_head)) {
+ if (!list_empty(&q->queue_head)) {
rq = list_entry_rq(q->queue_head.next);
- if (!(rq->cmd_flags & (REQ_FLUSH | REQ_FUA)) ||
- (rq->cmd_flags & REQ_FLUSH_SEQ))
- return rq;
- rq = blk_do_flush(q, rq);
- if (rq)
- return rq;
+ return rq;
}
Note that previously, a command would come in here, have
REQ_FLUSH|REQ_FUA set, and then get handed off to blk_do_flush:
struct request *blk_do_flush(struct request_queue *q, struct request *rq)
{
unsigned int fflags = q->flush_flags; /* may change, cache it */
bool has_flush = fflags & REQ_FLUSH, has_fua = fflags & REQ_FUA;
bool do_preflush = has_flush && (rq->cmd_flags & REQ_FLUSH);
bool do_postflush = has_flush && !has_fua && (rq->cmd_flags &
REQ_FUA);
unsigned skip = 0;
...
if (blk_rq_sectors(rq) && !do_preflush && !do_postflush) {
rq->cmd_flags &= ~REQ_FLUSH;
if (!has_fua)
rq->cmd_flags &= ~REQ_FUA;
return rq;
}
So, the flush machinery was bypassed in such cases (q->flush_flags == 0
&& rq->cmd_flags & (REQ_FLUSH|REQ_FUA)).
Now, however, we don't get into the flush machinery at all. Instead,
__elv_next_request just hands a request with flush and fua bits set to
the scsi_request_fn, even if the underlying request_queue does not
support flush or fua.
The agreed upon approach is to fix the flush machinery to allow
stacking. While this isn't used in practice (since there is only one
request-based dm target, and that target will now reflect the flush
flags of the underlying device), it does future-proof the solution, and
make it function as designed.
In order to make this work, I had to add a field to the struct request,
inside the flush structure (to store the original req->end_io). Shaohua
had suggested overloading the union with rb_node and completion_data,
but the completion data is used by device mapper and can also be used by
other drivers. So, I didn't see a way around the additional field.
I tested this patch on an HP EVA with both ext4 and xfs, and it recovers
the lost performance. Comments and other testers, as always, are
appreciated.
Cheers,
Jeff
Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Function genpd_queue_power_off_work() is not defined for
CONFIG_PM_RUNTIME, so pm_genpd_poweroff_unused() causes a build
error to happen in that case. Fix the problem by making
pm_genpd_poweroff_unused() depend on CONFIG_PM_RUNTIME too.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>