We already use them for openat() and friends, but fstat() also wants to
be able to use O_PATH file descriptors. This should make it more
directly comparable to the O_SEARCH of Solaris.
Note that you could already do the same thing with "fstatat()" and an
empty path, but just doing "fstat()" directly is simpler and faster, so
there is no reason not to just allow it directly.
See also commit 332a2e1244, which did the same thing for fchdir, for
the same reasons.
Reported-by: ольга крыжановская <olga.kryzhanovska@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: stable@kernel.org # O_PATH introduced in 3.0+
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
commit a606dac368 adds support to link
devices which have _PRx, if a device does not have _PRx, a warning
message will be printed.
This commit is for ZPODD on Intel ZPODD capable platforms, on other
platforms, it has no problem if there is no power resource for this
device, so a warning here is not appropriate, change it to debug.
Reported-by: Borislav Petkov <bp@amd64.org>
Signed-off-by: Aaron Lu <aaron.lu@intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Commit 3236b2d469 ("IB/qib: MADs with misset M_Keys should return
failure") introduced a return code assignment that unfortunately
introduced an unconditional exit for the routine due to missing braces.
This patch adds the braces to correct the original patch.
Reviewed-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Fix CQE expansion of unsignaled WQE -- don't expand the CQE when the
WQE index of the completed CQE matches with last pending WQE (tail) in
the queue.
Signed-off-by: Parav Pandit <parav.pandit@emulex.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
After calling into the lower filesystem to do a rename, the lower target
inode's attributes were not copied up to the eCryptfs target inode. This
resulted in the eCryptfs target inode staying around, rather than being
evicted, because i_nlink was not updated for the eCryptfs inode. This
also meant that eCryptfs didn't do the final iput() on the lower target
inode so it stayed around, as well. This would result in a failure to
free up space occupied by the target file in the rename() operation.
Both target inodes would eventually be evicted when the eCryptfs
filesystem was unmounted.
This patch calls fsstack_copy_attr_all() after the lower filesystem
does its ->rename() so that important inode attributes, such as i_nlink,
are updated at the eCryptfs layer. ecryptfs_evict_inode() is now called
and eCryptfs can drop its final reference on the lower inode.
http://launchpad.net/bugs/561129
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Tested-by: Colin Ian King <colin.king@canonical.com>
Cc: <stable@vger.kernel.org> [2.6.39+]
Since eCryptfs only calls fput() on the lower file in
ecryptfs_release(), eCryptfs should call the lower filesystem's
->flush() from ecryptfs_flush().
If the lower filesystem implements ->flush(), then eCryptfs should try
to flush out any dirty pages prior to calling the lower ->flush(). If
the lower filesystem does not implement ->flush(), then eCryptfs has no
need to do anything in ecryptfs_flush() since dirty pages are now
written out to the lower filesystem in ecryptfs_release().
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Fixes a regression caused by:
821f749 eCryptfs: Revert to a writethrough cache model
That patch reverted some code (specifically, 32001d6f) that was
necessary to properly handle open() -> mmap() -> close() -> dirty pages
-> munmap(), because the lower file could be closed before the dirty
pages are written out.
Rather than reapplying 32001d6f, this approach is a better way of
ensuring that the lower file is still open in order to handle writing
out the dirty pages. It is called from ecryptfs_release(), while we have
a lock on the lower file pointer, just before the lower file gets the
final fput() and we overwrite the pointer.
https://launchpad.net/bugs/1047261
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Reported-by: Artemy Tregubenko <me@arty.name>
Tested-by: Artemy Tregubenko <me@arty.name>
Tested-by: Colin Ian King <colin.king@canonical.com>
YAMON requires and enforces the RTC Data Mode (Register B, DM bit) to
binary, that is the bit is set every time the board goes through the
firmware bootstrap sequence. Likewise its calendar manipulation commands
interpret or set the RTC registers unconditionally as binary, never
actually checking what the value of the DM bit is, under the (correct)
assumption that it has been previously set, to indicate the binary mode.
A change to Linux a while ago however introduced a platform-specific
tweak that clears that bit and therefore forces the data mode to BCD.
This causes clock corruption and misinterpretation that has to be fixed up
by user-mode tools in system startup scripts as the initial clock is often
incorrect according to the BCD interpretation forced.
This change removes the hack; a comment included refers to alarm code,
but even if it was broken at one point by requiring the BCD mode, it
should have been trivially corrected and even if not, given how rarely the
alarm feature is used, that was not really a reasonable justification to
break the system clock that is indeed used by virtually everything. And
either way the alarm code has been since fixed anyway.
Signed-off-by: Maciej W. Rozycki <macro@codesourcery.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/4336/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
When using the commands below to write some data to a virtio-scsi LUN of the
QEMU guest(32-bit) with 1G physical memory(qemu -m 1024), the qemu will crash.
# sudo mkfs.ext4 /dev/sdb (/dev/sdb is the virtio-scsi LUN.)
# sudo mount /dev/sdb /mnt
# dd if=/dev/zero of=/mnt/file bs=1M count=1024
In current implementation, sg_set_buf is called to add buffers to sg list which
is put into the virtqueue eventually. But if there are some HighMem pages in
table->sgl you can not get virtual address by sg_virt. So, sg_virt(sg_elem) may
return NULL value. This will cause QEMU exit when virtqueue_map_sg is called
in QEMU because an invalid GPA is passed by virtqueue.
Two solutions are discussed here:
http://lkml.indiana.edu/hypermail/linux/kernel/1207.3/00675.html
Finally, value assignment approach was adopted because:
Value assignment creates a well-formed scatterlist, because the termination
marker in source sg_list has been set in blk_rq_map_sg(). The last entry of the
source sg_list is just copied to the the last entry in destination list. Note
that, for now, virtio_ring does not care about the form of the scatterlist and
simply processes the first out_num + in_num consecutive elements of the sg[]
array.
I have tested the patch on my workstation. QEMU would not crash any more.
Cc: <stable@vger.kernel.org> # 3.4: 4fe74b1: [SCSI] virtio-scsi: SCSI driver
Signed-off-by: Wang Sen <senwang@linux.vnet.ibm.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
The code currently always selects turbo mode for PCA9665, no matter which
clock frequency is configured. This is because it compares the clock frequency
against constants reflecting (boundary / 100). Compare against real boundary
frequencies to fix the problem.
Signed-off-by: Thomas Kavanagh <tkavanagh@juniper.net>
Signed-off-by: Guenter Roeck <groeck@juniper.net>
Signed-off-by: Wolfram Sang <w.sang@pengutronix.de>
Guide people to where their patches can be found these days.
Signed-off-by: Wolfram Sang <w.sang@pengutronix.de>
Acked-by: Jean Delvare <khali@linux-fr.org>
Acked-by: Ben Dooks <ben-linux@fluff.org>
We noticed a plymouth bug on Fedora 18, and I then
noticed this stupid thinko, fixing it fixed the problem
with plymouth.
Cc: stable@vger.kernel.org
Acked-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Avoid the construction of a malformed DMA request sent to
the DMA controller.
Log message is for debug only because this condition is unlikely to
append and may only trigger at driver development time.
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Cc: Stable <stable@vger.kernel.org> [2.6.31+]
Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com>
s/dma_memcpy/slave_sg/ and it is sg length that we are
talking about.
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Cc: Stable <stable@vger.kernel.org> [2.6.31+]
Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com>
Alex writes:
This is the current set of radeon fixes for 3.6. Two small fixes:
- fix the fence issues introduced in 3.5 with 64-bit fences
- PLL fix for multiple DP heads
* 'drm-fixes-3.6' of git://people.freedesktop.org/~agd5f/linux:
drm/radeon: make 64bit fences more robust v3
drm/radeon: rework pll selection (v3)
This patch adds on the fixes done in commits 89dd86db78 ("mlx4_core:
Allow large mlx4_buddy bitmaps") and 3de819e6b6 ("mlx4_core: Fix
integer overflow issues around MTT table") so that memory registration
of up to 8TB (log_num_mtt=31) finally works.
It fixes integer overflows in a few mlx4_table_yyy routines in icm.c
by using a u64 intermediate variable, and int/uint issues that caused
table indexes to become nagive by setting some variables to be u32
instead of int. These problems cause crashes when a user attempted to
register > 512GB of RAM.
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Commit 0090def("ACPI: Add interface to register/unregister device
to/from power resources") used resource_lock to protect the devices list
that relies on power resource. It caused a mutex dead lock, as below
acpi_power_on ---> lock resource_lock
__acpi_power_on
acpi_power_on_device
acpi_power_get_inferred_state
acpi_power_get_list_state ---> lock resource_lock
This patch adds a new mutex "devices_lock" to protect the devices list
and calls acpi_power_on_device in acpi_power_on, instead of
__acpi_power_on, after the resource_lock is released.
[rjw: Changed data type of a boolean variable to bool.]
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Aaron Lu <aaron.lu@intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
It turns out that there are ACPI BIOSes defining device objects with
_PSx and without either _PSC or _PRx. For devices corresponding to
those ACPI objetcs __acpi_bus_get_power() returns ACPI_STATE_UNKNOWN
and their initial power states are regarded as unknown as a result.
If such a device is a parent of another power-manageable device, the
child cannot be put into a low-power state through ACPI, because
__acpi_bus_set_power() refuses to change power states of devices
whose parents' power states are unknown.
To work around this problem, observe that the ACPI power state of
a device cannot be higher-power (lower-number) than the power state
of its parent. Thus, if the device's _PSC method or the
configuration of its power resources indicates that the device is
in D0, the device's parent has to be in D0 as well. Consequently,
if the parent's power state is unknown when we've just learned that
its child's power state is D0, we can safely set the parent's
power.state field to ACPI_STATE_D0.
Tested-by: Aaron Lu <aaron.lu@intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
If vlan option is being specified in the pktgen and packet size
being requested is less than 46 bytes, despite being illogical
request, pktgen should not crash the kernel.
BUG: unable to handle kernel paging request at ffff88021fb82000
Process kpktgend_0 (pid: 1184, threadinfo ffff880215f1a000, task ffff880218544530)
Call Trace:
[<ffffffffa0637cd2>] ? pktgen_finalize_skb+0x222/0x300 [pktgen]
[<ffffffff814f0084>] ? build_skb+0x34/0x1c0
[<ffffffffa0639b11>] pktgen_thread_worker+0x5d1/0x1790 [pktgen]
[<ffffffffa03ffb10>] ? igb_xmit_frame_ring+0xa30/0xa30 [igb]
[<ffffffff8107ba20>] ? wake_up_bit+0x40/0x40
[<ffffffff8107ba20>] ? wake_up_bit+0x40/0x40
[<ffffffffa0639540>] ? spin+0x240/0x240 [pktgen]
[<ffffffff8107b4e3>] kthread+0x93/0xa0
[<ffffffff81615de4>] kernel_thread_helper+0x4/0x10
[<ffffffff8107b450>] ? flush_kthread_worker+0x80/0x80
[<ffffffff81615de0>] ? gs_change+0x13/0x13
The root cause of why pktgen is not able to handle this case is due
to comparison of signed (datalen) and unsigned data (sizeof), which
eventually passes a huge number to skb_put().
Signed-off-by: Nishank Trivedi <nistrive@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The acpi_evalf() function modifies four bytes of data but in
fan_get_status() we pass a pointer to u8. I have modified the
function to use type checking now.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Matthew Garrett <mjg@redhat.com>
Fix a device reference count leakage issue in function
eeepc_rfkill_hotplug().
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Matthew Garrett <mjg@redhat.com>
MODULE_PARM_DESC for wlan_status is further in the same file
Signed-off-by: Maxim A. Nikulin <M.A.Nikulin@gmail.com>
Signed-off-by: Corentin Chary <corentin.chary@gmail.com>
Signed-off-by: Matthew Garrett <mjg@redhat.com>
This function fails to add the start address of the gmux I/O range to
the requested port address and thus writes to the wrong location.
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Matthew Garrett <mjg@redhat.com>
Study of Apple's binary driver revealed that the GMUX_READ_PORT should
be written between calls to gmux_index_wait_ready and
gmux_index_wait_complete (i.e., the new index protocol must be
followed). If this is not done correctly, the indexed
gmux device only partially accepts writes which lead to problems
concerning GPU switching. Special thanks to Seth Forshee who helped
greatly with identifying unnecessary changes.
Signed-off-by: Bernhard Froemel <froemel@vmars.tuwien.ac.at>
Signed-off-by: Matthew Garrett <mjg@redhat.com>
This patch extracts and displays version information from the indexed
gmux device as it is also done for the classic gmux device.
Signed-off-by: Bernhard Froemel <froemel@vmars.tuwien.ac.at>
Signed-off-by: Matthew Garrett <mjg@redhat.com>
Commit a334872224 added afex support but lacked
several logical changes. This lack can cause afex to crash, and also
have a slight effect on other flows (i.e., driver always assumes the Tx ring
has less available buffers than what it actually has).
This patch adds the missing segments, fixing said issues.
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Barak Witkowski <barak@broadcom.com>
Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Under traffic, there are several registers that when read (e.g., via
'ethtool -d') may cause the chip to stall.
This patch corrects the registers read in such flows.
Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch propagates users' requested flow-control into the link layer,
which will later be used to advertise this flow-control for auto-negotiation
(until now these values were ignored).
Signed-off-by: Yaniv Rosner <yaniv.rosner@broadcom.com>
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Prior to this fix, the driver reported the chip's active duplex state
is always 'full', even if using half-duplex mode.
Signed-off-by: Yaniv Rosner <yaniv.rosner@broadcom.com>
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Prevent updating the xmac PFC configuration when using a link speed
slower than 10G -the umac block is responsible for 1G or slower connections,
therefore it is possible the xmac block is reset when connection is slower.
Signed-off-by: Yaniv Rosner <yaniv.rosner@broadcom.com>
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
FW needs the driver statistics for management. Current logic is broken
in that the function that gathers the port statistics does not copy
its own statistics to a place where the FW can use it.
This patch causes every function that can pass statistics to the FW to
do so.
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
During traffic when DCB is enabled, it is possible for multiple instances
of statistics queries to be sent to the chip - this may cause the FW to assert.
This patch prevents the sending of an additional instance of statistics query
while the previous query hasn't completed.
Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For DP we can use the same PPLL for all active DP
encoders. Take advantage of that to prevent cases
where we may end up sharing a PPLL between DP and
non-DP which won't work. Also clean up the code
a bit.
v2: - fix missing pll_id assignment in crtc init
v3: - fix DP PPLL check
- document functions
- break in main encoder search loop after matching.
no need to keep checking additional encoders.
fixes:
https://bugs.freedesktop.org/show_bug.cgi?id=54471
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
This fixes a hang on suspend due to calling wdm_suspend on
the unregistered data interface. The hang should have been
a NULL pointer reference had it not been for a logic error
in the cdc_wdm code.
commit 230718bd net: qmi_wwan: bind to both control and data interface
changed qmi_wwan to use cdc_wdm as a subdriver for devices with
a two-interface QMI/wwan function. The commit failed to update
qmi_wwan_suspend and qmi_wwan_resume, which were written to handle
either a single combined interface function, or no subdriver at all.
The result was that we called into the subdriver both when the
control interface was suspended and when the data interface was
suspended. Calling the subdriver suspend function with an
unregistered interface is not supported and will make the
subdriver bug out.
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
gred_dequeue() and gred_drop() do not seem to get called when the
queue is empty, meaning that we never start idling while in WRED
mode. And since qidlestart is not stored by gred_store_wred_set(),
we would never stop idling while in WRED mode if we ever started.
This messes up the average queue size calculation that influences
packet marking/dropping behavior.
Now, we start WRED mode idling as we are removing the last packet
from the queue. Also we now actually stop WRED mode idling when we
are enqueuing a packet.
Cc: Bruce Osler <brosler@cisco.com>
Signed-off-by: David Ward <david.ward@ll.mit.edu>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
q->vars.qavg is a Wlog scaled value, but q->backlog is not. In order
to pass q->vars.qavg as the backlog value, we need to un-scale it.
Additionally, the qave value returned via netlink should not be Wlog
scaled, so we need to un-scale the result of red_calc_qavg().
This caused artificially high values for "Average Queue" to be shown
by 'tc -s -d qdisc', but did not affect the actual operation of GRED.
Signed-off-by: David Ward <david.ward@ll.mit.edu>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Each pair of DPs only needs to be compared once when searching for
a non-unique prio value.
Signed-off-by: David Ward <david.ward@ll.mit.edu>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It is a bad idea to hold a spinlock and call flush_work_sync.
Move the workqueue cleanup outside the spinlock and use cancel_work_sync,
on closing the channel this seems to be the more correct function.
Remove the never used and constant return value of mISDN_freebchannel.
Signed-off-by: Karsten Keil <keil@b1-systems.de>
Cc: <stable@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pablo Neira Ayuso say:
====================
The following patchset contains four updates for your net tree, they are:
* Fix crash on timewait sockets, since the TCP early demux was added,
in nfnetlink_log, from Eric Dumazet.
* Fix broken syslog log-level for xt_LOG and ebt_log since printk format was
converted from <.> to a 2 bytes pattern using ASCII SOH, from Joe Perches.
* Two security fixes for the TCP connection tracking targeting off-path attacks,
from Jozsef Kadlecsik. The problem was discovered by Jan Wrobel and it is
documented in: http://mixedbit.org/reflection_scan/reflection_scan.pdf.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Lezcano reported seeing multi-second stalls from
keyboard input on his T61 laptop when NOHZ and CPU_IDLE
were enabled on a 32bit kernel.
He bisected the problem down to commit
1e75fa8be9 ("time: Condense timekeeper.xtime into xtime_sec").
After reproducing this issue, I narrowed the problem down
to the fact that timekeeping_get_ns() returns a 64bit
nsec value that hasn't been accumulated. In some cases
this value was being then stored in timespec.tv_nsec
(which is a long).
On 32bit systems, with idle times larger then 4 seconds
(or less, depending on the value of xtime_nsec), the
returned nsec value would overflow 32bits. This limited
kept time from increasing, causing timers to not expire.
The fix is to make sure we don't directly store the
result of timekeeping_get_ns() into a tv_nsec field,
instead using a 64bit nsec value which can then be
added into the timespec via timespec_add_ns().
Reported-and-bisected-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Tested-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Acked-by: Prarit Bhargava <prarit@redhat.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Link: http://lkml.kernel.org/r/1347405963-35715-1-git-send-email-john.stultz@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Current implementation simply ignores attribute flags. Thus, there is
no notification to userland of unsupported features. Check syscall's
attribute flags to let userland know if a feature is supported by the
kernel. This is also needed to distinguish between future kernels what
might support a feature.
Cc: <stable@vger.kernel.org> v3.5..
Signed-off-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20120910093018.GO8285@erda.amd.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This patch exports the clockticks event and its encoding to user level.
The clockticks event was exported for Nehalem/Westmere but not for Sandy
Bridge (client). Given that it uses a special encoding, it needs to be
exported to user tools, so users can do:
# perf stat -a -C 0 -e uncore_cbox_0/clockticks/ sleep 1
Signed-off-by: Stephane Eranian <eranian@google.com>
Acked-by: Yan, Zheng <zheng.z.yan@intel.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20120829130122.GA32336@quad
Signed-off-by: Ingo Molnar <mingo@kernel.org>