Commit Graph

535275 Commits

Author SHA1 Message Date
WANG Cong a6c1aea044 cls_u32: complete the check for non-forced case in u32_destroy()
In commit 1e052be69d ("net_sched: destroy proto tp when all filters are gone")
I added a check in u32_destroy() to see if all real filters are gone
for each tp, however, that is only done for root_ht, same is needed
for others.

This can be reproduced by the following tc commands:

tc filter add dev eth0 parent 1:0 prio 5 handle 15: protocol ip u32 divisor 256
tc filter add dev eth0 protocol ip parent 1: prio 5 handle 15:2:2 u32
ht 15:2: match ip src 10.0.0.2 flowid 1:10
tc filter add dev eth0 protocol ip parent 1: prio 5 handle 15:2:3 u32
ht 15:2: match ip src 10.0.0.3 flowid 1:10

Fixes: 1e052be69d ("net_sched: destroy proto tp when all filters are gone")
Reported-by: Akshat Kakkar <akshat.1984@gmail.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Cong Wang <cwang@twopensource.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-25 17:02:48 -07:00
Jan Beulich e308fd3bb2 LSM: restore certain default error codes
While in most cases commit b1d9e6b064 ("LSM: Switch to lists of hooks")
retained previous error returns, in three cases it altered them without
any explanation in the commit message. Restore all of them - in the
security_old_inode_init_security() case this led to reiserfs using
uninitialized data, sooner or later crashing the system (the only other
user of this function - ocfs2 - was unaffected afaict, since it passes
pre-initialized structures).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2015-08-26 09:46:50 +10:00
Ross Zwisler de4a196c02 nfit, nd_blk: BLK status register is only 32 bits
Only read 32 bits for the BLK status register in read_blk_stat().

The format and size of this register is defined in the
"NVDIMM Driver Writer's guide":

http://pmem.io/documents/NVDIMM_Driver_Writers_Guide.pdf

Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Reported-by: Nicholas Moulin <nicholas.w.moulin@linux.intel.com>
Tested-by: Nicholas Moulin <nicholas.w.moulin@linux.intel.com>
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2015-08-25 19:42:01 -04:00
Russell King aac27c7a0d net: fec: use reinit_completion() in mdio accessor functions
Rather than re-initialising the entire completion on every mdio access,
use reinit_completion() which only resets the completion count.  This
avoids possible reinitialisation of the contained spinlock and waitqueue
while they may be in use (eg, mid-completion.)

Such an event could occur if there's a long delay in interrupt handling
causing the mdio accessor to time out, then a second access comes in
while the interrupt handler on a different CPU has called complete().
Another scenario where this has been observed is while locking has
been missing at the phy layer, allowing concurrent attempts to access
the MDIO bus.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-25 16:33:16 -07:00
Russell King 05a7f582be net: phy: add locking to phy_read_mmd_indirect()/phy_write_mmd_indirect()
The phy layer is missing locking for the above two functions - it
has been observed that two threads (userspace and the phy worker
thread) can race, entering the bus ->write or ->read functions
simultaneously.

This causes the FEC driver to initialise a completion while another
thread is waiting on it or while the interrupt is calling complete()
on it, which causes spinlock unlock-without-lock, spinlock lockups,
and completion timeouts.

Fixes: a59a4d192 ("phy: add the EEE support and the way to access to the MMD registers.")
Fixes: 0c1d77dfb ("net: libphy: Add phy specific function to access mmd phy registers")
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-25 16:30:46 -07:00
Marcelo Ricardo Leitner bef0057b7b vxlan: re-ignore EADDRINUSE from igmp_join
Before 56ef9c909b40[1] it used to ignore all errors from igmp_join().
That commit enhanced that and made it error out whatever error happened
with igmp_join(), but that's not good because when using multicast
groups vxlan will try to join it multiple times if the socket is reused
and then the 2nd and further attempts will fail with EADDRINUSE.

As we don't track to which groups the socket is already subscribed, it's
okay to just ignore that error.

Fixes: 56ef9c909b ("vxlan: Move socket initialization to within rtnl scope")
Reported-by: John Nielsen <lists@jnielsen.net>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-25 16:24:35 -07:00
David S. Miller e732cdd416 linux-can-fixes-for-4.2-20150825
-----BEGIN PGP SIGNATURE-----
 
 iQEcBAABCgAGBQJV3BEJAAoJEP5prqPJtc/Hb2wH/RFxO+rBh16yZBJjFlPbn2ZQ
 VbcngZtowJle9kjTBINCN/8KsjOhdpn9oT8iOxVcrwyxPa/gWcqnz7cip9regabu
 6fOnIlmCnomJ9E/9Gt4joqsB14Zlbubn4xU+VJacZRDjXktJSeHGexxYHnAsROC6
 V4W2yIySd5T1UvlzSCbbugRLa9c0ROtLj2RxdHrTicbmcyQrA/bvErACFGtlInso
 PV6YLFk+ESk3RH0vl2FxUkNC2g7QiKp7zhX9eAuDkEg2CIYCL1sNQt6eAQMPaRbc
 o9u60JLbqiXrKbvlmOGBnIPqkBWxbYX3Jo9Qc4bEyS93Gzdn73kMwnFbMgd8Oek=
 =JbLo
 -----END PGP SIGNATURE-----

Merge tag 'linux-can-fixes-for-4.2-20150825' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can

Marc Kleine-Budde says:

====================
this is the updated pull request of one patch by me for the peak_usb
driver. It fixes the driver, so that non FD adapters don't provide CAN
FD bittimings.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-25 16:12:45 -07:00
Kazuya Mizuguchi 83bc805bff net: compile renesas directory if NET_VENDOR_RENESAS is configured
Currently the renesas ethernet driver directory is compiled if SH_ETH is
configured rather than NET_VENDOR_RENESAS. Although incorrect that was
quite harmless as until recently as SH_ETH configured the only driver in
the renesas directory. However, as of c156633f13 ("Renesas Ethernet AVB
driver proper") the renesas directory includes another driver, configured
by RAVB, and it makes little sense for it to have a hidden dependency on
SH_ETH.

Signed-off-by: Kazuya Mizuguchi <kazuya.mizuguchi.ks@renesas.com>
[horms: rewrote changelog]
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-25 16:03:54 -07:00
huaibin Wang d4257295ba ip6_gre: release cached dst on tunnel removal
When a tunnel is deleted, the cached dst entry should be released.

This problem may prevent the removal of a netns (seen with a x-netns IPv6
gre tunnel):
  unregister_netdevice: waiting for lo to become free. Usage count = 3

CC: Dmitry Kozlov <xeb@mail.ru>
Fixes: c12b395a46 ("gre: Support GRE over IPv6")
Signed-off-by: huaibin Wang <huaibin.wang@6wind.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-25 14:33:00 -07:00
Jeff Moyer 74c9c9134b mtip32x: fix regression introduced by blk-mq per-hctx flush
Hi,

After commit f70ced0917 (blk-mq: support per-distpatch_queue flush
machinery), the mtip32xx driver may oops upon module load due to walking
off the end of an array in mtip_init_cmd.  On initialization of the
flush_rq, init_request is called with request_index >= the maximum queue
depth the driver supports.  For mtip32xx, this value is used to index
into an array.  What this means is that the driver will walk off the end
of the array, and either oops or cause random memory corruption.

The problem is easily reproduced by doing modprobe/rmmod of the mtip32xx
driver in a loop.  I can typically reproduce the problem in about 30
seconds.

Now, in the case of mtip32xx, it actually doesn't support flush/fua, so
I think we can simply return without doing anything.  In addition, no
other mq-enabled driver does anything with the request_index passed into
init_request(), so no other driver is affected.  However, I'm not really
sure what is expected of drivers.  Ming, what did you envision drivers
would do when initializing the flush requests?

Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-08-25 14:35:51 -06:00
Tejun Heo 006a0973ed writeback: sync_inodes_sb() must write out I_DIRTY_TIME inodes and always call wait_sb_inodes()
e79729123f ("writeback: don't issue wb_writeback_work if clean")
updated writeback path to avoid kicking writeback work items if there
are no inodes to be written out; unfortunately, the avoidance logic
was too aggressive and broke sync_inodes_sb().

* sync_inodes_sb() must write out I_DIRTY_TIME inodes but I_DIRTY_TIME
  inodes dont't contribute to bdi/wb_has_dirty_io() tests and were
  being skipped over.

* inodes are taken off wb->b_dirty/io/more_io lists after writeback
  starts on them.  sync_inodes_sb() skipping wait_sb_inodes() when
  bdi_has_dirty_io() breaks it by making it return while writebacks
  are in-flight.

This patch fixes the breakages by

* Removing bdi_has_dirty_io() shortcut from bdi_split_work_to_wbs().
  The callers are already testing the condition.

* Removing bdi_has_dirty_io() shortcut from sync_inodes_sb() so that
  it always calls into bdi_split_work_to_wbs() and wait_sb_inodes().

* Making bdi_split_work_to_wbs() consider the b_dirty_time list for
  WB_SYNC_ALL writebacks.

Kudos to Eryu, Dave and Jan for tracking down the issue.

Signed-off-by: Tejun Heo <tj@kernel.org>
Fixes: e79729123f ("writeback: don't issue wb_writeback_work if clean")
Link: http://lkml.kernel.org/g/20150812101204.GE17933@dhcp-13-216.nay.redhat.com
Reported-and-bisected-by: Eryu Guan <eguan@redhat.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Jan Kara <jack@suse.com>
Cc: Ted Ts'o <tytso@google.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-08-25 14:35:09 -06:00
David Daney 8b63ec1837 phylib: Make PHYs children of their MDIO bus, not the bus' parent.
commit 18ee49ddb0 ("phylib: rename mii_bus::dev to mii_bus::parent")
changed the parent of PHY devices from the bus to the bus parent.

Then, commit 4dea547fef ("phylib: rework to prepare for OF
registration of PHYs") moved the code into phy_device.c

At this point, it is somewhat unclear why the change was seen as
necessary.  But, when we look at the device model tree in
/sys/devices, it is clearly incorrect.  The PHYs should be children of
their MDIO bus.

Change the PHY's parent device to be the MDIO bus device.

Cc: Lennert Buytenhek <buytenh@wantstofly.org>
Cc: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: David Daney <david.daney@cavium.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-25 11:30:23 -07:00
Linus Torvalds b1713b135f Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fix from Thomas Gleixner:
 "A single fix for a APIC regression introduced in 4.0 which went
  undetected until now.

  I screwed up the x2apic cleanup in a subtle way.  The screwup is only
  visible on systems which have x2apic preenabled in the BIOS and need
  to disable it during boot"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/apic: Fix fallout from x2apic cleanup
2015-08-25 09:01:05 -07:00
Marc Kleine-Budde 06b23f7fbb can: pcan_usb: don't provide CAN FD bittimings by non-FD adapters
The CAN FD data bittiming constants are provided via netlink only when there
are valid CAN FD constants available in priv->data_bittiming_const.

Due to the indirection of pointer assignments in the peak_usb driver the
priv->data_bittiming_const never becomes NULL - not even for non-FD adapters.

The data_bittiming_const points to zero'ed data which leads to this result
when running 'ip -details link show can0':

35: can0: <NOARP,ECHO> mtu 16 qdisc noop state DOWN mode DEFAULT group default qlen 10
    link/can  promiscuity 0
    can state STOPPED restart-ms 0
	  pcan_usb: tseg1 1..16 tseg2 1..8 sjw 1..4 brp 1..64 brp-inc 1
	  : dtseg1 0..0 dtseg2 0..0 dsjw 1..0 dbrp 0..0 dbrp-inc 0  <== BROKEN!
	  clock 8000000

This patch changes the struct peak_usb_adapter::bittiming_const and struct
peak_usb_adapter::data_bittiming_const to pointers to fix the assignemnt
problems.

Cc: linux-stable <stable@vger.kernel.org> # >= 4.0
Reported-by: Oliver Hartkopp <socketcan@hartkopp.net>
Tested-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2015-08-25 08:50:00 +02:00
Takashi Iwai c7cd0ef66a ALSA: hda - Fix path power activation
The widget power-saving code tries to turn up/down the power of each
widget in the I/O paths that are modified at each jack plug/unplug.
The recent report revealed that the power activation leaves some
widgets unpowered after plugging.  This is because
snd_hda_activate_path() turns on path->active flag at the end of the
function while the path power management is done before that.  Then
it's regarded as if nothing is active, and the driver turns off the
power.

The fix is simply to set the flag at the beginning of the function,
before trying to power up.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=102521
Cc: <stable@vger.kernel.org> [v4.1+]
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2015-08-25 07:59:02 +02:00
Takashi Iwai 9d2b48f730 ALSA: hda - Check all inputs for is_active_nid_for_any()
The is_active_nid_for_any() function in the generic parser is supposed
to check all connections from/to the given widget, but the current
code checks only the first input connection (index = 0).

This patch corrects the code to check all inputs by passing -1 to
index argument.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=102521
Cc: <stable@vger.kernel.org> [v4.1+]
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2015-08-25 07:59:01 +02:00
J. Bruce Fields 883985f612 nfsd: Add Jeff Layton as co-maintainer
Jeff has been doing a lot of development (including much of the
state-locking rewrite just as one example) plus lots of review and other
miscellaneous nfsd work, so let's acknowledge the status quo.

I'll continue to be the one to send regular pull requests but Jeff will
should be available to cover there occasionally too.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-08-24 14:54:24 -07:00
David Ahern ba51b6be38 net: Fix RCU splat in af_key
Hit the following splat testing VRF change for ipsec:

[  113.475692] ===============================
[  113.476194] [ INFO: suspicious RCU usage. ]
[  113.476667] 4.2.0-rc6-1+deb7u2+clUNRELEASED #3.2.65-1+deb7u2+clUNRELEASED Not tainted
[  113.477545] -------------------------------
[  113.478013] /work/monster-14/dsa/kernel.git/include/linux/rcupdate.h:568 Illegal context switch in RCU read-side critical section!
[  113.479288]
[  113.479288] other info that might help us debug this:
[  113.479288]
[  113.480207]
[  113.480207] rcu_scheduler_active = 1, debug_locks = 1
[  113.480931] 2 locks held by setkey/6829:
[  113.481371]  #0:  (&net->xfrm.xfrm_cfg_mutex){+.+.+.}, at: [<ffffffff814e9887>] pfkey_sendmsg+0xfb/0x213
[  113.482509]  #1:  (rcu_read_lock){......}, at: [<ffffffff814e767f>] rcu_read_lock+0x0/0x6e
[  113.483509]
[  113.483509] stack backtrace:
[  113.484041] CPU: 0 PID: 6829 Comm: setkey Not tainted 4.2.0-rc6-1+deb7u2+clUNRELEASED #3.2.65-1+deb7u2+clUNRELEASED
[  113.485422] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5.1-0-g8936dbb-20141113_115728-nilsson.home.kraxel.org 04/01/2014
[  113.486845]  0000000000000001 ffff88001d4c7a98 ffffffff81518af2 ffffffff81086962
[  113.487732]  ffff88001d538480 ffff88001d4c7ac8 ffffffff8107ae75 ffffffff8180a154
[  113.488628]  0000000000000b30 0000000000000000 00000000000000d0 ffff88001d4c7ad8
[  113.489525] Call Trace:
[  113.489813]  [<ffffffff81518af2>] dump_stack+0x4c/0x65
[  113.490389]  [<ffffffff81086962>] ? console_unlock+0x3d6/0x405
[  113.491039]  [<ffffffff8107ae75>] lockdep_rcu_suspicious+0xfa/0x103
[  113.491735]  [<ffffffff81064032>] rcu_preempt_sleep_check+0x45/0x47
[  113.492442]  [<ffffffff8106404d>] ___might_sleep+0x19/0x1c8
[  113.493077]  [<ffffffff81064268>] __might_sleep+0x6c/0x82
[  113.493681]  [<ffffffff81133190>] cache_alloc_debugcheck_before.isra.50+0x1d/0x24
[  113.494508]  [<ffffffff81134876>] kmem_cache_alloc+0x31/0x18f
[  113.495149]  [<ffffffff814012b5>] skb_clone+0x64/0x80
[  113.495712]  [<ffffffff814e6f71>] pfkey_broadcast_one+0x3d/0xff
[  113.496380]  [<ffffffff814e7b84>] pfkey_broadcast+0xb5/0x11e
[  113.497024]  [<ffffffff814e82d1>] pfkey_register+0x191/0x1b1
[  113.497653]  [<ffffffff814e9770>] pfkey_process+0x162/0x17e
[  113.498274]  [<ffffffff814e9895>] pfkey_sendmsg+0x109/0x213

In pfkey_sendmsg the net mutex is taken and then pfkey_broadcast takes
the RCU lock.

Since pfkey_broadcast takes the RCU lock the allocation argument is
pointless since GFP_ATOMIC must be used between the rcu_read_{,un}lock.
The one call outside of rcu can be done with GFP_KERNEL.

Fixes: 7f6b9dbd5a ("af_key: locking change")
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-24 14:48:10 -07:00
Adrian Hunter 9d1bf02ac3 perf tools: Update Intel PT documentation
Update Intel PT documentation to describe new features.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1437150840-31811-26-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-08-24 17:51:09 -03:00
Adrian Hunter 7eacca3ebb perf tools: Add Intel PT support for decoding TRACESTOP packets
A TRACESTOP packet is produced when an Intel PT trace enters a defined
region of the address space at which point the tracing stops.

This patch just adds decoder support.

Support for specifying TRACESTOP regions is left until later.

For details refer to the June 2015 or later Intel 64 and IA-32
Architectures SDM Chapter 36 Intel Processor Trace.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1437150840-31811-25-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-08-24 17:50:23 -03:00
Adrian Hunter 0de802abd1 perf tools: Add Intel PT support for using CYC packets
CYC packets are a new Intel PT feature.

CYC packets provide even finer grain timestamp information than MTC and
TSC packets.  A CYC packet contains the number of CPU cycles since the
last CYC packet. Unlike MTC and TSC packets, CYC packets are only sent
when another packet is also sent.

Support for this feature is indicated by:

/sys/bus/event_source/devices/intel_pt/caps/psb_cyc

which contains "1" if the feature is supported and "0" otherwise.

CYC packets can be requested using a PMU config term e.g. perf record -e
intel_pt/cyc/u sleep 1

The frequency of CYC packets can also be specified.  e.g. perf record -e
intel_pt/cyc,cyc_thresh=2/u sleep 1

CYC packets are not requested by default.

Valid cyc_thresh values are given by:

/sys/bus/event_source/devices/intel_pt/caps/cycle_thresholds

which contains a hexadecimal value, the bits of which represent valid
values e.g. bit 2 set means value 2 is valid.

The value represents the minimum number of CPU cycles that must have
passed before a CYC packet can be sent.  The number of CPU cycles is:

    2 ^ (value - 1)

e.g. value 4 means 8 CPU cycles must pass before a CYC packet can be
sent.  Note a CYC packet is still only sent when another packet is sent,
not at, e.g. every 8 CPU cycles.

If an invalid value is entered, the error message will give a list of
valid values e.g.

    $ perf record -e intel_pt/cyc,cyc_thresh=15/u uname
    Invalid cyc_thresh for intel_pt. Valid values are: 0-12

tools/perf/Documentation/intel-pt.txt is updated in a later patch as
there are a number of new features being added.

For more information refer to the June 2015 or later Intel 64 and IA-32
Architectures SDM Chapter 36 Intel Processor Trace.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1437150840-31811-24-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-08-24 17:49:43 -03:00
Adrian Hunter cc33618619 perf tools: Add Intel PT support for decoding CYC packets
CYC packets provide even finer grain timestamp information than MTC and
TSC packets.  A CYC packet contains the number of CPU cycles since the
last CYC packet.

This patch just adds decoder support.  The CPU frequency can be related
to TSC using the Maximum Non-Turbo Ratio in combination with the CBR
(core-to-bus ratio) packet.  However more accuracy is achieved by simply
interpolating the number of cycles between other timing packets like MTC
or TSC.  This patch takes the latter approach.

Support for a default value and validation of values is provided by a
later patch. Also documentation is updated in a separate patch.

For details refer to the June 2015 or later Intel 64 and IA-32
Architectures SDM Chapter 36 Intel Processor Trace.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1437150840-31811-23-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-08-24 17:49:04 -03:00
Adrian Hunter b45fc0bfaf perf tools: Add Intel PT support for using MTC packets
MTC packets are a new Intel PT feature.

MTC packets provide finer grain timestamp information than TSC packets.

Support for this feature is indicated by:

  /sys/bus/event_source/devices/intel_pt/caps/mtc

which contains "1" if the feature is supported and "0" otherwise.

MTC packets can be requested using a PMU config term e.g. perf record -e
intel_pt/mtc/u sleep 1

The frequency of MTC packets can also be specified.  e.g. perf record -e
intel_pt/mtc,mtc_period=2/u sleep 1

The default value is 3 or the nearest lower value that is supported.  0
is always supported.

Valid values are given by:

/sys/bus/event_source/devices/intel_pt/caps/mtc_periods

which contains a hexadecimal value, the bits of which represent valid
values e.g. bit 2 set means value 2 is valid.

The value is converted to the MTC frequency as:

	CTC-frequency / (2 ^ value)

e.g. value 3 means one eighth of CTC-frequency

Where CTC is the hardware crystal clock, the frequency of which can be
related to TSC via values provided in cpuid leaf 0x15.

If an invalid value is entered, the error message will give a list of
valid values e.g.

	$ perf record -e intel_pt/mtc_period=15/u uname
	Invalid mtc_period for intel_pt. Valid values are: 0,3,6,9

tools/perf/Documentation/intel-pt.txt is updated in a later patch as
there are a number of new features being added.

For more information refer to the June 2015 or later Intel 64 and IA-32
Architectures SDM Chapter 36 Intel Processor Trace.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1437150840-31811-22-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-08-24 17:48:06 -03:00
Adrian Hunter 79b58424b8 perf tools: Add Intel PT support for decoding MTC packets
MTC packets provide finer grain timestamp information than TSC packets.
MTC packets record time using the hardware crystal clock (CTC) which is
related to TSC packets using a TMA packet.

This patch just adds decoder support.

Support for a default value and validation of values is provided by a
later patch. Also documentation is updated in a separate patch.

For details refer to the June 2015 or later Intel 64 and IA-32
Architectures SDM Chapter 36 Intel Processor Trace.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1437150840-31811-21-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-08-24 17:46:56 -03:00
Adrian Hunter 11fa7cb86b perf tools: Pass Intel PT information for decoding MTC and CYC
Record additional information in the AUXTRACE_INFO event in preparation
for decoding MTC and CYC packets.  Pass the information to the decoder.

The AUXTRACE_INFO record can be extended by using the size to indicate
the presence of new members.

The additional information includes PMU config bit positions and the TSC
to CTC (hardware crystal clock) ratio needed to decode MTC packets.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1437150840-31811-20-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-08-24 17:46:43 -03:00
Adrian Hunter 3d49807870 perf tools: Add new Intel PT packet definitions
New features have been added to Intel PT which include a number of new
packet definitions.

This patch adds packet definitions for new packets: TMA, MTC, CYC, VMCS,
TRACESTOP and MNT.  Also another bit in PIP is defined.

This patch only adds support for the definitions. Later patches add
support for decoding TMA, MTC, CYC and TRACESTOP which is where those
packets are explained.

VMCS and the newly defined bit in PIP are used with virtualization which
is not supported yet.  MNT is a maintenance packet which the decoder
should ignore.

For details, refer to the June 2015 or later Intel 64 and IA-32
Architectures SDM Chapter 36 Intel Processor Trace.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1437150840-31811-19-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-08-24 17:46:06 -03:00
Adrian Hunter bc9b6bf07c perf tools: Add Intel PT support for PSB periods
The PSB packet is a synchronization packet that provides a starting
point for decoding or recovery from errors.

This patch adds support for a new Intel PT feature that allows the
frequency of PSB packets to be specified.

Support for this feature is indicated by
/sys/bus/event_source/devices/intel_pt/caps/psb_cyc which contains "1"
if the feature is supported and "0" otherwise.

The PSB period can be specified as a PMU config term e.g. perf record -e
intel_pt/psb_period=2/u sleep 1

The default value is 3 or the nearest lower value that is supported.  0
is always supported.

Valid values are given by:

/sys/bus/event_source/devices/intel_pt/caps/psb_periods

which contains a hexadecimal value, the bits of which represent valid
values e.g. bit 2 set means value 2 is valid.

The value is converted to the approximate number of trace bytes between
PSB packets as:

	2 ^ (value + 11)

e.g. value 3 means 16KiB bytes between PSBs

If an invalid value is entered, the error message will give a list of
valid values e.g.

	$ perf record -e intel_pt/psb_period=15/u uname
	Invalid psb_period for intel_pt. Valid values are: 0-5

tools/perf/Documentation/intel-pt.txt is updated in a later patch as
there are a number of new features being added.

For more information about PSB periods refer to the Intel 64 and IA-32
Architectures SDM Chapter 36 Intel Processor Trace from June 2015 or
later.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1437150840-31811-18-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-08-24 17:45:08 -03:00
Adrian Hunter 2a21d03686 perf tools: Fix Intel PT 'instructions' sample period
The period on synthesized 'instructions' samples was being set to a
fixed value, whereas the correct value is the number of instructions
since the last sample, which is a value that the decoder can provide.
So do it that way.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1437150840-31811-14-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-08-24 17:42:26 -03:00
Arnaldo Carvalho de Melo 5c9ce1e644 perf ordered_events: Clear the progress bar at the end of a flush
We were depending on the next screen operation after a flush() being
one that would redraw the whole screen so that the progress bar would
be overwritten, when that didn't happen a screen artifact of, say, a
error dialog window would be overlaid on top of the progress bar, fix
it by calling ui_browser__finish(), that now has a TUI implementation.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-el0fyw6duemnx62lydjzhs8c@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-08-24 17:16:22 -03:00
Arnaldo Carvalho de Melo 1e259ad4a2 perf ui tui progress: Implement the ui_progress_ops->finish() method
So that we can erase the progress bar after we're done with it, avoiding
things like:

-------------------------------------------------------------------

          ┌─Error:──────────────────────────────────────────────────────┐
          │Can't annotate unmapped_area_topdown:                        │
          │                                                             │
          │No vmlinux file with build id a826726b5ddacfab1f0bade868f1a79│
          │was found in the path.                                       │
          │                                                             │
          │Note that annotation using /proc/kcore requires CAP_SYS_RAWIO│
┌Processin│                                                             │──┐
│         │Please use:                                                  │  │
└─────────│                                                             │──┘
          │  perf buildid-cache -vu vmlinux                             │
          │                                                             │
          │or:                                                          │
          │                                                             │
          │  --vmlinux vmlinux                                          │
          │                                                             │
          │                                                             │
          │Press any key...                                             │
          └─────────────────────────────────────────────────────────────┘

Can't annotate unmapped_area_topdown:
-------------------------------------------------------------------

I.e. that finished progress bar behind the error window. It is not a
problem when we end up redrawing the whole screen, but its ugly when
we present such error windows, provide a TUI method so that code like
the above may avoid this situation, as will be done with the annotation
code in the next cset.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-qvktnojzwwe37pweging058t@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-08-24 16:18:26 -03:00
Arnaldo Carvalho de Melo c0b4dffbc5 perf annotate: Reset the dso find_symbol cache when removing symbols
The 'annotate' tool does some filtering in the entries in a DSO but
forgot to reset the cache done in dso__find_symbol(), cauxing a SEGV:

  [root@zoo ~]# perf annotate netlink_poll
  perf: Segmentation fault
  -------- backtrace --------
  perf[0x526ceb]
  /lib64/libc.so.6(+0x34960)[0x7faedfbe0960]
  perf(rb_erase+0x223)[0x499d63]
  perf[0x4213e9]
  perf[0x4bc123]
  perf[0x4bc621]
  perf[0x4bf26b]
  perf[0x4bc855]
  perf(perf_session__process_events+0x340)[0x4bddc0]
  perf(cmd_annotate+0x6bb)[0x421b5b]
  perf[0x479063]
  perf(main+0x60a)[0x42098a]
  /lib64/libc.so.6(__libc_start_main+0xf0)[0x7faedfbcbfe0]
  perf[0x420aa9]
  [0x0]
  [root@zoo ~]#

Fix it by reseting the find cache when removing symbols.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Fixes: b685ac22b4 ("perf symbols: Add front end cache for DSO symbol lookup")
Link: http://lkml.kernel.org/n/tip-b2y9x46y0t8yem1ive41zqyp@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-08-24 13:33:14 -03:00
Markus Osterhoff c7e69ae6b4 ALSA: hda: fix possible NULL dereference
After a for-loop was replaced by list_for_each_entry, see
Commit bbbc7e8502 ("ALSA: hda - Allocate hda_pcm objects dynamically"),
Commit 751e221689 ("ALSA: hda: fix possible null dereference"),
a possible NULL pointer dereference has been introduced; this patch adds
the NULL check on pcm->pcm, while leaving a potentially superfluous
check on pcm itself untouched.

Signed-off-by: Markus Osterhoff <linux-kernel@k-raum.org>
Cc: <stable@vger.kernel.org> #v4.1+
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2015-08-24 15:02:03 +02:00
Jaedon Shin b6df7d61c8 net: bcmgenet: fix uncleaned dma flags
Clean the dma flags of multiq ring buffer int the interface stop
process. This patch fixes that the genet is not running while the
interface is re-enabled.

$ ifup eth0 - running after booting
$ ifdown eth0
$ ifup eth0 - not running and occur tx_timeout

The bcmgenet_dma_disable() in bcmgenet_open() do clean ring16 dma flag
only. If the genet has multiq, the dma register is not cleaned. and
bcmgenet_init_dma() is not done correctly. in case
GENET_V2(tx_queues=4), tdma_ctrl has 0x1e after running
bcmgenet_dma_disable().

Signed-off-by: Jaedon Shin <jaedon.shin@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-23 23:00:41 -07:00
Florian Fainelli eed635699a net: bcmgenet: Avoid sleeping in bcmgenet_timeout
bcmgenet_timeout() executes in atomic context, yet we will invoke
napi_disable() which does sleep. Looking back at the changes, disabling
TX napi and re-enabling it is completely useless, since we reclaim all
TX buffers and re-enable interrupts, and wake up the TX queues.

Fixes: 13ea657806 ("net: bcmgenet: improve TX timeout")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-23 22:59:33 -07:00
Linus Torvalds c13dcf9f2d Linux 4.2-rc8 2015-08-23 20:52:59 -07:00
Linus Torvalds d683477020 SCSI fixes on 20150823
A couple of major (hang and deadlock) fixes with fortunately fairly rare
 triggering conditions.  The PM oops is only really triggered by people using
 enclosure services (rare) and the fnic driver is mostly used in enterprise
 environments.
 
 Signed-off-by: James Bottomley <JBottomley@Odin.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQEcBAABAgAGBQJV2e34AAoJEDeqqVYsXL0MywIH/0ZzZzofgUammzkjalMxoW1b
 rojSyB1bpuADc8eXpqsw1x6coNxKK85e9aAmplXqdykgazw44lzkH43Vez7gbwGN
 JG5+utu2hQUMJYF9bQ3NLPu5JgxgP0w6aqY1L1ZndKFrRmEnM53UcojNT3M2NIN3
 cgA5ICDD0RSQy24KDSZaN+y3SvopEcY5eXUcLfshrwXI3yAIH4G39z8hQHCFGHZB
 BkYq9qjI5T4P7PRE5trRYu7B9rO8IJpoYdPnmI3i49jIyJlpFXP3FlLSmL+gCzyO
 FBhdln9sCulCWnTirRf7Gsbq6LfXpihtzzgfCtPoxxwI4PRcIBh0jxu3Od8o9NY=
 =jY8K
 -----END PGP SIGNATURE-----

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "A couple of major (hang and deadlock) fixes with fortunately fairly
  rare triggering conditions.  The PM oops is only really triggered by
  people using enclosure services (rare) and the fnic driver is mostly
  used in enterprise environments"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  SCSI: Fix NULL pointer dereference in runtime PM
  fnic: Use the local variable instead of I/O flag to acquire io_req_lock in fnic_queuecommand() to avoid deadloack
2015-08-23 20:46:22 -07:00
Ken-ichirou MATSUZAWA c953e23936 netlink: mmap: fix tx type check
I can't send netlink message via mmaped netlink socket since

    commit: a8866ff6a5
    netlink: make the check for "send from tx_ring" deterministic

msg->msg_iter.type is set to WRITE (1) at

    SYSCALL_DEFINE6(sendto, ...
        import_single_range(WRITE, ...
            iov_iter_init(1, WRITE, ...

call path, so that we need to check the type by iter_is_iovec()
to accept the WRITE.

Signed-off-by: Ken-ichirou MATSUZAWA <chamas@h4.dion.ne.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-23 16:04:46 -07:00
Linus Torvalds eb63b34bdf Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus
Pull MIPS bug fixes from Ralf Baechle:
 "Two more fixes for 4.2.

  One fixes a build issue with the LLVM assembler - LLVM assembler macro
  names are case sensitive, GNU as macro names are insensitive; the
  other corrects a license string (GPL v2, not GPLv2) such that the
  module loader will recognice the license correctly"

* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
  FIRMWARE: bcm47xx_nvram: Fix module license.
  MIPS: Fix LLVM build issue.
2015-08-23 07:23:09 -07:00
Linus Torvalds c4c53bad40 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull 9p regression fix from Al Viro:
 "Fix for breakage introduced when switching p9_client_{read,write}() to
  struct iov_iter * (went into 4.1)"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  9p: ensure err is initialized to 0 in p9_client_read/write
2015-08-22 20:22:11 -07:00
Vincent Bernat 999b8b88c6 9p: ensure err is initialized to 0 in p9_client_read/write
Some use of those functions were providing unitialized values to those
functions. Notably, when reading 0 bytes from an empty file on a 9P
filesystem, the return code of read() was not 0.

Tested with this simple program:

    #include <assert.h>
    #include <sys/types.h>
    #include <sys/stat.h>
    #include <fcntl.h>
    #include <unistd.h>

    int main(int argc, const char **argv)
    {
        assert(argc == 2);
        char buffer[256];
        int fd = open(argv[1], O_RDONLY|O_NOCTTY);
        assert(fd >= 0);
        assert(read(fd, buffer, 0) == 0);
        return 0;
    }

Cc: stable@vger.kernel.org # v4.1
Signed-off-by: Vincent Bernat <vincent@bernat.im>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-08-22 21:35:02 -04:00
Linus Torvalds b7dec838b5 Merge branch 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm
Pull ARM fixes from Russell King:
 "Another couple of small ARM fixes.

  A patch from Masahiro Yamada who noticed that "make -jN all zImage"
  would end up generating bad images where N > 1, and a patch from
  Nicolas to fix the Marvell CPU user access optimisation code when page
  faults are disabled"

* 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm:
  ARM: 8418/1: add boot image dependencies to not generate invalid images
  ARM: 8414/1: __copy_to_user_memcpy: fix mmap semaphore usage
2015-08-22 15:48:04 -07:00
Adrian Hunter 5839a5506d perf tools: Fix tarball build broken by pt/bts
Fix some include paths and add missing inat_types.h.

Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/55D77696.60102@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-08-22 12:27:07 -03:00
Linus Torvalds d0b89bd548 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
 "Various low level fixes: fix more fallout from the FPU rework and the
  asm entry code rework, plus an MSI rework fix, and an idle-tracing fix"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/fpu/math-emu: Fix crash in fork()
  x86/fpu/math-emu: Fix math-emu boot crash
  x86/idle: Restore trace_cpu_idle to mwait_idle() calls
  x86/irq: Build correct vector mapping for multiple MSI interrupts
  Revert "sched/x86_64: Don't save flags on context switch"
2015-08-22 08:15:36 -07:00
Linus Torvalds c3a0651422 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
 "Tooling fixes: a 'perf record' deadlock fix plus debuggability fixes"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf top: Show backtrace when handling a SIGSEGV on --stdio mode
  perf tools: Fix buildid processing
  perf tools: Make fork event processing more resilient
  perf tools: Avoid deadlock when map_groups are broken
2015-08-22 08:06:28 -07:00
Thomas Gleixner a57e456a7b x86/apic: Fix fallout from x2apic cleanup
In the recent x2apic cleanup I got two things really wrong:
1) The safety check in __disable_x2apic which allows the function to
   be called unconditionally is backwards. The check is there to
   prevent access to the apic MSR in case that the machine has no
   apic. Though right now it returns if the machine has an apic and
   therefor the disabling of x2apic is never invoked.

2) x2apic_disable() sets x2apic_mode to 0 after registering the local
   apic. That's wrong, because register_lapic_address() checks x2apic
   mode and therefor takes the wrong code path.

This results in boot failures on machines with x2apic preenabled by
BIOS and can also lead to an fatal MSR access on machines without
apic.

The solutions are simple:
1) Correct the sanity check for apic availability
2) Clear x2apic_mode _before_ calling register_lapic_address()

Fixes: 659006bf3a 'x86/x2apic: Split enable and setup function'
Reported-and-tested-by: Javier Monteagudo <javiermon@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://bugzilla.redhat.com/show_bug.cgi?id=1224764
Cc: stable@vger.kernel.org # 4.0+
Cc: Laura Abbott <labbott@redhat.com>
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Borislav Petkov <bp@alien8.de>
2015-08-22 17:01:48 +02:00
Linus Torvalds 84f3fe4608 Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fixes from Thomas Gleixner:
 "A series of small fixlets for a regression visible on OMAP devices
  caused by the conversion of the OMAP interrupt chips to hierarchical
  interrupt domains.  Mostly one liners on the driver side plus a small
  helper function in the core to avoid open coded mess in the drivers"

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irqchip/crossbar: Restore set_wake functionality
  irqchip/crossbar: Restore the mask on suspend behaviour
  ARM: OMAP: wakeupgen: Restore the irq_set_type() mechanism
  irqchip/crossbar: Restore the irq_set_type() mechanism
  genirq: Introduce irq_chip_set_type_parent() helper
  genirq: Don't return ENOSYS in irq_chip_retrigger_hierarchy
2015-08-22 07:45:36 -07:00
Linus Torvalds f8a89fc05a Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fixes from Thomas Gleixner:
 "Two minimalistic fixes for 4.2 regressions:

   - Eric fixed a thinko in the timer_list base switching code caused by
     the overhaul of the timer wheel.  It can cause a cpu to see the
     wrong base for a timer while we move the timer around.

   - Guenter fixed a regression for IMX if booted w/o device tree, where
     the timer interrupt is not initialized and therefor the machine
     fails to boot"

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  clocksource/imx: Fix boot with non-DT systems
  timer: Write timer->flags atomically
2015-08-22 07:37:41 -07:00
Ingo Molnar 827409b2f5 x86/fpu/math-emu: Fix crash in fork()
During later stages of math-emu bootup the following crash triggers:

	 math_emulate: 0060:c100d0a8
	 Kernel panic - not syncing: Math emulation needed in kernel
	 CPU: 0 PID: 1511 Comm: login Not tainted 4.2.0-rc7+ #1012
	 [...]
	 Call Trace:
	  [<c181d50d>] dump_stack+0x41/0x52
	  [<c181c918>] panic+0x77/0x189
	  [<c1003530>] ? math_error+0x140/0x140
	  [<c164c2d7>] math_emulate+0xba7/0xbd0
	  [<c100d0a8>] ? fpu__copy+0x138/0x1c0
	  [<c1109c3c>] ? __alloc_pages_nodemask+0x12c/0x870
	  [<c136ac20>] ? proc_clear_tty+0x40/0x70
	  [<c136ac6e>] ? session_clear_tty+0x1e/0x30
	  [<c1003530>] ? math_error+0x140/0x140
	  [<c1003575>] do_device_not_available+0x45/0x70
	  [<c100d0a8>] ? fpu__copy+0x138/0x1c0
	  [<c18258e6>] error_code+0x5a/0x60
	  [<c1003530>] ? math_error+0x140/0x140
	  [<c100d0a8>] ? fpu__copy+0x138/0x1c0
	  [<c100c205>] arch_dup_task_struct+0x25/0x30
	  [<c1048cea>] copy_process.part.51+0xea/0x1480
	  [<c115a8e5>] ? dput+0x175/0x200
	  [<c136af70>] ? no_tty+0x30/0x30
	  [<c1157242>] ? do_vfs_ioctl+0x322/0x540
	  [<c104a21a>] _do_fork+0xca/0x340
	  [<c1057b06>] ? SyS_rt_sigaction+0x66/0x90
	  [<c104a557>] SyS_clone+0x27/0x30
	  [<c1824a80>] sysenter_do_call+0x12/0x12

The reason is the incorrect assumption in fpu_copy(), that FNSAVE
can be executed from math-emu kernels as well.

Don't try to copy the registers, the soft state will be copied
by fork anyway, so the child task inherits the parent task's
soft math state.

With this fix applied math-emu kernels boot up fine on modern
hardware and the 'no387 nofxsr' boot options.

Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Bobby Powers <bobbypowers@gmail.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-08-22 10:23:03 +02:00
Ingo Molnar 5fc960380e x86/fpu/math-emu: Fix math-emu boot crash
On a math-emu bootup the following crash occurs:

	Initializing CPU#0
	------------[ cut here ]------------
	kernel BUG at arch/x86/kernel/traps.c:779!
	invalid opcode: 0000 [#1] SMP
	[...]
	EIP is at do_device_not_available+0xe/0x70
	[...]
	Call Trace:
	 [<c18238e6>] error_code+0x5a/0x60
	 [<c1002bd0>] ? math_error+0x140/0x140
	 [<c100bbd9>] ? fpu__init_cpu+0x59/0xa0
	 [<c1012322>] cpu_init+0x202/0x330
	 [<c104509f>] ? __native_set_fixmap+0x1f/0x30
	 [<c1b56ab0>] trap_init+0x305/0x346
	 [<c1b548af>] start_kernel+0x1a5/0x35d
	 [<c1b542b4>] i386_start_kernel+0x82/0x86

The reason is that in the following commit:

  b1276c48e9 ("x86/fpu: Initialize fpregs in fpu__init_cpu_generic()")

I failed to consider math-emu's limitation that it cannot execute the
FNINIT instruction in kernel mode.

The long term fix might be to allow math-emu to execute (certain) kernel
mode FPU instructions, but for now apply the safe (albeit somewhat ugly)
fix: initialize the emulation state explicitly without trapping out to
the FPU emulator.

Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-08-22 10:02:04 +02:00
Ingo Molnar 0e53909a1c perf/core improvements and fixes:
User visible:
 
 - Fix segfault using 'perf script --show-mmap-events', affects
   only current perf/core (Adrian Hunter).
 
 - /proc/kcore requires CAP_SYS_RAWIO message too noisy, make it
   debug only (Adrian Hunter)
 
 - Fix Intel PT timestamp handling (Adrian Hunter)
 
 - Add Intel BTS support, with a call-graph script to show it and
   PT in use in a GUI using 'perf script' python scripting with
   postgresql and Qt (Adrian Hunter)
 
 - Add checks for returned EVENT_ERROR type in libtraceevent, fixing
   a bug that surfaced on arm64 systems (Dean Nelson)
 
 - Fallback to using kallsyms when libdw fails to handle a vmlinux file,
   that can happen, for instance, when perf is statically linked and
   then libdw fails to load libebl_{arch}.so (Wang Nan)
 
 Infrastructure:
 
 - Initialize reference counts in map__clone() (Arnaldo Carvalho de Melo)
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJV10xwAAoJENZQFvNTUqpAZaMP/39aHr+m26/fWD8zecaqpCVQ
 toHRT74S/+kz5PrJ9hVSOMdHQeD1aXkYQaegXg5YmzOHmnprBpbNS4pipjvTIuOC
 psynqW3wjckryI1hUpSOF/QHzAE/X6YyQwprwGEUt3V0DSJN8opYFKTV4an8c/+g
 OwLvC+3YXvLj6QbHz0UX+u7aRxATXbwsBgqlTrRu4HBCXS0zMgx74fpLKfc3eDee
 5Nj44PxTEpHq1ixp755gX/UVBbcNPNansA8w8ghQ/lX2zkXGnEoy6ZSC8N4aZVER
 vqiNIsXDraKD++iXCBQknFVaa59LNy4+c4VjYi8h9J1egnnGyjwSUdxSkv+V200Z
 y2fk5ic9BOZYBUx9ojW6sMg8xwsZ3NRkelPMmHMlEQVicYE2V2eJAz6seoUlzfEt
 bDnu7HhOeqU/6QLpA5qylNWksuMDheJGHIyv1IARhrNEX2F467pCdeBX0JKI7zyH
 s7CpNR4oQhBEBl9kyIT89O2hLS1AaHlLhiB6ctAse/7NWsqGJYOg4sCqLB6xF4/B
 NsNU5H9CUMcf3E9AFTOJ1jFhn98P+p4Gra4VNUuUUD5sZ05ltu5pANouztGs3hhS
 CYxmppwrD3yPrM18Gumhb44aqpWjwMsx5nCO0zQfvKpoLxO4DQKdqBGYbIlrcV6X
 8VnsIIJrJVGDn/KSnscw
 =rUMO
 -----END PGP SIGNATURE-----

Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core

Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:

User visible changes:

  - Fix segfault using 'perf script --show-mmap-events', affects
    only current perf/core. (Adrian Hunter)

  - /proc/kcore requires CAP_SYS_RAWIO message too noisy, make it
    debug only. (Adrian Hunter)

  - Fix Intel PT timestamp handling. (Adrian Hunter)

  - Add Intel BTS support, with a call-graph script to show it and
    PT in use in a GUI using 'perf script' python scripting with
    postgresql and Qt. (Adrian Hunter)

  - Add checks for returned EVENT_ERROR type in libtraceevent, fixing
    a bug that surfaced on arm64 systems. (Dean Nelson)

  - Fallback to using kallsyms when libdw fails to handle a vmlinux file,
    that can happen, for instance, when perf is statically linked and
    then libdw fails to load libebl_{arch}.so. (Wang Nan)

Infrastructure changes:

  - Initialize reference counts in map__clone(). (Arnaldo Carvalho de Melo)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-08-22 08:45:46 +02:00