Commit Graph

737418 Commits

Author SHA1 Message Date
Hans de Goede ed0545a7fb drm/i915: Free memdup-ed DSI VBT data structures on driver_unload
Make intel_bios_cleanup function free the DSI VBT data structures which
are memdup-ed by parse_mipi_config() and parse_mipi_sequence().

Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180214082151.25015-2-hdegoede@redhat.com
(cherry picked from commit e1b86c85f6)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2018-02-14 11:41:55 -08:00
Hans de Goede 7928e9bb09 drm/i915: Add intel_bios_cleanup() function
Add an intel_bios_cleanup() function to act as counterpart of
intel_bios_init() and move the cleanup of vbt related resources there,
putting it in the same file as the allocation.

Changed in v2:
-While touching the code anyways, remove the unnecessary:
 if (dev_priv->vbt.child_dev) done before kfree(dev_priv->vbt.child_dev)

Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180214082151.25015-1-hdegoede@redhat.com
(cherry picked from commit 785f076b3b)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2018-02-14 11:41:47 -08:00
Hans de Goede 405cacc947 drm/i915/vlv: Add cdclk workaround for DSI
At least on the Chuwi Vi8 (non pro/plus) the LCD panel will show an image
shifted aprox. 20% to the left (with wraparound) and sometimes also wrong
colors, showing that the panel controller is starting with sampling the
datastream somewhere mid-line. This happens after the first blanking and
re-init of the panel.

After looking at drm.debug output I noticed that initially we inherit the
cdclk of 333333 KHz set by the GOP, but after the re-init we picked 266667
KHz, which turns out to be the cause of this problem, a quick hack to hard
code the cdclk to 333333 KHz makes the problem go away.

I've tested this on various Bay Trail devices, to make sure this not does
cause regressions on other devices and the higher cdclk does not cause
any problems on the following devices:
-GP-electronic T701      1024x600   333333 KHz cdclk after this patch
-PEAQ C1010              1920x1200  333333 KHz cdclk after this patch
-PoV mobii-wintab-800w    800x1280  333333 KHz cdclk after this patch
-Asus Transformer-T100TA 1368x768   320000 KHz cdclk after this patch

Also interesting wrt this is the comment in vlv_calc_cdclk about the
existing workaround to avoid 200 Mhz as clock because that causes issues
in some cases.

This commit extends the "do not use 200 Mhz" workaround with an extra
check to require atleast 320000 KHz (avoiding 266667 KHz) when a DSI
panel is active.

Changes in v2:
-Change the commit message and the code comment to not treat the GOP as
 a reference, the GOP should not be treated as a reference

Acked-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171220105017.11259-1-hdegoede@redhat.com
(cherry picked from commit c8dae55a8c)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2018-02-14 11:39:44 -08:00
Rodrigo Vivi a885691943 Merge tag 'gvt-fixes-2018-02-14' of https://github.com/intel/gvt-linux into drm-intel-fixes
gvt-fixes-2018-02-14

- gtt mmio 8b access fix (Tina)
- one KBL required mmio reg for switch (Weinan)
- one trace log typo fix (Weinan)

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180214052827.4nny7vkcoca4vjhn@zhen-hp.sh.intel.com
2018-02-14 11:23:21 -08:00
Will Deacon 2ce77f6d8a arm64: proc: Set PTE_NG for table entries to avoid traversing them twice
When KASAN is enabled, the swapper page table contains many identical
mappings of the zero page, which can lead to a stall during boot whilst
the G -> nG code continually walks the same page table entries looking
for global mappings.

This patch sets the nG bit (bit 11, which is IGNORED) in table entries
after processing the subtree so we can easily skip them if we see them
a second time.

Tested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-02-14 18:58:20 +00:00
Linus Torvalds 6556677a80 Fix regressions in patch Implement iomap for block_map
This tag is meant for pulling a patch called gfs2: Fixes to
 "Implement iomap for block_map". The patch fixes some
 regressions we recently discovered in commit 3974320ca6.
 -----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJahFiJAAoJENeLYdPf93o7JKYH/irlIZM7NPHhiOcot1lXG6HL
 x1fV9u6Rjw7QimctgM6ks1lu/R7hamNvOCAPz7TFXIo0grWes2qOcZa7tdWqkpZK
 TGmSIv+NfrI9NzB3PwleImClfHR8SOgIh/ZlvHQWu9JvKkPlZ3Ik0mZCXbzUFn0I
 Q5ebe+yvaaGeU3QUzsdBgTWuYRE0uQfIylyTz7f8wc9PDp2zB2l01CCCbat/VEWe
 Jy1HlXSiQsmR0N5ypm5d3AszXJ0zbHfjQzKpNACP59WrRjnKvxsBan7En5pQBFnP
 lhLWClqxgtXlvmSb4Takw+Cu9aS2zCYizQ8eqecX5FKQp1Vufoxs48EqRnq55IY=
 =vJqP
 -----END PGP SIGNATURE-----

Merge tag 'gfs2-4.16.rc1.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2

Pull gfs2 fix from Bob Peterson:
 "Fix regressions in the gfs2 iomap for block_map implementation we
  recently discovered in commit 3974320ca6"

* tag 'gfs2-4.16.rc1.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2:
  gfs2: Fixes to "Implement iomap for block_map"
2018-02-14 10:14:59 -08:00
Linus Torvalds 694a20dae6 powerpc fixes for 4.16 #2
A larger batch of fixes than we'd like. Roughly 1/3 fixes for new code, 1/3
 fixes for stable and 1/3 minor things.
 
 There's four commits fixing bugs when using 16GB huge pages on hash, caused by
 some of the preparatory changes for pkeys.
 
 Two fixes for bugs in the enhanced IRQ soft masking for local_t, one of which
 broke KVM in some circumstances.
 
 Four fixes for Power9. The most bizarre being a bug where futexes stopped
 working because a NULL pointer dereference didn't trap during early boot (it
 aliased the kernel mapping). A fix for memory hotplug when using the Radix MMU,
 and a fix for live migration of guests using the Radix MMU.
 
 Two fixes for hotplug on pseries machines. One where we weren't correctly
 updating NUMA info when CPUs are added and removed. And the other fixes
 crashes/hangs seen when doing memory hot remove during boot, which is apparently
 a thing people do.
 
 Finally a handful of build fixes for obscure configs and other minor fixes.
 
 Thanks to:
   Alexey Kardashevskiy, Aneesh Kumar K.V, Balbir Singh, Colin Ian King, Daniel
   Henrique Barboza, Florian Weimer, Guenter Roeck, Harish, Laurent Vivier,
   Madhavan Srinivasan, Mauricio Faria de Oliveira, Nathan Fontenot, Nicholas
   Piggin, Sam Bobroff.
 -----BEGIN PGP SIGNATURE-----
 
 iQIwBAABCAAaBQJahDTmExxtcGVAZWxsZXJtYW4uaWQuYXUACgkQUevqPMjhpYAd
 chAAtVe8hmkEJefTbU63GBeqva0JHSiTu2DENZAlN/epWtbtyl05PLETMdTcwGCv
 nK2zzR+xbSFN1DzZK8KQfDBW33McKZE+YkHwYOC8Kff/N0SKdHK4zvxYr7FTZGzG
 9uSG5vrxVEsPLT/yANabl0d0vKWMsJ1jZquvJAU0eLNUbA/skGjEPADtXqYQUXiA
 EnW4xeczsMLjuzTleoRqrBx74Gulovuq9LVAjfDvkydWlCU9MQkrodCgP0V2hQtw
 RAJ/QLY+NS/vMCBnvVOGBaKzIqrfeQTHF3P0j4pyBeBq/2kNuidM5n25uoc31wUq
 DE4Ebe2FJA6CHP5KEyf7dr9y7gsks/ak3/CKs+l6Yz3/0BqenEMhu6WKJ1tgf9cC
 qAmi1dIjtpw6JZ6baCbkloUdAGNjKVfLWB9ld9VIfg0C+C3y4L7+TKJukxrCBGI6
 hopfT/3p8xUdla3euiRXRLZzajyKDGrqk71hk5J/J0ChXfWB0B51X0F6NIfH41Mn
 YsVUQ95p3zS79Pl942ijGScFX/bNVLfEEGzlI/nwU/wbTxF5g/XNXm5PjBsGSr/W
 zFcCwCpFV2b/kypQoxQA5CbrKRCLOleDA/lLOxW/1NMYOQsNj05DM9wYAw5Bl+lX
 AVj2c5jM9heNN4scxDiufRNfqZbyjZ4fFUpXLNqs7N5vcks=
 =BmuL
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-4.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:
 "A larger batch of fixes than we'd like. Roughly 1/3 fixes for new
  code, 1/3 fixes for stable and 1/3 minor things.

  There's four commits fixing bugs when using 16GB huge pages on hash,
  caused by some of the preparatory changes for pkeys.

  Two fixes for bugs in the enhanced IRQ soft masking for local_t, one
  of which broke KVM in some circumstances.

  Four fixes for Power9. The most bizarre being a bug where futexes
  stopped working because a NULL pointer dereference didn't trap during
  early boot (it aliased the kernel mapping). A fix for memory hotplug
  when using the Radix MMU, and a fix for live migration of guests using
  the Radix MMU.

  Two fixes for hotplug on pseries machines. One where we weren't
  correctly updating NUMA info when CPUs are added and removed. And the
  other fixes crashes/hangs seen when doing memory hot remove during
  boot, which is apparently a thing people do.

  Finally a handful of build fixes for obscure configs and other minor
  fixes.

  Thanks to: Alexey Kardashevskiy, Aneesh Kumar K.V, Balbir Singh, Colin
  Ian King, Daniel Henrique Barboza, Florian Weimer, Guenter Roeck,
  Harish, Laurent Vivier, Madhavan Srinivasan, Mauricio Faria de
  Oliveira, Nathan Fontenot, Nicholas Piggin, Sam Bobroff"

* tag 'powerpc-4.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  selftests/powerpc: Fix to use ucontext_t instead of struct ucontext
  powerpc/kdump: Fix powernv build break when KEXEC_CORE=n
  powerpc/pseries: Fix build break for SPLPAR=n and CPU hotplug
  powerpc/mm/hash64: Zero PGD pages on allocation
  powerpc/mm/hash64: Store the slot information at the right offset for hugetlb
  powerpc/mm/hash64: Allocate larger PMD table if hugetlb config is enabled
  powerpc/mm: Fix crashes with 16G huge pages
  powerpc/mm: Flush radix process translations when setting MMU type
  powerpc/vas: Don't set uses_vas for kernel windows
  powerpc/pseries: Enable RAS hotplug events later
  powerpc/mm/radix: Split linear mapping on hot-unplug
  powerpc/64s/radix: Boot-time NULL pointer protection using a guard-PID
  ocxl: fix signed comparison with less than zero
  powerpc/64s: Fix may_hard_irq_enable() for PMI soft masking
  powerpc/64s: Fix MASKABLE_RELON_EXCEPTION_HV_OOL macro
  powerpc/numa: Invalidate numa_cpu_lookup_table on cpu remove
2018-02-14 10:06:41 -08:00
Nitzan Carmi 8000d1fdb0 nvme-rdma: fix sysfs invoked reset_ctrl error flow
When reset_controller that is invoked by sysfs fails,
it enters an error flow which practically removes the
nvme ctrl entirely (similar to delete_ctrl flow). It
causes the system to hang, since a sysfs attribute cannot
be unregistered by one of its own methods.

This can be fixed by calling delete_ctrl as a work rather
than sequential code. In addition, it should give the ctrl
a chance to recover using reconnection mechanism (consistant
with FC reset_ctrl error flow). Also, while we're here, return
suitable errno in case the reset ended with non live ctrl.

Signed-off-by: Nitzan Carmi <nitzanc@mellanox.com>
Reviewed-by: Max Gurtovoy <maxg@mellanox.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
2018-02-14 15:44:22 +02:00
Israel Rukshin 7756f72ccd nvmet: Change return code of discard command if not supported
Execute discard command on block device that doesn't support it
should return success.
Returning internal error while using multi-path fails the path.

Reviewed-by: Max Gurtovoy <maxg@mellanox.com>
Signed-off-by: Israel Rukshin <israelr@mellanox.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
2018-02-14 15:38:59 +02:00
Christian Borntraeger fa08a3b4eb virtio/s390: implement PM operations for virtio_ccw
Suspend/Resume to/from disk currently fails. Let us wire
up the necessary callbacks. This is mostly just forwarding
the requests to the virtio drivers. The only thing that
has to be done in virtio_ccw itself is to re-set the
virtio revision.

Suggested-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-Id: <20171207141102.70190-2-borntraeger@de.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
[CH: merged <20171218083706.223836-1-borntraeger@de.ibm.com> to fix
!CONFIG_PM configs]
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-02-14 14:34:09 +02:00
Jan-Marek Glogowski fdcc968a3b ALSA: hda/realtek: PCI quirk for Fujitsu U7x7
These laptops have a combined jack to attach headsets, the U727 on
the left, the U757 on the right, but a headsets microphone doesn't
work. Using hdajacksensetest I found that pin 0x19 changed the
present state when plugging the headset, in addition to 0x21, but
didn't have the correct configuration (shown as "Not connected").

So this sets the configuration to the same values as the headphone
pin 0x21 except for the device type microphone, which makes it
work correctly. With the patch the configured pins for U727 are

Pin 0x12 (Internal Mic, Mobile-In): present = No
Pin 0x14 (Internal Speaker): present = No
Pin 0x19 (Black Mic, Left side): present = No
Pin 0x1d (Internal Aux): present = No
Pin 0x21 (Black Headphone, Left side): present = No

Signed-off-by: Jan-Marek Glogowski <glogow@fbihome.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2018-02-14 12:02:26 +01:00
Phil Elwell 118032be38 mmc: bcm2835: Don't overwrite max frequency unconditionally
The optional DT parameter max-frequency could init the max bus frequency.
So take care of this, before setting the max bus frequency.

Fixes: 660fc733bd ("mmc: bcm2835: Add new driver for the sdhost controller.")
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Cc: <stable@vger.kernel.org> # 4.12+
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2018-02-14 11:30:10 +01:00
Jerome Brunet fe0e58048f Revert "mmc: meson-gx: include tx phase in the tuning process"
This reverts commit 0a44697627.

This commit was initially intended to fix problems with hs200 and hs400
on some boards, mainly the odroid-c2. The OC2 (Rev 0.2) I have performs
well in this modes, so I could not confirm these issues.

We've had several reports about the issues being still present on (some)
OC2, so apparently, this change does not do what it was supposed to do.
Maybe the eMMC signal quality is on the edge on the board. This may
explain the variability we see in term of stability, but this is just a
guess. Lowering the max_frequency to 100Mhz seems to do trick for those
affected by the issue

Worse, the commit created new issues (CRC errors and hangs) on other
boards, such as the kvim 1 and 2, the p200 or the libretech-cc.

According to amlogic, the Tx phase should not be tuned and left in its
default configuration, so it is best to just revert the commit.

Fixes: 0a44697627 ("mmc: meson-gx: include tx phase in the tuning process")
Cc: <stable@vger.kernel.org> # 4.14+
Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2018-02-14 11:30:03 +01:00
Takashi Iwai d15d662e89 ALSA: seq: Fix racy pool initializations
ALSA sequencer core initializes the event pool on demand by invoking
snd_seq_pool_init() when the first write happens and the pool is
empty.  Meanwhile user can reset the pool size manually via ioctl
concurrently, and this may lead to UAF or out-of-bound accesses since
the function tries to vmalloc / vfree the buffer.

A simple fix is to just wrap the snd_seq_pool_init() call with the
recently introduced client->ioctl_mutex; as the calls for
snd_seq_pool_init() from other side are always protected with this
mutex, we can avoid the race.

Reported-by: 范龙飞 <long7573@126.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2018-02-14 10:39:08 +01:00
Weinan Li 3cc7644e4a drm/i915/gvt: fix one typo of render_mmio trace
Fix one typo of render_mmio trace, exchange the mmio value of old and new.

Signed-off-by: Weinan Li <weinan.z.li@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2018-02-14 10:35:00 +08:00
Tina Zhang a26ca6ad4c drm/i915/gvt: Support BAR0 8-byte reads/writes
GGTT is in BAR0 with 8 bytes aligned. With a qemu patch (commit:
38d49e8c1523d97d2191190d3f7b4ce7a0ab5aa3), VFIO can use 8-byte reads/
writes to access it.

This patch is to support the 8-byte GGTT reads/writes.

Ideally, we would like to support 8-byte reads/writes for the total BAR0.
But it needs more work for handling 8-byte MMIO reads/writes.

This patch can fix the issue caused by partial updating GGTT entry, during
guest booting up.

v3:
- Use intel_vgpu_get_bar_gpa() stead. (Zhenyu)
- Include all the GGTT checking logic in gtt_entry(). (Zhenyu)

v2:
- Limit to GGTT entry. (Zhenyu)

Signed-off-by: Tina Zhang <tina.zhang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2018-02-14 10:34:44 +08:00
Weinan Li 37ad4e6878 drm/i915/gvt: add 0xe4f0 into gen9 render list
Guest may set this register on KBL platform, it can impact hardware
behavior, so add it into the gen9 render list. Otherwise gpu hang issue may
happen during different vgpu switch.

v2: separate it from patch set.

Cc: Zhi Wang <zhi.a.wang@intel.com>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Weinan Li <weinan.z.li@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2018-02-14 10:34:30 +08:00
Chris Wilson 4b8b41d15d drm/i915/pmu: Fix building without CONFIG_PM
As we peek inside struct device to query members guarded by CONFIG_PM,
so must be the code.

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Fixes: 1fe699e301 ("drm/i915/pmu: Fix sleep under atomic in RC6 readout")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180207160428.17015-1-chris@chris-wilson.co.uk
(cherry picked from commit 05273c950a)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180213095747.2424-4-tvrtko.ursulin@linux.intel.com
2018-02-13 16:56:06 -08:00
Tvrtko Ursulin 4c83f0a788 drm/i915/pmu: Fix sleep under atomic in RC6 readout
We are not allowed to call intel_runtime_pm_get from the PMU counter read
callback since the former can sleep, and the latter is running under IRQ
context.

To workaround this, we record the last known RC6 and while runtime
suspended estimate its increase by querying the runtime PM core
timestamps.

Downside of this approach is that we can temporarily lose a chunk of RC6
time, from the last PMU read-out to runtime suspend entry, but that will
eventually catch up, once device comes back online and in the presence of
PMU queries.

Also, we have to be careful not to overshoot the RC6 estimate, so once
resumed after a period of approximation, we only update the counter once
it catches up. With the observation that RC6 is increasing while the
device is suspended, this should not pose a problem and can only cause
slight inaccuracies due clock base differences.

v2: Simplify by estimating on top of PM core counters. (Imre)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104943
Fixes: 6060b6aec0 ("drm/i915/pmu: Add RC6 residency metrics")
Testcase: igt/perf_pmu/rc6-runtime-pm
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Imre Deak <imre.deak@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: David Airlie <airlied@linux.ie>
Cc: intel-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180206183311.17924-1-tvrtko.ursulin@linux.intel.com
(cherry picked from commit 1fe699e301)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180213095747.2424-3-tvrtko.ursulin@linux.intel.com
2018-02-13 16:56:03 -08:00
Tvrtko Ursulin d3f84c8b09 drm/i915/pmu: Fix PMU enable vs execlists tasklet race
Commit 99e48bf98d ("drm/i915: Lock out execlist tasklet while peeking
inside for busy-stats") added a tasklet_disable call in busy stats
enabling, but we failed to understand that the PMU enable callback runs
as an hard IRQ (IPI).

Consequence of this is that the PMU enable callback can interrupt the
execlists tasklet, and will then deadlock when it calls
intel_engine_stats_enable->tasklet_disable.

To fix this, I realized it is possible to move the engine stats enablement
and disablement to PMU event init and destroy hooks. This allows for much
simpler implementation since those hooks run in normal context (can
sleep).

v2: Extract engine_event_destroy. (Chris Wilson)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Fixes: 99e48bf98d ("drm/i915: Lock out execlist tasklet while peeking inside for busy-stats")
Testcase: igt/perf_pmu/enable-race-*
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: intel-gfx@lists.freedesktop.org
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180205093448.13877-1-tvrtko.ursulin@linux.intel.com
(cherry picked from commit b2f78cda26)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180213095747.2424-2-tvrtko.ursulin@linux.intel.com
2018-02-13 16:55:59 -08:00
Chris Wilson edb76b01ac drm/i915: Lock out execlist tasklet while peeking inside for busy-stats
In order to prevent a race condition where we may end up overaccounting
the active state and leaving the busy-stats believing the GPU is 100%
busy, lock out the tasklet while we reconstruct the busy state. There is
no direct spinlock guard for the execlists->port[], so we need to
utilise tasklet_disable() as a synchronous barrier to prevent it, the
only writer to execlists->port[], from running at the same time as the
enable.

Fixes: 4900727d35 ("drm/i915/pmu: Reconstruct active state on starting busy-stats")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180115092041.13509-1-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
(cherry picked from commit 99e48bf98d)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180213095747.2424-1-tvrtko.ursulin@linux.intel.com
2018-02-13 16:55:55 -08:00
Chris Wilson 117172c8f9 drm/i915/breadcrumbs: Ignore unsubmitted signalers
When a request is preempted, it is unsubmitted from the HW queue and
removed from the active list of breadcrumbs. In the process, this
however triggers the signaler and it may see the clear rbtree with the
old, and still valid, seqno, or it may match the cleared seqno with the
now zero rq->global_seqno. This confuses the signaler into action and
signaling the fence.

Fixes: d6a2289d9d ("drm/i915: Remove the preempted request from the execution queue")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: <stable@vger.kernel.org> # v4.12+
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180206094633.30181-1-chris@chris-wilson.co.uk
(cherry picked from commit fd10e2ce99)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180213090154.17373-1-chris@chris-wilson.co.uk
2018-02-13 16:55:45 -08:00
Keith Busch 4244140d7b nvme-pci: Fix timeouts in connecting state
We need to halt the controller immediately if we haven't completed
initialization as indicated by the new "connecting" state.

Fixes: ad70062cdb ("nvme-pci: introduce RECONNECTING state to mark initializing procedure")
Signed-off-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2018-02-13 17:09:50 -07:00
Keith Busch 815c6704bf nvme-pci: Remap CMB SQ entries on every controller reset
The controller memory buffer is remapped into a kernel address on each
reset, but the driver was setting the submission queue base address
only on the very first queue creation. The remapped address is likely to
change after a reset, so accessing the old address will hit a kernel bug.

This patch fixes that by setting the queue's CMB base address each time
the queue is created.

Fixes: f63572dff1 ("nvme: unmap CMB and remove sysfs file in reset path")
Reported-by: Christian Black <christian.d.black@intel.com>
Cc: Jon Derrick <jonathan.derrick@intel.com>
Cc: <stable@vger.kernel.org> # 4.9+
Signed-off-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2018-02-13 17:09:50 -07:00
Jianchao Wang 3fd176b754 nvme: fix the deadlock in nvme_update_formats
nvme_update_formats will invoke nvme_ns_remove under namespaces_mutext.
The will cause deadlock because nvme_ns_remove will also require
the namespaces_mutext. Fix it by getting the ns entries which should
be removed under namespaces_mutext and invoke nvme_ns_remove out of
namespaces_mutext.

Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
2018-02-13 17:09:50 -07:00
Andreas Gruenbacher 49edd5bf42 gfs2: Fixes to "Implement iomap for block_map"
It turns out that commit 3974320ca6 "Implement iomap for block_map"
introduced a few bugs that trigger occasional failures with xfstest
generic/476:

In gfs2_iomap_begin, we jump to do_alloc when we determine that we are
beyond the end of the allocated metadata (height > ip->i_height).
There, we can end up calling hole_size with a metapath that doesn't
match the current metadata tree, which doesn't make sense.  After
untangling the code at do_alloc, fix this by checking if the block we
are looking for is within the range of allocated metadata.

In addition, add a BUG() in case gfs2_iomap_begin is accidentally called
for reading stuffed files: this is handled separately.  Make sure we
don't truncate iomap->length for reads beyond the end of the file; in
that case, the entire range counts as a hole.

Finally, revert to taking a bitmap write lock when doing allocations.
It's unclear why that change didn't lead to any failures during testing.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2018-02-13 13:38:10 -07:00
Linus Torvalds 61f14c015f MIPS changes for 4.16-rc2
A single change (and associated DT binding update) to allow the address
 of the MIPS Cluster Power Controller (CPC) to be chosen by DT, which
 allows SMP to work on generic MIPS kernels where the bootloader hasn't
 configured the CPC address (i.e. the new Ranchu platform).
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEd80NauSabkiESfLYbAtpk944dnoFAlqCxagACgkQbAtpk944
 dnqc5g//Ss6wWO2hgUtzUTEwPgmCLANG1cVvykI0a1rROzdnEUI8CRUoiRlD0A0B
 gxpDZKNXWkpZ1veL7XuwKjFLk1tW7a/MNB/nxIPhruGRl1uBH8BOo3OXYE/ZzvKl
 74kN55Ykz4sltlyuTSSG/6e41ysXkSB4xJdTb/hx6jPOVwFM4RoOFODmhRYf5mKO
 p9N9ZYpqC07IYL6upRqEhEG1LePio3aVx66ngq+d+8SOISMP3puXf5TkvRFkkCfz
 OSPsvDtbsm8tf1yM4vvw7PNK4JsuS+OjbDMaLZXZFy+OBMAb0VJ2ZCG9OM5Chkvc
 Dqkb5Ds0pB0kYGHL8bh726q5NmcIVfKT5k0XRyz5a3weHdSbCn5/pHPg5uxtvlDP
 xt2i6k3HJjoMb5FmbhObROf6O904d5vi4u0E17EefWOwEaDn23PruzqUDqAGgq4g
 k84hXuVSZd/Ymu/9Lh+KYlhyiqCKcReleIRzg+ySU5bmXZR8izkiTdU1NIXRH4mg
 4xi7SV/tygACd0cu42CF6b5lOWIGBZZ5qtyI93cfWRCngL2LT0rYfCNg+IuuK9eb
 hZ2YZ7AjqUWYMPQgxHJ6rPLslY9/LDiW3OrtL7/3gEQyC3D41XYSIFThMO+DDC5c
 Ok6nJNxnEE3AvqE5iHjr/PA3GRx6bUmv/2Ty+DzDqWnO7Gayxls=
 =E7mu
 -----END PGP SIGNATURE-----

Merge tag 'mips_4.16_2' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/mips

Pull MIPS fix from James Hogan:
 "A single change (and associated DT binding update) to allow the
  address of the MIPS Cluster Power Controller (CPC) to be chosen by DT,
  which allows SMP to work on generic MIPS kernels where the bootloader
  hasn't configured the CPC address (i.e. the new Ranchu platform)"

* tag 'mips_4.16_2' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/mips:
  MIPS: CPC: Map registers using DT in mips_cpc_default_phys_base()
  dt-bindings: Document mti,mips-cpc binding
2018-02-13 09:35:17 -08:00
Christoph Hellwig 7bcfab202c powerpc/macio: set a proper dma_coherent_mask
We have expected busses to set up a coherent mask to properly use the
common dma mapping code for a long time, and now that I've added a warning
macio turned out to not set one up yet.  This sets it to the same value
as the dma_mask, which seems to be what the drivers expect.

Reported-by: Mathieu Malaterre <malat@debian.org>
Tested-by: Mathieu Malaterre <malat@debian.org>
Reported-by: Meelis Roos <mroos@linux.ee>
Tested-by: Meelis Roos <mroos@linux.ee>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-02-13 08:58:53 -08:00
Nitesh Shetty 67b4110f8c blk: optimization for classic polling
This removes the dependency on interrupts to wake up task. Set task
state as TASK_RUNNING, if need_resched() returns true,
while polling for IO completion.
Earlier, polling task used to sleep, relying on interrupt to wake it up.
This made some IO take very long when interrupt-coalescing is enabled in
NVMe.

Reference:
http://lists.infradead.org/pipermail/linux-nvme/2018-February/015435.html

Changes since v2->v3:
	-using __set_current_state() instead of set_current_state()

Changes since v1->v2:
	-setting task state once in blk_poll, instead of multiple
callers.

Signed-off-by: Nitesh Shetty <nj.shetty@samsung.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-02-13 09:12:04 -07:00
Tony Luck fd0e786d9d x86/mm, mm/hwpoison: Don't unconditionally unmap kernel 1:1 pages
In the following commit:

  ce0fa3e56a ("x86/mm, mm/hwpoison: Clear PRESENT bit for kernel 1:1 mappings of poison pages")

... we added code to memory_failure() to unmap the page from the
kernel 1:1 virtual address space to avoid speculative access to the
page logging additional errors.

But memory_failure() may not always succeed in taking the page offline,
especially if the page belongs to the kernel.  This can happen if
there are too many corrected errors on a page and either mcelog(8)
or drivers/ras/cec.c asks to take a page offline.

Since we remove the 1:1 mapping early in memory_failure(), we can
end up with the page unmapped, but still in use. On the next access
the kernel crashes :-(

There are also various debug paths that call memory_failure() to simulate
occurrence of an error. Since there is no actual error in memory, we
don't need to map out the page for those cases.

Revert most of the previous attempt and keep the solution local to
arch/x86/kernel/cpu/mcheck/mce.c. Unmap the page only when:

	1) there is a real error
	2) memory_failure() succeeds.

All of this only applies to 64-bit systems. 32-bit kernel doesn't map
all of memory into kernel space. It isn't worth adding the code to unmap
the piece that is mapped because nobody would run a 32-bit kernel on a
machine that has recoverable machine checks.

Signed-off-by: Tony Luck <tony.luck@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave <dave.hansen@intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robert (Persistent Memory) <elliott@hpe.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-mm@kvack.org
Cc: stable@vger.kernel.org #v4.14
Fixes: ce0fa3e56a ("x86/mm, mm/hwpoison: Clear PRESENT bit for kernel 1:1 mappings of poison pages")
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-13 16:25:06 +01:00
Tycho Andersen 2dd6fd2e99 locking/semaphore: Update the file path in documentation
While reading this header I noticed that the locking stuff has moved to
kernel/locking/*, so update the path in semaphore.h to point to that.

Signed-off-by: Tycho Andersen <tycho@tycho.ws>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20180201114119.1090-1-tycho@tycho.ws
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-13 15:00:06 +01:00
Will Deacon 61e02392d3 locking/atomic/bitops: Document and clarify ordering semantics for failed test_and_{}_bit()
A test_and_{}_bit() operation fails if the value of the bit is such that
the modification does not take place. For example, if test_and_set_bit()
returns 1. In these cases, follow the behaviour of cmpxchg and allow the
operation to be unordered. This also applies to test_and_set_bit_lock()
if the lock is found to be be taken already.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1518528619-20049-1-git-send-email-will.deacon@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-13 14:55:53 +01:00
Will Deacon 11dc13224c locking/qspinlock: Ensure node->count is updated before initialising node
When queuing on the qspinlock, the count field for the current CPU's head
node is incremented. This needn't be atomic because locking in e.g. IRQ
context is balanced and so an IRQ will return with node->count as it
found it.

However, the compiler could in theory reorder the initialisation of
node[idx] before the increment of the head node->count, causing an
IRQ to overwrite the initialised node and potentially corrupt the lock
state.

Avoid the potential for this harmful compiler reordering by placing a
barrier() between the increment of the head node->count and the subsequent
node initialisation.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1518528177-19169-3-git-send-email-will.deacon@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-13 14:50:14 +01:00
Will Deacon 95bcade33a locking/qspinlock: Ensure node is initialised before updating prev->next
When a locker ends up queuing on the qspinlock locking slowpath, we
initialise the relevant mcs node and publish it indirectly by updating
the tail portion of the lock word using xchg_tail. If we find that there
was a pre-existing locker in the queue, we subsequently update their
->next field to point at our node so that we are notified when it's our
turn to take the lock.

This can be roughly illustrated as follows:

  /* Initialise the fields in node and encode a pointer to node in tail */
  tail = initialise_node(node);

  /*
   * Exchange tail into the lockword using an atomic read-modify-write
   * operation with release semantics
   */
  old = xchg_tail(lock, tail);

  /* If there was a pre-existing waiter ... */
  if (old & _Q_TAIL_MASK) {
	prev = decode_tail(old);
	smp_read_barrier_depends();

	/* ... then update their ->next field to point to node.
	WRITE_ONCE(prev->next, node);
  }

The conditional update of prev->next therefore relies on the address
dependency from the result of xchg_tail ensuring order against the
prior initialisation of node. However, since the release semantics of
the xchg_tail operation apply only to the write portion of the RmW,
then this ordering is not guaranteed and it is possible for the CPU
to return old before the writes to node have been published, consequently
allowing us to point prev->next to an uninitialised node.

This patch fixes the problem by making the update of prev->next a RELEASE
operation, which also removes the reliance on dependency ordering.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1518528177-19169-2-git-send-email-will.deacon@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-13 14:50:14 +01:00
Arnd Bergmann 01684e72f1 x86/error_inject: Make just_return_func() globally visible
With link time optimizations enabled, I get a link failure:

  ./ccLbOEHX.ltrans19.ltrans.o: In function `override_function_with_return':
  <artificial>:(.text+0x7f3): undefined reference to `just_return_func'

Marking the symbol .globl makes it work as expected.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Josef Bacik <jbacik@fb.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Nicolas Pitre <nico@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Fixes: 540adea380 ("error-injection: Separate error-injection from kprobe")
Link: http://lkml.kernel.org/r/20180202145634.200291-3-arnd@arndb.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-13 14:33:35 +01:00
mike.travis@hpe.com c25d99d20b x86/platform/UV: Fix GAM Range Table entries less than 1GB
The latest UV platforms include the new ApachePass NVDIMMs into the
UV address space.  This has introduced address ranges in the Global
Address Map Table that are less than the previous lowest range, which
was 2GB.  Fix the address calculation so it accommodates address ranges
from bytes to exabytes.

Signed-off-by: Mike Travis <mike.travis@hpe.com>
Reviewed-by: Andrew Banman <andrew.banman@hpe.com>
Reviewed-by: Dimitri Sivanich <dimitri.sivanich@hpe.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russ Anderson <russ.anderson@hpe.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20180205221503.190219903@stormcage.americas.sgi.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-13 14:15:45 +01:00
Marcin Nowakowski 67a3ba25aa
MIPS: Fix incorrect mem=X@Y handling
Commit 73fbc1eba7 ("MIPS: fix mem=X@Y commandline processing") added a
fix to ensure that the memory range between PHYS_OFFSET and low memory
address specified by mem= cmdline argument is not later processed by
free_all_bootmem.  This change was incorrect for systems where the
commandline specifies more than 1 mem argument, as it will cause all
memory between PHYS_OFFSET and each of the memory offsets to be marked
as reserved, which results in parts of the RAM marked as reserved
(Creator CI20's u-boot has a default commandline argument 'mem=256M@0x0
mem=768M@0x30000000').

Change the behaviour to ensure that only the range between PHYS_OFFSET
and the lowest start address of the memories is marked as protected.

This change also ensures that the range is marked protected even if it's
only defined through the devicetree and not only via commandline
arguments.

Reported-by: Mathieu Malaterre <mathieu.malaterre@gmail.com>
Signed-off-by: Marcin Nowakowski <marcin.nowakowski@mips.com>
Fixes: 73fbc1eba7 ("MIPS: fix mem=X@Y commandline processing")
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: <stable@vger.kernel.org> # v4.11+
Tested-by: Mathieu Malaterre <malat@debian.org>
Patchwork: https://patchwork.linux-mips.org/patch/18562/
Signed-off-by: James Hogan <jhogan@kernel.org>
2018-02-13 13:14:41 +00:00
Progyan Bhattacharya 74eb816b21 x86/build: Add arch/x86/tools/insn_decoder_test to .gitignore
The file was generated by make command and should not be in the source tree.

Signed-off-by: Progyan Bhattacharya <progyanb@acm.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-13 14:10:29 +01:00
Leo Yan 43d1b29b27 sched/cpufreq: Remove unused SUGOV_KTHREAD_PRIORITY macro
Since schedutil kernel thread directly set priority to 0, the macro
SUGOV_KTHREAD_PRIORITY is not used.  So remove it.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rafael J . Wysocki <rafael.j.wysocki@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vikram Mulukutla <markivx@codeaurora.org>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Link: http://lkml.kernel.org/r/1518097702-9665-1-git-send-email-leo.yan@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-13 13:04:03 +01:00
Jaedon Shin 627f4a2bdf
MIPS: BMIPS: Fix section mismatch warning
Remove the __init annotation from bmips_cpu_setup() to avoid the
following warning.

WARNING: vmlinux.o(.text+0x35c950): Section mismatch in reference from the function brcmstb_pm_s3() to the function .init.text:bmips_cpu_setup()
The function brcmstb_pm_s3() references
the function __init bmips_cpu_setup().
This is often because brcmstb_pm_s3 lacks a __init
annotation or the annotation of bmips_cpu_setup is wrong.

Signed-off-by: Jaedon Shin <jaedon.shin@gmail.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Kevin Cernekee <cernekee@gmail.com>
Cc: linux-mips@linux-mips.org
Reviewed-by: James Hogan <jhogan@kernel.org>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Patchwork: https://patchwork.linux-mips.org/patch/18589/
Signed-off-by: James Hogan <jhogan@kernel.org>
2018-02-13 11:53:28 +00:00
Masayoshi Mizuma 295cc7eb31 x86/smpboot: Fix uncore_pci_remove() indexing bug when hot-removing a physical CPU
When a physical CPU is hot-removed, the following warning messages
are shown while the uncore device is removed in uncore_pci_remove():

  WARNING: CPU: 120 PID: 5 at arch/x86/events/intel/uncore.c:988
  uncore_pci_remove+0xf1/0x110
  ...
  CPU: 120 PID: 5 Comm: kworker/u1024:0 Not tainted 4.15.0-rc8 #1
  Workqueue: kacpi_hotplug acpi_hotplug_work_fn
  ...
  Call Trace:
  pci_device_remove+0x36/0xb0
  device_release_driver_internal+0x145/0x210
  pci_stop_bus_device+0x76/0xa0
  pci_stop_root_bus+0x44/0x60
  acpi_pci_root_remove+0x1f/0x80
  acpi_bus_trim+0x54/0x90
  acpi_bus_trim+0x2e/0x90
  acpi_device_hotplug+0x2bc/0x4b0
  acpi_hotplug_work_fn+0x1a/0x30
  process_one_work+0x141/0x340
  worker_thread+0x47/0x3e0
  kthread+0xf5/0x130

When uncore_pci_remove() runs, it tries to get the package ID to
clear the value of uncore_extra_pci_dev[].dev[] by using
topology_phys_to_logical_pkg(). The warning messesages are
shown because topology_phys_to_logical_pkg() returns -1.

  arch/x86/events/intel/uncore.c:
  static void uncore_pci_remove(struct pci_dev *pdev)
  {
  ...
          phys_id = uncore_pcibus_to_physid(pdev->bus);
  ...
                  pkg = topology_phys_to_logical_pkg(phys_id); // returns -1
                  for (i = 0; i < UNCORE_EXTRA_PCI_DEV_MAX; i++) {
                          if (uncore_extra_pci_dev[pkg].dev[i] == pdev) {
                                  uncore_extra_pci_dev[pkg].dev[i] = NULL;
                                  break;
                          }
                  }
                  WARN_ON_ONCE(i >= UNCORE_EXTRA_PCI_DEV_MAX); // <=========== HERE!!

topology_phys_to_logical_pkg() tries to find
cpuinfo_x86->phys_proc_id that matches the phys_pkg argument.

  arch/x86/kernel/smpboot.c:
  int topology_phys_to_logical_pkg(unsigned int phys_pkg)
  {
          int cpu;

          for_each_possible_cpu(cpu) {
                  struct cpuinfo_x86 *c = &cpu_data(cpu);

                  if (c->initialized && c->phys_proc_id == phys_pkg)
                          return c->logical_proc_id;
          }
          return -1;
  }

However, the phys_proc_id was already set to 0 by remove_siblinginfo()
when the CPU was offlined.

So, topology_phys_to_logical_pkg() cannot find the correct
logical_proc_id and always returns -1.

As the result, uncore_pci_remove() calls WARN_ON_ONCE() and the warning
messages are shown.

What is worse is that the bogus 'pkg' index results in two bugs:

 - We dereference uncore_extra_pci_dev[] with a negative index
 - We fail to clean up a stale pointer in uncore_extra_pci_dev[][]

To fix these bugs, remove the clearing of ->phys_proc_id from remove_siblinginfo().

This should not cause any problems, because ->phys_proc_id is not
used after it is hot-removed and it is re-set while hot-adding.

Signed-off-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: yasu.isimatu@gmail.com
Cc: <stable@vger.kernel.org>
Fixes: 30bb981185 ("x86/topology: Avoid wasting 128k for package id array")
Link: http://lkml.kernel.org/r/ed738d54-0f01-b38b-b794-c31dc118c207@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-13 12:47:28 +01:00
Harish ecdf06e1ea selftests/powerpc: Fix to use ucontext_t instead of struct ucontext
With glibc 2.26 'struct ucontext' is removed to improve POSIX
compliance, which breaks powerpc/alignment_handler selftest. Fix the
test by using ucontext_t. Tested on ppc, works with older glibc
versions as well.

Fixes the following:
  alignment_handler.c: In function ‘sighandler’:
  alignment_handler.c:68:5: error: dereferencing pointer to incomplete type ‘struct ucontext’
    ucp->uc_mcontext.gp_regs[PT_NIP] += 4;

Signed-off-by: Harish <harish@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-02-13 22:44:18 +11:00
Guenter Roeck 9109617545 powerpc/kdump: Fix powernv build break when KEXEC_CORE=n
If KEXEC_CORE is not enabled, powernv builds fail as follows.

  arch/powerpc/platforms/powernv/smp.c: In function 'pnv_smp_cpu_kill_self':
  arch/powerpc/platforms/powernv/smp.c:236:4: error:
  	implicit declaration of function 'crash_ipi_callback'

Add dummy function calls, similar to kdump_in_progress(), to solve the
problem.

Fixes: 4145f35864 ("powernv/kdump: Fix cases where the kdump kernel can get HMI's")
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-02-13 22:39:37 +11:00
Guenter Roeck 82343484a2 powerpc/pseries: Fix build break for SPLPAR=n and CPU hotplug
Commit e67e02a544 ("powerpc/pseries: Fix cpu hotplug crash with
memoryless nodes") adds an unconditional call to
find_and_online_cpu_nid(), which is only declared if CONFIG_PPC_SPLPAR
is enabled. This results in the following build error if this is not
the case.

  arch/powerpc/platforms/pseries/hotplug-cpu.o: In function `dlpar_online_cpu':
  arch/powerpc/platforms/pseries/hotplug-cpu.c:369:
  			undefined reference to `.find_and_online_cpu_nid'

Follow the guideline provided by similar functions and provide a dummy
function if CONFIG_PPC_SPLPAR is not enabled. This also moves the
external function declaration into an include file where it should be.

Fixes: e67e02a544 ("powerpc/pseries: Fix cpu hotplug crash with memoryless nodes")
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
[mpe: Change subject to emphasise the build fix]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-02-13 22:38:39 +11:00
Aneesh Kumar K.V fc5c2f4a55 powerpc/mm/hash64: Zero PGD pages on allocation
On powerpc we allocate page table pages from slab caches of different
sizes. Currently we have a constructor that zeroes out the objects when
we allocate them for the first time.

We expect the objects to be zeroed out when we free the the object
back to slab cache. This happens in the unmap path. For hugetlb pages
we call huge_pte_get_and_clear() to do that.

With the current configuration of page table size, both PUD and PGD
level tables are allocated from the same slab cache. At the PUD level,
we use the second half of the table to store the slot information. But
we never clear that when unmapping.

When such a freed object is then allocated for a PGD page, the second
half of the page table page will not be zeroed as expected. This
results in a kernel crash.

Fix it by always clearing PGD pages when they're allocated.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
[mpe: Change log wording and formatting, add whitespace]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-02-13 22:37:48 +11:00
Aneesh Kumar K.V ff31e10546 powerpc/mm/hash64: Store the slot information at the right offset for hugetlb
The hugetlb pte entries are at the PMD and PUD level, so we can't use
PTRS_PER_PTE to find the second half of the page table. Use the right
offset for PUD/PMD to get to the second half of the table.

Fixes: bf9a95f9a6 ("powerpc: Free up four 64K PTE bits in 64K backed HPTE pages")
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Reviewed-by: Ram Pai <linuxram@us.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-02-13 22:37:48 +11:00
Aneesh Kumar K.V 4a7aa4fecb powerpc/mm/hash64: Allocate larger PMD table if hugetlb config is enabled
We use the second half of the page table to store slot information, so we must
allocate it always if hugetlb is possible.

Fixes: bf9a95f9a6 ("powerpc: Free up four 64K PTE bits in 64K backed HPTE pages")
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Reviewed-by: Ram Pai <linuxram@us.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-02-13 22:37:48 +11:00
Aneesh Kumar K.V fae2211697 powerpc/mm: Fix crashes with 16G huge pages
To support memory keys, we moved the hash pte slot information to the
second half of the page table. This was ok with PTE entries at level
4 (PTE page) and level 3 (PMD). We already allocate larger page table
pages at those levels to accomodate extra details. For level 4 we
already have the extra space which was used to track 4k hash page
table entry details and at level 3 the extra space was allocated to
track the THP details.

With hugetlbfs PTE, we used this extra space at the PMD level to store
the slot details. But we also support hugetlbfs PTE at PUD level for
16GB pages and PUD level page didn't allocate extra space. This
resulted in memory corruption.

Fix this by allocating extra space at PUD level when HUGETLB is
enabled.

Fixes: bf9a95f9a6 ("powerpc: Free up four 64K PTE bits in 64K backed HPTE pages")
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Reviewed-by: Ram Pai <linuxram@us.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-02-13 22:37:47 +11:00
Alexey Kardashevskiy 62e984ddfd powerpc/mm: Flush radix process translations when setting MMU type
Radix guests do normally invalidate process-scoped translations when a
new pid is allocated but migrated guests do not invalidate these so
migrated guests crash sometime, especially easy to reproduce with
migration happening within first 10 seconds after the guest boot start
on the same machine.

This adds the "Invalidate process-scoped translations" flush to fix
radix guests migration.

Fixes: 2ee13be34b ("KVM: PPC: Book3S HV: Update kvmppc_set_arch_compat() for ISA v3.00")
Cc: stable@vger.kernel.org # v4.10+
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Tested-by: Laurent Vivier <lvivier@redhat.com>
Tested-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-02-13 22:37:47 +11:00
Nicholas Piggin b00b628986 powerpc/vas: Don't set uses_vas for kernel windows
cp_abort is only required for user windows, because kernel context
must not be preempted between a copy/paste pair.

Without this patch, the init task gets used_vas set when it runs the
nx842_powernv_init initcall, which opens windows for kernel usage.

used_vas is then never cleared anywhere, so it gets propagated into
all other tasks. It's a property of the address space, so it should
really be cleared when a new mm is created (or in dup_mmap if the
mmaps are marked as VM_DONTCOPY). For now we seem to have no such
driver, so leave that for another patch.

Fixes: 6c8e6bb2a5 ("powerpc/vas: Add support for user receive window")
Cc: stable@vger.kernel.org # v4.15+
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-02-13 22:37:46 +11:00