Commit Graph

481811 Commits

Author SHA1 Message Date
Oded Gabbay 19f6d2a660 amdkfd: Add basic modules to amdkfd
This patch adds the process module and three helper modules:

- kfd_process, which handles process which open /dev/kfd

- kfd_doorbell, which provides helper functions for doorbell allocation,
  release and mapping to userspace

- kfd_pasid, which provides helper functions for pasid allocation and release

- kfd_aperture, which provides helper functions for managing the LDS, Local GPU
  memory and Scratch memory apertures of the process

This patch only contains the basic kfd_process module, which doesn't contain
the reference to the queue scheduler. This was done to allow easier code review.

Also, this patch doesn't contain the calls to the IOMMU driver for binding the
pasid to the device. Again, this was done to allow easier code review

The kfd_process object is created when a process opens /dev/kfd and is closed
when the mm_struct of that process is teared-down.

v3:

Removed kfd_vidmem.c file
Replaced direct mmput call to mmu_notifier release
Removed typedefs
Moved bool field to end of the structure
Added new kernel params for gart usage limitation
Added initialization of sa manager
Fixed debug messages
Remove support for LDS in 32 bit
Changed code to support mmap of doorbell pages from userspace
Added documentation for apertures

v4: Replaced RCU by SRCU for kfd_process list management

v5:

Move amdkfd from drm/radeon/ to drm/amd/
Rename kfd_aperture.c to kfd_flat_memory.c
Protect against multiple init calls
MQD size is H/W dependent so moved it to device info structure
Rename kfd_mem_obj structure's members
Use delayed function for process tear-down

Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2014-07-16 23:25:31 +03:00
Evgeny Pinchuk 5b5c4e40a3 amdkfd: Add topology module to amdkfd
This patch adds the topology module to the driver. The topology is exposed to
userspace through the sysfs.

The calls to add and remove a device to/from topology are done by the radeon
driver.

v3:

The CPU information, that is provided in the topology section of the amdkfd
driver, is extracted from the CRAT table. Unlike the CPU information located
in /sys/devices/system/cpu/cpu*, which is extracted from the SRAT table.

While the CPU information provided by the CRAT and the SRAT tables might be
identical, the node topology might be different. The SRAT table contains the
topology of CPU nodes only. The CRAT table contains the topology of CPU and GPU
nodes together (and can be interleaved). For example CPU node 1 in SRAT can be
CPU node 3 in CRAT. Furthermore it's worth to mention that the CRAT table
contains only HSA compatible nodes (nodes which are compliant with the HSA
spec).

To recap, amdkfd exposes a different kind of topology than the one exposed by
/sys/devices/system/cpu/cpu even though it may contain similar information.

v4:

The topology module doesn't support uevent handling and doesn't notify the
userspace about runtime modifications. It is up to the userspace to acquire
snapshots of the topology information created by the amdkfd and exposed
in sysfs.

The following is an example of how the topology looks on a Kaveri A10-7850K
system with amdkfd installed:

/sys/devices/virtual/kfd/kfd/
|
--- topology/
      |
      |--- generation_id
      |--- system_properties
      |--- nodes/
            |
            |--- 0/
                 |
                 |--- gpu_id
                 |--- name
                 |--- properties
                 |--- caches/
                      |
                      |--- 0/
                           |
                           |--- properties
                      |--- 1/
                           |
                           |--- properties
                      |--- 2/
                           |
                           |--- properties
                 |--- io_links/
                      |
                 |--- mem_banks/
                      |
                      |--- 0/
                           |
                           |--- properties
                      |--- 1/
                           |
                           |--- properties
                      |--- 2/
                           |
                           |--- properties
                      |--- 3/
                           |
                           |--- properties

v5:

Move amdkfd from drm/radeon/ to drm/amd/

Add a check if dev->gpu pointer is null before accessing it in the
node_show function in kfd_topology.c
This situation may occur when amdkfd is loaded and there is a GPU with a CRAT
table, but that GPU isn't supported by amdkfd

Signed-off-by: Evgeny Pinchuk <evgeny.pinchuk@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2014-07-16 21:22:32 +03:00
Oded Gabbay 4a488a7ad7 amdkfd: Add amdkfd skeleton driver
This patch adds the amdkfd skeleton driver. The driver does nothing except
define a /dev/kfd device.

It returns -ENODEV on all amdkfd IOCTLs.

v3: Move bool field to the end of structure, removed the pmc ioctls and added
a meaningful error message for ioctl error.

v5:

Create a new folder drm/amd and move amdkfd from drm/radeon/ to drm/amd/
Remove scheduler_class from kfd_priv.h as it was never used
Add skeleton implementation of the Get Version IOCTL

v6:
Update module version to the correct number and remove the "default m" from the
Kconfig file

Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2014-07-16 21:08:55 +03:00
Oded Gabbay b7facbaec7 amdkfd: Add IOCTL set definitions of amdkfd
- KFD_IOC_GET_VERSION:
	Retrieves the interface version of amdkfd

- KFD_IOC_CREATE_QUEUE:
	Creates a usermode queue that runs on a specific GPU device

- KFD_IOC_DESTROY_QUEUE:
	Destroys an existing usermode queue

- KFD_IOC_SET_MEMORY_POLICY:
	Sets the memory policy of the default and alternate aperture of the
        calling process

- KFD_IOC_GET_CLOCK_COUNTERS:
	Retrieves counters (timestamps) of CPU and GPU

- KFD_IOC_GET_PROCESS_APERTURES:
	Retrieves information about process apertures that were initialized
        during the open() call of the amdkfd device

- KFD_IOC_UPDATE_QUEUE:
	Updates configuration of an existing usermode queue

v3: Remove pragma pack and pmc ioctls. Added parameter for doorbell offset and
a comment on counters

v5:

Add define for AQL queues.
Fix arguments of Get Version IOCTL
Make IOCTL's structures to be the same size on 32/64 bit

v6: Change the version of the amdkfd-thunk interface

Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2014-07-16 15:55:29 +03:00
Oded Gabbay 16423d6793 Update MAINTAINERS and CREDITS files with amdkfd info
v6: Update entries to reflect new name & location of driver

Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2014-07-15 13:08:36 +03:00
Oded Gabbay e28740ece3 drm/radeon: Add radeon <--> amdkfd interface
This patch adds the interface between the radeon driver and the amdkfd driver.
The interface implementation is contained in radeon_kfd.c and radeon_kfd.h.

The interface itself is represented by a pointer to struct
kfd_dev. The pointer is located inside radeon_device structure.

All the register accesses that amdkfd need are done using this interface. This
allows us to avoid direct register accesses in amdkfd proper,  while also
avoiding locking between amdkfd and radeon.

The single exception is the doorbells that are used in both of the drivers.
However, because they are located in separate pci bar pages, the danger of
sharing registers between the drivers is minimal.

Having said that, we are planning to move the doorbells as well to radeon.

v3:

Add interface for sa manager init and fini. The init function will allocate a
buffer on system memory and pin it to the GART address space via the radeon sa
manager.

All mappings of buffers to GART address space are done via the radeon sa
manager. The interface of allocate memory will use the radeon sa manager to sub
allocate from the single buffer that was allocated during the init function.

Change lower_32/upper_32 calls to use linux macros

Add documentation for the interface

v4:

Change ptr field type in kgd_mem from uint32_t* to void* to match to type that
is returned by radeon_sa_bo_cpu_addr

v5:

Change format of mqd structure to work with latest KV firmware
Add support for AQL queues creation to enable working with open-source HSA
runtime.
Move generic kfd-->kgd interface and other generic kgd definitions to a generic
header file that will be used by AMD's radeon and amdgpu drivers

Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2014-07-15 13:53:32 +03:00
Oded Gabbay 1c0a46255f drm/radeon: adding synchronization for GRBM GFX
Implementing a lock for selecting and accessing shader engines and arrays.
This lock will make sure that radeon and amdkfd are not colliding when
accessing shader engines and arrays with GRBM_GFX_INDEX register.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2014-07-14 15:36:08 +03:00
Oded Gabbay ebff8453d3 drm/radeon: Report doorbell configuration to amdkfd
radeon and amdkfd share the doorbell aperture.
radeon sets it up, takes the doorbells required for its own rings
and reports the setup to amdkfd.
radeon reserved doorbells are at the start of the doorbell aperture.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2014-01-28 14:43:19 +02:00
Oded Gabbay 28b57b856b drm/radeon/cik: Don't touch int of pipes 1-7
amdkfd should set interrupts for pipes 1-7.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2014-02-11 18:28:24 +02:00
Oded Gabbay 62a7b7fbd0 drm/radeon: reduce number of free VMIDs and pipes in KV
To support HSA on KV, we need to limit the number of vmids and pipes
that are available for radeon's use with KV.

This patch reserves VMIDs 8-15 for amdkfd (so radeon can only use VMIDs
0-7) and also makes radeon thinks that KV has only a single MEC with a single
pipe in it

v3: Use define for static vmid allocation in radeon

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2014-01-16 17:35:44 +02:00
Oded Gabbay a015c1e926 iommu/amd: fix accounting of device_state
This patch fixes a bug in the accounting of the device_state.
In the current code, the device_state was put (decremented) too many times,
which sometimes lead to the driver getting stuck permanently in
put_device_state_wait(). That happen because the device_state->count would go
below zero, which is never supposed to happen.

The root cause is that the device_state was decremented in put_pasid_state()
and put_pasid_state_wait() but also in all the functions that call those
functions. Therefore, the device_state was decremented twice in each of these
code paths.

The fix is to decouple the device_state accounting from the pasid_state
accounting - remove the call to put_device_state() from the
put_pasid_state() and the put_pasid_state_wait())

Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2014-11-10 10:57:36 +02:00
Joerg Roedel e7cc3dd48c iommu/amd: use new invalidate_range mmu-notifier
Make use of the new invalidate_range mmu_notifier call-back and remove the
old logic of assigning an empty page-table between invalidate_range_start
and invalidate_range_end.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
Tested-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Andrea Arcangeli <aarcange@redhat.com>
Reviewed-by: Jérôme Glisse <jglisse@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Rik van Riel <riel@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Johannes Weiner <jweiner@redhat.com>
Cc: Jay Cornwall <Jay.Cornwall@amd.com>
Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2014-11-13 13:46:09 +11:00
Joerg Roedel 0f0a327fa1 mmu_notifier: add the callback for mmu_notifier_invalidate_range()
Now that the mmu_notifier_invalidate_range() calls are in place, add the
callback to allow subsystems to register against it.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
Reviewed-by: Andrea Arcangeli <aarcange@redhat.com>
Reviewed-by: Jérôme Glisse <jglisse@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Rik van Riel <riel@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Johannes Weiner <jweiner@redhat.com>
Cc: Jay Cornwall <Jay.Cornwall@amd.com>
Cc: Oded Gabbay <Oded.Gabbay@amd.com>
Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2014-11-13 13:46:09 +11:00
Joerg Roedel 34ee645e83 mmu_notifier: call mmu_notifier_invalidate_range() from VMM
Add calls to the new mmu_notifier_invalidate_range() function to all
places in the VMM that need it.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
Reviewed-by: Andrea Arcangeli <aarcange@redhat.com>
Reviewed-by: Jérôme Glisse <jglisse@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Rik van Riel <riel@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Johannes Weiner <jweiner@redhat.com>
Cc: Jay Cornwall <Jay.Cornwall@amd.com>
Cc: Oded Gabbay <Oded.Gabbay@amd.com>
Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2014-11-13 13:46:09 +11:00
Joerg Roedel 1897bdc4d3 mmu_notifier: add mmu_notifier_invalidate_range()
This notifier closes an important gap in the current mmu_notifier
implementation, the existing callbacks are called too early or too late to
reliably manage a non-CPU TLB.  Specifically, invalidate_range_start() is
called when all pages are still mapped and invalidate_range_end() when all
pages are unmapped and potentially freed.

This is fine when the users of the mmu_notifiers manage their own SoftTLB,
like KVM does.  When the TLB is managed in software it is easy to wipe out
entries for a given range and prevent new entries to be established until
invalidate_range_end is called.

But when the user of mmu_notifiers has to manage a hardware TLB it can
still wipe out TLB entries in invalidate_range_start, but it can't make
sure that no new TLB entries in the given range are established between
invalidate_range_start and invalidate_range_end.

To avoid silent data corruption the entries in the non-CPU TLB need to be
flushed when the pages are unmapped (at this point in time no _new_ TLB
entries can be established in the non-CPU TLB) but not yet freed (as the
non-CPU TLB may still have _existing_ entries pointing to the pages about
to be freed).

To fix this problem we need to catch the moment when the Linux VMM flushes
remote TLBs (as a non-CPU TLB is not very CPU TLB), as this is the point
in time when the pages are unmapped but _not_ yet freed.

The mmu_notifier_invalidate_range() function aims to catch that moment.

IOMMU code will be one user of the notifier-callback.  Currently this is
only the AMD IOMMUv2 driver, but its code is about to be more generalized
and converted to a generic IOMMU-API extension to fit the needs of similar
functionality in other IOMMUs as well.

The current attempt in the AMD IOMMUv2 driver to work around the
invalidate_range_start/end() shortcoming is to assign an empty page table
to the non-CPU TLB between any invalidata_range_start/end calls.  With the
empty page-table assigned, every page-table walk to re-fill the non-CPU
TLB will cause a page-fault reported to the IOMMU driver via an interrupt,
possibly causing interrupt storms.

The page-fault handler in the AMD IOMMUv2 driver doesn't handle the fault
if an invalidate_range_start/end pair is active, it just reports back
SUCCESS to the device and let it refault the page.  But existing hardware
(newer Radeon GPUs) that makes use of this feature don't re-fault
indefinitly, after a certain number of faults for the same address the
device enters a failure state and needs to be resetted.

To avoid the GPUs entering a failure state we need to get rid of the
empty-page-table workaround and use the mmu_notifier_invalidate_range()
function introduced with this patch.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
Reviewed-by: Andrea Arcangeli <aarcange@redhat.com>
Reviewed-by: Jérôme Glisse <jglisse@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Rik van Riel <riel@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Johannes Weiner <jweiner@redhat.com>
Cc: Jay Cornwall <Jay.Cornwall@amd.com>
Cc: Oded Gabbay <Oded.Gabbay@amd.com>
Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2014-11-13 13:46:09 +11:00
Dave Airlie 7fd36c0bae Merge branch 'drm-next-3.19' of git://people.freedesktop.org/~agd5f/linux into drm-next
Radeon patches for 3.19.  Christian has a number of GPUVM improvements
slated as well, but I'd like to wait until he gets back to work next week
to pull those in. Highlights of this pull:
- ttm performance improvements
- CI dpm fixes

* 'drm-next-3.19' of git://people.freedesktop.org/~agd5f/linux: (26 commits)
  drm/radeon/si/ci: make u8 static arrays constant
  drm/radeon: set power control in ci dpm enable
  drm/radeon: powertune fixes for hawaii
  drm/radeon: fix dpm mc init for certain hawaii boards
  drm/radeon: set bootup pcie level to max for ci dpm
  drm/radeon: fix default dpm state setup
  drm/radeon: workaround a hw bug in bonaire pcie dpm
  drm/radeon: fix mclk vddc configuration for cards for hawaii
  drm/radeon: fix sclk DS enablement
  drm/radeon: fix activity settings for sclk and mclk for CI
  drm/radeon: improve mclk param calcuations for ci dpm
  drm/radeon: fix dram timing for certain hawaii boards
  drm/radeon: switch force state commands for CI
  drm/radeon: fix for memory training on bonaire 0x6649
  drm/radeon/ci: handle gpio controlled dpm features properly
  drm/radeon: store the gpio shift as well
  drm/radeon: export radeon_atombios_lookup_gpio
  drm/radeon: fix typo in CI dpm disable
  drm/radeon: rework CI dpm thermal setup
  drm/radeon: rework SI dpm thermal setup
  ...
2014-11-13 09:06:41 +10:00
Dave Airlie c81b99423b drm/radeon/si/ci: make u8 static arrays constant
These two arrays don't change, just make them constant,
reduces data segment by a few bytes.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2014-11-12 11:56:46 -05:00
Alex Deucher b94b95e7e3 drm/radeon: set power control in ci dpm enable
Necessary for poper operation.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2014-11-12 11:56:46 -05:00
Alex Deucher 542b379b55 drm/radeon: powertune fixes for hawaii
- bapm is not available on hawaii
- update pt defaults

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2014-11-12 11:56:45 -05:00
Alex Deucher 90b2fee35c drm/radeon: fix dpm mc init for certain hawaii boards
Needs special overrides for certain vram configurations.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2014-11-12 11:56:44 -05:00
Alex Deucher 4e21518c3d drm/radeon: set bootup pcie level to max for ci dpm
Avoids problems when re-loading the driver.  Does not
affect power saving when dpm is enabled.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2014-11-12 11:56:44 -05:00
Alex Deucher b6b41cf3b6 drm/radeon: fix default dpm state setup
Only enable the first levels for mclk and sclk.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2014-11-12 11:56:43 -05:00
Alex Deucher 36654dd4b9 drm/radeon: workaround a hw bug in bonaire pcie dpm
Some boards get stuck in pcie x1 otherwise.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2014-11-12 11:56:42 -05:00
Alex Deucher 127e056e2a drm/radeon: fix mclk vddc configuration for cards for hawaii
Need to use vddc0 for vdcc1 for certain hawaii configurations.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2014-11-12 11:56:42 -05:00
Alex Deucher 489ba72c1e drm/radeon: fix sclk DS enablement
Only enable it for levels 0 and 1.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2014-11-12 11:56:41 -05:00
Alex Deucher d3052b8ce8 drm/radeon: fix activity settings for sclk and mclk for CI
Only need to be enabled on the first level.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2014-11-12 11:56:40 -05:00
Alex Deucher c0392f8f09 drm/radeon: improve mclk param calcuations for ci dpm
Properly take into account the post divider.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2014-11-12 11:56:40 -05:00
Alex Deucher 21b8a36904 drm/radeon: fix dram timing for certain hawaii boards
Certain memory configurations need a fix.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2014-11-12 11:56:39 -05:00
Alex Deucher 1c52279f57 drm/radeon: switch force state commands for CI
Use the preferred SMC commands for forcing state on CI.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2014-11-12 11:56:38 -05:00
Alex Deucher 9feb3dda5c drm/radeon: fix for memory training on bonaire 0x6649
Workaround for memory link training on certain variants
of 0x6649.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2014-11-12 11:56:38 -05:00
Alex Deucher 34fc0b58d9 drm/radeon/ci: handle gpio controlled dpm features properly
Certain feature enablement depends on entries in the atom
gpio pin table.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2014-11-12 11:56:37 -05:00
Alex Deucher 727b3d25be drm/radeon: store the gpio shift as well
We need this in the dpm code.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2014-11-12 11:56:36 -05:00
Alex Deucher 09e619c0c6 drm/radeon: export radeon_atombios_lookup_gpio
We need it for dpm.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2014-11-12 11:56:36 -05:00
Alex Deucher 129acb7c0b drm/radeon: fix typo in CI dpm disable
Need to disable DS, not enable it when disabling dpm.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2014-11-12 11:56:35 -05:00
Alex Deucher 1955f107a7 drm/radeon: rework CI dpm thermal setup
In preparation for fan control.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2014-11-12 11:56:35 -05:00
Alex Deucher 2271e2e2a2 drm/radeon: rework SI dpm thermal setup
In preparation for fan control.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2014-11-12 11:56:34 -05:00
Alex Deucher 9b92d1ec62 drm/radeon/dpm: grab fan info from vbios
Required for fan control support.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2014-11-12 11:56:33 -05:00
Michel Dänzer 507d0ca71b drm/ttm: Use only DRM_MM_SEARCH_BELOW for TTM_PL_FLAG_TOPDOWN
DRM_MM_SEARCH_BEST gets the smallest hole which can fit the BO. That seems
against the idea of TTM_PL_FLAG_TOPDOWN:

* The smallest hole may be in the overall bottom of the area
* If the hole isn't much larger than the BO, it doesn't make much
  difference whether the BO is placed at the bottom or at the top of the
  hole

Reviewed-by: Lauri Kasanen <cand@gmx.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2014-11-12 11:56:33 -05:00
Michel Dänzer c165812cbf drm/ttm: Add DRM_MM_SEARCH_BELOW for TTM_PL_FLAG_TOPDOWN
If the BO should be placed at the top of the area, we should start looking
for holes from the top.

Reviewed-by: Lauri Kasanen <cand@gmx.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2014-11-12 11:56:32 -05:00
Michel Dänzer a8b5ebe6b5 drm/radeon: Set TTM_PL_FLAG_TOPDOWN also for RADEON_GEM_CPU_ACCESS BOs
I wasn't sure if TTM_PL_FLAG_TOPDOWN works correctly with non-0 lpfn, but
AFAICT it does.

Reviewed-by: Lauri Kasanen <cand@gmx.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2014-11-12 11:56:31 -05:00
Michel Dänzer 2a85aedd11 drm/radeon: Try evicting from CPU accessible to inaccessible VRAM first
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2014-11-12 11:29:10 -05:00
Michel Dänzer c9da4a4b38 drm/radeon: Try placing NO_CPU_ACCESS BOs outside of CPU accessible VRAM
This avoids them getting in the way of BOs which might be accessed by
the CPU. They can still go to the CPU accessible part of VRAM though if
there's no space outside of it.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2014-11-12 11:29:10 -05:00
Daniel Vetter fcf93f6948 drm: More specific locking for get* ioctls
Motivated by the per-plane locking I've gone through all the get*
ioctls and reduced the locking to the bare minimum required.

v2: Rebase and make it compile ...

v3: Review from Sean:
- Simplify return handling in getplane_res.
- Add a comment to getplane_res that the plane list is invariant and
  can be walked locklessly.

v4: Actually git add.

Cc: Sean Paul <seanpaul@chromium.org>
Reviewed-by: Sean Paul <seanpaul@chromium.org>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-11-12 17:56:34 +10:00
Daniel Vetter 4d02e2de0e drm: Per-plane locking
Turned out to be much simpler on top of my latest atomic stuff than
what I've feared. Some details:

- Drop the modeset_lock_all snakeoil in drm_plane_init. Same
  justification as for the equivalent change in drm_crtc_init done in

	commit d0fa1af40e
	Author: Daniel Vetter <daniel.vetter@ffwll.ch>
	Date:   Mon Sep 8 09:02:49 2014 +0200

	    drm: Drop modeset locking from crtc init function

  Without these the drm_modeset_lock_init would fall over the exact
  same way.

- Since the atomic core code wraps the locking switching it to
  per-plane locks was a one-line change.

- For the legacy ioctls add a plane argument to the locking helper so
  that we can grab the right plane lock (cursor or primary). Since the
  universal cursor plane might not be there, or someone really crazy
  might forgoe the primary plane even accept NULL.

- Add some locking WARN_ON to the atomic helpers for good paranoid
  measure and to check that it all works out.

Tested on my exynos atomic hackfest with full lockdep checks and ww
backoff injection.

v2: I've forgotten about the load-detect code in i915.

v3: Thierry reported that in latest 3.18-rc vmwgfx doesn't compile any
more due to

commit 21e88620aa
Author: Rob Clark <robdclark@gmail.com>
Date:   Thu Oct 30 13:39:04 2014 -0400

    drm/vmwgfx: fix lock breakage

Rebased and fix this up.

Cc: Thierry Reding <thierry.reding@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Reviewed-by: Sean Paul <seanpaul@chromium.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-11-12 17:56:12 +10:00
Rob Clark 5ee3229c87 drm: export atomic wait_for_vblanks helper (v2)
v1: original
v2: danvet's kerneldoc nitpicks

Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-11-12 17:55:44 +10:00
Dave Airlie 51b44eb17b Linux 3.18-rc4
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJUX/DqAAoJEHm+PkMAQRiGLtQH/iAt3fRHlYDXjaJian/KG1Cb
 wVP0I+HWZmvVmmd0PzyaxCZLgRNwdmmYHEH4QLy2JwZ3jZfFHlxhy+hDWCgz+67t
 bIzkLs0Pf1T4kJ2+r8qW2kBEz9PWJHGTQw7NTqZ++Ts3rPptBA6Fg4mEJ6fQigXy
 qRIY68DpipUkXV9BWBWijnTmrvP5tt7JtPzBr4DC8frMjvWct8+XwYhc2k2tEv2j
 LwLYb1OW6PUpPv2BQBfWjqqH77vYNQVhJwuwGcDe2YZdI0UFkDheL24+RbbPcZ4f
 OnrLjJSSgzv6lBWkAaXZK7/WJ/JZbXxEqHzWZQ3xXoQov97bm7lEYJqqi5gDasQ=
 =6Qpa
 -----END PGP SIGNATURE-----

Merge tag 'v3.18-rc4' into drm-next

backmerge to get vmwgfx locking changes into next as the
conflict with per-plane locking.
2014-11-12 17:53:30 +10:00
Dave Airlie cc7096fb6d drm/mode: document path property and function to set it. (v1.1)
These two didn't get documented properly, do so.

Pointed out by Daniel.

v1.1: add missing boilerplate (Daniel)

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-11-10 10:21:14 +10:00
Dave Airlie 122387a53e Merge tag 'topic/atomic-helpers-2014-11-09' of git://anongit.freedesktop.org/drm-intel into drm-next
So here's my atomic series, finally all debugged&reviewed. Sean Paul has
done a full detailed pass over it all, and a lot of other people have
commented and provided feedback on some parts. Rob Clark also converted
msm over the w/e and seems happy. The only small thing is that Rob wants
to export the wait_for_vblank, which imo makes sense. Since there's other
stuff still to do I think we should apply Rob's patch (once it has grown
appropriate kerneldoc) later on top of this.

This is just the core<->driver interface plus a big pile of helpers. Short
recap of the main ideas:

- There are essentially three helper libraries in this patch set:

  * Transitional helpers to use the new plane callbacks for legacy plane
    updates and in the crtc helper's ->mode_set callback. These helpers are
    only temporarily used to convert drivers to atomic, but they allow a
    nice separation between changing the driver backend and switching to
    the atomic commit logic.

  * Legacy helpers to implement all the legacy driver entry points
    (page_flip, set_config, plane vfuncs) on top of the new atomic driver
    interface. These are completely driver agnostic. The reason for having
    the legacy support as helpers is that drivers can switch step-by-step.
    And they could e.g. even keep the legacy page_flip code around for some
    old platforms where converting to full-blown atomic isn't worth it.

  * Atomic helpers which implement the various new ->atomic_* driver
    interfaces in terms of the revised crtc helper and new plane helper
    hooks.

- The revised crtc helper implemenation essentially implements all the
  lessons learned in the i915 modeset rework (when using the atomic helpers
  only):

  * Enable/disable sequence for a given config are always the same and
    callbacks are always called in the same order. This contrast starkly
    with the crtc helpers, where the sequence of operations is heavily
    dependent on the previous config.

    One corollary of this is that if the configuration of a crtc only
    partially changes (e.g. a connector moves in a cloned config) the
    helper code will still disable/enable the full display pipeline. This
    is the only way to ensure that the enable/disable sequence is always
    the same.

  * It won't call disable or enable hooks more than once any more because
    it lost track of state, thanks to the atomic state tracking. And if
    drivers implement the ->reset hook properly (by either resetting the hw
    or reading out the hw state into the atomic structures) this even
    extends to the hardware state. So no more disable-me-harder kind of
    nonsense.

  * The only thing missing is the hw state readout/cross-check support, but
    if drivers have hw state readout support in their ->reset handlers it's
    simple to extend that to cross-check the hw state.

  * The crtc->mode_set callback is gone and its replacement only sets crtc
    timings and no longer updates the primary plane state. This way we can
    finally implement primary planes properly.

- The new plane helpers should be suitable enough for pretty much
  everything, and a perfect fit for hardware with GO bits. Even if they
  don't fit the atomic helper library is rather flexible and exports all
  the functions for the individual steps to drivers. So drivers can pick
  what matches and implement their own magic for everything else.

- A big difference compared to all previous atomic series is that this one
  doesn't implement async commit in a generic way. Imo driver requirements
  for that are too diverse to create anything reasonable sane which would
  actually work on a reasonable amount of different drivers. Also, we've
  never had a helper library for page_flips even, so it's really hard to
  know what might work and what's stupid without a bit of experience in the form
  of a few driver implementations.

  I think with the current flexibility for drivers to pick individual
  stages and existing helpers like drm_flip_queue it's rather easy though
  to implement proper async commit.

- There's a few other differences of minor importance to earlier atomic
  series:

  * Common/generic properties are parsed in the callers/core and not in
    drivers, and passed to drivers by directly setting the right members in
    atomic state structures. That greatly simplifies all the transitional
    and legacy helpers an removes a lot of boilerplate code.

  * There's no crazy trylock mode used for the async commit since these
    helpers don't do async commit. A simple ordered flip queue of atomic
    state updates should be sufficient for preventing concurrent hw access
    anyway, as long as synchronous updates stall correctly with e.g.
    flush_work_queue or similar function. Abusing locks to enforce ordering
    isn't a good idea imo anyway.

  * These helpers reuse the existing ->mode_fixup hooks in the atomic_check
    callback. Which means that drivers need to adapat and move a lot less code
    into their atomic_check callbacks.

Now this isn't everything needed in the drm core and helpers for full
atomic support. But it's enough to start with converting drivers, and
except for actually testing multiplane and multicrtc updates also enough to
implement full atomic updates. Still missing are:

- Per-plane locking. Since these helpers here encapsulate the locking
  completely this should be fairly easy to implement.

- fbdev support for atomic_check/commit, so that multi-pipe finally works
  sanely in fbcon.

- Adding and decoding shared/core properties. That just needs to be rebased
  from Rob's latest patch series, with minor adjustments so that the
  decoding happens in the core instead of in drivers.

- Actually adding the atomic ioctl. Again just rebasing Rob's latest patch
  should be all that's needed.

- Resolving how to deal with DPMS in atomic. Atomic is a good excuse to fix up
  the crazy semantics dpms currently has. I'm floating an RFC about this topic
  already.

- Finally I couldn't test connector/encoder stealing properly since my test
  vehicle here doesn't allow a connector on different crtcs. So drivers
  which support this might see some surprises in that area. There is no semantic
  change though in how encoder stealing and assignment works (or at least no
  intended one), so I think the risk is minimal.

As just mentioned I've done a fake conversion of an existing driver using
crtc helpers to debug the helper code and validate the smooth transition
approach. And that smooth transition was the really big motivation for
this. It seems to actually work and consists of 3 phases:

Phase 1: Rework driver backend for crtc/plane helpers

The requirement here is that universal plane support is already implement. If
universal plane support isn't implement yet it might be better though to just do
it as part of this phase, directly using the new plane helpers. There are two
big things to do:

- Split up the existing ->update/disable_plane hooks into check/commit
  hooks and extract the crtc-wide prep/flush parts (like setting/clearing
  GO bits).

- The other big change is to split the crtc->mode_set hook into the plane
  update (done using the plane helpers) and the crtc setup in a new
  ->mode_set_nofb hook.

When phase 1 is complete the driver implements all the new callbacks which
push the software state into hardware, but still using all the legacy entry
points and crtc helpers. The transitional helpers serve as impendance
mismatch here.

Phase 2: Rework state handling

This consists of rolling out the state handling helpers for planes, crtcs
and connectors and reviewing all ->mode_fixup and similar hooks to make
sure they don't depend upon implicit global state which might change in the
atomic world. Any such code must be moved into ->atomic_check functions which
just rely on the free-standing atomic state update structures.

This phase also adds a few small pieces of fixup code to make sure the
atomic state doesn't get out of sync in the legacy driver callbacks.

Phase 3: Roll out atomic support

Now it's just about replacing vfuncs with the ones provided by the helper
and filling out the small missing pieces (like atomic_check logic or async
commit support needed for page_flips). Due to the prep work in phase 1 no
changes to the driver backend functions should be required, and because of
the prep work in phase 2 atomic implementations can be rolled out
step-by-step. So if async commit ins't implemented yet page_flip can be
implemented with the legacy functions without wreaking havoc in the other
operations.

* tag 'topic/atomic-helpers-2014-11-09' of git://anongit.freedesktop.org/drm-intel:
  drm/atomic: Refcounting for plane_state->fb
  drm: Docbook integration and over sections for all the new helpers
  drm/atomic-helpers: functions for state duplicate/destroy/reset
  drm/atomic-helper: implement ->page_flip
  drm/atomic-helpers: document how to implement async commit
  drm/atomic: Integrate fence support
  drm/atomic-helper: implementatations for legacy interfaces
  drm: Atomic crtc/connector updates using crtc/plane helper interfaces
  drm/crtc-helper: Transitional functions using atomic plane helpers
  drm/plane-helper: transitional atomic plane helpers
  drm: Add atomic/plane helpers
  drm: Global atomic state handling
  drm: Add atomic driver interface definitions for objects
  drm/modeset_lock: document trylock_only in kerneldoc
  drm: fixup kerneldoc in drm_crtc.h
  drm: Pull drm_crtc.h into the kerneldoc template
  drm: Move drm_crtc_init from drm_crtc.h to drm_plane_helper.h
2014-11-10 09:59:16 +10:00
Linus Torvalds 206c5f60a3 Linux 3.18-rc4 2014-11-09 14:55:29 -08:00
Linus Torvalds ee867cf97a arm64 fixes:
- enable bpf syscall for compat
 - cpu_suspend fix when checking the idle state type
 - defconfig update
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJUXzboAAoJEGvWsS0AyF7xSP8P/1IGrDeWgT5+J9edv1B4QYm1
 hw+X+T6Wywl4yPpkh7vljL64ajDa0wPFKu3t83xLpuFE2TEKT+EdABB9TJHAb8RN
 hebRGYQXMCfDowTXS2QGK058rtHFVsdH3j8NWoUNwIBVW6Wq2Jcaw87V3E+K3TLL
 FlYnxKen40N/IUJRe+M5nn7VUo7lXPbIiVGHb3gfPBq4yeLccIsVYWYNIYUEDMTQ
 /BJTmbY9Hte0uZK5+q4cXqmNKiDgODskIFIOQUowJoKPR0cuhBraC61GIMSSbqRA
 sbOs8yjadRcHsyjnFL3dlrhbRKzV9Vy4EDJ7l15m9j67MSs2UHiZ65AKHHyUbS1C
 vIYMNc2gcTQH3kUyoWs+Dc5lrBhQ/T3rBz9bPnRu9SXmr0RjwnFkKcgOsm19B6cv
 hqYuj8Pbjg5+Qoq0gRznQRzM3S3RN2WhzXSHt/gmHIKfcd8mtpHQDNRKayqlIWBS
 paiCKkl7fn8uuZVuuzn4gSK50k3kvpJV6YddiTfLa7bYslIEpQyIAJXHurCY+y96
 tTVIyO2vh1RzJ0b1HT2hHeDSDHmAXuHMQoc3Q4TNkHap+TTcI70+/TTCw23qswV9
 dMuz7iN0/U96O2EiQlmWFpGhC1VZnjbzdIalnnEnfEfMTFA2vJqZr5eyzID4TfR0
 b3PkGNf1/qgPiibwbJ3G
 =CBMO
 -----END PGP SIGNATURE-----

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Catalin Marinas:
 - enable bpf syscall for compat
 - cpu_suspend fix when checking the idle state type
 - defconfig update

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: defconfig: update defconfig for 3.18
  arm64: compat: Enable bpf syscall
  arm64: psci: fix cpu_suspend to check idle state type for index
2014-11-09 14:49:56 -08:00