On architectures for which access to GPU memory is non-coherent,
caches need to be flushed and invalidated explicitly when BO control
changes between CPU and GPU.
This patch adds buffer synchronization functions which invokes the
correct API (PCI or DMA) to ensure synchronization is effective.
Based on the TTM DMA cache helper patches by Lucas Stach.
Signed-off-by: Lucas Stach <dev@lynxeye.de>
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Specify TTM_PL_FLAG_UNCACHED when allocating GPFIFOs and fences to
allow them to be safely accessed by the kernel without being synced
on non-coherent architectures.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Allow nouveau_bo_new() to recognize the TTM_PL_FLAG_UNCACHED flag, which
means that we want the allocated BO to be perfectly coherent between the
CPU and GPU. This is useful on non-coherent architectures for which we
do not want to manually sync some rarely-accessed buffers: typically,
fences and pushbuffers.
A TTM BO allocated with the TTM_PL_FLAG_UNCACHED on a non-coherent
architecture will be populated using the DMA API, and accesses to it
performed using the coherent mapping performed by dma_alloc_coherent().
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Add a function allowing us to know whether a device is CPU-coherent,
i.e. accesses performed by the CPU on GPU-mapped buffers will
be immediately visible on the GPU side and vice-versa.
For now, a device is considered to be coherent if it uses the PCI bus on
a non-ARM architecture.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Pinned BOs are supposed to remain in their current location until
unpinned. Display a warning for the supposedly-erroneous case where we
are trying to move such objects.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Probably missing something here, doesn't make a lot of sense to write
or+link data into a register whose offset is calculated by the same
or+link info..
This is the all I've witnessed the binary driver and vbios doing so
far, so it'll do.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
The binary driver has been doing this since GF119, and we've somehow
gotten away with it. But, TMDS that hasn't been initialised already
by the x86 vbios code is distorted without it on GM204.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Logging at trace level, rather than as en error, as it seems conceivable
that failure could be normal under certain circumstances (new bios,
older sink that doesn't support a particular DPCD address)
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Starting from GM204, certain registers are no longer accessible by the host
(or unsigned PMU firmware).
This commit implements devinit on PMU, using a signed microcode image, and
devinit data, from the VBIOS.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
The field at +0x2 is technically processor specific, though I don't know
that it's ever mattered in practice (yet).
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
We're about to need to be able to fetch additional chunks of data beyond
the primary bios image, which makes fetching a lot more complicated.
This splits out the verious shadowing routines to be nothing more than
very dumb "fetch this much data from this offset" routines, and leaves
the logic of what and how much to fetch in common code.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>