2018-05-08 18:39:47 +08:00
|
|
|
/*
|
|
|
|
* Copyright 2018 Red Hat Inc.
|
|
|
|
*
|
|
|
|
* Permission is hereby granted, free of charge, to any person obtaining a
|
|
|
|
* copy of this software and associated documentation files (the "Software"),
|
|
|
|
* to deal in the Software without restriction, including without limitation
|
|
|
|
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
|
|
|
|
* and/or sell copies of the Software, and to permit persons to whom the
|
|
|
|
* Software is furnished to do so, subject to the following conditions:
|
|
|
|
*
|
|
|
|
* The above copyright notice and this permission notice shall be included in
|
|
|
|
* all copies or substantial portions of the Software.
|
|
|
|
*
|
|
|
|
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
|
|
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
|
|
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
|
|
|
|
* THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
|
|
|
|
* OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
|
|
|
|
* ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
|
|
|
|
* OTHER DEALINGS IN THE SOFTWARE.
|
|
|
|
*/
|
|
|
|
#include "head.h"
|
|
|
|
#include "base.h"
|
|
|
|
#include "core.h"
|
|
|
|
#include "curs.h"
|
|
|
|
#include "ovly.h"
|
drm/nouveau/kms/nvd9-: Add CRC support
This introduces support for CRC readback on gf119+, using the
documentation generously provided to us by Nvidia:
https://github.com/NVIDIA/open-gpu-doc/blob/master/Display-CRC/display-crc.txt
We expose all available CRC sources. SF, SOR, PIOR, and DAC are exposed
through a single set of "outp" sources: outp-active/auto for a CRC of
the scanout region, outp-complete for a CRC of both the scanout and
blanking/sync region combined, and outp-inactive for a CRC of only the
blanking/sync region. For each source, nouveau selects the appropriate
tap point based on the output path in use. We also expose an "rg"
source, which allows for capturing CRCs of the scanout raster before
it's encoded into a video signal in the output path. This tap point is
referred to as the raster generator.
Note that while there's some other neat features that can be used with
CRC capture on nvidia hardware, like capturing from two CRC sources
simultaneously, I couldn't see any usecase for them and did not
implement them.
Nvidia only allows for accessing CRCs through a shared DMA region that
we program through the core EVO/NvDisplay channel which is referred to
as the notifier context. The notifier context is limited to either 255
(for Fermi-Pascal) or 2047 (Volta+) entries to store CRCs in, and
unfortunately the hardware simply drops CRCs and reports an overflow
once all available entries in the notifier context are filled.
Since the DRM CRC API and igt-gpu-tools don't expect there to be a limit
on how many CRCs can be captured, we work around this in nouveau by
allocating two separate notifier contexts for each head instead of one.
We schedule a vblank worker ahead of time so that once we start getting
close to filling up all of the available entries in the notifier
context, we can swap the currently used notifier context out with
another pre-prepared notifier context in a manner similar to page
flipping.
Unfortunately, the hardware only allows us to this by flushing two
separate updates on the core channel: one to release the current
notifier context handle, and one to program the next notifier context's
handle. When the hardware processes the first update, the CRC for the
current frame is lost. However, the second update can be flushed
immediately without waiting for the first to complete so that CRC
generation resumes on the next frame. According to Nvidia's hardware
engineers, there isn't any cleaner way of flipping notifier contexts
that would avoid this.
Since using vblank workers to swap out the notifier context will ensure
we can usually flush both updates to hardware within the timespan of a
single frame, we can also ensure that there will only be exactly one
frame lost between the first and second update being executed by the
hardware. This gives us the guarantee that we're always correctly
matching each CRC entry with it's respective frame even after a context
flip. And since IGT will retrieve the CRC entry for a frame by waiting
until it receives a CRC for any subsequent frames, this doesn't cause an
issue with any tests and is much simpler than trying to change the
current DRM API to accommodate.
In order to facilitate testing of correct handling of this limitation,
we also expose a debugfs interface to manually control the threshold for
when we start trying to flip the notifier context. We will use this in
igt to trigger a context flip for testing purposes without needing to
wait for the notifier to completely fill up. This threshold is reset
to the default value set by nouveau after each capture, and is exposed
in a separate folder within each CRTC's debugfs directory labelled
"nv_crc".
Changes since v1:
* Forgot to finish saving crc.h before saving, whoops. This just adds
some corrections to the empty function declarations that we use if
CONFIG_DEBUG_FS isn't enabled.
Changes since v2:
* Don't check return code from debugfs_create_dir() or
debugfs_create_file() - Greg K-H
Changes since v3:
(no functional changes)
* Fix SPDX license identifiers (checkpatch)
* s/uint32_t/u32/ (checkpatch)
* Fix indenting in switch cases (checkpatch)
Changes since v4:
* Remove unneeded param changes with nv50_head_flush_clr/set
* Rebase
Changes since v5:
* Remove set but unused variable (outp) in nv50_crc_atomic_check() -
Kbuild bot
Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Acked-by: Dave Airlie <airlied@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200627194657.156514-10-lyude@redhat.com
2019-10-08 02:20:12 +08:00
|
|
|
#include "crc.h"
|
2018-05-08 18:39:47 +08:00
|
|
|
|
|
|
|
#include <nvif/class.h>
|
drm/nouveau/kms/nvd9-: Add CRC support
This introduces support for CRC readback on gf119+, using the
documentation generously provided to us by Nvidia:
https://github.com/NVIDIA/open-gpu-doc/blob/master/Display-CRC/display-crc.txt
We expose all available CRC sources. SF, SOR, PIOR, and DAC are exposed
through a single set of "outp" sources: outp-active/auto for a CRC of
the scanout region, outp-complete for a CRC of both the scanout and
blanking/sync region combined, and outp-inactive for a CRC of only the
blanking/sync region. For each source, nouveau selects the appropriate
tap point based on the output path in use. We also expose an "rg"
source, which allows for capturing CRCs of the scanout raster before
it's encoded into a video signal in the output path. This tap point is
referred to as the raster generator.
Note that while there's some other neat features that can be used with
CRC capture on nvidia hardware, like capturing from two CRC sources
simultaneously, I couldn't see any usecase for them and did not
implement them.
Nvidia only allows for accessing CRCs through a shared DMA region that
we program through the core EVO/NvDisplay channel which is referred to
as the notifier context. The notifier context is limited to either 255
(for Fermi-Pascal) or 2047 (Volta+) entries to store CRCs in, and
unfortunately the hardware simply drops CRCs and reports an overflow
once all available entries in the notifier context are filled.
Since the DRM CRC API and igt-gpu-tools don't expect there to be a limit
on how many CRCs can be captured, we work around this in nouveau by
allocating two separate notifier contexts for each head instead of one.
We schedule a vblank worker ahead of time so that once we start getting
close to filling up all of the available entries in the notifier
context, we can swap the currently used notifier context out with
another pre-prepared notifier context in a manner similar to page
flipping.
Unfortunately, the hardware only allows us to this by flushing two
separate updates on the core channel: one to release the current
notifier context handle, and one to program the next notifier context's
handle. When the hardware processes the first update, the CRC for the
current frame is lost. However, the second update can be flushed
immediately without waiting for the first to complete so that CRC
generation resumes on the next frame. According to Nvidia's hardware
engineers, there isn't any cleaner way of flipping notifier contexts
that would avoid this.
Since using vblank workers to swap out the notifier context will ensure
we can usually flush both updates to hardware within the timespan of a
single frame, we can also ensure that there will only be exactly one
frame lost between the first and second update being executed by the
hardware. This gives us the guarantee that we're always correctly
matching each CRC entry with it's respective frame even after a context
flip. And since IGT will retrieve the CRC entry for a frame by waiting
until it receives a CRC for any subsequent frames, this doesn't cause an
issue with any tests and is much simpler than trying to change the
current DRM API to accommodate.
In order to facilitate testing of correct handling of this limitation,
we also expose a debugfs interface to manually control the threshold for
when we start trying to flip the notifier context. We will use this in
igt to trigger a context flip for testing purposes without needing to
wait for the notifier to completely fill up. This threshold is reset
to the default value set by nouveau after each capture, and is exposed
in a separate folder within each CRTC's debugfs directory labelled
"nv_crc".
Changes since v1:
* Forgot to finish saving crc.h before saving, whoops. This just adds
some corrections to the empty function declarations that we use if
CONFIG_DEBUG_FS isn't enabled.
Changes since v2:
* Don't check return code from debugfs_create_dir() or
debugfs_create_file() - Greg K-H
Changes since v3:
(no functional changes)
* Fix SPDX license identifiers (checkpatch)
* s/uint32_t/u32/ (checkpatch)
* Fix indenting in switch cases (checkpatch)
Changes since v4:
* Remove unneeded param changes with nv50_head_flush_clr/set
* Rebase
Changes since v5:
* Remove set but unused variable (outp) in nv50_crc_atomic_check() -
Kbuild bot
Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Acked-by: Dave Airlie <airlied@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200627194657.156514-10-lyude@redhat.com
2019-10-08 02:20:12 +08:00
|
|
|
#include <nvif/event.h>
|
|
|
|
#include <nvif/cl0046.h>
|
2018-05-08 18:39:47 +08:00
|
|
|
|
|
|
|
#include <drm/drm_atomic_helper.h>
|
|
|
|
#include <drm/drm_crtc_helper.h>
|
2020-01-23 21:59:30 +08:00
|
|
|
#include <drm/drm_vblank.h>
|
2018-05-08 18:39:47 +08:00
|
|
|
#include "nouveau_connector.h"
|
drm/nouveau/kms/nvd9-: Add CRC support
This introduces support for CRC readback on gf119+, using the
documentation generously provided to us by Nvidia:
https://github.com/NVIDIA/open-gpu-doc/blob/master/Display-CRC/display-crc.txt
We expose all available CRC sources. SF, SOR, PIOR, and DAC are exposed
through a single set of "outp" sources: outp-active/auto for a CRC of
the scanout region, outp-complete for a CRC of both the scanout and
blanking/sync region combined, and outp-inactive for a CRC of only the
blanking/sync region. For each source, nouveau selects the appropriate
tap point based on the output path in use. We also expose an "rg"
source, which allows for capturing CRCs of the scanout raster before
it's encoded into a video signal in the output path. This tap point is
referred to as the raster generator.
Note that while there's some other neat features that can be used with
CRC capture on nvidia hardware, like capturing from two CRC sources
simultaneously, I couldn't see any usecase for them and did not
implement them.
Nvidia only allows for accessing CRCs through a shared DMA region that
we program through the core EVO/NvDisplay channel which is referred to
as the notifier context. The notifier context is limited to either 255
(for Fermi-Pascal) or 2047 (Volta+) entries to store CRCs in, and
unfortunately the hardware simply drops CRCs and reports an overflow
once all available entries in the notifier context are filled.
Since the DRM CRC API and igt-gpu-tools don't expect there to be a limit
on how many CRCs can be captured, we work around this in nouveau by
allocating two separate notifier contexts for each head instead of one.
We schedule a vblank worker ahead of time so that once we start getting
close to filling up all of the available entries in the notifier
context, we can swap the currently used notifier context out with
another pre-prepared notifier context in a manner similar to page
flipping.
Unfortunately, the hardware only allows us to this by flushing two
separate updates on the core channel: one to release the current
notifier context handle, and one to program the next notifier context's
handle. When the hardware processes the first update, the CRC for the
current frame is lost. However, the second update can be flushed
immediately without waiting for the first to complete so that CRC
generation resumes on the next frame. According to Nvidia's hardware
engineers, there isn't any cleaner way of flipping notifier contexts
that would avoid this.
Since using vblank workers to swap out the notifier context will ensure
we can usually flush both updates to hardware within the timespan of a
single frame, we can also ensure that there will only be exactly one
frame lost between the first and second update being executed by the
hardware. This gives us the guarantee that we're always correctly
matching each CRC entry with it's respective frame even after a context
flip. And since IGT will retrieve the CRC entry for a frame by waiting
until it receives a CRC for any subsequent frames, this doesn't cause an
issue with any tests and is much simpler than trying to change the
current DRM API to accommodate.
In order to facilitate testing of correct handling of this limitation,
we also expose a debugfs interface to manually control the threshold for
when we start trying to flip the notifier context. We will use this in
igt to trigger a context flip for testing purposes without needing to
wait for the notifier to completely fill up. This threshold is reset
to the default value set by nouveau after each capture, and is exposed
in a separate folder within each CRTC's debugfs directory labelled
"nv_crc".
Changes since v1:
* Forgot to finish saving crc.h before saving, whoops. This just adds
some corrections to the empty function declarations that we use if
CONFIG_DEBUG_FS isn't enabled.
Changes since v2:
* Don't check return code from debugfs_create_dir() or
debugfs_create_file() - Greg K-H
Changes since v3:
(no functional changes)
* Fix SPDX license identifiers (checkpatch)
* s/uint32_t/u32/ (checkpatch)
* Fix indenting in switch cases (checkpatch)
Changes since v4:
* Remove unneeded param changes with nv50_head_flush_clr/set
* Rebase
Changes since v5:
* Remove set but unused variable (outp) in nv50_crc_atomic_check() -
Kbuild bot
Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Acked-by: Dave Airlie <airlied@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200627194657.156514-10-lyude@redhat.com
2019-10-08 02:20:12 +08:00
|
|
|
|
2018-05-08 18:39:47 +08:00
|
|
|
void
|
2018-05-08 18:39:47 +08:00
|
|
|
nv50_head_flush_clr(struct nv50_head *head,
|
|
|
|
struct nv50_head_atom *asyh, bool flush)
|
2018-05-08 18:39:47 +08:00
|
|
|
{
|
2018-05-08 18:39:47 +08:00
|
|
|
union nv50_head_atom_mask clr = {
|
|
|
|
.mask = asyh->clr.mask & ~(flush ? 0 : asyh->set.mask),
|
|
|
|
};
|
drm/nouveau/kms/nvd9-: Add CRC support
This introduces support for CRC readback on gf119+, using the
documentation generously provided to us by Nvidia:
https://github.com/NVIDIA/open-gpu-doc/blob/master/Display-CRC/display-crc.txt
We expose all available CRC sources. SF, SOR, PIOR, and DAC are exposed
through a single set of "outp" sources: outp-active/auto for a CRC of
the scanout region, outp-complete for a CRC of both the scanout and
blanking/sync region combined, and outp-inactive for a CRC of only the
blanking/sync region. For each source, nouveau selects the appropriate
tap point based on the output path in use. We also expose an "rg"
source, which allows for capturing CRCs of the scanout raster before
it's encoded into a video signal in the output path. This tap point is
referred to as the raster generator.
Note that while there's some other neat features that can be used with
CRC capture on nvidia hardware, like capturing from two CRC sources
simultaneously, I couldn't see any usecase for them and did not
implement them.
Nvidia only allows for accessing CRCs through a shared DMA region that
we program through the core EVO/NvDisplay channel which is referred to
as the notifier context. The notifier context is limited to either 255
(for Fermi-Pascal) or 2047 (Volta+) entries to store CRCs in, and
unfortunately the hardware simply drops CRCs and reports an overflow
once all available entries in the notifier context are filled.
Since the DRM CRC API and igt-gpu-tools don't expect there to be a limit
on how many CRCs can be captured, we work around this in nouveau by
allocating two separate notifier contexts for each head instead of one.
We schedule a vblank worker ahead of time so that once we start getting
close to filling up all of the available entries in the notifier
context, we can swap the currently used notifier context out with
another pre-prepared notifier context in a manner similar to page
flipping.
Unfortunately, the hardware only allows us to this by flushing two
separate updates on the core channel: one to release the current
notifier context handle, and one to program the next notifier context's
handle. When the hardware processes the first update, the CRC for the
current frame is lost. However, the second update can be flushed
immediately without waiting for the first to complete so that CRC
generation resumes on the next frame. According to Nvidia's hardware
engineers, there isn't any cleaner way of flipping notifier contexts
that would avoid this.
Since using vblank workers to swap out the notifier context will ensure
we can usually flush both updates to hardware within the timespan of a
single frame, we can also ensure that there will only be exactly one
frame lost between the first and second update being executed by the
hardware. This gives us the guarantee that we're always correctly
matching each CRC entry with it's respective frame even after a context
flip. And since IGT will retrieve the CRC entry for a frame by waiting
until it receives a CRC for any subsequent frames, this doesn't cause an
issue with any tests and is much simpler than trying to change the
current DRM API to accommodate.
In order to facilitate testing of correct handling of this limitation,
we also expose a debugfs interface to manually control the threshold for
when we start trying to flip the notifier context. We will use this in
igt to trigger a context flip for testing purposes without needing to
wait for the notifier to completely fill up. This threshold is reset
to the default value set by nouveau after each capture, and is exposed
in a separate folder within each CRTC's debugfs directory labelled
"nv_crc".
Changes since v1:
* Forgot to finish saving crc.h before saving, whoops. This just adds
some corrections to the empty function declarations that we use if
CONFIG_DEBUG_FS isn't enabled.
Changes since v2:
* Don't check return code from debugfs_create_dir() or
debugfs_create_file() - Greg K-H
Changes since v3:
(no functional changes)
* Fix SPDX license identifiers (checkpatch)
* s/uint32_t/u32/ (checkpatch)
* Fix indenting in switch cases (checkpatch)
Changes since v4:
* Remove unneeded param changes with nv50_head_flush_clr/set
* Rebase
Changes since v5:
* Remove set but unused variable (outp) in nv50_crc_atomic_check() -
Kbuild bot
Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Acked-by: Dave Airlie <airlied@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200627194657.156514-10-lyude@redhat.com
2019-10-08 02:20:12 +08:00
|
|
|
if (clr.crc) nv50_crc_atomic_clr(head);
|
2018-05-08 18:39:47 +08:00
|
|
|
if (clr.olut) head->func->olut_clr(head);
|
2018-05-08 18:39:47 +08:00
|
|
|
if (clr.core) head->func->core_clr(head);
|
|
|
|
if (clr.curs) head->func->curs_clr(head);
|
2018-05-08 18:39:47 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
void
|
|
|
|
nv50_head_flush_set(struct nv50_head *head, struct nv50_head_atom *asyh)
|
|
|
|
{
|
|
|
|
if (asyh->set.view ) head->func->view (head, asyh);
|
|
|
|
if (asyh->set.mode ) head->func->mode (head, asyh);
|
|
|
|
if (asyh->set.core ) head->func->core_set(head, asyh);
|
2018-05-08 18:39:47 +08:00
|
|
|
if (asyh->set.olut ) {
|
|
|
|
asyh->olut.offset = nv50_lut_load(&head->olut,
|
|
|
|
asyh->olut.buffer,
|
2018-12-11 12:50:02 +08:00
|
|
|
asyh->state.gamma_lut,
|
|
|
|
asyh->olut.load);
|
2018-05-08 18:39:47 +08:00
|
|
|
head->func->olut_set(head, asyh);
|
|
|
|
}
|
2018-05-08 18:39:47 +08:00
|
|
|
if (asyh->set.curs ) head->func->curs_set(head, asyh);
|
|
|
|
if (asyh->set.base ) head->func->base (head, asyh);
|
|
|
|
if (asyh->set.ovly ) head->func->ovly (head, asyh);
|
|
|
|
if (asyh->set.dither ) head->func->dither (head, asyh);
|
|
|
|
if (asyh->set.procamp) head->func->procamp (head, asyh);
|
drm/nouveau/kms/nvd9-: Add CRC support
This introduces support for CRC readback on gf119+, using the
documentation generously provided to us by Nvidia:
https://github.com/NVIDIA/open-gpu-doc/blob/master/Display-CRC/display-crc.txt
We expose all available CRC sources. SF, SOR, PIOR, and DAC are exposed
through a single set of "outp" sources: outp-active/auto for a CRC of
the scanout region, outp-complete for a CRC of both the scanout and
blanking/sync region combined, and outp-inactive for a CRC of only the
blanking/sync region. For each source, nouveau selects the appropriate
tap point based on the output path in use. We also expose an "rg"
source, which allows for capturing CRCs of the scanout raster before
it's encoded into a video signal in the output path. This tap point is
referred to as the raster generator.
Note that while there's some other neat features that can be used with
CRC capture on nvidia hardware, like capturing from two CRC sources
simultaneously, I couldn't see any usecase for them and did not
implement them.
Nvidia only allows for accessing CRCs through a shared DMA region that
we program through the core EVO/NvDisplay channel which is referred to
as the notifier context. The notifier context is limited to either 255
(for Fermi-Pascal) or 2047 (Volta+) entries to store CRCs in, and
unfortunately the hardware simply drops CRCs and reports an overflow
once all available entries in the notifier context are filled.
Since the DRM CRC API and igt-gpu-tools don't expect there to be a limit
on how many CRCs can be captured, we work around this in nouveau by
allocating two separate notifier contexts for each head instead of one.
We schedule a vblank worker ahead of time so that once we start getting
close to filling up all of the available entries in the notifier
context, we can swap the currently used notifier context out with
another pre-prepared notifier context in a manner similar to page
flipping.
Unfortunately, the hardware only allows us to this by flushing two
separate updates on the core channel: one to release the current
notifier context handle, and one to program the next notifier context's
handle. When the hardware processes the first update, the CRC for the
current frame is lost. However, the second update can be flushed
immediately without waiting for the first to complete so that CRC
generation resumes on the next frame. According to Nvidia's hardware
engineers, there isn't any cleaner way of flipping notifier contexts
that would avoid this.
Since using vblank workers to swap out the notifier context will ensure
we can usually flush both updates to hardware within the timespan of a
single frame, we can also ensure that there will only be exactly one
frame lost between the first and second update being executed by the
hardware. This gives us the guarantee that we're always correctly
matching each CRC entry with it's respective frame even after a context
flip. And since IGT will retrieve the CRC entry for a frame by waiting
until it receives a CRC for any subsequent frames, this doesn't cause an
issue with any tests and is much simpler than trying to change the
current DRM API to accommodate.
In order to facilitate testing of correct handling of this limitation,
we also expose a debugfs interface to manually control the threshold for
when we start trying to flip the notifier context. We will use this in
igt to trigger a context flip for testing purposes without needing to
wait for the notifier to completely fill up. This threshold is reset
to the default value set by nouveau after each capture, and is exposed
in a separate folder within each CRTC's debugfs directory labelled
"nv_crc".
Changes since v1:
* Forgot to finish saving crc.h before saving, whoops. This just adds
some corrections to the empty function declarations that we use if
CONFIG_DEBUG_FS isn't enabled.
Changes since v2:
* Don't check return code from debugfs_create_dir() or
debugfs_create_file() - Greg K-H
Changes since v3:
(no functional changes)
* Fix SPDX license identifiers (checkpatch)
* s/uint32_t/u32/ (checkpatch)
* Fix indenting in switch cases (checkpatch)
Changes since v4:
* Remove unneeded param changes with nv50_head_flush_clr/set
* Rebase
Changes since v5:
* Remove set but unused variable (outp) in nv50_crc_atomic_check() -
Kbuild bot
Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Acked-by: Dave Airlie <airlied@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200627194657.156514-10-lyude@redhat.com
2019-10-08 02:20:12 +08:00
|
|
|
if (asyh->set.crc ) nv50_crc_atomic_set (head, asyh);
|
2018-05-08 18:39:47 +08:00
|
|
|
if (asyh->set.or ) head->func->or (head, asyh);
|
|
|
|
}
|
|
|
|
|
|
|
|
static void
|
|
|
|
nv50_head_atomic_check_procamp(struct nv50_head_atom *armh,
|
|
|
|
struct nv50_head_atom *asyh,
|
|
|
|
struct nouveau_conn_atom *asyc)
|
|
|
|
{
|
|
|
|
const int vib = asyc->procamp.color_vibrance - 100;
|
|
|
|
const int hue = asyc->procamp.vibrant_hue - 90;
|
|
|
|
const int adj = (vib > 0) ? 50 : 0;
|
|
|
|
asyh->procamp.sat.cos = ((vib * 2047 + adj) / 100) & 0xfff;
|
|
|
|
asyh->procamp.sat.sin = ((hue * 2047) / 100) & 0xfff;
|
|
|
|
asyh->set.procamp = true;
|
|
|
|
}
|
|
|
|
|
|
|
|
static void
|
|
|
|
nv50_head_atomic_check_dither(struct nv50_head_atom *armh,
|
|
|
|
struct nv50_head_atom *asyh,
|
|
|
|
struct nouveau_conn_atom *asyc)
|
|
|
|
{
|
|
|
|
u32 mode = 0x00;
|
|
|
|
|
2020-03-18 02:54:06 +08:00
|
|
|
if (asyc->dither.mode) {
|
|
|
|
if (asyc->dither.mode == DITHERING_MODE_AUTO) {
|
|
|
|
if (asyh->base.depth > asyh->or.bpc * 3)
|
|
|
|
mode = DITHERING_MODE_DYNAMIC2X2;
|
|
|
|
} else {
|
|
|
|
mode = asyc->dither.mode;
|
|
|
|
}
|
2018-05-08 18:39:47 +08:00
|
|
|
|
2020-03-18 02:54:06 +08:00
|
|
|
if (asyc->dither.depth == DITHERING_DEPTH_AUTO) {
|
|
|
|
if (asyh->or.bpc >= 8)
|
|
|
|
mode |= DITHERING_DEPTH_8BPC;
|
|
|
|
} else {
|
|
|
|
mode |= asyc->dither.depth;
|
|
|
|
}
|
2018-05-08 18:39:47 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
asyh->dither.enable = mode;
|
|
|
|
asyh->dither.bits = mode >> 1;
|
|
|
|
asyh->dither.mode = mode >> 3;
|
|
|
|
asyh->set.dither = true;
|
|
|
|
}
|
|
|
|
|
|
|
|
static void
|
|
|
|
nv50_head_atomic_check_view(struct nv50_head_atom *armh,
|
|
|
|
struct nv50_head_atom *asyh,
|
|
|
|
struct nouveau_conn_atom *asyc)
|
|
|
|
{
|
|
|
|
struct drm_connector *connector = asyc->state.connector;
|
|
|
|
struct drm_display_mode *omode = &asyh->state.adjusted_mode;
|
|
|
|
struct drm_display_mode *umode = &asyh->state.mode;
|
|
|
|
int mode = asyc->scaler.mode;
|
|
|
|
struct edid *edid;
|
|
|
|
int umode_vdisplay, omode_hdisplay, omode_vdisplay;
|
|
|
|
|
|
|
|
if (connector->edid_blob_ptr)
|
|
|
|
edid = (struct edid *)connector->edid_blob_ptr->data;
|
|
|
|
else
|
|
|
|
edid = NULL;
|
|
|
|
|
|
|
|
if (!asyc->scaler.full) {
|
|
|
|
if (mode == DRM_MODE_SCALE_NONE)
|
|
|
|
omode = umode;
|
|
|
|
} else {
|
|
|
|
/* Non-EDID LVDS/eDP mode. */
|
|
|
|
mode = DRM_MODE_SCALE_FULLSCREEN;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* For the user-specified mode, we must ignore doublescan and
|
|
|
|
* the like, but honor frame packing.
|
|
|
|
*/
|
|
|
|
umode_vdisplay = umode->vdisplay;
|
|
|
|
if ((umode->flags & DRM_MODE_FLAG_3D_MASK) == DRM_MODE_FLAG_3D_FRAME_PACKING)
|
|
|
|
umode_vdisplay += umode->vtotal;
|
|
|
|
asyh->view.iW = umode->hdisplay;
|
|
|
|
asyh->view.iH = umode_vdisplay;
|
|
|
|
/* For the output mode, we can just use the stock helper. */
|
|
|
|
drm_mode_get_hv_timing(omode, &omode_hdisplay, &omode_vdisplay);
|
|
|
|
asyh->view.oW = omode_hdisplay;
|
|
|
|
asyh->view.oH = omode_vdisplay;
|
|
|
|
|
|
|
|
/* Add overscan compensation if necessary, will keep the aspect
|
|
|
|
* ratio the same as the backend mode unless overridden by the
|
|
|
|
* user setting both hborder and vborder properties.
|
|
|
|
*/
|
|
|
|
if ((asyc->scaler.underscan.mode == UNDERSCAN_ON ||
|
|
|
|
(asyc->scaler.underscan.mode == UNDERSCAN_AUTO &&
|
|
|
|
drm_detect_hdmi_monitor(edid)))) {
|
|
|
|
u32 bX = asyc->scaler.underscan.hborder;
|
|
|
|
u32 bY = asyc->scaler.underscan.vborder;
|
|
|
|
u32 r = (asyh->view.oH << 19) / asyh->view.oW;
|
|
|
|
|
|
|
|
if (bX) {
|
|
|
|
asyh->view.oW -= (bX * 2);
|
|
|
|
if (bY) asyh->view.oH -= (bY * 2);
|
|
|
|
else asyh->view.oH = ((asyh->view.oW * r) + (r / 2)) >> 19;
|
|
|
|
} else {
|
|
|
|
asyh->view.oW -= (asyh->view.oW >> 4) + 32;
|
|
|
|
if (bY) asyh->view.oH -= (bY * 2);
|
|
|
|
else asyh->view.oH = ((asyh->view.oW * r) + (r / 2)) >> 19;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Handle CENTER/ASPECT scaling, taking into account the areas
|
|
|
|
* removed already for overscan compensation.
|
|
|
|
*/
|
|
|
|
switch (mode) {
|
|
|
|
case DRM_MODE_SCALE_CENTER:
|
2019-05-26 06:41:49 +08:00
|
|
|
/* NOTE: This will cause scaling when the input is
|
|
|
|
* larger than the output.
|
|
|
|
*/
|
|
|
|
asyh->view.oW = min(asyh->view.iW, asyh->view.oW);
|
|
|
|
asyh->view.oH = min(asyh->view.iH, asyh->view.oH);
|
|
|
|
break;
|
2018-05-08 18:39:47 +08:00
|
|
|
case DRM_MODE_SCALE_ASPECT:
|
2019-05-26 06:41:49 +08:00
|
|
|
/* Determine whether the scaling should be on width or on
|
|
|
|
* height. This is done by comparing the aspect ratios of the
|
|
|
|
* sizes. If the output AR is larger than input AR, that means
|
|
|
|
* we want to change the width (letterboxed on the
|
|
|
|
* left/right), otherwise on the height (letterboxed on the
|
|
|
|
* top/bottom).
|
|
|
|
*
|
|
|
|
* E.g. 4:3 (1.333) AR image displayed on a 16:10 (1.6) AR
|
|
|
|
* screen will have letterboxes on the left/right. However a
|
|
|
|
* 16:9 (1.777) AR image on that same screen will have
|
|
|
|
* letterboxes on the top/bottom.
|
|
|
|
*
|
|
|
|
* inputAR = iW / iH; outputAR = oW / oH
|
|
|
|
* outputAR > inputAR is equivalent to oW * iH > iW * oH
|
|
|
|
*/
|
|
|
|
if (asyh->view.oW * asyh->view.iH > asyh->view.iW * asyh->view.oH) {
|
|
|
|
/* Recompute output width, i.e. left/right letterbox */
|
2018-05-08 18:39:47 +08:00
|
|
|
u32 r = (asyh->view.iW << 19) / asyh->view.iH;
|
|
|
|
asyh->view.oW = ((asyh->view.oH * r) + (r / 2)) >> 19;
|
|
|
|
} else {
|
2019-05-26 06:41:49 +08:00
|
|
|
/* Recompute output height, i.e. top/bottom letterbox */
|
2018-05-08 18:39:47 +08:00
|
|
|
u32 r = (asyh->view.iH << 19) / asyh->view.iW;
|
|
|
|
asyh->view.oH = ((asyh->view.oW * r) + (r / 2)) >> 19;
|
|
|
|
}
|
|
|
|
break;
|
|
|
|
default:
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
|
|
|
|
asyh->set.view = true;
|
|
|
|
}
|
|
|
|
|
2018-05-08 18:39:47 +08:00
|
|
|
static int
|
2018-05-08 18:39:47 +08:00
|
|
|
nv50_head_atomic_check_lut(struct nv50_head *head,
|
|
|
|
struct nv50_head_atom *asyh)
|
|
|
|
{
|
|
|
|
struct nv50_disp *disp = nv50_disp(head->base.base.dev);
|
2018-05-08 18:39:47 +08:00
|
|
|
struct drm_property_blob *olut = asyh->state.gamma_lut;
|
2019-09-06 12:13:59 +08:00
|
|
|
int size;
|
2018-05-08 18:39:47 +08:00
|
|
|
|
|
|
|
/* Determine whether core output LUT should be enabled. */
|
|
|
|
if (olut) {
|
|
|
|
/* Check if any window(s) have stolen the core output LUT
|
|
|
|
* to as an input LUT for legacy gamma + I8 colour format.
|
|
|
|
*/
|
|
|
|
if (asyh->wndw.olut) {
|
|
|
|
/* If any window has stolen the core output LUT,
|
|
|
|
* all of them must.
|
|
|
|
*/
|
|
|
|
if (asyh->wndw.olut != asyh->wndw.mask)
|
|
|
|
return -EINVAL;
|
|
|
|
olut = NULL;
|
|
|
|
}
|
2018-05-08 18:39:47 +08:00
|
|
|
}
|
|
|
|
|
2019-09-06 12:13:59 +08:00
|
|
|
if (!olut) {
|
|
|
|
if (!head->func->olut_identity) {
|
|
|
|
asyh->olut.handle = 0;
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
size = 0;
|
|
|
|
} else {
|
|
|
|
size = drm_color_lut_size(olut);
|
2018-05-08 18:39:47 +08:00
|
|
|
}
|
2018-05-08 18:39:47 +08:00
|
|
|
|
2019-09-06 12:13:59 +08:00
|
|
|
if (!head->func->olut(head, asyh, size)) {
|
|
|
|
DRM_DEBUG_KMS("Invalid olut\n");
|
|
|
|
return -EINVAL;
|
|
|
|
}
|
2018-05-08 18:39:47 +08:00
|
|
|
asyh->olut.handle = disp->core->chan.vram.handle;
|
|
|
|
asyh->olut.buffer = !asyh->olut.buffer;
|
2019-09-06 12:13:59 +08:00
|
|
|
|
2018-05-08 18:39:47 +08:00
|
|
|
return 0;
|
2018-05-08 18:39:47 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
static void
|
|
|
|
nv50_head_atomic_check_mode(struct nv50_head *head, struct nv50_head_atom *asyh)
|
|
|
|
{
|
|
|
|
struct drm_display_mode *mode = &asyh->state.adjusted_mode;
|
|
|
|
struct nv50_head_mode *m = &asyh->mode;
|
|
|
|
u32 blankus;
|
|
|
|
|
|
|
|
drm_mode_set_crtcinfo(mode, CRTC_INTERLACE_HALVE_V | CRTC_STEREO_DOUBLE);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* DRM modes are defined in terms of a repeating interval
|
|
|
|
* starting with the active display area. The hardware modes
|
|
|
|
* are defined in terms of a repeating interval starting one
|
|
|
|
* unit (pixel or line) into the sync pulse. So, add bias.
|
|
|
|
*/
|
|
|
|
|
|
|
|
m->h.active = mode->crtc_htotal;
|
|
|
|
m->h.synce = mode->crtc_hsync_end - mode->crtc_hsync_start - 1;
|
|
|
|
m->h.blanke = mode->crtc_hblank_end - mode->crtc_hsync_start - 1;
|
|
|
|
m->h.blanks = m->h.blanke + mode->crtc_hdisplay;
|
|
|
|
|
|
|
|
m->v.active = mode->crtc_vtotal;
|
|
|
|
m->v.synce = mode->crtc_vsync_end - mode->crtc_vsync_start - 1;
|
|
|
|
m->v.blanke = mode->crtc_vblank_end - mode->crtc_vsync_start - 1;
|
|
|
|
m->v.blanks = m->v.blanke + mode->crtc_vdisplay;
|
|
|
|
|
|
|
|
/*XXX: Safe underestimate, even "0" works */
|
|
|
|
blankus = (m->v.active - mode->crtc_vdisplay - 2) * m->h.active;
|
|
|
|
blankus *= 1000;
|
|
|
|
blankus /= mode->crtc_clock;
|
|
|
|
m->v.blankus = blankus;
|
|
|
|
|
|
|
|
if (mode->flags & DRM_MODE_FLAG_INTERLACE) {
|
|
|
|
m->v.blank2e = m->v.active + m->v.blanke;
|
|
|
|
m->v.blank2s = m->v.blank2e + mode->crtc_vdisplay;
|
|
|
|
m->v.active = (m->v.active * 2) + 1;
|
|
|
|
m->interlace = true;
|
|
|
|
} else {
|
|
|
|
m->v.blank2e = 0;
|
|
|
|
m->v.blank2s = 1;
|
|
|
|
m->interlace = false;
|
|
|
|
}
|
|
|
|
m->clock = mode->crtc_clock;
|
|
|
|
|
|
|
|
asyh->or.nhsync = !!(mode->flags & DRM_MODE_FLAG_NHSYNC);
|
|
|
|
asyh->or.nvsync = !!(mode->flags & DRM_MODE_FLAG_NVSYNC);
|
|
|
|
asyh->set.or = head->func->or != NULL;
|
|
|
|
asyh->set.mode = true;
|
|
|
|
}
|
|
|
|
|
|
|
|
static int
|
|
|
|
nv50_head_atomic_check(struct drm_crtc *crtc, struct drm_crtc_state *state)
|
|
|
|
{
|
|
|
|
struct nouveau_drm *drm = nouveau_drm(crtc->dev);
|
|
|
|
struct nv50_head *head = nv50_head(crtc);
|
|
|
|
struct nv50_head_atom *armh = nv50_head_atom(crtc->state);
|
|
|
|
struct nv50_head_atom *asyh = nv50_head_atom(state);
|
|
|
|
struct nouveau_conn_atom *asyc = NULL;
|
|
|
|
struct drm_connector_state *conns;
|
|
|
|
struct drm_connector *conn;
|
drm/nouveau/kms/nvd9-: Add CRC support
This introduces support for CRC readback on gf119+, using the
documentation generously provided to us by Nvidia:
https://github.com/NVIDIA/open-gpu-doc/blob/master/Display-CRC/display-crc.txt
We expose all available CRC sources. SF, SOR, PIOR, and DAC are exposed
through a single set of "outp" sources: outp-active/auto for a CRC of
the scanout region, outp-complete for a CRC of both the scanout and
blanking/sync region combined, and outp-inactive for a CRC of only the
blanking/sync region. For each source, nouveau selects the appropriate
tap point based on the output path in use. We also expose an "rg"
source, which allows for capturing CRCs of the scanout raster before
it's encoded into a video signal in the output path. This tap point is
referred to as the raster generator.
Note that while there's some other neat features that can be used with
CRC capture on nvidia hardware, like capturing from two CRC sources
simultaneously, I couldn't see any usecase for them and did not
implement them.
Nvidia only allows for accessing CRCs through a shared DMA region that
we program through the core EVO/NvDisplay channel which is referred to
as the notifier context. The notifier context is limited to either 255
(for Fermi-Pascal) or 2047 (Volta+) entries to store CRCs in, and
unfortunately the hardware simply drops CRCs and reports an overflow
once all available entries in the notifier context are filled.
Since the DRM CRC API and igt-gpu-tools don't expect there to be a limit
on how many CRCs can be captured, we work around this in nouveau by
allocating two separate notifier contexts for each head instead of one.
We schedule a vblank worker ahead of time so that once we start getting
close to filling up all of the available entries in the notifier
context, we can swap the currently used notifier context out with
another pre-prepared notifier context in a manner similar to page
flipping.
Unfortunately, the hardware only allows us to this by flushing two
separate updates on the core channel: one to release the current
notifier context handle, and one to program the next notifier context's
handle. When the hardware processes the first update, the CRC for the
current frame is lost. However, the second update can be flushed
immediately without waiting for the first to complete so that CRC
generation resumes on the next frame. According to Nvidia's hardware
engineers, there isn't any cleaner way of flipping notifier contexts
that would avoid this.
Since using vblank workers to swap out the notifier context will ensure
we can usually flush both updates to hardware within the timespan of a
single frame, we can also ensure that there will only be exactly one
frame lost between the first and second update being executed by the
hardware. This gives us the guarantee that we're always correctly
matching each CRC entry with it's respective frame even after a context
flip. And since IGT will retrieve the CRC entry for a frame by waiting
until it receives a CRC for any subsequent frames, this doesn't cause an
issue with any tests and is much simpler than trying to change the
current DRM API to accommodate.
In order to facilitate testing of correct handling of this limitation,
we also expose a debugfs interface to manually control the threshold for
when we start trying to flip the notifier context. We will use this in
igt to trigger a context flip for testing purposes without needing to
wait for the notifier to completely fill up. This threshold is reset
to the default value set by nouveau after each capture, and is exposed
in a separate folder within each CRTC's debugfs directory labelled
"nv_crc".
Changes since v1:
* Forgot to finish saving crc.h before saving, whoops. This just adds
some corrections to the empty function declarations that we use if
CONFIG_DEBUG_FS isn't enabled.
Changes since v2:
* Don't check return code from debugfs_create_dir() or
debugfs_create_file() - Greg K-H
Changes since v3:
(no functional changes)
* Fix SPDX license identifiers (checkpatch)
* s/uint32_t/u32/ (checkpatch)
* Fix indenting in switch cases (checkpatch)
Changes since v4:
* Remove unneeded param changes with nv50_head_flush_clr/set
* Rebase
Changes since v5:
* Remove set but unused variable (outp) in nv50_crc_atomic_check() -
Kbuild bot
Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Acked-by: Dave Airlie <airlied@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200627194657.156514-10-lyude@redhat.com
2019-10-08 02:20:12 +08:00
|
|
|
int i, ret;
|
2018-05-08 18:39:47 +08:00
|
|
|
|
|
|
|
NV_ATOMIC(drm, "%s atomic_check %d\n", crtc->name, asyh->state.active);
|
|
|
|
if (asyh->state.active) {
|
|
|
|
for_each_new_connector_in_state(asyh->state.state, conn, conns, i) {
|
|
|
|
if (conns->crtc == crtc) {
|
|
|
|
asyc = nouveau_conn_atom(conns);
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
if (armh->state.active) {
|
|
|
|
if (asyc) {
|
|
|
|
if (asyh->state.mode_changed)
|
|
|
|
asyc->set.scaler = true;
|
|
|
|
if (armh->base.depth != asyh->base.depth)
|
|
|
|
asyc->set.dither = true;
|
|
|
|
}
|
|
|
|
} else {
|
|
|
|
if (asyc)
|
|
|
|
asyc->set.mask = ~0;
|
|
|
|
asyh->set.mask = ~0;
|
|
|
|
asyh->set.or = head->func->or != NULL;
|
|
|
|
}
|
|
|
|
|
2019-05-08 12:54:34 +08:00
|
|
|
if (asyh->state.mode_changed || asyh->state.connectors_changed)
|
2018-05-08 18:39:47 +08:00
|
|
|
nv50_head_atomic_check_mode(head, asyh);
|
|
|
|
|
|
|
|
if (asyh->state.color_mgmt_changed ||
|
2018-05-08 18:39:47 +08:00
|
|
|
memcmp(&armh->wndw, &asyh->wndw, sizeof(asyh->wndw))) {
|
|
|
|
int ret = nv50_head_atomic_check_lut(head, asyh);
|
|
|
|
if (ret)
|
|
|
|
return ret;
|
|
|
|
|
|
|
|
asyh->olut.visible = asyh->olut.handle != 0;
|
|
|
|
}
|
2018-05-08 18:39:47 +08:00
|
|
|
|
|
|
|
if (asyc) {
|
|
|
|
if (asyc->set.scaler)
|
|
|
|
nv50_head_atomic_check_view(armh, asyh, asyc);
|
|
|
|
if (asyc->set.dither)
|
|
|
|
nv50_head_atomic_check_dither(armh, asyh, asyc);
|
|
|
|
if (asyc->set.procamp)
|
|
|
|
nv50_head_atomic_check_procamp(armh, asyh, asyc);
|
|
|
|
}
|
|
|
|
|
2018-05-08 18:39:47 +08:00
|
|
|
if (head->func->core_calc) {
|
2018-05-08 18:39:47 +08:00
|
|
|
head->func->core_calc(head, asyh);
|
2018-05-08 18:39:47 +08:00
|
|
|
if (!asyh->core.visible)
|
|
|
|
asyh->olut.visible = false;
|
|
|
|
}
|
2018-05-08 18:39:47 +08:00
|
|
|
|
2018-05-08 18:39:47 +08:00
|
|
|
asyh->set.base = armh->base.cpp != asyh->base.cpp;
|
|
|
|
asyh->set.ovly = armh->ovly.cpp != asyh->ovly.cpp;
|
|
|
|
} else {
|
2018-05-08 18:39:47 +08:00
|
|
|
asyh->olut.visible = false;
|
2018-05-08 18:39:47 +08:00
|
|
|
asyh->core.visible = false;
|
|
|
|
asyh->curs.visible = false;
|
|
|
|
asyh->base.cpp = 0;
|
|
|
|
asyh->ovly.cpp = 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (!drm_atomic_crtc_needs_modeset(&asyh->state)) {
|
|
|
|
if (asyh->core.visible) {
|
|
|
|
if (memcmp(&armh->core, &asyh->core, sizeof(asyh->core)))
|
|
|
|
asyh->set.core = true;
|
|
|
|
} else
|
|
|
|
if (armh->core.visible) {
|
|
|
|
asyh->clr.core = true;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (asyh->curs.visible) {
|
|
|
|
if (memcmp(&armh->curs, &asyh->curs, sizeof(asyh->curs)))
|
|
|
|
asyh->set.curs = true;
|
|
|
|
} else
|
|
|
|
if (armh->curs.visible) {
|
|
|
|
asyh->clr.curs = true;
|
|
|
|
}
|
2018-05-08 18:39:47 +08:00
|
|
|
|
|
|
|
if (asyh->olut.visible) {
|
|
|
|
if (memcmp(&armh->olut, &asyh->olut, sizeof(asyh->olut)))
|
|
|
|
asyh->set.olut = true;
|
|
|
|
} else
|
|
|
|
if (armh->olut.visible) {
|
|
|
|
asyh->clr.olut = true;
|
|
|
|
}
|
2018-05-08 18:39:47 +08:00
|
|
|
} else {
|
2018-05-08 18:39:47 +08:00
|
|
|
asyh->clr.olut = armh->olut.visible;
|
2018-05-08 18:39:47 +08:00
|
|
|
asyh->clr.core = armh->core.visible;
|
|
|
|
asyh->clr.curs = armh->curs.visible;
|
2018-05-08 18:39:47 +08:00
|
|
|
asyh->set.olut = asyh->olut.visible;
|
2018-05-08 18:39:47 +08:00
|
|
|
asyh->set.core = asyh->core.visible;
|
|
|
|
asyh->set.curs = asyh->curs.visible;
|
|
|
|
}
|
|
|
|
|
2020-06-30 06:36:25 +08:00
|
|
|
ret = nv50_crc_atomic_check_head(head, asyh, armh);
|
drm/nouveau/kms/nvd9-: Add CRC support
This introduces support for CRC readback on gf119+, using the
documentation generously provided to us by Nvidia:
https://github.com/NVIDIA/open-gpu-doc/blob/master/Display-CRC/display-crc.txt
We expose all available CRC sources. SF, SOR, PIOR, and DAC are exposed
through a single set of "outp" sources: outp-active/auto for a CRC of
the scanout region, outp-complete for a CRC of both the scanout and
blanking/sync region combined, and outp-inactive for a CRC of only the
blanking/sync region. For each source, nouveau selects the appropriate
tap point based on the output path in use. We also expose an "rg"
source, which allows for capturing CRCs of the scanout raster before
it's encoded into a video signal in the output path. This tap point is
referred to as the raster generator.
Note that while there's some other neat features that can be used with
CRC capture on nvidia hardware, like capturing from two CRC sources
simultaneously, I couldn't see any usecase for them and did not
implement them.
Nvidia only allows for accessing CRCs through a shared DMA region that
we program through the core EVO/NvDisplay channel which is referred to
as the notifier context. The notifier context is limited to either 255
(for Fermi-Pascal) or 2047 (Volta+) entries to store CRCs in, and
unfortunately the hardware simply drops CRCs and reports an overflow
once all available entries in the notifier context are filled.
Since the DRM CRC API and igt-gpu-tools don't expect there to be a limit
on how many CRCs can be captured, we work around this in nouveau by
allocating two separate notifier contexts for each head instead of one.
We schedule a vblank worker ahead of time so that once we start getting
close to filling up all of the available entries in the notifier
context, we can swap the currently used notifier context out with
another pre-prepared notifier context in a manner similar to page
flipping.
Unfortunately, the hardware only allows us to this by flushing two
separate updates on the core channel: one to release the current
notifier context handle, and one to program the next notifier context's
handle. When the hardware processes the first update, the CRC for the
current frame is lost. However, the second update can be flushed
immediately without waiting for the first to complete so that CRC
generation resumes on the next frame. According to Nvidia's hardware
engineers, there isn't any cleaner way of flipping notifier contexts
that would avoid this.
Since using vblank workers to swap out the notifier context will ensure
we can usually flush both updates to hardware within the timespan of a
single frame, we can also ensure that there will only be exactly one
frame lost between the first and second update being executed by the
hardware. This gives us the guarantee that we're always correctly
matching each CRC entry with it's respective frame even after a context
flip. And since IGT will retrieve the CRC entry for a frame by waiting
until it receives a CRC for any subsequent frames, this doesn't cause an
issue with any tests and is much simpler than trying to change the
current DRM API to accommodate.
In order to facilitate testing of correct handling of this limitation,
we also expose a debugfs interface to manually control the threshold for
when we start trying to flip the notifier context. We will use this in
igt to trigger a context flip for testing purposes without needing to
wait for the notifier to completely fill up. This threshold is reset
to the default value set by nouveau after each capture, and is exposed
in a separate folder within each CRTC's debugfs directory labelled
"nv_crc".
Changes since v1:
* Forgot to finish saving crc.h before saving, whoops. This just adds
some corrections to the empty function declarations that we use if
CONFIG_DEBUG_FS isn't enabled.
Changes since v2:
* Don't check return code from debugfs_create_dir() or
debugfs_create_file() - Greg K-H
Changes since v3:
(no functional changes)
* Fix SPDX license identifiers (checkpatch)
* s/uint32_t/u32/ (checkpatch)
* Fix indenting in switch cases (checkpatch)
Changes since v4:
* Remove unneeded param changes with nv50_head_flush_clr/set
* Rebase
Changes since v5:
* Remove set but unused variable (outp) in nv50_crc_atomic_check() -
Kbuild bot
Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Acked-by: Dave Airlie <airlied@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200627194657.156514-10-lyude@redhat.com
2019-10-08 02:20:12 +08:00
|
|
|
if (ret)
|
|
|
|
return ret;
|
|
|
|
|
2018-05-08 18:39:47 +08:00
|
|
|
if (asyh->clr.mask || asyh->set.mask)
|
|
|
|
nv50_atom(asyh->state.state)->lock_core = true;
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
static const struct drm_crtc_helper_funcs
|
|
|
|
nv50_head_help = {
|
|
|
|
.atomic_check = nv50_head_atomic_check,
|
2020-01-23 21:59:29 +08:00
|
|
|
.get_scanout_position = nouveau_display_scanoutpos,
|
2018-05-08 18:39:47 +08:00
|
|
|
};
|
|
|
|
|
|
|
|
static void
|
|
|
|
nv50_head_atomic_destroy_state(struct drm_crtc *crtc,
|
|
|
|
struct drm_crtc_state *state)
|
|
|
|
{
|
|
|
|
struct nv50_head_atom *asyh = nv50_head_atom(state);
|
|
|
|
__drm_atomic_helper_crtc_destroy_state(&asyh->state);
|
|
|
|
kfree(asyh);
|
|
|
|
}
|
|
|
|
|
|
|
|
static struct drm_crtc_state *
|
|
|
|
nv50_head_atomic_duplicate_state(struct drm_crtc *crtc)
|
|
|
|
{
|
|
|
|
struct nv50_head_atom *armh = nv50_head_atom(crtc->state);
|
|
|
|
struct nv50_head_atom *asyh;
|
|
|
|
if (!(asyh = kmalloc(sizeof(*asyh), GFP_KERNEL)))
|
|
|
|
return NULL;
|
|
|
|
__drm_atomic_helper_crtc_duplicate_state(crtc, &asyh->state);
|
2018-05-08 18:39:47 +08:00
|
|
|
asyh->wndw = armh->wndw;
|
2018-05-08 18:39:47 +08:00
|
|
|
asyh->view = armh->view;
|
|
|
|
asyh->mode = armh->mode;
|
2018-05-08 18:39:47 +08:00
|
|
|
asyh->olut = armh->olut;
|
2018-05-08 18:39:47 +08:00
|
|
|
asyh->core = armh->core;
|
|
|
|
asyh->curs = armh->curs;
|
|
|
|
asyh->base = armh->base;
|
|
|
|
asyh->ovly = armh->ovly;
|
|
|
|
asyh->dither = armh->dither;
|
|
|
|
asyh->procamp = armh->procamp;
|
drm/nouveau/kms/nvd9-: Add CRC support
This introduces support for CRC readback on gf119+, using the
documentation generously provided to us by Nvidia:
https://github.com/NVIDIA/open-gpu-doc/blob/master/Display-CRC/display-crc.txt
We expose all available CRC sources. SF, SOR, PIOR, and DAC are exposed
through a single set of "outp" sources: outp-active/auto for a CRC of
the scanout region, outp-complete for a CRC of both the scanout and
blanking/sync region combined, and outp-inactive for a CRC of only the
blanking/sync region. For each source, nouveau selects the appropriate
tap point based on the output path in use. We also expose an "rg"
source, which allows for capturing CRCs of the scanout raster before
it's encoded into a video signal in the output path. This tap point is
referred to as the raster generator.
Note that while there's some other neat features that can be used with
CRC capture on nvidia hardware, like capturing from two CRC sources
simultaneously, I couldn't see any usecase for them and did not
implement them.
Nvidia only allows for accessing CRCs through a shared DMA region that
we program through the core EVO/NvDisplay channel which is referred to
as the notifier context. The notifier context is limited to either 255
(for Fermi-Pascal) or 2047 (Volta+) entries to store CRCs in, and
unfortunately the hardware simply drops CRCs and reports an overflow
once all available entries in the notifier context are filled.
Since the DRM CRC API and igt-gpu-tools don't expect there to be a limit
on how many CRCs can be captured, we work around this in nouveau by
allocating two separate notifier contexts for each head instead of one.
We schedule a vblank worker ahead of time so that once we start getting
close to filling up all of the available entries in the notifier
context, we can swap the currently used notifier context out with
another pre-prepared notifier context in a manner similar to page
flipping.
Unfortunately, the hardware only allows us to this by flushing two
separate updates on the core channel: one to release the current
notifier context handle, and one to program the next notifier context's
handle. When the hardware processes the first update, the CRC for the
current frame is lost. However, the second update can be flushed
immediately without waiting for the first to complete so that CRC
generation resumes on the next frame. According to Nvidia's hardware
engineers, there isn't any cleaner way of flipping notifier contexts
that would avoid this.
Since using vblank workers to swap out the notifier context will ensure
we can usually flush both updates to hardware within the timespan of a
single frame, we can also ensure that there will only be exactly one
frame lost between the first and second update being executed by the
hardware. This gives us the guarantee that we're always correctly
matching each CRC entry with it's respective frame even after a context
flip. And since IGT will retrieve the CRC entry for a frame by waiting
until it receives a CRC for any subsequent frames, this doesn't cause an
issue with any tests and is much simpler than trying to change the
current DRM API to accommodate.
In order to facilitate testing of correct handling of this limitation,
we also expose a debugfs interface to manually control the threshold for
when we start trying to flip the notifier context. We will use this in
igt to trigger a context flip for testing purposes without needing to
wait for the notifier to completely fill up. This threshold is reset
to the default value set by nouveau after each capture, and is exposed
in a separate folder within each CRTC's debugfs directory labelled
"nv_crc".
Changes since v1:
* Forgot to finish saving crc.h before saving, whoops. This just adds
some corrections to the empty function declarations that we use if
CONFIG_DEBUG_FS isn't enabled.
Changes since v2:
* Don't check return code from debugfs_create_dir() or
debugfs_create_file() - Greg K-H
Changes since v3:
(no functional changes)
* Fix SPDX license identifiers (checkpatch)
* s/uint32_t/u32/ (checkpatch)
* Fix indenting in switch cases (checkpatch)
Changes since v4:
* Remove unneeded param changes with nv50_head_flush_clr/set
* Rebase
Changes since v5:
* Remove set but unused variable (outp) in nv50_crc_atomic_check() -
Kbuild bot
Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Acked-by: Dave Airlie <airlied@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200627194657.156514-10-lyude@redhat.com
2019-10-08 02:20:12 +08:00
|
|
|
asyh->crc = armh->crc;
|
2019-05-12 01:08:31 +08:00
|
|
|
asyh->or = armh->or;
|
2019-02-02 08:20:04 +08:00
|
|
|
asyh->dp = armh->dp;
|
2018-05-08 18:39:47 +08:00
|
|
|
asyh->clr.mask = 0;
|
|
|
|
asyh->set.mask = 0;
|
|
|
|
return &asyh->state;
|
|
|
|
}
|
|
|
|
|
|
|
|
static void
|
|
|
|
nv50_head_reset(struct drm_crtc *crtc)
|
|
|
|
{
|
|
|
|
struct nv50_head_atom *asyh;
|
|
|
|
|
|
|
|
if (WARN_ON(!(asyh = kzalloc(sizeof(*asyh), GFP_KERNEL))))
|
|
|
|
return;
|
|
|
|
|
2019-03-01 20:56:12 +08:00
|
|
|
if (crtc->state)
|
|
|
|
nv50_head_atomic_destroy_state(crtc, crtc->state);
|
|
|
|
|
2018-05-08 18:39:47 +08:00
|
|
|
__drm_atomic_helper_crtc_reset(crtc, &asyh->state);
|
|
|
|
}
|
|
|
|
|
drm/nouveau/kms/nvd9-: Add CRC support
This introduces support for CRC readback on gf119+, using the
documentation generously provided to us by Nvidia:
https://github.com/NVIDIA/open-gpu-doc/blob/master/Display-CRC/display-crc.txt
We expose all available CRC sources. SF, SOR, PIOR, and DAC are exposed
through a single set of "outp" sources: outp-active/auto for a CRC of
the scanout region, outp-complete for a CRC of both the scanout and
blanking/sync region combined, and outp-inactive for a CRC of only the
blanking/sync region. For each source, nouveau selects the appropriate
tap point based on the output path in use. We also expose an "rg"
source, which allows for capturing CRCs of the scanout raster before
it's encoded into a video signal in the output path. This tap point is
referred to as the raster generator.
Note that while there's some other neat features that can be used with
CRC capture on nvidia hardware, like capturing from two CRC sources
simultaneously, I couldn't see any usecase for them and did not
implement them.
Nvidia only allows for accessing CRCs through a shared DMA region that
we program through the core EVO/NvDisplay channel which is referred to
as the notifier context. The notifier context is limited to either 255
(for Fermi-Pascal) or 2047 (Volta+) entries to store CRCs in, and
unfortunately the hardware simply drops CRCs and reports an overflow
once all available entries in the notifier context are filled.
Since the DRM CRC API and igt-gpu-tools don't expect there to be a limit
on how many CRCs can be captured, we work around this in nouveau by
allocating two separate notifier contexts for each head instead of one.
We schedule a vblank worker ahead of time so that once we start getting
close to filling up all of the available entries in the notifier
context, we can swap the currently used notifier context out with
another pre-prepared notifier context in a manner similar to page
flipping.
Unfortunately, the hardware only allows us to this by flushing two
separate updates on the core channel: one to release the current
notifier context handle, and one to program the next notifier context's
handle. When the hardware processes the first update, the CRC for the
current frame is lost. However, the second update can be flushed
immediately without waiting for the first to complete so that CRC
generation resumes on the next frame. According to Nvidia's hardware
engineers, there isn't any cleaner way of flipping notifier contexts
that would avoid this.
Since using vblank workers to swap out the notifier context will ensure
we can usually flush both updates to hardware within the timespan of a
single frame, we can also ensure that there will only be exactly one
frame lost between the first and second update being executed by the
hardware. This gives us the guarantee that we're always correctly
matching each CRC entry with it's respective frame even after a context
flip. And since IGT will retrieve the CRC entry for a frame by waiting
until it receives a CRC for any subsequent frames, this doesn't cause an
issue with any tests and is much simpler than trying to change the
current DRM API to accommodate.
In order to facilitate testing of correct handling of this limitation,
we also expose a debugfs interface to manually control the threshold for
when we start trying to flip the notifier context. We will use this in
igt to trigger a context flip for testing purposes without needing to
wait for the notifier to completely fill up. This threshold is reset
to the default value set by nouveau after each capture, and is exposed
in a separate folder within each CRTC's debugfs directory labelled
"nv_crc".
Changes since v1:
* Forgot to finish saving crc.h before saving, whoops. This just adds
some corrections to the empty function declarations that we use if
CONFIG_DEBUG_FS isn't enabled.
Changes since v2:
* Don't check return code from debugfs_create_dir() or
debugfs_create_file() - Greg K-H
Changes since v3:
(no functional changes)
* Fix SPDX license identifiers (checkpatch)
* s/uint32_t/u32/ (checkpatch)
* Fix indenting in switch cases (checkpatch)
Changes since v4:
* Remove unneeded param changes with nv50_head_flush_clr/set
* Rebase
Changes since v5:
* Remove set but unused variable (outp) in nv50_crc_atomic_check() -
Kbuild bot
Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Acked-by: Dave Airlie <airlied@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200627194657.156514-10-lyude@redhat.com
2019-10-08 02:20:12 +08:00
|
|
|
static int
|
|
|
|
nv50_head_late_register(struct drm_crtc *crtc)
|
|
|
|
{
|
|
|
|
return nv50_head_crc_late_register(nv50_head(crtc));
|
|
|
|
}
|
|
|
|
|
2018-05-08 18:39:47 +08:00
|
|
|
static void
|
|
|
|
nv50_head_destroy(struct drm_crtc *crtc)
|
|
|
|
{
|
|
|
|
struct nv50_head *head = nv50_head(crtc);
|
drm/nouveau/kms/nvd9-: Add CRC support
This introduces support for CRC readback on gf119+, using the
documentation generously provided to us by Nvidia:
https://github.com/NVIDIA/open-gpu-doc/blob/master/Display-CRC/display-crc.txt
We expose all available CRC sources. SF, SOR, PIOR, and DAC are exposed
through a single set of "outp" sources: outp-active/auto for a CRC of
the scanout region, outp-complete for a CRC of both the scanout and
blanking/sync region combined, and outp-inactive for a CRC of only the
blanking/sync region. For each source, nouveau selects the appropriate
tap point based on the output path in use. We also expose an "rg"
source, which allows for capturing CRCs of the scanout raster before
it's encoded into a video signal in the output path. This tap point is
referred to as the raster generator.
Note that while there's some other neat features that can be used with
CRC capture on nvidia hardware, like capturing from two CRC sources
simultaneously, I couldn't see any usecase for them and did not
implement them.
Nvidia only allows for accessing CRCs through a shared DMA region that
we program through the core EVO/NvDisplay channel which is referred to
as the notifier context. The notifier context is limited to either 255
(for Fermi-Pascal) or 2047 (Volta+) entries to store CRCs in, and
unfortunately the hardware simply drops CRCs and reports an overflow
once all available entries in the notifier context are filled.
Since the DRM CRC API and igt-gpu-tools don't expect there to be a limit
on how many CRCs can be captured, we work around this in nouveau by
allocating two separate notifier contexts for each head instead of one.
We schedule a vblank worker ahead of time so that once we start getting
close to filling up all of the available entries in the notifier
context, we can swap the currently used notifier context out with
another pre-prepared notifier context in a manner similar to page
flipping.
Unfortunately, the hardware only allows us to this by flushing two
separate updates on the core channel: one to release the current
notifier context handle, and one to program the next notifier context's
handle. When the hardware processes the first update, the CRC for the
current frame is lost. However, the second update can be flushed
immediately without waiting for the first to complete so that CRC
generation resumes on the next frame. According to Nvidia's hardware
engineers, there isn't any cleaner way of flipping notifier contexts
that would avoid this.
Since using vblank workers to swap out the notifier context will ensure
we can usually flush both updates to hardware within the timespan of a
single frame, we can also ensure that there will only be exactly one
frame lost between the first and second update being executed by the
hardware. This gives us the guarantee that we're always correctly
matching each CRC entry with it's respective frame even after a context
flip. And since IGT will retrieve the CRC entry for a frame by waiting
until it receives a CRC for any subsequent frames, this doesn't cause an
issue with any tests and is much simpler than trying to change the
current DRM API to accommodate.
In order to facilitate testing of correct handling of this limitation,
we also expose a debugfs interface to manually control the threshold for
when we start trying to flip the notifier context. We will use this in
igt to trigger a context flip for testing purposes without needing to
wait for the notifier to completely fill up. This threshold is reset
to the default value set by nouveau after each capture, and is exposed
in a separate folder within each CRTC's debugfs directory labelled
"nv_crc".
Changes since v1:
* Forgot to finish saving crc.h before saving, whoops. This just adds
some corrections to the empty function declarations that we use if
CONFIG_DEBUG_FS isn't enabled.
Changes since v2:
* Don't check return code from debugfs_create_dir() or
debugfs_create_file() - Greg K-H
Changes since v3:
(no functional changes)
* Fix SPDX license identifiers (checkpatch)
* s/uint32_t/u32/ (checkpatch)
* Fix indenting in switch cases (checkpatch)
Changes since v4:
* Remove unneeded param changes with nv50_head_flush_clr/set
* Rebase
Changes since v5:
* Remove set but unused variable (outp) in nv50_crc_atomic_check() -
Kbuild bot
Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Acked-by: Dave Airlie <airlied@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200627194657.156514-10-lyude@redhat.com
2019-10-08 02:20:12 +08:00
|
|
|
|
2020-06-08 12:47:37 +08:00
|
|
|
nvif_notify_dtor(&head->base.vblank);
|
2018-05-08 18:39:47 +08:00
|
|
|
nv50_lut_fini(&head->olut);
|
2018-05-08 18:39:47 +08:00
|
|
|
drm_crtc_cleanup(crtc);
|
|
|
|
kfree(head);
|
|
|
|
}
|
|
|
|
|
|
|
|
static const struct drm_crtc_funcs
|
|
|
|
nv50_head_func = {
|
|
|
|
.reset = nv50_head_reset,
|
|
|
|
.gamma_set = drm_atomic_helper_legacy_gamma_set,
|
|
|
|
.destroy = nv50_head_destroy,
|
|
|
|
.set_config = drm_atomic_helper_set_config,
|
|
|
|
.page_flip = drm_atomic_helper_page_flip,
|
|
|
|
.atomic_duplicate_state = nv50_head_atomic_duplicate_state,
|
|
|
|
.atomic_destroy_state = nv50_head_atomic_destroy_state,
|
2020-01-23 21:59:30 +08:00
|
|
|
.enable_vblank = nouveau_display_vblank_enable,
|
|
|
|
.disable_vblank = nouveau_display_vblank_disable,
|
|
|
|
.get_vblank_timestamp = drm_crtc_vblank_helper_get_vblank_timestamp,
|
drm/nouveau/kms/nvd9-: Add CRC support
This introduces support for CRC readback on gf119+, using the
documentation generously provided to us by Nvidia:
https://github.com/NVIDIA/open-gpu-doc/blob/master/Display-CRC/display-crc.txt
We expose all available CRC sources. SF, SOR, PIOR, and DAC are exposed
through a single set of "outp" sources: outp-active/auto for a CRC of
the scanout region, outp-complete for a CRC of both the scanout and
blanking/sync region combined, and outp-inactive for a CRC of only the
blanking/sync region. For each source, nouveau selects the appropriate
tap point based on the output path in use. We also expose an "rg"
source, which allows for capturing CRCs of the scanout raster before
it's encoded into a video signal in the output path. This tap point is
referred to as the raster generator.
Note that while there's some other neat features that can be used with
CRC capture on nvidia hardware, like capturing from two CRC sources
simultaneously, I couldn't see any usecase for them and did not
implement them.
Nvidia only allows for accessing CRCs through a shared DMA region that
we program through the core EVO/NvDisplay channel which is referred to
as the notifier context. The notifier context is limited to either 255
(for Fermi-Pascal) or 2047 (Volta+) entries to store CRCs in, and
unfortunately the hardware simply drops CRCs and reports an overflow
once all available entries in the notifier context are filled.
Since the DRM CRC API and igt-gpu-tools don't expect there to be a limit
on how many CRCs can be captured, we work around this in nouveau by
allocating two separate notifier contexts for each head instead of one.
We schedule a vblank worker ahead of time so that once we start getting
close to filling up all of the available entries in the notifier
context, we can swap the currently used notifier context out with
another pre-prepared notifier context in a manner similar to page
flipping.
Unfortunately, the hardware only allows us to this by flushing two
separate updates on the core channel: one to release the current
notifier context handle, and one to program the next notifier context's
handle. When the hardware processes the first update, the CRC for the
current frame is lost. However, the second update can be flushed
immediately without waiting for the first to complete so that CRC
generation resumes on the next frame. According to Nvidia's hardware
engineers, there isn't any cleaner way of flipping notifier contexts
that would avoid this.
Since using vblank workers to swap out the notifier context will ensure
we can usually flush both updates to hardware within the timespan of a
single frame, we can also ensure that there will only be exactly one
frame lost between the first and second update being executed by the
hardware. This gives us the guarantee that we're always correctly
matching each CRC entry with it's respective frame even after a context
flip. And since IGT will retrieve the CRC entry for a frame by waiting
until it receives a CRC for any subsequent frames, this doesn't cause an
issue with any tests and is much simpler than trying to change the
current DRM API to accommodate.
In order to facilitate testing of correct handling of this limitation,
we also expose a debugfs interface to manually control the threshold for
when we start trying to flip the notifier context. We will use this in
igt to trigger a context flip for testing purposes without needing to
wait for the notifier to completely fill up. This threshold is reset
to the default value set by nouveau after each capture, and is exposed
in a separate folder within each CRTC's debugfs directory labelled
"nv_crc".
Changes since v1:
* Forgot to finish saving crc.h before saving, whoops. This just adds
some corrections to the empty function declarations that we use if
CONFIG_DEBUG_FS isn't enabled.
Changes since v2:
* Don't check return code from debugfs_create_dir() or
debugfs_create_file() - Greg K-H
Changes since v3:
(no functional changes)
* Fix SPDX license identifiers (checkpatch)
* s/uint32_t/u32/ (checkpatch)
* Fix indenting in switch cases (checkpatch)
Changes since v4:
* Remove unneeded param changes with nv50_head_flush_clr/set
* Rebase
Changes since v5:
* Remove set but unused variable (outp) in nv50_crc_atomic_check() -
Kbuild bot
Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Acked-by: Dave Airlie <airlied@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200627194657.156514-10-lyude@redhat.com
2019-10-08 02:20:12 +08:00
|
|
|
.late_register = nv50_head_late_register,
|
2018-05-08 18:39:47 +08:00
|
|
|
};
|
|
|
|
|
drm/nouveau/kms/nvd9-: Add CRC support
This introduces support for CRC readback on gf119+, using the
documentation generously provided to us by Nvidia:
https://github.com/NVIDIA/open-gpu-doc/blob/master/Display-CRC/display-crc.txt
We expose all available CRC sources. SF, SOR, PIOR, and DAC are exposed
through a single set of "outp" sources: outp-active/auto for a CRC of
the scanout region, outp-complete for a CRC of both the scanout and
blanking/sync region combined, and outp-inactive for a CRC of only the
blanking/sync region. For each source, nouveau selects the appropriate
tap point based on the output path in use. We also expose an "rg"
source, which allows for capturing CRCs of the scanout raster before
it's encoded into a video signal in the output path. This tap point is
referred to as the raster generator.
Note that while there's some other neat features that can be used with
CRC capture on nvidia hardware, like capturing from two CRC sources
simultaneously, I couldn't see any usecase for them and did not
implement them.
Nvidia only allows for accessing CRCs through a shared DMA region that
we program through the core EVO/NvDisplay channel which is referred to
as the notifier context. The notifier context is limited to either 255
(for Fermi-Pascal) or 2047 (Volta+) entries to store CRCs in, and
unfortunately the hardware simply drops CRCs and reports an overflow
once all available entries in the notifier context are filled.
Since the DRM CRC API and igt-gpu-tools don't expect there to be a limit
on how many CRCs can be captured, we work around this in nouveau by
allocating two separate notifier contexts for each head instead of one.
We schedule a vblank worker ahead of time so that once we start getting
close to filling up all of the available entries in the notifier
context, we can swap the currently used notifier context out with
another pre-prepared notifier context in a manner similar to page
flipping.
Unfortunately, the hardware only allows us to this by flushing two
separate updates on the core channel: one to release the current
notifier context handle, and one to program the next notifier context's
handle. When the hardware processes the first update, the CRC for the
current frame is lost. However, the second update can be flushed
immediately without waiting for the first to complete so that CRC
generation resumes on the next frame. According to Nvidia's hardware
engineers, there isn't any cleaner way of flipping notifier contexts
that would avoid this.
Since using vblank workers to swap out the notifier context will ensure
we can usually flush both updates to hardware within the timespan of a
single frame, we can also ensure that there will only be exactly one
frame lost between the first and second update being executed by the
hardware. This gives us the guarantee that we're always correctly
matching each CRC entry with it's respective frame even after a context
flip. And since IGT will retrieve the CRC entry for a frame by waiting
until it receives a CRC for any subsequent frames, this doesn't cause an
issue with any tests and is much simpler than trying to change the
current DRM API to accommodate.
In order to facilitate testing of correct handling of this limitation,
we also expose a debugfs interface to manually control the threshold for
when we start trying to flip the notifier context. We will use this in
igt to trigger a context flip for testing purposes without needing to
wait for the notifier to completely fill up. This threshold is reset
to the default value set by nouveau after each capture, and is exposed
in a separate folder within each CRTC's debugfs directory labelled
"nv_crc".
Changes since v1:
* Forgot to finish saving crc.h before saving, whoops. This just adds
some corrections to the empty function declarations that we use if
CONFIG_DEBUG_FS isn't enabled.
Changes since v2:
* Don't check return code from debugfs_create_dir() or
debugfs_create_file() - Greg K-H
Changes since v3:
(no functional changes)
* Fix SPDX license identifiers (checkpatch)
* s/uint32_t/u32/ (checkpatch)
* Fix indenting in switch cases (checkpatch)
Changes since v4:
* Remove unneeded param changes with nv50_head_flush_clr/set
* Rebase
Changes since v5:
* Remove set but unused variable (outp) in nv50_crc_atomic_check() -
Kbuild bot
Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Acked-by: Dave Airlie <airlied@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200627194657.156514-10-lyude@redhat.com
2019-10-08 02:20:12 +08:00
|
|
|
static const struct drm_crtc_funcs
|
|
|
|
nvd9_head_func = {
|
|
|
|
.reset = nv50_head_reset,
|
|
|
|
.gamma_set = drm_atomic_helper_legacy_gamma_set,
|
|
|
|
.destroy = nv50_head_destroy,
|
|
|
|
.set_config = drm_atomic_helper_set_config,
|
|
|
|
.page_flip = drm_atomic_helper_page_flip,
|
|
|
|
.atomic_duplicate_state = nv50_head_atomic_duplicate_state,
|
|
|
|
.atomic_destroy_state = nv50_head_atomic_destroy_state,
|
|
|
|
.enable_vblank = nouveau_display_vblank_enable,
|
|
|
|
.disable_vblank = nouveau_display_vblank_disable,
|
|
|
|
.get_vblank_timestamp = drm_crtc_vblank_helper_get_vblank_timestamp,
|
|
|
|
.verify_crc_source = nv50_crc_verify_source,
|
|
|
|
.get_crc_sources = nv50_crc_get_sources,
|
|
|
|
.set_crc_source = nv50_crc_set_source,
|
|
|
|
.late_register = nv50_head_late_register,
|
|
|
|
};
|
|
|
|
|
|
|
|
static int nv50_head_vblank_handler(struct nvif_notify *notify)
|
|
|
|
{
|
|
|
|
struct nouveau_crtc *nv_crtc =
|
|
|
|
container_of(notify, struct nouveau_crtc, vblank);
|
|
|
|
|
|
|
|
if (drm_crtc_handle_vblank(&nv_crtc->base))
|
|
|
|
nv50_crc_handle_vblank(nv50_head(&nv_crtc->base));
|
|
|
|
|
|
|
|
return NVIF_NOTIFY_KEEP;
|
|
|
|
}
|
|
|
|
|
drm/nouveau/kms/nv50-: Use less encoders by making mstos per-head
Currently, for every single MST capable DRM connector we create a set of
fake encoders, one for each possible head. Unfortunately this ends up
being a huge waste of encoders. While this currently isn't causing us
any problems, it's extremely close to doing so.
The ThinkPad P71 is a good example of this. Originally when trying to
figure out why nouveau was failing to load on this laptop, I discovered
it was because nouveau was creating too many encoders. This ended up
being because we were mistakenly creating MST encoders for the eDP port,
however we are still extremely close to hitting the encoder limit on
this machine as it exposes 1 eDP port and 5 DP ports, resulting in 31
encoders.
So while this fix didn't end up being necessary to fix the P71, we still
need to implement this so that we avoid hitting the encoder limit for
valid display configurations in the event that some machine with more
connectors then this becomes available. Plus, we don't want to let good
code go to waste :)
So, use less encoders by only creating one MSTO per head. Then, attach
each new MSTC to each MSTO which corresponds to a head that it's parent
DP port is capable of using. This brings the number of encoders we
register on the ThinkPad P71 from 31, down to just 15. Yay!
Signed-off-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2019-09-14 06:03:52 +08:00
|
|
|
struct nv50_head *
|
2018-05-08 18:39:47 +08:00
|
|
|
nv50_head_create(struct drm_device *dev, int index)
|
|
|
|
{
|
|
|
|
struct nouveau_drm *drm = nouveau_drm(dev);
|
|
|
|
struct nv50_disp *disp = nv50_disp(dev);
|
|
|
|
struct nv50_head *head;
|
2019-06-11 15:03:21 +08:00
|
|
|
struct nv50_wndw *base, *ovly, *curs;
|
drm/nouveau/kms/nvd9-: Add CRC support
This introduces support for CRC readback on gf119+, using the
documentation generously provided to us by Nvidia:
https://github.com/NVIDIA/open-gpu-doc/blob/master/Display-CRC/display-crc.txt
We expose all available CRC sources. SF, SOR, PIOR, and DAC are exposed
through a single set of "outp" sources: outp-active/auto for a CRC of
the scanout region, outp-complete for a CRC of both the scanout and
blanking/sync region combined, and outp-inactive for a CRC of only the
blanking/sync region. For each source, nouveau selects the appropriate
tap point based on the output path in use. We also expose an "rg"
source, which allows for capturing CRCs of the scanout raster before
it's encoded into a video signal in the output path. This tap point is
referred to as the raster generator.
Note that while there's some other neat features that can be used with
CRC capture on nvidia hardware, like capturing from two CRC sources
simultaneously, I couldn't see any usecase for them and did not
implement them.
Nvidia only allows for accessing CRCs through a shared DMA region that
we program through the core EVO/NvDisplay channel which is referred to
as the notifier context. The notifier context is limited to either 255
(for Fermi-Pascal) or 2047 (Volta+) entries to store CRCs in, and
unfortunately the hardware simply drops CRCs and reports an overflow
once all available entries in the notifier context are filled.
Since the DRM CRC API and igt-gpu-tools don't expect there to be a limit
on how many CRCs can be captured, we work around this in nouveau by
allocating two separate notifier contexts for each head instead of one.
We schedule a vblank worker ahead of time so that once we start getting
close to filling up all of the available entries in the notifier
context, we can swap the currently used notifier context out with
another pre-prepared notifier context in a manner similar to page
flipping.
Unfortunately, the hardware only allows us to this by flushing two
separate updates on the core channel: one to release the current
notifier context handle, and one to program the next notifier context's
handle. When the hardware processes the first update, the CRC for the
current frame is lost. However, the second update can be flushed
immediately without waiting for the first to complete so that CRC
generation resumes on the next frame. According to Nvidia's hardware
engineers, there isn't any cleaner way of flipping notifier contexts
that would avoid this.
Since using vblank workers to swap out the notifier context will ensure
we can usually flush both updates to hardware within the timespan of a
single frame, we can also ensure that there will only be exactly one
frame lost between the first and second update being executed by the
hardware. This gives us the guarantee that we're always correctly
matching each CRC entry with it's respective frame even after a context
flip. And since IGT will retrieve the CRC entry for a frame by waiting
until it receives a CRC for any subsequent frames, this doesn't cause an
issue with any tests and is much simpler than trying to change the
current DRM API to accommodate.
In order to facilitate testing of correct handling of this limitation,
we also expose a debugfs interface to manually control the threshold for
when we start trying to flip the notifier context. We will use this in
igt to trigger a context flip for testing purposes without needing to
wait for the notifier to completely fill up. This threshold is reset
to the default value set by nouveau after each capture, and is exposed
in a separate folder within each CRTC's debugfs directory labelled
"nv_crc".
Changes since v1:
* Forgot to finish saving crc.h before saving, whoops. This just adds
some corrections to the empty function declarations that we use if
CONFIG_DEBUG_FS isn't enabled.
Changes since v2:
* Don't check return code from debugfs_create_dir() or
debugfs_create_file() - Greg K-H
Changes since v3:
(no functional changes)
* Fix SPDX license identifiers (checkpatch)
* s/uint32_t/u32/ (checkpatch)
* Fix indenting in switch cases (checkpatch)
Changes since v4:
* Remove unneeded param changes with nv50_head_flush_clr/set
* Rebase
Changes since v5:
* Remove set but unused variable (outp) in nv50_crc_atomic_check() -
Kbuild bot
Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Acked-by: Dave Airlie <airlied@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200627194657.156514-10-lyude@redhat.com
2019-10-08 02:20:12 +08:00
|
|
|
struct nouveau_crtc *nv_crtc;
|
2018-05-08 18:39:47 +08:00
|
|
|
struct drm_crtc *crtc;
|
drm/nouveau/kms/nvd9-: Add CRC support
This introduces support for CRC readback on gf119+, using the
documentation generously provided to us by Nvidia:
https://github.com/NVIDIA/open-gpu-doc/blob/master/Display-CRC/display-crc.txt
We expose all available CRC sources. SF, SOR, PIOR, and DAC are exposed
through a single set of "outp" sources: outp-active/auto for a CRC of
the scanout region, outp-complete for a CRC of both the scanout and
blanking/sync region combined, and outp-inactive for a CRC of only the
blanking/sync region. For each source, nouveau selects the appropriate
tap point based on the output path in use. We also expose an "rg"
source, which allows for capturing CRCs of the scanout raster before
it's encoded into a video signal in the output path. This tap point is
referred to as the raster generator.
Note that while there's some other neat features that can be used with
CRC capture on nvidia hardware, like capturing from two CRC sources
simultaneously, I couldn't see any usecase for them and did not
implement them.
Nvidia only allows for accessing CRCs through a shared DMA region that
we program through the core EVO/NvDisplay channel which is referred to
as the notifier context. The notifier context is limited to either 255
(for Fermi-Pascal) or 2047 (Volta+) entries to store CRCs in, and
unfortunately the hardware simply drops CRCs and reports an overflow
once all available entries in the notifier context are filled.
Since the DRM CRC API and igt-gpu-tools don't expect there to be a limit
on how many CRCs can be captured, we work around this in nouveau by
allocating two separate notifier contexts for each head instead of one.
We schedule a vblank worker ahead of time so that once we start getting
close to filling up all of the available entries in the notifier
context, we can swap the currently used notifier context out with
another pre-prepared notifier context in a manner similar to page
flipping.
Unfortunately, the hardware only allows us to this by flushing two
separate updates on the core channel: one to release the current
notifier context handle, and one to program the next notifier context's
handle. When the hardware processes the first update, the CRC for the
current frame is lost. However, the second update can be flushed
immediately without waiting for the first to complete so that CRC
generation resumes on the next frame. According to Nvidia's hardware
engineers, there isn't any cleaner way of flipping notifier contexts
that would avoid this.
Since using vblank workers to swap out the notifier context will ensure
we can usually flush both updates to hardware within the timespan of a
single frame, we can also ensure that there will only be exactly one
frame lost between the first and second update being executed by the
hardware. This gives us the guarantee that we're always correctly
matching each CRC entry with it's respective frame even after a context
flip. And since IGT will retrieve the CRC entry for a frame by waiting
until it receives a CRC for any subsequent frames, this doesn't cause an
issue with any tests and is much simpler than trying to change the
current DRM API to accommodate.
In order to facilitate testing of correct handling of this limitation,
we also expose a debugfs interface to manually control the threshold for
when we start trying to flip the notifier context. We will use this in
igt to trigger a context flip for testing purposes without needing to
wait for the notifier to completely fill up. This threshold is reset
to the default value set by nouveau after each capture, and is exposed
in a separate folder within each CRTC's debugfs directory labelled
"nv_crc".
Changes since v1:
* Forgot to finish saving crc.h before saving, whoops. This just adds
some corrections to the empty function declarations that we use if
CONFIG_DEBUG_FS isn't enabled.
Changes since v2:
* Don't check return code from debugfs_create_dir() or
debugfs_create_file() - Greg K-H
Changes since v3:
(no functional changes)
* Fix SPDX license identifiers (checkpatch)
* s/uint32_t/u32/ (checkpatch)
* Fix indenting in switch cases (checkpatch)
Changes since v4:
* Remove unneeded param changes with nv50_head_flush_clr/set
* Rebase
Changes since v5:
* Remove set but unused variable (outp) in nv50_crc_atomic_check() -
Kbuild bot
Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Acked-by: Dave Airlie <airlied@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200627194657.156514-10-lyude@redhat.com
2019-10-08 02:20:12 +08:00
|
|
|
const struct drm_crtc_funcs *funcs;
|
2018-05-08 18:39:47 +08:00
|
|
|
int ret;
|
2018-05-08 18:39:47 +08:00
|
|
|
|
|
|
|
head = kzalloc(sizeof(*head), GFP_KERNEL);
|
|
|
|
if (!head)
|
drm/nouveau/kms/nv50-: Use less encoders by making mstos per-head
Currently, for every single MST capable DRM connector we create a set of
fake encoders, one for each possible head. Unfortunately this ends up
being a huge waste of encoders. While this currently isn't causing us
any problems, it's extremely close to doing so.
The ThinkPad P71 is a good example of this. Originally when trying to
figure out why nouveau was failing to load on this laptop, I discovered
it was because nouveau was creating too many encoders. This ended up
being because we were mistakenly creating MST encoders for the eDP port,
however we are still extremely close to hitting the encoder limit on
this machine as it exposes 1 eDP port and 5 DP ports, resulting in 31
encoders.
So while this fix didn't end up being necessary to fix the P71, we still
need to implement this so that we avoid hitting the encoder limit for
valid display configurations in the event that some machine with more
connectors then this becomes available. Plus, we don't want to let good
code go to waste :)
So, use less encoders by only creating one MSTO per head. Then, attach
each new MSTC to each MSTO which corresponds to a head that it's parent
DP port is capable of using. This brings the number of encoders we
register on the ThinkPad P71 from 31, down to just 15. Yay!
Signed-off-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2019-09-14 06:03:52 +08:00
|
|
|
return ERR_PTR(-ENOMEM);
|
2018-05-08 18:39:47 +08:00
|
|
|
|
|
|
|
head->func = disp->core->func->head;
|
|
|
|
head->base.index = index;
|
2018-05-08 18:39:48 +08:00
|
|
|
|
drm/nouveau/kms/nvd9-: Add CRC support
This introduces support for CRC readback on gf119+, using the
documentation generously provided to us by Nvidia:
https://github.com/NVIDIA/open-gpu-doc/blob/master/Display-CRC/display-crc.txt
We expose all available CRC sources. SF, SOR, PIOR, and DAC are exposed
through a single set of "outp" sources: outp-active/auto for a CRC of
the scanout region, outp-complete for a CRC of both the scanout and
blanking/sync region combined, and outp-inactive for a CRC of only the
blanking/sync region. For each source, nouveau selects the appropriate
tap point based on the output path in use. We also expose an "rg"
source, which allows for capturing CRCs of the scanout raster before
it's encoded into a video signal in the output path. This tap point is
referred to as the raster generator.
Note that while there's some other neat features that can be used with
CRC capture on nvidia hardware, like capturing from two CRC sources
simultaneously, I couldn't see any usecase for them and did not
implement them.
Nvidia only allows for accessing CRCs through a shared DMA region that
we program through the core EVO/NvDisplay channel which is referred to
as the notifier context. The notifier context is limited to either 255
(for Fermi-Pascal) or 2047 (Volta+) entries to store CRCs in, and
unfortunately the hardware simply drops CRCs and reports an overflow
once all available entries in the notifier context are filled.
Since the DRM CRC API and igt-gpu-tools don't expect there to be a limit
on how many CRCs can be captured, we work around this in nouveau by
allocating two separate notifier contexts for each head instead of one.
We schedule a vblank worker ahead of time so that once we start getting
close to filling up all of the available entries in the notifier
context, we can swap the currently used notifier context out with
another pre-prepared notifier context in a manner similar to page
flipping.
Unfortunately, the hardware only allows us to this by flushing two
separate updates on the core channel: one to release the current
notifier context handle, and one to program the next notifier context's
handle. When the hardware processes the first update, the CRC for the
current frame is lost. However, the second update can be flushed
immediately without waiting for the first to complete so that CRC
generation resumes on the next frame. According to Nvidia's hardware
engineers, there isn't any cleaner way of flipping notifier contexts
that would avoid this.
Since using vblank workers to swap out the notifier context will ensure
we can usually flush both updates to hardware within the timespan of a
single frame, we can also ensure that there will only be exactly one
frame lost between the first and second update being executed by the
hardware. This gives us the guarantee that we're always correctly
matching each CRC entry with it's respective frame even after a context
flip. And since IGT will retrieve the CRC entry for a frame by waiting
until it receives a CRC for any subsequent frames, this doesn't cause an
issue with any tests and is much simpler than trying to change the
current DRM API to accommodate.
In order to facilitate testing of correct handling of this limitation,
we also expose a debugfs interface to manually control the threshold for
when we start trying to flip the notifier context. We will use this in
igt to trigger a context flip for testing purposes without needing to
wait for the notifier to completely fill up. This threshold is reset
to the default value set by nouveau after each capture, and is exposed
in a separate folder within each CRTC's debugfs directory labelled
"nv_crc".
Changes since v1:
* Forgot to finish saving crc.h before saving, whoops. This just adds
some corrections to the empty function declarations that we use if
CONFIG_DEBUG_FS isn't enabled.
Changes since v2:
* Don't check return code from debugfs_create_dir() or
debugfs_create_file() - Greg K-H
Changes since v3:
(no functional changes)
* Fix SPDX license identifiers (checkpatch)
* s/uint32_t/u32/ (checkpatch)
* Fix indenting in switch cases (checkpatch)
Changes since v4:
* Remove unneeded param changes with nv50_head_flush_clr/set
* Rebase
Changes since v5:
* Remove set but unused variable (outp) in nv50_crc_atomic_check() -
Kbuild bot
Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Acked-by: Dave Airlie <airlied@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200627194657.156514-10-lyude@redhat.com
2019-10-08 02:20:12 +08:00
|
|
|
if (disp->disp->object.oclass < GF110_DISP)
|
|
|
|
funcs = &nv50_head_func;
|
|
|
|
else
|
|
|
|
funcs = &nvd9_head_func;
|
|
|
|
|
2018-05-08 18:39:48 +08:00
|
|
|
if (disp->disp->object.oclass < GV100_DISP) {
|
2019-06-11 15:03:21 +08:00
|
|
|
ret = nv50_base_new(drm, head->base.index, &base);
|
|
|
|
ret = nv50_ovly_new(drm, head->base.index, &ovly);
|
2018-05-08 18:39:48 +08:00
|
|
|
} else {
|
|
|
|
ret = nv50_wndw_new(drm, DRM_PLANE_TYPE_PRIMARY,
|
2019-06-11 15:03:21 +08:00
|
|
|
head->base.index * 2 + 0, &base);
|
|
|
|
ret = nv50_wndw_new(drm, DRM_PLANE_TYPE_OVERLAY,
|
|
|
|
head->base.index * 2 + 1, &ovly);
|
2018-05-08 18:39:48 +08:00
|
|
|
}
|
2018-05-08 18:39:47 +08:00
|
|
|
if (ret == 0)
|
|
|
|
ret = nv50_curs_new(drm, head->base.index, &curs);
|
|
|
|
if (ret) {
|
|
|
|
kfree(head);
|
drm/nouveau/kms/nv50-: Use less encoders by making mstos per-head
Currently, for every single MST capable DRM connector we create a set of
fake encoders, one for each possible head. Unfortunately this ends up
being a huge waste of encoders. While this currently isn't causing us
any problems, it's extremely close to doing so.
The ThinkPad P71 is a good example of this. Originally when trying to
figure out why nouveau was failing to load on this laptop, I discovered
it was because nouveau was creating too many encoders. This ended up
being because we were mistakenly creating MST encoders for the eDP port,
however we are still extremely close to hitting the encoder limit on
this machine as it exposes 1 eDP port and 5 DP ports, resulting in 31
encoders.
So while this fix didn't end up being necessary to fix the P71, we still
need to implement this so that we avoid hitting the encoder limit for
valid display configurations in the event that some machine with more
connectors then this becomes available. Plus, we don't want to let good
code go to waste :)
So, use less encoders by only creating one MSTO per head. Then, attach
each new MSTC to each MSTO which corresponds to a head that it's parent
DP port is capable of using. This brings the number of encoders we
register on the ThinkPad P71 from 31, down to just 15. Yay!
Signed-off-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2019-09-14 06:03:52 +08:00
|
|
|
return ERR_PTR(ret);
|
2018-05-08 18:39:47 +08:00
|
|
|
}
|
|
|
|
|
drm/nouveau/kms/nvd9-: Add CRC support
This introduces support for CRC readback on gf119+, using the
documentation generously provided to us by Nvidia:
https://github.com/NVIDIA/open-gpu-doc/blob/master/Display-CRC/display-crc.txt
We expose all available CRC sources. SF, SOR, PIOR, and DAC are exposed
through a single set of "outp" sources: outp-active/auto for a CRC of
the scanout region, outp-complete for a CRC of both the scanout and
blanking/sync region combined, and outp-inactive for a CRC of only the
blanking/sync region. For each source, nouveau selects the appropriate
tap point based on the output path in use. We also expose an "rg"
source, which allows for capturing CRCs of the scanout raster before
it's encoded into a video signal in the output path. This tap point is
referred to as the raster generator.
Note that while there's some other neat features that can be used with
CRC capture on nvidia hardware, like capturing from two CRC sources
simultaneously, I couldn't see any usecase for them and did not
implement them.
Nvidia only allows for accessing CRCs through a shared DMA region that
we program through the core EVO/NvDisplay channel which is referred to
as the notifier context. The notifier context is limited to either 255
(for Fermi-Pascal) or 2047 (Volta+) entries to store CRCs in, and
unfortunately the hardware simply drops CRCs and reports an overflow
once all available entries in the notifier context are filled.
Since the DRM CRC API and igt-gpu-tools don't expect there to be a limit
on how many CRCs can be captured, we work around this in nouveau by
allocating two separate notifier contexts for each head instead of one.
We schedule a vblank worker ahead of time so that once we start getting
close to filling up all of the available entries in the notifier
context, we can swap the currently used notifier context out with
another pre-prepared notifier context in a manner similar to page
flipping.
Unfortunately, the hardware only allows us to this by flushing two
separate updates on the core channel: one to release the current
notifier context handle, and one to program the next notifier context's
handle. When the hardware processes the first update, the CRC for the
current frame is lost. However, the second update can be flushed
immediately without waiting for the first to complete so that CRC
generation resumes on the next frame. According to Nvidia's hardware
engineers, there isn't any cleaner way of flipping notifier contexts
that would avoid this.
Since using vblank workers to swap out the notifier context will ensure
we can usually flush both updates to hardware within the timespan of a
single frame, we can also ensure that there will only be exactly one
frame lost between the first and second update being executed by the
hardware. This gives us the guarantee that we're always correctly
matching each CRC entry with it's respective frame even after a context
flip. And since IGT will retrieve the CRC entry for a frame by waiting
until it receives a CRC for any subsequent frames, this doesn't cause an
issue with any tests and is much simpler than trying to change the
current DRM API to accommodate.
In order to facilitate testing of correct handling of this limitation,
we also expose a debugfs interface to manually control the threshold for
when we start trying to flip the notifier context. We will use this in
igt to trigger a context flip for testing purposes without needing to
wait for the notifier to completely fill up. This threshold is reset
to the default value set by nouveau after each capture, and is exposed
in a separate folder within each CRTC's debugfs directory labelled
"nv_crc".
Changes since v1:
* Forgot to finish saving crc.h before saving, whoops. This just adds
some corrections to the empty function declarations that we use if
CONFIG_DEBUG_FS isn't enabled.
Changes since v2:
* Don't check return code from debugfs_create_dir() or
debugfs_create_file() - Greg K-H
Changes since v3:
(no functional changes)
* Fix SPDX license identifiers (checkpatch)
* s/uint32_t/u32/ (checkpatch)
* Fix indenting in switch cases (checkpatch)
Changes since v4:
* Remove unneeded param changes with nv50_head_flush_clr/set
* Rebase
Changes since v5:
* Remove set but unused variable (outp) in nv50_crc_atomic_check() -
Kbuild bot
Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Acked-by: Dave Airlie <airlied@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200627194657.156514-10-lyude@redhat.com
2019-10-08 02:20:12 +08:00
|
|
|
nv_crtc = &head->base;
|
|
|
|
crtc = &nv_crtc->base;
|
2019-06-11 15:03:21 +08:00
|
|
|
drm_crtc_init_with_planes(dev, crtc, &base->plane, &curs->plane,
|
drm/nouveau/kms/nvd9-: Add CRC support
This introduces support for CRC readback on gf119+, using the
documentation generously provided to us by Nvidia:
https://github.com/NVIDIA/open-gpu-doc/blob/master/Display-CRC/display-crc.txt
We expose all available CRC sources. SF, SOR, PIOR, and DAC are exposed
through a single set of "outp" sources: outp-active/auto for a CRC of
the scanout region, outp-complete for a CRC of both the scanout and
blanking/sync region combined, and outp-inactive for a CRC of only the
blanking/sync region. For each source, nouveau selects the appropriate
tap point based on the output path in use. We also expose an "rg"
source, which allows for capturing CRCs of the scanout raster before
it's encoded into a video signal in the output path. This tap point is
referred to as the raster generator.
Note that while there's some other neat features that can be used with
CRC capture on nvidia hardware, like capturing from two CRC sources
simultaneously, I couldn't see any usecase for them and did not
implement them.
Nvidia only allows for accessing CRCs through a shared DMA region that
we program through the core EVO/NvDisplay channel which is referred to
as the notifier context. The notifier context is limited to either 255
(for Fermi-Pascal) or 2047 (Volta+) entries to store CRCs in, and
unfortunately the hardware simply drops CRCs and reports an overflow
once all available entries in the notifier context are filled.
Since the DRM CRC API and igt-gpu-tools don't expect there to be a limit
on how many CRCs can be captured, we work around this in nouveau by
allocating two separate notifier contexts for each head instead of one.
We schedule a vblank worker ahead of time so that once we start getting
close to filling up all of the available entries in the notifier
context, we can swap the currently used notifier context out with
another pre-prepared notifier context in a manner similar to page
flipping.
Unfortunately, the hardware only allows us to this by flushing two
separate updates on the core channel: one to release the current
notifier context handle, and one to program the next notifier context's
handle. When the hardware processes the first update, the CRC for the
current frame is lost. However, the second update can be flushed
immediately without waiting for the first to complete so that CRC
generation resumes on the next frame. According to Nvidia's hardware
engineers, there isn't any cleaner way of flipping notifier contexts
that would avoid this.
Since using vblank workers to swap out the notifier context will ensure
we can usually flush both updates to hardware within the timespan of a
single frame, we can also ensure that there will only be exactly one
frame lost between the first and second update being executed by the
hardware. This gives us the guarantee that we're always correctly
matching each CRC entry with it's respective frame even after a context
flip. And since IGT will retrieve the CRC entry for a frame by waiting
until it receives a CRC for any subsequent frames, this doesn't cause an
issue with any tests and is much simpler than trying to change the
current DRM API to accommodate.
In order to facilitate testing of correct handling of this limitation,
we also expose a debugfs interface to manually control the threshold for
when we start trying to flip the notifier context. We will use this in
igt to trigger a context flip for testing purposes without needing to
wait for the notifier to completely fill up. This threshold is reset
to the default value set by nouveau after each capture, and is exposed
in a separate folder within each CRTC's debugfs directory labelled
"nv_crc".
Changes since v1:
* Forgot to finish saving crc.h before saving, whoops. This just adds
some corrections to the empty function declarations that we use if
CONFIG_DEBUG_FS isn't enabled.
Changes since v2:
* Don't check return code from debugfs_create_dir() or
debugfs_create_file() - Greg K-H
Changes since v3:
(no functional changes)
* Fix SPDX license identifiers (checkpatch)
* s/uint32_t/u32/ (checkpatch)
* Fix indenting in switch cases (checkpatch)
Changes since v4:
* Remove unneeded param changes with nv50_head_flush_clr/set
* Rebase
Changes since v5:
* Remove set but unused variable (outp) in nv50_crc_atomic_check() -
Kbuild bot
Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Acked-by: Dave Airlie <airlied@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200627194657.156514-10-lyude@redhat.com
2019-10-08 02:20:12 +08:00
|
|
|
funcs, "head-%d", head->base.index);
|
2018-05-08 18:39:47 +08:00
|
|
|
drm_crtc_helper_add(crtc, &nv50_head_help);
|
2019-09-06 12:13:59 +08:00
|
|
|
/* Keep the legacy gamma size at 256 to avoid compatibility issues */
|
2018-05-08 18:39:47 +08:00
|
|
|
drm_mode_crtc_set_gamma_size(crtc, 256);
|
2019-09-06 12:13:59 +08:00
|
|
|
drm_crtc_enable_color_mgmt(crtc, base->func->ilut_size,
|
|
|
|
disp->disp->object.oclass >= GF110_DISP,
|
|
|
|
head->func->olut_size);
|
2018-05-08 18:39:47 +08:00
|
|
|
|
2018-05-08 18:39:47 +08:00
|
|
|
if (head->func->olut_set) {
|
|
|
|
ret = nv50_lut_init(disp, &drm->client.mmu, &head->olut);
|
drm/nouveau/kms/nv50-: Use less encoders by making mstos per-head
Currently, for every single MST capable DRM connector we create a set of
fake encoders, one for each possible head. Unfortunately this ends up
being a huge waste of encoders. While this currently isn't causing us
any problems, it's extremely close to doing so.
The ThinkPad P71 is a good example of this. Originally when trying to
figure out why nouveau was failing to load on this laptop, I discovered
it was because nouveau was creating too many encoders. This ended up
being because we were mistakenly creating MST encoders for the eDP port,
however we are still extremely close to hitting the encoder limit on
this machine as it exposes 1 eDP port and 5 DP ports, resulting in 31
encoders.
So while this fix didn't end up being necessary to fix the P71, we still
need to implement this so that we avoid hitting the encoder limit for
valid display configurations in the event that some machine with more
connectors then this becomes available. Plus, we don't want to let good
code go to waste :)
So, use less encoders by only creating one MSTO per head. Then, attach
each new MSTC to each MSTO which corresponds to a head that it's parent
DP port is capable of using. This brings the number of encoders we
register on the ThinkPad P71 from 31, down to just 15. Yay!
Signed-off-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2019-09-14 06:03:52 +08:00
|
|
|
if (ret) {
|
|
|
|
nv50_head_destroy(crtc);
|
|
|
|
return ERR_PTR(ret);
|
|
|
|
}
|
2018-05-08 18:39:47 +08:00
|
|
|
}
|
|
|
|
|
2020-06-08 12:47:37 +08:00
|
|
|
ret = nvif_notify_ctor(&disp->disp->object, "kmsVbl", nv50_head_vblank_handler,
|
drm/nouveau/kms/nvd9-: Add CRC support
This introduces support for CRC readback on gf119+, using the
documentation generously provided to us by Nvidia:
https://github.com/NVIDIA/open-gpu-doc/blob/master/Display-CRC/display-crc.txt
We expose all available CRC sources. SF, SOR, PIOR, and DAC are exposed
through a single set of "outp" sources: outp-active/auto for a CRC of
the scanout region, outp-complete for a CRC of both the scanout and
blanking/sync region combined, and outp-inactive for a CRC of only the
blanking/sync region. For each source, nouveau selects the appropriate
tap point based on the output path in use. We also expose an "rg"
source, which allows for capturing CRCs of the scanout raster before
it's encoded into a video signal in the output path. This tap point is
referred to as the raster generator.
Note that while there's some other neat features that can be used with
CRC capture on nvidia hardware, like capturing from two CRC sources
simultaneously, I couldn't see any usecase for them and did not
implement them.
Nvidia only allows for accessing CRCs through a shared DMA region that
we program through the core EVO/NvDisplay channel which is referred to
as the notifier context. The notifier context is limited to either 255
(for Fermi-Pascal) or 2047 (Volta+) entries to store CRCs in, and
unfortunately the hardware simply drops CRCs and reports an overflow
once all available entries in the notifier context are filled.
Since the DRM CRC API and igt-gpu-tools don't expect there to be a limit
on how many CRCs can be captured, we work around this in nouveau by
allocating two separate notifier contexts for each head instead of one.
We schedule a vblank worker ahead of time so that once we start getting
close to filling up all of the available entries in the notifier
context, we can swap the currently used notifier context out with
another pre-prepared notifier context in a manner similar to page
flipping.
Unfortunately, the hardware only allows us to this by flushing two
separate updates on the core channel: one to release the current
notifier context handle, and one to program the next notifier context's
handle. When the hardware processes the first update, the CRC for the
current frame is lost. However, the second update can be flushed
immediately without waiting for the first to complete so that CRC
generation resumes on the next frame. According to Nvidia's hardware
engineers, there isn't any cleaner way of flipping notifier contexts
that would avoid this.
Since using vblank workers to swap out the notifier context will ensure
we can usually flush both updates to hardware within the timespan of a
single frame, we can also ensure that there will only be exactly one
frame lost between the first and second update being executed by the
hardware. This gives us the guarantee that we're always correctly
matching each CRC entry with it's respective frame even after a context
flip. And since IGT will retrieve the CRC entry for a frame by waiting
until it receives a CRC for any subsequent frames, this doesn't cause an
issue with any tests and is much simpler than trying to change the
current DRM API to accommodate.
In order to facilitate testing of correct handling of this limitation,
we also expose a debugfs interface to manually control the threshold for
when we start trying to flip the notifier context. We will use this in
igt to trigger a context flip for testing purposes without needing to
wait for the notifier to completely fill up. This threshold is reset
to the default value set by nouveau after each capture, and is exposed
in a separate folder within each CRTC's debugfs directory labelled
"nv_crc".
Changes since v1:
* Forgot to finish saving crc.h before saving, whoops. This just adds
some corrections to the empty function declarations that we use if
CONFIG_DEBUG_FS isn't enabled.
Changes since v2:
* Don't check return code from debugfs_create_dir() or
debugfs_create_file() - Greg K-H
Changes since v3:
(no functional changes)
* Fix SPDX license identifiers (checkpatch)
* s/uint32_t/u32/ (checkpatch)
* Fix indenting in switch cases (checkpatch)
Changes since v4:
* Remove unneeded param changes with nv50_head_flush_clr/set
* Rebase
Changes since v5:
* Remove set but unused variable (outp) in nv50_crc_atomic_check() -
Kbuild bot
Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Acked-by: Dave Airlie <airlied@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200627194657.156514-10-lyude@redhat.com
2019-10-08 02:20:12 +08:00
|
|
|
false, NV04_DISP_NTFY_VBLANK,
|
|
|
|
&(struct nvif_notify_head_req_v0) {
|
|
|
|
.head = nv_crtc->index,
|
|
|
|
},
|
|
|
|
sizeof(struct nvif_notify_head_req_v0),
|
|
|
|
sizeof(struct nvif_notify_head_rep_v0),
|
|
|
|
&nv_crtc->vblank);
|
|
|
|
if (ret)
|
|
|
|
return ERR_PTR(ret);
|
|
|
|
|
drm/nouveau/kms/nv50-: Use less encoders by making mstos per-head
Currently, for every single MST capable DRM connector we create a set of
fake encoders, one for each possible head. Unfortunately this ends up
being a huge waste of encoders. While this currently isn't causing us
any problems, it's extremely close to doing so.
The ThinkPad P71 is a good example of this. Originally when trying to
figure out why nouveau was failing to load on this laptop, I discovered
it was because nouveau was creating too many encoders. This ended up
being because we were mistakenly creating MST encoders for the eDP port,
however we are still extremely close to hitting the encoder limit on
this machine as it exposes 1 eDP port and 5 DP ports, resulting in 31
encoders.
So while this fix didn't end up being necessary to fix the P71, we still
need to implement this so that we avoid hitting the encoder limit for
valid display configurations in the event that some machine with more
connectors then this becomes available. Plus, we don't want to let good
code go to waste :)
So, use less encoders by only creating one MSTO per head. Then, attach
each new MSTC to each MSTO which corresponds to a head that it's parent
DP port is capable of using. This brings the number of encoders we
register on the ThinkPad P71 from 31, down to just 15. Yay!
Signed-off-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2019-09-14 06:03:52 +08:00
|
|
|
return head;
|
2018-05-08 18:39:47 +08:00
|
|
|
}
|