mm: avoid spurious 'bad pmd' warning messages
When the pmd_devmap() checks were added by5c7fb56e5e
("mm, dax: dax-pmd vs thp-pmd vs hugetlbfs-pmd") to add better support for DAX huge pages, they were all added to the end of if() statements after existing pmd_trans_huge() checks. So, things like: - if (pmd_trans_huge(*pmd)) + if (pmd_trans_huge(*pmd) || pmd_devmap(*pmd)) When further checks were added after pmd_trans_unstable() checks by commit7267ec008b
("mm: postpone page table allocation until we have page to map") they were also added at the end of the conditional: + if (pmd_trans_unstable(fe->pmd) || pmd_devmap(*fe->pmd)) This ordering is fine for pmd_trans_huge(), but doesn't work for pmd_trans_unstable(). This is because DAX huge pages trip the bad_pmd() check inside of pmd_none_or_trans_huge_or_clear_bad() (called by pmd_trans_unstable()), which prints out a warning and returns 1. So, we do end up doing the right thing, but only after spamming dmesg with suspicious looking messages: mm/pgtable-generic.c:39: bad pmd ffff8808daa49b88(84000001006000a5) Reorder these checks in a helper so that pmd_devmap() is checked first, avoiding the error messages, and add a comment explaining why the ordering is important. Fixes: commit7267ec008b
("mm: postpone page table allocation until we have page to map") Link: http://lkml.kernel.org/r/20170522215749.23516-1-ross.zwisler@linux.intel.com Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Pawel Lebioda <pawel.lebioda@intel.com> Cc: "Darrick J. Wong" <darrick.wong@oracle.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Matthew Wilcox <mawilcox@microsoft.com> Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Xiong Zhou <xzhou@redhat.com> Cc: Eryu Guan <eguan@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This commit is contained in:
parent
c288983ddd
commit
d0f0931de9
40
mm/memory.c
40
mm/memory.c
|
@ -3029,6 +3029,17 @@ static int __do_fault(struct vm_fault *vmf)
|
|||
return ret;
|
||||
}
|
||||
|
||||
/*
|
||||
* The ordering of these checks is important for pmds with _PAGE_DEVMAP set.
|
||||
* If we check pmd_trans_unstable() first we will trip the bad_pmd() check
|
||||
* inside of pmd_none_or_trans_huge_or_clear_bad(). This will end up correctly
|
||||
* returning 1 but not before it spams dmesg with the pmd_clear_bad() output.
|
||||
*/
|
||||
static int pmd_devmap_trans_unstable(pmd_t *pmd)
|
||||
{
|
||||
return pmd_devmap(*pmd) || pmd_trans_unstable(pmd);
|
||||
}
|
||||
|
||||
static int pte_alloc_one_map(struct vm_fault *vmf)
|
||||
{
|
||||
struct vm_area_struct *vma = vmf->vma;
|
||||
|
@ -3052,18 +3063,27 @@ static int pte_alloc_one_map(struct vm_fault *vmf)
|
|||
map_pte:
|
||||
/*
|
||||
* If a huge pmd materialized under us just retry later. Use
|
||||
* pmd_trans_unstable() instead of pmd_trans_huge() to ensure the pmd
|
||||
* didn't become pmd_trans_huge under us and then back to pmd_none, as
|
||||
* a result of MADV_DONTNEED running immediately after a huge pmd fault
|
||||
* in a different thread of this mm, in turn leading to a misleading
|
||||
* pmd_trans_huge() retval. All we have to ensure is that it is a
|
||||
* regular pmd that we can walk with pte_offset_map() and we can do that
|
||||
* through an atomic read in C, which is what pmd_trans_unstable()
|
||||
* provides.
|
||||
* pmd_trans_unstable() via pmd_devmap_trans_unstable() instead of
|
||||
* pmd_trans_huge() to ensure the pmd didn't become pmd_trans_huge
|
||||
* under us and then back to pmd_none, as a result of MADV_DONTNEED
|
||||
* running immediately after a huge pmd fault in a different thread of
|
||||
* this mm, in turn leading to a misleading pmd_trans_huge() retval.
|
||||
* All we have to ensure is that it is a regular pmd that we can walk
|
||||
* with pte_offset_map() and we can do that through an atomic read in
|
||||
* C, which is what pmd_trans_unstable() provides.
|
||||
*/
|
||||
if (pmd_trans_unstable(vmf->pmd) || pmd_devmap(*vmf->pmd))
|
||||
if (pmd_devmap_trans_unstable(vmf->pmd))
|
||||
return VM_FAULT_NOPAGE;
|
||||
|
||||
/*
|
||||
* At this point we know that our vmf->pmd points to a page of ptes
|
||||
* and it cannot become pmd_none(), pmd_devmap() or pmd_trans_huge()
|
||||
* for the duration of the fault. If a racing MADV_DONTNEED runs and
|
||||
* we zap the ptes pointed to by our vmf->pmd, the vmf->ptl will still
|
||||
* be valid and we will re-check to make sure the vmf->pte isn't
|
||||
* pte_none() under vmf->ptl protection when we return to
|
||||
* alloc_set_pte().
|
||||
*/
|
||||
vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd, vmf->address,
|
||||
&vmf->ptl);
|
||||
return 0;
|
||||
|
@ -3690,7 +3710,7 @@ static int handle_pte_fault(struct vm_fault *vmf)
|
|||
vmf->pte = NULL;
|
||||
} else {
|
||||
/* See comment in pte_alloc_one_map() */
|
||||
if (pmd_trans_unstable(vmf->pmd) || pmd_devmap(*vmf->pmd))
|
||||
if (pmd_devmap_trans_unstable(vmf->pmd))
|
||||
return 0;
|
||||
/*
|
||||
* A regular pmd is established and it can't morph into a huge
|
||||
|
|
Loading…
Reference in New Issue