Commit Graph

120 Commits

Author SHA1 Message Date
Will Deacon bf950040a5 arm64: pgtable: use a single bit for PTE_WRITE regardless of DBM
Depending on CONFIG_ARM64_HW_AFDBM, we use either bit 57 or 51 of the
pte to represent PTE_WRITE. Given that bit 51 is reserved prior to
ARMv8.1, we can just use that bit regardless of the config option. That
also matches what happens if a kernel configured with ARM64_HW_AFDBM=y
is run on a CPU without the DBM functionality.

Cc: Julien Grall <julien.grall@citrix.com>
Tested-by: Julien Grall <julien.grall@citrix.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-09-14 12:28:45 +01:00
Catalin Marinas 62d96c71d2 arm64: Fix pte_modify() to preserve the hardware dirty information
The pte_modify() function with hardware AF/DBM enabled must transfer the
hardware dirty information to the software PTE_DIRTY bit. However, it
was setting this bit in newprot and the mask does not cover such bit.
This patch sets PTE_DIRTY on the original pte which will be preserved in
the returned value.

Fixes: 2f4b829c62 ("arm64: Add support for hardware updates of the access and dirty pte bits")
Cc: Julien Grall <julien.grall@citrix.com>
Tested-by: Julien Grall <julien.grall@citrix.com>
Tested-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-09-14 12:28:41 +01:00
Catalin Marinas b847415ce9 arm64: Fix the pte_hw_dirty() check when AF/DBM is enabled
Commit 2f4b829c62 ("arm64: Add support for hardware updates of the
access and dirty pte bits") introduced support for handling hardware
updates of the access flag and dirty status. The PTE is automatically
dirtied in hardware (if supported) by clearing the PTE_RDONLY bit when
the PTE_DBM/PTE_WRITE bit is set. The pte_hw_dirty() macro was added to
detect a hardware dirtied pte. The pte_dirty() macro checks for both
software PTE_DIRTY and pte_hw_dirty().

Functions like pte_modify() clear the PTE_RDONLY bit since it is meant
to be set in set_pte_at() when written to memory. In such cases,
pte_hw_dirty() would return true even though such pte is clean. This
patch changes pte_hw_dirty() to test the PTE_DBM/PTE_WRITE bit together
with PTE_RDONLY.

Fixes: 2f4b829c62 ("arm64: Add support for hardware updates of the access and dirty pte bits")
Reported-by: Julien Grall <julien.grall@citrix.com>
Tested-by: Julien Grall <julien.grall@citrix.com>
Tested-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-09-14 12:28:31 +01:00
Jonathan (Zhixiong) Zhang 8d446c8647 arm64/mm: Add PROT_DEVICE_nGnRnE and PROT_NORMAL_WT
UEFI spec 2.5 section 2.3.6.1 defines that
EFI_MEMORY_[UC|WC|WT|WB] are possible EFI memory types for
AArch64.

Each of those EFI memory types is mapped to a corresponding
AArch64 memory type. So we need to define PROT_DEVICE_nGnRnE
and PROT_NORMWL_WT additionaly.

MT_NORMAL_WT is defined, and its encoding is added to MAIR_EL1
when initializing the CPU.

Signed-off-by: Jonathan (Zhixiong) Zhang <zjzhang@codeaurora.org>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1438936621-5215-6-git-send-email-matt@codeblueprint.co.uk
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-08-08 10:37:40 +02:00
Will Deacon 766ffb6980 arm64: pgtable: fix definition of pte_valid
pte_valid should check if the PTE_VALID bit (1 << 0) is set in the pte,
so fix the macro definition to use bitwise & instead of logical &&.

Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-07-28 16:14:03 +01:00
Will Deacon 4b3dc9679c arm64: force CONFIG_SMP=y and remove redundant #ifdefs
Nobody seems to be producing !SMP systems anymore, so this is just
becoming a source of kernel bugs, particularly if people want to use
coherent DMA with non-shared pages.

This patch forces CONFIG_SMP=y for arm64, removing a modest amount of
code in the process.

Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-07-27 11:08:40 +01:00
Catalin Marinas 2f4b829c62 arm64: Add support for hardware updates of the access and dirty pte bits
The ARMv8.1 architecture extensions introduce support for hardware
updates of the access and dirty information in page table entries. With
TCR_EL1.HA enabled, when the CPU accesses an address with the PTE_AF bit
cleared in the page table, instead of raising an access flag fault the
CPU sets the actual page table entry bit. To ensure that kernel
modifications to the page tables do not inadvertently revert a change
introduced by hardware updates, the exclusive monitor (ldxr/stxr) is
adopted in the pte accessors.

When TCR_EL1.HD is enabled, a write access to a memory location with the
DBM (Dirty Bit Management) bit set in the corresponding pte
automatically clears the read-only bit (AP[2]). Such DBM bit maps onto
the Linux PTE_WRITE bit and to check whether a writable (DBM set) page
is dirty, the kernel tests the PTE_RDONLY bit. In order to allow
read-only and dirty pages, the kernel needs to preserve the software
dirty bit. The hardware dirty status is transferred to the software
dirty bit in ptep_set_wrprotect() (using load/store exclusive loop) and
pte_modify().

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-07-27 11:08:39 +01:00
Will Deacon cba3574fd5 arm64: move update_mmu_cache() into asm/pgtable.h
Mark Brown reported an allnoconfig build failure in -next:

  Today's linux-next fails to build an arm64 allnoconfig due to "mm:
  make GUP handle pfn mapping unless FOLL_GET is requested" which
  causes:

  >       arm64-allnoconfig
  > ../mm/gup.c:51:4: error: implicit declaration of function
    'update_mmu_cache' [-Werror=implicit-function-declaration]

Fix the error by moving the function to asm/pgtable.h, as is the case
for most other architectures.

Reported-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-07-27 11:08:39 +01:00
Kirill A. Shutemov 9f25e6ad58 arm64: expose number of page table levels on Kconfig level
We would want to use number of page table level to define mm_struct.
Let's expose it as CONFIG_PGTABLE_LEVELS.

ARM64_PGTABLE_LEVELS is renamed to PGTABLE_LEVELS and defined before
sourcing init/Kconfig: arch/Kconfig will define default value and it's
sourced from init/Kconfig.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-04-14 16:49:01 -07:00
Feng Kan 6910fa16db arm64: enable PTE type bit in the mask for pte_modify
Caught during Trinity testing. The pte_modify does not allow
modification for PTE type bit. This cause the test to hang
the system. It is found that the PTE can't transit from an
inaccessible page (b00) to a valid page (b11) because the mask
does not allow it. This happens when a big block of mmaped
memory is set the PROT_NONE, then the a small piece is broken
off and set to PROT_WRITE | PROT_READ cause a huge page split.

Signed-off-by: Feng Kan <fkan@apm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2015-02-26 18:30:12 +00:00
Linus Torvalds 59d53737a8 Merge branch 'akpm' (patches from Andrew)
Merge second set of updates from Andrew Morton:
 "More of MM"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (83 commits)
  mm/nommu.c: fix arithmetic overflow in __vm_enough_memory()
  mm/mmap.c: fix arithmetic overflow in __vm_enough_memory()
  vmstat: Reduce time interval to stat update on idle cpu
  mm/page_owner.c: remove unnecessary stack_trace field
  Documentation/filesystems/proc.txt: describe /proc/<pid>/map_files
  mm: incorporate read-only pages into transparent huge pages
  vmstat: do not use deferrable delayed work for vmstat_update
  mm: more aggressive page stealing for UNMOVABLE allocations
  mm: always steal split buddies in fallback allocations
  mm: when stealing freepages, also take pages created by splitting buddy page
  mincore: apply page table walker on do_mincore()
  mm: /proc/pid/clear_refs: avoid split_huge_page()
  mm: pagewalk: fix misbehavior of walk_page_range for vma(VM_PFNMAP)
  mempolicy: apply page table walker on queue_pages_range()
  arch/powerpc/mm/subpage-prot.c: use walk->vma and walk_page_vma()
  memcg: cleanup preparation for page table walk
  numa_maps: remove numa_maps->vma
  numa_maps: fix typo in gather_hugetbl_stats
  pagemap: use walk->vma instead of calling find_vma()
  clear_refs: remove clear_refs_private->vma and introduce clear_refs_test_walk()
  ...
2015-02-11 18:23:28 -08:00
Linus Torvalds 6b00f7efb5 arm64 updates for 3.20:
- reimplementation of the virtual remapping of UEFI Runtime Services in
   a way that is stable across kexec
 - emulation of the "setend" instruction for 32-bit tasks (user
   endianness switching trapped in the kernel, SCTLR_EL1.E0E bit set
   accordingly)
 - compat_sys_call_table implemented in C (from asm) and made it a
   constant array together with sys_call_table
 - export CPU cache information via /sys (like other architectures)
 - DMA API implementation clean-up in preparation for IOMMU support
 - macros clean-up for KVM
 - dropped some unnecessary cache+tlb maintenance
 - CONFIG_ARM64_CPU_SUSPEND clean-up
 - defconfig update (CPU_IDLE)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJU25v3AAoJEGvWsS0AyF7xYjcP/j8ESvs+z0BPgeJ6XREfOnCh
 cp+w/1rJ5BafJ5RRkibrciwTNOIJS4FGMivWyURtoh430lS0Rh7fxZ3Ouna3xjrT
 Nf7AxenWoA8Lo6wHh+FlNUeGk3iWfX6WwA2tYrbKudK+LBJ1wHjwpE7cWQO0FgwJ
 aFDahu+QD5/u45p/VcVctMtiEDvOxBdO8gfat6r+YkLm7pbRxQkZnpA/JE4Gps1p
 Td5jvMNH9pXI5pffSbeR9Q+vs/r0yqKLXQg01Eb2bZgGDgwf9yzADrHuaKamZt35
 X5flmLiTGC6swJCJvUkZC1Nuue33bXcvW5+vgvar+MNGyXsxv+B/wARLqGhiWhQZ
 nLGwFpuNu6wdY9tGHb/XR8khcewkw1/lRH1hHKhchrmRyUqHvXcPgC5tamjLrY8C
 BV3BAeQvRho8OKwWUmbXIlyON1vPux6CJdj4D/A5NL+qph2WHeVWJCXg6nVFx0Wc
 Eb3bXbI4QRwTFL7pGRF8RyZJBAQtgYhQMKWMW2GHgUgn+r1EixG73BZoSwvpHrrw
 FOR9AVNfVBqmNON8xiIb3DN4EViq76EF0jrsZh5I9EoWS2w5qtk60kJQgXE+M4EE
 vOlmh3dhEVfCN2SxOn0bgoQmTulyjqGauTSSJKQbIBuinPFveukrJfGNFIWt0SZs
 f38FBMo6sgU4VG85B+Fr
 =X5x/
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Catalin Marinas:
 "arm64 updates for 3.20:

   - reimplementation of the virtual remapping of UEFI Runtime Services
     in a way that is stable across kexec
   - emulation of the "setend" instruction for 32-bit tasks (user
     endianness switching trapped in the kernel, SCTLR_EL1.E0E bit set
     accordingly)
   - compat_sys_call_table implemented in C (from asm) and made it a
     constant array together with sys_call_table
   - export CPU cache information via /sys (like other architectures)
   - DMA API implementation clean-up in preparation for IOMMU support
   - macros clean-up for KVM
   - dropped some unnecessary cache+tlb maintenance
   - CONFIG_ARM64_CPU_SUSPEND clean-up
   - defconfig update (CPU_IDLE)

  The EFI changes going via the arm64 tree have been acked by Matt
  Fleming.  There is also a patch adding sys_*stat64 prototypes to
  include/linux/syscalls.h, acked by Andrew Morton"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (47 commits)
  arm64: compat: Remove incorrect comment in compat_siginfo
  arm64: Fix section mismatch on alloc_init_p[mu]d()
  arm64: Avoid breakage caused by .altmacro in fpsimd save/restore macros
  arm64: mm: use *_sect to check for section maps
  arm64: drop unnecessary cache+tlb maintenance
  arm64:mm: free the useless initial page table
  arm64: Enable CPU_IDLE in defconfig
  arm64: kernel: remove ARM64_CPU_SUSPEND config option
  arm64: make sys_call_table const
  arm64: Remove asm/syscalls.h
  arm64: Implement the compat_sys_call_table in C
  syscalls: Declare sys_*stat64 prototypes if __ARCH_WANT_(COMPAT_)STAT64
  compat: Declare compat_sys_sigpending and compat_sys_sigprocmask prototypes
  arm64: uapi: expose our struct ucontext to the uapi headers
  smp, ARM64: Kill SMP single function call interrupt
  arm64: Emulate SETEND for AArch32 tasks
  arm64: Consolidate hotplug notifier for instruction emulation
  arm64: Track system support for mixed endian EL0
  arm64: implement generic IOMMU configuration
  arm64: Combine coherent and non-coherent swiotlb dma_ops
  ...
2015-02-11 18:03:54 -08:00
Kirill A. Shutemov d016bf7ece mm: make FIRST_USER_ADDRESS unsigned long on all archs
LKP has triggered a compiler warning after my recent patch "mm: account
pmd page tables to the process":

    mm/mmap.c: In function 'exit_mmap':
 >> mm/mmap.c:2857:2: warning: right shift count >= width of type [enabled by default]

The code:

 > 2857                WARN_ON(mm_nr_pmds(mm) >
   2858                                round_up(FIRST_USER_ADDRESS, PUD_SIZE) >> PUD_SHIFT);

In this, on tile, we have FIRST_USER_ADDRESS defined as 0.  round_up() has
the same type -- int.  PUD_SHIFT.

I think the best way to fix it is to define FIRST_USER_ADDRESS as unsigned
long.  On every arch for consistency.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reported-by: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-11 17:06:03 -08:00
Kirill A. Shutemov 9b3e661e58 arm64: drop PTE_FILE and pte_file()-related helpers
We've replaced remap_file_pages(2) implementation with emulation.  Nobody
creates non-linear mapping anymore.

This patch also adjust __SWP_TYPE_SHIFT and increase number of bits
availble for swap offset.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-10 14:30:31 -08:00
zhichang.yuan 523d6e9fae arm64:mm: free the useless initial page table
For 64K page system, after mapping a PMD section, the corresponding initial
page table is not needed any more. That page can be freed.

Signed-off-by: Zhichang Yuan <zhichang.yuan@linaro.org>
[catalin.marinas@arm.com: added BUG_ON() to catch late memblock freeing]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2015-01-28 12:07:28 +00:00
Ard Biesheuvel 8ce837cee8 arm64/mm: add create_pgd_mapping() to create private page tables
For UEFI, we need to install the memory mappings used for Runtime Services
in a dedicated set of page tables. Add create_pgd_mapping(), which allows
us to allocate and install those page table entries early.

Reviewed-by: Will Deacon <will.deacon@arm.com>
Tested-by: Leif Lindholm <leif.lindholm@linaro.org>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
2015-01-12 08:16:52 +00:00
Jungseok Lee 5d96e0cba2 arm64: mm: Add pgd_page to support RCU fast_gup
This patch adds pgd_page definition in order to keep supporting
HAVE_GENERIC_RCU_GUP configuration. In addition, it changes pud_page
expression to align with pmd_page for readability.

An introduction of pgd_page resolves the following build breakage
under 4KB + 4Level memory management combo.

mm/gup.c: In function 'gup_huge_pgd':
mm/gup.c:889:2: error: implicit declaration of function 'pgd_page' [-Werror=implicit-function-declaration]
  head = pgd_page(orig);
  ^
mm/gup.c:889:7: warning: assignment makes pointer from integer without a cast
  head = pgd_page(orig);

Cc: Will Deacon <will.deacon@arm.com>
Cc: Steve Capper <steve.capper@linaro.org>
Signed-off-by: Jungseok Lee <jungseoklee85@gmail.com>
[catalin.marinas@arm.com: remove duplicate pmd_page definition]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2014-12-23 16:39:17 +00:00
Kirill A. Shutemov c164e038ee mm: fix huge zero page accounting in smaps report
As a small zero page, huge zero page should not be accounted in smaps
report as normal page.

For small pages we rely on vm_normal_page() to filter out zero page, but
vm_normal_page() is not designed to handle pmds.  We only get here due
hackish cast pmd to pte in smaps_pte_range() -- pte and pmd format is not
necessary compatible on each and every architecture.

Let's add separate codepath to handle pmds.  follow_trans_huge_pmd() will
detect huge zero page for us.

We would need pmd_dirty() helper to do this properly.  The patch adds it
to THP-enabled architectures which don't yet have one.

[akpm@linux-foundation.org: use do_div to fix 32-bit build]
Signed-off-by: "Kirill A. Shutemov" <kirill@shutemov.name>
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Tested-by: Fengwei Yin <yfw.kernel@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-12-10 17:41:08 -08:00
Linus Torvalds 8a5de18239 Second batch of changes for KVM/{arm,arm64} for 3.18
- Support for 48bit IPA and VA (EL2)
 - A number of fixes for devices mapped into guests
 - Yet another VGIC fix for BE
 - A fix for CPU hotplug
 - A few compile fixes (disabled VGIC, strict mm checks)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJUQkyTAAoJECPQ0LrRPXpDwrUP/1WbELgB74W35CJ1zIc4KuBi
 unP1muW3QAr9Vmp/KovRKyLKFiRaTRlQsszaI78f4ZQ++0vzivU8dZwV81Gn1y/v
 0qF63OB0UYsOgXMRrh5JTEqzUyNyNBLUH+FAQiEO/srDoH5WLp3Zq7ThjzjwGn7Q
 K2ArxFiml+p2BGIGKWe3XIrxNgpW4oWhfe1kW4WU7sshuJlut3Nee+q2lSIg9mZx
 2VXYnLNzSsHizgQHuVEyXIqn8HA5FSCvjBYIUcLERlWB0I66WvzOqg9rH/BmNNR2
 H+cBDY+9D8KBUBG9zZSG7hZ0mAONKcOnxGZWGzte3Oi7FMZkB3Y/zrIs0na4iB5Y
 FxE8j+2qclZk9fkHQ7wn9Ws8hpGR2OrFlc2O5ZoBJJ2KJ4wMRHMeEbYRBCRQbTCN
 +81SUW7mh2j/La0JBqZ6DhhTiymUdIB+6v78im9WGlHsFRAIHBt0Q0u/pIyY+GJs
 OH7FoswI3vF5iODlHeRO1yjaO3rkj+IJwqTuUuhAIGu9+qnof3ge+eM1cOqrudNa
 u2kDz+BC21+Q8dflOF99Ryz7cMWqMiwtR+N+OUYpxc7RL7mCeHVANJxdWIFHKa2Z
 XJaHmbKjmw8AoR0fbS6YWOl2xmIhqU+FAngI+mow/Hz4pJDpR2K3w17ASXIVVQVX
 go2bvGHONkdn8Ji3Asap
 =3fl5
 -----END PGP SIGNATURE-----

Merge tag 'kvm-arm-for-3.18-take-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm

Pull second batch of changes for KVM/{arm,arm64} from Marc Zyngier:
 "The most obvious thing is the sizeable MMU changes to support 48bit
  VAs on arm64.

  Summary:

   - support for 48bit IPA and VA (EL2)
   - a number of fixes for devices mapped into guests
   - yet another VGIC fix for BE
   - a fix for CPU hotplug
   - a few compile fixes (disabled VGIC, strict mm checks)"

[ I'm pulling directly from Marc at the request of Paolo Bonzini, whose
  backpack was stolen at Düsseldorf airport and will do new keys and
  rebuild his web of trust.    - Linus ]

* tag 'kvm-arm-for-3.18-take-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm:
  arm/arm64: KVM: Fix BE accesses to GICv2 EISR and ELRSR regs
  arm: kvm: STRICT_MM_TYPECHECKS fix for user_mem_abort
  arm/arm64: KVM: Ensure memslots are within KVM_PHYS_SIZE
  arm64: KVM: Implement 48 VA support for KVM EL2 and Stage-2
  arm/arm64: KVM: map MMIO regions at creation time
  arm64: kvm: define PAGE_S2_DEVICE as read-only by default
  ARM: kvm: define PAGE_S2_DEVICE as read-only by default
  arm/arm64: KVM: add 'writable' parameter to kvm_phys_addr_ioremap
  arm/arm64: KVM: fix potential NULL dereference in user_mem_abort()
  arm/arm64: KVM: use __GFP_ZERO not memset() to get zeroed pages
  ARM: KVM: fix vgic-disabled build
  arm: kvm: fix CPU hotplug
2014-10-18 14:32:31 -07:00
Ard Biesheuvel 4a513fb009 arm64: kvm: define PAGE_S2_DEVICE as read-only by default
Now that we support read-only memslots, we need to make sure that
pass-through device mappings are not mapped writable if the guest
has requested them to be read-only. The existing implementation
already honours this by calling kvm_set_s2pte_writable() on the new
pte in case of writable mappings, so all we need to do is define
the default pgprot_t value used for devices to be PTE_S2_RDONLY.

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-10 13:07:38 +02:00
Linus Torvalds 0cf744bc7a Merge branch 'akpm' (fixes from Andrew Morton)
Merge patch-bomb from Andrew Morton:
 - part of OCFS2 (review is laggy again)
 - procfs
 - slab
 - all of MM
 - zram, zbud
 - various other random things: arch, filesystems.

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (164 commits)
  nosave: consolidate __nosave_{begin,end} in <asm/sections.h>
  include/linux/screen_info.h: remove unused ORIG_* macros
  kernel/sys.c: compat sysinfo syscall: fix undefined behavior
  kernel/sys.c: whitespace fixes
  acct: eliminate compile warning
  kernel/async.c: switch to pr_foo()
  include/linux/blkdev.h: use NULL instead of zero
  include/linux/kernel.h: deduplicate code implementing clamp* macros
  include/linux/kernel.h: rewrite min3, max3 and clamp using min and max
  alpha: use Kbuild logic to include <asm-generic/sections.h>
  frv: remove deprecated IRQF_DISABLED
  frv: remove unused cpuinfo_frv and friends to fix future build error
  zbud: avoid accessing last unused freelist
  zsmalloc: simplify init_zspage free obj linking
  mm/zsmalloc.c: correct comment for fullness group computation
  zram: use notify_free to account all free notifications
  zram: report maximum used memory
  zram: zram memory size limitation
  zsmalloc: change return value unit of zs_get_total_size_bytes
  zsmalloc: move pages_allocated to zs_pool
  ...
2014-10-09 22:26:14 -04:00
Steve Capper 29e5694054 arm64: mm: enable RCU fast_gup
Activate the RCU fast_gup for ARM64.  We also need to force THP splits to
broadcast an IPI s.t.  we block in the fast_gup page walker.  As THP
splits are comparatively rare, this should not lead to a noticeable
performance degradation.

Some pre-requisite functions pud_write and pud_page are also added.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Steve Capper <steve.capper@linaro.org>
Tested-by: Dann Frazier <dann.frazier@canonical.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Christoffer Dall <christoffer.dall@linaro.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-09 22:26:01 -04:00
Linus Torvalds 80213c03c4 PCI changes for the v3.18 merge window:
Enumeration
     - Check Vendor ID only for Config Request Retry Status (Rajat Jain)
     - Enable Config Request Retry Status when supported (Rajat Jain)
     - Add generic domain handling (Catalin Marinas)
     - Generate uppercase hex for modalias interface class (Ricardo Ribalda Delgado)
 
   Resource management
     - Add missing MEM_64 mask in pci_assign_unassigned_bridge_resources() (Yinghai Lu)
     - Increase IBM ipr SAS Crocodile BARs to at least system page size (Douglas Lehr)
 
   PCI device hotplug
     - Prevent NULL dereference during pciehp probe (Andreas Noever)
     - Move _HPP & _HPX handling into core (Bjorn Helgaas)
     - Apply _HPP to PCIe devices as well as PCI (Bjorn Helgaas)
     - Apply _HPP/_HPX to display devices (Bjorn Helgaas)
     - Preserve SERR & PARITY settings when applying _HPP/_HPX (Bjorn Helgaas)
     - Preserve MPS and MRRS settings when applying _HPP/_HPX (Bjorn Helgaas)
     - Apply _HPP/_HPX to all devices, not just hot-added ones (Bjorn Helgaas)
     - Fix wait time in pciehp timeout message (Yinghai Lu)
     - Add more pciehp Slot Control debug output (Yinghai Lu)
     - Stop disabling pciehp notifications during init (Yinghai Lu)
 
   MSI
     - Remove arch_msi_check_device() (Alexander Gordeev)
     - Rename pci_msi_check_device() to pci_msi_supported() (Alexander Gordeev)
     - Move D0 check into pci_msi_check_device() (Alexander Gordeev)
     - Remove unused kobject from struct msi_desc (Yijing Wang)
     - Remove "pos" from the struct msi_desc msi_attrib (Yijing Wang)
     - Add "msi_bus" sysfs MSI/MSI-X control for endpoints (Yijing Wang)
     - Use __get_cached_msi_msg() instead of get_cached_msi_msg() (Yijing Wang)
     - Use __read_msi_msg() instead of read_msi_msg() (Yijing Wang)
     - Use __write_msi_msg() instead of write_msi_msg() (Yijing Wang)
 
   Power management
     - Drop unused runtime PM support code for PCIe ports (Rafael J.  Wysocki)
     - Allow PCI devices to be put into D3cold during system suspend (Rafael J. Wysocki)
 
   AER
     - Add additional AER error strings (Gong Chen)
     - Make <linux/aer.h> standalone includable (Thierry Reding)
 
   Virtualization
     - Add ACS quirk for Solarflare SFC9120 & SFC9140 (Alex Williamson)
     - Add ACS quirk for Intel 10G NICs (Alex Williamson)
     - Add ACS quirk for AMD A88X southbridge (Marti Raudsepp)
     - Remove unused pci_find_upstream_pcie_bridge(), pci_get_dma_source() (Alex Williamson)
     - Add device flag helpers (Ethan Zhao)
     - Assume all Mellanox devices have broken INTx masking (Gavin Shan)
 
   Generic host bridge driver
     - Fix ioport_map() for !CONFIG_GENERIC_IOMAP (Liviu Dudau)
     - Add pci_register_io_range() and pci_pio_to_address() (Liviu Dudau)
     - Define PCI_IOBASE as the base of virtual PCI IO space (Liviu Dudau)
     - Fix the conversion of IO ranges into IO resources (Liviu Dudau)
     - Add pci_get_new_domain_nr() and of_get_pci_domain_nr() (Liviu Dudau)
     - Add support for parsing PCI host bridge resources from DT (Liviu Dudau)
     - Add pci_remap_iospace() to map bus I/O resources (Liviu Dudau)
     - Add arm64 architectural support for PCI (Liviu Dudau)
 
   APM X-Gene
     - Add APM X-Gene PCIe driver (Tanmay Inamdar)
     - Add arm64 DT APM X-Gene PCIe device tree nodes (Tanmay Inamdar)
 
   Freescale i.MX6
     - Probe in module_init(), not fs_initcall() (Lucas Stach)
     - Delay enabling reference clock for SS until it stabilizes (Tim Harvey)
 
   Marvell MVEBU
     - Fix uninitialized variable in mvebu_get_tgt_attr() (Thomas Petazzoni)
 
   NVIDIA Tegra
     - Make sure the PCIe PLL is really reset (Eric Yuen)
     - Add error path tegra_msi_teardown_irq() cleanup (Jisheng Zhang)
     - Fix extended configuration space mapping (Peter Daifuku)
     - Implement resource hierarchy (Thierry Reding)
     - Clear CLKREQ# enable on port disable (Thierry Reding)
     - Add Tegra124 support (Thierry Reding)
 
   ST Microelectronics SPEAr13xx
     - Pass config resource through reg property (Pratyush Anand)
 
   Synopsys DesignWare
     - Use NULL instead of false (Fabio Estevam)
     - Parse bus-range property from devicetree (Lucas Stach)
     - Use pci_create_root_bus() instead of pci_scan_root_bus() (Lucas Stach)
     - Remove pci_assign_unassigned_resources() (Lucas Stach)
     - Check private_data validity in single place (Lucas Stach)
     - Setup and clear exactly one MSI at a time (Lucas Stach)
     - Remove open-coded bitmap operations (Lucas Stach)
     - Fix configuration base address when using 'reg' (Minghuan Lian)
     - Fix IO resource end address calculation (Minghuan Lian)
     - Rename get_msi_data() to get_msi_addr() (Minghuan Lian)
     - Add get_msi_data() to pcie_host_ops (Minghuan Lian)
     - Add support for v3.65 hardware (Murali Karicheri)
     - Fold struct pcie_port_info into struct pcie_port (Pratyush Anand)
 
   TI Keystone
     - Add TI Keystone PCIe driver (Murali Karicheri)
     - Limit MRSS for all downstream devices (Murali Karicheri)
     - Assume controller is already in RC mode (Murali Karicheri)
     - Set device ID based on SoC to support multiple ports (Murali Karicheri)
 
   Xilinx AXI
     - Add Xilinx AXI PCIe driver (Srikanth Thokala)
     - Fix xilinx_pcie_assign_msi() return value test (Dan Carpenter)
 
   Miscellaneous
     - Clean up whitespace (Quentin Lambert)
     - Remove assignments from "if" conditions (Quentin Lambert)
     - Move PCI_VENDOR_ID_VMWARE to pci_ids.h (Francesco Ruggeri)
     - x86: Mark DMI tables as initialization data (Mathias Krause)
     - x86: Move __init annotation to the correct place (Mathias Krause)
     - x86: Mark constants of pci_mmcfg_nvidia_mcp55() as __initconst (Mathias Krause)
     - x86: Constify pci_mmcfg_probes[] array (Mathias Krause)
     - x86: Mark PCI BIOS initialization code as such (Mathias Krause)
     - Parenthesize PCI_DEVID and PCI_VPD_LRDT_ID parameters (Megan Kamiya)
     - Remove unnecessary variable in pci_add_dynid() (Tobias Klauser)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJUNWmJAAoJEFmIoMA60/r8GncP/3uHRoBrnaF6pv+S1l1p3Fs/
 l1kKH91/IuAAU7VJX8pkNybFqx02topWmiVVXAzqvD01PcRLGCLjPbWl5h+y5/Ja
 CHZH33AwHAmm0kt4BrOSOeHTLJhAigly2zV3P4F8jRIgyaeMoGZ6Ko4tkQUpm21k
 +ohrOd4cxYkmzzCjKwsZZhKnyRNpae8FmTk3VQBPuN8DbhvFPrqo5/+GeAdSZTdS
 HZHpfl2HL4095aY7uBVsZqNkjQyl6SnWwjkjLnuI8q3qA3BLgDZE/Jr8F/MNuW1V
 y01JIjerFWMDFyBIkpg7moYnODy6oP3KvczwYdKGmqsJja+0MQvYhLTwD+R/yTQS
 SewJA0mL3T3EJEfnFYkCiaIX27xIwk/FxHfaKPN91xgx/QM7xCVZNrU2/dXjhoX1
 GqLKxOEaFHhWWTyT5Dj27I0ZcElzFZ3tIwvrHfs8y22oAuAlsAypaUgvUwRfL4CO
 hOj4ITZa0t041sYWqxCoGAA9Fdp8HMzNKKS5F4mhADz4Ad9v6uPCNv/s/RoxVsbm
 jhZOtPYJ0/iCA+kNVX563S8Z3VpfPI+7bBjcj2WKdzW+IlICvOKT+kvwL2Tv/rE7
 w0hrNsbkgGsYbPldMx7LwCavsUtYFuNj0zoU6vkhP2jk6O2Tn5VXDmjrXH0v3iHI
 v03vlUtre0bQ26fzDyLQ
 =4Zv1
 -----END PGP SIGNATURE-----

Merge tag 'pci-v3.18-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI updates from Bjorn Helgaas:
 "The interesting things here are:

   - Turn on Config Request Retry Status Software Visibility.  This
     caused hangs last time, but we included a fix this time.
   - Rework PCI device configuration to use _HPP/_HPX more aggressively
   - Allow PCI devices to be put into D3cold during system suspend
   - Add arm64 PCI support
   - Add APM X-Gene host bridge driver
   - Add TI Keystone host bridge driver
   - Add Xilinx AXI host bridge driver

  More detailed summary:

  Enumeration
    - Check Vendor ID only for Config Request Retry Status (Rajat Jain)
    - Enable Config Request Retry Status when supported (Rajat Jain)
    - Add generic domain handling (Catalin Marinas)
    - Generate uppercase hex for modalias interface class (Ricardo Ribalda Delgado)

  Resource management
    - Add missing MEM_64 mask in pci_assign_unassigned_bridge_resources() (Yinghai Lu)
    - Increase IBM ipr SAS Crocodile BARs to at least system page size (Douglas Lehr)

  PCI device hotplug
    - Prevent NULL dereference during pciehp probe (Andreas Noever)
    - Move _HPP & _HPX handling into core (Bjorn Helgaas)
    - Apply _HPP to PCIe devices as well as PCI (Bjorn Helgaas)
    - Apply _HPP/_HPX to display devices (Bjorn Helgaas)
    - Preserve SERR & PARITY settings when applying _HPP/_HPX (Bjorn Helgaas)
    - Preserve MPS and MRRS settings when applying _HPP/_HPX (Bjorn Helgaas)
    - Apply _HPP/_HPX to all devices, not just hot-added ones (Bjorn Helgaas)
    - Fix wait time in pciehp timeout message (Yinghai Lu)
    - Add more pciehp Slot Control debug output (Yinghai Lu)
    - Stop disabling pciehp notifications during init (Yinghai Lu)

  MSI
    - Remove arch_msi_check_device() (Alexander Gordeev)
    - Rename pci_msi_check_device() to pci_msi_supported() (Alexander Gordeev)
    - Move D0 check into pci_msi_check_device() (Alexander Gordeev)
    - Remove unused kobject from struct msi_desc (Yijing Wang)
    - Remove "pos" from the struct msi_desc msi_attrib (Yijing Wang)
    - Add "msi_bus" sysfs MSI/MSI-X control for endpoints (Yijing Wang)
    - Use __get_cached_msi_msg() instead of get_cached_msi_msg() (Yijing Wang)
    - Use __read_msi_msg() instead of read_msi_msg() (Yijing Wang)
    - Use __write_msi_msg() instead of write_msi_msg() (Yijing Wang)

  Power management
    - Drop unused runtime PM support code for PCIe ports (Rafael J.  Wysocki)
    - Allow PCI devices to be put into D3cold during system suspend (Rafael J. Wysocki)

  AER
    - Add additional AER error strings (Gong Chen)
    - Make <linux/aer.h> standalone includable (Thierry Reding)

  Virtualization
    - Add ACS quirk for Solarflare SFC9120 & SFC9140 (Alex Williamson)
    - Add ACS quirk for Intel 10G NICs (Alex Williamson)
    - Add ACS quirk for AMD A88X southbridge (Marti Raudsepp)
    - Remove unused pci_find_upstream_pcie_bridge(), pci_get_dma_source() (Alex Williamson)
    - Add device flag helpers (Ethan Zhao)
    - Assume all Mellanox devices have broken INTx masking (Gavin Shan)

  Generic host bridge driver
    - Fix ioport_map() for !CONFIG_GENERIC_IOMAP (Liviu Dudau)
    - Add pci_register_io_range() and pci_pio_to_address() (Liviu Dudau)
    - Define PCI_IOBASE as the base of virtual PCI IO space (Liviu Dudau)
    - Fix the conversion of IO ranges into IO resources (Liviu Dudau)
    - Add pci_get_new_domain_nr() and of_get_pci_domain_nr() (Liviu Dudau)
    - Add support for parsing PCI host bridge resources from DT (Liviu Dudau)
    - Add pci_remap_iospace() to map bus I/O resources (Liviu Dudau)
    - Add arm64 architectural support for PCI (Liviu Dudau)

  APM X-Gene
    - Add APM X-Gene PCIe driver (Tanmay Inamdar)
    - Add arm64 DT APM X-Gene PCIe device tree nodes (Tanmay Inamdar)

  Freescale i.MX6
    - Probe in module_init(), not fs_initcall() (Lucas Stach)
    - Delay enabling reference clock for SS until it stabilizes (Tim Harvey)

  Marvell MVEBU
    - Fix uninitialized variable in mvebu_get_tgt_attr() (Thomas Petazzoni)

  NVIDIA Tegra
    - Make sure the PCIe PLL is really reset (Eric Yuen)
    - Add error path tegra_msi_teardown_irq() cleanup (Jisheng Zhang)
    - Fix extended configuration space mapping (Peter Daifuku)
    - Implement resource hierarchy (Thierry Reding)
    - Clear CLKREQ# enable on port disable (Thierry Reding)
    - Add Tegra124 support (Thierry Reding)

  ST Microelectronics SPEAr13xx
    - Pass config resource through reg property (Pratyush Anand)

  Synopsys DesignWare
    - Use NULL instead of false (Fabio Estevam)
    - Parse bus-range property from devicetree (Lucas Stach)
    - Use pci_create_root_bus() instead of pci_scan_root_bus() (Lucas Stach)
    - Remove pci_assign_unassigned_resources() (Lucas Stach)
    - Check private_data validity in single place (Lucas Stach)
    - Setup and clear exactly one MSI at a time (Lucas Stach)
    - Remove open-coded bitmap operations (Lucas Stach)
    - Fix configuration base address when using 'reg' (Minghuan Lian)
    - Fix IO resource end address calculation (Minghuan Lian)
    - Rename get_msi_data() to get_msi_addr() (Minghuan Lian)
    - Add get_msi_data() to pcie_host_ops (Minghuan Lian)
    - Add support for v3.65 hardware (Murali Karicheri)
    - Fold struct pcie_port_info into struct pcie_port (Pratyush Anand)

  TI Keystone
    - Add TI Keystone PCIe driver (Murali Karicheri)
    - Limit MRSS for all downstream devices (Murali Karicheri)
    - Assume controller is already in RC mode (Murali Karicheri)
    - Set device ID based on SoC to support multiple ports (Murali Karicheri)

  Xilinx AXI
    - Add Xilinx AXI PCIe driver (Srikanth Thokala)
    - Fix xilinx_pcie_assign_msi() return value test (Dan Carpenter)

  Miscellaneous
    - Clean up whitespace (Quentin Lambert)
    - Remove assignments from "if" conditions (Quentin Lambert)
    - Move PCI_VENDOR_ID_VMWARE to pci_ids.h (Francesco Ruggeri)
    - x86: Mark DMI tables as initialization data (Mathias Krause)
    - x86: Move __init annotation to the correct place (Mathias Krause)
    - x86: Mark constants of pci_mmcfg_nvidia_mcp55() as __initconst (Mathias Krause)
    - x86: Constify pci_mmcfg_probes[] array (Mathias Krause)
    - x86: Mark PCI BIOS initialization code as such (Mathias Krause)
    - Parenthesize PCI_DEVID and PCI_VPD_LRDT_ID parameters (Megan Kamiya)
    - Remove unnecessary variable in pci_add_dynid() (Tobias Klauser)"

* tag 'pci-v3.18-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (109 commits)
  arm64: dts: Add APM X-Gene PCIe device tree nodes
  PCI: Add ACS quirk for AMD A88X southbridge devices
  PCI: xgene: Add APM X-Gene PCIe driver
  PCI: designware: Remove open-coded bitmap operations
  PCI/MSI: Remove unnecessary temporary variable
  PCI/MSI: Use __write_msi_msg() instead of write_msi_msg()
  MSI/powerpc: Use __read_msi_msg() instead of read_msi_msg()
  PCI/MSI: Use __get_cached_msi_msg() instead of get_cached_msi_msg()
  PCI/MSI: Add "msi_bus" sysfs MSI/MSI-X control for endpoints
  PCI/MSI: Remove "pos" from the struct msi_desc msi_attrib
  PCI/MSI: Remove unused kobject from struct msi_desc
  PCI/MSI: Rename pci_msi_check_device() to pci_msi_supported()
  PCI/MSI: Move D0 check into pci_msi_check_device()
  PCI/MSI: Remove arch_msi_check_device()
  irqchip: armada-370-xp: Remove arch_msi_check_device()
  PCI/MSI/PPC: Remove arch_msi_check_device()
  arm64: Add architectural support for PCI
  PCI: Add pci_remap_iospace() to map bus I/O resources
  of/pci: Add support for parsing PCI host bridge resources from DT
  of/pci: Add pci_get_new_domain_nr() and of_get_pci_domain_nr()
  ...

Conflicts:
	arch/arm64/boot/dts/apm-storm.dtsi
2014-10-09 15:03:49 -04:00
Liviu Dudau d1e6dc91b5 arm64: Add architectural support for PCI
Use the generic PCI domain and OF functions to provide support for PCI
on arm64.

[bhelgaas: Change comments to use generic PCI, not just PCIe.  Nothing at
this level is PCIe-specific.]
Signed-off-by: Liviu Dudau <Liviu.Dudau@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
2014-09-30 17:08:57 -06:00
Laura Abbott b6d4f2800b arm64: Introduce {set,clear}_pte_bit
It's useful to be able to change individual bits in ptes at times.
Introduce functions for this and update existing pte_mk* functions
to use these primatives.

Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
[will: added missing inline keyword for new header functions]
Signed-off-by: Will Deacon <will.deacon@arm.com>
2014-09-08 14:39:18 +01:00
Catalin Marinas 7f0b1bf045 arm64: Fix barriers used for page table modifications
The architecture specification states that both DSB and ISB are required
between page table modifications and subsequent memory accesses using the
corresponding virtual address. When TLB invalidation takes place, the
tlb_flush_* functions already have the necessary barriers. However, there are
other functions like create_mapping() for which this is not the case.

The patch adds the DSB+ISB instructions in the set_pte() function for
valid kernel mappings. The invalid pte case is handled by tlb_flush_*
and the user mappings in general have a corresponding update_mmu_cache()
call containing a DSB. Even when update_mmu_cache() isn't called, the
kernel can still cope with an unlikely spurious page fault by
re-executing the instruction.

In addition, the set_pmd, set_pud() functions gain an ISB for
architecture compliance when block mappings are created.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Reported-by: Leif Lindholm <leif.lindholm@linaro.org>
Acked-by: Steve Capper <steve.capper@linaro.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: <stable@vger.kernel.org>
2014-07-24 10:25:42 +01:00
Catalin Marinas 7078db4621 arm64: asm/pgtable.h pmd/pud definitions clean-up
Non-functional change to group together the pmd/pud definitions and
reduce the amount of #if CONFIG_ARM64_PGTABLE_LEVELS.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Jungseok Lee <jungseoklee85@gmail.com>
2014-07-23 15:28:10 +01:00
Catalin Marinas 08375198b0 arm64: Determine the vmalloc/vmemmap space at build time based on VA_BITS
Rather than guessing what the maximum vmmemap space should be, this
patch allows the calculation based on the VA_BITS and sizeof(struct
page). The vmalloc space extends to the beginning of the vmemmap space.

Since the virtual kernel memory layout now depends on the build
configuration, this patch removes the detailed description in
Documentation/arm64/memory.txt in favour of information printed during
kernel booting.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Jungseok Lee <jungseoklee85@gmail.com>
2014-07-23 15:28:05 +01:00
Catalin Marinas abe669d7e1 arm64: Convert bool ARM64_x_LEVELS to int ARM64_PGTABLE_LEVELS
Rather than having several Kconfig options, define int
ARM64_PGTABLE_LEVELS which will be also useful in converting some of the
pgtable macros.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Jungseok Lee <jungseoklee85@gmail.com>
2014-07-23 15:27:46 +01:00
Jungseok Lee c79b954bf6 arm64: mm: Implement 4 levels of translation tables
This patch implements 4 levels of translation tables since 3 levels
of page tables with 4KB pages cannot support 40-bit physical address
space described in [1] due to the following issue.

It is a restriction that kernel logical memory map with 4KB + 3 levels
(0xffffffc000000000-0xffffffffffffffff) cannot cover RAM region from
544GB to 1024GB in [1]. Specifically, ARM64 kernel fails to create
mapping for this region in map_mem function since __phys_to_virt for
this region reaches to address overflow.

If SoC design follows the document, [1], over 32GB RAM would be placed
from 544GB. Even 64GB system is supposed to use the region from 544GB
to 576GB for only 32GB RAM. Naturally, it would reach to enable 4 levels
of page tables to avoid hacking __virt_to_phys and __phys_to_virt.

However, it is recommended 4 levels of page table should be only enabled
if memory map is too sparse or there is about 512GB RAM.

References
----------
[1]: Principles of ARM Memory Maps, White Paper, Issue C

Signed-off-by: Jungseok Lee <jays.lee@samsung.com>
Reviewed-by: Sungjinn Chung <sungjinn.chung@samsung.com>
Acked-by: Kukjin Kim <kgene.kim@samsung.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Reviewed-by: Steve Capper <steve.capper@linaro.org>
[catalin.marinas@arm.com: MEMBLOCK_INITIAL_LIMIT removed, same as PUD_SIZE]
[catalin.marinas@arm.com: early_ioremap_init() updated for 4 levels]
[catalin.marinas@arm.com: 48-bit VA depends on BROKEN until KVM is fixed]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Jungseok Lee <jungseoklee85@gmail.com>
2014-07-23 15:27:40 +01:00
Jungseok Lee e41ceed035 arm64: Introduce VA_BITS and translation level options
This patch adds virtual address space size and a level of translation
tables to kernel configuration. It facilicates introduction of
different MMU options, such as 4KB + 4 levels, 16KB + 4 levels and
64KB + 3 levels, easily.

The idea is based on the discussion with Catalin Marinas:
http://www.spinics.net/linux/lists/arm-kernel/msg319552.html

Signed-off-by: Jungseok Lee <jays.lee@samsung.com>
Reviewed-by: Sungjinn Chung <sungjinn.chung@samsung.com>
Acked-by: Kukjin Kim <kgene.kim@samsung.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Jungseok Lee <jungseoklee85@gmail.com>
2014-07-23 15:27:21 +01:00
Catalin Marinas b2f8c07bcb arm64: Remove duplicate (SWAPPER|IDMAP)_DIR_SIZE definitions
Just keep the asm/page.h definition as this is included in vmlinux.lds.S
as well.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
2014-07-17 16:02:42 +01:00
Steve Capper f3b766a26d arm64: mm: Fix horrendous config typo
The define ARM64_64K_PAGES is tested for rather than
CONFIG_ARM64_64K_PAGES. Correct that typo here.

Signed-off-by: Steve Capper <steve.capper@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2014-07-04 14:22:30 +01:00
Will Deacon e3a920afc3 arm64: mm: remove broken &= operator from pmd_mknotpresent
This should be a plain old '&' and could easily lead to undefined
behaviour if the target of a pmd_mknotpresent invocation was the same
as the parameter.

Fixes: 9c7e535fcc (arm64: mm: Route pmd thp functions through pte equivalents)
Signed-off-by: Will Deacon <will.deacon@arm.com>
Cc: <stable@vger.kernel.org> # v3.15
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2014-06-18 16:34:30 +01:00
Linus Torvalds cc07aabc53 - Optimised assembly string/memory routines (based on the AArch64 Cortex
Strings library contributed to glibc but re-licensed under GPLv2)
 - Optimised crypto algorithms making use of the ARMv8 crypto extensions
   (together with kernel API for using FPSIMD instructions in interrupt
   context)
 - Ftrace support
 - CPU topology parsing from DT
 - ESR_EL1 (Exception Syndrome Register) exposed to user space signal
   handlers for SIGSEGV/SIGBUS (useful to emulation tools like Qemu)
 - 1GB section linear mapping if applicable
 - Barriers usage clean-up
 - Default pgprot clean-up
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.9 (GNU/Linux)
 
 iQIcBAABAgAGBQJTkb+CAAoJEGvWsS0AyF7xLyEQAJgL8s2SdDyd+R8aukNDu3n9
 tCK7yVHO9Kg96dfeXVuSOVEo2jszo6R3nxzUL05FMovr230WBcmoeHvHz8ETGnw1
 g0yO8Ltkckjevog4UleCa3wGtYISjvwwrTalzbqoEWzsF2AV8oiqv/yuIn/EdkUr
 jaOqfNsnAQa8TIz4vMhi/AVdJWTTU/F6WP80oqCbxqXu/WL2InuBlHtOJMbk1HDI
 u1DJUGDQ1B9OgSVRkAOjCjSsEtz8sDY3lXsg3V1qT5+NbZTyomYM2IiBLdgQcX4P
 t/rqX9nX4VmRQtzefeP5WhKFks2x80C0BKibWC4teeL++tJHbgbFkyjoZZGcP27o
 zued3cYABrjrcAEU6ko/LUiL2Q4ozBOzosClpjpWulCxNPzsOps82UZWo3F3XbAt
 xjE3k7WF9WeNBOJdDGrarEaSLdnjjgCLoWVs8cOUYLpOOrtdSw16D29jJ68U0Y5g
 31wdwKxoueC8SFt8M9fP9J9Jyau08g+kvW1xQXrRmroppweFxjSpSy90imARyux/
 wUFz79HxkQB79ZHpJ0I5TNrw/w+7pBnfVSKGPOzrk+ZUsaH76caNRBoffUCzFMzz
 T3Sc8A36TZtOIcGR/Q4DMZNFXlIUXDSzCHP2Iu0QoIjTd5Ex96cqNvy3nswCYWwv
 yGe3ZEqUq9+WL7snNW4v
 =Jj8U
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux into next

Pull arm64 updates from Catalin Marinas:
 - Optimised assembly string/memory routines (based on the AArch64
   Cortex Strings library contributed to glibc but re-licensed under
   GPLv2)
 - Optimised crypto algorithms making use of the ARMv8 crypto extensions
   (together with kernel API for using FPSIMD instructions in interrupt
   context)
 - Ftrace support
 - CPU topology parsing from DT
 - ESR_EL1 (Exception Syndrome Register) exposed to user space signal
   handlers for SIGSEGV/SIGBUS (useful to emulation tools like Qemu)
 - 1GB section linear mapping if applicable
 - Barriers usage clean-up
 - Default pgprot clean-up

Conflicts as per Catalin.

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (57 commits)
  arm64: kernel: initialize broadcast hrtimer based clock event device
  arm64: ftrace: Add system call tracepoint
  arm64: ftrace: Add CALLER_ADDRx macros
  arm64: ftrace: Add dynamic ftrace support
  arm64: Add ftrace support
  ftrace: Add arm64 support to recordmcount
  arm64: Add 'notrace' attribute to unwind_frame() for ftrace
  arm64: add __ASSEMBLY__ in asm/insn.h
  arm64: Fix linker script entry point
  arm64: lib: Implement optimized string length routines
  arm64: lib: Implement optimized string compare routines
  arm64: lib: Implement optimized memcmp routine
  arm64: lib: Implement optimized memset routine
  arm64: lib: Implement optimized memmove routine
  arm64: lib: Implement optimized memcpy routine
  arm64: defconfig: enable a few more common/useful options in defconfig
  ftrace: Make CALLER_ADDRx macros more generic
  arm64: Fix deadlock scenario with smp_send_stop()
  arm64: Fix machine_shutdown() definition
  arm64: Support arch_irq_work_raise() via self IPIs
  ...
2014-06-06 10:43:28 -07:00
Linus Torvalds 1aacb90eaa Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial into next
Pull trivial tree changes from Jiri Kosina:
 "Usual pile of patches from trivial tree that make the world go round"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (23 commits)
  staging: go7007: remove reference to CONFIG_KMOD
  aic7xxx: Remove obsolete preprocessor define
  of: dma: doc fixes
  doc: fix incorrect formula to calculate CommitLimit value
  doc: Note need of bc in the kernel build from 3.10 onwards
  mm: Fix printk typo in dmapool.c
  modpost: Fix comment typo "Modules.symvers"
  Kconfig.debug: Grammar s/addition/additional/
  wimax: Spelling s/than/that/, wording s/destinatary/recipient/
  aic7xxx: Spelling s/termnation/termination/
  arm64: mm: Remove superfluous "the" in comment
  of: Spelling s/anonymouns/anonymous/
  dma: imx-sdma: Spelling s/determnine/determine/
  ath10k: Improve grammar in comments
  ath6kl: Spelling s/determnine/determine/
  of: Improve grammar for of_alias_get_id() documentation
  drm/exynos: Spelling s/contro/control/
  radio-bcm2048.c: fix wrong overflow check
  doc: printk-formats: do not mention casts for u64/s64
  doc: spelling error changes
  ...
2014-06-04 08:50:34 -07:00
Will Deacon ceb218359d arm64: mm: fix pmd_write CoW brokenness
Commit 9c7e535fcc ("arm64: mm: Route pmd thp functions through pte
equivalents") changed the pmd manipulator and accessor functions to
convert the target pmd to a pte, process it with the pte functions, then
convert it back. Along the way, we gained support for PTE_WRITE, however
this is completely ignored by set_pmd_at, and so we fail to set the
PMD_SECT_RDONLY for PMDs, resulting in all sorts of lovely failures (like
CoW not working).

Partially reverting the offending commit (by making use of
PMD_SECT_RDONLY explicitly for pmd_{write,wrprotect,mkwrite} functions)
leads to further issues because pmd_write can then return potentially
incorrect values for page table entries marked as RDONLY, leading to
BUG_ON(pmd_write(entry)) tripping under some THP workloads.

This patch fixes the issue by routing set_pmd_at through set_pte_at,
which correctly takes the PTE_WRITE flag into account. Given that
THP mappings are always anonymous, the additional cache-flushing code
in __sync_icache_dcache won't impose any significant overhead as the
flush will be skipped.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Steve Capper <steve.capper@arm.com>
Tested-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2014-05-29 11:31:14 +01:00
Catalin Marinas 5a0fdfada3 Revert "arm64: Introduce execute-only page access permissions"
This reverts commit bc07c2c6e9.

While the aim is increased security for --x memory maps, it does not
protect against kernel level reads. Until SECCOMP is implemented for
arm64, revert this patch to avoid giving a false idea of execute-only
mappings.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2014-05-16 16:44:32 +01:00
Will Deacon 98f7685ee6 arm64: barriers: make use of barrier options with explicit barriers
When calling our low-level barrier macros directly, we can often suffice
with more relaxed behaviour than the default "all accesses, full system"
option.

This patch updates the users of dsb() to specify the option which they
actually require.

Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2014-05-09 17:03:15 +01:00
Steve Capper 206a2a73a6 arm64: mm: Create gigabyte kernel logical mappings where possible
We have the capability to map 1GB level 1 blocks when using a 4K
granule.

This patch adjusts the create_mapping logic s.t. when mapping physical
memory on boot, we attempt to use a 1GB block if both the VA and PA
start and end are 1GB aligned. This both reduces the levels of lookup
required to resolve a kernel logical address, as well as reduces TLB
pressure on cores that support 1GB TLB entries.

Signed-off-by: Steve Capper <steve.capper@linaro.org>
Tested-by: Jungseok Lee <jays.lee@samsung.com>
[catalin.marinas@arm.com: s/prot_sect_kernel/PROT_SECT_NORMAL_EXEC/]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2014-05-09 16:10:58 +01:00
Catalin Marinas a501e32430 arm64: Clean up the default pgprot setting
The primary aim of this patchset is to remove the pgprot_default and
prot_sect_default global variables and rely strictly on predefined
values. The original goal was to be able to run SMP kernels on UP
hardware by not setting the Shareability bit. However, it is unlikely to
see UP ARMv8 hardware and even if we do, the Shareability bit is no
longer assumed to disable cacheable accesses.

A side effect is that the device mappings now have the Shareability
attribute set. The hardware, however, should ignore it since Device
accesses are always Outer Shareable.

Following the removal of the two global variables, there is some PROT_*
macro reshuffling and cleanup, including the __PAGE_* macros (replaced
by PAGE_*).

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
2014-05-09 15:53:37 +01:00
Catalin Marinas bc07c2c6e9 arm64: Introduce execute-only page access permissions
The ARMv8 architecture allows execute-only user permissions by clearing
the PTE_UXN and PTE_USER bits. The kernel, however, can still access
such page, so execute-only page permission does not protect against
read(2)/write(2) etc. accesses. Systems requiring such protection must
implement/enable features like SECCOMP.

This patch changes the arm64 __P100 and __S100 protection_map[] macros
to the new __PAGE_EXECONLY attributes. A side effect is that
pte_valid_user() no longer triggers for __PAGE_EXECONLY since PTE_USER
isn't set. To work around this, the check is done on the PTE_NG bit via
the pte_valid_ng() macro. VM_READ is also checked now for page faults.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2014-05-09 15:53:36 +01:00
Geert Uytterhoeven aad9061bf3 arm64: mm: Remove superfluous "the" in comment
Signed-off-by: Geert Uytterhoeven <geert+renesas@linux-m68k.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2014-05-05 15:37:50 +02:00
Linus Torvalds 1ce235faa8 - KGDB support for arm64
- PCI I/O space extended to 16M (in preparation of PCIe support patches)
 - Dropping ZONE_DMA32 in favour of ZONE_DMA (we only need one for the
   time being), together with swiotlb late initialisation to correctly
   setup the bounce buffer
 - DMA API cache maintenance support (not all ARMv8 platforms have
   hardware cache coherency)
 - Crypto extensions advertising via ELF_HWCAP2 for compat user space
 - Perf support for dwarf unwinding in compat mode
 - asm/tlb.h converted to the generic mmu_gather code
 - asm-generic rwsem implementation
 - Code clean-up
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.9 (GNU/Linux)
 
 iQIcBAABAgAGBQJTOaqsAAoJEGvWsS0AyF7xYNUP/3/IPySIB+/6pyUG6q7kvIpF
 Di93M+VdmnLEOKhhx/tjkiEmEQMp0hFPeOlQRWf/Ugg4ksulP6gRejdDEjIfkmsk
 LrRXLjvH79NDJbN0pTUXqGDvLLZ9Qnib+HEOuKABIYUrwhNKySBk+5omGfXFtwLR
 Mb5JxPX0kbBXOqbOX4RgANQoRlE8GxJR3V245zlGxA4klcN4IiaDy/99kj+kaeaa
 Cl8X9K2I550IZ2YUAWPOut2aee2qRFQtAhIDgVthTYlGRx7Y/rDLM16B8fFY/T0H
 7azIpSO5hk5lp8J3giJHYajlJlXNla5FeHQb8XAVnlyqFBmCUn0vvd2VbPvWREJp
 UD8t1vZZt/s2he6CVAQIfQghwLyzrpPa19KbnyI+3HtsZ+NS/puBJmcVKZ2PBY/L
 28BsRzB7BKAPEVhNmyPwFHNdZTvjaqYUCLhQ0uTp1sSHMcLeSs7+vyMR99f/0u9E
 doSYAeF41ZkxHXL5xEevdj4sFkCEY1XFxER1Y8VM1rqHTeGEoeYbdS/u9tEeBgit
 jBelvHAlNTBgbur2nW4E9fQpAF2CsvWnRq6lSmDRTkyjzcLUQqA8bsQJ3aUyJtZt
 j17kUIzSH1q7x3zAaWQcvMVeawdkv2+HanjuTOdeO2ehvyG71vvxA3RkCv8o5Jhh
 da+jAMhkpYQxk8mSKkWm
 =8+cB
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull ARM64 updates from Catalin Marinas:
 - KGDB support for arm64
 - PCI I/O space extended to 16M (in preparation of PCIe support
   patches)
 - Dropping ZONE_DMA32 in favour of ZONE_DMA (we only need one for the
   time being), together with swiotlb late initialisation to correctly
   setup the bounce buffer
 - DMA API cache maintenance support (not all ARMv8 platforms have
   hardware cache coherency)
 - Crypto extensions advertising via ELF_HWCAP2 for compat user space
 - Perf support for dwarf unwinding in compat mode
 - asm/tlb.h converted to the generic mmu_gather code
 - asm-generic rwsem implementation
 - Code clean-up

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (42 commits)
  arm64: Remove pgprot_dmacoherent()
  arm64: Support DMA_ATTR_WRITE_COMBINE
  arm64: Implement custom mmap functions for dma mapping
  arm64: Fix __range_ok macro
  arm64: Fix duplicated Kconfig entries
  arm64: mm: Route pmd thp functions through pte equivalents
  arm64: rwsem: use asm-generic rwsem implementation
  asm-generic: rwsem: de-PPCify rwsem.h
  arm64: enable generic CPU feature modalias matching for this architecture
  arm64: smp: make local symbol static
  arm64: debug: make local symbols static
  ARM64: perf: support dwarf unwinding in compat mode
  ARM64: perf: add support for frame pointer unwinding in compat mode
  ARM64: perf: add support for perf registers API
  arm64: Add boot time configuration of Intermediate Physical Address size
  arm64: Do not synchronise I and D caches for special ptes
  arm64: Make DMA coherent and strongly ordered mappings not executable
  arm64: barriers: add dmb barrier
  arm64: topology: Implement basic CPU topology support
  arm64: advertise ARMv8 extensions to 32-bit compat ELF binaries
  ...
2014-03-31 15:01:45 -07:00
Catalin Marinas 196adf2f30 arm64: Remove pgprot_dmacoherent()
Since this macro is identical to pgprot_writecombine() and is only used
in a single place, remove it completely to avoid confusion. On ARMv7+
processors, the coherent DMA mapping must be Normal NonCacheable (a.k.a.
writecombine) to avoid mismatched hardware attribute aliases (with the
kernel linear mapping as Normal Cacheable).

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2014-03-24 10:35:35 +00:00
Steve Capper 9c7e535fcc arm64: mm: Route pmd thp functions through pte equivalents
Rather than have separate hugetlb and transparent huge page pmd
manipulation functions, re-wire our thp functions to simply call the
pte equivalents.

This allows THP to take advantage of the new PTE_WRITE logic introduced
in:
  c2c93e5 arm64: mm: Introduce PTE_WRITE

To represent splitting THPs we use the PTE_SPECIAL bit as this is not
used for pmds.

Signed-off-by: Steve Capper <steve.capper@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2014-03-14 18:02:10 +00:00
Catalin Marinas 71fdb6bf61 arm64: Do not synchronise I and D caches for special ptes
Special pte mappings are not intended to be executable and do not even
have an associated struct page. This patch ensures that we do not call
__sync_icache_dcache() on such ptes.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Reported-by: Steve Capper <Steve.Capper@arm.com>
Tested-by: Laura Abbott <lauraa@codeaurora.org>
Tested-by: Bharat Bhushan <Bharat.Bhushan@freescale.com>
Cc: <stable@vger.kernel.org>
2014-03-13 11:22:29 +00:00
Catalin Marinas de2db74329 arm64: Make DMA coherent and strongly ordered mappings not executable
pgprot_{dmacoherent,writecombine,noncached} don't need to generate
executable mappings with side-effects like __sync_icache_dcache() being
called when the mapping is in user space.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Reported-by: Bharat Bhushan <Bharat.Bhushan@freescale.com>
Tested-by: Laura Abbott <lauraa@codeaurora.org>
Tested-by: Bharat Bhushan <Bharat.Bhushan@freescale.com>
Cc: <stable@vger.kernel.org>
2014-03-13 11:22:21 +00:00
Steve Capper 84fe6826c2 arm64: mm: Add double logical invert to pte accessors
Page table entries on ARM64 are 64 bits, and some pte functions such as
pte_dirty return a bitwise-and of a flag with the pte value. If the
flag to be tested resides in the upper 32 bits of the pte, then we run
into the danger of the result being dropped if downcast.

For example:
	gather_stats(page, md, pte_dirty(*pte), 1);
where pte_dirty(*pte) is downcast to an int.

This patch adds a double logical invert to all the pte_ accessors to
ensure predictable downcasting.

Signed-off-by: Steve Capper <steve.capper@linaro.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2014-02-28 15:44:19 +00:00
Steve Capper c2c93e5b7f arm64: mm: Introduce PTE_WRITE
We have the following means for encoding writable or dirty ptes:

                                PTE_DIRTY       PTE_RDONLY
!pte_dirty && !pte_write        0               1
!pte_dirty && pte_write         0               1
pte_dirty && !pte_write         1               1
pte_dirty && pte_write          1               0

So we can't distinguish between writable clean ptes and read only
ptes. This can cause problems with ptes being incorrectly flagged as
read only when they are writable but not dirty.

This patch introduces a new software bit PTE_WRITE which allows us to
correctly identify writable ptes. PTE_RDONLY is now only clear for
valid ptes where a page is both writable and dirty.

Signed-off-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2014-01-31 11:30:49 +00:00
Steve Capper 44b6dfc556 arm64: mm: Remove PTE_BIT_FUNC macro
Expand out the pte manipulation functions. This makes our life easier
when using things like tags and cscope.

Signed-off-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2014-01-31 11:30:05 +00:00
Catalin Marinas 3676f9ef54 arm64: Move PTE_PROT_NONE higher up
PTE_PROT_NONE means that a pte is present but does not have any
read/write attributes. However, setting the memory type like
pgprot_writecombine() is allowed and such bits overlap with
PTE_PROT_NONE. This causes mmap/munmap issues in drivers that change the
vma->vm_pg_prot on PROT_NONE mappings.

This patch reverts the PTE_FILE/PTE_PROT_NONE shift in commit
59911ca432 (ARM64: mm: Move PTE_PROT_NONE bit) and moves PTE_PROT_NONE
together with the other software bits.

Signed-off-by: Steve Capper <steve.capper@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Steve Capper <steve.capper@linaro.org>
Cc: <stable@vger.kernel.org> # 3.11+
2013-11-29 15:22:59 +00:00
Catalin Marinas 4f00130b70 arm64: Use Normal NonCacheable memory for writecombine
This provides better performance compared to Device GRE and also allows
unaligned accesses. Such memory is intended to be used with standard RAM
(e.g. framebuffers) and not I/O.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2013-11-29 15:05:07 +00:00
Catalin Marinas 847264fb7e arm64: Use 42-bit address space with 64K pages
This patch expands the VA_BITS to 42 when the 64K page configuration is
enabled allowing 2TB kernel linear mapping. Linux still uses 2 levels of
page tables in this configuration with pgd now being a full page.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
2013-11-05 17:23:52 +00:00
Linus Torvalds 1873e50028 Main features:
- KVM and Xen ports to AArch64
 - Hugetlbfs and transparent huge pages support for arm64
 - Applied Micro X-Gene Kconfig entry and dts file
 - Cache flushing improvements
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.9 (GNU/Linux)
 
 iQIcBAABAgAGBQJR0bZAAAoJEGvWsS0AyF7xTEEP/R/aRoqWwbVAMlwAhujq616O
 t4RzIyBXZXqxS9I+raokCX4mgYxdeisJlzN2hoq73VEX2BQlXZoYh8vmfY9WeNSM
 2pdfif2HF7oo9ymCRyqfuhbumPrTyJhpbguzOYrxPqpp2f1hv2D8hbUJEFj429yL
 UjqTFoONngfouZmAlwrPGZQKhBI95vvN53yvDMH0PWfvpm07DKGIQMYp20y0pj8j
 slhLH3bh2kfpS1cf23JtH6IICwWD2pXW0POo569CfZry6bI74xve+Trcsm7iPnsO
 PSI1P046ME1mu3SBbKwiPIdN/FQqWwTHW07fvMmH/xuXu3Zs/mxgzi7vDzDrVvTg
 PJSbKWD6N/IPPwKS/gCUmWWDASO0bXx3KlDuRZqAjbRojs0UPUOTUhzJM/BHUms1
 vY2QS9lAm02LmZZrk1LeKKP85gB+qKQvHuOVhIOldWeLGKtsNufz1kynz6YTqsLq
 uUB55KwbhQ7q8+aoY6lWujqiTXMoLkBgGdjHs2I407PAv7ZjlhRWk2fIry7xJifp
 rKu2cIlWsRe4CGvGI410NvIJFrGvJAV4wA43sgBDjPumyILgT/5jw9r3RpJEBZZs
 akw/Bl1CbL+gMjyoPUWgcWZdRkUCE0eLrgyMOmaYfst8cOTaWw4dWLvUG/bBZg+Y
 mGnuEQUQtAPadk8P/Sv3
 =PZ/e
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-aarch64

Pull ARM64 updates from Catalin Marinas:
 "Main features:
   - KVM and Xen ports to AArch64
   - Hugetlbfs and transparent huge pages support for arm64
   - Applied Micro X-Gene Kconfig entry and dts file
   - Cache flushing improvements

  For arm64 huge pages support, there are x86 changes moving part of
  arch/x86/mm/hugetlbpage.c into mm/hugetlb.c to be re-used by arm64"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-aarch64: (66 commits)
  arm64: Add initial DTS for APM X-Gene Storm SOC and APM Mustang board
  arm64: Add defines for APM ARMv8 implementation
  arm64: Enable APM X-Gene SOC family in the defconfig
  arm64: Add Kconfig option for APM X-Gene SOC family
  arm64/Makefile: provide vdso_install target
  ARM64: mm: THP support.
  ARM64: mm: Raise MAX_ORDER for 64KB pages and THP.
  ARM64: mm: HugeTLB support.
  ARM64: mm: Move PTE_PROT_NONE bit.
  ARM64: mm: Make PAGE_NONE pages read only and no-execute.
  ARM64: mm: Restore memblock limit when map_mem finished.
  mm: thp: Correct the HPAGE_PMD_ORDER check.
  x86: mm: Remove general hugetlb code from x86.
  mm: hugetlb: Copy general hugetlb code from x86 to mm.
  x86: mm: Remove x86 version of huge_pmd_share.
  mm: hugetlb: Copy huge_pmd_share from x86 to mm.
  arm64: KVM: document kernel object mappings in HYP
  arm64: KVM: MAINTAINERS update
  arm64: KVM: userspace API documentation
  arm64: KVM: enable initialization of a 32bit vcpu
  ...
2013-07-03 10:31:38 -07:00
Catalin Marinas aa729dccb5 Merge branch 'for-next/hugepages' of git://git.linaro.org/people/stevecapper/linux into upstream-hugepages
* 'for-next/hugepages' of git://git.linaro.org/people/stevecapper/linux:
  ARM64: mm: THP support.
  ARM64: mm: Raise MAX_ORDER for 64KB pages and THP.
  ARM64: mm: HugeTLB support.
  ARM64: mm: Move PTE_PROT_NONE bit.
  ARM64: mm: Make PAGE_NONE pages read only and no-execute.
  ARM64: mm: Restore memblock limit when map_mem finished.
  mm: thp: Correct the HPAGE_PMD_ORDER check.
  x86: mm: Remove general hugetlb code from x86.
  mm: hugetlb: Copy general hugetlb code from x86 to mm.
  x86: mm: Remove x86 version of huge_pmd_share.
  mm: hugetlb: Copy huge_pmd_share from x86 to mm.

Conflicts:
	arch/arm64/Kconfig
	arch/arm64/include/asm/pgtable-hwdef.h
	arch/arm64/include/asm/pgtable.h
2013-07-01 11:20:58 +01:00
Al Viro 40d158e618 consolidate io_remap_pfn_range definitions
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29 12:46:35 +04:00
Steve Capper af07484863 ARM64: mm: THP support.
Bring Transparent HugePage support to ARM. The size of a
transparent huge page depends on the normal page size. A
transparent huge page is always represented as a pmd.

If PAGE_SIZE is 4KB, THPs are 2MB.
If PAGE_SIZE is 64KB, THPs are 512MB.

Signed-off-by: Steve Capper <steve.capper@linaro.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
2013-06-14 09:52:41 +01:00
Steve Capper 084bd29810 ARM64: mm: HugeTLB support.
Add huge page support to ARM64, different huge page sizes are
supported depending on the size of normal pages:

PAGE_SIZE is 4KB:
   2MB - (pmds) these can be allocated at any time.
1024MB - (puds) usually allocated on bootup with the command line
         with something like: hugepagesz=1G hugepages=6

PAGE_SIZE is 64KB:
 512MB - (pmds) usually allocated on bootup via command line.

Signed-off-by: Steve Capper <steve.capper@linaro.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
2013-06-14 09:52:40 +01:00
Steve Capper 59911ca432 ARM64: mm: Move PTE_PROT_NONE bit.
Under ARM64, PTEs can be broadly categorised as follows:
   - Present and valid: Bit #0 is set. The PTE is valid and memory
     access to the region may fault.

   - Present and invalid: Bit #0 is clear and bit #1 is set.
     Represents present memory with PROT_NONE protection. The PTE
     is an invalid entry, and the user fault handler will raise a
     SIGSEGV.

   - Not present (file or swap): Bits #0 and #1 are clear.
     Memory represented has been paged out. The PTE is an invalid
     entry, and the fault handler will try and re-populate the
     memory where necessary.

Huge PTEs are block descriptors that have bit #1 clear. If we wish
to represent PROT_NONE huge PTEs we then run into a problem as
there is no way to distinguish between regular and huge PTEs if we
set bit #1.

To resolve this ambiguity this patch moves PTE_PROT_NONE from
bit #1 to bit #2 and moves PTE_FILE from bit #2 to bit #3. The
number of swap/file bits is reduced by 1 as a consequence, leaving
60 bits for file and swap entries.

Signed-off-by: Steve Capper <steve.capper@linaro.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
2013-06-14 09:52:19 +01:00
Steve Capper 072b1b62a6 ARM64: mm: Make PAGE_NONE pages read only and no-execute.
If we consider the following code sequence:

	my_pte = pte_modify(entry, myprot);
	x = pte_write(my_pte);
	y = pte_exec(my_pte);

If myprot comes from a PROT_NONE page, then x and y will both be
true which is undesireable behaviour.

This patch sets the no-execute and read-only bits for PAGE_NONE
such that the code above will return false for both x and y.

Signed-off-by: Steve Capper <steve.capper@linaro.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
2013-06-14 09:42:46 +01:00
Catalin Marinas 63917f0b5b Merge branch 'kvm-arm64/kvm-for-3.11' of git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into upstream
* 'kvm-arm64/kvm-for-3.11' of git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms: (33 commits)
  arm64: KVM: document kernel object mappings in HYP
  arm64: KVM: MAINTAINERS update
  arm64: KVM: userspace API documentation
  arm64: KVM: enable initialization of a 32bit vcpu
  arm64: KVM: 32bit guest fault injection
  arm64: KVM: 32bit specific register world switch
  arm64: KVM: CPU specific 32bit coprocessor access
  arm64: KVM: 32bit handling of coprocessor traps
  arm64: KVM: 32bit conditional execution emulation
  arm64: KVM: 32bit GP register access
  arm64: KVM: define 32bit specific registers
  arm64: KVM: Build system integration
  arm64: KVM: PSCI implementation
  arm64: KVM: Plug the arch timer
  ARM: KVM: timer: allow DT matching for ARMv8 cores
  arm64: KVM: Plug the VGIC
  arm64: KVM: Exit handling
  arm64: KVM: HYP mode world switch implementation
  arm64: KVM: hypervisor initialization code
  arm64: KVM: guest one-reg interface
  ...

Conflicts:
	arch/arm64/Makefile
2013-06-12 16:48:38 +01:00
Will Deacon 9ab6d02fdd arm64: pgtable: use pte_index instead of __pte_index
pte_index is a useful helper outside of arch/arm64, for things like the
ARM SMMU driver, so rename __pte_index to pte_index to be consistent
with both arch/arm/ and also the definitions of pmd_index and pgd_index.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2013-06-11 18:15:24 +01:00
Marc Zyngier 363116073a arm64: KVM: define HYP and Stage-2 translation page flags
Add HYP and S2 page flags, for both normal and device memory.

Reviewed-by: Christopher Covington <cov@codeaurora.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2013-06-07 14:03:31 +01:00
Will Deacon a6fadf7e67 arm64: mm: introduce present, faulting entries for PAGE_NONE
This is mostly a port of dbf62d5006 ("ARM: mm: introduce L_PTE_VALID
for page table entries") and 26ffd0d43b ("ARM: mm: introduce present,
faulting entries for PAGE_NONE") from ARM, which makes use of present,
faulting page table entries for page table entries mapped as PROT_NONE.

The main difference with this implementation is that we can make use of
the two pte type bits in order to avoid allocating a software bit for
identifying PROT_NONE pages, instead reserving the 10b suffix for these
types of mappings.

This is required to prevent users from accessing such pages via syscalls
such as read/write over a pipe.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2013-01-10 10:48:48 +00:00
Will Deacon 02522463c8 arm64: mm: only wrprotect clean ptes if they are present
Marking non-present ptes as read-only can corrupt file ptes, breaking
things like swap and file mappings.

This patch ensures that we only manipulate user pte bits when the pte
is marked present.

Cc: <stable@vger.kernel.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2013-01-10 10:43:44 +00:00
Linus Torvalds 97ebe8f55a Main AArch64 changes:
- Generic execve, kernel_thread, fork/vfork/clone.
 - Preparatory patches for KVM support (initialising EL2 mode for later
   installing KVM support, hypervisor stub).
 - Signal handling corner case fix (alternative signal stack set up for a
   SEGV handler, which is raised in response to RLIMIT_STACK being
   reached).
 - Sub-nanosecond timer error fix.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.9 (GNU/Linux)
 
 iQIcBAABAgAGBQJQx1TPAAoJEGvWsS0AyF7xSrEP/R7KPhKSKIJKW0n3nP/uGe5g
 isUiTM+y4koGzeHShao8I7VUXZLJptYHiviy12Rf0S/IK0L25P1p29ABLndd8SPB
 5lqQehLz35bAIzmRXypvz4szpCwlRXPzEcHX7cnid0Nv27A9hVpfssYM2HIKLIJN
 1AXZAxjlNmPHCc+hd+QOnP8d7h6KGiZWqiC1lsuU12Ma4oZIwiS225oxUdMg5d4I
 AxfWAvVLy14eNxDRqBgA0W2Jxe62TD82LrgD4tP88mbwWsFIyE5dea2yYShOJnBe
 mwLWw4Jovfe5VLSn00yggqM5JPp36sM/7Bka5EZaGKY2HllVtSwqnshUChG3fw3/
 fepN4nB0L8lPgTMfQAUjNKqZWgt2vwIGC+7GLX+Sg6/kOidRaxsQgU710gNvceZu
 E417RTtW4WM8IA+euCTiq3huJt7iOt8APSblpPWnrf8M7ntJKV4ESTOhtN30mR2D
 ZYeMZp1DYrET3Pxkd+bMdaRYGhMqAlpfCF096H+A4FscicbDLC+KincWtW/YpOXE
 voWDxE/Rd+3nAhCVL+A2HUSw9lNddsFvxRR9hQWfQ3uvMiDp7AS6O4EAYcK60GiA
 YsEnksMQr/ksscNf/7nvpY6DBNkeuZjj9IGfbFYVqWZ80f//8NEoJCNDzNPlATHU
 ddPpD5ZayUQ3UUMulQGg
 =fQ+2
 -----END PGP SIGNATURE-----

Merge tag 'arm64-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-aarch64

Pull ARM64 updates from Catalin Marinas:

 - Generic execve, kernel_thread, fork/vfork/clone.

 - Preparatory patches for KVM support (initialising EL2 mode for later
   installing KVM support, hypervisor stub).

 - Signal handling corner case fix (alternative signal stack set up for
   a SEGV handler, which is raised in response to RLIMIT_STACK being
   reached).

 - Sub-nanosecond timer error fix.

* tag 'arm64-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-aarch64: (30 commits)
  arm64: Update the MAINTAINERS entry
  arm64: compat for clock_adjtime(2) is miswired
  arm64: move FP-SIMD save/restore code to a macro
  arm64: hyp: initialize vttbr_el2 to zero
  arm64: add hypervisor stub
  arm64: record boot mode when entering the kernel
  arm64: move vector entry macro to assembler.h
  arm64: add AArch32 execution modes to ptrace.h
  arm64: expand register mapping between AArch32 and AArch64
  arm64: generic timer: use virtual counter instead of physical at EL0
  arm64: vdso: defer shifting of nanosecond component of timespec
  arm64: vdso: rework __do_get_tspec register allocation and return shift
  arm64: vdso: check sequence counter even for coarse realtime operations
  arm64: vdso: fix clocksource mask when extracting bottom 56 bits
  ARM64: Remove incorrect Kconfig symbol HAVE_SPARSE_IRQ
  Documentation: Fixes a word in Documentation/arm64/memory.txt
  arm64: Make !dirty ptes read-only
  arm64: Convert empty flush_cache_{mm,page} functions to static inline
  arm64: signal: let the compiler inline compat_get_sigframe
  arm64: signal: return struct rt_sigframe from get_sigframe
  ...

Conflicts:
	arch/arm64/include/asm/unistd32.h
2012-12-12 07:49:02 -08:00
Catalin Marinas 33eaa58f85 arm64: Make !dirty ptes read-only
The AArch64 Linux port relies on the mm code to wrprotect clean ptes.
This however is not the case with newly created ptes and
PAGE_SHARED(_EXEC) is writable but !dirty.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: <stable@vger.kernel.org>
2012-11-29 15:32:13 +00:00
Catalin Marinas 8e620b0476 arm64: Distinguish between user and kernel XN bits
On AArch64, the meaning of the XN bit has changed to UXN (user). The PXN
(privileged) bit must be set to prevent kernel execution. Without the
PXN bit set, the CPU may speculatively access device memory. This patch
ensures that all the mappings that the kernel must not execute from
(including user mappings) have the PXN bit set.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2012-11-16 15:50:25 +00:00
Catalin Marinas 4f04d8f005 arm64: MMU definitions
The virtual memory layout is described in
Documentation/arm64/memory.txt. This patch adds the MMU definitions for
the 4KB and 64KB translation table configurations. The SECTION_SIZE is
2MB with 4KB page and 512MB with 64KB page configuration.

PHYS_OFFSET is calculated at run-time and stored in a variable (no
run-time code patching at this stage).

On the current implementation, both user and kernel address spaces are
512G (39-bit) each with a maximum of 256G for the RAM linear mapping.
Linux uses 3 levels of translation tables with the 4K page configuration
and 2 levels with the 64K configuration. Extending the memory space
beyond 39-bit with the 4K pages or 42-bit with 64K pages requires an
additional level of translation tables.

The SPARSEMEM configuration is global to all AArch64 platforms and
allows for 1GB sections with SPARSEMEM_VMEMMAP enabled by default.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Tony Lindgren <tony@atomide.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
Acked-by: Olof Johansson <olof@lixom.net>
Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
2012-09-17 13:41:56 +01:00