Commit Graph

391 Commits

Author SHA1 Message Date
Linus Torvalds d34687ab97 ARC updates for 4.6-rc1
- Big Endian io accessors fix [Lada]
 - Spellos fixes [Adam]
 - Fix for DW GMAC breakage [Alexey]
 - Making DMA API 64-bit ready
 - Shutting up -Wmaybe-uninitialized noise for ARC
 - Other minor fixes here and there, comments update
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJW738lAAoJEGnX8d3iisJeProP/icm32aIHY0QXmJCBXCmQfLa
 HHzfBeJ2KsG8pIRgrvraK3FJkmFr+WxZ7x6b5hPNYeHIT3c179/GZ3DlssM1md0u
 sa50o5jmwd/J4o5jCKpUB/hx7wiAjpC2CYb6qIg39A2Nq5JhOFJV30XMbCscXkLI
 ae/o8oATi1502cf1OQ2EqNWKfME4ogG1KsEUNrSzcd+1P8LZxsnEVBmXuPHVdHLw
 kTHVgmCELsEchaV/QY9pY+uHkm9Y4vV18v0vqbklwED+cHkjmXQ2UysP3/J8KXKN
 PVSqmtUJIS2vxDGK5mWvz6jkWmU8gRXoT14ZqdmMARmhVhp3+JTm2fQ53NUwZ+b2
 JpPNGWVQRi86AaiUE8Fm+eWjC242CAm+lsBfx+mvqWpEvFGMlnRKw8oZiyeJhhIw
 3M1yrulQG7QbTSuQrgQwfGqtrhl2nnq+X0uoMJXYHupNDQ42QK8wmJ9bT7cmutD0
 K3Tmi84qoiSnN/HhWK/D9d60bLGvUY4RKiLjAcJz7lbMjtRhT/rpFFcFYCIhJyZs
 y//jOZK67o1ecDXBTaUcvT+edOrQVsmatn3w0p9VwATe8OiKHsLA/0UD34gwiECy
 o9g/i4tc2GfOLFoLv66czXTU9IuoKDh3HrTJgET7r1Re/+FKgJ+2+GX6AbiJzbhY
 9jsAAI/ZpsS6qMhvSz3d
 =n0fk
 -----END PGP SIGNATURE-----

Merge tag 'arc-4.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc

Pull ARC architecture updates from Vineet Gupta:
 - Big Endian io accessors fix [Lada]
 - Spellos fixes [Adam]
 - Fix for DW GMAC breakage [Alexey]
 - Making DMA API 64-bit ready
 - Shutting up -Wmaybe-uninitialized noise for ARC
 - Other minor fixes here and there, comments update

* tag 'arc-4.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc: (21 commits)
  ARCv2: ioremap: Support dynamic peripheral address space
  ARC: dma: reintroduce platform specific dma<->phys
  ARC: dma: ioremap: use phys_addr_t consistenctly in code paths
  ARC: dma: pass_phys() not sg_virt() to cache ops
  ARC: dma: non-coherent pages need V-P mapping if in HIGHMEM
  ARC: dma: Use struct page based page allocator helpers
  ARC: build: Turn off -Wmaybe-uninitialized for ARC gcc 4.8
  ARC: [plat-axs10x] add Ethernet PHY description in .dts
  arc: use of_platform_default_populate() to populate default bus
  ARC: thp: unbork !CONFIG_TRANSPARENT_HUGEPAGE build
  arc: [plat-nsimosci*] use ezchip network driver
  ARCv2: LLSC: software backoff is NOT needed starting HS2.1c
  ARC: mm: Use virt_to_pfn() for addr >> PAGE_SHIFT pattern
  ARC: [plat-nsim] document ranges
  ARC: build: Better way to detect ISA compatible toolchain
  ARCv2: Allow enabling PAE40 w/o HIGHMEM
  ARC: [BE] readl()/writel() to work in Big Endian CPU configuration
  ARC: [*defconfig] No need to specify CONFIG_CROSS_COMPILE
  ARC: [BE] Select correct CROSS_COMPILE prefix
  ARC: bitops: Remove non relevant comments
  ...
2016-03-21 13:00:46 -07:00
Linus Torvalds 1200b6809d Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:
 "Highlights:

   1) Support more Realtek wireless chips, from Jes Sorenson.

   2) New BPF types for per-cpu hash and arrap maps, from Alexei
      Starovoitov.

   3) Make several TCP sysctls per-namespace, from Nikolay Borisov.

   4) Allow the use of SO_REUSEPORT in order to do per-thread processing
   of incoming TCP/UDP connections.  The muxing can be done using a
   BPF program which hashes the incoming packet.  From Craig Gallek.

   5) Add a multiplexer for TCP streams, to provide a messaged based
      interface.  BPF programs can be used to determine the message
      boundaries.  From Tom Herbert.

   6) Add 802.1AE MACSEC support, from Sabrina Dubroca.

   7) Avoid factorial complexity when taking down an inetdev interface
      with lots of configured addresses.  We were doing things like
      traversing the entire address less for each address removed, and
      flushing the entire netfilter conntrack table for every address as
      well.

   8) Add and use SKB bulk free infrastructure, from Jesper Brouer.

   9) Allow offloading u32 classifiers to hardware, and implement for
      ixgbe, from John Fastabend.

  10) Allow configuring IRQ coalescing parameters on a per-queue basis,
      from Kan Liang.

  11) Extend ethtool so that larger link mode masks can be supported.
      From David Decotigny.

  12) Introduce devlink, which can be used to configure port link types
      (ethernet vs Infiniband, etc.), port splitting, and switch device
      level attributes as a whole.  From Jiri Pirko.

  13) Hardware offload support for flower classifiers, from Amir Vadai.

  14) Add "Local Checksum Offload".  Basically, for a tunneled packet
      the checksum of the outer header is 'constant' (because with the
      checksum field filled into the inner protocol header, the payload
      of the outer frame checksums to 'zero'), and we can take advantage
      of that in various ways.  From Edward Cree"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1548 commits)
  bonding: fix bond_get_stats()
  net: bcmgenet: fix dma api length mismatch
  net/mlx4_core: Fix backward compatibility on VFs
  phy: mdio-thunder: Fix some Kconfig typos
  lan78xx: add ndo_get_stats64
  lan78xx: handle statistics counter rollover
  RDS: TCP: Remove unused constant
  RDS: TCP: Add sysctl tunables for sndbuf/rcvbuf on rds-tcp socket
  net: smc911x: convert pxa dma to dmaengine
  team: remove duplicate set of flag IFF_MULTICAST
  bonding: remove duplicate set of flag IFF_MULTICAST
  net: fix a comment typo
  ethernet: micrel: fix some error codes
  ip_tunnels, bpf: define IP_TUNNEL_OPTS_MAX and use it
  bpf, dst: add and use dst_tclassid helper
  bpf: make skb->tc_classid also readable
  net: mvneta: bm: clarify dependencies
  cls_bpf: reset class and reuse major in da
  ldmvsw: Checkpatch sunvnet.c and sunvnet_common.c
  ldmvsw: Add ldmvsw.c driver code
  ...
2016-03-19 10:05:34 -07:00
Vineet Gupta deaf7565eb ARCv2: ioremap: Support dynamic peripheral address space
The peripheral address space is architectural address window which is
uncached and typically used to wire up peripherals.

For ARC700 cores (ARCompact ISA based) this was fixed to 1GB region
0xC000_0000 - 0xFFFF_FFFF.

For ARCv2 based HS38 cores the start address is flexible and can be
0xC, 0xD, 0xE, 0xF 000_000 by programming AUX_NON_VOLATILE_LIMIT reg
(typically done in bootloader)

Further in cas of PAE, the physical address can extend beyond 4GB so
need to confine this check, otherwise all pages beyond 4GB will be
treated as uncached

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-03-19 14:34:10 +05:30
Vineet Gupta f2e3d55397 ARC: dma: reintroduce platform specific dma<->phys
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-03-19 14:34:09 +05:30
Vineet Gupta f5db19e93f ARC: dma: ioremap: use phys_addr_t consistenctly in code paths
To support dma in physical memory beyond 4GB with PAE40

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-03-19 14:34:09 +05:30
Vineet Gupta 01609ec2fa ARC, thp: remove infrastructure for handling splitting PMDs
With THP refcounting work, no need to mark PMDs splitting.

(ARC got missed under the sweeping arch change as THP support was likely
not present in orig baseline)

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-17 15:09:34 -07:00
Vineet Gupta c511eaaa78 ARC: thp: unbork !CONFIG_TRANSPARENT_HUGEPAGE build
linux-next for 4.6-rc1 timeline reported ARC build failures !THP

| arch/arc/include/asm/tlbflush.h:29:0: warning: "flush_pmd_tlb_range" redefined [enabled by default]
| arch/arc/include/asm/tlbflush.h:29:0: warning: "flush_pmd_tlb_range" redefined [enabled by default]
| arch/arc/include/asm/tlbflush.h:29:0: warning: "flush_pmd_tlb_range" redefined [enabled by default]

Turns out that commit ("mm/thp/migration: switch from flush_tlb_range
to flush_pmd_tlb_range") triggered the issue while the problem was in
ARC code where THP specific helpers were not guarded with #ifdef.

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-03-17 15:31:45 +05:30
Linus Torvalds 63e30271b0 PCI changes for the v4.6 merge window:
Enumeration
     Disable IO/MEM decoding for devices with non-compliant BARs (Bjorn Helgaas)
     Mark Broadwell-EP Home Agent & PCU as having non-compliant BARs (Bjorn Helgaas
 
   Resource management
     Mark shadow copy of VGA ROM as IORESOURCE_PCI_FIXED (Bjorn Helgaas)
     Don't assign or reassign immutable resources (Bjorn Helgaas)
     Don't enable/disable ROM BAR if we're using a RAM shadow copy (Bjorn Helgaas)
     Set ROM shadow location in arch code, not in PCI core (Bjorn Helgaas)
     Remove arch-specific IORESOURCE_ROM_SHADOW size from sysfs (Bjorn Helgaas)
     ia64: Use ioremap() instead of open-coded equivalent (Bjorn Helgaas)
     ia64: Keep CPU physical (not virtual) addresses in shadow ROM resource (Bjorn Helgaas)
     MIPS: Keep CPU physical (not virtual) addresses in shadow ROM resource (Bjorn Helgaas)
     Remove unused IORESOURCE_ROM_COPY and IORESOURCE_ROM_BIOS_COPY (Bjorn Helgaas)
     Don't leak memory if sysfs_create_bin_file() fails (Bjorn Helgaas)
     rcar: Remove PCI_PROBE_ONLY handling (Lorenzo Pieralisi)
     designware: Remove PCI_PROBE_ONLY handling (Lorenzo Pieralisi)
 
   Virtualization
     Wait for up to 1000ms after FLR reset (Alex Williamson)
     Support SR-IOV on any function type (Kelly Zytaruk)
     Add ACS quirk for all Cavium devices (Manish Jaggi)
 
   AER
     Rename pci_ops_aer to aer_inj_pci_ops (Bjorn Helgaas)
     Restore pci_ops pointer while calling original pci_ops (David Daney)
     Fix aer_inject error codes (Jean Delvare)
     Use dev_warn() in aer_inject (Jean Delvare)
     Log actual error causes in aer_inject (Jean Delvare)
     Log aer_inject error injections (Jean Delvare)
 
   VPD
     Prevent VPD access for buggy devices (Babu Moger)
     Move pci_read_vpd() and pci_write_vpd() close to other VPD code (Bjorn Helgaas)
     Move pci_vpd_release() from header file to pci/access.c (Bjorn Helgaas)
     Remove struct pci_vpd_ops.release function pointer (Bjorn Helgaas)
     Rename VPD symbols to remove unnecessary "pci22" (Bjorn Helgaas)
     Fold struct pci_vpd_pci22 into struct pci_vpd (Bjorn Helgaas)
     Sleep rather than busy-wait for VPD access completion (Bjorn Helgaas)
     Update VPD definitions (Hannes Reinecke)
     Allow access to VPD attributes with size 0 (Hannes Reinecke)
     Determine actual VPD size on first access (Hannes Reinecke)
 
   Generic host bridge driver
     Move structure definitions to separate header file (David Daney)
     Add pci_host_common_probe(), based on gen_pci_probe() (David Daney)
     Expose pci_host_common_probe() for use by other drivers (David Daney)
 
   Altera host bridge driver
     Fix altera_pcie_link_is_up() (Ley Foon Tan)
 
   Cavium ThunderX host bridge driver
     Add PCIe host driver for ThunderX processors (David Daney)
     Add driver for ThunderX-pass{1,2} on-chip devices (David Daney)
 
   Freescale i.MX6 host bridge driver
     Add DT bindings to configure PHY Tx driver settings (Justin Waters)
     Move imx6_pcie_reset_phy() near other PHY handling functions (Lucas Stach)
     Move PHY reset into imx6_pcie_establish_link() (Lucas Stach)
     Remove broken Gen2 workaround (Lucas Stach)
     Move link up check into imx6_pcie_wait_for_link() (Lucas Stach)
 
   Freescale Layerscape host bridge driver
     Add "fsl,ls2085a-pcie" compatible ID (Yang Shi)
 
   Intel VMD host bridge driver
     Attach VMD resources to parent domain's resource tree (Jon Derrick)
     Set bus resource start to 0 (Keith Busch)
 
   Microsoft Hyper-V host bridge driver
     Add fwnode_handle to x86 pci_sysdata (Jake Oshins)
     Look up IRQ domain by fwnode_handle (Jake Oshins)
     Add paravirtual PCI front-end for Microsoft Hyper-V VMs (Jake Oshins)
 
   NVIDIA Tegra host bridge driver
     Add pci_ops.{add,remove}_bus() callbacks (Thierry Reding)
     Implement ->{add,remove}_bus() callbacks (Thierry Reding)
     Remove unused struct tegra_pcie.num_ports field (Thierry Reding)
     Track bus -> CPU mapping (Thierry Reding)
     Remove misleading PHYS_OFFSET (Thierry Reding)
 
   Renesas R-Car host bridge driver
     Depend on ARCH_RENESAS, not ARCH_SHMOBILE (Simon Horman)
 
   Synopsys DesignWare host bridge driver
     ARC: Add PCI support (Joao Pinto)
     Add generic dw_pcie_wait_for_link() (Joao Pinto)
     Add default link up check if sub-driver doesn't override (Joao Pinto)
     Add driver for prototyping kits based on ARC SDP (Joao Pinto)
 
   TI Keystone host bridge driver
     Defer probing if devm_phy_get() returns -EPROBE_DEFER (Shawn Lin)
 
   Xilinx AXI host bridge driver
     Use of_pci_get_host_bridge_resources() to parse DT (Bharat Kumar Gogada)
     Remove dependency on ARM-specific struct hw_pci (Bharat Kumar Gogada)
     Don't call pci_fixup_irqs() on Microblaze (Bharat Kumar Gogada)
     Update Zynq binding with Microblaze node (Bharat Kumar Gogada)
     microblaze: Support generic Xilinx AXI PCIe Host Bridge IP driver (Bharat Kumar Gogada)
 
   Xilinx NWL host bridge driver
     Add support for Xilinx NWL PCIe Host Controller (Bharat Kumar Gogada)
 
   Miscellaneous
     Check device_attach() return value always (Bjorn Helgaas)
     Move pci_set_flags() from asm-generic/pci-bridge.h to linux/pci.h (Bjorn Helgaas)
     Remove includes of empty asm-generic/pci-bridge.h (Bjorn Helgaas)
     ARM64: Remove generated include of asm-generic/pci-bridge.h (Bjorn Helgaas)
     Remove empty asm-generic/pci-bridge.h (Bjorn Helgaas)
     Remove includes of asm/pci-bridge.h (Bjorn Helgaas)
     Consolidate PCI DMA constants and interfaces in linux/pci-dma-compat.h (Bjorn Helgaas)
     unicore32: Remove unused HAVE_ARCH_PCI_SET_DMA_MASK definition (Bjorn Helgaas)
     Cleanup pci/pcie/Kconfig whitespace (Andreas Ziegler)
     Include pci/hotplug Kconfig directly from pci/Kconfig (Bjorn Helgaas)
     Include pci/pcie/Kconfig directly from pci/Kconfig (Bogicevic Sasa)
     frv: Remove stray pci_{alloc,free}_consistent() declaration (Christoph Hellwig)
     Move pci_dma_* helpers to common code (Christoph Hellwig)
     Add PCI_CLASS_SERIAL_USB_DEVICE definition (Heikki Krogerus)
     Add QEMU top-level IDs for (sub)vendor & device (Robin H. Johnson)
     Fix broken URL for Dell biosdevname (Naga Venkata Sai Indubhaskar Jupudi)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJW6XgMAAoJEFmIoMA60/r8Yq4P/1nNwwZPikU+9Z8k0HyGPll6
 vqXBOYj/wlbAxJTzH2weaoyUamFrwvsKaO3Vap3xHkAeTFPD/Dp0TipCCNMrZ82Z
 j1y83JJpenkRyX6ifLARCNYpOtvnvgzSrO9x7Sb2Xfqb64dPb7+jGAfOpGNzhKsO
 n1nj/L7RGx8Q6fNFGf8ANMXKTsdkdL+1pdwegjUXmD5WdOT+oW8DmqVbhyfSKwl0
 E8r4Ml2lIg7Qd5Wu5iKMIBsR0+5HEyrwV7ch92wXChwKfoRwG70qnn7FGdc0y5ZB
 XvJuj8UD5UeMxEUeoRa9SwU6wWQT3Q9e6BzMS+P+43z36SPYjMfy/Xffv054z/bY
 rQomLjuGxNLESpmfNK5JfKxWoe2YNXjHQIDWMrAHyNlwdKJbYiwPcxnZJhvOa/eB
 p0QYcGS7O43STjibG9PZhzeq8tuSJRshxi0W6iB9QlqO8qs8nJQxIO+sZj/vl4yz
 lSnswWcV9062KITl8Fe9xDw244/RTz1xSVCdldlSoDhJyeMOjRvzS8raUMyyVmbA
 YULsI3l2iCl+fwDm/T21o7hJG966oYdAmgEv7lc7BWfgEAMg//LZXvMzVvrPFB2D
 R77u/0idtOciVJrmnO/x9DnQO2hzro9SLmVH6m0+0YU4wSSpZfGn98PCrtkatOAU
 c8zT9dJgyJVE3Z7cnPJ4
 =otsF
 -----END PGP SIGNATURE-----

Merge tag 'pci-v4.6-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI updates from Bjorn Helgaas:
 "PCI changes for v4.6:

  Enumeration:
   - Disable IO/MEM decoding for devices with non-compliant BARs (Bjorn Helgaas)
   - Mark Broadwell-EP Home Agent & PCU as having non-compliant BARs (Bjorn Helgaas

  Resource management:
   - Mark shadow copy of VGA ROM as IORESOURCE_PCI_FIXED (Bjorn Helgaas)
   - Don't assign or reassign immutable resources (Bjorn Helgaas)
   - Don't enable/disable ROM BAR if we're using a RAM shadow copy (Bjorn Helgaas)
   - Set ROM shadow location in arch code, not in PCI core (Bjorn Helgaas)
   - Remove arch-specific IORESOURCE_ROM_SHADOW size from sysfs (Bjorn Helgaas)
   - ia64: Use ioremap() instead of open-coded equivalent (Bjorn Helgaas)
   - ia64: Keep CPU physical (not virtual) addresses in shadow ROM resource (Bjorn Helgaas)
   - MIPS: Keep CPU physical (not virtual) addresses in shadow ROM resource (Bjorn Helgaas)
   - Remove unused IORESOURCE_ROM_COPY and IORESOURCE_ROM_BIOS_COPY (Bjorn Helgaas)
   - Don't leak memory if sysfs_create_bin_file() fails (Bjorn Helgaas)
   - rcar: Remove PCI_PROBE_ONLY handling (Lorenzo Pieralisi)
   - designware: Remove PCI_PROBE_ONLY handling (Lorenzo Pieralisi)

  Virtualization:
   - Wait for up to 1000ms after FLR reset (Alex Williamson)
   - Support SR-IOV on any function type (Kelly Zytaruk)
   - Add ACS quirk for all Cavium devices (Manish Jaggi)

  AER:
   - Rename pci_ops_aer to aer_inj_pci_ops (Bjorn Helgaas)
   - Restore pci_ops pointer while calling original pci_ops (David Daney)
   - Fix aer_inject error codes (Jean Delvare)
   - Use dev_warn() in aer_inject (Jean Delvare)
   - Log actual error causes in aer_inject (Jean Delvare)
   - Log aer_inject error injections (Jean Delvare)

  VPD:
   - Prevent VPD access for buggy devices (Babu Moger)
   - Move pci_read_vpd() and pci_write_vpd() close to other VPD code (Bjorn Helgaas)
   - Move pci_vpd_release() from header file to pci/access.c (Bjorn Helgaas)
   - Remove struct pci_vpd_ops.release function pointer (Bjorn Helgaas)
   - Rename VPD symbols to remove unnecessary "pci22" (Bjorn Helgaas)
   - Fold struct pci_vpd_pci22 into struct pci_vpd (Bjorn Helgaas)
   - Sleep rather than busy-wait for VPD access completion (Bjorn Helgaas)
   - Update VPD definitions (Hannes Reinecke)
   - Allow access to VPD attributes with size 0 (Hannes Reinecke)
   - Determine actual VPD size on first access (Hannes Reinecke)

  Generic host bridge driver:
   - Move structure definitions to separate header file (David Daney)
   - Add pci_host_common_probe(), based on gen_pci_probe() (David Daney)
   - Expose pci_host_common_probe() for use by other drivers (David Daney)

  Altera host bridge driver:
   - Fix altera_pcie_link_is_up() (Ley Foon Tan)

  Cavium ThunderX host bridge driver:
   - Add PCIe host driver for ThunderX processors (David Daney)
   - Add driver for ThunderX-pass{1,2} on-chip devices (David Daney)

  Freescale i.MX6 host bridge driver:
   - Add DT bindings to configure PHY Tx driver settings (Justin Waters)
   - Move imx6_pcie_reset_phy() near other PHY handling functions (Lucas Stach)
   - Move PHY reset into imx6_pcie_establish_link() (Lucas Stach)
   - Remove broken Gen2 workaround (Lucas Stach)
   - Move link up check into imx6_pcie_wait_for_link() (Lucas Stach)

  Freescale Layerscape host bridge driver:
   - Add "fsl,ls2085a-pcie" compatible ID (Yang Shi)

  Intel VMD host bridge driver:
   - Attach VMD resources to parent domain's resource tree (Jon Derrick)
   - Set bus resource start to 0 (Keith Busch)

  Microsoft Hyper-V host bridge driver:
   - Add fwnode_handle to x86 pci_sysdata (Jake Oshins)
   - Look up IRQ domain by fwnode_handle (Jake Oshins)
   - Add paravirtual PCI front-end for Microsoft Hyper-V VMs (Jake Oshins)

  NVIDIA Tegra host bridge driver:
   - Add pci_ops.{add,remove}_bus() callbacks (Thierry Reding)
   - Implement ->{add,remove}_bus() callbacks (Thierry Reding)
   - Remove unused struct tegra_pcie.num_ports field (Thierry Reding)
   - Track bus -> CPU mapping (Thierry Reding)
   - Remove misleading PHYS_OFFSET (Thierry Reding)

  Renesas R-Car host bridge driver:
   - Depend on ARCH_RENESAS, not ARCH_SHMOBILE (Simon Horman)

  Synopsys DesignWare host bridge driver:
   - ARC: Add PCI support (Joao Pinto)
   - Add generic dw_pcie_wait_for_link() (Joao Pinto)
   - Add default link up check if sub-driver doesn't override (Joao Pinto)
   - Add driver for prototyping kits based on ARC SDP (Joao Pinto)

  TI Keystone host bridge driver:
   - Defer probing if devm_phy_get() returns -EPROBE_DEFER (Shawn Lin)

  Xilinx AXI host bridge driver:
   - Use of_pci_get_host_bridge_resources() to parse DT (Bharat Kumar Gogada)
   - Remove dependency on ARM-specific struct hw_pci (Bharat Kumar Gogada)
   - Don't call pci_fixup_irqs() on Microblaze (Bharat Kumar Gogada)
   - Update Zynq binding with Microblaze node (Bharat Kumar Gogada)
   - microblaze: Support generic Xilinx AXI PCIe Host Bridge IP driver (Bharat Kumar Gogada)

  Xilinx NWL host bridge driver:
   - Add support for Xilinx NWL PCIe Host Controller (Bharat Kumar Gogada)

  Miscellaneous:
   - Check device_attach() return value always (Bjorn Helgaas)
   - Move pci_set_flags() from asm-generic/pci-bridge.h to linux/pci.h (Bjorn Helgaas)
   - Remove includes of empty asm-generic/pci-bridge.h (Bjorn Helgaas)
   - ARM64: Remove generated include of asm-generic/pci-bridge.h (Bjorn Helgaas)
   - Remove empty asm-generic/pci-bridge.h (Bjorn Helgaas)
   - Remove includes of asm/pci-bridge.h (Bjorn Helgaas)
   - Consolidate PCI DMA constants and interfaces in linux/pci-dma-compat.h (Bjorn Helgaas)
   - unicore32: Remove unused HAVE_ARCH_PCI_SET_DMA_MASK definition (Bjorn Helgaas)
   - Cleanup pci/pcie/Kconfig whitespace (Andreas Ziegler)
   - Include pci/hotplug Kconfig directly from pci/Kconfig (Bjorn Helgaas)
   - Include pci/pcie/Kconfig directly from pci/Kconfig (Bogicevic Sasa)
   - frv: Remove stray pci_{alloc,free}_consistent() declaration (Christoph Hellwig)
   - Move pci_dma_* helpers to common code (Christoph Hellwig)
   - Add PCI_CLASS_SERIAL_USB_DEVICE definition (Heikki Krogerus)
   - Add QEMU top-level IDs for (sub)vendor & device (Robin H. Johnson)
   - Fix broken URL for Dell biosdevname (Naga Venkata Sai Indubhaskar Jupudi)"

* tag 'pci-v4.6-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (94 commits)
  PCI: Add PCI_CLASS_SERIAL_USB_DEVICE definition
  PCI: designware: Add driver for prototyping kits based on ARC SDP
  PCI: designware: Add default link up check if sub-driver doesn't override
  PCI: designware: Add generic dw_pcie_wait_for_link()
  PCI: Cleanup pci/pcie/Kconfig whitespace
  PCI: Simplify pci_create_attr() control flow
  PCI: Don't leak memory if sysfs_create_bin_file() fails
  PCI: Simplify sysfs ROM cleanup
  PCI: Remove unused IORESOURCE_ROM_COPY and IORESOURCE_ROM_BIOS_COPY
  MIPS: Loongson 3: Keep CPU physical (not virtual) addresses in shadow ROM resource
  MIPS: Loongson 3: Use temporary struct resource * to avoid repetition
  ia64/PCI: Keep CPU physical (not virtual) addresses in shadow ROM resource
  ia64/PCI: Use ioremap() instead of open-coded equivalent
  ia64/PCI: Use temporary struct resource * to avoid repetition
  PCI: Clean up pci_map_rom() whitespace
  PCI: Remove arch-specific IORESOURCE_ROM_SHADOW size from sysfs
  PCI: thunder: Add driver for ThunderX-pass{1,2} on-chip devices
  PCI: thunder: Add PCIe host driver for ThunderX processors
  PCI: generic: Expose pci_host_common_probe() for use by other drivers
  PCI: generic: Add pci_host_common_probe(), based on gen_pci_probe()
  ...
2016-03-16 14:45:55 -07:00
Alexander Duyck 01cfbad79a ipv4: Update parameters for csum_tcpudp_magic to their original types
This patch updates all instances of csum_tcpudp_magic and
csum_tcpudp_nofold to reflect the types that are usually used as the source
inputs.  For example the protocol field is populated based on nexthdr which
is actually an unsigned 8 bit value.  The length is usually populated based
on skb->len which is an unsigned integer.

This addresses an issue in which the IPv6 function csum_ipv6_magic was
generating a checksum using the full 32b of skb->len while
csum_tcpudp_magic was only using the lower 16 bits.  As a result we could
run into issues when attempting to adjust the checksum as there was no
protocol agnostic way to update it.

With this change the value is still truncated as many architectures use
"(len + proto) << 8", however this truncation only occurs for values
greater than 16776960 in length and as such is unlikely to occur as we stop
the inner headers at ~64K in size.

I did have to make a few minor changes in the arm, mn10300, nios2, and
score versions of the function in order to support these changes as they
were either using things such as an OR to combine the protocol and length,
or were using ntohs to convert the length which would have truncated the
value.

I also updated a few spots in terms of whitespace and type differences for
the addresses.  Most of this was just to make sure all of the definitions
were in sync going forward.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-13 23:55:13 -04:00
Vineet Gupta c2ff5cf273 ARC: mm: Use virt_to_pfn() for addr >> PAGE_SHIFT pattern
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-03-12 11:59:47 +05:30
Vineet Gupta 20d780374c ARC: build: Better way to detect ISA compatible toolchain
ARC architecture has 2 instruction sets: ARCompact/ARCv2.
While same gcc supports compiling for either (using appropriate toggles),
we can't use the same toolchain to build kernel because libgcc needs
to be unique and the toolchian (uClibc based) is not multilibed.

uClibc toolchain is convenient since it allows all userspace and
kernel to be built with a single install for an ISA.

This however means 2 gnu installs (with same triplet prefix) are needed
for building for 2 ISA and need to be in PATH.
As developers we keep switching the builds, but would occassionally fail
to update the PATH leading to usage of wrong tools. And this would only
show up at the end of kernel build when linking incompatible libgcc.

So the initial solution was to have gcc define a special preprocessor macro
DEFAULT_CPU_xxx which is unique for default toolchain configuration.
Claudiu proposed using grep for an existing preprocessor macro which is
again uniquely defined per ISA.

Cc: Michal Marek <mmarek@suse.cz>
Suggested-by: Claudiu Zissulescu <claziss@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-03-12 11:58:30 +05:30
Lada Trimasova f778cc6571 ARC: [BE] readl()/writel() to work in Big Endian CPU configuration
read{l,w}() write{l,w}() primitives should use le{16,32}_to_cpu() and
cpu_to_le{16,32}() respectively to ensure device registers are read
correctly in Big Endian CPU configuration.

Per Arnd Bergmann
| Most drivers using readl() or readl_relaxed() expect those to perform byte
| swaps on big-endian architectures, as the registers tend to be fixed endian

This was needed for getting UART to work correctly on a Big Endian ARC.

The ARC accessors originally were fine, and the bug got introduced
inadventently by commit b8a0330239 ("ARCv2: barriers")

Fixes: b8a0330239 ("ARCv2: barriers")
Link: http://lkml.kernel.org/r/201603100845.30602.arnd@arndb.de
Cc: Alexey Brodkin <abrodkin@synopsys.com>
Cc: stable@vger.kernel.org  [4.2+]
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Lada Trimasova <ltrimas@synopsys.com>
[vgupta: beefed up changelog, added Fixes/stable tags]
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-03-12 09:46:45 +05:30
Vineet Gupta 2a41b6dc28 ARC: bitops: Remove non relevant comments
commit 80f420842f removed the ARC bitops microoptimization but failed
to prune the comments to same effect

Fixes: 80f420842f ("ARC: Make ARC bitops "safer" (add anti-optimization)")
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-03-11 14:59:54 +05:30
Adam Buchbinder 7423cc0cae ARC: Fix misspellings in comments.
Signed-off-by: Adam Buchbinder <adam.buchbinder@gmail.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-03-11 14:59:53 +05:30
Joao Pinto c1678ffcde ARC: Add PCI support
Add PCI support to ARC and update drivers/pci Makefile enabling the ARC
arch to use the generic PCI setup functions.

[bhelgaas: fold in Joao's pci-dma-compat.h & pci-bridge.h build fix (I
should have caught this myself, sorry]
Signed-off-by: Joao Pinto <jpinto@synopsys.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Vineet Gupta <vgupta@synopsys.com>
2016-03-10 14:44:13 -06:00
Chen Gang 4e0b6ca9da asm-generic: page.h: Remove useless get_user_page and free_user_page
They are not symmetric with each other, neither are used in real world
(can not be found by grep command in source code root directory), so
remove them.

Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com>
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2016-02-26 15:24:55 +01:00
Vineet Gupta 9681787930 ARCv2: SMP: Push IPI_IRQ into IPI provider
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-02-24 11:07:31 +05:30
Vineet Gupta dbcbc7e7ce ARC: [intc-compact] Remove IPI setup from ARCompact port
There is no real ARC700 based SMP SoC so remove IPI definition.
EZChip's SMP ARC700 is going to use a different intc and IPI provider
anyways.

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-02-24 11:07:31 +05:30
Vineet Gupta bb143f814e ARCv2: SMP: Emulate IPI to self using software triggered interrupt
ARConnect/MCIP Inter-Core-Interrupt module can't send interrupt to
local core. So use core intc capability to trigger software
interrupt to self, using an unsued IRQ #21.

This showed up as csd deadlock with LTP trace_sched on a dual core
system. This test acts as scheduler fuzzer, triggering all sorts of
schedulting activity. Trouble starts with IPI to self, which doesn't get
delivered (effectively lost due to H/w capability), but the msg intended
to be sent remain enqueued in per-cpu @ipi_data.

All subsequent IPIs to this core from other cores get elided due to the
IPI coalescing optimization in ipi_send_msg_one() where a pending msg
implies an IPI already sent and assumes other core is yet to ack it.
After the elided IPI, other core simply goes into csd_lock_wait()
but never comes out as this core never sees the interrupt.

Fixes STAR 9001008624

Cc: Peter Zijlstra <peterz@infradead.org>
Cc: <stable@vger.kernel.org>        [4.2]
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-02-24 11:07:28 +05:30
Vineet Gupta a150b085b6 ARCv2: boot report CCMs (Closely Coupled Memories)
- ARCv2 uses a seperate BCR for {I,D}CCM base address:
  ARCompact encoded both base/size in same BCR

- Size encoding in common BCR is different for ARCompact/ARCv2

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-02-18 17:30:18 +05:30
Linus Torvalds e835a65f7a ARC fixes for 4.5
- Corner case of returning to delay slot from interrupt
 - Changing default interrupt prioiry level
 - Kconfig'ize support for super pages
 - Other minor fixes
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWvxdUAAoJEGnX8d3iisJeRkYP/2HZAt4J6c5MPk/NSy8rabVX
 2bB1m5jYXlBmJAIsmWm+WcDL72MdrB1Owtc5tEN+hIoQQa2QQpxolp32IslHg0o8
 C9CCzmF+iR8wz3caVk3javpsbze23XbHho/kdx/l2Ed3Fi+syI/9jF1GiboydRtR
 X22an1lslA6Y44pYxFmSFcMCv7XclFkJNe1ltxsgN9/QapnNrE/HWqUIy+SMr2Oo
 Tpo3m/Dc+IfMMejYyupc3keyAhyeux69lJXPuOzYiurgGUIyXz15Un2mQ9gZWf0u
 W56L/55VpQVuah46qrp5CBTLmdJA5cBqr0F8RqmZAqrEYLgn5SD4IhDjamo1qsP/
 FfFh0cG955SoEyCsUOPILWUFR5TeS4rJK+ZJjErUb+dwEC1BWZR0/Dn1s9KJN8b7
 GgGV8yXruDACFlFnCqnlxVs1TKOPOUqD2NZRAdsKunp+ywNrvGdD43xWONcriyvr
 2KW0nb+mH3RRk8HQzKjfqsVhLMoR7n1MD/+tg8ME8usLn1ik0hBerT56CX0Wh/yQ
 VnOUX6xqlaRydeJJgCUyByz3+jJVvj8sk/VZbr19F0p9id6wpiPQeNus2AcoHFKW
 OyvWcfxzqKegXrYtMsy8IoFzx73zJaXV3ht0I09rhAj3JkdF7vFEIUpKIhsWqxAK
 yWKKqLcVKga/2Yc8jduI
 =FNDd
 -----END PGP SIGNATURE-----

Merge tag 'arc-4.5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc

Pull ARC fixes from Vineet Gupta:
 "I've been sitting on some of these fixes for a while.

   - Corner case of returning to delay slot from interrupt
   - Changing default interrupt prioiry level
   - Kconfig'ize support for super pages
   - Other minor fixes"

* tag 'arc-4.5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
  ARC: mm: Introduce explicit super page size support
  ARCv2: intc: Allow interruption by lowest priority interrupt
  ARCv2: Check for LL-SC livelock only if LLSC is enabled
  ARC: shrink cpuinfo by not saving full timer BCR
  ARCv2: clocksource: Rename GRTC -> GFRC ...
  ARCv2: STAR 9000950267: Handle return from intr to Delay Slot #2
2016-02-13 08:18:21 -08:00
Vineet Gupta 37eda9df5b ARC: mm: Introduce explicit super page size support
MMUv4 supports 2 concurrent page sizes: Normal and Super [4K to 16M]

So far Linux supported a single super page size for a given Normal page,
depending on the software page walking address split.
e.g. we had 11:8:13 address split for 8K page, which meant super page
was 2 ^(8+13) = 2M (given that THP size has to be PMD_SHIFT)

Now we turn this around, by allowing multiple Super Pages in Kconfig
(currently 2M and 16M only) and forcing page walker address split to
PGDIR_SHIFT and PAGE_SHIFT

For configs without Super page, things are same as before and
PGDIR_SHIFT can be hacked to get non default address split

The motivation for this change is a customer who needs 16M super page
and a 8K Normal page combo.

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-02-12 12:10:25 +05:30
Vineet Gupta dec2b2849c ARCv2: intc: Allow interruption by lowest priority interrupt
ARC HS Cores support configurable multiple interrupt priorities of upto
16 levels.

There is processor "interrupt preemption threshhold" in STATUS32.E[4:1]
And several places need to set this up:
1. seed value as kernel is booting
2. seed value for user space programs
3. Arg to SLEEP instruction in idle task (what interrupt prio can wake)
4. Per-IRQ line prioirty (i.e. what is the priority of interrupt
   raised by a peripheral or timer or perf counter...

Currently above sites use the highest priority 0. This can be potential
problem when multiple priorities are supported. e.g. user space could
only be interrupted by P0 interrupt, not others...
So turn this over and instead make default interruption level to be
the lowest priority possible 15. This should be fine even if there are
fewer priority levels configured (say two: P0 HIGH, P1 LOW)

This feature also effectively disables FIRQ feature if present in
hardware config. With old code, a P0 interrupt would be FIRQ, needing
special handling (ISR or Register Banks) which is NOT supported yet.
Now it not be P0 (P15 or whatever is lowest prio) so FIRQ is not
triggered.

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-02-10 06:38:50 +05:30
Vineet Gupta b89bd1f4fb ARC: shrink cpuinfo by not saving full timer BCR
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-01-29 16:51:03 +05:30
Vineet Gupta d584f0fb04 ARCv2: clocksource: Rename GRTC -> GFRC ...
... it is now called Global Free Running Counter

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-01-29 16:51:02 +05:30
Christoph Hellwig e1c7e32453 dma-mapping: always provide the dma_map_ops based implementation
Move the generic implementation to <linux/dma-mapping.h> now that all
architectures support it and remove the HAVE_DMA_ATTR Kconfig symbol now
that everyone supports them.

[valentinrothberg@gmail.com: remove leftovers in Kconfig]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
Cc: Helge Deller <deller@gmx.de>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com>
Cc: Ley Foon Tan <lftan@altera.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Steven Miao <realmz6@gmail.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Valentin Rothberg <valentinrothberg@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20 17:09:18 -08:00
Christoph Hellwig 052c96dbe3 arc: convert to dma_map_ops
[vgupta@synopsys.com: ARC: dma mapping fixes #2]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Cc: Carlos Palminha <CARLOS.PALMINHA@synopsys.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20 17:09:18 -08:00
Alexey Brodkin 4b32e89af7 ARC: mm: fix building for MMU v2
ARC700 cores with MMU v2 don't have IC_PTAG AUX register and so we only
define ARC_REG_IC_PTAG for MMU versions >= 3.

But current implementation of cache_line_loop_vX() routines assumes
availability of all of them (v2, v3 and v4) simultaneously.

And given undefined ARC_REG_IC_PTAG if CONFIG_MMU_VER=2 we're seeing
compilation problem:
---------------------------------->8-------------------------------
  CC      arch/arc/mm/cache.o
arch/arc/mm/cache.c: In function '__cache_line_loop_v3':
arch/arc/mm/cache.c:270:13: error: 'ARC_REG_IC_PTAG' undeclared (first use in this function)
   aux_tag = ARC_REG_IC_PTAG;
             ^
arch/arc/mm/cache.c:270:13: note: each undeclared identifier is reported only once for each function it appears in
scripts/Makefile.build:258: recipe for target 'arch/arc/mm/cache.o' failed
---------------------------------->8-------------------------------

The simples fix is to have ARC_REG_IC_PTAG defined regardless MMU
version being used.

We don't use it in cache_line_loop_v2() anyways so who cares.

Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-12-21 12:10:40 +05:30
Vineet Gupta 575a9d4e2c ARC: smp: Rename platform hook @init_cpu_smp -> @init_per_cpu
Makes it similar to smp_ops which also has callback with same name

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-12-17 12:56:56 +05:30
Noam Camus b474a02382 ARC: rename smp operation init_irq_cpu() to init_per_cpu()
This will better reflect its description i.e. "any needed setup..."
and not just do an "IPI request".

Signed-off-by: Noam Camus <noamc@ezchip.com>
Acked-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-12-17 12:56:43 +05:30
Vineet Gupta bc79c9a721 ARC: dw2 unwind: Reinstante unwinding out of modules
The fix which removed linear searching of dwarf (because binary lookup
data always exists) missed out on the fact that modules don't get the
binary lookup tables info. This caused unwinding out of modules to stop
working.

So add binary lookup header setup (equivalent of eh_frame_hdr setup) to
modules as well.

While at it, confine the header setup to within unwinder code,
reducing one API exposed out of unwinder code.

Fixes: 2e22502c08 ARC: dw2 unwind: Remove falllback linear search thru FDE entries
Cc: <stable@vger.kernel.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-12-17 11:10:23 +05:30
Vineet Gupta b8628f3fe4 ARCv2: Use the default irq priority for idle sleep
Although kernel doesn't support the multiple IRQ priority levels provided
by HS38x core intc yet, ensure that the default prio value is used
anyways by relevant code.

SLEEP insn needs to be provided the IRQ priority level which can
interrupt it. This needs to be the default level which maynot
necessarily be 0 as assumed by current code.

This change allows a kernel with ARCV2_IRQ_DEF_PRIO = 1 to boot fine.

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-11-16 14:17:06 +05:30
Vineet Gupta 512b5b89b9 ARC: Abstract out ISA specific SLEEP args
No semantical changes, prepares for ARCv2 specific change in next commit

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-11-16 14:17:02 +05:30
Linus Torvalds b3a0d9a232 ARC fixes for 4.4-rc1
- A bunch of brown paper bag bugs (MAINTAINERS list email, SMP build failure)
 - cpu_relax() now compiler barrier for UP as well
 - Handling of userspace Bus Errors for ARCompact builds
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWRucxAAoJEGnX8d3iisJehaMP/RBYry4TGSSYDC7DqkbpTDlv
 Wd/HE3HDNC0r6hqLXMO2MBLWLvK22pJPcGzXkXtw8kqwTYtKWjZ0IE3R72Lgfw4P
 xlWKdjeI4RXkzJKOZhh6/7HVTkOSto4cUAcdwFcq4ec8asJBAwl/zWoM2Rw8PdfG
 A11iAZTHUSavj/QkFpuqhtvRMRUe76cK41RvdoJfOtB9MjRF3XD0+ceXeDTYuoRb
 aETH42JS5XjRvGShcvaUOCKDZcxlsPyd9LZZAzrLLIoepb7pBluh+YVJH3wJXBcl
 ECjXSprv6GUR1C8R7G3lMtGwIt2tBINTuxH/ZVQp2pUKIUW/TL8MXEQGvWfrosXL
 SbgsIYQSTuc9aO5c5qZ7MSEWz+hLDVlrgWZzs7FsNH7wxREQgl0hjsr382X0Y4n0
 tWPVMr0Hvu7rLdH0gvxIXA3rF92Q9kfLTXj2flaiwXgCxoOoeQj/CtqhUs1xbzed
 qJXf9bwWsjhexxwvHi9CJpyYaVFDp2kkxMoJvzvsLT9cOUGC0XKHnCT4Q96VE5Ms
 9bVBngAugeZRNqB8vKi/84oU8A1Sq+KoTk6b87Z/wpzkh06tmsMQrOEbzoZQsDh9
 6yPW7hgYb794apY9oKwTshHZzXDPg8J/+SVzKht8f84YtSbzS476K70PqnFeoUkD
 2W7IeYKaUolgt3n3BoOO
 =isVd
 -----END PGP SIGNATURE-----

Merge tag 'arc-4.4-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc

Pull ARC fixes from Vineet Gupta:
 "Found a couple of brown paper bag bugs with the prev pull request
  (including a SMP build breakage report from Guenter).  Since these are
  urgent I also decided to send over a bunch of other pending fixes
  which could have otherwise waited an rc or two.

  Summary:

   - A bunch of brown paper bag bugs (MAINTAINERS list email, SMP build
     failure)
   - cpu_relax() now compiler barrier for UP as well
   - handling of userspace Bus Errors for ARCompact builds"

* tag 'arc-4.4-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
  ARC: Fix silly typo in MAINTAINERS file
  ARC: cpu_relax() to be compiler barrier even for UP
  ARC: use ASL assembler mnemonic
  ARC: [arcompact] Handle bus error from userspace as Interrupt not exception
  ARC: remove extraneous header include
  ARCv2: lib: memcpy: use local symbols
2015-11-14 09:09:37 -08:00
Vineet Gupta 1cfc05cbe2 ARC: cpu_relax() to be compiler barrier even for UP
cpu_relax() on ARC has been barrier only for SMP (and no-op for UP). Per
recent discussions, it is safer to make it a compiler barrier
unconditionally.

Link: http://lkml.kernel.org/r/53A7D3AA.9020100@synopsys.com
Acked-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-11-14 13:12:30 +05:30
Linus Torvalds d63a978865 Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking changes from Ingo Molnar:
 "The main changes in this cycle were:

   - More gradual enhancements to atomic ops: new atomic*_read_ctrl()
     ops, synchronize atomic_{read,set}() ordering requirements between
     architectures, add atomic_long_t bitops.  (Peter Zijlstra)

   - Add _{relaxed|acquire|release}() variants for inc/dec atomics and
     use them in various locking primitives: mutex, rtmutex, mcs, rwsem.
     This enables weakly ordered architectures (such as arm64) to make
     use of more locking related optimizations.  (Davidlohr Bueso)

   - Implement atomic[64]_{inc,dec}_relaxed() on ARM.  (Will Deacon)

   - Futex kernel data cache footprint micro-optimization.  (Rasmus
     Villemoes)

   - pvqspinlock runtime overhead micro-optimization.  (Waiman Long)

   - misc smaller fixlets"

* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  ARM, locking/atomics: Implement _relaxed variants of atomic[64]_{inc,dec}
  locking/rwsem: Use acquire/release semantics
  locking/mcs: Use acquire/release semantics
  locking/rtmutex: Use acquire/release semantics
  locking/mutex: Use acquire/release semantics
  locking/asm-generic: Add _{relaxed|acquire|release}() variants for inc/dec atomics
  atomic: Implement atomic_read_ctrl()
  atomic, arch: Audit atomic_{read,set}()
  atomic: Add atomic_long_t bitops
  futex: Force hot variables into a single cache line
  locking/pvqspinlock: Kick the PV CPU unconditionally when _Q_SLOW_VAL
  locking/osq: Relax atomic semantics
  locking/qrwlock: Rename ->lock to ->wait_lock
  locking/Documentation/lockstat: Fix typo - lokcing -> locking
  locking/atomics, cmpxchg: Privatize the inclusion of asm/cmpxchg.h
2015-11-03 16:10:43 -08:00
Vineet Gupta 5a364c2a17 ARC: mm: PAE40 support
This is the first working implementation of 40-bit physical address
extension on ARCv2.

Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-29 18:41:30 +05:30
Vineet Gupta 28b4af729f ARC: mm: PAE40: switch to using phys_addr_t for physical addresses
That way a single flip of phys_addr_t to 64 bit ensures all places
dealing with physical addresses get correct data

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28 19:50:29 +05:30
Vineet Gupta 45890f6d34 ARC: mm: HIGHMEM: kmap API implementation
Implement kmap* API for ARC.

This enables
 - permanent kernel maps (pkmaps): :kmap() API
 - fixmap : kmap_atomic()

We use a very simple/uniform approach for both (unlike some of the other
arches). So fixmap doesn't use the customary compile time address stuff.
The important semantic is sleep'ability (pkmap) vs. not (fixmap) which
the API guarantees.

Note that this patch only enables highmem for subsequent PAE40 support
as there is no real highmem for ARC in pure 32-bit paradigm as explained
below.

ARC has 2:2 address split of the 32-bit address space with lower half
being translated (virtual) while upper half unstranslated
(0x8000_0000 to 0xFFFF_FFFF). kernel itself is linked at base of
unstranslated space (i.e. 0x8000_0000 onwards), which is mapped to say
DDR 0x0 by external Bus Glue logic (outside the core). So kernel can
potentially access 1.75G worth of memory directly w/o need for highmem.
(the top 256M is taken by uncached peripheral space from 0xF000_0000 to
0xFFFF_FFFF)

In PAE40, hardware can address memory beyond 4G (0x1_0000_0000) while
the logical/virtual addresses remain 32-bits. Thus highmem is required
for kernel proper to be able to access these pages for it's own purposes
(user space is agnostic to this anyways).

Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28 19:49:04 +05:30
Vineet Gupta 336e2136e1 ARC: mm: preps ahead of HIGHMEM support
Before we plug in highmem support, some of code needs to be ready for it
 - copy_user_highpage() needs to be using the kmap_atomic API
 - mk_pte() can't assume page_address()
 - do_page_fault() can't assume VMALLOC_END is end of kernel vaddr space

Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28 19:31:05 +05:30
Alexey Brodkin d40846457f ARC: mm: use generic macros _BITUL()/_AC()
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28 19:31:05 +05:30
Vineet Gupta aa0efcde45 ARCv2: smp: [plat-*]: No need to explicitly call mcip_init_smp()
MCIP now registers it's own per cpu setup routine (for IPI IRQ request)
using smp_ops.init_irq_cpu().

So no need for platforms to do that. This now completely decouples
platforms from MCIP.

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28 16:13:41 +05:30
Vineet Gupta 286130ebf1 ARC: smp: Introduce smp hook @init_irq_cpu called for all cores
Note this is not part of platform owned static machine_desc,
but more of device owned plat_smp_ops (rather misnamed) which a IPI
provider or some such typically defines.

This will help us seperate out the IPI registration from platform
specific init_cpu_smp() into device specific init_irq_cpu()

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28 16:13:41 +05:30
Vineet Gupta 8721a7f5a6 ARC: smp: Rename platform hook @init_smp -> @init_cpu_smp
This conveys better that it is called for each cpu

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28 16:13:40 +05:30
Vineet Gupta 26b8f99623 ARCv2: smp: [plat-*]: No need to explicitly call mcip_init_early_smp()
MCIP now registers it's own probe callback with smp_ops.init_early_smp()
which is called by ARC common code, so no need for platforms to do that.

This decouples the platforms and MCIP and helps confine MCIP details
to it's own file.

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28 16:13:40 +05:30
Vineet Gupta e55af4da02 ARC: smp: Introduce smp hook @init_early_smp for Master core
This adds a platform agnostic early SMP init hook which is called on
Master core before calling setup_processor()

  setup_arch()
     smp_init_cpus()
         smp_ops.init_early_smp()
     ...
     setup_processor()

How this helps:
 - Used for one time init of certain SMP centric IP blocks, before
   calling setup_processor() which probes various bits of core,
   possibly including this block

 - Currently platforms need to call this IP block init from their
   init routines, which doesn't make sense as this is specific to ARC
   core and not platform and otherwise requires copy/paste in all
   (and hence a possible point of failure)

e.g. MCIP init is called from 2 platforms currently (axs10x and sim)
which will go away once we have this.

This change only adds the hooks but they are empty for now. Next commit
will populate them and remove the explicit init calls from platforms.

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28 16:13:40 +05:30
Vineet Gupta 4c82f28617 ARC: remove @init_time, @init_irq platform callbacks
These are not in use for ARC platforms. Moreover DT mechanims exist to
probe them w/o explicit platform calls.

 - clocksource drivers can use CLOCKSOURCE_OF_DECLARE()
 - intc IRQCHIP_DECLARE() calls + cascading inside DT allows external
   intc to be probed automatically

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28 16:13:39 +05:30
Vineet Gupta e0868e6f67 ARC: smp: irqchip: handle IPI as percpu irq like timer
The reason this was not done so far was lack of genuine IPI_IRQ for
ARC700, as we don't have a SMP version of core yet (which might change
soon thx to EZChip). Nevertheles to increase the build coverage, we
need to allow CONFIG_SMP for ARC700 and still be able to run it on a
UP platform (nsim or AXS101) with a UP Device Tree (SMP-on-UP)

The build itself requires some define for IPI_IRQ and even a dummy
value is fine since that code won't run anyways.

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28 16:13:39 +05:30
Vineet Gupta d0890ea5b6 ARC: boot log: decode more mmu config items
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-17 17:48:26 +05:30
Vineet Gupta 964cf28f9d ARC: boot log: move helper macros to header for reuse
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-17 17:48:25 +05:30
Vineet Gupta b598e17f6a ARC: mm: compute TLB size as needed from ways * sets
This frees up some bits to hold more high level info such as PAE being
present, w/o increasing the size of already bloated cpuinfo struct

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-17 17:48:25 +05:30
Vineet Gupta 5c35ee642a ARC: make write_aux_reg safer against macro substitution
It was generating warnings when called as write_aux_reg(x, paddr >> 32)

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-17 17:48:24 +05:30
Vineet Gupta 55a2ae775a ARC: [arcompact] entry.S: Improve early return from exception
The requirement is to
 - Reenable Exceptions (AE cleared)
 - Reenable Interrupts (E1/E2 set)

We need to do wiggle these bits into ERSTATUS and call RTIE.

Prev version used the pre-exception STATUS32 as starting point for what
goes into ERSTATUS. This required explicit fixups of U/DE/L bits.

Instead, use the current (in-exception) STATUS32 as starting point.
Being in exception handler U/DE/L can be safely assumed to be correct.
Only AE/E1/E2 need to be fixed.

So the new implementation is slightly better
 -Avoids read form memory
 -Is 4 bytes smaller for the typical 1 level of intr configuration
 -Depicts the semantics more clearly

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-17 17:48:22 +05:30
Vineet Gupta 9dbd3d9bfd ARC: [arcompact] don't check for hard isr calling local_irq_enable()
Historically this was done by ARC IDE driver, which is long gone.
IRQ core is pretty robust now and already checks if IRQs are enabled
in hard ISRs. Thus no point in checking this in arch code, for every
call of irq enabled.

Further if some driver does do that - let it bring down the system so we
notice/fix this sooner than covering up for sucker

This makes local_irq_enable() - for L1 only case atleast simple enough
so we can inline it.

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-17 17:48:22 +05:30
Vineet Gupta c7119d56d2 ARCv2: mm: THP: flush_pmd_tlb_range make SMP safe
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-17 17:48:21 +05:30
Vineet Gupta 722fe8fd36 ARCv2: mm: THP: Implement flush_pmd_tlb_range() optimization
Implement the TLB flush routine to evict a sepcific Super TLB entry,
vs. moving to a new ASID on every such flush.

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-17 17:48:21 +05:30
Vineet Gupta fe6c1b8611 ARCv2: mm: THP support
MMUv4 in HS38x cores supports Super Pages which are basis for Linux THP
support.

Normal and Super pages can co-exist (ofcourse not overlap) in TLB with a
new bit "SZ" in TLB page desciptor to distinguish between them.
Super Page size is configurable in hardware (4K to 16M), but fixed once
RTL builds.

The exact THP size a Linx configuration will support is a function of:
 - MMU page size (typical 8K, RTL fixed)
 - software page walker address split between PGD:PTE:PFN (typical
   11:8:13, but can be changed with 1 line)

So for above default, THP size supported is 8K * 256 = 2M

Default Page Walker is 2 levels, PGD:PTE:PFN, which in THP regime
reduces to 1 level (as PTE is folded into PGD and canonically referred
to as PMD).

Thus thp PMD accessors are implemented in terms of PTE (just like sparc)

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-17 17:48:18 +05:30
Vineet Gupta 24830fc782 ARC: mm: Introduce PTE_SPECIAL
Needed for THP, but will also come in handy for fast GUP later

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-09 17:04:23 +05:30
Vineet Gupta 129cbed54a ARC: mm: pte flags comsetic cleanups, comments
No semantical changes

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-09 17:04:22 +05:30
Vineet Gupta e8a75963a4 ARC: mm: switch pgtable_to to pte_t *
ARC is the only arch with unsigned long type (vs. struct page *).
Historically this was done to avoid the page_address() calls in various
arch hooks which need to get the virtual/logical address of the table.

Some arches alternately define it as pte_t *, and is as efficient as
unsigned long (generated code doesn't change)

Suggested-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-09 17:04:22 +05:30
Ingo Molnar 82fc167c39 Linux 4.3-rc4
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJWEUxnAAoJEHm+PkMAQRiGYCYH/3gtGkFdvSLi+E1PfI8Qk3ZA
 XuYA4Mj09JBVSmaICeueMTDVrdiq0OE0zPib26GWlF/za13kNU8KgMR3+6XCuLSX
 DiCmh6mwDItoNoSIIUERLqrFHABXz8rZ3gb3uu2+kNN74Cl0piNm1YpFclEEWjMr
 9Wk5fkq+ontnDVUQOvWUxPiUXOJTvdLXBWTRDw1yTdE3RMNwRI2d/hme6Hq++WYV
 tRalZZKQaoB33js9WRVAoLVunvtna+i+/y7VGLj8QyS0+d6ec81Hey2r1/fR/oG4
 bs4ul6vtqeb3IR/PjUqxF59pSrCLEO+qrp9KrTlJNYgr1m1QyjRxWUdy/XhyaWo=
 =gIhN
 -----END PGP SIGNATURE-----

Merge tag 'v4.3-rc4' into locking/core, to pick up fixes before applying new changes

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-06 17:10:28 +02:00
Linus Torvalds 30c44659f4 Merge branch 'strscpy' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile
Pull strscpy string copy function implementation from Chris Metcalf.

Chris sent this during the merge window, but I waffled back and forth on
the pull request, which is why it's going in only now.

The new "strscpy()" function is definitely easier to use and more secure
than either strncpy() or strlcpy(), both of which are horrible nasty
interfaces that have serious and irredeemable problems.

strncpy() has a useless return value, and doesn't NUL-terminate an
overlong result.  To make matters worse, it pads a short result with
zeroes, which is a performance disaster if you have big buffers.

strlcpy(), by contrast, is a mis-designed "fix" for strlcpy(), lacking
the insane NUL padding, but having a differently broken return value
which returns the original length of the source string.  Which means
that it will read characters past the count from the source buffer, and
you have to trust the source to be properly terminated.  It also makes
error handling fragile, since the test for overflow is unnecessarily
subtle.

strscpy() avoids both these problems, guaranteeing the NUL termination
(but not excessive padding) if the destination size wasn't zero, and
making the overflow condition very obvious by returning -E2BIG.  It also
doesn't read past the size of the source, and can thus be used for
untrusted source data too.

So why did I waffle about this for so long?

Every time we introduce a new-and-improved interface, people start doing
these interminable series of trivial conversion patches.

And every time that happens, somebody does some silly mistake, and the
conversion patch to the improved interface actually makes things worse.
Because the patch is mindnumbing and trivial, nobody has the attention
span to look at it carefully, and it's usually done over large swatches
of source code which means that not every conversion gets tested.

So I'm pulling the strscpy() support because it *is* a better interface.
But I will refuse to pull mindless conversion patches.  Use this in
places where it makes sense, but don't do trivial patches to fix things
that aren't actually known to be broken.

* 'strscpy' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile:
  tile: use global strscpy() rather than private copy
  string: provide strscpy()
  Make asm/word-at-a-time.h available on all architectures
2015-10-04 16:31:13 +01:00
Peter Zijlstra 62e8a3258b atomic, arch: Audit atomic_{read,set}()
This patch makes sure that atomic_{read,set}() are at least
{READ,WRITE}_ONCE().

We already had the 'requirement' that atomic_read() should use
ACCESS_ONCE(), and most archs had this, but a few were lacking.
All are now converted to use READ_ONCE().

And, by a symmetry and general paranoia argument, upgrade atomic_set()
to use WRITE_ONCE().

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: james.hogan@imgtec.com
Cc: linux-kernel@vger.kernel.org
Cc: oleg@redhat.com
Cc: will.deacon@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-09-23 09:54:28 +02:00
Linus Torvalds ca520cab25 Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking and atomic updates from Ingo Molnar:
 "Main changes in this cycle are:

   - Extend atomic primitives with coherent logic op primitives
     (atomic_{or,and,xor}()) and deprecate the old partial APIs
     (atomic_{set,clear}_mask())

     The old ops were incoherent with incompatible signatures across
     architectures and with incomplete support.  Now every architecture
     supports the primitives consistently (by Peter Zijlstra)

   - Generic support for 'relaxed atomics':

       - _acquire/release/relaxed() flavours of xchg(), cmpxchg() and {add,sub}_return()
       - atomic_read_acquire()
       - atomic_set_release()

     This came out of porting qwrlock code to arm64 (by Will Deacon)

   - Clean up the fragile static_key APIs that were causing repeat bugs,
     by introducing a new one:

       DEFINE_STATIC_KEY_TRUE(name);
       DEFINE_STATIC_KEY_FALSE(name);

     which define a key of different types with an initial true/false
     value.

     Then allow:

       static_branch_likely()
       static_branch_unlikely()

     to take a key of either type and emit the right instruction for the
     case.  To be able to know the 'type' of the static key we encode it
     in the jump entry (by Peter Zijlstra)

   - Static key self-tests (by Jason Baron)

   - qrwlock optimizations (by Waiman Long)

   - small futex enhancements (by Davidlohr Bueso)

   - ... and misc other changes"

* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (63 commits)
  jump_label/x86: Work around asm build bug on older/backported GCCs
  locking, ARM, atomics: Define our SMP atomics in terms of _relaxed() operations
  locking, include/llist: Use linux/atomic.h instead of asm/cmpxchg.h
  locking/qrwlock: Make use of _{acquire|release|relaxed}() atomics
  locking/qrwlock: Implement queue_write_unlock() using smp_store_release()
  locking/lockref: Remove homebrew cmpxchg64_relaxed() macro definition
  locking, asm-generic: Add _{relaxed|acquire|release}() variants for 'atomic_long_t'
  locking, asm-generic: Rework atomic-long.h to avoid bulk code duplication
  locking/atomics: Add _{acquire|release|relaxed}() variants of some atomic operations
  locking, compiler.h: Cast away attributes in the WRITE_ONCE() magic
  locking/static_keys: Make verify_keys() static
  jump label, locking/static_keys: Update docs
  locking/static_keys: Provide a selftest
  jump_label: Provide a self-test
  s390/uaccess, locking/static_keys: employ static_branch_likely()
  x86, tsc, locking/static_keys: Employ static_branch_likely()
  locking/static_keys: Add selftest
  locking/static_keys: Add a new static_key interface
  locking/static_keys: Rework update logic
  locking/static_keys: Add static_key_{en,dis}able() helpers
  ...
2015-09-03 15:46:07 -07:00
Vineet Gupta 9b28829d6d ARCv2: perf: Finally introduce HS perf unit
With all features in place, the ARC HS pct block can now be effectively
allowed to be probed/used

Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-08-27 14:59:07 +05:30
Alexey Brodkin e6b1d126bb ARCv2: perf: implement exclusion of event counting in user or kernel mode
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-08-27 14:58:14 +05:30
Alexey Brodkin 36481cf7fb ARCv2: perf: Support sampling events using overflow interrupts
In times of ARC 700 performance counters didn't have support of
interrupt an so for ARC we only had support of non-sampling events.

Put simply only "perf stat" was functional.

Now with ARC HS we have support of interrupts in performance counters
which this change introduces support of.

ARC performance counters act in the following way in regard of
interrupts generation.
 [1] A counter counts starting from value set in PCT_COUNT register pair
 [2] Once counter reaches value set in PCT_INT_CNT interrupt is raised

Basic setup look like this:
 [1] PCT_COUNT = 0;
 [2] PCT_INT_CNT = __limit_value__;
 [3] Enable interrupts for that counter and let it run
 [4] Let counter reach its limit
 [5] Handle interrupt when it happens

Note that PCT HW block is build in CPU core and so ints interrupt
line (which is basically OR of all counters IRQs) is wired directly to
top-level IRQC. That means do de-assert PCT interrupt it's required to
reset IRQs from all counters that have reached their limit values.

Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-08-27 14:57:43 +05:30
Vineet Gupta fb7c572551 ARC: perf: cap the number of counters to hardware max of 32
The number of counters in PCT can never be more than 32 (while
countable conditions could be 100+) for both ARCompact and ARCv2

And while at it update copyright dates.

Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-08-27 14:57:03 +05:30
Vineet Gupta 090749502f ARC: add/fix some comments in code - no functional change
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-08-20 19:05:49 +05:30
Yuriy Kolerov 6de6066c0d ARC: change some branchs to jumps to resolve linkage errors
When kernel's binary becomes large enough (32M and more) errors
may occur during the final linkage stage. It happens because
the build system uses short relocations for ARC  by default.
This problem may be easily resolved by passing -mlong-calls
option to GCC to use long absolute jumps (j) instead of short
relative branchs (b).

But there are fragments of pure assembler code exist which use
branchs in inappropriate places and cause a linkage error because
of relocations overflow.

First of these fragments is .fixup insertion in futex.h and
unaligned.c. It inserts a code in the separate section (.fixup)
with branch instruction. It leads to the linkage error when
kernel becomes large.

Second of these fragments is calling scheduler's functions
(common kernel code) from entry.S of ARC's code. When kernel's
binary becomes large it may lead to the linkage error because
scheduler may occur far enough from ARC's code in the final
binary.

Signed-off-by: Yuriy Kolerov <yuriy.kolerov@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-08-20 18:53:15 +05:30
Vineet Gupta eb2cd8b72b ARC: ensure futex ops are atomic in !LLSC config
W/o hardware assisted atomic r-m-w the best we can do is to disable
preemption.

Cc: David Hildenbrand <dahi@linux.vnet.ibm.com>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Michel Lespinasse <walken@google.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-08-20 18:16:01 +05:30
Vineet Gupta 882a95ae0a ARC: make futex_atomic_cmpxchg_inatomic() return bimodal
Callers of cmpxchg_futex_value_locked() in futex code expect bimodal
return value:
  !0 (essentially -EFAULT as failure)
   0 (success)

Before this patch, the success return value was old value of futex,
which could very well be non zero, causing caller to possibly take the
failure path erroneously.

Fix that by returning 0 for success

(This fix was done back in 2011 for all upstream arches, which ARC
obviously missed)

Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Michel Lespinasse <walken@google.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-08-20 18:16:00 +05:30
Vineet Gupta ed574e2bbd ARC: futex cosmetics
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Michel Lespinasse <walken@google.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-08-20 18:16:00 +05:30
Vineet Gupta 31d30c8208 ARC: add barriers to futex code
The atomic ops on futex need to provide the full barrier just like
regular atomics in kernel.

Also remove pagefault_enable/disable in futex_atomic_cmpxchg_inatomic()
as core code already does that

Cc: David Hildenbrand <dahi@linux.vnet.ibm.com>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Michel Lespinasse <walken@google.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-08-20 18:15:59 +05:30
Alexey Brodkin f2b0b25a37 ARCv2: Support IO Coherency and permutations involving L1 and L2 caches
In case of ARCv2 CPU there're could be following configurations
that affect cache handling for data exchanged with peripherals
via DMA:
 [1] Only L1 cache exists
 [2] Both L1 and L2 exist, but no IO coherency unit
 [3] L1, L2 caches and IO coherency unit exist

Current implementation takes care of [1] and [2].
Moreover support of [2] is implemented with run-time check
for SLC existence which is not super optimal.

This patch introduces support of [3] and rework of DMA ops
usage. Instead of doing run-time check every time a particular
DMA op is executed we'll have 3 different implementations of
DMA ops and select appropriate one during init.

As for IOC support for it we need:
 [a] Implement empty DMA ops because IOC takes care of cache
     coherency with DMAed data
 [b] Route dma_alloc_coherent() via dma_alloc_noncoherent()
     This is required to make IOC work in first place and also
     serves as optimization as LD/ST to coherent buffers can be
     srviced from caches w/o going all the way to memory

Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
[vgupta:
  -Added some comments about IOC gains
  -Marked dma ops as static,
  -Massaged changelog a bit]
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-08-20 18:11:17 +05:30
Vineet Gupta 1097163870 ARCv2: spinlock/rwlock/atomics: reduce 1 instruction in exponential backoff
The increment of delay counter was 2 instructions:
Arithmatic Shfit Left (ASL) + set to 1 on overflow

This can be done in 1 using ROtate Left (ROL)

Suggested-by: Nigel Topham <ntopham@synopsys.com>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-08-07 13:56:16 +05:30
Vineet Gupta 87ce62802f ARC: Make pt_regs regs unsigned
KGDB fails to build after f51e2f1911 ("ARC: make sure instruction_pointer()
returns unsigned value")

The hack to force one specific reg to unsigned backfired. There's no
reason to keep the regs signed after all.

|  CC      arch/arc/kernel/kgdb.o
|../arch/arc/kernel/kgdb.c: In function 'kgdb_trap':
| ../arch/arc/kernel/kgdb.c:180:29: error: lvalue required as left operand of assignment
|   instruction_pointer(regs) -= BREAK_INSTR_SIZE;

Reported-by: Yuriy Kolerov <yuriy.kolerov@synopsys.com>
Fixes: f51e2f1911 ("ARC: make sure instruction_pointer() returns unsigned value")
Cc: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-08-05 11:48:21 +05:30
Vineet Gupta b89aa12c17 ARCv2: spinlock/rwlock: Reset retry delay when starting a new spin-wait cycle
The previous commit for delayed retry of SCOND needs some fine tuning
for spin locks.

The backoff from delayed retry in conjunction with spin looping of lock
itself can potentially cause the delay counter to reach high values.
So to provide fairness to any lock operation, after a lock "seems"
available (i.e. just before first SCOND try0, reset the delay counter
back to starting value of 1

Essentially reset delay to 1 for a new spin-wait-loop-acquire cycle.

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-08-04 09:26:35 +05:30
Vineet Gupta e78fdfef84 ARCv2: spinlock/rwlock/atomics: Delayed retry of failed SCOND with exponential backoff
This is to workaround the llock/scond livelock

HS38x4 could get into a LLOCK/SCOND livelock in case of multiple overlapping
coherency transactions in the SCU. The exclusive line state keeps rotating
among contenting cores leading to a never ending cycle. So break the cycle
by deferring the retry of failed exclusive access (SCOND). The actual delay
needed is function of number of contending cores as well as the unrelated
coherency traffic from other cores. To keep the code simple, start off with
small delay of 1 which would suffice most cases and in case of contention
double the delay. Eventually the delay is sufficient such that the coherency
pipeline is drained, thus a subsequent exclusive access would succeed.

Link: http://lkml.kernel.org/r/1438612568-28265-1-git-send-email-vgupta@synopsys.com
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-08-04 09:26:34 +05:30
Vineet Gupta 69cbe630f5 ARC: LLOCK/SCOND based rwlock
With LLOCK/SCOND, the rwlock counter can be atomically updated w/o need
for a guarding spin lock.

This in turn elides the EXchange instruction based spinning which causes
the cacheline transition to exclusive state and concurrent spinning
across cores would cause the line to keep bouncing around.
LLOCK/SCOND based implementation is superior as spinning on LLOCK keeps
the cacheline in shared state.

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-08-04 09:26:33 +05:30
Vineet Gupta ae7eae9e03 ARC: LLOCK/SCOND based spin_lock
Current spin_lock uses EXchange instruction to implement the atomic test
and set of lock location (reads orig value and ST 1). This however forces
the cacheline into exclusive state (because of the ST) and concurrent
loops in multiple cores will bounce the line around between cores.

Instead, use LLOCK/SCOND to implement the atomic test and set which is
better as line is in shared state while lock is spinning on LLOCK

The real motivation of this change however is to make way for future
changes in atomics to implement delayed retry (with backoff).
Initial experiment with delayed retry in atomics combined with orig
EX based spinlock was a total disaster (broke even LMBench) as
struct sock has a cache line sharing an atomic_t and spinlock. The
tight spinning on lock, caused the atomic retry to keep backing off
such that it would never finish.

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-08-04 09:26:33 +05:30
Vineet Gupta 8ac0665fb6 ARC: refactor atomic inline asm operands with symbolic names
This reduces the diff in forth-coming patches and also helps understand
better the incremental changes to inline asm.

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-08-04 09:26:32 +05:30
Vineet Gupta f5959cb0c3 Revert "ARCv2: STAR 9000837815 workaround hardware exclusive transactions livelock"
Extended testing of quad core configuration revealed that this fix was
insufficient. Specifically LTP open posix shm_op/23-1 would cause the
hardware livelock in llock/scond loop in update_cpu_load_active()

So remove this and make way for a proper workaround

This reverts commit a5c8b52abe.

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-08-04 09:26:31 +05:30
Vineet Gupta e13c42ecbe ARCv2: Fix the peripheral address space detection
With HS 2.1 release, the peripheral space register no longer contains
the uncached space specifics, causing the kernel to panic early on.
So read the newer NON VOLATILE AUX register to get that info.

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-08-03 19:34:07 +05:30
Peter Zijlstra de9e432cb5 atomic: Collapse all atomic_{set,clear}_mask definitions
Move the now generic definitions of atomic_{set,clear}_mask() into
linux/atomic.h to avoid endless and pointless repetition.

Also, provide an atomic_andnot() wrapper for those few archs that can
implement that.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-07-27 14:06:24 +02:00
Peter Zijlstra e6942b7de2 atomic: Provide atomic_{or,xor,and}
Implement atomic logic ops -- atomic_{or,xor,and}.

These will replace the atomic_{set,clear}_mask functions that are
available on some archs.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-07-27 14:06:24 +02:00
Peter Zijlstra cda7e4137a arc: Provide atomic_{or,xor,and}
Implement atomic logic ops -- atomic_{or,xor,and}.

These will replace the atomic_{set,clear}_mask functions that are
available on some archs.

Acked-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-07-27 14:06:21 +02:00
Laurent Dufour f2abeef9fd mm: clean up per architecture MM hook header files
Commit 2ae416b142 ("mm: new mm hook framework") introduced an empty
header file (mm-arch-hooks.h) for every architecture, even those which
doesn't need to define mm hooks.

As suggested by Geert Uytterhoeven, this could be cleaned through the use
of a generic header file included via each per architecture
asm/include/Kbuild file.

The PowerPC architecture is not impacted here since this architecture has
to defined the arch_remap MM hook.

Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Suggested-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Vineet Gupta <vgupta@synopsys.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-07-17 16:39:53 -07:00
Alexey Brodkin f51e2f1911 ARC: make sure instruction_pointer() returns unsigned value
Currently instruction_pointer() returns pt_regs->ret and so return value
is of type "long", which implicitly stands for "signed long".

While that's perfectly fine when dealing with 32-bit values if return
value of instruction_pointer() gets assigned to 64-bit variable sign
extension may happen.

And at least in one real use-case it happens already.
In perf_prepare_sample() return value of perf_instruction_pointer()
(which is an alias to instruction_pointer() in case of ARC) is assigned
to (struct perf_sample_data)->ip (which type is "u64").

And what we see if instuction pointer points to user-space application
that in case of ARC lays below 0x8000_0000 "ip" gets set properly with
leading 32 zeros. But if instruction pointer points to kernel address
space that starts from 0x8000_0000 then "ip" is set with 32 leadig
"f"-s. I.e. id instruction_pointer() returns 0x8100_0000, "ip" will be
assigned with 0xffff_ffff__8100_0000. Which is obviously wrong.

In particular that issuse broke output of perf, because perf was unable
to associate addresses like 0xffff_ffff__8100_0000 with anything from
/proc/kallsyms.

That's what we used to see:
 ----------->8----------
  6.27%  ls       [unknown]                [k] 0xffffffff8046c5cc
  2.96%  ls       libuClibc-0.9.34-git.so  [.] memcpy
  2.25%  ls       libuClibc-0.9.34-git.so  [.] memset
  1.66%  ls       [unknown]                [k] 0xffffffff80666536
  1.54%  ls       libuClibc-0.9.34-git.so  [.] 0x000224d6
  1.18%  ls       libuClibc-0.9.34-git.so  [.] 0x00022472
 ----------->8----------

With that change perf output looks much better now:
 ----------->8----------
  8.21%  ls       [kernel.kallsyms]        [k] memset
  3.52%  ls       libuClibc-0.9.34-git.so  [.] memcpy
  2.11%  ls       libuClibc-0.9.34-git.so  [.] malloc
  1.88%  ls       libuClibc-0.9.34-git.so  [.] memset
  1.64%  ls       [kernel.kallsyms]        [k] _raw_spin_unlock_irqrestore
  1.41%  ls       [kernel.kallsyms]        [k] __d_lookup_rcu
 ----------->8----------

Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Cc: arc-linux-dev@synopsys.com
Cc: stable@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-07-13 13:33:18 +05:30
Vineet Gupta 9138d4138d ARC: Add llock/scond to futex backend
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-07-09 17:36:33 +05:30
Vineet Gupta 80f420842f ARC: Make ARC bitops "safer" (add anti-optimization)
ARCompact/ARCv2 ISA provide that any instructions which deals with
bitpos/count operand ASL, LSL, BSET, BCLR, BMSK .... will only consider
lower 5 bits. i.e. auto-clamp the pos to 0-31.

ARC Linux bitops exploited this fact by NOT explicitly masking out upper
bits for @nr operand in general, saving a bunch of AND/BMSK instructions
in generated code around bitops.

While this micro-optimization has worked well over years it is NOT safe
as shifting a number with a value, greater than native size is
"undefined" per "C" spec.

So as it turns outm EZChip ran into this eventually, in their massive
muti-core SMP build with 64 cpus. There was a test_bit() inside a loop
from 63 to 0 and gcc was weirdly optimizing away the first iteration
(so it was really adhering to standard by implementing undefined behaviour
vs. removing all the iterations which were phony i.e. (1 << [63..32])

| for i = 63 to 0
|    X = ( 1 << i )
|    if X == 0
|       continue

So fix the code to do the explicit masking at the expense of generating
additional instructions. Fortunately, this can be mitigated to a large
extent as gcc has SHIFT_COUNT_TRUNCATED which allows combiner to fold
masking into shift operation itself. It is currently not enabled in ARC
gcc backend, but could be done after a bit of testing.

Fixes STAR 9000866918 ("unsafe "undefined behavior" code in kernel")

Reported-by: Noam Camus <noamc@ezchip.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-07-09 17:36:32 +05:30
Chris Metcalf a6e2f029ae Make asm/word-at-a-time.h available on all architectures
Added the x86 implementation of word-at-a-time to the
generic version, which previously only supported big-endian.

Omitted the x86-specific load_unaligned_zeropad(), which in
any case is also not present for the existing BE-only
implementation of a word-at-a-time, and is only used under
CONFIG_DCACHE_WORD_ACCESS.

Added as a "generic-y" to the Kbuilds of all architectures
that didn't previously have it.

Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
2015-07-08 16:41:55 -04:00
Linus Torvalds 2d01eedf1d Merge branch 'akpm' (patches from Andrew)
Merge third patchbomb from Andrew Morton:

 - the rest of MM

 - scripts/gdb updates

 - ipc/ updates

 - lib/ updates

 - MAINTAINERS updates

 - various other misc things

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (67 commits)
  genalloc: rename of_get_named_gen_pool() to of_gen_pool_get()
  genalloc: rename dev_get_gen_pool() to gen_pool_get()
  x86: opt into HAVE_COPY_THREAD_TLS, for both 32-bit and 64-bit
  MAINTAINERS: add zpool
  MAINTAINERS: BCACHE: Kent Overstreet has changed email address
  MAINTAINERS: move Jens Osterkamp to CREDITS
  MAINTAINERS: remove unused nbd.h pattern
  MAINTAINERS: update brcm gpio filename pattern
  MAINTAINERS: update brcm dts pattern
  MAINTAINERS: update sound soc intel patterns
  MAINTAINERS: remove website for paride
  MAINTAINERS: update Emulex ocrdma email addresses
  bcache: use kvfree() in various places
  libcxgbi: use kvfree() in cxgbi_free_big_mem()
  target: use kvfree() in session alloc and free
  IB/ehca: use kvfree() in ipz_queue_{cd}tor()
  drm/nouveau/gem: use kvfree() in u_free()
  drm: use kvfree() in drm_free_large()
  cxgb4: use kvfree() in t4_free_mem()
  cxgb3: use kvfree() in cxgb_free_mem()
  ...
2015-07-01 17:47:51 -07:00
Linus Torvalds 0890a26479 - Support for HS38 cores based on ARCv2 ISA
ARCv2 is the next generation ISA from Synopsys and basis for the
      HS3{4,6,8} families of processors which retain the traditional ARC mantra of
      low power and configurability and are now more performant and feature rich.
 
      HS38x is a 10 stage pipeline core which supports MMU (with huge pages) and
      SMP (upto 4 cores) among other features.
 
      + www.synopsys.com/dw/ipdir.php?ds=arc-hs38-processor
      + http://news.synopsys.com/2014-10-14-New-DesignWare-ARC-HS38-Processor-Doubles-Performance-for-Embedded-Linux-Applications
      + http://www.embedded.com/electronics-news/4435975/Synopsys-ARC-HS38-core-gives-2X-boost-to-Linux-based-apps
 
  - Support for ARC SDP (Software Development platform): Main Board + CPU Cards
     = AXS101: CPU Card with ARC700 in silicon @ 700 MHz
     = AXS103: CPU Card with HS38x in FPGA
 
  - Refactoring of ARCompact port to accomodate new ARCv2 ISA
  - Miscll updates/cleanups
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJVk0g8AAoJEGnX8d3iisJecqsQAI6gvBC4GSNYDrmgGJJK1uLQ
 uf6ZXQRLBtyxwa6VMvaNFe91i5XV5WvEXDuNBQX4FdYbp7Fs+Jz5VK79xFtbVEdU
 H6mgKcs9HBwQvrHBxl54XxxXfX7kD1kxrlV7cL4b7bXTEX0XyH5ROUj600/YP+B4
 8t+XdYcfgFK0HpeFGXVP+Xmv/e+hBbzCpOjOd2ZFqEwymvSpZDc4KZ2yDvV2+Ybn
 JNZ421urQOrxR27njvvPvtpeN7uuJKfRYq7IuIR8+Ad72S19EDdw+DZHp2XoUMXA
 wgydWrrOaX2Dr2CmXHGA1C4nWEG7+Yo9I1WitjJct0tkOQyDR2OIDGmvKGBd1uoS
 QsihtoKBRvns+2gpXBEOmOHmF6ggpHNN0ppIwCp+AK5kX3fmxBtyUekyYmVpg8oQ
 xgFIuJgmiAvW7QB7xIO6SFFt18De2ifDRrKWJwVauvfW/PvUIwuUBEcbh0OHAn54
 ebUUWu2ZdVNe0XCsZOAQGwYHZRWBk8Bn3bhFpNnOliRiF77e9GsKeGYeIswYFy7I
 42Gp35ftEj1pLLFZ1vIsAo72N6ErmHwPOcJkaBYaTbPGPcTEO2aR6b8WOcCjsPxK
 DUeUV3H2HV+6V4jw/96lnsaRqsaj4TsJxEAFRR3wT1DLoRudCIDubaXTdvvDie77
 RgKn4ZdxgmXD97+deBqc
 =KwNo
 -----END PGP SIGNATURE-----

Merge tag 'arc-4.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc

Pull ARC architecture updates from Vineet Gupta:

 - support for HS38 cores based on ARCv2 ISA

     ARCv2 is the next generation ISA from Synopsys and basis for the
     HS3{4,6,8} families of processors which retain the traditional ARC mantra of
     low power and configurability and are now more performant and feature rich.

     HS38x is a 10 stage pipeline core which supports MMU (with huge pages) and
     SMP (upto 4 cores) among other features.

     + www.synopsys.com/dw/ipdir.php?ds=arc-hs38-processor
     + http://news.synopsys.com/2014-10-14-New-DesignWare-ARC-HS38-Processor-Doubles-Performance-for-Embedded-Linux-Applications
     + http://www.embedded.com/electronics-news/4435975/Synopsys-ARC-HS38-core-gives-2X-boost-to-Linux-based-apps

 - support for ARC SDP (Software Development platform): Main Board + CPU Cards
    = AXS101: CPU Card with ARC700 in silicon @ 700 MHz
    = AXS103: CPU Card with HS38x in FPGA

 - refactoring of ARCompact port to accomodate new ARCv2 ISA

 - misc updates/cleanups

* tag 'arc-4.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc: (72 commits)
  ARC: Fix build failures for ARCompact in linux-next after ARCv2 support
  ARCv2: Allow older gcc to cope with new regime of ARCv2/ARCompact support
  ARCv2: [vdk] dts files and defconfig for HS38 VDK
  ARCv2: [axs103] Support ARC SDP FPGA platform for HS38x cores
  ARC: [axs101] Prepare for AXS103
  ARCv2: [nsim*hs*] Support simulation platforms for HS38x cores
  ARCv2: All bits in place, allow ARCv2 builds
  ARCv2: SLC: Handle explcit flush for DMA ops (w/o IO-coherency)
  ARCv2: STAR 9000837815 workaround hardware exclusive transactions livelock
  ARC: Reduce bitops lines of code using macros
  ARCv2: barriers
  arch: conditionally define smp_{mb,rmb,wmb}
  ARC: add smp barriers around atomics per Documentation/atomic_ops.txt
  ARC: add compiler barrier to LLSC based cmpxchg
  ARCv2: SMP: intc: IDU 2nd level intc for dynamic IRQ distribution
  ARCv2: SMP: clocksource: Enable Global Real Time counter
  ARCv2: SMP: ARConnect debug/robustness
  ARCv2: SMP: Support ARConnect (MCIP) for Inter-Core-Interrupts et al
  ARC: make plat_smp_ops weak to allow over-rides
  ARCv2: clocksource: Introduce 64bit local RTC counter
  ...
2015-07-01 09:24:26 -07:00
Akinobu Mita 92ed932d30 arc: use for_each_sg()
This replaces the plain loop over the sglist array with for_each_sg()
macro which consists of sg_next() function calls.  Since arc doesn't
select ARCH_HAS_SG_CHAIN, it is not necessary to use for_each_sg() in
order to loop over each sg element.  But this can help find problems with
drivers that do not properly initialize their sg tables when
CONFIG_DEBUG_SG is enabled.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Acked-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-30 19:44:58 -07:00
Linus Torvalds ad90fb9751 Merge branch 'for-4.2/sg' of git://git.kernel.dk/linux-block
Pull asm/scatterlist.h removal from Jens Axboe:
 "We don't have any specific arch scatterlist anymore, since parisc
  finally switched over.  Kill the include"

* 'for-4.2/sg' of git://git.kernel.dk/linux-block:
  remove scatterlist.h generation from arch Kbuild files
  remove <asm/scatterlist.h>
2015-06-25 15:22:36 -07:00
Laurent Dufour 2ae416b142 mm: new mm hook framework
CRIU is recreating the process memory layout by remapping the checkpointee
memory area on top of the current process (criu).  This includes remapping
the vDSO to the place it has at checkpoint time.

However some architectures like powerpc are keeping a reference to the
vDSO base address to build the signal return stack frame by calling the
vDSO sigreturn service.  So once the vDSO has been moved, this reference
is no more valid and the signal frame built later are not usable.

This patch serie is introducing a new mm hook framework, and a new
arch_remap hook which is called when mremap is done and the mm lock still
hold.  The next patch is adding the vDSO remap and unmap tracking to the
powerpc architecture.

This patch (of 3):

This patch introduces a new set of header file to manage mm hooks:
- per architecture empty header file (arch/x/include/asm/mm-arch-hooks.h)
- a generic header (include/linux/mm-arch-hooks.h)

The architecture which need to overwrite a hook as to redefine it in its
header file, while architecture which doesn't need have nothing to do.

The default hooks are defined in the generic header and are used in the
case the architecture is not defining it.

In a next step, mm hooks defined in include/asm-generic/mm_hooks.h should
be moved here.

Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Suggested-by: Andrew Morton <akpm@linux-foundation.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-24 17:49:41 -07:00
Vineet Gupta 795f455856 ARCv2: SLC: Handle explcit flush for DMA ops (w/o IO-coherency)
L2 cache on ARCHS processors is called SLC (System Level Cache)
For working DMA (in absence of hardware assisted IO Coherency) we need
to manage SLC explicitly when buffers transition between cpu and
controllers.

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-25 06:00:19 +05:30
Vineet Gupta a5c8b52abe ARCv2: STAR 9000837815 workaround hardware exclusive transactions livelock
A quad core SMP build could get into hardware livelock with concurrent
LLOCK/SCOND. Workaround that by adding a PREFETCHW which is serialized by
SCU (System Coherency Unit). It brings the cache line in Exclusive state
and makes others invalidate their lines. This gives enough time for
winner to complete the LLOCK/SCOND, before others can get the line back.

The prefetchw in the ll/sc loop is not nice but this is the only
software workaround for current version of RTL.

Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-25 06:00:18 +05:30
Vineet Gupta 04e2eee4b0 ARC: Reduce bitops lines of code using macros
No semantical changes !

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-25 06:00:18 +05:30