linux

Commit Graph

Author	SHA1	Message	Date
Paul Mackerras	08fe1e7bd2	KVM: PPC: Book3S HV: Fix bug in dirty page tracking This fixes a bug in the tracking of pages that get modified by the guest. If the guest creates a large-page HPTE, writes to memory somewhere within the large page, and then removes the HPTE, we only record the modified state for the first normal page within the large page, when in fact the guest might have modified some other normal page within the large page. To fix this we use some unused bits in the rmap entry to record the order (log base 2) of the size of the page that was modified, when removing an HPTE. Then in kvm_test_clear_dirty_npages() we use that order to return the correct number of modified pages. The same thing could in principle happen when removing a HPTE at the host's request, i.e. when paging out a page, except that we never page out large pages, and the guest can only create large-page HPTEs if the guest RAM is backed by large pages. However, we also fix this case for the sake of future-proofing. The reference bit is also subject to the same loss of information. We don't make the same fix here for the reference bit because there isn't an interface for userspace to find out which pages the guest has referenced, whereas there is one for userspace to find out which pages the guest has modified. Because of this loss of information, the kvm_age_hva_hv() and kvm_test_age_hva_hv() functions might incorrectly say that a page has not been referenced when it has, but that doesn't matter greatly because we never page or swap out large pages. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>	2015-08-22 11:16:18 +02:00
Paul Mackerras	1e5bf454f5	KVM: PPC: Book3S HV: Fix race in reading change bit when removing HPTE The reference (R) and change (C) bits in a HPT entry can be set by hardware at any time up until the HPTE is invalidated and the TLB invalidation sequence has completed. This means that when removing a HPTE, we need to read the HPTE after the invalidation sequence has completed in order to obtain reliable values of R and C. The code in kvmppc_do_h_remove() used to do this. However, commit `6f22bd3265` ("KVM: PPC: Book3S HV: Make HTAB code LE host aware") removed the read after invalidation as a side effect of other changes. This restores the read of the HPTE after invalidation. The user-visible effect of this bug would be that when migrating a guest, there is a small probability that a page modified by the guest and then unmapped by the guest might not get re-transmitted and thus the destination might end up with a stale copy of the page. Fixes: `6f22bd3265` Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>	2015-08-22 11:16:18 +02:00
Paul Mackerras	b4deba5c41	KVM: PPC: Book3S HV: Implement dynamic micro-threading on POWER8 This builds on the ability to run more than one vcore on a physical core by using the micro-threading (split-core) modes of the POWER8 chip. Previously, only vcores from the same VM could be run together, and (on POWER8) only if they had just one thread per core. With the ability to split the core on guest entry and unsplit it on guest exit, we can run up to 8 vcpu threads from up to 4 different VMs, and we can run multiple vcores with 2 or 4 vcpus per vcore. Dynamic micro-threading is only available if the static configuration of the cores is whole-core mode (unsplit), and only on POWER8. To manage this, we introduce a new kvm_split_mode struct which is shared across all of the subcores in the core, with a pointer in the paca on each thread. In addition we extend the core_info struct to have information on each subcore. When deciding whether to add a vcore to the set already on the core, we now have two possibilities: (a) piggyback the vcore onto an existing subcore, or (b) start a new subcore. Currently, when any vcpu needs to exit the guest and switch to host virtual mode, we interrupt all the threads in all subcores and switch the core back to whole-core mode. It may be possible in future to allow some of the subcores to keep executing in the guest while subcore 0 switches to the host, but that is not implemented in this patch. This adds a module parameter called dynamic_mt_modes which controls which micro-threading (split-core) modes the code will consider, as a bitmap. In other words, if it is 0, no micro-threading mode is considered; if it is 2, only 2-way micro-threading is considered; if it is 4, only 4-way, and if it is 6, both 2-way and 4-way micro-threading mode will be considered. The default is 6. With this, we now have secondary threads which are the primary thread for their subcore and therefore need to do the MMU switch. These threads will need to be started even if they have no vcpu to run, so we use the vcore pointer in the PACA rather than the vcpu pointer to trigger them. It is now possible for thread 0 to find that an exit has been requested before it gets to switch the subcore state to the guest. In that case we haven't added the guest's timebase offset to the timebase, so we need to be careful not to subtract the offset in the guest exit path. In fact we just skip the whole path that switches back to host context, since we haven't switched to the guest context. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>	2015-08-22 11:16:17 +02:00
Paul Mackerras	ec25716508	KVM: PPC: Book3S HV: Make use of unused threads when running guests When running a virtual core of a guest that is configured with fewer threads per core than the physical cores have, the extra physical threads are currently unused. This makes it possible to use them to run one or more other virtual cores from the same guest when certain conditions are met. This applies on POWER7, and on POWER8 to guests with one thread per virtual core. (It doesn't apply to POWER8 guests with multiple threads per vcore because they require a 1-1 virtual to physical thread mapping in order to be able to use msgsndp and the TIR.) The idea is that we maintain a list of preempted vcores for each physical cpu (i.e. each core, since the host runs single-threaded). Then, when a vcore is about to run, it checks to see if there are any vcores on the list for its physical cpu that could be piggybacked onto this vcore's execution. If so, those additional vcores are put into state VCORE_PIGGYBACK and their runnable VCPU threads are started as well as the original vcore, which is called the master vcore. After the vcores have exited the guest, the extra ones are put back onto the preempted list if any of their VCPUs are still runnable and not idle. This means that vcpu->arch.ptid is no longer necessarily the same as the physical thread that the vcpu runs on. In order to make it easier for code that wants to send an IPI to know which CPU to target, we now store that in a new field in struct vcpu_arch, called thread_cpu. Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Tested-by: Laurent Vivier <lvivier@redhat.com> Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>	2015-08-22 11:16:17 +02:00
Tudor Laurentiu	845ac985cf	KVM: PPC: add missing pt_regs initialization On this switch branch the regs initialization doesn't happen so add it. This was found with the help of a static code analysis tool. Signed-off-by: Laurentiu Tudor <Laurentiu.Tudor@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2015-08-22 11:16:17 +02:00
Thomas Huth	5358a96341	KVM: PPC: Fix warnings from sparse When compiling the KVM code for POWER with "make C=1", sparse complains about functions missing proper prototypes and a 64-bit constant missing the ULL prefix. Let's fix this by making the functions static or by including the proper header with the prototypes, and by appending a ULL prefix to the constant PPC_MPPE_ADDRESS_MASK. Signed-off-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2015-08-22 11:16:16 +02:00
Thomas Huth	129fd4233b	KVM: PPC: Remove PPC970 from KVM_BOOK3S_64_HV text in Kconfig Since the PPC970 support has been removed from the kvm-hv kernel module recently, we should also reflect this change in the help text of the corresponding Kconfig option. Signed-off-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2015-08-22 11:16:16 +02:00
Tudor Laurentiu	f5ffe330f5	KVM: PPC: fix suspicious use of conditional operator This was signaled by a static code analysis tool. Signed-off-by: Laurentiu Tudor <Laurentiu.Tudor@freescale.com> Reviewed-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2015-08-22 11:16:16 +02:00
Ross Zwisler	e2e05394e4	pmem, dax: have direct_access use __pmem annotation Update the annotation for the kaddr pointer returned by direct_access() so that it is a __pmem pointer. This is consistent with the PMEM driver and with how this direct_access() pointer is used in the DAX code. Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2015-08-20 14:07:24 -04:00
Samuel Mendoza-Jonas	e72bb8a5a8	powerpc/powernv: Reset HILE before kexec_sequence() On powernv secondary cpus are returned to OPAL, and will then enter the target kernel in big-endian. However if it is set the HILE bit will persist, causing the first exception in the target kernel to be delivered in litte-endian regardless of the current endianness. If running on top of OPAL make sure the HILE bit is reset once we've finished waiting for all of the secondaries to be returned to OPAL. Signed-off-by: Samuel Mendoza-Jonas <sam.mj@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-08-20 18:19:09 +10:00
Samuel Mendoza-Jonas	ffebf5f391	powerpc/kexec: Reset secondary cpu endianness before kexec If the target kernel does not inlcude the FIXUP_ENDIAN check, coming from a different-endian kernel will cause the target kernel to panic. All ppc64 kernels can handle starting in big-endian mode, so return to big-endian before branching into the target kernel. This mainly affects pseries as secondaries on powernv are returned to OPAL. Signed-off-by: Samuel Mendoza-Jonas <sam.mj@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-08-20 18:19:08 +10:00
Vasant Hegde	c159b5968e	powerpc/powernv: Create LED platform device This patch adds platform devices for leds. Also export LED related OPAL API's so that led driver can use these APIs. Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-08-20 18:19:07 +10:00
Anshuman Khandual	8a8d91817a	powerpc/powernv: Add OPAL interfaces for accessing and modifying system LED states This patch registers the following two new OPAL interfaces calls for the platform LED subsystem. With the help of these new OPAL calls, the kernel will be able to get or set the state of various individual LEDs on the system at any given location code which is passed through the LED specific device tree nodes. (1) OPAL_LEDS_GET_INDICATOR opal_leds_get_ind (2) OPAL_LEDS_SET_INDICATOR opal_leds_set_ind Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Acked-by: Stewart Smith <stewart@linux.vnet.ibm.com> Tested-by: Stewart Smith <stewart@linux.vnet.ibm.com> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-08-20 18:19:07 +10:00
Wei Yang	74703cc4e0	powerpc/powernv: Fix the log message when disabling VF On powernv platform, IOV BAR would be shifted if necessary. While the log message is not correct when disabling VFs. This patch fixes this by print correct message based on the offset value. Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-08-20 16:15:24 +10:00
Gerhard Sittig	acf6cec836	powerpc/512x: silence a USB Kconfig dependency warning the PPC_MPC512x config automatically selected USB_EHCI_BIG_ENDIAN_* switches, which made Kconfig warn about "unmet direct dependencies": scripts/kconfig/conf --silentoldconfig Kconfig warning: (PPC_MPC512x && 440EPX) selects USB_EHCI_BIG_ENDIAN_DESC which has unmet direct dependencies (USB_SUPPORT && USB && USB_EHCI_HCD) warning: (PPC_MPC512x && PPC_PS3 && PPC_CELLEB && 440EPX) selects USB_EHCI_BIG_ENDIAN_MMIO which has unmet direct dependencies (USB_SUPPORT && USB && USB_EHCI_HCD) warning: (PPC_MPC512x && 440EPX) selects USB_EHCI_BIG_ENDIAN_DESC which has unmet direct dependencies (USB_SUPPORT && USB && USB_EHCI_HCD) warning: (PPC_MPC512x && PPC_PS3 && PPC_CELLEB && 440EPX) selects USB_EHCI_BIG_ENDIAN_MMIO which has unmet direct dependencies (USB_SUPPORT && USB && USB_EHCI_HCD) make the selected entries additionally depend on USB_EHCI_HCD which silences the warning Signed-off-by: Gerhard Sittig <gsi@denx.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-08-19 16:14:50 +10:00
Hari Bathini	74943dab6b	powerpc/nvram: print no error when pstore backend is not nvram Pstore only supports one backend at a time. The preferred pstore backend is set by passing the pstore.backend=<name> argument to the kernel at boot time. Currently, while trying to register with pstore, nvram throws an error message even when "pstore.backend != nvram", which is unnecessary. This patch removes the error message in case "pstore.backend != nvram". Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-08-19 16:14:21 +10:00
Andrzej Hajda	fc9e9cbf4e	powerpc/nvram: use kmemdup rather than duplicating its implementation The patch was generated using fixed coccinelle semantic patch scripts/coccinelle/api/memdup.cocci [1]. [1]: http://permalink.gmane.org/gmane.linux.kernel/2014320 Signed-off-by: Andrzej Hajda <a.hajda@samsung.com> Reviewed-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-08-18 19:34:43 +10:00
Andrzej Hajda	2e16acc5ee	powerpc/pseries: use kmemdup rather than duplicating its implementation The patch was generated using fixed coccinelle semantic patch scripts/coccinelle/api/memdup.cocci [1]. [1]: http://permalink.gmane.org/gmane.linux.kernel/2014320 Signed-off-by: Andrzej Hajda <a.hajda@samsung.com> Reviewed-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-08-18 19:34:42 +10:00
Gavin Shan	39bfd715b4	powerpc/eeh: Disable automatically blocked PCI config pcibios_set_pcie_reset_state() could be called to complete reset request when passing through PCI device, flag EEH_PE_ISOLATED is set before saving the PCI config sapce. On some Broadcom adapters, EEH_PE_CFG_BLOCKED is automatically set when the flag EEH_PE_ISOLATED is marked. It caused bogus data saved from the PCI config space, which will be restored to the PCI adapter after the reset. Eventually, the hardware can't work with corrupted data in PCI config space. The patch fixes the issue with eeh_pe_state_mark_no_cfg(), which doesn't set EEH_PE_CFG_BLOCKED when seeing EEH_PE_ISOLATED on the PE, in order to avoid the bogus data saved and restored to the PCI config space. Reported-by: Rajanikanth H. Adaveeshaiah <rajanikanth.ha@in.ibm.com> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-08-18 19:34:42 +10:00
Gavin Shan	dd497154d3	powerpc: Export include/uapi/asm/eeh.h This adds include/uapi/asm/eeh.h to kbuild so that the header file will be exported automatically with below command. The header file was added by commit `ed3e81ff20` ("powerpc/eeh: Move PE state constants around") make INSTALL_HDR_PATH=/tmp/headers \ SRCARCH=powerpc headers_install Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-08-18 19:34:41 +10:00
Vaibhav Jain	e0ad784b65	powerpc/pseries: enable RTC class support A working rtc kernel driver is needed so that hwclock can synchronize system clock to rtc during shutdown/boot. We already have a powernv platform rtc driver located at drivers/rtc/rtc-opal.c. However it depends on CONFIG_RTC_CLASS which is disabled by default. Hence the driver isn't enabled and not compiled for the powernv kernel. We fix this by enabling rtc class support in pseries defconfig which enables this driver and compiles it into the pseries kernel. In case CONFIG_PPC_POWERNV is not enabled we fallback to 'Generic RTC support' driver which emulates the legacy 'PC RTC driver'. Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-08-18 19:34:41 +10:00
Nikunj A Dadhania	1d805440a3	powerpc/numa: initialize distance lookup table from drconf path In some situations, a NUMA guest that supports ibm,dynamic-memory-reconfiguration node will end up having flat NUMA distances between nodes. This is because of two problems in the current code. 1) Different representations of associativity lists. There is an assumption about the associativity list in initialize_distance_lookup_table(). Associativity list has two forms: a) [cpu,memory]@x/ibm,associativity has following format: <N> <N integers> b) ibm,dynamic-reconfiguration-memory/ibm,associativity-lookup-arrays <M> <N> <M associativity lists each having N integers> M = the number of associativity lists N = the number of entries per associativity list Fix initialize_distance_lookup_table() so that it does not assume "case a". And update the caller to skip the length field before sending the associativity list. 2) Distance table not getting updated from drconf path. Node distance table will not get initialized in certain cases as ibm,dynamic-reconfiguration-memory path does not initialize the lookup table. Call initialize_distance_lookup_table() from drconf path with appropriate associativity list. Reported-by: Bharata B Rao <bharata@linux.vnet.ibm.com> Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com> Acked-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-08-18 19:34:41 +10:00
Andrew Donnellan	53522982fc	powerpc/powernv: move dma_get_required_mask from pnv_phb to pci_controller_ops Simplify the dma_get_required_mask call chain by moving it from pnv_phb to pci_controller_ops, similar to commit `763d2d8df1` ("powerpc/powernv: Move dma_set_mask from pnv_phb to pci_controller_ops"). Previous call chain: 0) call dma_get_required_mask() (kernel/dma.c) 1) call ppc_md.dma_get_required_mask, if it exists. On powernv, that points to pnv_dma_get_required_mask() (platforms/powernv/setup.c) 2) device is PCI, therefore call pnv_pci_dma_get_required_mask() (platforms/powernv/pci.c) 3) call phb->dma_get_required_mask if it exists 4) it only exists in the ioda case, where it points to pnv_pci_ioda_dma_get_required_mask() (platforms/powernv/pci-ioda.c) New call chain: 0) call dma_get_required_mask() (kernel/dma.c) 1) device is PCI, therefore call pci_controller_ops.dma_get_required_mask if it exists 2) in the ioda case, that points to pnv_pci_ioda_dma_get_required_mask() (platforms/powernv/pci-ioda.c) In the p5ioc2 case, the call chain remains the same - dma_get_required_mask() does not find either a ppc_md call or pci_controller_ops call, so it calls __dma_get_required_mask(). Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Reviewed-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-08-18 19:32:11 +10:00
Michael Ellerman	73b341efda	powerpc/mm: Drop CONFIG_PPC_HAS_HASH_64K The relation between CONFIG_PPC_HAS_HASH_64K and CONFIG_PPC_64K_PAGES is painfully complicated. But if we rearrange it enough we can see that PPC_HAS_HASH_64K essentially depends on PPC_STD_MMU_64 && PPC_64K_PAGES. We can then notice that PPC_HAS_HASH_64K is used in files that are only built for PPC_STD_MMU_64, meaning it's equivalent to PPC_64K_PAGES. So replace all uses and drop it. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>	2015-08-18 19:32:10 +10:00
Michael Ellerman	55f8b5b82f	powerpc/mm: Simplify page size kconfig dependencies For config options with only a single value, guarding the single value with 'if' is the same as adding a 'depends' statement. And it's more standard to just use 'depends'. And if the option has both an 'if' guard and a 'depends' we can collapse them into a single 'depends' by combining them with &&. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>	2015-08-18 19:32:10 +10:00
Michael Ellerman	953005770e	powerpc/mm: Drop the 64K on 4K version of pte_pagesize_index() Now that support for 64k pages with a 4K kernel is removed, this code is unreachable. CONFIG_PPC_HAS_HASH_64K can only be true when CONFIG_PPC_64K_PAGES is also true. But when CONFIG_PPC_64K_PAGES is true we include pte-hash64.h which includes pte-hash64-64k.h, which defines both pte_pagesize_index() and crucially __real_pte, which means this definition can never be used. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>	2015-08-18 19:31:55 +10:00
Michael Ellerman	f444f1f898	powerpc/cell: Drop support for 64K local store on 4K kernels Back in the olden days we added support for using 64K pages to map the SPU (Synergistic Processing Unit) local store on Cell, when the main kernel was using 4K pages. This was useful at the time because distros were using 4K pages, but using 64K pages on the SPUs could reduce TLB pressure there. However these days the number of Cell users is approaching zero, and supporting this option adds unpleasant complexity to the memory management code. So drop the option, CONFIG_SPU_FS_64K_LS, and all related code. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Acked-by: Jeremy Kerr <jk@ozlabs.org>	2015-08-18 19:29:49 +10:00
Michael Ellerman	74b5037baa	powerpc/mm: Fix pte_pagesize_index() crash on 4K w/64K hash The powerpc kernel can be built to have either a 4K PAGE_SIZE or a 64K PAGE_SIZE. However when built with a 4K PAGE_SIZE there is an additional config option which can be enabled, PPC_HAS_HASH_64K, which means the kernel also knows how to hash a 64K page even though the base PAGE_SIZE is 4K. This is used in one obscure configuration, to support 64K pages for SPU local store on the Cell processor when the rest of the kernel is using 4K pages. In this configuration, pte_pagesize_index() is defined to just pass through its arguments to get_slice_psize(). However pte_pagesize_index() is called for both user and kernel addresses, whereas get_slice_psize() only knows how to handle user addresses. This has been broken forever, however until recently it happened to work. That was because in get_slice_psize() the large kernel address would cause the right shift of the slice mask to return zero. However in commit `7aa0727f33` ("powerpc/mm: Increase the slice range to 64TB"), the get_slice_psize() code was changed so that instead of a right shift we do an array lookup based on the address. When passed a kernel address this means we index way off the end of the slice array and return random junk. That is only fatal if we happen to hit something non-zero, but when we do return a non-zero value we confuse the MMU code and eventually cause a check stop. This fix is ugly, but simple. When we're called for a kernel address we return 4K, which is always correct in this configuration, otherwise we use the slice mask. Fixes: `7aa0727f33` ("powerpc/mm: Increase the slice range to 64TB") Reported-by: Cyril Bur <cyrilbur@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>	2015-08-18 19:29:13 +10:00
Jaiprakash Singh	4524cd093f	powerpc/t1023rdb/dts: set ifc nand chip select from 2 to 1 IFC NAND chip select is wrongly mapped to 2 in reg property of NAND node. Due to this kernel is not able probe NAND flash. Set chip select to 1 in reg property. Signed-off-by: Jaiprakash Singh <b44839@freescale.com> Signed-off-by: Shengzhou Liu <Shengzhou.Liu@freescale.com> Signed-off-by: Scott Wood <scottwood@freescale.com>	2015-08-17 19:41:16 -05:00
Wang Dongsheng	163e60c169	powerpc/mpc85xx:Add SCFG device tree support of T104x Signed-off-by: Wang Dongsheng <dongsheng.wang@freescale.com> Signed-off-by: Scott Wood <scottwood@freescale.com>	2015-08-17 19:33:22 -05:00
Priyanka Jain	0d748ec5ea	powerpc/fsl-booke: Add T1040D4RDB/T1042D4RDB board support T1040D4RDB/T1042D4RDB are Freescale Reference Design Board which can support T1040/T1042 QorIQ Power Architecture™ processor respectively T1040D4RDB/T1042D4RDB board Overview ------------------------------------- - SERDES Connections, 8 lanes supporting: - PCI - SGMII - SATA 2.0 - QSGMII(only for T1040D4RDB) - DDR Controller - Supports rates of up to 1600 MHz data-rate - Supports one DDR4 UDIMM -IFC/Local Bus - NAND flash: 1GB 8-bit NAND flash - NOR: 128MB 16-bit NOR Flash - Ethernet - Two on-board RGMII 10/100/1G ethernet ports. - PHY #0 remains powered up during deep-sleep - CPLD - Clocks - System and DDR clock (SYSCLK, “DDRCLK”) - SERDES clocks - Power Supplies - USB - Supports two USB 2.0 ports with integrated PHYs - Two type A ports with 5V@1.5A per port. - SDHC - SDHC/SDXC connector - SPI - On-board 64MB SPI flash - I2C - Devices connected: EEPROM, thermal monitor, VID controller - Other IO - Two Serial ports - ProfiBus port Add support for T1040/T1042D4RDB board: -add device tree -Add entry in corenet_generic.c Signed-off-by: Priyanka Jain <Priyanka.Jain@freescale.com> Signed-off-by: Scott Wood <scottwood@freescale.com>	2015-08-17 18:58:10 -05:00
Hou Zhiqiang	32d3c4ff01	powerpc/85xx: Remove unused pci fixup hooks on c293pcie The c293pcie board is an endpoint device and it doesn't need PM, so remove hooks pcibios_fixup_phb and pcibios_fixup_bus. Signed-off-by: Hou Zhiqiang <B48286@freescale.com> Signed-off-by: Scott Wood <scottwood@freescale.com>	2015-08-17 18:55:05 -05:00
Kevin Hao	69399ee9cb	powerpc/e6500: hw tablewalk: optimize a bit for tcd lock acquiring codes It makes no sense to put the instructions for calculating the lock value (cpu number + 1) and the clearing of eq bit of cr1 in lbarx/stbcx loop. And when the lock is acquired by the other thread, the current lock value has no chance to equal with the lock value used by current cpu. So we can skip the comparing for these two lock values in the lbz/bne loop. Signed-off-by: Kevin Hao <haokexin@gmail.com> Signed-off-by: Scott Wood <scottwood@freescale.com>	2015-08-17 18:53:47 -05:00
Kevin Hao	e5e55cc08c	powerpc/e6500: remove the stale TCD_LOCK macro Since we moved the "lock" to be the first element of struct tlb_core_data in commit `82d86de25b` ("powerpc/e6500: Make TLB lock recursive"), this macro is not used by any code. Just delete it. Signed-off-by: Kevin Hao <haokexin@gmail.com> Signed-off-by: Scott Wood <scottwood@freescale.com>	2015-08-17 18:53:42 -05:00
Jason Jin	e8c4b3dfe1	powerpc: Add a vga alias node for P1022 In u-boot, when set the video as console, the name 'vga' is used as a general name for the video device, during the fdt_fixup_stdout process, the 'vga' name is used to search in the dtb to setup the 'linux,stdout-path' node. Though the P1022 DIU is not VGA-compatible device, to meet the 'vga' name used in u-boot, the vga alias node is added for P1022 in this patch. At the same time, a display alias is also added so that no other components grow dependencies on the vga alias node. Signed-off-by: Jason Jin <Jason.Jin@freescale.com> Signed-off-by: Wang Dongsheng <dongsheng.wang@freescale.com> Signed-off-by: Scott Wood <scottwood@freescale.com>	2015-08-17 18:52:33 -05:00
Bjorn Helgaas	1f408d5743	Merge branches 'pci/hotplug', 'pci/iommu', 'pci/irq' and 'pci/virtualization' into next * pci/hotplug: PCI: pciehp: Remove ignored MRL sensor interrupt events PCI: pciehp: Remove unused interrupt events PCI: pciehp: Handle invalid data when reading from non-existent devices PCI: Hold pci_slot_mutex while searching bus->slots list PCI: Protect pci_bus->slots with pci_slot_mutex, not pci_bus_sem PCI: pciehp: Simplify pcie_poll_cmd() PCI: Use "slot" and "pci_slot" for struct hotplug_slot and struct pci_slot * pci/iommu: PCI: Remove pci_ats_enabled() PCI: Stop caching ATS Invalidate Queue Depth PCI: Move ATS declarations to linux/pci.h so they're all together PCI: Clean up ATS error handling PCI: Use pci_physfn() rather than looking up physfn by hand PCI: Inline the ATS setup code into pci_ats_init() PCI: Rationalize pci_ats_queue_depth() error checking PCI: Reduce size of ATS structure elements PCI: Embed ATS info directly into struct pci_dev PCI: Allocate ATS struct during enumeration iommu/vt-d: Cache PCI ATS state and Invalidate Queue Depth * pci/irq: PCI: Kill off set_irq_flags() usage * pci/virtualization: PCI: Add ACS quirks for Intel I219-LM/V	2015-08-14 08:16:29 -05:00
Daniel Axtens	e642d11bdb	powerpc/eeh: Probe after unbalanced kref check In the complete hotplug case, EEH PEs are supposed to be released and set to NULL. Normally, this is done by eeh_remove_device(), which is called from pcibios_release_device(). However, if something is holding a kref to the device, it will not be released, and the PE will remain. eeh_add_device_late() has a check for this which will explictly destroy the PE in this case. This check in eeh_add_device_late() occurs after a call to eeh_ops->probe(). On PowerNV, probe is a pointer to pnv_eeh_probe(), which will exit without probing if there is an existing PE. This means that on PowerNV, devices with outstanding krefs will not be rediscovered by EEH correctly after a complete hotplug. This is affecting CXL (CAPI) devices in the field. Put the probe after the kref check so that the PE is destroyed and affected devices are correctly rediscovered by EEH. Fixes: `d91dafc02f` ("powerpc/eeh: Delay probing EEH device during hotplug") Cc: stable@vger.kernel.org Cc: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Daniel Axtens <dja@axtens.net> Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-08-14 21:31:49 +10:00
Gautham R. Shenoy	e63dbd16ab	powerpc: Add an inline function to update POWER8 HID0 Section 3.7 of Version 1.2 of the Power8 Processor User's Manual prescribes that updates to HID0 be preceded by a SYNC instruction and followed by an ISYNC instruction (Page 91). Create an inline function name update_power8_hid0() which follows this recipe and invoke it from the static split core path. Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Reviewed-by: Sam Bobroff <sam.bobroff@au1.ibm.com> Tested-by: Sam Bobroff <sam.bobroff@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-08-14 15:58:28 +10:00
Ingo Molnar	f52609fdab	Merge branch 'locking/arch-atomic' into locking/core, because it's ready for upstream Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-08-12 11:44:30 +02:00
Anshuman Khandual	9afac93343	powerpc/prom: Use DRCONF flags while processing detected LMBs Replace hard coded values with existing DRCONF flags while procesing detected LMBs from the device tree. Does not change any functionality. Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-08-12 15:05:47 +10:00
Anshuman Khandual	8218a3031c	powerpc/xmon: Drop the valid variable completely in dump_segments() The value of 'valid' is always zero when 'esid' is zero, and if 'esid' is non-zero then the value of 'valid' is irrelevant because we are using logical or in the if expression. In fact 'valid' can be dropped completely from dump_segments() by simply doing the check with SLB_ESID_V directly in the if. Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com> [mpe: Rewrite change log] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-08-12 15:05:47 +10:00
Anshuman Khandual	9c61f7a0ad	powerpc/prom: Simplify the logic to fetch SLB size The code to fetch the SLB size from the device tree wants to first look for "slb-size" and then if that's not found "ibm,slb-size". We can simplify the code by looking for the properties and then if we find one of them we set mmu_slb_size. We also change the function name from check_cpu_slb_size() to init_mmu_slb_size() as the function doesn't check anything, it only initialises mmu_slb_size. Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com> [mpe: Rewrite change log] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-08-12 15:05:46 +10:00
Anshuman Khandual	79d0be7407	powerpc/slb: Add documentation on runtime patching of SLB encoding This patch adds some documentation to patch_slb_encoding() explaining how it works. Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com> [mpe: Update change log and mention the signedness of the immediate] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-08-12 15:04:52 +10:00
Anshuman Khandual	2be682af48	powerpc/slb: Rename all the 'slot' occurrences to 'entry' The SLB code uses 'slot' and 'entry' interchangeably, change it to always use 'entry'. Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com> [mpe: Rewrite change log] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-08-12 14:50:12 +10:00
Anshuman Khandual	752b8adec4	powerpc/slb: Remove a duplicate extern variable This patch just removes one redundant entry for one extern variable 'slb_compare_rr_to_size' from the scope. This patch does not change any functionality. Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-08-12 14:50:12 +10:00
Dan Williams	92b19ff50e	cleanup IORESOURCE_CACHEABLE vs ioremap() Quoting Arnd: I was thinking the opposite approach and basically removing all uses of IORESOURCE_CACHEABLE from the kernel. There are only a handful of them.and we can probably replace them all with hardcoded ioremap_cached() calls in the cases they are actually useful. All existing usages of IORESOURCE_CACHEABLE call ioremap() instead of ioremap_nocache() if the resource is cacheable, however ioremap() is uncached by default. Clearly none of the existing usages care about the cacheability. Particularly devm_ioremap_resource() never worked as advertised since it always fell back to plain ioremap(). Clean this up as the new direction we want is to convert ioremap_<type>() usages to memremap(..., flags). Suggested-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2015-08-10 23:07:06 -04:00
Viresh Kumar	37a13e78e0	powerpc/time: Migrate to new 'set-state' interface Migrate powerpc driver to the new 'set-state' interface provided by clockevents core, the earlier 'set-mode' interface is marked obsolete now. This also enables us to implement callbacks for new states of clockevent devices, for example: ONESHOT_STOPPED. We weren't doing anything in ->set_mode(ONSHOT) and so set_state_oneshot() isn't implemented. Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>	2015-08-10 11:41:02 +02:00
Scott Wood	c60232029a	powerpc/fsl: Force coherent memory on e500mc derivatives In CoreNet systems it is not allowed to mix M and non-M mappings to the same memory, and coherent DMA accesses are considered to be M mappings for this purpose. Ignoring this has been observed to cause hard lockups in non-SMP kernels on e6500. Furthermore, e6500 implements the LRAT (logical to real address table) which allows KVM guests to control the WIMGE bits. This means that KVM cannot force the M bit on the way it usually does, so the guest had better set it itself. Signed-off-by: Scott Wood <scottwood@freescale.com>	2015-08-07 23:00:01 -05:00
Scott Wood	0d61f0b3e2	powerpc/booke64: Move mb() to __set_pte_at() with kernel-addr test map_kernel() doesn't catch all places that create kernel PTEs. In particular, vmalloc() calls set_pte_at() directly. This causes a crash when booting a non-SMP kernel on e6500. Move the sync to __set_pte(), to be executed only for kernel addresses. Signed-off-by: Scott Wood <scottwood@freescale.com>	2015-08-07 23:00:01 -05:00
Shaohui Xie	3fa647bff3	powerpc/config: enable aquantia PHY Aquantia PHYs used on platforms such as T2080RDB, T1024RDB. Signed-off-by: Shaohui Xie <Shaohui.Xie@freescale.com> Signed-off-by: Scott Wood <scottwood@freescale.com>	2015-08-07 22:59:33 -05:00
Shaohui Xie	b6808fb731	powerpc/85xx: enable teranetics PHY The PHY uses XAUI interface to connect to MAC, mostly the PHY used on riser card. Signed-off-by: Shaohui Xie <Shaohui.Xie@freescale.com> Signed-off-by: Scott Wood <scottwood@freescale.com>	2015-08-07 22:59:33 -05:00
Shengzhou Liu	add888d6b2	powerpc/t1023rdb: add ina220 current sensor node Add support for INA220 current sensor. Signed-off-by: Shengzhou Liu <Shengzhou.Liu@freescale.com> Signed-off-by: Scott Wood <scottwood@freescale.com>	2015-08-07 22:59:32 -05:00
Shengzhou Liu	4a6b8a4b20	powerpc/t1024rdb: add ina220 current sensor node Add support for INA220 current sensor. Signed-off-by: Shengzhou Liu <Shengzhou.Liu@freescale.com> Signed-off-by: Scott Wood <scottwood@freescale.com>	2015-08-07 22:59:32 -05:00
LEROY Christophe	295ffb4189	powerpc/32: Few optimisations in memcpy This patch adds a few optimisations in memcpy functions by using lbzu/stbu instead of lxb/stb and by re-ordering insn inside a loop to reduce latency due to loading Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Scott Wood <scottwood@freescale.com>	2015-08-07 22:59:29 -05:00
LEROY Christophe	0b05e2d671	powerpc/32: cacheable_memcpy becomes memcpy cacheable_memcpy uses dcbz instruction and is more efficient than memcpy when the destination is in RAM. If the destination is in an io area, memcpy_toio() is normally used, not memcpy This patch renames memcpy as generic_memcpy, and renames cacheable_memcpy as memcpy On MPC885, we get approximatly 7% increase of the transfer rate on an FTP reception Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Scott Wood <scottwood@freescale.com>	2015-08-07 22:59:27 -05:00
LEROY Christophe	c152f149ce	powerpc/32: Merge the new memset() with the old one cacheable_memzero() which has become the new memset() and the old memset() are quite similar, so just merge them. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Scott Wood <scottwood@freescale.com>	2015-08-07 22:59:24 -05:00
LEROY Christophe	5b2a32e806	powerpc/32: memset(0): use cacheable_memzero cacheable_memzero uses dcbz instruction and is more efficient than memset(0) when the destination is in RAM This patch renames memset as generic_memset, and defines memset as a prolog to cacheable_memzero. This prolog checks if the byte to set is 0. If not, it falls back to generic_memcpy() cacheable_memzero disappears as it is not referenced anywhere anymore Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Scott Wood <scottwood@freescale.com>	2015-08-07 22:59:21 -05:00
LEROY Christophe	df087e450d	Partially revert "powerpc: Remove duplicate cacheable_memcpy/memzero functions" This partially reverts commit 'powerpc: Remove duplicate cacheable_memcpy/memzero functions ("b05ae4ee602b7dc90771408ccf0972e1b3801a35")' Functions cacheable_memcpy/memzero are more efficient than memcpy/memset as they use the dcbz instruction which avoids refill of the cacheline with the data that we will overwrite. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Scott Wood <scottwood@freescale.com>	2015-08-07 22:59:21 -05:00
LEROY Christophe	934628c7e6	powerpc: use memset_io() to clear CPM Muram CPM muram is not cached, so use memset_io() instead of memset() Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Scott Wood <scottwood@freescale.com>	2015-08-07 22:59:20 -05:00
Scott Wood	2f7d2b74a9	powerpc/mm: Don't call __flush_dcache_icache_phys() with PA>VA __flush_dcache_icache_phys() requires the ability to access the memory with the MMU disabled, which means that on a 32-bit system any memory above 4 GiB is inaccessible. In particular, mpc86xx is 32-bit and can have more than 4 GiB of RAM. Signed-off-by: Scott Wood <scottwood@freescale.com>	2015-08-07 22:59:20 -05:00
LEROY Christophe	501c8de7b0	powerpc: add support for csum_add() The C version of csum_add() as defined in include/net/checksum.h gives the following assembly in ppc32: 0: 7c 04 1a 14 add r0,r4,r3 4: 7c 64 00 10 subfc r3,r4,r0 8: 7c 63 19 10 subfe r3,r3,r3 c: 7c 63 00 50 subf r3,r3,r0 and the following in ppc64: 0xc000000000001af8 <+0>: add r3,r3,r4 0xc000000000001afc <+4>: cmplw cr7,r3,r4 0xc000000000001b00 <+8>: mfcr r4 0xc000000000001b04 <+12>: rlwinm r4,r4,29,31,31 0xc000000000001b08 <+16>: add r3,r4,r3 0xc000000000001b0c <+20>: clrldi r3,r3,32 0xc000000000001b10 <+24>: blr include/net/checksum.h also offers the possibility to define an arch specific function. This patch provides a specific csum_add() inline function. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Scott Wood <scottwood@freescale.com>	2015-08-07 22:59:19 -05:00
LEROY Christophe	92c985f1d7	powerpc: put csum_tcpudp_magic inline csum_tcpudp_magic() is only a few instructions, and does modify really few registers. So it is not worth having it as a separate function and suffer function branching and saving of volatile registers. This patch makes it inline by use of the already existing csum_tcpudp_nofold() function. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Scott Wood <scottwood@freescale.com>	2015-08-07 22:59:19 -05:00
Scott Wood	44d5401425	powerpc/85xx: Use kconfig fragments Unify mpc85xx and corenet configs using fragments, to ease maintenance and avoid the sort of drift that the previous patch fixed. Hardware and software options are separated, with the hope that other embedded platforms could share the software options, and to make it easier to maintain custom/alternate configs that focus on either hardware or software options. Due to the previous patch, this patch should not affect the results of any of the affected defconfigs -- only how those results are achieved. The resulting config is more or less the union of the options that any of the configs previously selected. No attempt was made in this (or the previous) patch to edit out questionable options, but this patch will make it easier to do so in future patches. Signed-off-by: Scott Wood <scottwood@freescale.com>	2015-08-07 22:59:19 -05:00
Scott Wood	7e2ad2ef85	powerpc/85xx: Make defconfigs consistent The mpc85xx and corenet configs have many differences between them that can't be explained by the target hardware of each config. The next patch will consolidate these targets using kconfig fragments; this patch shows what the resulting defconfigs will look like (generated by using savedefconfig on a fragment-generated config). Signed-off-by: Scott Wood <scottwood@freescale.com>	2015-08-07 22:59:18 -05:00
Michael Ellerman	ecc7456880	powerpc: Update corenet32_smp_defconfig for modern distros corenet32_smp_defconfig is missing some things that modern distros require, enable them. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Scott Wood <scottwood@freescale.com>	2015-08-07 22:59:18 -05:00
Yao Yuan	f728b8b7d0	powerpc/corenet32: enable DMA in defconfig By default we enable DMA(CONFIG_FSL_DMA) support which are needed on P2041RDB, P3041DS, P4080DS, B4860QDS, etc. Signed-off-by: Yuan Yao <yao.yuan@freescale.com> Signed-off-by: Scott Wood <scottwood@freescale.com>	2015-08-07 22:59:17 -05:00
Yangbo Lu	7a2efb3a88	powerpc/corenet: enable eSDHC Signed-off-by: Yangbo Lu <yangbo.lu@freescale.com> Signed-off-by: Scott Wood <scottwood@freescale.com>	2015-08-07 22:59:17 -05:00
Laurent Pinchart	60acc4ebe7	treewide: Fix typo compatability -> compatibility Even though 'compatability' has a dedicated entry in the Wiktionary, it's listed as 'Mispelling of compatibility'. Fix it. Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> for the atomic_helper.c Signed-off-by: Jiri Kosina <jkosina@suse.com>	2015-08-07 14:01:39 +02:00
Masanari Iida	971bd8fa36	treewide: Fix typo in printk This patch fix spelling typo inv various part of sources. Signed-off-by: Masanari Iida <standby24x7@gmail.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Jiri Kosina <jkosina@suse.com>	2015-08-07 13:58:05 +02:00
Amanieu d'Antras	3c00cb5e68	signal: fix information leak in copy_siginfo_from_user32 This function can leak kernel stack data when the user siginfo_t has a positive si_code value. The top 16 bits of si_code descibe which fields in the siginfo_t union are active, but they are treated inconsistently between copy_siginfo_from_user32, copy_siginfo_to_user32 and copy_siginfo_to_user. copy_siginfo_from_user32 is called from rt_sigqueueinfo and rt_tgsigqueueinfo in which the user has full control overthe top 16 bits of si_code. This fixes the following information leaks: x86: 8 bytes leaked when sending a signal from a 32-bit process to itself. This leak grows to 16 bytes if the process uses x32. (si_code = __SI_CHLD) x86: 100 bytes leaked when sending a signal from a 32-bit process to a 64-bit process. (si_code = -1) sparc: 4 bytes leaked when sending a signal from a 32-bit process to a 64-bit process. (si_code = any) parsic and s390 have similar bugs, but they are not vulnerable because rt_[tg]sigqueueinfo have checks that prevent sending a positive si_code to a different process. These bugs are also fixed for consistency. Signed-off-by: Amanieu d'Antras <amanieu@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Chris Metcalf <cmetcalf@ezchip.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-08-07 04:39:40 +03:00
Naveen N. Rao	197165d449	powerpc/ftrace: add powerpc timebase as a trace clock source Add a new powerpc-specific trace clock using the timebase register, similar to x86-tsc. This gives us - a fast, monotonic, hardware clock source for trace entries, and - a clock that can be used to correlate events across cpus as well as across hypervisor and guests. Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-08-06 16:36:23 +10:00
Wei Yongjun	35a7f41cc6	powerpc/4xx: Fix return value check in hsta_msi_probe() In case of error, the functions platform_get_resource() and kmalloc() returns NULL not ERR_PTR(). The IS_ERR() test in the return value check should be replaced with NULL test. Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-08-06 16:33:46 +10:00
Joe Perches	a825ac078b	powerpc: Remove redundant breaks break; break; isn't useful. Remove one. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-08-06 15:10:20 +10:00
Kevin Hao	ae2a84b407	powerpc: pci: use %pR for printing struct resource Use %pR to simplify the debug code. This also make the debug info more readable. Signed-off-by: Kevin Hao <haokexin@gmail.com> [mpe: Unsplit multi-line printk strings] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-08-06 15:10:19 +10:00
Mahesh Salgaonkar	62521ea6db	powerpc/powernv: Invoke opal_cec_reboot2() on unrecoverable HMI. Invoke new opal_cec_reboot2() call with reboot type OPAL_REBOOT_PLATFORM_ERROR (for unrecoverable HMI interrupts) to inform BMC/OCC about this error, so that BMC can collect relevant data for error analysis and decide what component to de-configure before rebooting. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-08-06 15:10:19 +10:00
Mahesh Salgaonkar	e784b6499d	powerpc/powernv: Invoke opal_cec_reboot2() on unrecoverable machine check errors. On non-recoverable MCE errors in kernel space, Linux kernel panics and system reboots. On BMC based system opal-prd runs as a daemon in the host. Hence, kernel crash may prevent opal-prd to detect and analyze this MCE error. This may land us in a situation where the faulty memory never gets de-configured and Linux would keep hitting same MCE error again and again. If this happens in early stage of kernel initialization, then Linux will keep crashing and rebooting in a loop. This patch fixes this issue by invoking new opal_cec_reboot2() call with reboot type OPAL_REBOOT_PLATFORM_ERROR to inform BMC/OCC about this error, so that BMC can collect relevant data for error analysis and decide what component to de-configure before rebooting. This patch is dependent on OPAL patchset posted on skiboot mailing list at https://lists.ozlabs.org/pipermail/skiboot/2015-July/001771.html that introduces opal_cec_reboot2() opal call. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-08-06 15:10:18 +10:00
Mahesh Salgaonkar	1852ae276b	powerpc/powernv: Pull all HMI events before panic. In the event of unrecovered HMI the existing code panics as soon as it receives the first unrecovered HMI event. This makes host to report partial information about HMIs before panic. There may be more errors which would have caused the HMI and hence more HMI event would have been generated waiting to be pulled by host. This patch implements a logic to pull and display all the HMI event before going down panic path. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-08-06 15:10:18 +10:00
Mahesh Salgaonkar	c33e11d0dd	powerpc/powernv: display reason for Malfunction Alert HMI. The V2 version of HMI event now carries additional information for Malfunction Alert. It now contains error information about CORE and NX checkstop. This patch checks and displays the check stop reason before panic. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Acked-by: Stewart Smith <stewart@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-08-06 15:09:59 +10:00
Paul E. McKenney	12d560f4ea	rcu,locking: Privatize smp_mb__after_unlock_lock() RCU is the only thing that uses smp_mb__after_unlock_lock(), and is likely the only thing that ever will use it, so this commit makes this macro private to RCU. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: "linux-arch@vger.kernel.org" <linux-arch@vger.kernel.org>	2015-08-04 08:49:21 -07:00
Konstantin Khlebnikov	c56dadf397	sched/preempt, powerpc, kvm: Use need_resched() instead of should_resched() Function should_resched() is equal to (!preempt_count() && need_resched()). In preemptive kernel preempt_count here is non-zero because of vc->lock. Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Graf <agraf@suse.de> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: David Vrabel <david.vrabel@citrix.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20150715095203.12246.72922.stgit@buzz Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-08-03 12:21:24 +02:00
Peter Zijlstra	11276d5306	locking/static_keys: Add a new static_key interface There are various problems and short-comings with the current static_key interface: - static_key_{true,false}() read like a branch depending on the key value, instead of the actual likely/unlikely branch depending on init value. - static_key_{true,false}() are, as stated above, tied to the static_key init values STATIC_KEY_INIT_{TRUE,FALSE}. - we're limited to the 2 (out of 4) possible options that compile to a default NOP because that's what our arch_static_branch() assembly emits. So provide a new static_key interface: DEFINE_STATIC_KEY_TRUE(name); DEFINE_STATIC_KEY_FALSE(name); Which define a key of different types with an initial true/false value. Then allow: static_branch_likely() static_branch_unlikely() to take a key of either type and emit the right instruction for the case. This means adding a second arch_static_branch_jump() assembly helper which emits a JMP per default. In order to determine the right instruction for the right state, encode the branch type in the LSB of jump_entry::key. This is the final step in removing the naming confusion that has led to a stream of avoidable bugs such as: `a833581e37` ("x86, perf: Fix static_key bug in load_mm_cr4()") ... but it also allows new static key combinations that will give us performance enhancements in the subsequent patches. Tested-by: Rabin Vincent <rabin@rab.in> # arm Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Michael Ellerman <mpe@ellerman.id.au> # ppc Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> # s390 Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-08-03 11:34:15 +02:00
Peter Zijlstra	76b235c6bc	jump_label: Rename JUMP_LABEL_{EN,DIS}ABLE to JUMP_LABEL_{JMP,NOP} Since we've already stepped away from ENABLE is a JMP and DISABLE is a NOP with the branch_default bits, and are going to make it even worse, rename it to make it all clearer. This way we don't mix multiple levels of logic attributes, but have a plain 'physical' name for what the current instruction patching status of a jump label is. This is a first step in removing the naming confusion that has led to a stream of avoidable bugs such as: `a833581e37` ("x86, perf: Fix static_key bug in load_mm_cr4()") Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org [ Beefed up the changelog. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-08-03 11:34:12 +02:00
Andrey Konovalov	76695af20c	locking, arch: use WRITE_ONCE()/READ_ONCE() in smp_store_release()/smp_load_acquire() Replace ACCESS_ONCE() macro in smp_store_release() and smp_load_acquire() with WRITE_ONCE() and READ_ONCE() on x86, arm, arm64, ia64, metag, mips, powerpc, s390, sparc and asm-generic since ACCESS_ONCE() does not work reliably on non-scalar types. WRITE_ONCE() and READ_ONCE() were introduced in the following commits: `230fa253df` ("kernel: Provide READ_ONCE and ASSIGN_ONCE") `43239cbe79` ("kernel: Change ASSIGN_ONCE(val, x) to WRITE_ONCE(x, val)") Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Davidlohr Bueso <dbueso@suse.de> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Acked-by: Ralf Baechle <ralf@linux-mips.org> Cc: Alexander Duyck <alexander.h.duyck@redhat.com> Cc: Andre Przywara <andre.przywara@arm.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Borislav Petkov <bp@suse.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: David S. Miller <davem@davemloft.net> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: James Hogan <james.hogan@imgtec.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Russell King <linux@arm.linux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-arch@vger.kernel.org Link: http://lkml.kernel.org/r/1438528264-714-1-git-send-email-andreyknvl@google.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-08-03 10:59:30 +02:00
Yijing Wang	017ffe64e8	PCI: Hold pci_slot_mutex while searching bus->slots list Previously, pci_setup_device() and similar functions searched the pci_bus->slots list without any locking. It was possible for another thread to update the list while we searched it. Add pci_dev_assign_slot() to search the list while holding pci_slot_mutex. [bhelgaas: changelog, fold in CONFIG_SYSFS fix] Tested-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Yijing Wang <wangyijing@huawei.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2015-07-30 16:19:53 -05:00
Alistair Popple	b8d65e9662	powerpc/eeh-powernv: Fix unbalanced IRQ warning pnv_eeh_next_error() re-enables the eeh opal event interrupt but it gets called from a loop if there are more outstanding events to process, resulting in a warning due to enabling an already enabled interrupt. Instead the interrupt should only be re-enabled once the last outstanding event has been processed. Tested-by: Daniel Axtens <dja@axtens.net> Reported-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Alistair Popple <alistair@popple.id.au> Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-07-30 19:01:32 +10:00
Michael Ellerman	2449acc534	powerpc/kernel: Enable seccomp filter This commit enables seccomp filter on powerpc, now that we have all the necessary pieces in place. To support seccomp's desire to modify the syscall return value under some circumstances, we use a different ABI to the ptrace ABI. That is we use r3 as the syscall return value, and orig_gpr3 is the first syscall parameter. This means the seccomp code, or a ptracer via SECCOMP_RET_TRACE, will see -ENOSYS preloaded in r3. This is identical to the behaviour on x86, and allows seccomp or the ptracer to either leave the -ENOSYS or change it to something else, as well as rejecting or not the syscall by modifying r0. If seccomp does not reject the syscall, we restore the register state to match what ptrace and audit expect, ie. r3 is the first syscall parameter again. We do this restore using orig_gpr3, which may have been modified by seccomp, which allows seccomp to modify the first syscall paramater and allow the syscall to proceed. We need to #ifdef the the additional handling of r3 for seccomp, so move it all out of line. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Kees Cook <keescook@chromium.org>	2015-07-30 14:34:44 +10:00
Marc Zyngier	ad3aedfbb0	genirq/irqdomain: Allow irq domain aliasing It is not uncommon (at least with the ARM stuff) to have a piece of hardware that implements different flavours of "interrupts". A typical example of this is the GICv3 ITS, which implements standard PCI/MSI support, but also some form of "generic MSI". So far, the PCI/MSI domain is registered using the ITS device_node, so that irq_find_host can return it. On the contrary, the raw MSI domain is not registered with an device_node, making it impossible to be looked up by another subsystem (obviously, using the same device_node twice would only result in confusion, as it is not defined which one irq_find_host would return). A solution to this is to "type" domains that may be aliasing, and to be able to lookup an device_node that matches a given type. For this, we introduce irq_find_matching_host() as a superset of irq_find_host: struct irq_domain irq_find_matching_host(struct device_node node, enum irq_domain_bus_token bus_token); where bus_token is the "type" we want to match the domain against (so far, only DOMAIN_BUS_ANY is defined). This result in some moderately invasive changes on the PPC side (which is the only user of the .match method). This has otherwise no functionnal change. Reviewed-by: Hanjun Guo <hanjun.guo@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Cc: <linux-arm-kernel@lists.infradead.org> Cc: Yijing Wang <wangyijing@huawei.com> Cc: Ma Jun <majun258@huawei.com> Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Cc: Duc Dang <dhdang@apm.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Jiang Liu <jiang.liu@linux.intel.com> Cc: Jason Cooper <jason@lakedaemon.net> Link: http://lkml.kernel.org/r/1438091186-10244-2-git-send-email-marc.zyngier@arm.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2015-07-30 00:14:36 +02:00
Thomas Gleixner	4b979e4c61	Merge branch 'linus' into irq/core Pull in upstream fixes before applying conflicting changes	2015-07-30 00:13:24 +02:00
Christoph Hellwig	4246a0b63b	block: add a bi_error field to struct bio Currently we have two different ways to signal an I/O error on a BIO: (1) by clearing the BIO_UPTODATE flag (2) by returning a Linux errno value to the bi_end_io callback The first one has the drawback of only communicating a single possible error (-EIO), and the second one has the drawback of not beeing persistent when bios are queued up, and are not passed along from child to parent bio in the ever more popular chaining scenario. Having both mechanisms available has the additional drawback of utterly confusing driver authors and introducing bugs where various I/O submitters only deal with one of them, and the others have to add boilerplate code to deal with both kinds of error returns. So add a new bi_error field to store an errno value directly in struct bio and remove the existing mechanisms to clean all this up. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: NeilBrown <neilb@suse.com> Signed-off-by: Jens Axboe <axboe@fb.com>	2015-07-29 08:55:15 -06:00
Luis R. Rodriguez	4c73e89266	arch//io.h: Add ioremap_uc() to all architectures This adds ioremap_uc() only for architectures that do not include asm-generic.h/io.h as that already provides a default definition for them for both cases where you have CONFIG_MMU and you do not, and because of this, the number of architectures this patch address is less than the architectures that the ioremap_wt() patch addressed, "arch//io.h: Add ioremap_wt() to all architectures"). In order to reduce the number of architectures we have to modify by adding new architecture IO APIs we'll have to review the architectures in this patch, see why they can't add asm-generic.h/io.h or issues that would be created by doing so and then spread a consistent inclusion of this header towards the end of their own header. For instance arch/metag includes the asm-generic/io.h before the ioremap*() definitions, this should be the other way around but only once we have guard wrappers for the non-MMU case also for asm-generic/io.h. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com> Cc: Abhilash Kesavan <a.kesavan@samsung.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Borislav Petkov <bp@suse.de> Cc: Chris Metcalf <cmetcalf@ezchip.com> Cc: David Howells <dhowells@redhat.com> Cc: Fengguang Wu <fengguang.wu@intel.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Greg Ungerer <gerg@uclinux.org> Cc: Guenter Roeck <linux@roeck-us.net> Cc: Haavard Skinnemoen <hskinnemoen@gmail.com> Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no> Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com> Cc: Kyle McMartin <kyle@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Hurley <peter@hurleysoftware.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rob Herring <robh@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Toshi Kani <toshi.kani@hp.com> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-am33-list@redhat.com Cc: linux-arch@vger.kernel.org Cc: linux-m68k@lists.linux-m68k.org Cc: linux-sh@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/r/20150728181713.GB30479@wotan.suse.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-07-29 10:02:36 +02:00
Michael Ellerman	1b60bab04e	powerpc/kernel: Add SIG_SYS support for compat tasks SIG_SYS was added in commit `a0727e8ce5` "signal, x86: add SIGSYS info and make it synchronous." Because we use the asm-generic struct siginfo, we got support for SIG_SYS for free as part of that commit. However there was no compat handling added for powerpc. That means we've been advertising the existence of signfo._sifields._sigsys to compat tasks, but not actually filling in the fields correctly. Luckily it looks like no one has noticed, presumably because the only user of SIGSYS in the kernel is seccomp filter, which we don't support yet. So before we enable seccomp filter, add compat handling for SIGSYS. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Kees Cook <keescook@chromium.org>	2015-07-29 11:56:13 +10:00
Michael Ellerman	e9fbe68632	powerpc: Change syscall_get_nr() to return int The documentation for syscall_get_nr() in asm-generic says: Note this returns int even on 64-bit machines. Only 32 bits of system call number can be meaningful. If the actual arch value is 64 bits, this truncates to 32 bits so 0xffffffff means -1. However our implementation was never updated to reflect this. Generally it's not important, but there is once case where it matters. For seccomp filter with SECCOMP_RET_TRACE, the tracer will set regs->gpr[0] to -1 to reject the syscall. When the task is a compat task, this means we end up with 0xffffffff in r0 because ptrace will zero extend the 32-bit value. If syscall_get_nr() returns an unsigned long, then a 64-bit kernel will see a positive value in r0 and will incorrectly allow the syscall through seccomp. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Kees Cook <keescook@chromium.org>	2015-07-29 11:56:13 +10:00
Michael Ellerman	1cb9839b73	powerpc: Use orig_gpr3 in syscall_get_arguments() Currently syscall_get_arguments() is used by syscall tracepoints, and collect_syscall() which is used in some debugging as well as /proc/pid/syscall. The current implementation just copies regs->gpr[3 .. 5] out, which is fine for all the current use cases. When we enable seccomp filter, that will also start using syscall_get_arguments(). However for seccomp filter we want to use r3 as the return value of the syscall, and orig_gpr3 as the first parameter. This will allow seccomp to modify the return value in r3. To support this we need to modify syscall_get_arguments() to return orig_gpr3 instead of r3. This is safe for all uses because orig_gpr3 always contains the r3 value that was passed to the syscall. We store it in the syscall entry path and never modify it. Update syscall_set_arguments() while we're here, even though it's never used. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Kees Cook <keescook@chromium.org>	2015-07-29 11:56:13 +10:00
Michael Ellerman	a765784429	powerpc: Rework syscall_get_arguments() so there is only one loop Currently syscall_get_arguments() has two loops, one for compat and one for regular tasks. In prepartion for the next patch, which changes which registers we use, switch it to only have one loop, so we only have one place to update. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Kees Cook <keescook@chromium.org>	2015-07-29 11:56:12 +10:00
Michael Ellerman	1b1a3702a6	powerpc: Don't negate error in syscall_set_return_value() Currently the only caller of syscall_set_return_value() is seccomp filter, which is not enabled on powerpc. This means we have not noticed that our implementation of syscall_set_return_value() negates error, even though the value passed in is already negative. So remove the negation in syscall_set_return_value(), and expect the caller to do it like all other implementations do. Also add a comment about the ccr handling. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Kees Cook <keescook@chromium.org>	2015-07-29 11:56:12 +10:00
Michael Ellerman	2923e6d503	powerpc: Drop unused syscall_get_error() syscall_get_error() is unused, and never has been. It's also probably wrong, as it negates r3 before returning it, but that depends on what the caller is expecting. It also doesn't deal with compat, and doesn't deal with TIF_NOERROR. Although we could fix those, until it has a caller and it's clear what semantics the caller wants it's just untested code. So drop it. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Kees Cook <keescook@chromium.org>	2015-07-29 11:56:12 +10:00
Michael Ellerman	d38374142b	powerpc/kernel: Change the do_syscall_trace_enter() API The API for calling do_syscall_trace_enter() is currently sensible enough, it just returns the (modified) syscall number. However once we enable seccomp filter it will get more complicated. When seccomp filter runs, the seccomp kernel code (via SECCOMP_RET_ERRNO), or a ptracer (via SECCOMP_RET_TRACE), may reject the syscall and may or may not set a return value in r3. That means the assembler that calls do_syscall_trace_enter() can not blindly return ENOSYS, it needs to only return ENOSYS if a return value has not already been set. There is no way to implement that logic with the current API. So change the do_syscall_trace_enter() API to make it deal with the return code juggling, and the assembler can then just return whatever return code it is given. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Kees Cook <keescook@chromium.org>	2015-07-29 11:56:11 +10:00
Michael Ellerman	c3525940cc	powerpc/kernel: Switch to using MAX_ERRNO Currently on powerpc we have our own #define for the highest (negative) errno value, called _LAST_ERRNO. This is defined to be 516, for reasons which are not clear. The generic code, and x86, use MAX_ERRNO, which is defined to be 4095. In particular seccomp uses MAX_ERRNO to restrict the value that a seccomp filter can return. Currently with the mismatch between _LAST_ERRNO and MAX_ERRNO, a seccomp tracer wanting to return 600, expecting it to be seen as an error, would instead find on powerpc that userspace sees a successful syscall with a return value of 600. To avoid this inconsistency, switch powerpc to use MAX_ERRNO. We are somewhat confident that generic syscalls that can return a non-error value above negative MAX_ERRNO have already been updated to use force_successful_syscall_return(). I have also checked all the powerpc specific syscalls, and believe that none of them expect to return a non-error value between -MAX_ERRNO and -516. So this change should be safe ... Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Kees Cook <keescook@chromium.org>	2015-07-29 11:56:11 +10:00
Shilpasri G Bhat	196ba2d514	powerpc/powernv: Add definition of OPAL_MSG_OCC message type Add OPAL_MSG_OCC message definition to opal_message_type to receive OCC events like reset, load and throttled. Host performance can be affected when OCC is reset or OCC throttles the max Pstate. We can register to opal_message_notifier to receive OPAL_MSG_OCC type of message and report it to the userspace so as to keep the user informed about the reason for a performance drop in workloads. The reset and load OCC events are notified to kernel when FSP sends OCC_RESET and OCC_LOAD commands. Both reset and load messages are sent to kernel on successful completion of reset and load operation respectively. The throttle OCC event indicates that the Pmax of the chip is reduced. The chip_id and throttle reason for reducing Pmax is also queued along with the message. Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-07-28 17:24:13 +02:00
Peter Zijlstra	de9e432cb5	atomic: Collapse all atomic_{set,clear}_mask definitions Move the now generic definitions of atomic_{set,clear}_mask() into linux/atomic.h to avoid endless and pointless repetition. Also, provide an atomic_andnot() wrapper for those few archs that can implement that. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2015-07-27 14:06:24 +02:00

1 2 3 4 5 ...

14079 Commits