linux

Commit Graph

Author	SHA1	Message	Date
Alexander Graf	b3c5d3c2a4	KVM: PPC: Rename MMIO register identifiers We need the KVM_REG namespace for generic register settings now, so let's rename the existing users to something different, enabling us to reuse the namespace for more visible interfaces. While at it, also move these private constants to a private header. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-03-05 14:52:41 +02:00
Paul Mackerras	31f3438eca	KVM: PPC: Move kvm_vcpu_ioctl_[gs]et_one_reg down to platform-specific code This moves the get/set_one_reg implementation down from powerpc.c into booke.c, book3s_pr.c and book3s_hv.c. This avoids #ifdefs in C code, but more importantly, it fixes a bug on Book3s HV where we were accessing beyond the end of the kvm_vcpu struct (via the to_book3s() macro) and corrupting memory, causing random crashes and file corruption. On Book3s HV we only accept setting the HIOR to zero, since the guest runs in supervisor mode and its vectors are never offset from zero. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de> [agraf update to apply on top of changed ONE_REG patches] Signed-off-by: Avi Kivity <avi@redhat.com>	2012-03-05 14:52:41 +02:00
Alexander Graf	1022fc3d3b	KVM: PPC: Add support for explicit HIOR setting Until now, we always set HIOR based on the PVR, but this is just wrong. Instead, we should be setting HIOR explicitly, so user space can decide what the initial HIOR value is - just like on real hardware. We keep the old PVR based way around for backwards compatibility, but once user space uses the SET_ONE_REG based method, we drop the PVR logic. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-03-05 14:52:41 +02:00
Alexander Graf	e24ed81fed	KVM: PPC: Add generic single register ioctls Right now we transfer a static struct every time we want to get or set registers. Unfortunately, over time we realize that there are more of these than we thought of before and the extensibility and flexibility of transferring a full struct every time is limited. So this is a new approach to the problem. With these new ioctls, we can get and set a single register that is identified by an ID. This allows for very precise and limited transmittal of data. When we later realize that it's a better idea to shove over multiple registers at once, we can reuse most of the infrastructure and simply implement a GET_MANY_REGS / SET_MANY_REGS interface. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-03-05 14:52:40 +02:00
Sasha Levin	6b75e6bfef	KVM: PPC: Use the vcpu kmem_cache when allocating new VCPUs Currently the code kzalloc()s new VCPUs instead of using the kmem_cache which is created when KVM is initialized. Modify it to allocate VCPUs from that kmem_cache. Signed-off-by: Sasha Levin <levinsasha928@gmail.com> Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-03-05 14:52:40 +02:00
Liu Yu	d37b1a037c	KVM: PPC: booke: Add booke206 TLB trace The existing kvm_stlb_write/kvm_gtlb_write were a poor match for the e500/book3e MMU -- mas1 was passed as "tid", mas2 was limited to "unsigned int" which will be a problem on 64-bit, mas3/7 got split up rather than treated as a single 64-bit word, etc. Signed-off-by: Liu Yu <yu.liu@freescale.com> [scottwood@freescale.com: made mas2 64-bit, and added mas8 init] Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-03-05 14:52:40 +02:00
Paul Mackerras	82ed36164c	KVM: PPC: Book3s HV: Implement get_dirty_log using hardware changed bit This changes the implementation of kvm_vm_ioctl_get_dirty_log() for Book3s HV guests to use the hardware C (changed) bits in the guest hashed page table. Since this makes the implementation quite different from the Book3s PR case, this moves the existing implementation from book3s.c to book3s_pr.c and creates a new implementation in book3s_hv.c. That implementation calls kvmppc_hv_get_dirty_log() to do the actual work by calling kvm_test_clear_dirty on each page. It iterates over the HPTEs, clearing the C bit if set, and returns 1 if any C bit was set (including the saved C bit in the rmap entry). Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-03-05 14:52:39 +02:00
Paul Mackerras	5551489373	KVM: PPC: Book3S HV: Use the hardware referenced bit for kvm_age_hva This uses the host view of the hardware R (referenced) bit to speed up kvm_age_hva() and kvm_test_age_hva(). Instead of removing all the relevant HPTEs in kvm_age_hva(), we now just reset their R bits if set. Also, kvm_test_age_hva() now scans the relevant HPTEs to see if any of them have R set. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-03-05 14:52:39 +02:00
Paul Mackerras	bad3b5075e	KVM: PPC: Book3s HV: Maintain separate guest and host views of R and C bits This allows both the guest and the host to use the referenced (R) and changed (C) bits in the guest hashed page table. The guest has a view of R and C that is maintained in the guest_rpte field of the revmap entry for the HPTE, and the host has a view that is maintained in the rmap entry for the associated gfn. Both view are updated from the guest HPT. If a bit (R or C) is zero in either view, it will be initially set to zero in the HPTE (or HPTEs), until set to 1 by hardware. When an HPTE is removed for any reason, the R and C bits from the HPTE are ORed into both views. We have to be careful to read the R and C bits from the HPTE after invalidating it, but before unlocking it, in case of any late updates by the hardware. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-03-05 14:52:39 +02:00
Paul Mackerras	a92bce95f0	KVM: PPC: Book3S HV: Keep HPTE locked when invalidating This reworks the implementations of the H_REMOVE and H_BULK_REMOVE hcalls to make sure that we keep the HPTE locked and in the reverse- mapping chain until we have finished invalidating it. Previously we would remove it from the chain and unlock it before invalidating it, leaving a tiny window when the guest could access the page even though we believe we have removed it from the guest (e.g., kvm_unmap_hva() has been called for the page and has found no HPTEs in the chain). In addition, we'll need this for future patches where we will need to read the R and C bits in the HPTE after invalidating it. Doing this required restructuring kvmppc_h_bulk_remove() substantially. Since we want to batch up the tlbies, we now need to keep several HPTEs locked simultaneously. In order to avoid possible deadlocks, we don't spin on the HPTE bitlock for any except the first HPTE in a batch. If we can't acquire the HPTE bitlock for the second or subsequent HPTE, we terminate the batch at that point, do the tlbies that we have accumulated so far, unlock those HPTEs, and then start a new batch to do the remaining invalidations. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-03-05 14:52:39 +02:00
Matt Evans	b5434032fc	KVM: PPC: Add KVM_CAP_NR_VCPUS and KVM_CAP_MAX_VCPUS PPC KVM lacks these two capabilities, and as such a userland system must assume a max of 4 VCPUs (following api.txt). With these, a userland can determine a more realistic limit. Signed-off-by: Matt Evans <matt@ozlabs.org> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-03-05 14:52:38 +02:00
Matt Evans	03cdab5340	KVM: PPC: Fix vcpu_create dereference before validity check. Fix usage of vcpu struct before check that it's actually valid. Signed-off-by: Matt Evans <matt@ozlabs.org> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-03-05 14:52:38 +02:00
Paul Mackerras	4cf302bc10	KVM: PPC: Allow for read-only pages backing a Book3S HV guest With this, if a guest does an H_ENTER with a read/write HPTE on a page which is currently read-only, we make the actual HPTE inserted be a read-only version of the HPTE. We now intercept protection faults as well as HPTE not found faults, and for a protection fault we work out whether it should be reflected to the guest (e.g. because the guest HPTE didn't allow write access to usermode) or handled by switching to kernel context and calling kvmppc_book3s_hv_page_fault, which will then request write access to the page and update the actual HPTE. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-03-05 14:52:38 +02:00
Paul Mackerras	342d3db763	KVM: PPC: Implement MMU notifiers for Book3S HV guests This adds the infrastructure to enable us to page out pages underneath a Book3S HV guest, on processors that support virtualized partition memory, that is, POWER7. Instead of pinning all the guest's pages, we now look in the host userspace Linux page tables to find the mapping for a given guest page. Then, if the userspace Linux PTE gets invalidated, kvm_unmap_hva() gets called for that address, and we replace all the guest HPTEs that refer to that page with absent HPTEs, i.e. ones with the valid bit clear and the HPTE_V_ABSENT bit set, which will cause an HDSI when the guest tries to access them. Finally, the page fault handler is extended to reinstantiate the guest HPTE when the guest tries to access a page which has been paged out. Since we can't intercept the guest DSI and ISI interrupts on PPC970, we still have to pin all the guest pages on PPC970. We have a new flag, kvm->arch.using_mmu_notifiers, that indicates whether we can page guest pages out. If it is not set, the MMU notifier callbacks do nothing and everything operates as before. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-03-05 14:52:38 +02:00
Paul Mackerras	697d3899dc	KVM: PPC: Implement MMIO emulation support for Book3S HV guests This provides the low-level support for MMIO emulation in Book3S HV guests. When the guest tries to map a page which is not covered by any memslot, that page is taken to be an MMIO emulation page. Instead of inserting a valid HPTE, we insert an HPTE that has the valid bit clear but another hypervisor software-use bit set, which we call HPTE_V_ABSENT, to indicate that this is an absent page. An absent page is treated much like a valid page as far as guest hcalls (H_ENTER, H_REMOVE, H_READ etc.) are concerned, except of course that an absent HPTE doesn't need to be invalidated with tlbie since it was never valid as far as the hardware is concerned. When the guest accesses a page for which there is an absent HPTE, it will take a hypervisor data storage interrupt (HDSI) since we now set the VPM1 bit in the LPCR. Our HDSI handler for HPTE-not-present faults looks up the hash table and if it finds an absent HPTE mapping the requested virtual address, will switch to kernel mode and handle the fault in kvmppc_book3s_hv_page_fault(), which at present just calls kvmppc_hv_emulate_mmio() to set up the MMIO emulation. This is based on an earlier patch by Benjamin Herrenschmidt, but since heavily reworked. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-03-05 14:52:37 +02:00
Paul Mackerras	06ce2c63d9	KVM: PPC: Maintain a doubly-linked list of guest HPTEs for each gfn This expands the reverse mapping array to contain two links for each HPTE which are used to link together HPTEs that correspond to the same guest logical page. Each circular list of HPTEs is pointed to by the rmap array entry for the guest logical page, pointed to by the relevant memslot. Links are 32-bit HPT entry indexes rather than full 64-bit pointers, to save space. We use 3 of the remaining 32 bits in the rmap array entries as a lock bit, a referenced bit and a present bit (the present bit is needed since HPTE index 0 is valid). The bit lock for the rmap chain nests inside the HPTE lock bit. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-03-05 14:52:37 +02:00
Paul Mackerras	9d0ef5ea04	KVM: PPC: Allow I/O mappings in memory slots This provides for the case where userspace maps an I/O device into the address range of a memory slot using a VM_PFNMAP mapping. In that case, we work out the pfn from vma->vm_pgoff, and record the cache enable bits from vma->vm_page_prot in two low-order bits in the slot_phys array entries. Then, in kvmppc_h_enter() we check that the cache bits in the HPTE that the guest wants to insert match the cache bits in the slot_phys array entry. However, we do allow the guest to create what it thinks is a non-cacheable or write-through mapping to memory that is actually cacheable, so that we can use normal system memory as part of an emulated device later on. In that case the actual HPTE we insert is a cacheable HPTE. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-03-05 14:52:37 +02:00
Paul Mackerras	da9d1d7f28	KVM: PPC: Allow use of small pages to back Book3S HV guests This relaxes the requirement that the guest memory be provided as 16MB huge pages, allowing it to be provided as normal memory, i.e. in pages of PAGE_SIZE bytes (4k or 64k). To allow this, we index the kvm->arch.slot_phys[] arrays with a small page index, even if huge pages are being used, and use the low-order 5 bits of each entry to store the order of the enclosing page with respect to normal pages, i.e. log_2(enclosing_page_size / PAGE_SIZE). Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-03-05 14:52:37 +02:00
Paul Mackerras	c77162dee7	KVM: PPC: Only get pages when actually needed, not in prepare_memory_region() This removes the code from kvmppc_core_prepare_memory_region() that looked up the VMA for the region being added and called hva_to_page to get the pfns for the memory. We have no guarantee that there will be anything mapped there at the time of the KVM_SET_USER_MEMORY_REGION ioctl call; userspace can do that ioctl and then map memory into the region later. Instead we defer looking up the pfn for each memory page until it is needed, which generally means when the guest does an H_ENTER hcall on the page. Since we can't call get_user_pages in real mode, if we don't already have the pfn for the page, kvmppc_h_enter() will return H_TOO_HARD and we then call kvmppc_virtmode_h_enter() once we get back to kernel context. That calls kvmppc_get_guest_page() to get the pfn for the page, and then calls back to kvmppc_h_enter() to redo the HPTE insertion. When the first vcpu starts executing, we need to have the RMO or VRMA region mapped so that the guest's real mode accesses will work. Thus we now have a check in kvmppc_vcpu_run() to see if the RMO/VRMA is set up and if not, call kvmppc_hv_setup_rma(). It checks if the memslot starting at guest physical 0 now has RMO memory mapped there; if so it sets it up for the guest, otherwise on POWER7 it sets up the VRMA. The function that does that, kvmppc_map_vrma, is now a bit simpler, as it calls kvmppc_virtmode_h_enter instead of creating the HPTE itself. Since we are now potentially updating entries in the slot_phys[] arrays from multiple vcpu threads, we now have a spinlock protecting those updates to ensure that we don't lose track of any references to pages. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-03-05 14:52:36 +02:00
Paul Mackerras	075295dd32	KVM: PPC: Make the H_ENTER hcall more reliable At present, our implementation of H_ENTER only makes one try at locking each slot that it looks at, and doesn't even retry the ldarx/stdcx. atomic update sequence that it uses to attempt to lock the slot. Thus it can return the H_PTEG_FULL error unnecessarily, particularly when the H_EXACT flag is set, meaning that the caller wants a specific PTEG slot. This improves the situation by making a second pass when no free HPTE slot is found, where we spin until we succeed in locking each slot in turn and then check whether it is full while we hold the lock. If the second pass fails, then we return H_PTEG_FULL. This also moves lock_hpte to a header file (since later commits in this series will need to use it from other source files) and renames it to try_lock_hpte, which is a somewhat less misleading name. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-03-05 14:52:36 +02:00
Paul Mackerras	93e602490c	KVM: PPC: Add an interface for pinning guest pages in Book3s HV guests This adds two new functions, kvmppc_pin_guest_page() and kvmppc_unpin_guest_page(), and uses them to pin the guest pages where the guest has registered areas of memory for the hypervisor to update, (i.e. the per-cpu virtual processor areas, SLB shadow buffers and dispatch trace logs) and then unpin them when they are no longer required. Although it is not strictly necessary to pin the pages at this point, since all guest pages are already pinned, later commits in this series will mean that guest pages aren't all pinned. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-03-05 14:52:36 +02:00
Paul Mackerras	b2b2f16508	KVM: PPC: Keep page physical addresses in per-slot arrays This allocates an array for each memory slot that is added to store the physical addresses of the pages in the slot. This array is vmalloc'd and accessed in kvmppc_h_enter using real_vmalloc_addr(). This allows us to remove the ram_pginfo field from the kvm_arch struct, and removes the 64GB guest RAM limit that we had. We use the low-order bits of the array entries to store a flag indicating that we have done get_page on the corresponding page, and therefore need to call put_page when we are finished with the page. Currently this is set for all pages except those in our special RMO regions. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-03-05 14:52:35 +02:00
Paul Mackerras	8936dda4c2	KVM: PPC: Keep a record of HV guest view of hashed page table entries This adds an array that parallels the guest hashed page table (HPT), that is, it has one entry per HPTE, used to store the guest's view of the second doubleword of the corresponding HPTE. The first doubleword in the HPTE is the same as the guest's idea of it, so we don't need to store a copy, but the second doubleword in the HPTE has the real page number rather than the guest's logical page number. This allows us to remove the back_translate() and reverse_xlate() functions. This "reverse mapping" array is vmalloc'd, meaning that to access it in real mode we have to walk the kernel's page tables explicitly. That is done by the new real_vmalloc_addr() function. (In fact this returns an address in the linear mapping, so the result is usable both in real mode and in virtual mode.) There are also some minor cleanups here: moving the definitions of HPT_ORDER etc. to a header file and defining HPT_NPTE for HPT_NPTEG << 3. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-03-05 14:52:35 +02:00
Paul Mackerras	4e72dbe135	KVM: PPC: Make wakeups work again for Book3S HV guests When commit f43fdc15fa ("KVM: PPC: booke: Improve timer register emulation") factored out some code in arch/powerpc/kvm/powerpc.c into a new helper function, kvm_vcpu_kick(), an error crept in which causes Book3s HV guest vcpus to stall. This fixes it. On POWER7 machines, guest vcpus are grouped together into virtual CPU cores that share a single waitqueue, so it's important to use vcpu->arch.wqp rather than &vcpu->wq. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-03-05 14:52:34 +02:00
Liu Yu-B13201	befdc0a65a	KVM: PPC: Avoid patching paravirt template code Currently we patch the whole code include paravirt template code. This isn't safe for scratch area and has impact to performance. Signed-off-by: Liu Yu <yu.liu@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-03-05 14:52:34 +02:00
Scott Wood	570135243a	KVM: PPC: e500: use hardware hint when loading TLB0 entries The hardware maintains a per-set next victim hint. Using this reduces conflicts, especially on e500v2 where a single guest TLB entry is mapped to two shadow TLB entries (user and kernel). We want those two entries to go to different TLB ways. sesel is now only used for TLB1. Reported-by: Liu Yu <yu.liu@freescale.com> Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-03-05 14:52:34 +02:00
Scott Wood	7b11dc9938	KVM: PPC: e500: Fix TLBnCFG in KVM_CONFIG_TLB The associativity, not just total size, can differ from the host hardware. Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-03-05 14:52:32 +02:00
Alexander Graf	e371f713db	KVM: PPC: Book3S: PR: Fix signal check race As Scott put it: > If we get a signal after the check, we want to be sure that we don't > receive the reschedule IPI until after we're in the guest, so that it > will cause another signal check. we need to have interrupts disabled from the point we do signal_check() all the way until we actually enter the guest. This patch fixes potential signal loss races. Reported-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-03-05 14:52:30 +02:00
Alexander Graf	ae21216bec	KVM: PPC: align vcpu_kick with x86 Our vcpu kick implementation differs a bit from x86 which resulted in us not disabling preemption during the kick. Get it a bit closer to what x86 does. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-03-05 14:52:30 +02:00
Alexander Graf	468a12c2b5	KVM: PPC: Use get/set for to_svcpu to help preemption When running the 64-bit Book3s PR code without CONFIG_PREEMPT_NONE, we were doing a few things wrong, most notably access to PACA fields without making sure that the pointers stay stable accross the access (preempt_disable()). This patch moves to_svcpu towards a get/put model which allows us to disable preemption while accessing the shadow vcpu fields in the PACA. That way we can run preemptible and everyone's happy! Reported-by: Jörg Sommer <joerg@alea.gnuu.de> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-03-05 14:52:30 +02:00
Alexander Graf	d33ad328c0	KVM: PPC: Book3s: PR: No irq_disable in vcpu_run Somewhere during merges we ended up from local_irq_enable() foo(); local_irq_disable() to always keeping irqs enabled during that part. However, we now have the following code: foo(); local_irq_disable() which disables interrupts without the surrounding code enabling them again! So let's remove that disable and be happy. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-03-05 14:52:28 +02:00
Alexander Graf	7d82714d4d	KVM: PPC: Book3s: PR: Disable preemption in vcpu_run When entering the guest, we want to make sure we're not getting preempted away, so let's disable preemption on entry, but enable it again while handling guest exits. Reported-by: Jörg Sommer <joerg@alea.gnuu.de> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-03-05 14:52:27 +02:00
Scott Wood	dfd4d47e9a	KVM: PPC: booke: Improve timer register emulation Decrementers are now properly driven by TCR/TSR, and the guest has full read/write access to these registers. The decrementer keeps ticking (and setting the TSR bit) regardless of whether the interrupts are enabled with TCR. The decrementer stops at zero, rather than going negative. Decrementers (and FITs, once implemented) are delivered as level-triggered interrupts -- dequeued when the TSR bit is cleared, not on delivery. Signed-off-by: Liu Yu <yu.liu@freescale.com> [scottwood@freescale.com: significant changes] Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-03-05 14:52:27 +02:00
Scott Wood	b59049720d	KVM: PPC: Paravirtualize SPRG4-7, ESR, PIR, MASn This allows additional registers to be accessed by the guest in PR-mode KVM without trapping. SPRG4-7 are readable from userspace. On booke, KVM will sync these registers when it enters the guest, so that accesses from guest userspace will work. The guest kernel, OTOH, must consistently use either the real registers or the shared area between exits. This also applies to the already-paravirted SPRG3. On non-booke, it's not clear to what extent SPRG4-7 are supported (they're not architected for book3s, but exist on at least some classic chips). They are copied in the get/set regs ioctls, but I do not see any non-booke emulation. I also do not see any syncing with real registers (in PR-mode) including the user-readable SPRG3. This patch should not make that situation any worse. Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-03-05 14:52:26 +02:00
Scott Wood	940b45ec18	KVM: PPC: booke: Paravirtualize wrtee Also fix wrteei 1 paravirt to check for a pending interrupt. Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-03-05 14:52:26 +02:00
Scott Wood	29ac26efbd	KVM: PPC: booke: Fix int_pending calculation for MSR[EE] paravirt int_pending was only being lowered if a bit in pending_exceptions was cleared during exception delivery -- but for interrupts, we clear it during IACK/TSR emulation. This caused paravirt for enabling MSR[EE] to be ineffective. Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-03-05 14:52:26 +02:00
Scott Wood	c59a6a3e4e	KVM: PPC: booke: Check for MSR[WE] in prepare_to_enter This prevents us from inappropriately blocking in a KVM_SET_REGS ioctl -- the MSR[WE] will take effect when the guest is next entered. It also causes SRR1[WE] to be set when we enter the guest's interrupt handler, which is what e500 hardware is documented to do. Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-03-05 14:52:26 +02:00
Scott Wood	25051b5a5a	KVM: PPC: Move prepare_to_enter call site into subarch code This function should be called with interrupts disabled, to avoid a race where an exception is delivered after we check, but the resched kick is received before we disable interrupts (and thus doesn't actually trigger the exit code that would recheck exceptions). booke already does this properly in the lightweight exit case, but not on initial entry. For now, move the call of prepare_to_enter into subarch-specific code so that booke can do the right thing here. Ideally book3s would do the same thing, but I'm having a hard time seeing where it does any interrupt disabling of this sort (plus it has several additional call sites), so I'm deferring the book3s fix to someone more familiar with that code. book3s behavior should be unchanged by this patch. Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-03-05 14:52:26 +02:00
Scott Wood	7e28e60ef9	KVM: PPC: Rename deliver_interrupts to prepare_to_enter This function also updates paravirt int_pending, so rename it to be more obvious that this is a collection of checks run prior to (re)entering a guest. Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-03-05 14:52:25 +02:00
Scott Wood	1d1ef22208	KVM: PPC: booke: check for signals in kvmppc_vcpu_run Currently we check prior to returning from a lightweight exit, but not prior to initial entry. book3s already does a similar test. Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-03-05 14:52:25 +02:00
Bharat Bhushan	7401f6266d	KVM: PPC: booke: Do Not start decrementer when SPRN_DEC set 0 As per specification the decrementer interrupt not happen when DEC is written with 0. Also when DEC is zero, no decrementer running. So we should not start hrtimer for decrementer when DEC = 0. Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-03-05 14:52:25 +02:00
Bharat Bhushan	dc2babfea5	KVM: PPC: Fix DEC truncation for greater than 0xffff_ffff/1000 kvmppc_emulate_dec() uses dec_nsec of type unsigned long and does below calculation: dec_nsec = vcpu->arch.dec; dec_nsec *= 1000; This will truncate if DEC value "vcpu->arch.dec" is greater than 0xffff_ffff/1000. For example : For tb_ticks_per_usec = 4a, we can not set decrementer more than ~58ms. Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com> Acked-by: Liu Yu <yu.liu@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-03-05 14:52:25 +02:00
Bharat Bhushan	f9208427f7	PPC: Fix race in mtmsr paravirt implementation The current implementation of mtmsr and mtmsrd are racy in that it does: * check (int_pending == 0) ---> host sets int_pending = 1 <--- * write shared page * done while instead we should check for int_pending after the shared page is written. Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-03-05 14:52:24 +02:00
Alexander Graf	95325e6b19	KVM: PPC: E500: Support hugetlbfs With hugetlbfs support emerging on e500, we should also support KVM backing its guest memory by it. This patch adds support for hugetlbfs into the e500 shadow mmu code. Signed-off-by: Alexander Graf <agraf@suse.de> Acked-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-03-05 14:52:24 +02:00
Scott Wood	841741f23b	KVM: PPC: e500: Don't hardcode PIR=0 The hardcoded behavior prevents proper SMP support. user space shall specify the vcpu's PIR as the vcpu id. Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-03-05 14:52:24 +02:00
Scott Wood	303b7c97e3	KVM: PPC: e500: tlbsx: fix tlb0 esel It should contain the way, not the absolute TLB0 index. Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-03-05 14:52:24 +02:00
Scott Wood	dc83b8bc02	KVM: PPC: e500: MMU API This implements a shared-memory API for giving host userspace access to the guest's TLB. Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-03-05 14:52:24 +02:00
Scott Wood	0164c0f0c4	KVM: PPC: e500: clear up confusion between host and guest entries Split out the portions of tlbe_priv that should be associated with host entries into tlbe_ref. Base victim selection on the number of hardware entries, not guest entries. For TLB1, where one guest entry can be mapped by multiple host entries, we use the host tlbe_ref for tracking page references. For the guest TLB0 entries, we still track it with gtlb_priv, to avoid having to retranslate if the entry is evicted from the host TLB but not the guest TLB. Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-03-05 14:52:23 +02:00
Scott Wood	90b92a6f51	KVM: PPC: e500: Eliminate preempt_disable in local_sid_destroy_all The only place it makes sense to call this function already needs to have preemption disabled. Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-03-05 14:52:23 +02:00
Scott Wood	3bf3cdcc14	KVM: PPC: e500: don't translate gfn to pfn with preemption disabled Delay allocation of the shadow pid until we're ready to disable preemption and write the entry. Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-03-05 14:52:23 +02:00
Christian Borntraeger	b9e5dc8d45	KVM: provide synchronous registers in kvm_run On some cpus the overhead for virtualization instructions is in the same range as a system call. Having to call multiple ioctls to get set registers will make certain userspace handled exits more expensive than necessary. Lets provide a section in kvm_run that works as a shared save area for guest registers. We also provide two 64bit flags fields (architecture specific), that will specify 1. which parts of these fields are valid. 2. which registers were modified by userspace Each bit for these flag fields will define a group of registers (like general purpose) or a single register. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-03-05 14:52:22 +02:00
Carsten Otte	5b1c1493af	KVM: s390: ucontrol: export SIE control block to user This patch exports the s390 SIE hardware control block to userspace via the mapping of the vcpu file descriptor. In order to do so, a new arch callback named kvm_arch_vcpu_fault is introduced for all architectures. It allows to map architecture specific pages. Signed-off-by: Carsten Otte <cotte@de.ibm.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-03-05 14:52:19 +02:00
Carsten Otte	e08b963716	KVM: s390: add parameter for KVM_CREATE_VM This patch introduces a new config option for user controlled kernel virtual machines. It introduces a parameter to KVM_CREATE_VM that allows to set bits that alter the capabilities of the newly created virtual machine. The parameter is passed to kvm_arch_init_vm for all architectures. The only valid modifier bit for now is KVM_VM_S390_UCONTROL. This requires CAP_SYS_ADMIN privileges and creates a user controlled virtual machine on s390 architectures. Signed-off-by: Carsten Otte <cotte@de.ibm.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-03-05 14:52:18 +02:00
Ingo Molnar	737f24bda7	Merge branch 'perf/urgent' into perf/core Conflicts: tools/perf/builtin-record.c tools/perf/builtin-top.c tools/perf/perf.h tools/perf/util/top.h Merge reason: resolve these cherry-picking conflicts. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2012-03-05 09:20:08 +01:00
Thomas Gleixner	ba74c1448f	sched/rt: Document scheduler related skip-resched-check sites Create a distinction between scheduler related preempt_enable_no_resched() calls and the nearly one hundred other places in the kernel that do not want to reschedule, for one reason or another. This distinction matters for -rt, where the scheduler and the non-scheduler preempt models (and checks) are different. For upstream it's purely documentational. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/n/tip-gs88fvx2mdv5psnzxnv575ke@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>	2012-03-01 10:28:04 +01:00
Thomas Gleixner	bd2f55361f	sched/rt: Use schedule_preempt_disabled() Coccinelle based conversion. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/n/tip-24swm5zut3h9c4a6s46x8rws@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>	2012-03-01 10:28:03 +01:00
Paul Gortmaker	50af5ead3b	bug.h: add include of it to various implicit C users With bug.h currently living right in linux/kernel.h there are files that use BUG_ON and friends but are not including the header explicitly. Fix them up so we can remove the presence in kernel.h file. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>	2012-02-29 17:15:08 -05:00
Yinghai Lu	210647af89	PCI: Rename pci_remove_bus_device to pci_stop_and_remove_bus_device The old pci_remove_bus_device actually did stop and remove. Make the name reflect that to reduce confusion. This patch is done by sed scripts and changes back some incorrect __pci_remove_bus_device changes. Suggested-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2012-02-27 12:12:18 -08:00
David S. Miller	ff4783ce78	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/ethernet/sfc/rx.c Overlapping changes in drivers/net/ethernet/sfc/rx.c, one to change the rx_buf->is_page boolean into a set of u16 flags, and another to adjust how ->ip_summed is initialized. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-02-26 21:55:51 -05:00
Danny Kukawka	0a167e0a5c	arch/powerpc/platforms/powernv/setup.c: included asm/xics.h twice arch/powerpc/platforms/powernv/setup.c: included 'asm/xics.h' twice, remove the duplicate. Signed-off-by: Danny Kukawka <danny.kukawka@bisect.de> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2012-02-27 11:33:59 +11:00
Danny Kukawka	ed7e3d1ca7	arch/powerpc/kvm/book3s_hv.c: included linux/sched.h twice arch/powerpc/kvm/book3s_hv.c: included 'linux/sched.h' twice, remove the duplicate. Signed-off-by: Danny Kukawka <danny.kukawka@bisect.de> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2012-02-27 11:33:58 +11:00
Stephen Rothwell	3d066d77cf	powerpc: remove CONFIG_PPC_ISERIES from the architecture Kconfig files After this, we can remove the legacy iSeries code more easily. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2012-02-27 11:33:58 +11:00
Benjamin Herrenschmidt	fe83364f0b	powerpc/mpic: Fix allocation of reverse-map for multi-ISU mpics When using a multi-ISU MPIC, we can interrupts up to isu_size * MPIC_MAX_ISU, not just isu_size, so allocate the right size reverse map. Without this, the code will constantly fallback to a linear search. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2012-02-27 11:33:58 +11:00
Benjamin Herrenschmidt	f851013cb2	Merge remote-tracking branch 'origin/master' into next	2012-02-27 10:50:11 +11:00
Ben Greear	3bdc0eba0b	net: Add framework to allow sending packets with customized CRC. This is useful for testing RX handling of frames with bad CRCs. Requires driver support to actually put the packet on the wire properly. Signed-off-by: Ben Greear <greearb@candelatech.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2012-02-24 01:37:35 -08:00
Ingo Molnar	c5905afb0e	static keys: Introduce 'struct static_key', static_key_true()/false() and static_key_slow_[inc\|dec]() So here's a boot tested patch on top of Jason's series that does all the cleanups I talked about and turns jump labels into a more intuitive to use facility. It should also address the various misconceptions and confusions that surround jump labels. Typical usage scenarios: #include <linux/static_key.h> struct static_key key = STATIC_KEY_INIT_TRUE; if (static_key_false(&key)) do unlikely code else do likely code Or: if (static_key_true(&key)) do likely code else do unlikely code The static key is modified via: static_key_slow_inc(&key); ... static_key_slow_dec(&key); The 'slow' prefix makes it abundantly clear that this is an expensive operation. I've updated all in-kernel code to use this everywhere. Note that I (intentionally) have not pushed through the rename blindly through to the lowest levels: the actual jump-label patching arch facility should be named like that, so we want to decouple jump labels from the static-key facility a bit. On non-jump-label enabled architectures static keys default to likely()/unlikely() branches. Signed-off-by: Ingo Molnar <mingo@elte.hu> Acked-by: Jason Baron <jbaron@redhat.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Cc: a.p.zijlstra@chello.nl Cc: mathieu.desnoyers@efficios.com Cc: davem@davemloft.net Cc: ddaney.cavm@gmail.com Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/20120222085809.GA26397@elte.hu Signed-off-by: Ingo Molnar <mingo@elte.hu>	2012-02-24 10:05:59 +01:00
Bjorn Helgaas	fb127cb9de	PCI: collapse pcibios_resource_to_bus Everybody uses the generic pcibios_resource_to_bus() supplied by the core now, so remove the ARCH_HAS_GENERIC_PCI_OFFSETS used during conversion. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2012-02-23 20:19:04 -07:00
Bjorn Helgaas	6c5705fec6	powerpc/PCI: get rid of device resource fixups Tell the PCI core about host bridge address translation so it can take care of bus-to-resource conversion for us. CC: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2012-02-23 20:19:03 -07:00
Bjorn Helgaas	673c975624	powerpc/PCI: replace pci_probe_only with pci_flags We already use pci_flags, so this just sets pci_flags directly and removes the intermediate step of figuring out pci_probe_only, then using it to set pci_flags. The PCI core provides a pci_flags definition (currently __weak), so drop the powerpc definitions in favor of that. CC: Benjamin Herrenschmidt <benh@kernel.crashing.org> CC: linuxppc-dev@lists.ozlabs.org Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2012-02-23 20:18:58 -07:00
Bjorn Helgaas	3c13be017a	powerpc/PCI: make pci_probe_only default to 0 pci_probe_only is set on ppc64 to prevent resource re-allocation by the core. It's meant to be used in very specific circumstances such as when operating under a hypervisor that may prevent such re-allocation. Instead of default to 1, we make it default to 0 and explicitly set it in the few cases where we need it. This fixes FSL PCI which wants it clear among others. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2012-02-23 20:18:58 -07:00
Paul Gortmaker	6d166fec12	ppc-6xx: fix build failure in flipper-pic.c and hlwd-pic.c The commit `bae1d8f199` (linux-next) "irq_domain/powerpc: Use common irq_domain structure instead of irq_host" made this change: -static struct irq_host flipper_irq_host; +static struct irq_domain flipper_irq_host; and this change: -static struct irq_host hlwd_irq_host; +static struct irq_domain hlwd_irq_host; The intent was to change the type, and not the name, but then in a couple of instances, it looks like the sed to change the irq_domain_ops name inadvertently also changed the irq_host name where it was not supposed to, causing build failures. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Grant Likely <grant.likely@secretlab.ca>	2012-02-22 18:41:19 -07:00
Michael Ellerman	f2699491e0	powerpc/perf: Move perf core & PMU code into a subdirectory The perf code has grown a lot since it started, and is big enough to warrant its own subdirectory. For reference it's ~60% bigger than the oprofile code. It declutters the kernel directory, makes it simpler to grep for "just perf stuff", and allows us to shorten some filenames. While we're at it, make it more obvious that we have two implementations of the core perf logic. One for (roughly) Book3S CPUs, which was the original implementation, and the other for Freescale embedded CPUs. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2012-02-23 10:50:04 +11:00
Mahesh Salgaonkar	12d9299241	fadump: Remove the phyp assisted dump code. Remove the phyp assisted dump implementation which is not is use. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2012-02-23 10:50:03 +11:00
Mahesh Salgaonkar	67b43b9d7c	fadump: Invalidate the fadump registration during machine shutdown. If dump is active during system reboot, shutdown or halt then invalidate the fadump registration as it does not get invalidated automatically. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2012-02-23 10:50:03 +11:00
Mahesh Salgaonkar	b500afff11	fadump: Invalidate registration and release reserved memory for general use. This patch introduces an sysfs interface '/sys/kernel/fadump_release_mem' to invalidate the last fadump registration, invalidate '/proc/vmcore', release the reserved memory for general use and re-register for future kernel dump. Once the dump is copied to the disk, unlike phyp dump, the userspace tool can release all the memory reserved for dump with one single operation of echo 1 to '/sys/kernel/fadump_release_mem'. Release the reserved memory region excluding the size of the memory required for future kernel dump registration. And therefore, unlike kdump, Fadump doesn't need a 2nd reboot to get back the system to the production configuration. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2012-02-23 10:50:02 +11:00
Mahesh Salgaonkar	d34c5f26cf	fadump: Add PT_NOTE program header for vmcoreinfo Introduce a PT_NOTE program header that points to physical address of vmcoreinfo_note buffer declared in kernel/kexec.c. The vmcoreinfo note buffer is populated during crash_fadump() at the time of system crash. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2012-02-23 10:50:02 +11:00
Mahesh Salgaonkar	ebaeb5ae24	fadump: Convert firmware-assisted cpu state dump data into elf notes. When registered for firmware assisted dump on powerpc, firmware preserves the registers for the active CPUs during a system crash. This patch reads the cpu register data stored in Firmware-assisted dump format (except for crashing cpu) and converts it into elf notes and updates the PT_NOTE program header accordingly. The exact register state for crashing cpu is saved to fadump crash info structure in scratch area during crash_fadump() and read during second kernel boot. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2012-02-23 10:50:01 +11:00
Mahesh Salgaonkar	2df173d9e8	fadump: Initialize elfcore header and add PT_LOAD program headers. Build the crash memory range list by traversing through system memory during the first kernel before we register for firmware-assisted dump. After the successful dump registration, initialize the elfcore header and populate PT_LOAD program headers with crash memory ranges. The elfcore header is saved in the scratch area within the reserved memory. The scratch area starts at the end of the memory reserved for saving RMR region contents. The scratch area contains fadump crash info structure that contains magic number for fadump validation and physical address where the eflcore header can be found. This structure will also be used to pass some important crash info data to the second kernel which will help second kernel to populate ELF core header with correct data before it gets exported through /proc/vmcore. Since the firmware preserves the entire partition memory at the time of crash the contents of the scratch area will be preserved till second kernel boot. Since the memory dump exported through /proc/vmcore is in ELF format similar to kdump, it will help us to reuse the kdump infrastructure for dump capture and filtering. Unlike phyp dump, userspace tool does not need to refer any sysfs interface while reading /proc/vmcore. NOTE: The current design implementation does not address a possibility of introducing additional fields (in future) to this structure without affecting compatibility. It's on TODO list to come up with better approach to address this. Reserved dump area start => +-------------------------------------+ \| CPU state dump data \| +-------------------------------------+ \| HPTE region data \| +-------------------------------------+ \| RMR region data \| Scratch area start => +-------------------------------------+ \| fadump crash info structure { \| \| magic nummber \| +------\|---- elfcorehdr_addr \| \| \| } \| +----> +-------------------------------------+ \| ELF core header \| Reserved dump area end => +-------------------------------------+ Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2012-02-23 10:50:01 +11:00
Mahesh Salgaonkar	3ccc00a7e0	fadump: Register for firmware assisted dump. On 2012-02-20 11:02:51 Mon, Paul Mackerras wrote: > On Thu, Feb 16, 2012 at 04:44:30PM +0530, Mahesh J Salgaonkar wrote: > > If I have read the code correctly, we are going to get this printk on > non-pSeries machines or on older pSeries machines, even if the user > has not put the fadump=on option on the kernel command line. The > printk will be annoying since there is no actual error condition. It > seems to me that the condition for the printk should include > fw_dump.fadump_enabled. In other words you should probably add > > if (!fw_dump.fadump_enabled) > return 0; > > at the beginning of the function. Hi Paul, Thanks for pointing it out. Please find the updated patch below. The existing patches above this (4/10 through 10/10) cleanly applies on this update. Thanks, -Mahesh. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2012-02-23 10:50:01 +11:00
Mahesh Salgaonkar	eb39c8803d	fadump: Reserve the memory for firmware assisted dump. Reserve the memory during early boot to preserve CPU state data, HPTE region and RMA (real mode area) region data in case of kernel crash. At the time of crash, powerpc firmware will store CPU state data, HPTE region data and move RMA region data to the reserved memory area. If the firmware-assisted dump fails to reserve the memory, then fallback to existing kexec-based kdump. Most of the code implementation to reserve memory has been adapted from phyp assisted dump implementation written by Linas Vepstas and Manish Ahuja This patch also introduces a config option CONFIG_FA_DUMP for firmware assisted dump feature on Powerpc (ppc64) architecture. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2012-02-23 10:50:01 +11:00
Kyle Moffett	e55d7f737d	powerpc/mpic: Remove duplicate MPIC_WANTS_RESET flag There are two separate flags controlling whether or not the MPIC is reset during initialization, which is completely unnecessary, and only one of them can be specified in the device tree. Also, most platforms in-tree right now do actually want to reset the MPIC during initialization anyways, which means lots of duplicate code passing the MPIC_WANTS_RESET flag. Fix all of the callers which currently do not pass the MPIC_WANTS_RESET flag to pass the MPIC_NO_RESET flag, then remove the MPIC_WANTS_RESET flag and make the code reset the MPIC by default. Signed-off-by: Kyle Moffett <Kyle.D.Moffett@boeing.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2012-02-23 10:50:00 +11:00
Kyle Moffett	c1b8d45db4	powerpc/mpic: Add "last-interrupt-source" property to override hardware The FreeScale PowerQUICC-III-compatible (mpc85xx/mpc86xx) MPICs do not correctly report the number of hardware interrupt sources, so software needs to override the detected value with "256". To avoid needing to write custom board-specific code to detect that scenario, allow it to be easily overridden in the device-tree. Signed-off-by: Kyle Moffett <Kyle.D.Moffett@boeing.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2012-02-23 10:50:00 +11:00
Kyle Moffett	5019609fce	powerpc/mpic: Remove MPIC_BROKEN_FRR_NIRQS and duplicate irq_count The mpic->irq_count variable is only used as a software error-checking limit to determine whether or not an IRQ number is valid. In board code which does not manually specify an IRQ count to mpic_alloc(), i.e. 0, it is automatically detected from the number of ISUs and the ISU size. In practice, all hardware ends up with irq_count == num_sources, so all of the runtime checks on mpic->irq_count should just check the value of mpic->num_sources instead. When platform hardware does not correctly report the number of IRQs, which only happens on the MPC85xx/MPC86xx, the MPIC_BROKEN_FRR_NIRQS flag is used to override the detected value of num_sources with the manual irq_count parameter. Since there's no need to manually specify the number of IRQs except in this case, the extra flag can be eliminated and the test changed to "irq_count != 0". Signed-off-by: Kyle Moffett <Kyle.D.Moffett@boeing.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2012-02-23 10:49:59 +11:00
Kyle Moffett	9ca163c860	fsl/mpic: Create and document the "single-cpu-affinity" device-tree flag The Freescale MPIC (and perhaps others in the future) is incapable of routing non-IPI interrupts to more than once CPU at a time. Currently all of the Freescale boards msut pass the MPIC_SINGLE_DEST_CPU flag to mpic_alloc(), but that information should really be present in the device-tree. Older board code can't rely on the device-tree having the property set, but newer platforms won't need it manually specified in the code. [BenH: Remove unrelated changes, folded in a different patch] Signed-off-by: Kyle Moffett <Kyle.D.Moffett@boeing.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2012-02-23 10:49:59 +11:00
Kyle Moffett	98cca250ae	fsl/mpic: Document and use the "big-endian" device-tree flag The MPIC code checks for a "big-endian" property and sets the flag MPIC_BIG_ENDIAN if one is present, although prior to the "mpic->flags" fixup that would never have worked anways. Unfortunately, even now that it works properly, the Freescale mpic device-node (the "PowerQUICC-III"-compatible one) does not specify it, so all of the board ports need to manually pass it to mpic_alloc(). Document the flag and add it to the pq3 device tree. Existing code will still need to pass the MPIC_BIG_ENDIAN flag because their dtb may not have this property, but new platforms shouldn't need to do so. Signed-off-by: Kyle Moffett <Kyle.D.Moffett@boeing.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2012-02-23 10:49:58 +11:00
Kyle Moffett	3a7a7176e8	powerpc/mpic: Fix use of "flags" variable in mpic_alloc() The mpic_alloc() function takes a "flags" parameter and assigns it into the mpic->flags variable fairly early on, but several later pieces of code detect various device-tree properties and save them into the "mpic->flags" variable (EG: "big-endian" => MPIC_BIG_ENDIAN). Unfortunately, a number of codepaths (including several which test the flag MPIC_BIG_ENDIAN!) test "flags" instead of "mpic->flags", and get wrong answers as a result. Consolidate the device-tree flag tests early in mpic_alloc() and change all of the checks after "mpic->flags" is init'ed to use "mpic->flags". [BenH: Fixed up use of mpic->node before it's initialized] Signed-off-by: Kyle Moffett <Kyle.D.Moffett@boeing.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2012-02-23 10:49:58 +11:00
Benjamin Herrenschmidt	18b246fa60	powerpc: Fix various issues with return to userspace We have a few problems when returning to userspace. This is a quick set of fixes for 3.3, I'll look into a more comprehensive rework for 3.4. This fixes: - We kept interrupts soft-disabled when schedule'ing or calling do_signal when returning to userspace as a result of a hardware interrupt. - Rename do_signal to do_notify_resume like all other archs (and do_signal_pending back to do_signal, which it was before Roland changed it). - Add the missing call to key_replace_session_keyring() to do_notify_resume(). Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> ---	2012-02-22 16:48:53 +11:00
Michael Ellerman	922b9f86a0	powerpc: Fix program check handling when lockdep is enabled In commit `54321242af` ("Disable interrupts early in Program Check"), we switched from enabling to disabling interrupts in program_check_common. Whereas ENABLE_INTS leaves r3 untouched, if lockdep is enabled DISABLE_INTS calls into lockdep code and will clobber r3. That means we pass a bogus struct pt_regs* into program_check_exception() and all hell breaks loose. So load our regs pointer into r3 after we call DISABLE_INTS. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2012-02-22 16:48:49 +11:00
Rusty Russell	07d2f1a54a	powerpc: Remove references to cpu__map This has been obsolescent for a while; time for the final push. In adjacent context, replaced old cpus_ with cpumask_*. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2012-02-22 16:48:47 +11:00
Grant Likely	0f22dd395f	of: Only compile OF_DYNAMIC on PowerPC pseries and iseries Only two architectures use the OF node reference counting and reclaim bits. There is no need to compile it for the rest of the PowerPC platforms or for any of the other architectures. This patch makes iseries and pseries select CONFIG_OF_DYNAMIC, and makes it default to off for everything else. It is still safe to turn on CONFIG_OF_DYNAMIC on all architectures, it just isn't necessary. v2: Also select OF_DYNAMIC for PPC_CHROMA and MPC885ADS as reported by Michael Meuling Signed-off-by: Grant Likely <grant.likely@secretlab.ca> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: Jimi Xenidis <jimix@pobox.com> (for PPC_CHROMA bug fix) Cc: Rob Herring <rob.herring@calxeda.com>	2012-02-21 13:33:00 -07:00
Pavel Emelyanov	ef64a54f6e	sock: Introduce the SO_PEEK_OFF sock option This one specifies where to start MSG_PEEK-ing queue data from. When set to negative value means that MSG_PEEK works as ususally -- peeks from the head of the queue always. When some bytes are peeked from queue and the peeking offset is non negative it is moved forward so that the next peek will return next portion of data. When non-peeking recvmsg occurs and the peeking offset is non negative is is moved backward so that the next peek will still peek the proper data (i.e. the one that would have been picked if there were no non peeking recv in between). The offset is set using per-proto opteration to let the protocol handle the locking issues and to check whether the peeking offset feature is supported by the protocol the socket belongs to. Signed-off-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-02-21 15:03:48 -05:00
Justin P. Mattock	a80581d0d1	Typos: change aditional to additional. The below patch fixes some typos "aditional" to "additional", and also fixes a comment with another word mispelled. Signed-off-by: Justin P. Mattock <justinmattock@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2012-02-21 11:40:36 +01:00
David Howells	1dce27c5aa	Wrap accesses to the fd_sets in struct fdtable Wrap accesses to the fd_sets in struct fdtable (for recording open files and close-on-exec flags) so that we can move away from using fd_sets since we abuse the fd_set structs by not allocating the full-sized structure under normal circumstances and by non-core code looking at the internals of the fd_sets. The first abuse means that use of FD_ZERO() on these fd_sets is not permitted, since that cannot be told about their abnormal lengths. This introduces six wrapper functions for setting, clearing and testing close-on-exec flags and fd-is-open flags: void __set_close_on_exec(int fd, struct fdtable fdt); void __clear_close_on_exec(int fd, struct fdtable fdt); bool close_on_exec(int fd, const struct fdtable fdt); void __set_open_fd(int fd, struct fdtable fdt); void __clear_open_fd(int fd, struct fdtable fdt); bool fd_is_open(int fd, const struct fdtable fdt); Note that I've prepended '__' to the names of the set/clear functions because they require the caller to hold a lock to use them. Note also that I haven't added wrappers for looking behind the scenes at the the array. Possibly that should exist too. Signed-off-by: David Howells <dhowells@redhat.com> Link: http://lkml.kernel.org/r/20120216174942.23314.1364.stgit@warthog.procyon.org.uk Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: Al Viro <viro@zeniv.linux.org.uk>	2012-02-19 10:30:52 -08:00
Grant Likely	ff8c3ab816	irq_domain/powerpc: Replace custom xlate functions with library functions This patch converts a number of the powerpc drivers to use the common library of irq_domain xlate functions, dropping a bunch of lines in the process. v5: - Remove tsi108 changes from patch Signed-off-by: Grant Likely <grant.likely@secretlab.ca> Cc: Rob Herring <rob.herring@calxeda.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Milton Miller <miltonm@bga.com> Tested-by: Olof Johansson <olof@lixom.net>	2012-02-16 06:11:24 -07:00
Grant Likely	9f70b8eb3c	irq_domain/powerpc: constify irq_domain_ops Make all the irq_domain_ops structures in powerpc 'static const' Signed-off-by: Grant Likely <grant.likely@secretlab.ca> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Milton Miller <miltonm@bga.com> Tested-by: Olof Johansson <olof@lixom.net>	2012-02-16 06:11:24 -07:00
Grant Likely	a18dc81bf5	irq_domain: constify irq_domain_ops Make irq_domain_ops pointer a constant to make it safer for multiple instances to share the same ops pointer and change the irq_domain code so that it does not modify the ops. v4: Fix mismatched type reference in powerpc code Signed-off-by: Grant Likely <grant.likely@secretlab.ca> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Milton Miller <miltonm@bga.com> Tested-by: Olof Johansson <olof@lixom.net>	2012-02-16 06:11:24 -07:00
Grant Likely	1bc04f2cf8	irq_domain: Add support for base irq and hwirq in legacy mappings Add support for a legacy mapping where irq = (hwirq - first_hwirq + first_irq) so that a controller driver can allocate a fixed range of irq_descs and use a simple calculation to translate back and forth between linux and hw irq numbers. This is needed to use an irq_domain with many of the ARM interrupt controller drivers that manage their own irq_desc allocations. Ultimately the goal is to migrate those drivers to use the linear revmap, but doing it this way allows each driver to be converted separately which makes the migration path easier. This patch generalizes the IRQ_DOMAIN_MAP_LEGACY method to use (first_irq-first_hwirq) as the offset between hwirq and linux irq number, and adds checks to make sure that the hwirq number does not exceed range assigned to the controller. Signed-off-by: Grant Likely <grant.likely@secretlab.ca> Cc: Rob Herring <rob.herring@calxeda.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Milton Miller <miltonm@bga.com> Tested-by: Olof Johansson <olof@lixom.net>	2012-02-16 06:11:23 -07:00
Grant Likely	a8db8cf0d8	irq_domain: Replace irq_alloc_host() with revmap-specific initializers Each revmap type has different arguments for setting up the revmap. This patch splits up the generator functions so that each revmap type can do its own setup and the user doesn't need to keep track of how each revmap type handles the arguments. This patch also adds a host_data argument to the generators. There are cases where the host_data pointer will be needed before the function returns. ie. the legacy map calls the .map callback for each irq before returning. v2: - Add void host_data argument to irq_domain_add_() functions - fixed failure to compile - Moved IRQ_DOMAIN_MAP_* defines into irqdomain.c Signed-off-by: Grant Likely <grant.likely@secretlab.ca> Cc: Rob Herring <rob.herring@calxeda.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Milton Miller <miltonm@bga.com> Tested-by: Olof Johansson <olof@lixom.net>	2012-02-16 06:11:22 -07:00
Grant Likely	cc79ca691c	irq_domain: Move irq_domain code from powerpc to kernel/irq This patch only moves the code. It doesn't make any changes, and the code is still only compiled for powerpc. Follow-on patches will generalize the code for other architectures. Signed-off-by: Grant Likely <grant.likely@secretlab.ca> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Milton Miller <miltonm@bga.com> Tested-by: Olof Johansson <olof@lixom.net>	2012-02-16 01:37:49 -07:00
Grant Likely	6d9285b00f	irq_domain/powerpc: Eliminate virq_is_host() There is only one user, and it is trivial to open-code. Signed-off-by: Grant Likely <grant.likely@secretlab.ca> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Milton Miller <miltonm@bga.com> Tested-by: Olof Johansson <olof@lixom.net>	2012-02-16 01:36:46 -07:00
Anton Blanchard	9a45a9407c	powerpc/perf: power_pmu_start restores incorrect values, breaking frequency events perf on POWER stopped working after commit `e050e3f0a7` (perf: Fix broken interrupt rate throttling). That patch exposed a bug in the POWER perf_events code. Since the PMCs count upwards and take an exception when the top bit is set, we want to write 0x80000000 - left in power_pmu_start. We were instead programming in left which effectively disables the counter until we eventually hit 0x80000000. This could take seconds or longer. With the patch applied I get the expected number of samples: SAMPLE events: 9948 Signed-off-by: Anton Blanchard <anton@samba.org> Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: <stable@kernel.org>	2012-02-16 16:24:35 +11:00
Benjamin Herrenschmidt	54321242af	powerpc: Disable interrupts early in Program Check Program Check exceptions are the result of WARNs, BUGs, some type of breakpoints, kprobe, and other illegal instructions. We want interrupts (and thus preemption) to remain disabled while doing the initial stage of testing the reason and branching off to a debugger or kprobe, so we are still on the original CPU which makes debugging easier in various cases. This is how the code was intended, hence the local_irq_enable() right in the middle of program_check_exception(). However, the assembly exception prologue for that exception was incorrectly marked as enabling interrupts, which defeats that (and records a redundant enable with lockdep). Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2012-02-16 16:15:10 +11:00
Stephen Rothwell	a1a1d1bfc9	powerpc: Remove legacy iSeries from ppc64_defconfig Since we are heading towards removing the Legacy iSeries platform, start by no longer building it for ppc64_defconfig. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2012-02-16 16:15:08 +11:00
Benjamin Herrenschmidt	13635dfdc6	powerpc/fsl/pci: Fix PCIe fixup regression Upstream changes to the way PHB resources are registered broke the resource fixup for FSL boards. We can no longer rely on the resource pointer array for the PHB's pci_bus structure, so let's leave it alone and go straight for the PHB resources instead. This also makes the code generally more readable. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2012-02-16 16:15:03 +11:00
Ira Snyder	40c8cefaaf	powerpc: Fix kernel log of oops/panic instruction dump A kernel oops/panic prints an instruction dump showing several instructions before and after the instruction which caused the oops/panic. The code intended that the faulting instruction be enclosed in angle brackets, however a bug caused the faulting instruction to be interpreted by printk() as the message log level. To fix this, the KERN_CONT log level is added before the actual text of the printed message. === Before the patch === [ 1081.587266] Instruction dump: [ 1081.590236] 7c000110 7c0000f8 5400077c 552907f6 7d290378 992b0003 4e800020 38000001 [ 1081.598034] 3d20c03a 9009a114 7c0004ac 39200000 [ 1081.602500] 4e800020 3803ffd0 2b800009 <4>[ 1081.587266] Instruction dump: <4>[ 1081.590236] 7c000110 7c0000f8 5400077c 552907f6 7d290378 992b0003 4e800020 38000001 <4>[ 1081.598034] 3d20c03a 9009a114 7c0004ac 39200000 <98090000>[ 1081.602500] 4e800020 3803ffd0 2b800009 === After the patch === [ 51.385216] Instruction dump: [ 51.388186] 7c000110 7c0000f8 5400077c 552907f6 7d290378 992b0003 4e800020 38000001 [ 51.395986] 3d20c03a 9009a114 7c0004ac 39200000 <98090000> 4e800020 3803ffd0 2b800009 <4>[ 51.385216] Instruction dump: <4>[ 51.388186] 7c000110 7c0000f8 5400077c 552907f6 7d290378 992b0003 4e800020 38000001 <4>[ 51.395986] 3d20c03a 9009a114 7c0004ac 39200000 <98090000> 4e800020 3803ffd0 2b800009 Signed-off-by: Ira W. Snyder <iws@ovro.caltech.edu> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2012-02-16 16:11:23 +11:00
Grant Likely	4bbdd45afd	irq_domain/powerpc: eliminate irq_map; use irq_alloc_desc() instead This patch drops the powerpc-specific irq_map table and replaces it with directly using the irq_alloc_desc()/irq_free_desc() interfaces for allocating and freeing irq_desc structures. This patch is a preparation step for generalizing the powerpc-specific virq infrastructure to become irq_domains. As part of this change, the irq_big_lock is changed to a mutex from a raw spinlock. There is no longer any need to use a spin lock since the irq_desc allocation code is now responsible for the critical section of finding an unused range of irq numbers. The radix lookup table is also changed to store the irq_data pointer instead of the irq_map entry since the irq_map is removed. This should end up being functionally equivalent since only allocated irq_descs are ever added to the radix tree. v5: - Really don't ever allocate virq 0. The previous version could still do it if hint == 0 - Respect irq_virq_count setting for NOMAP. Some NOMAP domains cannot use virq values above irq_virq_count. - Use numa_node_id() when allocating irq_descs. Ideally the API should obtain that value from the caller, but that touches a lot of call sites so will be deferred to a follow-on patch. - Fix irq_find_mapping() to include irq numbers lower than NUM_ISA_INTERRUPTS. With the switch to irq_alloc_desc*(), the lowest possible allocated irq is now returned by arch_probe_nr_irqs(). v4: - Fix incorrect access to irq_data structure in debugfs code - Don't ever allocate virq 0 Signed-off-by: Grant Likely <grant.likely@secretlab.ca> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Milton Miller <miltonm@bga.com> Tested-by: Olof Johansson <olof@lixom.net>	2012-02-14 14:06:51 -07:00
Grant Likely	bae1d8f199	irq_domain/powerpc: Use common irq_domain structure instead of irq_host This patch drops the powerpc-specific irq_host structures and uses the common irq_domain strucutres defined in linux/irqdomain.h. It also fixes all the users to use the new structure names. Renaming irq_host to irq_domain has been discussed for a long time, and this patch is a step in the process of generalizing the powerpc virq code to be usable by all architecture. An astute reader will notice that this patch actually removes the irq_host structure instead of renaming it. This is because the irq_domain structure already exists in include/linux/irqdomain.h and has the needed data members. Signed-off-by: Grant Likely <grant.likely@secretlab.ca> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Milton Miller <miltonm@bga.com> Tested-by: Olof Johansson <olof@lixom.net>	2012-02-14 14:06:50 -07:00
H. Peter Anvin	fae89ee8d7	powerpc: Use generic posix_types.h Change the powerpc architecture to use <asm-generic/posix_types.h>. [ v2: fix the definition for __kernel_ssize_t ] Signed-off-by: H. Peter Anvin <hpa@zytor.com> Link: http://lkml.kernel.org/r/1328677745-20121-16-git-send-email-hpa@zytor.com Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org>	2012-02-14 12:01:29 -08:00
Thadeu Lima de Souza Cascardo	778a785f02	powerpc/pseries/eeh: Fix crash when error happens during device probe EEH may happen during a PCI driver probe. If the driver is trying to access some register in a loop, the EEH code will try to print the driver name. But the driver pointer in struct pci_dev is not set until probe returns successfully. Use a function to test if the device and the driver pointer is NULL before accessing the driver's name. Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2012-02-14 15:01:39 +11:00
Brian King	444080d13d	powerpc/pseries: Fix partition migration hang in stop_topology_update This fixes a hang that was observed during live partition migration. Since stop_topology_update must not be called from an interrupt context, call it earlier in the migration process. The hang observed can be seen below: WARNING: at kernel/timer.c:1011 Modules linked in: ip6t_LOG xt_tcpudp xt_pkttype ipt_LOG xt_limit ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_raw xt_NOTRACK ipt_REJECT xt_state iptable_raw iptable_filter ip6table_mangle nf_conntrack_netbios_ns nf_conntrack_broadcast nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 ip_tables ip6table_filter ip6_tables x_tables ipv6 fuse loop ibmveth sg ext3 jbd mbcache raid456 async_raid6_recov async_pq raid6_pq async_xor xor async_memcpy async_tx raid10 raid1 raid0 scsi_dh_alua scsi_dh_rdac scsi_dh_hp_sw scsi_dh_emc dm_round_robin dm_multipath scsi_dh sd_mod crc_t10dif ibmvfc scsi_transport_fc scsi_tgt scsi_mod dm_snapshot dm_mod NIP: c0000000000c52d8 LR: c00000000004be28 CTR: 0000000000000000 REGS: c00000005ffd77d0 TRAP: 0700 Not tainted (3.2.0-git-00001-g07d106d) MSR: 8000000000021032 <ME,CE,IR,DR> CR: 48000084 XER: 00000001 CFAR: c00000000004be20 TASK = c00000005ec78860[0] 'swapper/3' THREAD: c00000005ec98000 CPU: 3 GPR00: 0000000000000001 c00000005ffd7a50 c000000000fbbc98 c000000000ec8340 GPR04: 00000000282a0020 0000000000000000 0000000000004000 0000000000000101 GPR08: 0000000000000012 c00000005ffd4000 0000000000000020 c000000000f3ba88 GPR12: 0000000000000000 c000000007f40900 0000000000000001 0000000000000004 GPR16: 0000000000000001 0000000000000000 0000000000000000 c000000001022310 GPR20: 0000000000000001 0000000000000000 0000000000200200 c000000001029e14 GPR24: 0000000000000000 0000000000000001 0000000000000040 c00000003f74bc80 GPR28: c00000003f74bc84 c000000000f38038 c000000000f16b58 c000000000ec8340 NIP [c0000000000c52d8] .del_timer_sync+0x28/0x60 LR [c00000000004be28] .stop_topology_update+0x20/0x38 Call Trace: [c00000005ffd7a50] [c00000005ec78860] 0xc00000005ec78860 (unreliable) [c00000005ffd7ad0] [c00000000004be28] .stop_topology_update+0x20/0x38 [c00000005ffd7b40] [c000000000028378] .__rtas_suspend_last_cpu+0x58/0x260 [c00000005ffd7bf0] [c0000000000fa230] .generic_smp_call_function_interrupt+0x160/0x358 [c00000005ffd7cf0] [c000000000036ec8] .smp_ipi_demux+0x88/0x100 [c00000005ffd7d80] [c00000000005c154] .icp_hv_ipi_action+0x5c/0x80 [c00000005ffd7e00] [c00000000012a088] .handle_irq_event_percpu+0x100/0x318 [c00000005ffd7f00] [c00000000012e774] .handle_percpu_irq+0x84/0xd0 [c00000005ffd7f90] [c000000000022ba8] .call_handle_irq+0x1c/0x2c [c00000005ec9ba20] [c00000000001157c] .do_IRQ+0x22c/0x2a8 [c00000005ec9bae0] [c0000000000054bc] hardware_interrupt_entry+0x18/0x1c Exception: 501 at .cpu_idle+0x194/0x2f8 LR = .cpu_idle+0x194/0x2f8 [c00000005ec9bdd0] [c000000000017e58] .cpu_idle+0x188/0x2f8 (unreliable) [c00000005ec9be90] [c00000000067ec18] .start_secondary+0x3e4/0x524 [c00000005ec9bf90] [c0000000000093e8] .start_secondary_prolog+0x10/0x14 Instruction dump: ebe1fff8 4e800020 fbe1fff8 7c0802a6 f8010010 7c7f1b78 f821ff81 78290464 80090014 5400019e 7c0000d0 78000fe0 <0b000000> 4800000c 7c210b78 7c421378 Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2012-02-14 15:01:39 +11:00
Michael Ellerman	f1c853b53c	powerpc/powernv: Disable interrupts while taking phb->lock We need to disable interrupts when taking the phb->lock. Otherwise we could deadlock with pci_lock taken from an interrupt. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2012-02-14 15:01:39 +11:00
Benjamin Herrenschmidt	6fe5f5f3ff	powerpc: Fix WARN_ON in decrementer_check_overflow We use __get_cpu_var() which triggers a false positive warning in smp_processor_id() thinking interrupts are enabled (at this point, they are soft-enabled but hard-disabled). Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2012-02-14 15:01:38 +11:00
Benjamin Herrenschmidt	7a768d30ca	powerpc/wsp: Fix IRQ affinity setting We call the cache_hwirq_map() function with a linux IRQ number but it expects a HW irq number. This triggers a BUG on multic-chip setups in addition to not doing the right thing. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2012-02-14 15:01:38 +11:00
Srikar Dronamraju	e62894273c	powerpc: Implement GET_IP/SET_IP With this change, helpers such as instruction_pointer() et al, get defined in the generic header in terms of GET_IP Removed the unnecessary definition of profile_pc in !CONFIG_SMP case as suggested by Mike Frysinger. Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Acked-by: Mike Frysinger <vapier@gentoo.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2012-02-14 15:01:38 +11:00
Benjamin Herrenschmidt	454c0bfd0c	powerpc/wsp: Permanently enable PCI class code workaround It appears that on the Chroma card, the class code of the root complex is still wrong even on DD2 or later chips. This could be a firmware issue, but that breaks resource allocation so let's unconditionally fix it up. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2012-02-14 15:01:38 +11:00
Grant Likely	07d57a32fb	drivercore: Output common devicetree information in uevent When userspace needs to find a specific device, it currently isn't easy to resolve a /sys/devices/ path from a specific device tree node. Nor is it easy to obtain the compatible list for devices. This patch generalizes the code that inserts OF_* values into the uevent device attribute so that any device that is attached to an OF node will have that information exported to userspace. Without this patch only platform devices and some powerpc-specific busses have access to this data. The original function also creates a MODALIAS property for the compatible list, but that code has not been generalized into the common case because it has the potential to break module loading on a lot of bus types. Bus types are still responsible for their own MODALIAS properties. Boot tested on ARM and compile tested on PowerPC and SPARC. Signed-off-by: Grant Likely <grant.likely@secretlab.ca> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Tobias Klauser <tklauser@distanz.ch> Cc: Frederic Lambert <frdrc66@gmail.com> Cc: Rob Herring <rob.herring@calxeda.com> Cc: Mark Brown <broonie@sirena.org.uk> Cc: "David S. Miller" <davem@davemloft.net> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2012-02-01 14:26:30 -07:00
Ingo Molnar	44a6839711	Merge branch 'perf/fast' into perf/core Merge reason: Lets ready it for v3.4 Signed-off-by: Ingo Molnar <mingo@elte.hu>	2012-01-27 12:08:09 +01:00
Rob Herring	2ed86b16ea	irq: make SPARSE_IRQ an optionally hidden option On ARM, we don't want SPARSE_IRQ to be a user visible option. Make SPARSE_IRQ visible based on MAY_HAVE_SPARSE_IRQ instead of depending on HAVE_SPARSE_IRQ. With this, SPARSE_IRQ is not visible on C6X and ARM. Signed-off-by: Rob Herring <rob.herring@calxeda.com> Cc: Russell King <linux@arm.linux.org.uk> Cc: Mark Salter <msalter@redhat.com> Cc: Aurelien Jacquiot <a-jacquiot@ti.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Cc: linux-c6x-dev@linux-c6x.org Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-sh@vger.kernel.org	2012-01-25 20:37:42 -06:00
Benjamin Herrenschmidt	3493c85366	powerpc: Fix build on some non-freescale platforms Commit `9deaa53ac7` broke build on platforms that use legacy_serial.c without also having CONFIG_SERIAL_8250_FSL enabled due to an unconditional code to a routine in that module. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2012-01-25 13:33:22 +11:00
Benjamin Herrenschmidt	f7ea82beb2	powerpc/powernv: Fix PCI resource handling Recent changes to the handling of PCI resources for host bridges are breaking the PowerNV code for assigning resources on IODA. The root of the problem is that the pci_bus attached to a host bridge no longer has its "legacy" resource pointers populated but only uses the newer list instead. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2012-01-25 13:32:00 +11:00
Christian Kujau	897e01a08c	powerpc/crash: Fix build error without SMP I could not find cpus_in_crash anywhere in the sourcetree, except for arch/powerpc/kernel/crash.c. Moving the definition into the CONFIG_SMP fixes it. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2012-01-25 09:47:45 +11:00
Deepthi Dharwar	f7aa554510	powerpc/cpuidle: Make it a bool, not a tristate As pointed out, asm/system.h has empty inline implementations for update_smt_snooze_delay and pseries_notify_cpuidle_add_cpu, which are used when CONFIG_PSERIES_IDLE is undefined. Since those two functions are used in core power architecture functions (store_smt_snooze_delay at kernel/sysfs.c and smp_xics_setup_cpu at platforms/pseries/smp.c), Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2012-01-25 09:43:06 +11:00
Benjamin Herrenschmidt	407a362f94	Merge remote-tracking branch 'kumar/merge' into merge	2012-01-25 09:40:34 +11:00
Ramneek Mehresh	621c4b999e	powerpc/85xx: Add dr_mode property in USB nodes Add usb2 controller node for P1020RDB, P2020RDB, P2020DS, P1021MDS Signed-off-by: Ramneek Mehresh <ramneek.mehresh@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>	2012-01-18 08:05:42 -06:00
Ramneek Mehresh	70a0ac6686	powerpc/85xx: Enable USB2 controller node for P1020RDB Enable USB2 controller node for P1020RDB. USB2 controller is used only when board boots from SPI or SD as it is muxed with eLBC Signed-off-by: Ramneek Mehresh <ramneek.mehresh@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>	2012-01-18 08:05:36 -06:00
Jerry Huang	f8b5a31877	powerpc/85xx: Fix cmd12 bug and add the chip compatible for eSDHC According to latest kernel, the auto-cmd12 property should be "sdhci,auto-cmd12", and according to the SDHC binding and the workaround for the special chip, add the chip compatible for eSDHC: "fsl,p1022-esdhc", "fsl,mpc8536-esdhc", "fsl,p1020-esdhc", "fsl,p2020-esdhc" and "fsl,p1010-esdhc". Signed-off-by: Jerry Huang <Chang-Ming.Huang@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>	2012-01-18 08:01:53 -06:00
Linus Torvalds	f429ee3b80	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit: (29 commits) audit: no leading space in audit_log_d_path prefix audit: treat s_id as an untrusted string audit: fix signedness bug in audit_log_execve_info() audit: comparison on interprocess fields audit: implement all object interfield comparisons audit: allow interfield comparison between gid and ogid audit: complex interfield comparison helper audit: allow interfield comparison in audit rules Kernel: Audit Support For The ARM Platform audit: do not call audit_getname on error audit: only allow tasks to set their loginuid if it is -1 audit: remove task argument to audit_set_loginuid audit: allow audit matching on inode gid audit: allow matching on obj_uid audit: remove audit_finish_fork as it can't be called audit: reject entry,always rules audit: inline audit_free to simplify the look of generic code audit: drop audit_set_macxattr as it doesn't do anything audit: inline checks for not needing to collect aux records audit: drop some potentially inadvisable likely notations ... Use evil merge to fix up grammar mistakes in Kconfig file. Bad speling and horrible grammar (and copious swearing) is to be expected, but let's keep it to commit messages and comments, rather than expose it to users in config help texts or printouts.	2012-01-17 16:41:31 -08:00
Eric Paris	b05d8447e7	audit: inline audit_syscall_entry to reduce burden on archs Every arch calls: if (unlikely(current->audit_context)) audit_syscall_entry() which requires knowledge about audit (the existance of audit_context) in the arch code. Just do it all in static inline in audit.h so that arch's can remain blissfully ignorant. Signed-off-by: Eric Paris <eparis@redhat.com>	2012-01-17 16:16:56 -05:00
Eric Paris	d7e7528bcd	Audit: push audit success and retcode into arch ptrace.h The audit system previously expected arches calling to audit_syscall_exit to supply as arguments if the syscall was a success and what the return code was. Audit also provides a helper AUDITSC_RESULT which was supposed to simplify things by converting from negative retcodes to an audit internal magic value stating success or failure. This helper was wrong and could indicate that a valid pointer returned to userspace was a failed syscall. The fix is to fix the layering foolishness. We now pass audit_syscall_exit a struct pt_reg and it in turns calls back into arch code to collect the return value and to determine if the syscall was a success or failure. We also define a generic is_syscall_success() macro which determines success/failure based on if the value is < -MAX_ERRNO. This works for arches like x86 which do not use a separate mechanism to indicate syscall failure. We make both the is_syscall_success() and regs_return_value() static inlines instead of macros. The reason is because the audit function must take a void* for the regs. (uml calls theirs struct uml_pt_regs instead of just struct pt_regs so audit_syscall_exit can't take a struct pt_regs). Since the audit function takes a void* we need to use static inlines to cast it back to the arch correct structure to dereference it. The other major change is that on some arches, like ia64, MIPS and ppc, we change regs_return_value() to give us the negative value on syscall failure. THE only other user of this macro, kretprobe_example.c, won't notice and it makes the value signed consistently for the audit functions across all archs. In arch/sh/kernel/ptrace_64.c I see that we were using regs[9] in the old audit code as the return value. But the ptrace_64.h code defined the macro regs_return_value() as regs[3]. I have no idea which one is correct, but this patch now uses the regs_return_value() function, so it now uses regs[3]. For powerpc we previously used regs->result but now use the regs_return_value() function which uses regs->gprs[3]. regs->gprs[3] is always positive so the regs_return_value(), much like ia64 makes it negative before calling the audit code when appropriate. Signed-off-by: Eric Paris <eparis@redhat.com> Acked-by: H. Peter Anvin <hpa@zytor.com> [for x86 portion] Acked-by: Tony Luck <tony.luck@intel.com> [for ia64] Acked-by: Richard Weinberger <richard@nod.at> [for uml] Acked-by: David S. Miller <davem@davemloft.net> [for sparc] Acked-by: Ralf Baechle <ralf@linux-mips.org> [for mips] Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> [for ppc]	2012-01-17 16:16:56 -05:00
Julia Lawall	0cf572dc00	arch/powerpc/sysdev/fsl_pci.c: add missing iounmap Add missing iounmap in error handling code, in a case where the function already preforms iounmap on some other execution path. A simplified version of the semantic match that finds this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ expression e; statement S,S1; int ret; @@ e = $ioremap\\|ioremap_nocache$(...) ... when != iounmap(e) if (<+...e...+>) S ... when any when != iounmap(e) *if (...) { ... when != iounmap(e) return ...; } ... when any iounmap(e); // </smpl> Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>	2012-01-17 13:08:35 -06:00
Michael Neuling	ba8438fb4e	powerpc: fix compile error with 85xx/p1022_ds.c Current linux-next compiled with mpc85xx_defconfig causes this: arch/powerpc/platforms/85xx/p1022_ds.c:341:14: error: 'udbg_progress' undeclared here (not in a function) Add include to fix this. Signed-off-by: Michael Neuling <mikey@neuling.org> Acked-by: Timur Tabi <timur@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>	2012-01-17 13:08:08 -06:00
Linus Torvalds	c63dbbd526	Merge branch 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild * 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild: Kbuild: Use dtc's -d (dependency) option dtc: Implement -d option to write out a dependency file kbuild: Fix comment in Makefile.lib scripts/genksyms: clean lex/yacc generated files kbuild: Correctly deal with make options which contain an "s"	2012-01-16 14:34:54 -08:00
Stephen Warren	7c43185138	Kbuild: Use dtc's -d (dependency) option This hooks dtc into Kbuild's dependency system. Thus, for example, "make dtbs" will rebuild tegra-harmony.dtb if only tegra20.dtsi has changed yet tegra-harmony.dts has not. The previous lack of this feature recently caused me to have very confusing "git bisect" results. For ARM, it's obvious what to add to $(targets). I'm not familiar enough with other architectures to know what to add there. Powerpc appears to already add various .dtb files into $(targets), but the other archs may need something added to $(targets) to work. Signed-off-by: Stephen Warren <swarren@nvidia.com> Acked-by: Shawn Guo <shawn.guo@linaro.org> [mmarek: Dropped arch/c6x part to avoid merging commits from the middle of the merge window] Signed-off-by: Michal Marek <mmarek@suse.cz>	2012-01-15 00:04:35 +01:00
Linus Torvalds	8e63dd6e1c	Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: powerpc: Fix unpaired __trace_hcall_entry and __trace_hcall_exit powerpc: Fix RCU idle and hcall tracing	2012-01-14 12:26:23 -08:00
WANG Cong	a3dd332305	kexec: remove KMSG_DUMP_KEXEC KMSG_DUMP_KEXEC is useless because we already save kernel messages inside /proc/vmcore, and it is unsafe to allow modules to do other stuffs in a crash dump scenario. [akpm@linux-foundation.org: fix powerpc build] Signed-off-by: WANG Cong <xiyou.wangcong@gmail.com> Reported-by: Vivek Goyal <vgoyal@redhat.com> Acked-by: Vivek Goyal <vgoyal@redhat.com> Acked-by: Jarod Wilson <jarod@redhat.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-01-12 20:13:11 -08:00
Wanlong Gao	9512938b88	cpumask: update setup_node_to_cpumask_map() comments node_to_cpumask() has been replaced by cpumask_of_node(), and wholly removed since commit `29c337a0` ("cpumask: remove obsolete node_to_cpumask now everyone uses cpumask_of_node"). So update the comments for setup_node_to_cpumask_map(). Signed-off-by: Wanlong Gao <gaowanlong@cn.fujitsu.com> Acked-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-01-12 20:13:11 -08:00
Joe Perches	ff2d8b19a3	treewide: convert uses of ATTRIB_NORETURN to __noreturn Use the more commonly used __noreturn instead of ATTRIB_NORETURN. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Joe Perches <joe@perches.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Haavard Skinnemoen <hskinnemoen@gmail.com> Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no> Cc: Tony Luck <tony.luck@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Ralf Baechle <ralf@linux-mips.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Chris Metcalf <cmetcalf@tilera.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-01-12 20:13:03 -08:00
Joe Perches	9402c95f34	treewide: remove useless NORET_TYPE macro and uses It's a very old and now unused prototype marking so just delete it. Neaten panic pointer argument style to keep checkpatch quiet. Signed-off-by: Joe Perches <joe@perches.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Haavard Skinnemoen <hskinnemoen@gmail.com> Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no> Cc: Tony Luck <tony.luck@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Ralf Baechle <ralf@linux-mips.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Chris Metcalf <cmetcalf@tilera.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-01-12 20:13:03 -08:00
Linus Torvalds	7b67e75147	Merge branch 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci * 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci: (80 commits) x86/PCI: Expand the x86_msi_ops to have a restore MSIs. PCI: Increase resource array mask bit size in pcim_iomap_regions() PCI: DEVICE_COUNT_RESOURCE should be equal to PCI_NUM_RESOURCES PCI: pci_ids: add device ids for STA2X11 device (aka ConneXT) PNP: work around Dell 1536/1546 BIOS MMCONFIG bug that breaks USB x86/PCI: amd: factor out MMCONFIG discovery PCI: Enable ATS at the device state restore PCI: msi: fix imbalanced refcount of msi irq sysfs objects PCI: kconfig: English typo in pci/pcie/Kconfig PCI/PM/Runtime: make PCI traces quieter PCI: remove pci_create_bus() xtensa/PCI: convert to pci_scan_root_bus() for correct root bus resources x86/PCI: convert to pci_create_root_bus() and pci_scan_root_bus() x86/PCI: use pci_scan_bus() instead of pci_scan_bus_parented() x86/PCI: read Broadcom CNB20LE host bridge info before PCI scan sparc32, leon/PCI: convert to pci_scan_root_bus() for correct root bus resources sparc/PCI: convert to pci_create_root_bus() sh/PCI: convert to pci_scan_root_bus() for correct root bus resources powerpc/PCI: convert to pci_create_root_bus() powerpc/PCI: split PHB part out of pcibios_map_io_space() ... Fix up conflicts in drivers/pci/msi.c and include/linux/pci_regs.h due to the same patches being applied in other branches.	2012-01-11 18:50:26 -08:00
Linus Torvalds	e343a895a9	lib: use generic pci_iomap on all architectures Many architectures don't want to pull in iomap.c, so they ended up duplicating pci_iomap from that file. That function isn't trivial, and we are going to modify it https://lkml.org/lkml/2011/11/14/183 so the duplication hurts. This reduces the scope of the problem significantly, by moving pci_iomap to a separate file and referencing that from all architectures. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQEcBAABAgAGBQJPBZXBAAoJECgfDbjSjVRpuuYIAIMD0wE96MuTOSBJX4VG8VAP UyjL9dsfMRy8CKioQo5/fxpTY07YBCWmNauSSX7pzgcoUKBfYIGn4Z1qwGYsWK9M CzLs6PXLTugw0FtKobHZl/klRTWEBS6YOUjp9x568rplwF+Ppk7b993uj7eS/g+e T0mUKzqg4/UavbHd9+W5KgC4drQ5hgtu2WZHoUxBK4umnd3C2G+U82Sthg50o/XU SC8IGm39K8I36HoIWgXj3Y7nkOP3mQELohOT4ZPiVSmLvGS4i47+ix75anO+8ZvZ jxHr8RC85IK1Nd89NZhbKOyvx0QQiwoKUZaTwcWXJNSOADzZnM6icdIsodc+Elo= =ccQZ -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost lib: use generic pci_iomap on all architectures Many architectures don't want to pull in iomap.c, so they ended up duplicating pci_iomap from that file. That function isn't trivial, and we are going to modify it https://lkml.org/lkml/2011/11/14/183 so the duplication hurts. This reduces the scope of the problem significantly, by moving pci_iomap to a separate file and referencing that from all architectures. * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: alpha: drop pci_iomap/pci_iounmap from pci-noop.c mn10300: switch to GENERIC_PCI_IOMAP mn10300: add missing __iomap markers frv: switch to GENERIC_PCI_IOMAP tile: switch to GENERIC_PCI_IOMAP tile: don't panic on iomap sparc: switch to GENERIC_PCI_IOMAP sh: switch to GENERIC_PCI_IOMAP powerpc: switch to GENERIC_PCI_IOMAP parisc: switch to GENERIC_PCI_IOMAP mips: switch to GENERIC_PCI_IOMAP microblaze: switch to GENERIC_PCI_IOMAP arm: switch to GENERIC_PCI_IOMAP alpha: switch to GENERIC_PCI_IOMAP lib: add GENERIC_PCI_IOMAP lib: move GENERIC_IOMAP to lib/Kconfig Fix up trivial conflicts due to changes nearby in arch/{m68k,score}/Kconfig	2012-01-10 18:04:27 -08:00
Li Zhong	ebb7f616ab	powerpc: Fix unpaired __trace_hcall_entry and __trace_hcall_exit Unpaired calling of __trace_hcall_entry and __trace_hcall_exit could cause incorrect preempt count. And it might happen as the global variable hcall_tracepoint_refcount is checked separately before calling them. Instead, store the value that was used on entry in the stack frame and retreive it from there after the call Reported-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com> Tested-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2012-01-11 12:50:26 +11:00
Anton Blanchard	a5ccfee05a	powerpc: Fix RCU idle and hcall tracing Tracepoints should not be called inside an rcu_idle_enter/rcu_idle_exit region. Since pSeries calls H_CEDE in the idle loop, we were violating this rule. commit `a7b152d534` (powerpc: Tell RCU about idle after hcall tracing) tried to work around it by delaying the rcu_idle_enter until after we called the hcall tracepoint, but there are a number of issues with it. The hcall tracepoint trampoline code is called conditionally when the tracepoint is enabled. If the tracepoint is not enabled we never call rcu_idle_enter. The idle_uses_rcu check was also done at compile time which breaks multiplatform builds. The simple fix is to avoid tracing H_CEDE and rely on other tracepoints and the hypervisor dispatch trace log to work out if we called H_CEDE. This fixes a hang during boot on pSeries. Signed-off-by: Anton Blanchard <anton@samba.org> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2012-01-11 11:54:20 +11:00
Linus Torvalds	3dcf6c1b6b	Merge branch 'kvm-updates/3.3' of git://git.kernel.org/pub/scm/virt/kvm/kvm * 'kvm-updates/3.3' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (74 commits) KVM: PPC: Whitespace fix for kvm.h KVM: Fix whitespace in kvm_para.h KVM: PPC: annotate kvm_rma_init as __init KVM: x86 emulator: implement RDPMC (0F 33) KVM: x86 emulator: fix RDPMC privilege check KVM: Expose the architectural performance monitoring CPUID leaf KVM: VMX: Intercept RDPMC KVM: SVM: Intercept RDPMC KVM: Add generic RDPMC support KVM: Expose a version 2 architectural PMU to a guests KVM: Expose kvm_lapic_local_deliver() KVM: x86 emulator: Use opcode::execute for Group 9 instruction KVM: x86 emulator: Use opcode::execute for Group 4/5 instructions KVM: x86 emulator: Use opcode::execute for Group 1A instruction KVM: ensure that debugfs entries have been created KVM: drop bsp_vcpu pointer from kvm struct KVM: x86: Consolidate PIT legacy test KVM: x86: Do not rely on implicit inclusions KVM: Make KVM_INTEL depend on CPU_SUP_INTEL KVM: Use memdup_user instead of kmalloc/copy_from_user ...	2012-01-10 09:57:11 -08:00
Linus Torvalds	5983faf942	Merge branch 'tty-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty * 'tty-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (65 commits) tty: serial: imx: move del_timer_sync() to avoid potential deadlock imx: add polled io uart methods imx: Add save/restore functions for UART control regs serial/imx: let probing fail for the dt case without a valid alias serial/imx: propagate error from of_alias_get_id instead of using -ENODEV tty: serial: imx: Allow UART to be a source for wakeup serial: driver for m32 arch should not have DEC alpha errata serial/documentation: fix documented name of DCD cpp symbol atmel_serial: fix spinlock lockup in RS485 code tty: Fix memory leak in virtual console when enable unicode translation serial: use DIV_ROUND_CLOSEST instead of open coding it serial: add support for 400 and 800 v3 series Titan cards serial: bfin-uart: Remove ASYNC_CTS_FLOW flag for hardware automatic CTS. serial: bfin-uart: Enable hardware automatic CTS only when CTS pin is available. serial: make FSL errata depend on 8250_CONSOLE, not just 8250 serial: add irq handler for Freescale 16550 errata. serial: manually inline serial8250_handle_port serial: make 8250 timeout use the specified IRQ handler serial: export the key functions for an 8250 IRQ handler serial: clean up parameter passing for 8250 Rx IRQ handling ...	2012-01-09 12:09:24 -08:00
Linus Torvalds	98793265b4	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (53 commits) Kconfig: acpi: Fix typo in comment. misc latin1 to utf8 conversions devres: Fix a typo in devm_kfree comment btrfs: free-space-cache.c: remove extra semicolon. fat: Spelling s/obsolate/obsolete/g SCSI, pmcraid: Fix spelling error in a pmcraid_err() call tools/power turbostat: update fields in manpage mac80211: drop spelling fix types.h: fix comment spelling for 'architectures' typo fixes: aera -> area, exntension -> extension devices.txt: Fix typo of 'VMware'. sis900: Fix enum typo 'sis900_rx_bufer_status' decompress_bunzip2: remove invalid vi modeline treewide: Fix comment and string typo 'bufer' hyper-v: Update MAINTAINERS treewide: Fix typos in various parts of the kernel, and fix some comments. clockevents: drop unknown Kconfig symbol GENERIC_CLOCKEVENTS_MIGR gpio: Kconfig: drop unknown symbol 'CS5535_GPIO' leds: Kconfig: Fix typo 'D2NET_V2' sound: Kconfig: drop unknown symbol ARCH_CLPS7500 ... Fix up trivial conflicts in arch/powerpc/platforms/40x/Kconfig (some new kconfig additions, close to removed commented-out old ones)	2012-01-08 13:21:22 -08:00
Linus Torvalds	eb59c505f8	Merge branch 'pm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm * 'pm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (76 commits) PM / Hibernate: Implement compat_ioctl for /dev/snapshot PM / Freezer: fix return value of freezable_schedule_timeout_killable() PM / shmobile: Allow the A4R domain to be turned off at run time PM / input / touchscreen: Make st1232 use device PM QoS constraints PM / QoS: Introduce dev_pm_qos_add_ancestor_request() PM / shmobile: Remove the stay_on flag from SH7372's PM domains PM / shmobile: Don't include SH7372's INTCS in syscore suspend/resume PM / shmobile: Add support for the sh7372 A4S power domain / sleep mode PM: Drop generic_subsys_pm_ops PM / Sleep: Remove forward-only callbacks from AMBA bus type PM / Sleep: Remove forward-only callbacks from platform bus type PM: Run the driver callback directly if the subsystem one is not there PM / Sleep: Make pm_op() and pm_noirq_op() return callback pointers PM/Devfreq: Add Exynos4-bus device DVFS driver for Exynos4210/4212/4412. PM / Sleep: Merge internal functions in generic_ops.c PM / Sleep: Simplify generic system suspend callbacks PM / Hibernate: Remove deprecated hibernation snapshot ioctls PM / Sleep: Fix freezer failures due to racy usermodehelper_is_disabled() ARM: S3C64XX: Implement basic power domain support PM / shmobile: Use common always on power domain governor ... Fix up trivial conflict in fs/xfs/xfs_buf.c due to removal of unused XBT_FORCE_SLEEP bit	2012-01-08 13:10:57 -08:00
Linus Torvalds	972b2c7199	Merge branch 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs * 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (165 commits) reiserfs: Properly display mount options in /proc/mounts vfs: prevent remount read-only if pending removes vfs: count unlinked inodes vfs: protect remounting superblock read-only vfs: keep list of mounts for each superblock vfs: switch ->show_options() to struct dentry * vfs: switch ->show_path() to struct dentry * vfs: switch ->show_devname() to struct dentry * vfs: switch ->show_stats to struct dentry * switch security_path_chmod() to struct path * vfs: prefer ->dentry->d_sb to ->mnt->mnt_sb vfs: trim includes a bit switch mnt_namespace ->root to struct mount vfs: take /proc//mounts and friends to fs/proc_namespace.c vfs: opencode mntget() mnt_set_mountpoint() vfs: spread struct mount - remaining argument of next_mnt() vfs: move fsnotify junk to struct mount vfs: move mnt_devname vfs: move mnt_list to struct mount vfs: switch pnode.h macros to struct mount ...	2012-01-08 12:19:57 -08:00
Linus Torvalds	fbce1c234f	Changes queued in gpio/next for the start of the 3.3 merge window -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAABAgAGBQJPBSJSAAoJEEFnBt12D9kBDmIP/R6PWxg+NrJaGDmxRWBBkOch iI4Et7GT3Sk5hYqPz0kehiSz1F4bqpYK3JmdfqotJIhQCM2znXM+BaDLzgFnMsRf g4t70PIZQMpk3pj4gtIkMpZCkRdA9VFriIf1cEdHcLiRdNVsXlTE6CcQT4fbE1P/ JTAGBrVDwbchuoPsf7Mbu8SYT+cQdWLKpgzC4fAzI++QoYyYBtfvuCHTE4VvasNS jMDOfT+u0Gwp6LW34v0gKk7MXEQzeq07r1VZxMyuaWbRQoiLr5d1fy97+Fh5+4dU Rb9C6zGIoBzRdiJLlfNYrls/FueDHwPyIwXuapMp5QbaGN5fkhVpIVIhYICOLdHW IXH2V3saK1s6QbD7nErNT2lCWZQSCUKq/PR9W40dj3PYAI0oxbRBSoPdfGuW1Bl7 fYqClUza1Bln8bEmmsvAOnIDdfs6zpkWmfouwx4AGaHTfzIAg4VRdoxdUpS8h+/d sNEYi99IuYfl3KWCwQbWJMgCOvoBdOGZSCsTrXnjLf1HJCsQOt/Md69Ff6DLn1VR gDp0EeqgDH5oXJmQTW1WjXaJJair2j8pY8mlmGxl1AqA4aY8C3dxIfHxlBPzdxii I6KTPaIUSXZaOo6e7XDjjjTzabL4q0aQGT5ahpKj928Rnx00QZNXoTy4FmYkL+0k j7QZfM/p6+vfxekKUR17 =9ugo -----END PGP SIGNATURE----- Merge tag 'gpio-for-linus' of git://git.secretlab.ca/git/linux-2.6 Changes queued in gpio/next for the start of the 3.3 merge window * tag 'gpio-for-linus-20120104' of git://git.secretlab.ca/git/linux-2.6: gpio: Add decode of WM8994 GPIO configuration gpio: Convert GPIO drivers to module_platform_driver gpio: Fix typo in comment in Samsung driver gpio: Explicitly index samsung_gpio_cfgs gpio: Add Linus Walleij as gpio co-maintainer of: Add device tree selftests of: create of_phandle_args to simplify return of phandle parsing data gpio/powerpc: Eliminate duplication of of_get_named_gpio_flags() gpio/microblaze: Eliminate duplication of of_get_named_gpio_flags() gpiolib: output basic details and consolidate gpio device drivers pch_gpio: Change company name OKI SEMICONDUCTOR to LAPIS Semiconductor pch_gpio: Support new device LAPIS Semiconductor ML7831 IOH spi/pl022: make the chip deselect handling thread safe spi/pl022: add support for pm_runtime autosuspend spi/pl022: disable the PL022 block when unused spi/pl022: move device disable to workqueue thread spi/pl022: skip default configuration before suspending spi/pl022: fix build warnings spi/pl022: only enable RX interrupts when TX is complete	2012-01-07 12:15:36 -08:00
Linus Torvalds	7affca3537	Merge branch 'driver-core-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core * 'driver-core-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (73 commits) arm: fix up some samsung merge sysdev conversion problems firmware: Fix an oops on reading fw_priv->fw in sysfs loading file Drivers:hv: Fix a bug in vmbus_driver_unregister() driver core: remove __must_check from device_create_file debugfs: add missing #ifdef HAS_IOMEM arm: time.h: remove device.h #include driver-core: remove sysdev.h usage. clockevents: remove sysdev.h arm: convert sysdev_class to a regular subsystem arm: leds: convert sysdev_class to a regular subsystem kobject: remove kset_find_obj_hinted() m86k: gpio - convert sysdev_class to a regular subsystem mips: txx9_sram - convert sysdev_class to a regular subsystem mips: 7segled - convert sysdev_class to a regular subsystem sh: dma - convert sysdev_class to a regular subsystem sh: intc - convert sysdev_class to a regular subsystem power: suspend - convert sysdev_class to a regular subsystem power: qe_ic - convert sysdev_class to a regular subsystem power: cmm - convert sysdev_class to a regular subsystem s390: time - convert sysdev_class to a regular subsystem ... Fix up conflicts with 'struct sysdev' removal from various platform drivers that got changed: - arch/arm/mach-exynos/cpu.c - arch/arm/mach-exynos/irq-eint.c - arch/arm/mach-s3c64xx/common.c - arch/arm/mach-s3c64xx/cpu.c - arch/arm/mach-s5p64x0/cpu.c - arch/arm/mach-s5pv210/common.c - arch/arm/plat-samsung/include/plat/cpu.h - arch/powerpc/kernel/sysfs.c and fix up cpu_is_hotpluggable() as per Greg in include/linux/cpu.h	2012-01-07 12:03:30 -08:00
Al Viro	ece2ccb668	Merge branches 'vfsmount-guts', 'umode_t' and 'partitions' into Z	2012-01-06 23:15:54 -05:00
Linus Torvalds	e4e88f31bc	Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (185 commits) powerpc: fix compile error with 85xx/p1010rdb.c powerpc: fix compile error with 85xx/p1023_rds.c powerpc/fsl: add MSI support for the Freescale hypervisor arch/powerpc/sysdev/fsl_rmu.c: introduce missing kfree powerpc/fsl: Add support for Integrated Flash Controller powerpc/fsl: update compatiable on fsl 16550 uart nodes powerpc/85xx: fix PCI and localbus properties in p1022ds.dts powerpc/85xx: re-enable ePAPR byte channel driver in corenet32_smp_defconfig powerpc/fsl: Update defconfigs to enable some standard FSL HW features powerpc: Add TBI PHY node to first MDIO bus sbc834x: put full compat string in board match check powerpc/fsl-pci: Allow 64-bit PCIe devices to DMA to any memory address powerpc: Fix unpaired probe_hcall_entry and probe_hcall_exit offb: Fix setting of the pseudo-palette for >8bpp offb: Add palette hack for qemu "standard vga" framebuffer offb: Fix bug in calculating requested vram size powerpc/boot: Change the WARN to INFO for boot wrapper overlap message powerpc/44x: Fix build error on currituck platform powerpc/boot: Change the load address for the wrapper to fit the kernel powerpc/44x: Enable CRASH_DUMP for 440x ... Fix up a trivial conflict in arch/powerpc/include/asm/cputime.h due to the additional sparse-checking code for cputime_t.	2012-01-06 17:58:22 -08:00
Linus Torvalds	9753dfe19a	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1958 commits) net: pack skb_shared_info more efficiently net_sched: red: split red_parms into parms and vars net_sched: sfq: extend limits cnic: Improve error recovery on bnx2x devices cnic: Re-init dev->stats_addr after chip reset net_sched: Bug in netem reordering bna: fix sparse warnings/errors bna: make ethtool_ops and strings const xgmac: cleanups net: make ethtool_ops const vmxnet3" make ethtool ops const xen-netback: make ops structs const virtio_net: Pass gfp flags when allocating rx buffers. ixgbe: FCoE: Add support for ndo_get_fcoe_hbainfo() call netdev: FCoE: Add new ndo_get_fcoe_hbainfo() call igb: reset PHY after recovering from PHY power down igb: add basic runtime PM support igb: Add support for byte queue limits. e1000: cleanup CE4100 MDIO registers access e1000: unmap ce4100_gbe_mdio_base_virt in e1000_remove ...	2012-01-06 17:22:09 -08:00
Bjorn Helgaas	45a709f890	powerpc/PCI: convert to pci_create_root_bus() Convert from pci_create_bus() to pci_create_root_bus(). This way the root bus resources are correct immediately. This patch doesn't fix a problem because powerpc fixed the resources before scanning the bus, but it makes powerpc more consistent with other architectures. v2: fix build error with resource pointer passing CC: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2012-01-06 12:11:09 -08:00
Bjorn Helgaas	49a6cba4eb	powerpc/PCI: split PHB part out of pcibios_map_io_space() No functional change. This is so we can use pcibios_phb_map_io_space() before we have a struct pci_bus. v2: fix map io phb typo CC: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2012-01-06 12:11:08 -08:00
Bjorn Helgaas	a46770f5b9	powerpc/PCI: make pcibios_setup_phb_resources() static CC: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2012-01-06 12:11:08 -08:00
Myron Stowe	79c8be8384	PCI: PowerPC: convert pcibios_set_master() to a non-inlined function This patch converts PowerPC's architecture-specific 'pcibios_set_master()' routine to a non-inlined function. This will allow follow on patches to create a generic 'pcibios_set_master()' function using the '__weak' attribute which can be used by all architectures as a default which, if necessary, can then be over- ridden by architecture-specific code. Converting 'pci_bios_set_master()' to a non-inlined function will allow PowerPC's 'pcibios_set_master()' implementation to remain architecture-specific after the generic version is introduced and thus, not change current behavior. No functional change. Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Myron Stowe <myron.stowe@redhat.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2012-01-06 12:10:38 -08:00
Greg Kroah-Hartman	ff4b8a57f0	Merge branch 'driver-core-next' into Linux 3.2 This resolves the conflict in the arch/arm/mach-s3c64xx/s3c6400.c file, and it fixes the build error in the arch/x86/kernel/microcode_core.c file, that the merge did not catch. The microcode_core.c patch was provided by Stephen Rothwell <sfr@canb.auug.org.au> who was invaluable in the merge issues involved with the large sysdev removal process in the driver-core tree. Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2012-01-06 11:42:52 -08:00
Linus Torvalds	0db49b72bc	Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (40 commits) sched/tracing: Add a new tracepoint for sleeptime sched: Disable scheduler warnings during oopses sched: Fix cgroup movement of waking process sched: Fix cgroup movement of newly created process sched: Fix cgroup movement of forking process sched: Remove cfs bandwidth period check in tg_set_cfs_period() sched: Fix load-balance lock-breaking sched: Replace all_pinned with a generic flags field sched: Only queue remote wakeups when crossing cache boundaries sched: Add missing rcu_dereference() around ->real_parent usage [S390] fix cputime overflow in uptime_proc_show [S390] cputime: add sparse checking and cleanup sched: Mark parent and real_parent as __rcu sched, nohz: Fix missing RCU read lock sched, nohz: Set the NOHZ_BALANCE_KICK flag for idle load balancer sched, nohz: Fix the idle cpu check in nohz_idle_balance sched: Use jump_labels for sched_feat sched/accounting: Fix parameter passing in task_group_account_field sched/accounting: Fix user/system tick double accounting sched/accounting: Re-use scheduler statistics for the root cgroup ... Fix up conflicts in - arch/ia64/include/asm/cputime.h, include/asm-generic/cputime.h usecs_to_cputime64() vs the sparse cleanups - kernel/sched/fair.c, kernel/time/tick-sched.c scheduler changes in multiple branches	2012-01-06 08:44:54 -08:00
Linus Torvalds	423d091dfe	Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (64 commits) cpu: Export cpu_up() rcu: Apply ACCESS_ONCE() to rcu_boost() return value Revert "rcu: Permit rt_mutex_unlock() with irqs disabled" docs: Additional LWN links to RCU API rcu: Augment rcu_batch_end tracing for idle and callback state rcu: Add rcutorture tests for srcu_read_lock_raw() rcu: Make rcutorture test for hotpluggability before offlining CPUs driver-core/cpu: Expose hotpluggability to the rest of the kernel rcu: Remove redundant rcu_cpu_stall_suppress declaration rcu: Adaptive dyntick-idle preparation rcu: Keep invoking callbacks if CPU otherwise idle rcu: Irq nesting is always 0 on rcu_enter_idle_common rcu: Don't check irq nesting from rcu idle entry/exit rcu: Permit dyntick-idle with callbacks pending rcu: Document same-context read-side constraints rcu: Identify dyntick-idle CPUs on first force_quiescent_state() pass rcu: Remove dynticks false positives and RCU failures rcu: Reduce latency of rcu_prepare_for_idle() rcu: Eliminate RCU_FAST_NO_HZ grace-period hang rcu: Avoid needlessly IPIing CPUs at GP end ...	2012-01-06 08:02:40 -08:00
Linus Torvalds	4a2164a7db	Merge branch 'core-memblock-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip * 'core-memblock-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (52 commits) memblock: Reimplement memblock allocation using reverse free area iterator memblock: Kill early_node_map[] score: Use HAVE_MEMBLOCK_NODE_MAP s390: Use HAVE_MEMBLOCK_NODE_MAP mips: Use HAVE_MEMBLOCK_NODE_MAP ia64: Use HAVE_MEMBLOCK_NODE_MAP SuperH: Use HAVE_MEMBLOCK_NODE_MAP sparc: Use HAVE_MEMBLOCK_NODE_MAP powerpc: Use HAVE_MEMBLOCK_NODE_MAP memblock: Implement memblock_add_node() memblock: s/memblock_analyze()/memblock_allow_resize()/ and update users memblock: Track total size of regions automatically powerpc: Cleanup memblock usage memblock: Reimplement memblock_enforce_memory_limit() using __memblock_remove() memblock: Make memblock functions handle overflowing range @size memblock: Reimplement __memblock_remove() using memblock_isolate_range() memblock: Separate out memblock_isolate_range() from memblock_set_node() memblock: Kill memblock_init() memblock: Kill sentinel entries at the end of static region arrays memblock: Add __memblock_dump_all() ...	2012-01-06 07:54:53 -08:00
Tony Breeds	ef88e3911c	powerpc: fix compile error with 85xx/p1010rdb.c Current linux-next compiled with mpc85xx_defconfig causes this: arch/powerpc/platforms/85xx/p1010rdb.c:41:14: error: 'np' undeclared (first use in this function) arch/powerpc/platforms/85xx/p1023_rds.c:102:14: error: 'np' undeclared (first use in this function) Introduced in: commit `996983b75c` Author: Kyle Moffett <Kyle.D.Moffett@boeing.com> Date: Fri Dec 2 06:28:02 2011 +0000 powerpc/mpic: Search for open-pic device-tree node if NULL Signed-off-by: Tony Breeds <tony@bakeyournoodle.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>	2012-01-04 15:52:40 -06:00
Michael Neuling	2c1c743039	powerpc: fix compile error with 85xx/p1023_rds.c Current linux-next compiled with mpc85xx_smp_defconfig causes this: arch/powerpc/platforms/85xx/p1023_rds.c: In function 'mpc85xx_rds_pic_init': arch/powerpc/platforms/85xx/p1023_rds.c:102:14: error: 'np' undeclared (first use in this function) arch/powerpc/platforms/85xx/p1023_rds.c:102:14: note: each undeclared identifier is reported only once for each function it appears in Introduced in: commit `996983b75c` Author: Kyle Moffett <Kyle.D.Moffett@boeing.com> Date: Fri Dec 2 06:28:02 2011 +0000 powerpc/mpic: Search for open-pic device-tree node if NULL Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>	2012-01-04 15:51:24 -06:00
Timur Tabi	446bc1ffe4	powerpc/fsl: add MSI support for the Freescale hypervisor Add support for vmpic-msi nodes to the fsl_msi driver. The MSI is virtualized by the hypervisor, so the vmpic-msi does not contain a 'reg' property. Instead, the driver uses hcalls. Add support for the "msi-address-64" property to the fsl_pci driver. The Freescale hypervisor typically puts the virtualized MSIIR register in the page after the end of DDR, so we extend the DDR ATMU to cover it. Any other location for MSIIR is not supported, for now. Signed-off-by: Timur Tabi <timur@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>	2012-01-04 15:47:44 -06:00
Julia Lawall	c6ca52ad32	arch/powerpc/sysdev/fsl_rmu.c: introduce missing kfree rmu needs to be freed before leaving the function in an error case. A simplified version of the semantic match that finds the problem is as follows: (http://coccinelle.lip6.fr) // <smpl> @r exists@ local idexpression x; statement S; identifier f1; position p1,p2; expression ptr != NULL; @@ x@p1 = $kmalloc\\|kzalloc\\|kcalloc$(...); ... if (x == NULL) S <... when != x when != if (...) { <+...x...+> } x->f1 ...> ( return $0\\|<+...x...+>\\|ptr$; \| return@p2 ...; ) @script:python@ p1 << r.p1; p2 << r.p2; @@ print " file: %s kmalloc %s return %s" % (p1[0].file,p1[0].line,p2[0].line) // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>	2012-01-04 15:43:03 -06:00
Prabhakar Kushwaha	a20cbdeffc	powerpc/fsl: Add support for Integrated Flash Controller Integrated Flash Controller supports various flashes like NOR, NAND and other devices using NOR, NAND and GPCM Machine available on it. IFC supports four chip selects. Signed-off-by: Dipen Dudhat <Dipen.Dudhat@freescale.com> Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Li Yang <leoli@freescale.com> Signed-off-by: Liu Shuo <b35362@freescale.com> Signed-off-by: Prabhakar Kushwaha <prabhakar@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>	2012-01-04 15:41:22 -06:00
Kumar Gala	f706bed114	powerpc/fsl: update compatiable on fsl 16550 uart nodes The Freescale serial port's are pretty much a 16550, however there are some FSL specific bugs and features. Add a "fsl,ns16550" compatiable string to allow code to handle those FSL specific issues. Signed-off-by: Kumar Gala <galak@kernel.crashing.org>	2012-01-04 15:38:40 -06:00
Timur Tabi	0725696306	powerpc/85xx: fix PCI and localbus properties in p1022ds.dts PCI ranges, localbus reg and localbus chip-select 2 range do not match the memory map setup by bootloader. Signed-off-by: Timur Tabi <timur@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>	2012-01-04 15:33:58 -06:00
Timur Tabi	3900fad3d3	powerpc/85xx: re-enable ePAPR byte channel driver in corenet32_smp_defconfig Commit `7c4b2f09` (powerpc: Update mpc85xx/corenet 32-bit defconfigs) accidentally disabled the ePAPR byte channel driver in the defconfig for Freescale CoreNet platforms. Signed-off-by: Timur Tabi <timur@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>	2012-01-04 15:33:58 -06:00
Kumar Gala	389a6c527e	powerpc/fsl: Update defconfigs to enable some standard FSL HW features corenet64_smp_defconfig: - enabled rapidio corenet32_smp_defconfig: - enabled hugetlbfs, rapidio mpc85xx_smp_defconfig: - enabled P1010RDB, hugetlbfs, SPI, SDHC, Crypto/CAAM mpc85xx_smp_defconfig: - enabled hugetlbfs, SPI, SDHC, Crypto/CAAM Signed-off-by: Kumar Gala <galak@kernel.crashing.org>	2012-01-04 15:33:58 -06:00
Andy Fleming	220669495b	powerpc: Add TBI PHY node to first MDIO bus Systems which use the fsl_pq_mdio driver need to specify an address for TBI PHY transactions such that the address does not conflict with any PHYs on the bus (all transactions to that address are directed to the onboard TBI PHY). The driver used to scan for a free address if no address was specified, however this ran into issues when the PHY Lib was fixed so that all MDIO transactions were protected by a mutex. As it is, the code was meant to serve as a transitional tool until the device trees were all updated to specify the TBI address. The best fix for the mutex issue was to remove the scanning code, but it turns out some of the newer SoCs have started to omit the tbi-phy node when SGMII is not being used. As such, these devices will now fail unless we add a tbi-phy node to the first mdio controller. Signed-off-by: Andy Fleming <afleming@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>	2012-01-04 15:33:51 -06:00
Paul Gortmaker	dabc78403f	sbc834x: put full compat string in board match check The commit 883c2cfc8bcc0fd00c5d9f596fb8870f481b5bda: "fix of_flat_dt_is_compatible() to match the full compatible string" causes silent boot death on the sbc8349 board because it was just looking for 8349 and not 8349E -- as originally there were non-E (no SEC/encryption) chips available. Just add the E to the board detection string since all boards I've seen were manufactured with the E versions. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>	2012-01-04 15:27:59 -06:00
Kumar Gala	96ea3b4a70	powerpc/fsl-pci: Allow 64-bit PCIe devices to DMA to any memory address There is an issue on FSL-BookE 64-bit devices (P5020) in which PCIe devices that are capable of doing 64-bit DMAs (like an Intel e1000) do not function and crash the kernel if we have >4G of memory in the system. The reason is that the existing code only sets up one inbound window for access to system memory across PCIe. That window is limited to a 32-bit address space. So on systems we'll end up utilizing SWIOTLB for dma mappings. However SWIOTLB dma ops implement dma_alloc_coherent() as dma_direct_alloc_coherent(). Thus we can end up with dma addresses that are not accessible because of the inbound window limitation. We could possibly set the SWIOTLB alloc_coherent op to swiotlb_alloc_coherent() however that does not address the issue since the swiotlb_alloc_coherent() will behave almost identical to dma_direct_alloc_coherent() since the devices coherent_dma_mask will be greater than any address allocated by swiotlb_alloc_coherent() and thus we'll never bounce buffer it into a range that would be dma-able. The easiest and best solution is to just make it so that a 64-bit capable device is able to DMA to any internal system address. We accomplish this by opening up a second inbound window that maps all of memory above the internal SoC address width so we can set it up to access all of the internal SoC address space if needed. We than fixup the dma_ops and dma_offset for PCIe devices with a dma mask greater than the maximum internal SoC address. Signed-off-by: Kumar Gala <galak@kernel.crashing.org>	2012-01-04 15:27:58 -06:00
Al Viro	0583fcc96b	consolidate umode_t declarations Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-01-03 22:55:17 -05:00
Al Viro	1bc94226d5	switch spu_create(2) to use of SYSCALL_DEFINE4, make it use umode_t Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-01-03 22:55:16 -05:00
Al Viro	c6684b2685	switch spufs guts to umode_t Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-01-03 22:55:15 -05:00
Al Viro	d161a13f97	switch procfs to umode_t use both proc_dir_entry ->mode and populating functions Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-01-03 22:54:56 -05:00
Al Viro	ff01bb4832	fs: move code out of buffer.c Move invalidate_bdev, block_sync_page into fs/block_dev.c. Export kill_bdev as well, so brd doesn't have to open code it. Reduce buffer_head.h requirement accordingly. Removed a rather large comment from invalidate_bdev, as it looked a bit obsolete to bother moving. The small comment replacing it says enough. Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-01-03 22:54:07 -05:00
Al Viro	6b520e0565	vfs: fix the stupidity with i_dentry in inode destructors Seeing that just about every destructor got that INIT_LIST_HEAD() copied into it, there is no point whatsoever keeping this INIT_LIST_HEAD in inode_init_once(); the cost of taking it into inode_init_always() will be negligible for pipes and sockets and negative for everything else. Not to mention the removal of boilerplate code from ->destroy_inode() instances... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-01-03 22:52:40 -05:00
Li Zhong	e4f387d8db	powerpc: Fix unpaired probe_hcall_entry and probe_hcall_exit Unpaired calling of probe_hcall_entry and probe_hcall_exit might happen as following, which could cause incorrect preempt count. __trace_hcall_entry => trace_hcall_entry -> probe_hcall_entry => get_cpu_var => preempt_disable __trace_hcall_exit => trace_hcall_exit -> probe_hcall_exit => put_cpu_var => preempt_enable where: A => B and A -> B means A calls B, but => means A will call B through function name, and B will definitely be called. -> means A will call B through function pointer, so B might not be called if the function pointer is not set. So error happens when only one of probe_hcall_entry and probe_hcall_exit get called during a hcall. This patch tries to move the preempt count operations from probe_hcall_entry and probe_hcall_exit to its callers. Reported-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com> Tested-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> CC: stable@kernel.org [v2.6.32+] Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2012-01-03 12:09:27 +11:00
David S. Miller	7f8e3234c5	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2011-12-30 13:04:14 -05:00
Andreas Schwab	34845636a1	procfs: do not confuse jiffies with cputime64_t Commit `2a95ea6c0d` ("procfs: do not overflow get_{idle,iowait}_time for nohz") did not take into account that one some architectures jiffies and cputime use different units. This causes get_idle_time() to return numbers in the wrong units, making the idle time fields in /proc/stat wrong. Instead of converting the usec value returned by get_cpu_{idle,iowait}_time_us to units of jiffies, use the new function usecs_to_cputime64 to convert it to the correct unit of cputime64_t. Signed-off-by: Andreas Schwab <schwab@linux-m68k.org> Acked-by: Michal Hocko <mhocko@suse.cz> Cc: Arnd Bergmann <arnd@arndb.de> Cc: "Artem S. Tashkinov" <t.artem@mailcity.com> Cc: Dave Jones <davej@redhat.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-12-29 16:31:57 -08:00
Alexander Graf	da69dee073	KVM: PPC: Whitespace fix for kvm.h kvm.h had sparse whitespace at the end of the line. Clean it up so syncing with QEMU gets easier. Signed-off-by: Alexander Graf <agraf@suse.de>	2011-12-27 11:26:43 +02:00
Nishanth Aravamudan	6c9b7c409c	KVM: PPC: annotate kvm_rma_init as __init kvm_rma_init() is only called at boot-time, by setup_arch, which is also __init. Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2011-12-27 11:26:40 +02:00
Xiao Guangrong	28a37544fb	KVM: introduce id_to_memslot function Introduce id_to_memslot to get memslot by slot id Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-12-27 11:17:39 +02:00
Scott Wood	fae9dbb4b4	KVM: PPC: e500: include linux/export.h This is required for THIS_MODULE. We recently stopped acquiring it via some other header. Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2011-12-26 13:28:03 +02:00
Michael Neuling	251da03897	KVM: PPC: fix kvmppc_start_thread() for CONFIG_SMP=N Currently kvmppc_start_thread() tries to wake other SMT threads via xics_wake_cpu(). Unfortunately xics_wake_cpu only exists when CONFIG_SMP=Y so when compiling with CONFIG_SMP=N we get: arch/powerpc/kvm/built-in.o: In function `.kvmppc_start_thread': book3s_hv.c:(.text+0xa1e0): undefined reference to `.xics_wake_cpu' The following should be fine since kvmppc_start_thread() shouldn't called to start non-zero threads when SMP=N since threads_per_core=1. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Alexander Graf <agraf@suse.de>	2011-12-26 13:28:02 +02:00
Andreas Schwab	96f38d7286	KVM: PPC: protect use of kvmppc_h_pr kvmppc_h_pr is only available if CONFIG_KVM_BOOK3S_64_PR. Signed-off-by: Andreas Schwab <schwab@linux-m68k.org> Signed-off-by: Alexander Graf <agraf@suse.de>	2011-12-26 13:28:01 +02:00
Andreas Schwab	36cc66d638	KVM: PPC: move compute_tlbie_rb to book3s_64 common header compute_tlbie_rb is only used on ppc64 and cannot be compiled on ppc32. Signed-off-by: Andreas Schwab <schwab@linux-m68k.org> Signed-off-by: Alexander Graf <agraf@suse.de>	2011-12-26 13:28:00 +02:00
Kay Sievers	edbaa603eb	driver-core: remove sysdev.h usage. The sysdev.h file should not be needed by any in-kernel code, so remove the .h file from these random files that seem to still want to include it. The sysdev code will be going away soon, so this include needs to be removed no matter what. Cc: Jiandong Zheng <jdzheng@broadcom.com> Cc: Scott Branden <sbranden@broadcom.com> Cc: Russell King <linux@arm.linux.org.uk> Cc: Kukjin Kim <kgene.kim@samsung.com> Cc: David Brown <davidb@codeaurora.org> Cc: Daniel Walker <dwalker@fifo99.com> Cc: Bryan Huntsman <bryanh@codeaurora.org> Cc: Ben Dooks <ben-linux@fluff.org> Cc: Wan ZongShun <mcuos.com@gmail.com> Cc: Haavard Skinnemoen <hskinnemoen@gmail.com> Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: "Venkatesh Pallipadi Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Grant Likely <grant.likely@secretlab.ca> Cc: Richard Purdie <rpurdie@rpsys.net> Cc: Matthew Garrett <mjg@redhat.com> Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>	2011-12-21 16:26:03 -08:00
Kay Sievers	86ba41d033	power: suspend - convert sysdev_class to a regular subsystem After all sysdev classes are ported to regular driver core entities, the sysdev implementation will be entirely removed from the kernel. Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-12-21 15:09:52 -08:00
Kay Sievers	cfde779fcc	power: qe_ic - convert sysdev_class to a regular subsystem After all sysdev classes are ported to regular driver core entities, the sysdev implementation will be entirely removed from the kernel. Cc: Timur Tabi <timur@freescale.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-12-21 15:09:51 -08:00
Kay Sievers	6c9d290952	power: cmm - convert sysdev_class to a regular subsystem After all sysdev classes are ported to regular driver core entities, the sysdev implementation will be entirely removed from the kernel. Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Prarit Bhargava <prarit@redhat.com> Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-12-21 15:09:51 -08:00
Kay Sievers	10fbcf4c6c	convert 'memory' sysdev_class to a regular subsystem This moves the 'memory sysdev_class' over to a regular 'memory' subsystem and converts the devices to regular devices. The sysdev drivers are implemented as subsystem interfaces now. After all sysdev classes are ported to regular driver core entities, the sysdev implementation will be entirely removed from the kernel. Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-12-21 14:48:43 -08:00
Kay Sievers	8a25a2fd12	cpu: convert 'cpu' and 'machinecheck' sysdev_class to a regular subsystem This moves the 'cpu sysdev_class' over to a regular 'cpu' subsystem and converts the devices to regular devices. The sysdev drivers are implemented as subsystem interfaces now. After all sysdev classes are ported to regular driver core entities, the sysdev implementation will be entirely removed from the kernel. Userspace relies on events and generic sysfs subsystem infrastructure from sysdev devices, which are made available with this conversion. Cc: Haavard Skinnemoen <hskinnemoen@gmail.com> Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no> Cc: Tony Luck <tony.luck@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Borislav Petkov <bp@amd64.org> Cc: Tigran Aivazian <tigran@aivazian.fsnet.co.uk> Cc: Len Brown <lenb@kernel.org> Cc: Zhang Rui <rui.zhang@intel.com> Cc: Dave Jones <davej@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Russell King <rmk+kernel@arm.linux.org.uk> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-12-21 14:29:42 -08:00
Rafael J. Wysocki	90363ddf0a	PM: Drop generic_subsys_pm_ops Since the PM core is now going to execute driver callbacks directly if the corresponding subsystem callbacks are not present, forward-only subsystem callbacks (i.e. such that only execute the corresponding driver callbacks) are not necessary any more. Thus it is possible to remove generic_subsys_pm_ops, because the only callback in there that is not forward-only, .runtime_idle, is not really used by the only user of generic_subsys_pm_ops, which is vio_bus_type. However, the generic callback routines themselves cannot be removed from generic_ops.c, because they are used individually by a number of subsystems. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>	2011-12-21 22:03:32 +01:00
Rafael J. Wysocki	b00f4dc5ff	Merge branch 'master' into pm-sleep * master: (848 commits) SELinux: Fix RCU deref check warning in sel_netport_insert() binary_sysctl(): fix memory leak mm/vmalloc.c: remove static declaration of va from __get_vm_area_node ipmi_watchdog: restore settings when BMC reset oom: fix integer overflow of points in oom_badness memcg: keep root group unchanged if creation fails nilfs2: potential integer overflow in nilfs_ioctl_clean_segments() nilfs2: unbreak compat ioctl cpusets: stall when updating mems_allowed for mempolicy or disjoint nodemask evm: prevent racing during tfm allocation evm: key must be set once during initialization mmc: vub300: fix type of firmware_rom_wait_states module parameter Revert "mmc: enable runtime PM by default" mmc: sdhci: remove "state" argument from sdhci_suspend_host x86, dumpstack: Fix code bytes breakage due to missing KERN_CONT IB/qib: Correct sense on freectxts increment and decrement RDMA/cma: Verify private data length cgroups: fix a css_set not found bug in cgroup_attach_proc oprofile: Fix uninitialized memory access when writing to writing to oprofilefs Revert "xen/pv-on-hvm kexec: add xs_reset_watches to shutdown watches from old kernel" ... Conflicts: kernel/cgroup_freezer.c	2011-12-21 21:59:45 +01:00
Suzuki Poulose	eba3d97db8	powerpc/boot: Change the WARN to INFO for boot wrapper overlap message commit `c55aef0e5b` ("powerpc/boot: Change the load address for the wrapper to fit the kernel") introduced a WARNING to inform the user that the uncompressed kernel would overlap the boot uncompressing wrapper code. Change it to an INFO. I initially thought, this would be a 'WARNING' for the those boards, where the link_address should be fixed, so that the user can take actions accordingly. Changing the same to INFO. Signed-off-by: Suzuki K. Poulose <suzuki@in.ibm.com> Signed-off-by: Josh Boyer <jwboyer@gmail.com>	2011-12-21 15:09:25 -05:00
Peter Zijlstra	35edc2a509	perf, arch: Rework perf_event_index() Put the logic to compute the event index into a per pmu method. This is required because the x86 rules are weird and wonderful and don't match the capabilities of the current scheme. AFAIK only powerpc actually has a usable userspace read of the PMCs but I'm not at all sure anybody actually used that. ARM is restored to the default since it currently does not support userspace access at all. And all software events are provided with a method that reports their index as 0 (disabled). Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Michael Cree <mcree@orcon.net.nz> Cc: Will Deacon <will.deacon@arm.com> Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com> Cc: Anton Blanchard <anton@samba.org> Cc: Eric B Munson <emunson@mgebm.net> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: David S. Miller <davem@davemloft.net> Cc: Richard Kuo <rkuo@codeaurora.org> Cc: Stephane Eranian <eranian@google.com> Cc: Arun Sharma <asharma@fb.com> Link: http://lkml.kernel.org/n/tip-dfydxodki16lylkt3gl2j7cw@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-12-21 11:01:07 +01:00
Josh Boyer	eb975652b8	powerpc/44x: Fix build error on currituck platform The MPIC_PRIMARY define was recently made "default" and the meaning was inverted to MPIC_SECONDARY. This causes compile errors in currituck now, so fix it to the new manner of allocating mpics. Signed-off-by: Josh Boyer <jwboyer@gmail.com>	2011-12-20 10:41:28 -05:00
Suzuki Poulose	c55aef0e5b	powerpc/boot: Change the load address for the wrapper to fit the kernel The wrapper code which uncompresses the kernel in case of a 'ppc' boot is by default loaded at 0x00400000 and the kernel will be uncompressed to fit the location 0-0x00400000. But with dynamic relocations, the size of the kernel may exceed 0x00400000(4M). This would cause an overlap of the uncompressed kernel and the boot wrapper, causing a failure in boot. The message looks like : zImage starting: loaded at 0x00400000 (sp: 0x0065ffb0) Allocating 0x5ce650 bytes for kernel ... Insufficient memory for kernel at address 0! (_start=00400000, uncompressed size=00591a20) This patch shifts the load address of the boot wrapper code to the next higher MB, according to the size of the uncompressed vmlinux. With the patch, we get the following message while building the image : WARN: Uncompressed kernel (size 0x5b0344) overlaps the address of the wrapper(0x400000) WARN: Fixing the link_address of wrapper to (0x600000) Signed-off-by: Suzuki K. Poulose <suzuki@in.ibm.com> Signed-off-by: Josh Boyer <jwboyer@gmail.com>	2011-12-20 10:22:56 -05:00
Suzuki Poulose	5b2e478da0	powerpc/44x: Enable CRASH_DUMP for 440x Now that we have relocatable kernel, supporting CRASH_DUMP only requires turning the switches on for UP machines. We don't have kexec support on 47x yet. Enabling SMP support would be done as part of enabling the PPC_47x support. Signed-off-by: Suzuki K. Poulose <suzuki@in.ibm.com> Cc: Josh Boyer <jwboyer@gmail.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: linuxppc-dev <linuxppc-dev@lists.ozlabs.org> Signed-off-by: Josh Boyer <jwboyer@gmail.com>	2011-12-20 10:22:14 -05:00
Suzuki Poulose	26ecb6c44b	powerpc/44x: Enable CONFIG_RELOCATABLE for PPC44x The following patch adds relocatable kernel support - based on processing of dynamic relocations - for PPC44x kernel. We find the runtime address of _stext and relocate ourselves based on the following calculation. virtual_base = ALIGN(KERNELBASE,256M) + MODULO(_stext.run,256M) relocate() is called with the Effective Virtual Base Address (as shown below) \| Phys. Addr\| Virt. Addr \| Page (256M) \|------------------------\| Boundary \| \| \| \| \| \| \| \| \| Kernel Load \|___________\|_ __ _ _ _ _\|<- Effective Addr(_stext)\| \| ^ \|Virt. Base Addr \| \| \| \| \| \| \| \| \| \|reloc_offset\| \| \| \| \| \| \| \| \| \| \|______v_____\|<-(KERNELBASE)%256M \| \| \| \| \| \| \| \| \| Page(256M) \|-----------\|------------\| Boundary \| \| \| The virt_phys_offset is updated accordingly, i.e, virt_phys_offset = effective. kernel virt base - kernstart_addr I have tested the patches on 440x platforms only. However this should work fine for PPC_47x also, as we only depend on the runtime address and the current TLB XLAT entry for the startup code, which is available in r25. I don't have access to a 47x board yet. So, it would be great if somebody could test this on 47x. Signed-off-by: Suzuki K. Poulose <suzuki@in.ibm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Kumar Gala <galak@kernel.crashing.org> Cc: Tony Breeds <tony@bakeyournoodle.com> Cc: Josh Boyer <jwboyer@gmail.com> Cc: linuxppc-dev <linuxppc-dev@lists.ozlabs.org> Signed-off-by: Josh Boyer <jwboyer@gmail.com>	2011-12-20 10:21:57 -05:00
Suzuki Poulose	368ff8f14d	powerpc: Define virtual-physical translations for RELOCATABLE We find the runtime address of _stext and relocate ourselves based on the following calculation. virtual_base = ALIGN(KERNELBASE,KERNEL_TLB_PIN_SIZE) + MODULO(_stext.run,KERNEL_TLB_PIN_SIZE) relocate() is called with the Effective Virtual Base Address (as shown below) \| Phys. Addr\| Virt. Addr \| Page \|------------------------\| Boundary \| \| \| \| \| \| \| \| \| Kernel Load \|___________\|_ __ _ _ _ _\|<- Effective Addr(_stext)\| \| ^ \|Virt. Base Addr \| \| \| \| \| \| \| \| \| \|reloc_offset\| \| \| \| \| \| \| \| \| \| \|______v_____\|<-(KERNELBASE)%TLB_SIZE \| \| \| \| \| \| \| \| \| Page \|-----------\|------------\| Boundary \| \| \| On BookE, we need __va() & __pa() early in the boot process to access the device tree. Currently this has been defined as : #define __va(x) ((void )(unsigned long)((phys_addr_t)(x) - PHYSICAL_START + KERNELBASE) where: PHYSICAL_START is kernstart_addr - a variable updated at runtime. KERNELBASE is the compile time Virtual base address of kernel. This won't work for us, as kernstart_addr is dynamic and will yield different results for __va()/__pa() for same mapping. e.g., Let the kernel be loaded at 64MB and KERNELBASE be 0xc0000000 (same as PAGE_OFFSET). In this case, we would be mapping 0 to 0xc0000000, and kernstart_addr = 64M Now __va(1MB) = (0x100000) - (0x4000000) + 0xc0000000 = 0xbc100000 , which is wrong. it should be : 0xc0000000 + 0x100000 = 0xc0100000 On platforms which support AMP, like PPC_47x (based on 44x), the kernel could be loaded at highmem. Hence we cannot always depend on the compile time constants for mapping. Here are the possible solutions: 1) Update kernstart_addr(PHSYICAL_START) to match the Physical address of compile time KERNELBASE value, instead of the actual Physical_Address(_stext). The disadvantage is that we may break other users of PHYSICAL_START. They could be replaced with __pa(_stext). 2) Redefine __va() & __pa() with relocation offset #ifdef CONFIG_RELOCATABLE_PPC32 #define __va(x) ((void )(unsigned long)((phys_addr_t)(x) - PHYSICAL_START + (KERNELBASE + RELOC_OFFSET))) #define __pa(x) ((unsigned long)(x) + PHYSICAL_START - (KERNELBASE + RELOC_OFFSET)) #endif where, RELOC_OFFSET could be a) A variable, say relocation_offset (like kernstart_addr), updated at boot time. This impacts performance, as we have to load an additional variable from memory. OR b) #define RELOC_OFFSET ((PHYSICAL_START & PPC_PIN_SIZE_OFFSET_MASK) - \ (KERNELBASE & PPC_PIN_SIZE_OFFSET_MASK)) This introduces more calculations for doing the translation. 3) Redefine __va() & __pa() with a new variable i.e, #define __va(x) ((void )(unsigned long)((phys_addr_t)(x) + VIRT_PHYS_OFFSET)) where VIRT_PHYS_OFFSET : #ifdef CONFIG_RELOCATABLE_PPC32 #define VIRT_PHYS_OFFSET virt_phys_offset #else #define VIRT_PHYS_OFFSET (KERNELBASE - PHYSICAL_START) #endif / CONFIG_RELOCATABLE_PPC32 */ where virt_phy_offset is updated at runtime to : Effective KERNELBASE - kernstart_addr. Taking our example, above: virt_phys_offset = effective_kernelstart_vaddr - kernstart_addr = 0xc0400000 - 0x400000 = 0xc0000000 and __va(0x100000) = 0xc0000000 + 0x100000 = 0xc0100000 which is what we want. I have implemented (3) in the following patch which has same cost of operation as the existing one. I have tested the patches on 440x platforms only. However this should work fine for PPC_47x also, as we only depend on the runtime address and the current TLB XLAT entry for the startup code, which is available in r25. I don't have access to a 47x board yet. So, it would be great if somebody could test this on 47x. Signed-off-by: Suzuki K. Poulose <suzuki@in.ibm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Kumar Gala <galak@kernel.crashing.org> Cc: linuxppc-dev <linuxppc-dev@lists.ozlabs.org> Signed-off-by: Josh Boyer <jwboyer@gmail.com>	2011-12-20 10:21:34 -05:00
Suzuki Poulose	9c5f7d39a8	powerpc: Process dynamic relocations for kernel The following patch implements the dynamic relocation processing for PPC32 kernel. relocate() accepts the target virtual address and relocates the kernel image to the same. Currently the following relocation types are handled : R_PPC_RELATIVE R_PPC_ADDR16_LO R_PPC_ADDR16_HI R_PPC_ADDR16_HA The last 3 relocations in the above list depends on value of Symbol indexed whose index is encoded in the Relocation entry. Hence we need the Symbol Table for processing such relocations. Note: The GNU ld for ppc32 produces buggy relocations for relocation types that depend on symbols. The value of the symbols with STB_LOCAL scope should be assumed to be zero. - Alan Modra Signed-off-by: Suzuki K. Poulose <suzuki@in.ibm.com> Signed-off-by: Josh Poimboeuf <jpoimboe@linux.vnet.ibm.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Alan Modra <amodra@au1.ibm.com> Cc: Kumar Gala <galak@kernel.crashing.org> Cc: linuxppc-dev <linuxppc-dev@lists.ozlabs.org> Signed-off-by: Josh Boyer <jwboyer@gmail.com>	2011-12-20 10:21:08 -05:00
Suzuki Poulose	2391324545	powerpc/44x: Enable DYNAMIC_MEMSTART for 440x DYNAMIC_MEMSTART(old RELOCATABLE) was restricted only to PPC_47x variants of 44x. This patch enables DYNAMIC_MEMSTART for 440x based chipsets. Signed-off-by: Suzuki K. Poulose <suzuki@in.ibm.com> Cc: Josh Boyer <jwboyer@gmail.com> Cc: Kumar Gala <galak@kernel.crashing.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: linux ppc dev <linuxppc-dev@lists.ozlabs.org> Signed-off-by: Josh Boyer <jwboyer@gmail.com>	2011-12-20 10:20:38 -05:00
Suzuki Poulose	0f890c8d20	powerpc: Rename mapping based RELOCATABLE to DYNAMIC_MEMSTART for BookE The current implementation of CONFIG_RELOCATABLE in BookE is based on mapping the page aligned kernel load address to KERNELBASE. This approach however is not enough for platforms, where the TLB page size is large (e.g, 256M on 44x). So we are renaming the RELOCATABLE used currently in BookE to DYNAMIC_MEMSTART to reflect the actual method. The CONFIG_RELOCATABLE for PPC32(BookE) based on processing of the dynamic relocations will be introduced in the later in the patch series. This change would allow the use of the old method of RELOCATABLE for platforms which can afford to enforce the page alignment (platforms with smaller TLB size). Changes since v3: * Introduced a new config, NONSTATIC_KERNEL, to denote a kernel which is either a RELOCATABLE or DYNAMIC_MEMSTART(Suggested by: Josh Boyer) Suggested-by: Scott Wood <scottwood@freescale.com> Tested-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Suzuki K. Poulose <suzuki@in.ibm.com> Cc: Scott Wood <scottwood@freescale.com> Cc: Kumar Gala <galak@kernel.crashing.org> Cc: Josh Boyer <jwboyer@gmail.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: linux ppc dev <linuxppc-dev@lists.ozlabs.org> Signed-off-by: Josh Boyer <jwboyer@gmail.com>	2011-12-20 10:20:19 -05:00
Benjamin Herrenschmidt	3f53638c80	powerpc: Fix old bug in prom_init setting of the color We have an array of 16 entries and a loop of 32 iterations... oops. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2011-12-19 14:41:25 +11:00
Paul Mackerras	64968f60e7	powerpc: Only use initrd_end as the limit for alloc_bottom if it's inside the RMO. As the kernels and initrd's get bigger boot-loaders and possibly kexec-tools will need to place the initrd outside the RMO. When this happens we end up with no lowmem and the boot doesn't get very far. Only use initrd_end as the limit for alloc_bottom if it's inside the RMO. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Tony Breeds <tony@bakeyournoodle.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2011-12-19 14:41:24 +11:00
Anton Blanchard	b206590c04	powerpc: Fix comment explaining our VSID layout We support 16TB of user address space and half a million contexts so update the comment to reflect this. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2011-12-19 14:41:22 +11:00
Andreas Schwab	9f5072d4f6	powerpc: Fix wrong divisor in usecs_to_cputime Commit `d57af9b` (taskstats: use real microsecond granularity for CPU times) renamed msecs_to_cputime to usecs_to_cputime, but failed to update all numbers on the way. This causes nonsensical cpu idle/iowait values to be displayed in /proc/stat (the only user of usecs_to_cputime so far). This also renames __cputime_msec_factor to __cputime_usec_factor, adapting its value and using it directly in cputime_to_usecs instead of doing two multiplications. Signed-off-by: Andreas Schwab <schwab@linux-m68k.org> Acked-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2011-12-19 14:41:20 +11:00
David Rientjes	2011b1d0d3	powerpc/mm: Fix section mismatch for read_n_cells read_n_cells() cannot be marked as .devinit.text since it is referenced from two functions that are not in that section: of_get_lmb_size() and hot_add_drconf_scn_to_nid(). Signed-off-by: David Rientjes <rientjes@google.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2011-12-19 14:41:18 +11:00
David Rientjes	28e86bdbc9	powerpc/mm: Fix section mismatch for mark_reserved_regions_for_nid mark_reserved_regions_for_nid() is only called from do_init_bootmem(), which is in .init.text, so it must be in the same section to avoid a section mismatch warning. Reported-by: Subrata Modak <subrata@linux.vnet.ibm.com> Signed-off-by: David Rientjes <rientjes@google.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2011-12-19 14:41:16 +11:00
Matt Evans	2c9c6ce019	powerpc: Add __SANE_USERSPACE_TYPES__ to asm/types.h for LL64 PPC64 uses long long for u64 in the kernel, but powerpc's asm/types.h prevents 64-bit userland from seeing this definition, instead defaulting to u64 == long in userspace. Some user programs (e.g. kvmtool) may actually want LL64, so this patch adds a check for __SANE_USERSPACE_TYPES__ so that, if defined, int-ll64.h is included instead. Signed-off-by: Matt Evans <matt@ozlabs.org> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2011-12-19 14:41:14 +11:00
Anton Blanchard	a66086b819	powerpc: POWER7 optimised copy_to_user/copy_from_user using VMX Implement a POWER7 optimised copy_to_user/copy_from_user using VMX. For large aligned copies this new loop is over 10% faster, and for large unaligned copies it is over 200% faster. If we take a fault we fall back to the old version, this keeps things relatively simple and easy to verify. On POWER7 unaligned stores rarely slow down - they only flush when a store crosses a 4KB page boundary. Furthermore this flush is handled completely in hardware and should be 20-30 cycles. Unaligned loads on the other hand flush much more often - whenever crossing a 128 byte cache line, or a 32 byte sector if either sector is an L1 miss. Considering this information we really want to get the loads aligned and not worry about the alignment of the stores. Microbenchmarks confirm that this approach is much faster than the current unaligned copy loop that uses shifts and rotates to ensure both loads and stores are aligned. We also want to try and do the stores in cacheline aligned, cacheline sized chunks. If the store queue is unable to merge an entire cacheline of stores then the L2 cache will have to do a read/modify/write. Even worse, we will serialise this with the stores in the next iteration of the copy loop since both iterations hit the same cacheline. Based on this, the new loop does the following things: 1 - 127 bytes Get the source 8 byte aligned and use 8 byte loads and stores. Pretty boring and similar to how the current loop works. 128 - 4095 bytes Get the source 8 byte aligned and use 8 byte loads and stores, 1 cacheline at a time. We aren't doing the stores in cacheline aligned chunks so we will potentially serialise once per cacheline. Even so it is much better than the loop we have today. 4096 - bytes If both source and destination have the same alignment get them both 16 byte aligned, then get the destination cacheline aligned. Do cacheline sized loads and stores using VMX. If source and destination do not have the same alignment, we get the destination cacheline aligned, and use permute to do aligned loads. In both cases the VMX loop should be optimal - we always do aligned loads and stores and are always doing stores in cacheline aligned, cacheline sized chunks. To be able to use VMX we must be careful about interrupts and sleeping. We don't use the VMX loop when in an interrupt (which should be rare anyway) and we wrap the VMX loop in disable/enable_pagefault and fall back to the existing copy_tofrom_user loop if we do need to sleep. The VMX breakpoint of 4096 bytes was chosen using this microbenchmark: http://ozlabs.org/~anton/junkcode/copy_to_user.c Since we are using VMX and there is a cost to saving and restoring the user VMX state there are two broad cases we need to benchmark: - Best case - userspace never uses VMX - Worst case - userspace always uses VMX In reality a userspace process will sit somewhere between these two extremes. Since we need to test both aligned and unaligned copies we end up with 4 combinations. The point at which the VMX loop begins to win is: 0% VMX aligned 2048 bytes unaligned 2048 bytes 100% VMX aligned 16384 bytes unaligned 8192 bytes Considering this is a microbenchmark, the data is hot in cache and the VMX loop has better store queue merging properties we set the breakpoint to 4096 bytes, a little below the unaligned breakpoints. Some future optimisations we can look at: - Looking at the perf data, a significant part of the cost when a task is always using VMX is the extra exception we take to restore the VMX state. As such we should do something similar to the x86 optimisation that restores FPU state for heavy users. ie: /* * If the task has used fpu the last 5 timeslices, just do a full * restore of the math state immediately to avoid the trap; the * chances of needing FPU soon are obviously high now / preload_fpu = tsk_used_math(next_p) && next_p->fpu_counter > 5; and / * fpu_counter contains the number of consecutive context switches * that the FPU is used. If this is over a threshold, the lazy fpu * saving becomes unlazy to save the trap. This is an unsigned char * so that after 256 times the counter wraps and the behavior turns * lazy again; this to deal with bursty apps that only use FPU for * a short time */ - We could create a paca bit to mirror the VMX enabled MSR bit and check that first, avoiding multiple calls to calling enable_kernel_altivec. That should help with iovec based system calls like readv. - We could have two VMX breakpoints, one for when we know the user VMX state is loaded into the registers and one when it isn't. This could be a second bit in the paca so we can calculate the break points quickly. - One suggestion from Ben was to save and restore the VSX registers we use inline instead of using enable_kernel_altivec. [BenH: Fixed a problem with preempt and fixed build without CONFIG_ALTIVEC] Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2011-12-19 14:40:40 +11:00
Richard Kuo	0766387bcf	powerpc: Use rwsem.h from generic location As of commit `dd472da38`, rwsem.h was moved into asm-generic. This patch removes the arch file and points the build at its new location. Signed-off-by: Richard Kuo <rkuo@codeaurora.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2011-12-16 14:39:48 +11:00
Benjamin Herrenschmidt	1e7342e778	Merge remote-tracking branch 'jwb/next' into next Conflicts: arch/powerpc/platforms/40x/ppc40x_simple.c	2011-12-16 11:24:25 +11:00
Benjamin Herrenschmidt	78c5c68a4c	powerpc/pmac: Fix SMP kernels on pre-core99 UP machines The code for "powersurge" SMP would kick in and cause a crash at boot due to the lack of a NULL test. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2011-12-16 11:10:25 +11:00
Benjamin Herrenschmidt	8e609d5e7b	powerpc/pmac: Simplify old pmac PIC interrupt handling In the old days, we treated all interrupts from the legacy Apple home made interrupt controllers as level, with a trick reading the "level" register along with the "event" register to work arounds bugs where it would occasionally fail to latch some events. Doing so appeared to work fine for both level and edge interrupts. Later on, we discovered in Darwin source the magic masks that define which interrupts are actually level and which are edge, and implemented a different algorithm, more similar to what Apple does, that treats those differently. I recently discovered however that this caused problems (including loss of interrupts) with an old Wallstreet PowerBook when trying to use the internal modem (connected to a cascaded controller). It looks like some interrupts are treated as edge while they are really level and I'm starting to seriously doubt the correctness of the Darwin code (which has other obvious bugs when you read it, so ...) This patch reverts to our original behaviour of treating everything as a level interrupt. It appears to solve the problems with the modem on the Wallstreet and everything else seems to be working properly as well. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2011-12-16 11:10:11 +11:00
Benjamin Herrenschmidt	43ca5d347a	Merge branch 'kexec' into next	2011-12-16 11:09:21 +11:00
Benjamin Herrenschmidt	efdad722ef	Merge branch 'ps3' into next	2011-12-16 11:09:15 +11:00
Benjamin Herrenschmidt	e6f08d37e6	Merge branch 'cpuidle' into next	2011-12-16 11:09:11 +11:00
Martin Schwidefsky	648616343c	[S390] cputime: add sparse checking and cleanup Make cputime_t and cputime64_t nocast to enable sparse checking to detect incorrect use of cputime. Drop the cputime macros for simple scalar operations. The conversion macros are still needed. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2011-12-15 14:56:19 +01:00
Grant Likely	1a2d397a6e	gpio/powerpc: Eliminate duplication of of_get_named_gpio_flags() A large chunk of qe_pin_request() is unnecessarily cut-and-paste directly from of_get_named_gpio_flags(). This patch cuts out the duplicate code and replaces it with a call to of_get_gpio(). v2: fixed compile error due to missing gpio_to_chip() Signed-off-by: Grant Likely <grant.likely@secretlab.ca> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Kumar Gala <galak@kernel.crashing.org>	2011-12-12 13:40:16 -07:00
Frederic Weisbecker	1268fbc746	nohz: Remove tick_nohz_idle_enter_norcu() / tick_nohz_idle_exit_norcu() Those two APIs were provided to optimize the calls of tick_nohz_idle_enter() and rcu_idle_enter() into a single irq disabled section. This way no interrupt happening in-between would needlessly process any RCU job. Now we are talking about an optimization for which benefits have yet to be measured. Let's start simple and completely decouple idle rcu and dyntick idle logics to simplify. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Reviewed-by: Josh Triplett <josh@joshtriplett.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2011-12-11 10:31:57 -08:00
Paul E. McKenney	a7b152d534	powerpc: Tell RCU about idle after hcall tracing The PowerPC pSeries platform (CONFIG_PPC_PSERIES=y) enables hypervisor-call tracing for CONFIG_TRACEPOINTS=y kernels. One of the hypervisor calls that is traced is the H_CEDE call in the idle loop that tells the hypervisor that this OS instance no longer needs the current CPU. However, tracing uses RCU, so this combination of kernel configuration variables needs to avoid telling RCU about the current CPU's idleness until after the H_CEDE-entry tracing completes on the one hand, and must tell RCU that the the current CPU is no longer idle before the H_CEDE-exit tracing starts. In all other cases, it suffices to inform RCU of CPU idleness upon idle-loop entry and exit. This commit makes the required adjustments. Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>	2011-12-11 10:31:39 -08:00
Frederic Weisbecker	2bbb6817c0	nohz: Allow rcu extended quiescent state handling seperately from tick stop It is assumed that rcu won't be used once we switch to tickless mode and until we restart the tick. However this is not always true, as in x86-64 where we dereference the idle notifiers after the tick is stopped. To prepare for fixing this, add two new APIs: tick_nohz_idle_enter_norcu() and tick_nohz_idle_exit_norcu(). If no use of RCU is made in the idle loop between tick_nohz_enter_idle() and tick_nohz_exit_idle() calls, the arch must instead call the new *_norcu() version such that the arch doesn't need to call rcu_idle_enter() and rcu_idle_exit(). Otherwise the arch must call tick_nohz_enter_idle() and tick_nohz_exit_idle() and also call explicitly: - rcu_idle_enter() after its last use of RCU before the CPU is put to sleep. - rcu_idle_exit() before the first use of RCU after the CPU is woken up. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Frysinger <vapier@gentoo.org> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: David Miller <davem@davemloft.net> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Hans-Christian Egtvedt <hans-christian.egtvedt@atmel.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Russell King <linux@arm.linux.org.uk> Cc: Paul Mackerras <paulus@samba.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Paul Mundt <lethal@linux-sh.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2011-12-11 10:31:36 -08:00
Frederic Weisbecker	280f06774a	nohz: Separate out irq exit and idle loop dyntick logic The tick_nohz_stop_sched_tick() function, which tries to delay the next timer tick as long as possible, can be called from two places: - From the idle loop to start the dytick idle mode - From interrupt exit if we have interrupted the dyntick idle mode, so that we reprogram the next tick event in case the irq changed some internal state that requires this action. There are only few minor differences between both that are handled by that function, driven by the ts->inidle cpu variable and the inidle parameter. The whole guarantees that we only update the dyntick mode on irq exit if we actually interrupted the dyntick idle mode, and that we enter in RCU extended quiescent state from idle loop entry only. Split this function into: - tick_nohz_idle_enter(), which sets ts->inidle to 1, enters dynticks idle mode unconditionally if it can, and enters into RCU extended quiescent state. - tick_nohz_irq_exit() which only updates the dynticks idle mode when ts->inidle is set (ie: if tick_nohz_idle_enter() has been called). To maintain symmetry, tick_nohz_restart_sched_tick() has been renamed into tick_nohz_idle_exit(). This simplifies the code and micro-optimize the irq exit path (no need for local_irq_save there). This also prepares for the split between dynticks and rcu extended quiescent state logics. We'll need this split to further fix illegal uses of RCU in extended quiescent states in the idle loop. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Frysinger <vapier@gentoo.org> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: David Miller <davem@davemloft.net> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Hans-Christian Egtvedt <hans-christian.egtvedt@atmel.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Russell King <linux@arm.linux.org.uk> Cc: Paul Mackerras <paulus@samba.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Paul Mundt <lethal@linux-sh.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>	2011-12-11 10:31:35 -08:00
Paul Gortmaker	9deaa53ac7	serial: add irq handler for Freescale 16550 errata. Sending a break on the SOC UARTs found in some MPC83xx/85xx/86xx chips seems to cause a short lived IRQ storm (/proc/interrupts typically shows somewhere between 300 and 1500 events). Unfortunately this renders SysRQ over the serial console completely inoperable. The suggested workaround in the errata is to read the Rx register, wait one character period, and then read the Rx register again. We achieve this by tracking the old LSR value, and on the subsequent interrupt event after a break, we don't read LSR, instead we just read the RBR again and return immediately. The "fsl,ns16550" is used in the compatible field of the serial device to mark UARTs known to have this issue. Thanks to Scott Wood for providing the errata data which led to a much cleaner fix. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Acked-by: Alan Cox <alan@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-12-09 19:14:13 -08:00
Tony Breeds	228d550533	powerpc/47x: Add support for the new IBM currituck platform Based on original work by David 'Shaggy' Kleikamp. Signed-off-by: Tony Breeds <tony@bakeyournoodle.com> Signed-off-by: Josh Boyer <jwboyer@gmail.com>	2011-12-09 07:51:40 -05:00
Tony Breeds	df777bd39a	powerpc/476fpe: Add 476fpe SoC code Based on original work by David 'Shaggy' Kleikamp. Signed-off-by: Tony Breeds <tony@bakeyournoodle.com> Signed-off-by: Josh Boyer <jwboyer@gmail.com>	2011-12-09 07:51:02 -05:00
Tony Breeds	075bcf5879	powerpc/boot: Add mfdcrx Needed for currituck support. Signed-off-by: Tony Breeds <tony@bakeyournoodle.com> Signed-off-by: Josh Boyer <jwboyer@gmail.com>	2011-12-09 07:49:50 -05:00
Tony Breeds	e32a03290c	powerpc/boot: Add extended precision shifts to the boot wrapper. The upcomming currituck patches will need to do 64-bit shifts which will fail with undefined symbol without this patch. I looked at linking against libgcc but we can't guarantee that libgcc was compiled with soft-float. Also Using ../lib/div64.S or ../kernel/misc_32.S, this will break the build as the .o's need to be built with different flags for the bootwrapper vs the kernel. So for now the easyest option is to just copy code from arch/powerpc/kernel/misc_32.S I don't think this code changes too often ;P Signed-off-by: Tony Breeds <tony@bakeyournoodle.com> Signed-off-by: Josh Boyer <jwboyer@gmail.com>	2011-12-09 07:49:27 -05:00
Christoph Egger	ca899859f1	powerpc/44x: Removing dead CONFIG_PPC47x CONFIG_PPC47x doesn't exist in Kconfig and no 476 processor calls this function ppc44x_pin_tlb() as it has it's own ppc47x_pin_tlb(). This code is probably an artifact of the original 476 code that shouldn't have made it upstream. Signed-off-by: Christoph Egger <siccegge@cs.fau.de> Signed-off-by: Tony Breeds <tony@bakeyournoodle.com> Signed-off-by: Josh Boyer <jwboyer@gmail.com>	2011-12-09 07:48:56 -05:00
Tony Breeds	466c2bc762	powerpc/44x: pci: Setup the dma_window properties for each pci_controller Needed if you want to use swiotlb, harmless otherwise. Signed-off-by: Tony Breeds <tony@bakeyournoodle.com> Signed-off-by: Josh Boyer <jwboyer@gmail.com>	2011-12-09 07:48:36 -05:00
Tony Breeds	8115846e1a	powerpc/44x: pci: Add a want_sdr flag into ppc4xx_pciex_hwops Currituck doesn't need nor use SDR so aborting the pci setup if there is no sdr-base would be bad. Add a flag to ppc4xx_pciex_hwops for the backends to state if they need SDR and then only complain and abort if they do and it's not found in the device tree. Signed-off-by: Tony Breeds <tony@bakeyournoodle.com> Signed-off-by: Josh Boyer <jwboyer@gmail.com>	2011-12-09 07:47:48 -05:00
Tony Breeds	9fb5529679	powerpc/44x: pci: Use PCI_BASE_ADDRESS_MEM_PREFETCH rather than magic value. Signed-off-by: Tony Breeds <tony@bakeyournoodle.com> Signed-off-by: Josh Boyer <jwboyer@gmail.com>	2011-12-09 07:45:00 -05:00
Tejun Heo	0ee332c145	memblock: Kill early_node_map[] Now all ARCH_POPULATES_NODE_MAP archs select HAVE_MEBLOCK_NODE_MAP - there's no user of early_node_map[] left. Kill early_node_map[] and replace ARCH_POPULATES_NODE_MAP with HAVE_MEMBLOCK_NODE_MAP. Also, relocate for_each_mem_pfn_range() and helper from mm.h to memblock.h as page_alloc.c would no longer host an alternative implementation. This change is ultimately one to one mapping and shouldn't cause any observable difference; however, after the recent changes, there are some functions which now would fit memblock.c better than page_alloc.c and dependency on HAVE_MEMBLOCK_NODE_MAP instead of HAVE_MEMBLOCK doesn't make much sense on some of them. Further cleanups for functions inside HAVE_MEMBLOCK_NODE_MAP in mm.h would be nice. -v2: Fix compile bug introduced by mis-spelling CONFIG_HAVE_MEMBLOCK_NODE_MAP to CONFIG_MEMBLOCK_HAVE_NODE_MAP in mmzone.h. Reported by Stephen Rothwell. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Tony Luck <tony.luck@intel.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Chen Liqin <liqin.chen@sunplusct.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: "H. Peter Anvin" <hpa@zytor.com>	2011-12-08 10:22:09 -08:00
Tejun Heo	1d7cfe18ec	powerpc: Use HAVE_MEMBLOCK_NODE_MAP powerpc doesn't access early_node_map[] directly and enabling HAVE_MEMBLOCK_NODE_MAP is trivial - replacing add_active_range() calls with memblock_set_node() and selecting HAVE_MEMBLOCK_NODE_MAP is enough. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Yinghai Lu <yinghai@kernel.org>	2011-12-08 10:22:08 -08:00
Tejun Heo	1aadc0560f	memblock: s/memblock_analyze()/memblock_allow_resize()/ and update users The only function of memblock_analyze() is now allowing resize of memblock region arrays. Rename it to memblock_allow_resize() and update its users. * The following users remain the same other than renaming. arm/mm/init.c::arm_memblock_init() microblaze/kernel/prom.c::early_init_devtree() powerpc/kernel/prom.c::early_init_devtree() openrisc/kernel/prom.c::early_init_devtree() sh/mm/init.c::paging_init() sparc/mm/init_64.c::paging_init() unicore32/mm/init.c::uc32_memblock_init() * In the following users, analyze was used to update total size which is no longer necessary. powerpc/kernel/machine_kexec.c::reserve_crashkernel() powerpc/kernel/prom.c::early_init_devtree() powerpc/mm/init_32.c::MMU_init() powerpc/mm/tlb_nohash.c::__early_init_mmu() powerpc/platforms/ps3/mm.c::ps3_mm_add_memory() powerpc/platforms/embedded6xx/wii.c::wii_memory_fixups() sh/kernel/machine_kexec.c::reserve_crashkernel() * x86/kernel/e820.c::memblock_x86_fill() was directly setting memblock_can_resize before populating memblock and calling analyze afterwards. Call memblock_allow_resize() before start populating. memblock_can_resize is now static inside memblock.c. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Russell King <linux@arm.linux.org.uk> Cc: Michal Simek <monstr@monstr.eu> Cc: Paul Mundt <lethal@linux-sh.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: "H. Peter Anvin" <hpa@zytor.com>	2011-12-08 10:22:08 -08:00
Tejun Heo	6fbef13c4f	powerpc: Cleanup memblock usage * early_init_devtree(): Total memory size is aligned to PAGE_SIZE; however, alignment isn't enforced if memory_limit is explicitly specified. Simplify the logic and always apply PAGE_SIZE alignment. * MMU_init(): memblock regions is truncated by directly modifying memblock.memory.cnt. This is incomplete (reserved array is not truncated) and unnecessarily low level hindering further memblock improvments. Use memblock_enforce_memory_limit() instead. * wii_memory_fixups(): Unnecessarily low level direct manipulation of memblock regions. The same result can be achieved using properly abstracted operations. Reimplement using memblock API. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Yinghai Lu <yinghai@kernel.org>	2011-12-08 10:22:07 -08:00
Tejun Heo	fe091c208a	memblock: Kill memblock_init() memblock_init() initializes arrays for regions and memblock itself; however, all these can be done with struct initializers and memblock_init() can be removed. This patch kills memblock_init() and initializes memblock with struct initializer. The only difference is that the first dummy entries don't have .nid set to MAX_NUMNODES initially. This doesn't cause any behavior difference. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Russell King <linux@arm.linux.org.uk> Cc: Michal Simek <monstr@monstr.eu> Cc: Paul Mundt <lethal@linux-sh.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: "H. Peter Anvin" <hpa@zytor.com>	2011-12-08 10:22:07 -08:00
Tejun Heo	1c16d242aa	memblock: Fix include breakages caused by `24aa07882b` `24aa07882b` (memblock, x86: Replace memblock_x86_reserve/free_range() with generic ones) removed arch/x86/include/asm/memblock.h and dropped its inclusion from include/linux/memblock.h which breaks other architectures which depended on the generic memblock.h pulling in the arch specific one. However, the proper fix isn't adding back the asm inclusion. memblock doesn't have any arch dependent part and doesn't need arch specific header file and asm/memblock.h files are either practically empty or contain mostly unrelated arch specific stuff. * In microblaze, sh, powerpc, sparc and openrisc, asm/memblock.h is either empty or just contains unused MEMBLOCK_DBG() macro. Remove them. * In arm and unicore32, asm/memblock.h contains arch specific stuff. Include it directly from its users. It might be a good idea to rename the header file to avoid confusion. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: "H. Peter Anvin" <hpa@zytor.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Russell King <linux@arm.linux.org.uk> Cc: Michal Simek <monstr@monstr.eu> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mundt <lethal@linux-sh.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>	2011-12-08 10:22:06 -08:00
Anton Blanchard	7c637b04fb	powerpc: Enable squashfs as a module Most distros use it so we may as well enable it and get regular compile testing. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2011-12-08 14:22:54 +11:00
Anton Blanchard	120a52c388	powerpc/nvram: Add spinlock to oops_to_nvram to prevent oops in compression code. When issuing a system reset we almost always oops in the oops_to_nvram code because multiple CPUs are using the deflate work area. Add a spinlock to protect it. To play it safe I'm using trylock to avoid locking up if the NVRAM code oopses. This means we might miss multiple CPUs oopsing at exactly the same time but I think it's best to play it safe for now. Once we are happy with the reliability we can change it to a full spinlock. Signed-off-by: Anton Blanchard <anton@samba.org> Acked-by: Jim Keniston <jkenisto@us.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2011-12-08 14:22:54 +11:00
Paul Mackerras	2fde6d20bb	powerpc: Provide a way for KVM to indicate that NV GPR values are lost This fixes a problem where a CPU thread coming out of nap mode can think it has valid values in the nonvolatile GPRs (r14 - r31) as saved away in power7_idle, but in fact the values have been trashed because the thread was used for KVM in the mean time. The result is that the thread crashes because code that called power7_idle (e.g., pnv_smp_cpu_kill_self()) goes to use values in registers that have been trashed. The bit field in SRR1 that tells whether state was lost only reflects the most recent nap, which may not have been the nap instruction in power7_idle. So we need an extra PACA field to indicate that state has been lost even if SRR1 indicates that the most recent nap didn't lose state. We clear this field when saving the state in power7_idle, we set it to a non-zero value when we use the thread for KVM, and we test it in power7_wakeup_noloss. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2011-12-08 14:22:53 +11:00
Paul Mackerras	cba313da5c	powerpc/powernv: Fix problems in onlining CPUs At present, on the powernv platform, if you off-line a CPU that was online, and then try to on-line it again, the kernel generates a warning message "OPAL Error -1 starting CPU n". Furthermore, if the CPU is a secondary thread that was used by KVM while it was off-line, the CPU fails to come online. The first problem is fixed by only calling OPAL to start the CPU the first time it is on-lined, as indicated by the cpu_start field of its PACA being zero. The second problem is fixed by restoring the cpu_start field to 1 instead of 0 when using the CPU within KVM. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2011-12-08 14:22:53 +11:00
Anton Blanchard	3339264042	powerpc/pseries: Increase minimum RMO size from 64MB to 256MB The minimum RMO size field in ibm,client-architecture is currently ignored, but a future firmware version will rectify that. Since we always get at least 128MB of RMO right now, asking for 64MB is likely to result in boot failures. We should bump it to at least 128MB, but considering all the boot issues we have on 128MB RMO boxes and all new machines have virtual RMO, we may as well set our minimum to 256MB. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2011-12-08 14:22:53 +11:00
sukadev@linux.vnet.ibm.com	8a3e3d31d1	powerpc: Punch a hole in /dev/mem for librtas With CONFIG_STRICT_DEVMEM=y, user space cannot read any part of /dev/mem. Since this breaks librtas, punch a hole in /dev/mem to allow access to the rmo_buffer that librtas needs. Anton Blanchard reported the problem and helped with the fix. A quick test for this patch: # cat /proc/rtas/rmo_buffer 000000000f190000 10000 # python -c "print 0x000000000f190000 / 0x10000" 3865 # dd if=/dev/mem of=/tmp/foo count=1 bs=64k skip=3865 1+0 records in 1+0 records out 65536 bytes (66 kB) copied, 0.000205235 s, 319 MB/s # dd if=/dev/mem of=/tmp/foo dd: reading `/dev/mem': Operation not permitted 0+0 records in 0+0 records out 0 bytes (0 B) copied, 0.00022519 s, 0.0 kB/s Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2011-12-08 14:22:52 +11:00
Benjamin Herrenschmidt	11eab297f5	powerpc: Add support for OpenBlockS 600 So I've had one of these for a while and it looks like the vendor never bothered submitting the support upstream. This adds it using ppc40x_simple and provides a device-tree. There are some changes to the boot wrapper because the way u-boot works on this thing, it seems to expect a multipart image with the kernel, initrd and dtb in it. The USB support is missing as it needs the yet unmerged driver for the DWC OTG part and the GPIOs may need further definition in the dts. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2011-12-08 14:22:52 +11:00
Geoff Levand	987706acf8	powerpc/ps3: Update ps3_defconfig Refresh ps3_defconfig to latest kernel sources and change the options: CONFIG_PPP=m to CONFIG_PPP=n. CONFIG_NAMESPACES=y to CONFIG_NAMESPACES=n CONFIG_NUMA=y to CONFIG_NUMA=n Signed-off-by: Geoff Levand <geoff@infradead.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2011-12-08 14:05:56 +11:00

... 3 4 5 6 7 ...

9407 Commits