mirror of https://gitee.com/openkylin/linux.git
16964 Commits
Author | SHA1 | Message | Date |
---|---|---|---|
Linus Torvalds | 9199c4caa1 |
PCI updates for v3.13:
PCI device hotplug - Move device_del() from pci_stop_dev() to pci_destroy_dev() (Rafael J. Wysocki) Host bridge drivers - Update maintainers for DesignWare, i.MX6, Armada, R-Car (Bjorn Helgaas) - mvebu: Return 'unsupported' for Interrupt Line and Interrupt Pin (Jason Gunthorpe) Miscellaneous - Avoid unnecessary CPU switch when calling .probe() (Alexander Duyck) - Revert "workqueue: allow work_on_cpu() to be called recursively" (Bjorn Helgaas) - Disable Bus Master only on kexec reboot (Khalid Aziz) - Omit PCI ID macro strings to shorten quirk names for LTO (Michal Marek) MAINTAINERS | 33 +++++++++++++++++++++++++++++++++ drivers/pci/host/pci-mvebu.c | 5 +++++ drivers/pci/pci-driver.c | 38 ++++++++++++++++++++++++++++++-------- drivers/pci/remove.c | 4 +++- include/linux/kexec.h | 3 +++ include/linux/pci.h | 30 +++++++++++++++--------------- kernel/kexec.c | 4 ++++ kernel/workqueue.c | 32 ++++++++++---------------------- 8 files changed, 103 insertions(+), 46 deletions(-) -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAABAgAGBQJSrLDLAAoJEFmIoMA60/r8JO4P/iakGDezwXcd9YTbdaq/NmEK 31JtBao9rJNSUXpFGaFKGm99A479QNb+Fo1o5NaysyGbcAA2W0HeCkaee13hpw2D l3wvzJRoEwqVySOzMwlGsxxhYuvGiZ2WVALNkwBzwyCiYtypNnUtGNCSp/3J6XKp 2ZUjoagyixdS4oYoR4irZucPBWwzW89Gx4oJ0rBttNVsXjiT1D2OzYDYTuxMyb2E ZjXJqTUYzfwiFxqKPrGOabUPD9GX3EWaFmu01FLlsSznPZkk9yJR3/eq6I6utp/E WSpP+7v3jnPHjZVky1FdHaP5wqurtNRsiWBQVyNYF+M5WofKA1hGA+niJDEB/pVL vE74fJfZzpET1jlZR+7Z5qHPxW2A8Q2ARn74A42DYLQGmJEh7VdARcayV1E4diRH DzEnhf33b9bwIC1Q0cESWZ5RH4r2Xbjg0a1qoVQYi5VEJEMWXID3Ofk64Bd9QOz4 oLZ27V7clurD+gNarch/zgpd9LVIHLFriR2YWPpA0Iwy9EjueH8GsiOS3NqDn2SQ sQg5utE4vdnix4VrCvvbufAr3kJngndtuj/s7I3lZqi7nCyS+jeFFvMUd9h3MUqZ rs1IN1/qeTBBNi2dtmcKDN2ItahCoBhTI37JRaijZwe+B5m6mTe049it+E0SZPI2 IqLOYWL4ggT0se/cMXld =Edvk -----END PGP SIGNATURE----- Merge tag 'pci-v3.13-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI updates from Bjorn Helgaas: "PCI device hotplug - Move device_del() from pci_stop_dev() to pci_destroy_dev() (Rafael Wysocki) Host bridge drivers - Update maintainers for DesignWare, i.MX6, Armada, R-Car (Bjorn Helgaas) - mvebu: Return 'unsupported' for Interrupt Line and Interrupt Pin (Jason Gunthorpe) Miscellaneous - Avoid unnecessary CPU switch when calling .probe() (Alexander Duyck) - Revert "workqueue: allow work_on_cpu() to be called recursively" (Bjorn Helgaas) - Disable Bus Master only on kexec reboot (Khalid Aziz) - Omit PCI ID macro strings to shorten quirk names for LTO (Michal Marek)" * tag 'pci-v3.13-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: MAINTAINERS: Add DesignWare, i.MX6, Armada, R-Car PCI host maintainers PCI: Disable Bus Master only on kexec reboot PCI: mvebu: Return 'unsupported' for Interrupt Line and Interrupt Pin PCI: Omit PCI ID macro strings to shorten quirk names PCI: Move device_del() from pci_stop_dev() to pci_destroy_dev() Revert "workqueue: allow work_on_cpu() to be called recursively" PCI: Avoid unnecessary CPU switch when calling driver .probe() method |
|
Linus Torvalds | 5dec682c7f |
Keyrings fixes 2013-12-10
-----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.15 (GNU/Linux) iQIVAwUAUqdgihOxKuMESys7AQIishAAjGG3LnEp12fd7oay5//4SEg31ybPqGj7 Xotk9mblBW7cWPbLqk7xyIhxpIj2zM14XatEV2TfBNZV0Lcrzr1M5s87ZRUX0ui+ FHbtfn/M3Or8AvX7W1HZgG2Se7M0Ba6Y2RRXNsEdgqk6KMQuHhPEZ2FZ5vCx6BAD vXzxWIKVNeqMXR7z6xkyqmpztnVQs+ZgzE9+c96QKyhBdtBy4spDmfgJS90m0pjN HhgONpdfgosknj8yu43rWIQvd3UUO5BVntCeic94Fbgh77ZAEi0dD7ifLz/ebcja pfJfPXxYzkCfIrSdAzQF8iYnQ+5rRomvGsMcvtq6mBooah/YmEkmKpKJDTp47wyN IEUeJaxM2Qp2jcUSjEd7lY9o1AK4sj+90cKeVRUd5kzZP1iolkQHR+dGiAwqbn6w MJBn9fCamvhpsZDhl2G/DICprFDvErd9zAlNubivggJmITnPXNWx+K1RDL4fO4qc zLHTGHkxYyACM/7oJfwbH/NyZ1yu53OlE3R2h6TA6ISc7nAed7qzGAZyrSV92Quc pItGaQ/zNa0sbe+nufCx9FOWu8sA3x7qnazLxhVtlPde9nlxcpGo5cagqcyc4hyp /IGK2JoGkgHvehECY8miJlsu7UqmThIhmy6o4T4X6ErX1M9ifR92WGeXkAi/2mh/ WciVpPvTrqM= =/AdA -----END PGP SIGNATURE----- Merge tag 'keys-devel-20131210' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs Pull misc keyrings fixes from David Howells: "These break down into five sets: - A patch to error handling in the big_key type for huge payloads. If the payload is larger than the "low limit" and the backing store allocation fails, then big_key_instantiate() doesn't clear the payload pointers in the key, assuming them to have been previously cleared - but only one of them is. Unfortunately, the garbage collector still calls big_key_destroy() when sees one of the pointers with a weird value in it (and not NULL) which it then tries to clean up. - Three patches to fix the keyring type: * A patch to fix the hash function to correctly divide keyrings off from keys in the topology of the tree inside the associative array. This is only a problem if searching through nested keyrings - and only if the hash function incorrectly puts the a keyring outside of the 0 branch of the root node. * A patch to fix keyrings' use of the associative array. The __key_link_begin() function initially passes a NULL key pointer to assoc_array_insert() on the basis that it's holding a place in the tree whilst it does more allocation and stuff. This is only a problem when a node contains 16 keys that match at that level and we want to add an also matching 17th. This should easily be manufactured with a keyring full of keyrings (without chucking any other sort of key into the mix) - except for (a) above which makes it on average adding the 65th keyring. * A patch to fix searching down through nested keyrings, where any keyring in the set has more than 16 keyrings and none of the first keyrings we look through has a match (before the tree iteration needs to step to a more distal node). Test in keyutils test suite: http://git.kernel.org/cgit/linux/kernel/git/dhowells/keyutils.git/commit/?id=8b4ae963ed92523aea18dfbb8cab3f4979e13bd1 - A patch to fix the big_key type's use of a shmem file as its backing store causing audit messages and LSM check failures. This is done by setting S_PRIVATE on the file to avoid LSM checks on the file (access to the shmem file goes through the keyctl() interface and so is gated by the LSM that way). This isn't normally a problem if a key is used by the context that generated it - and it's currently only used by libkrb5. Test in keyutils test suite: http://git.kernel.org/cgit/linux/kernel/git/dhowells/keyutils.git/commit/?id=d9a53cbab42c293962f2f78f7190253fc73bd32e - A patch to add a generated file to .gitignore. - A patch to fix the alignment of the system certificate data such that it it works on s390. As I understand it, on the S390 arch, symbols must be 2-byte aligned because loading the address discards the least-significant bit" * tag 'keys-devel-20131210' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: KEYS: correct alignment of system_certificate_list content in assembly file Ignore generated file kernel/x509_certificate_list security: shmem: implement kernel private shmem inodes KEYS: Fix searching of nested keyrings KEYS: Fix multiple key add into associative array KEYS: Fix the keyring hash function KEYS: Pre-clear struct key on allocation |
|
Linus Torvalds | 5cdec2d833 |
futex: move user address verification up to common code
When debugging the read-only hugepage case, I was confused by the fact that get_futex_key() did an access_ok() only for the non-shared futex case, since the user address checking really isn't in any way specific to the private key handling. Now, it turns out that the shared key handling does effectively do the equivalent checks inside get_user_pages_fast() (it doesn't actually check the address range on x86, but does check the page protections for being a user page). So it wasn't actually a bug, but the fact that we treat the address differently for private and shared futexes threw me for a loop. Just move the check up, so that it gets done for both cases. Also, use the 'rw' parameter for the type, even if it doesn't actually matter any more (it's a historical artifact of the old racy i386 "page faults from kernel space don't check write protections"). Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
|
Linus Torvalds | f12d5bfceb |
futex: fix handling of read-only-mapped hugepages
The hugepage code had the exact same bug that regular pages had in commit |
|
Hendrik Brueckner | 62226983da |
KEYS: correct alignment of system_certificate_list content in assembly file
Apart from data-type specific alignment constraints, there are also architecture-specific alignment requirements. For example, on s390 symbols must be on even addresses implying a 2-byte alignment. If the system_certificate_list_end symbol is on an odd address and if this address is loaded, the least-significant bit is ignored. As a result, the load_system_certificate_list() fails to load the certificates because of a wrong certificate length calculation. To be safe, align system_certificate_list on an 8-byte boundary. Also improve the length calculation of the system_certificate_list content. Introduce a system_certificate_list_size (8-byte aligned because of unsigned long) variable that stores the length. Let the linker calculate this size by introducing a start and end label for the certificate content. Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Signed-off-by: David Howells <dhowells@redhat.com> |
|
Rusty Russell | 7cfe5b3310 |
Ignore generated file kernel/x509_certificate_list
$ git status # On branch pending-rebases # Untracked files: # (use "git add <file>..." to include in what will be committed) # # kernel/x509_certificate_list nothing added to commit but untracked files present (use "git add" to track) $ Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: David Howells <dhowells@redhat.com> |
|
Khalid Aziz | 4fc9bbf98f |
PCI: Disable Bus Master only on kexec reboot
Add a flag to tell the PCI subsystem that kernel is shutting down in
preparation to kexec a kernel. Add code in PCI subsystem to use this flag
to clear Bus Master bit on PCI devices only in case of kexec reboot.
This fixes a power-off problem on Acer Aspire V5-573G and likely other
machines and avoids any other issues caused by clearing Bus Master bit on
PCI devices in normal shutdown path. The problem was introduced by
|
|
Linus Torvalds | 843f4f4bb1 |
A regression showed up that there's a large delay when enabling
all events. This was prevalent when FTRACE_SELFTEST was enabled which enables all events several times, and caused the system bootup to pause for over a minute. This was tracked down to an addition of a synchronize_sched() performed when system call tracepoints are unregistered. The synchronize_sched() is needed between the unregistering of the system call tracepoint and a deletion of a tracing instance buffer. But placing the synchronize_sched() in the unreg of *every* system call tracepoint is a bit overboard. A single synchronize_sched() before the deletion of the instance is sufficient. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.15 (GNU/Linux) iQEcBAABAgAGBQJSofYNAAoJEKQekfcNnQGuTUcIAJOx8745KlkFjN4VX+nCWNfP xrrpnLBymbGMA9lQ4fXk+kdiuhH8DjRYdKq9fU4T481MFYKkToUIZH6NeaLI5fr1 0nBPPjVyAlJ+yt9JbOAYa1jEYnAr27ORDHEtdQnqb6OJSky3oh9jCQi+toxmh2qX Sv1tIYeAf3K2V/h5xt6uSl9oiZ6KBtwE3f+xkHWNizaU9i2rq2gxd77fSbPNTIps wLdsESYziA2UeAm13eh8xXo1uqRbfvx7bPr59cu0+3AqdOoaXFG+JoE/MqmljIO5 HkyCKJVqP8HD+QieuEB+hw4zpSsl6A6iKSlaiHm8NMnRRocfM6qMKMQXQ28KhxA= =sYm/ -----END PGP SIGNATURE----- Merge tag 'trace-fixes-3.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fix from Steven Rostedt: "A regression showed up that there's a large delay when enabling all events. This was prevalent when FTRACE_SELFTEST was enabled which enables all events several times, and caused the system bootup to pause for over a minute. This was tracked down to an addition of a synchronize_sched() performed when system call tracepoints are unregistered. The synchronize_sched() is needed between the unregistering of the system call tracepoint and a deletion of a tracing instance buffer. But placing the synchronize_sched() in the unreg of *every* system call tracepoint is a bit overboard. A single synchronize_sched() before the deletion of the instance is sufficient" * tag 'trace-fixes-3.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing: Only run synchronize_sched() at instance deletion time |
|
Steven Rostedt | 3ccb012392 |
tracing: Only run synchronize_sched() at instance deletion time
It has been reported that boot up with FTRACE_SELFTEST enabled can take a
very long time. There can be stalls of over a minute.
This was tracked down to the synchronize_sched() called when a system call
event is disabled. As the self tests enable and disable thousands of events,
this makes the synchronize_sched() get called thousands of times.
The synchornize_sched() was added with
|
|
Linus Torvalds | 1ab231b274 |
Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fixes from Thomas Gleixner: - timekeeping: Cure a subtle drift issue on GENERIC_TIME_VSYSCALL_OLD - nohz: Make CONFIG_NO_HZ=n and nohz=off command line option behave the same way. Fixes a long standing load accounting wreckage. - clocksource/ARM: Kconfig update to avoid ARM=n wreckage - clocksource/ARM: Fixlets for the AT91 and SH clocksource/clockevents - Trivial documentation update and kzalloc conversion from akpms pile * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: nohz: Fix another inconsistency between CONFIG_NO_HZ=n and nohz=off time: Fix 1ns/tick drift w/ GENERIC_TIME_VSYSCALL_OLD clocksource: arm_arch_timer: Hide eventstream Kconfig on non-ARM clocksource: sh_tmu: Add clk_prepare/unprepare support clocksource: sh_tmu: Release clock when sh_tmu_register() fails clocksource: sh_mtu2: Add clk_prepare/unprepare support clocksource: sh_mtu2: Release clock when sh_mtu2_register() fails ARM: at91: rm9200: switch back to clockevents_config_and_register tick: Document tick_do_timer_cpu timer: Convert kmalloc_node(...GFP_ZERO...) to kzalloc_node(...) NOHZ: Check for nohz active instead of nohz enabled |
|
Linus Torvalds | a45299e727 |
Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fixes from Thomas Gleixner: - Correction of fuzzy and fragile IRQ_RETVAL macro - IRQ related resume fix affecting only XEN - ARM/GIC fix for chained GIC controllers * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqchip: Gic: fix boot for chained gics irq: Enable all irqs unconditionally in irq_resume genirq: Correct fuzzy and fragile IRQ_RETVAL() definition |
|
Linus Torvalds | a0b57ca33e |
Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fixes from Ingo Molnar: "Various smaller fixlets, all over the place" * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/doc: Fix generation of device-drivers sched: Expose preempt_schedule_irq() sched: Fix a trivial typo in comments sched: Remove unused variable in 'struct sched_domain' sched: Avoid NULL dereference on sd_busy sched: Check sched_domain before computing group power MAINTAINERS: Update file patterns in the lockdep and scheduler entries |
|
Linus Torvalds | e321ae4c20 |
Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar: "Misc kernel and tooling fixes" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: tools lib traceevent: Fix conversion of pointer to integer of different size perf/trace: Properly use u64 to hold event_id perf: Remove fragile swevent hlist optimization ftrace, perf: Avoid infinite event generation loop tools lib traceevent: Fix use of multiple options in processing field perf header: Fix possible memory leaks in process_group_desc() perf header: Fix bogus group name perf tools: Tag thread comm as overriden |
|
Linus Torvalds | 7224b31bd5 |
Merge branch 'for-3.13-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq
Pull workqueue fixes from Tejun Heo: "This contains one important fix. The NUMA support added a while back broke ordering guarantees on ordered workqueues. It was enforced by having single frontend interface with @max_active == 1 but the NUMA support puts multiple interfaces on unbound workqueues on NUMA machines thus breaking the ordered guarantee. This is fixed by disabling NUMA support on ordered workqueues. The above and a couple other patches were sitting in for-3.12-fixes but I forgot to push that out, so they ended up waiting a bit too long. My aplogies. Other fixes are minor" * 'for-3.13-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: workqueue: fix pool ID allocation leakage and remove BUILD_BUG_ON() in init_workqueues workqueue: fix comment typo for __queue_work() workqueue: fix ordered workqueues in NUMA setups workqueue: swap set_cpus_allowed_ptr() and PF_NO_SETAFFINITY |
|
Linus Torvalds | 2855987d13 |
Merge branch 'for-3.13-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup fixes from Tejun Heo: "Fixes for three issues. - cgroup destruction path could swamp system_wq possibly leading to deadlock. This actually seems to happen in the wild with memcg because memcg destruction path adds nested dependency on system_wq. Resolved by isolating cgroup destruction work items on its dedicated workqueue. - Possible locking context deadlock through seqcount reported by lockdep - Memory leak under certain conditions" * 'for-3.13-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: cgroup: fix cgroup_subsys_state leak for seq_files cpuset: Fix memory allocator deadlock cgroup: use a dedicated workqueue for cgroup destruction |
|
Thomas Gleixner | 0e576acbc1 |
nohz: Fix another inconsistency between CONFIG_NO_HZ=n and nohz=off
If CONFIG_NO_HZ=n tick_nohz_get_sleep_length() returns NSEC_PER_SEC/HZ. If CONFIG_NO_HZ=y and the nohz functionality is disabled via the command line option "nohz=off" or not enabled due to missing hardware support, then tick_nohz_get_sleep_length() returns 0. That happens because ts->sleep_length is never set in that case. Set it to NSEC_PER_SEC/HZ when the NOHZ mode is inactive. Reported-by: Michal Hocko <mhocko@suse.cz> Reported-by: Borislav Petkov <bp@alien8.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> |
|
Helge Deller | 5ecbe3c3c6 |
kernel/extable: fix address-checks for core_kernel and init areas
The init_kernel_text() and core_kernel_text() functions should not include the labels _einittext and _etext when checking if an address is inside the .text or .init sections. Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
|
Tejun Heo | e605b36575 |
cgroup: fix cgroup_subsys_state leak for seq_files
If a cgroup file implements either read_map() or read_seq_string(),
such file is served using seq_file by overriding file->f_op to
cgroup_seqfile_operations, which also overrides the release method to
single_release() from cgroup_file_release().
Because cgroup_file_open() didn't use to acquire any resources, this
used to be fine, but since
|
|
Peter Zijlstra | 0fc0287c9e |
cpuset: Fix memory allocator deadlock
Juri hit the below lockdep report: [ 4.303391] ====================================================== [ 4.303392] [ INFO: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected ] [ 4.303394] 3.12.0-dl-peterz+ #144 Not tainted [ 4.303395] ------------------------------------------------------ [ 4.303397] kworker/u4:3/689 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire: [ 4.303399] (&p->mems_allowed_seq){+.+...}, at: [<ffffffff8114e63c>] new_slab+0x6c/0x290 [ 4.303417] [ 4.303417] and this task is already holding: [ 4.303418] (&(&q->__queue_lock)->rlock){..-...}, at: [<ffffffff812d2dfb>] blk_execute_rq_nowait+0x5b/0x100 [ 4.303431] which would create a new lock dependency: [ 4.303432] (&(&q->__queue_lock)->rlock){..-...} -> (&p->mems_allowed_seq){+.+...} [ 4.303436] [ 4.303898] the dependencies between the lock to be acquired and SOFTIRQ-irq-unsafe lock: [ 4.303918] -> (&p->mems_allowed_seq){+.+...} ops: 2762 { [ 4.303922] HARDIRQ-ON-W at: [ 4.303923] [<ffffffff8108ab9a>] __lock_acquire+0x65a/0x1ff0 [ 4.303926] [<ffffffff8108cbe3>] lock_acquire+0x93/0x140 [ 4.303929] [<ffffffff81063dd6>] kthreadd+0x86/0x180 [ 4.303931] [<ffffffff816ded6c>] ret_from_fork+0x7c/0xb0 [ 4.303933] SOFTIRQ-ON-W at: [ 4.303933] [<ffffffff8108abcc>] __lock_acquire+0x68c/0x1ff0 [ 4.303935] [<ffffffff8108cbe3>] lock_acquire+0x93/0x140 [ 4.303940] [<ffffffff81063dd6>] kthreadd+0x86/0x180 [ 4.303955] [<ffffffff816ded6c>] ret_from_fork+0x7c/0xb0 [ 4.303959] INITIAL USE at: [ 4.303960] [<ffffffff8108a884>] __lock_acquire+0x344/0x1ff0 [ 4.303963] [<ffffffff8108cbe3>] lock_acquire+0x93/0x140 [ 4.303966] [<ffffffff81063dd6>] kthreadd+0x86/0x180 [ 4.303969] [<ffffffff816ded6c>] ret_from_fork+0x7c/0xb0 [ 4.303972] } Which reports that we take mems_allowed_seq with interrupts enabled. A little digging found that this can only be from cpuset_change_task_nodemask(). This is an actual deadlock because an interrupt doing an allocation will hit get_mems_allowed()->...->__read_seqcount_begin(), which will spin forever waiting for the write side to complete. Cc: John Stultz <john.stultz@linaro.org> Cc: Mel Gorman <mgorman@suse.de> Reported-by: Juri Lelli <juri.lelli@gmail.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Tested-by: Juri Lelli <juri.lelli@gmail.com> Acked-by: Li Zefan <lizefan@huawei.com> Acked-by: Mel Gorman <mgorman@suse.de> Signed-off-by: Tejun Heo <tj@kernel.org> Cc: stable@vger.kernel.org |
|
Thomas Gleixner | 32e475d76a |
sched: Expose preempt_schedule_irq()
Tony reported that |
|
Linus Torvalds | 8ae516aa8b |
This includes two fixes.
1) is a bug fix that happens when root does the following: echo function_graph > current_tracer modprobe foo echo nop > current_tracer This causes the ftrace internal accounting to get screwed up and crashes ftrace, preventing the user from using the function tracer after that. 2) if a TRACE_EVENT has a string field, and NULL is given for it. The internal trace event code does a strlen() and strcpy() on the source of field. If it is NULL it causes the system to oops. This bug has been there since 2.6.31, but no TRACE_EVENT ever passed in a NULL to the string field, until now. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.14 (GNU/Linux) iQEcBAABAgAGBQJSlUw+AAoJEKQekfcNnQGugTYIAJQ7Zfhor2Jrw7XzkcBDpQv9 kqL/NvjLfyA49BLwba0VJCqJA56dEfW7kaSa7Wx0qAHdHKATLDA8G4c9FdHRAmZf WJ4jDbrcJqc7DA2vEn4aUuczwvTMx0H1KJHPMAu9taEno3YocIzCMxkuNYelwAz2 XUkUGtR7olF85pyVccfZLKnKPtslSwxWoG6WgEqiAap6fIorPlcSXBVYFqLKVTRJ P2e847eqxMF5ACLmv3dWiEvTPtWY91abN1zpeJYQNjBtQJmzVlvRcXYE6TwPUIFg RtB9n3SrT+0lEWvcxDbQi+4hKHf+JQLkGaYwWCMJihbdF4sh36olzpUOimKVqsk= =+9+v -----END PGP SIGNATURE----- Merge tag 'trace-fixes-v3.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fixes from Steven Rostedt: "This includes two fixes. 1) is a bug fix that happens when root does the following: echo function_graph > current_tracer modprobe foo echo nop > current_tracer This causes the ftrace internal accounting to get screwed up and crashes ftrace, preventing the user from using the function tracer after that. 2) if a TRACE_EVENT has a string field, and NULL is given for it. The internal trace event code does a strlen() and strcpy() on the source of field. If it is NULL it causes the system to oops. This bug has been there since 2.6.31, but no TRACE_EVENT ever passed in a NULL to the string field, until now" * tag 'trace-fixes-v3.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: ftrace: Fix function graph with loading of modules tracing: Allow events to have NULL strings |
|
Steven Rostedt (Red Hat) | 8a56d7761d |
ftrace: Fix function graph with loading of modules
Commit |
|
Bjorn Helgaas | 12997d1a99 |
Revert "workqueue: allow work_on_cpu() to be called recursively"
This reverts commit |
|
Laxman Dewangan | ac01810c9d |
irq: Enable all irqs unconditionally in irq_resume
When the system enters suspend, it disables all interrupts in suspend_device_irqs(), including the interrupts marked EARLY_RESUME. On the resume side things are different. The EARLY_RESUME interrupts are reenabled in sys_core_ops->resume and the non EARLY_RESUME interrupts are reenabled in the normal system resume path. When suspend_noirq() failed or suspend is aborted for any other reason, we might omit the resume side call to sys_core_ops->resume() and therefor the interrupts marked EARLY_RESUME are not reenabled and stay disabled forever. To solve this, enable all irqs unconditionally in irq_resume() regardless whether interrupts marked EARLY_RESUMEhave been already enabled or not. This might try to reenable already enabled interrupts in the non failure case, but the only affected platform is XEN and it has been confirmed that it does not cause any side effects. [ tglx: Massaged changelog. ] Signed-off-by: Laxman Dewangan <ldewangan@nvidia.com> Acked-by-and-tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Heiko Stuebner <heiko@sntech.de> Reviewed-by: Pavel Machek <pavel@ucw.cz> Cc: <ian.campbell@citrix.com> Cc: <rjw@rjwysocki.net> Cc: <len.brown@intel.com> Cc: <gregkh@linuxfoundation.org> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1385388587-16442-1-git-send-email-ldewangan@nvidia.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de> |
|
Linus Torvalds | 26b265cd29 |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto update from Herbert Xu: - Made x86 ablk_helper generic for ARM - Phase out chainiv in favour of eseqiv (affects IPsec) - Fixed aes-cbc IV corruption on s390 - Added constant-time crypto_memneq which replaces memcmp - Fixed aes-ctr in omap-aes - Added OMAP3 ROM RNG support - Add PRNG support for MSM SoC's - Add and use Job Ring API in caam - Misc fixes [ NOTE! This pull request was sent within the merge window, but Herbert has some questionable email sending setup that makes him public enemy #1 as far as gmail is concerned. So most of his emails seem to be trapped by gmail as spam, resulting in me not seeing them. - Linus ] * git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (49 commits) crypto: s390 - Fix aes-cbc IV corruption crypto: omap-aes - Fix CTR mode counter length crypto: omap-sham - Add missing modalias padata: make the sequence counter an atomic_t crypto: caam - Modify the interface layers to use JR API's crypto: caam - Add API's to allocate/free Job Rings crypto: caam - Add Platform driver for Job Ring hwrng: msm - Add PRNG support for MSM SoC's ARM: DT: msm: Add Qualcomm's PRNG driver binding document crypto: skcipher - Use eseqiv even on UP machines crypto: talitos - Simplify key parsing crypto: picoxcell - Simplify and harden key parsing crypto: ixp4xx - Simplify and harden key parsing crypto: authencesn - Simplify key parsing crypto: authenc - Export key parsing helper function crypto: mv_cesa: remove deprecated IRQF_DISABLED hwrng: OMAP3 ROM Random Number Generator support crypto: sha256_ssse3 - also test for BMI2 crypto: mv_cesa - Remove redundant of_match_ptr crypto: sahara - Remove redundant of_match_ptr ... |
|
Li Bin | 4e8b22bd1a |
workqueue: fix pool ID allocation leakage and remove BUILD_BUG_ON() in init_workqueues
When one work starts execution, the high bits of work's data contain pool ID. It can represent a maximum of WORK_OFFQ_POOL_NONE. Pool ID is assigned WORK_OFFQ_POOL_NONE when the work being initialized indicating that no pool is associated and get_work_pool() uses it to check the associated pool. So if worker_pool_assign_id() assigns a ID greater than or equal WORK_OFFQ_POOL_NONE to a pool, it triggers leakage, and it may break the non-reentrance guarantee. This patch fix this issue by modifying the worker_pool_assign_id() function calling idr_alloc() by setting @end param WORK_OFFQ_POOL_NONE. Furthermore, in the current implementation, the BUILD_BUG_ON() in init_workqueues makes no sense. The number of worker pools needed cannot be determined at compile time, because the number of backing pools for UNBOUND workqueues is dynamic based on the assigned custom attributes. So remove it. tj: Minor comment and indentation updates. Signed-off-by: Li Bin <huawei.libin@huawei.com> Signed-off-by: Tejun Heo <tj@kernel.org> |
|
Li Bin | 9ef28a73ff |
workqueue: fix comment typo for __queue_work()
It seems the "dying" should be "draining" here. Signed-off-by: Li Bin <huawei.libin@huawei.com> Signed-off-by: Tejun Heo <tj@kernel.org> |
|
Tejun Heo | 8a2b753844 |
workqueue: fix ordered workqueues in NUMA setups
An ordered workqueue implements execution ordering by using single
pool_workqueue with max_active == 1. On a given pool_workqueue, work
items are processed in FIFO order and limiting max_active to 1
enforces the queued work items to be processed one by one.
Unfortunately,
|
|
Oleg Nesterov | 9115122806 |
workqueue: swap set_cpus_allowed_ptr() and PF_NO_SETAFFINITY
Move the setting of PF_NO_SETAFFINITY up before set_cpus_allowed() in create_worker(). Otherwise userland can change ->cpus_allowed in between. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Tejun Heo <tj@kernel.org> |
|
Tejun Heo | e5fca243ab |
cgroup: use a dedicated workqueue for cgroup destruction
Since |
|
Martin Schwidefsky | 4be77398ac |
time: Fix 1ns/tick drift w/ GENERIC_TIME_VSYSCALL_OLD
Since commit |
|
Linus Torvalds | 78dc53c422 |
Merge branch 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull security subsystem updates from James Morris: "In this patchset, we finally get an SELinux update, with Paul Moore taking over as maintainer of that code. Also a significant update for the Keys subsystem, as well as maintenance updates to Smack, IMA, TPM, and Apparmor" and since I wanted to know more about the updates to key handling, here's the explanation from David Howells on that: "Okay. There are a number of separate bits. I'll go over the big bits and the odd important other bit, most of the smaller bits are just fixes and cleanups. If you want the small bits accounting for, I can do that too. (1) Keyring capacity expansion. KEYS: Consolidate the concept of an 'index key' for key access KEYS: Introduce a search context structure KEYS: Search for auth-key by name rather than target key ID Add a generic associative array implementation. KEYS: Expand the capacity of a keyring Several of the patches are providing an expansion of the capacity of a keyring. Currently, the maximum size of a keyring payload is one page. Subtract a small header and then divide up into pointers, that only gives you ~500 pointers on an x86_64 box. However, since the NFS idmapper uses a keyring to store ID mapping data, that has proven to be insufficient to the cause. Whatever data structure I use to handle the keyring payload, it can only store pointers to keys, not the keys themselves because several keyrings may point to a single key. This precludes inserting, say, and rb_node struct into the key struct for this purpose. I could make an rbtree of records such that each record has an rb_node and a key pointer, but that would use four words of space per key stored in the keyring. It would, however, be able to use much existing code. I selected instead a non-rebalancing radix-tree type approach as that could have a better space-used/key-pointer ratio. I could have used the radix tree implementation that we already have and insert keys into it by their serial numbers, but that means any sort of search must iterate over the whole radix tree. Further, its nodes are a bit on the capacious side for what I want - especially given that key serial numbers are randomly allocated, thus leaving a lot of empty space in the tree. So what I have is an associative array that internally is a radix-tree with 16 pointers per node where the index key is constructed from the key type pointer and the key description. This means that an exact lookup by type+description is very fast as this tells us how to navigate directly to the target key. I made the data structure general in lib/assoc_array.c as far as it is concerned, its index key is just a sequence of bits that leads to a pointer. It's possible that someone else will be able to make use of it also. FS-Cache might, for example. (2) Mark keys as 'trusted' and keyrings as 'trusted only'. KEYS: verify a certificate is signed by a 'trusted' key KEYS: Make the system 'trusted' keyring viewable by userspace KEYS: Add a 'trusted' flag and a 'trusted only' flag KEYS: Separate the kernel signature checking keyring from module signing These patches allow keys carrying asymmetric public keys to be marked as being 'trusted' and allow keyrings to be marked as only permitting the addition or linkage of trusted keys. Keys loaded from hardware during kernel boot or compiled into the kernel during build are marked as being trusted automatically. New keys can be loaded at runtime with add_key(). They are checked against the system keyring contents and if their signatures can be validated with keys that are already marked trusted, then they are marked trusted also and can thus be added into the master keyring. Patches from Mimi Zohar make this usable with the IMA keyrings also. (3) Remove the date checks on the key used to validate a module signature. X.509: Remove certificate date checks It's not reasonable to reject a signature just because the key that it was generated with is no longer valid datewise - especially if the kernel hasn't yet managed to set the system clock when the first module is loaded - so just remove those checks. (4) Make it simpler to deal with additional X.509 being loaded into the kernel. KEYS: Load *.x509 files into kernel keyring KEYS: Have make canonicalise the paths of the X.509 certs better to deduplicate The builder of the kernel now just places files with the extension ".x509" into the kernel source or build trees and they're concatenated by the kernel build and stuffed into the appropriate section. (5) Add support for userspace kerberos to use keyrings. KEYS: Add per-user_namespace registers for persistent per-UID kerberos caches KEYS: Implement a big key type that can save to tmpfs Fedora went to, by default, storing kerberos tickets and tokens in tmpfs. We looked at storing it in keyrings instead as that confers certain advantages such as tickets being automatically deleted after a certain amount of time and the ability for the kernel to get at these tokens more easily. To make this work, two things were needed: (a) A way for the tickets to persist beyond the lifetime of all a user's sessions so that cron-driven processes can still use them. The problem is that a user's session keyrings are deleted when the session that spawned them logs out and the user's user keyring is deleted when the UID is deleted (typically when the last log out happens), so neither of these places is suitable. I've added a system keyring into which a 'persistent' keyring is created for each UID on request. Each time a user requests their persistent keyring, the expiry time on it is set anew. If the user doesn't ask for it for, say, three days, the keyring is automatically expired and garbage collected using the existing gc. All the kerberos tokens it held are then also gc'd. (b) A key type that can hold really big tickets (up to 1MB in size). The problem is that Active Directory can return huge tickets with lots of auxiliary data attached. We don't, however, want to eat up huge tracts of unswappable kernel space for this, so if the ticket is greater than a certain size, we create a swappable shmem file and dump the contents in there and just live with the fact we then have an inode and a dentry overhead. If the ticket is smaller than that, we slap it in a kmalloc()'d buffer" * 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (121 commits) KEYS: Fix keyring content gc scanner KEYS: Fix error handling in big_key instantiation KEYS: Fix UID check in keyctl_get_persistent() KEYS: The RSA public key algorithm needs to select MPILIB ima: define '_ima' as a builtin 'trusted' keyring ima: extend the measurement list to include the file signature kernel/system_certificate.S: use real contents instead of macro GLOBAL() KEYS: fix error return code in big_key_instantiate() KEYS: Fix keyring quota misaccounting on key replacement and unlink KEYS: Fix a race between negating a key and reading the error set KEYS: Make BIG_KEYS boolean apparmor: remove the "task" arg from may_change_ptraced_domain() apparmor: remove parent task info from audit logging apparmor: remove tsk field from the apparmor_audit_struct apparmor: fix capability to not use the current task, during reporting Smack: Ptrace access check mode ima: provide hash algo info in the xattr ima: enable support for larger default filedata hash algorithms ima: define kernel parameter 'ima_template=' to change configured default ima: add Kconfig default measurement list template ... |
|
Linus Torvalds | 3eaded86ac |
Merge git://git.infradead.org/users/eparis/audit
Pull audit updates from Eric Paris: "Nothing amazing. Formatting, small bug fixes, couple of fixes where we didn't get records due to some old VFS changes, and a change to how we collect execve info..." Fixed conflict in fs/exec.c as per Eric and linux-next. * git://git.infradead.org/users/eparis/audit: (28 commits) audit: fix type of sessionid in audit_set_loginuid() audit: call audit_bprm() only once to add AUDIT_EXECVE information audit: move audit_aux_data_execve contents into audit_context union audit: remove unused envc member of audit_aux_data_execve audit: Kill the unused struct audit_aux_data_capset audit: do not reject all AUDIT_INODE filter types audit: suppress stock memalloc failure warnings since already managed audit: log the audit_names record type audit: add child record before the create to handle case where create fails audit: use given values in tty_audit enable api audit: use nlmsg_len() to get message payload length audit: use memset instead of trying to initialize field by field audit: fix info leak in AUDIT_GET requests audit: update AUDIT_INODE filter rule to comparator function audit: audit feature to set loginuid immutable audit: audit feature to only allow unsetting the loginuid audit: allow unsetting the loginuid (with priv) audit: remove CONFIG_AUDIT_LOGINUID_IMMUTABLE audit: loginuid functions coding style selinux: apply selinux checks on new audit message types ... |
|
Linus Torvalds | b5898cd057 |
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs bits and pieces from Al Viro: "Assorted bits that got missed in the first pull request + fixes for a couple of coredump regressions" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: fold try_to_ascend() into the sole remaining caller dcache.c: get rid of pointless macros take read_seqbegin_or_lock() and friends to seqlock.h consolidate simple ->d_delete() instances gfs2: endianness misannotations dump_emit(): use __kernel_write(), not vfs_write() dump_align(): fix the dumb braino |
|
Linus Torvalds | 82023bb7f7 |
More ACPI and power management updates for 3.13-rc1
- ACPI-based device hotplug fixes for issues introduced recently and a fix for an older error code path bug in the ACPI PCI host bridge driver. - Fix for recently broken OMAP cpufreq build from Viresh Kumar. - Fix for a recent hibernation regression related to s2disk. - Fix for a locking-related regression in the ACPI EC driver from Puneet Kumar. - System suspend error code path fix related to runtime PM and runtime PM documentation update from Ulf Hansson. - cpufreq's conservative governor fix from Xiaoguang Chen. - New processor IDs for intel_idle and turbostat and removal of an obsolete Kconfig option from Len Brown. - New device IDs for the ACPI LPSS (Low-Power Subsystem) driver and ACPI-based PCI hotplug (ACPIPHP) cleanup from Mika Westerberg. - Removal of several ACPI video DMI blacklist entries that are not necessary any more from Aaron Lu. - Rework of the ACPI companion representation in struct device and code cleanup related to that change from Rafael J Wysocki, Lan Tianyu and Jarkko Nikula. - Fixes for assigning names to ACPI-enumerated I2C and SPI devices from Jarkko Nikula. / -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (GNU/Linux) iQIcBAABCAAGBQJSjLYNAAoJEILEb/54YlRxkEQP/1pmFWNwSsxLtTHd+PEs0Xbo QccqvjQrnw/c8GcmK4eZrz6/xyuepmmjy9kfRKj2ENZniy0NEsSFqkTdSO3vYlva 8HKWUj7MV3evhFERXAF6Tu0b4Enx4jOP7VMtmYxJo3qrSnKRUcUzc6DGv/ACsUT1 Nkj0Lhdsg053Z+YzIXrl50w0tCDEMhVmWlMHBtYgr+dMNVnkfPBGkqMblMkKCXT2 w/yHvauZlxQHtI+8bVqTuGgNN0CPzdlpFGiuUF+5mDf6dRX8zlSn56Ia+Wyw1k9X dQp4jYQOgPRo03rNKqQPDiPxUdc7T0RAHRvDB51Ncweuh5PfZGguQe71p6/LKY2W i6zblZ0f/vc13hTiMrP+qzKcwZvgPB5DH7SfnHr61JKV7GNFCdYAqoceS5hYMzR9 d2Fd+txgm763IHWewXfDS/G2cU492R5qr4jpmUIACBQKWDZcqmSRDwRj83t56Ltb jgFBMbg4vZxG7IARhind74xsALxdhsgmFjPmx+0qPWjYxcU8otQZpXbgGNI9iOuW pxIQv5WPQW0tTmwO4HSuVCOwDPLPz5R0jkev7SvSj3Ek3TeD7He4LmnK055CATiC puq+6dp1FISPOPJYk+0DI61qN/CB/qNwRp8LU3ctZwudPVhznIE9FFQ3iN1FdBg2 X8VDcT9t7VvVuxSBjgkj =QMp+ -----END PGP SIGNATURE----- Merge tag 'pm+acpi-2-3.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull more ACPI and power management updates from Rafael Wysocki: - ACPI-based device hotplug fixes for issues introduced recently and a fix for an older error code path bug in the ACPI PCI host bridge driver - Fix for recently broken OMAP cpufreq build from Viresh Kumar - Fix for a recent hibernation regression related to s2disk - Fix for a locking-related regression in the ACPI EC driver from Puneet Kumar - System suspend error code path fix related to runtime PM and runtime PM documentation update from Ulf Hansson - cpufreq's conservative governor fix from Xiaoguang Chen - New processor IDs for intel_idle and turbostat and removal of an obsolete Kconfig option from Len Brown - New device IDs for the ACPI LPSS (Low-Power Subsystem) driver and ACPI-based PCI hotplug (ACPIPHP) cleanup from Mika Westerberg - Removal of several ACPI video DMI blacklist entries that are not necessary any more from Aaron Lu - Rework of the ACPI companion representation in struct device and code cleanup related to that change from Rafael J Wysocki, Lan Tianyu and Jarkko Nikula - Fixes for assigning names to ACPI-enumerated I2C and SPI devices from Jarkko Nikula * tag 'pm+acpi-2-3.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (24 commits) PCI / hotplug / ACPI: Drop unused acpiphp_debug declaration ACPI / scan: Set flags.match_driver in acpi_bus_scan_fixed() ACPI / PCI root: Clear driver_data before failing enumeration ACPI / hotplug: Fix PCI host bridge hot removal ACPI / hotplug: Fix acpi_bus_get_device() return value check cpufreq: governor: Remove fossil comment in the cpufreq_governor_dbs() ACPI / video: clean up DMI table for initial black screen problem ACPI / EC: Ensure lock is acquired before accessing ec struct members PM / Hibernate: Do not crash kernel in free_basic_memory_bitmaps() ACPI / AC: Remove struct acpi_device pointer from struct acpi_ac spi: Use stable dev_name for ACPI enumerated SPI slaves i2c: Use stable dev_name for ACPI enumerated I2C slaves ACPI: Provide acpi_dev_name accessor for struct acpi_device device name ACPI / bind: Use (put|get)_device() on ACPI device objects too ACPI: Eliminate the DEVICE_ACPI_HANDLE() macro ACPI / driver core: Store an ACPI device pointer in struct acpi_dev_node cpufreq: OMAP: Fix compilation error 'r & ret undeclared' PM / Runtime: Fix error path for prepare PM / Runtime: Update documentation around probe|remove|suspend cpufreq: conservative: set requested_freq to policy max when it is over policy max ... |
|
Linus Torvalds | 1ee2dcc224 |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller: "Mostly these are fixes for fallout due to merge window changes, as well as cures for problems that have been with us for a much longer period of time" 1) Johannes Berg noticed two major deficiencies in our genetlink registration. Some genetlink protocols we passing in constant counts for their ops array rather than something like ARRAY_SIZE(ops) or similar. Also, some genetlink protocols were using fixed IDs for their multicast groups. We have to retain these fixed IDs to keep existing userland tools working, but reserve them so that other multicast groups used by other protocols can not possibly conflict. In dealing with these two problems, we actually now use less state management for genetlink operations and multicast groups. 2) When configuring interface hardware timestamping, fix several drivers that simply do not validate that the hwtstamp_config value is one the driver actually supports. From Ben Hutchings. 3) Invalid memory references in mwifiex driver, from Amitkumar Karwar. 4) In dev_forward_skb(), set the skb->protocol in the right order relative to skb_scrub_packet(). From Alexei Starovoitov. 5) Bridge erroneously fails to use the proper wrapper functions to make calls to netdev_ops->ndo_vlan_rx_{add,kill}_vid. Fix from Toshiaki Makita. 6) When detaching a bridge port, make sure to flush all VLAN IDs to prevent them from leaking, also from Toshiaki Makita. 7) Put in a compromise for TCP Small Queues so that deep queued devices that delay TX reclaim non-trivially don't have such a performance decrease. One particularly problematic area is 802.11 AMPDU in wireless. From Eric Dumazet. 8) Fix crashes in tcp_fastopen_cache_get(), we can see NULL socket dsts here. Fix from Eric Dumzaet, reported by Dave Jones. 9) Fix use after free in ipv6 SIT driver, from Willem de Bruijn. 10) When computing mergeable buffer sizes, virtio-net fails to take the virtio-net header into account. From Michael Dalton. 11) Fix seqlock deadlock in ip4_datagram_connect() wrt. statistic bumping, this one has been with us for a while. From Eric Dumazet. 12) Fix NULL deref in the new TIPC fragmentation handling, from Erik Hugne. 13) 6lowpan bit used for traffic classification was wrong, from Jukka Rissanen. 14) macvlan has the same issue as normal vlans did wrt. propagating LRO disabling down to the real device, fix it the same way. From Michal Kubecek. 15) CPSW driver needs to soft reset all slaves during suspend, from Daniel Mack. 16) Fix small frame pacing in FQ packet scheduler, from Eric Dumazet. 17) The xen-netfront RX buffer refill timer isn't properly scheduled on partial RX allocation success, from Ma JieYue. 18) When ipv6 ping protocol support was added, the AF_INET6 protocol initialization cleanup path on failure was borked a little. Fix from Vlad Yasevich. 19) If a socket disconnects during a read/recvmsg/recvfrom/etc that blocks we can do the wrong thing with the msg_name we write back to userspace. From Hannes Frederic Sowa. There is another fix in the works from Hannes which will prevent future problems of this nature. 20) Fix route leak in VTI tunnel transmit, from Fan Du. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (106 commits) genetlink: make multicast groups const, prevent abuse genetlink: pass family to functions using groups genetlink: add and use genl_set_err() genetlink: remove family pointer from genl_multicast_group genetlink: remove genl_unregister_mc_group() hsr: don't call genl_unregister_mc_group() quota/genetlink: use proper genetlink multicast APIs drop_monitor/genetlink: use proper genetlink multicast APIs genetlink: only pass array to genl_register_family_with_ops() tcp: don't update snd_nxt, when a socket is switched from repair mode atm: idt77252: fix dev refcnt leak xfrm: Release dst if this dst is improper for vti tunnel netlink: fix documentation typo in netlink_set_err() be2net: Delete secondary unicast MAC addresses during be_close be2net: Fix unconditional enabling of Rx interface options net, virtio_net: replace the magic value ping: prevent NULL pointer dereference on write to msg_name bnx2x: Prevent "timeout waiting for state X" bnx2x: prevent CFC attention bnx2x: Prevent panic during DMAE timeout ... |
|
Kirill A. Shutemov | 24b9fdc59a |
kernel/bounds: avoid circular dependencies in generated headers
<linux/spinlock.h> has heavy dependencies on other header files. It triggers circular dependencies in generated headers on IA64, at least: CC kernel/bounds.s In file included from /home/space/kas/git/public/linux/arch/ia64/include/asm/thread_info.h:9:0, from include/linux/thread_info.h:54, from include/asm-generic/preempt.h:4, from arch/ia64/include/generated/asm/preempt.h:1, from include/linux/preempt.h:18, from include/linux/spinlock.h:50, from kernel/bounds.c:14: /home/space/kas/git/public/linux/arch/ia64/include/asm/asm-offsets.h:1:35: fatal error: generated/asm-offsets.h: No such file or directory compilation terminated. Let's replace <linux/spinlock.h> with <linux/spinlock_types.h>, it's enough to find out size of spinlock_t. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Reported-and-Tested-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> |
|
Johannes Berg | c53ed74236 |
genetlink: only pass array to genl_register_family_with_ops()
As suggested by David Miller, make genl_register_family_with_ops() a macro and pass only the array, evaluating ARRAY_SIZE() in the macro, this is a little safer. The openvswitch has some indirection, assing ops/n_ops directly in that code. This might ultimately just assign the pointers in the family initializations, saving the struct genl_family_and_ops and code (once mcast groups are handled differently.) Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> |
|
Linus Torvalds | 4007162647 |
Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq cleanups from Ingo Molnar: "This is a multi-arch cleanup series from Thomas Gleixner, which we kept to near the end of the merge window, to not interfere with architecture updates. This series (motivated by the -rt kernel) unifies more aspects of IRQ handling and generalizes PREEMPT_ACTIVE" * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: preempt: Make PREEMPT_ACTIVE generic sparc: Use preempt_schedule_irq ia64: Use preempt_schedule_irq m32r: Use preempt_schedule_irq hardirq: Make hardirq bits generic m68k: Simplify low level interrupt handling code genirq: Prevent spurious detection for unconditionally polled interrupts |
|
Shigeru Yoshida | 0515973ffb |
sched: Fix a trivial typo in comments
Fix a trivial typo in rq_attach_root(). Signed-off-by: Shigeru Yoshida <shigeru.yoshida@gmail.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20131117.121236.1990617639803941055.shigeru.yoshida@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org> |
|
Peter Zijlstra | 42eb088ed2 |
sched: Avoid NULL dereference on sd_busy
Commit
|
|
Srikar Dronamraju | 9abf24d465 |
sched: Check sched_domain before computing group power
After commit
|
|
Vince Weaver | 0022cedd4a |
perf/trace: Properly use u64 to hold event_id
The 64-bit attr.config value for perf trace events was being copied into an "int" before doing a comparison, meaning the top 32 bits were being truncated. As far as I can tell this didn't cause any errors, but it did mean it was possible to create valid aliases for all the tracepoint ids which I don't think was intended. (For example, 0xffffffff00000018 and 0x18 both enable the same tracepoint). Signed-off-by: Vince Weaver <vincent.weaver@maine.edu> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1311151236100.11932@vincent-weaver-1.um.maine.edu Signed-off-by: Ingo Molnar <mingo@kernel.org> |
|
Peter Zijlstra | 06db0b2171 |
perf: Remove fragile swevent hlist optimization
Currently we only allocate a single cpu hashtable for per-cpu swevents; do away with this optimization for it is fragile in the face of things like perf_pmu_migrate_context(). The easiest thing is to make sure all CPUs are consistent wrt state. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20130913111447.GN31370@twins.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org> |
|
Peter Zijlstra | d5b5f391d4 |
ftrace, perf: Avoid infinite event generation loop
Vince's perf-trinity fuzzer found yet another 'interesting' problem. When we sample the irq_work_exit tracepoint with period==1 (or PERF_SAMPLE_PERIOD) and we add an fasync SIGNAL handler we create an infinite event generation loop: ,-> <IPI> | irq_work_exit() -> | trace_irq_work_exit() -> | ... | __perf_event_overflow() -> (due to fasync) | irq_work_queue() -> (irq_work_list must be empty) '--------- arch_irq_work_raise() Similar things can happen due to regular poll() wakeups if we exceed the ring-buffer wakeup watermark, or have an event_limit. To avoid this, dis-allow sampling this particular tracepoint. In order to achieve this, create a special perf_perm function pointer for each event and call this (when set) on trying to create a tracepoint perf event. [ roasted: use expr... to allow for ',' in your expression ] Reported-by: Vince Weaver <vincent.weaver@maine.edu> Tested-by: Vince Weaver <vincent.weaver@maine.edu> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Dave Jones <davej@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/r/20131114152304.GC5364@laptop.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org> |
|
Andrew Morton | 050ded1bba |
tick: Document tick_do_timer_cpu
Taken straight from a tglx email ;) Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> |
|
Joe Perches | da554eba2e |
timer: Convert kmalloc_node(...GFP_ZERO...) to kzalloc_node(...)
Use the helper function instead of __GFP_ZERO. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> |
|
Thomas Gleixner | d689fe222a |
NOHZ: Check for nohz active instead of nohz enabled
RCU and the fine grained idle time accounting functions check tick_nohz_enabled. But that variable is merily telling that NOHZ has been enabled in the config and not been disabled on the command line. But it does not tell anything about nohz being active. That's what all this should check for. Matthew reported, that the idle accounting on his old P1 machine showed bogus values, when he enabled NOHZ in the config and did not disable it on the kernel command line. The reason is that his machine uses (refined) jiffies as a clocksource which explains why the "fine" grained accounting went into lala land, because it depends on when the system goes and leaves idle relative to the jiffies increment. Provide a tick_nohz_active indicator and let RCU and the accounting code use this instead of tick_nohz_enable. Reported-and-tested-by: Matthew Whitehead <tedheadster@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Steven Rostedt <rostedt@goodmis.org> Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: john.stultz@linaro.org Cc: mwhitehe@redhat.com Link: http://lkml.kernel.org/r/alpine.DEB.2.02.1311132052240.30673@ionos.tec.linutronix.de |
|
Rafael J. Wysocki | b38f67c4ae |
Merge branch 'pm-sleep'
* pm-sleep: PM / Hibernate: Do not crash kernel in free_basic_memory_bitmaps() |
|
Linus Torvalds | b29c8306a3 |
This batch of changes is mostly clean ups and small bug fixes.
The only real feature that was added this release is from Namhyung Kim, who introduced "set_graph_notrace" filter that lets you run the function graph tracer and not trace particular functions and their call chain. Tom Zanussi added some updates to the ftrace multibuffer tracing that made it more consistent with the top level tracing. One of the fixes for perf function tracing required an API change in RCU; the addition of "rcu_is_watching()". As Paul McKenney is pushing that change in this release too, he gave me a branch that included all the changes to get that working, and I pulled that into my tree in order to complete the perf function tracing fix. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.14 (GNU/Linux) iQEcBAABAgAGBQJSgX5SAAoJEKQekfcNnQGulUAH/jORqJrKaNAulmZ314VsAqfa zMtF5UAAPf7kqc3AN/jtFrhJUNEfxWOo7A4r0FsM/rKdWJF+98GA6aqYVD+XoWFt +36fg1enxbXUjixQ96Uh+o1+BJUgYDqljuWzqSu/oiXWfWwl8+WL4kcbhb+V9WcF SpdzLCWVZRfhyDiN3+0zvyQ8RSG2Pd7CWn9zroI0e4sxGo0Ki6JUnIcXtZGOBDOQ IIZdjXvGSfpJ+3u3XvRPXJcltRCtOsVWxYzrmvRlmHDW5QMe1+WmmrlojTePrLaJ xn8+3WINqetAR+ZQnazbpt1XzJzKa8QtFgpiN0kT6qL7cg3N1Owc4vLGohl7wok= =Nesf -----END PGP SIGNATURE----- Merge tag 'trace-3.13' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing update from Steven Rostedt: "This batch of changes is mostly clean ups and small bug fixes. The only real feature that was added this release is from Namhyung Kim, who introduced "set_graph_notrace" filter that lets you run the function graph tracer and not trace particular functions and their call chain. Tom Zanussi added some updates to the ftrace multibuffer tracing that made it more consistent with the top level tracing. One of the fixes for perf function tracing required an API change in RCU; the addition of "rcu_is_watching()". As Paul McKenney is pushing that change in this release too, he gave me a branch that included all the changes to get that working, and I pulled that into my tree in order to complete the perf function tracing fix" * tag 'trace-3.13' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing: Add rcu annotation for syscall trace descriptors tracing: Do not use signed enums with unsigned long long in fgragh output tracing: Remove unused function ftrace_off_permanent() tracing: Do not assign filp->private_data to freed memory tracing: Add helper function tracing_is_disabled() tracing: Open tracer when ftrace_dump_on_oops is used tracing: Add support for SOFT_DISABLE to syscall events tracing: Make register/unregister_ftrace_command __init tracing: Update event filters for multibuffer recordmcount.pl: Add support for __fentry__ ftrace: Have control op function callback only trace when RCU is watching rcu: Do not trace rcu_is_watching() functions ftrace/x86: skip over the breakpoint for ftrace caller trace/trace_stat: use rbtree postorder iteration helper instead of opencoding ftrace: Add set_graph_notrace filter ftrace: Narrow down the protected area of graph_lock ftrace: Introduce struct ftrace_graph_data ftrace: Get rid of ftrace_graph_filter_enabled tracing: Fix potential out-of-bounds in trace_get_user() tracing: Show more exact help information about snapshot |