linux

Commit Graph

Author	SHA1	Message	Date
Florian Westphal	3282e65558	tcp: remove unused mib counters was used by tcp prequeue and header prediction. TCPFORWARDRETRANS use was removed in january. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-31 14:37:50 -07:00
Florian Westphal	573aeb0492	tcp: remove CA_ACK_SLOWPATH re-indent tcp_ack, and remove CA_ACK_SLOWPATH; it is always set now. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-31 14:37:50 -07:00
Florian Westphal	45f119bf93	tcp: remove header prediction Like prequeue, I am not sure this is overly useful nowadays. If we receive a train of packets, GRO will aggregate them if the headers are the same (HP predates GRO by several years) so we don't get a per-packet benefit, only a per-aggregated-packet one. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-31 14:37:49 -07:00
Florian Westphal	b6690b1438	tcp: remove low_latency sysctl Was only checked by the removed prequeue code. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-31 14:37:49 -07:00
Florian Westphal	c13ee2a4f0	tcp: reindent two spots after prequeue removal These two branches are now always true, remove the conditional. objdiff shows no changes. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-31 14:37:49 -07:00
Florian Westphal	e7942d0633	tcp: remove prequeue support prequeue is a tcp receive optimization that moves part of rx processing from bh to process context. This only works if the socket being processed belongs to a process that is blocked in recv on that socket. In practice, this doesn't happen anymore that often because nowadays servers tend to use an event driven (epoll) model. Even normal client applications (web browsers) commonly use many tcp connections in parallel. This has measureable impact only in netperf (which uses plain recv and thus allows prequeue use) from host to locally running vm (~4%), however, there were no changes when using netperf between two physical hosts with ixgbe interfaces. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-31 14:37:49 -07:00
Linus Torvalds	2e7ca2064c	Merge branch 'for-4.13-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup fixes from Tejun Heo: "Several cgroup bug fixes. - cgroup core was calling a migration callback on empty migrations, which could make cpuset crash. - There was a very subtle bug where the controller interface files aren't created directly when cgroup2 is mounted. Because later operations create them, this bug didn't get noticed earlier. - Failed writes to cgroup.subtree_control were incorrectly returning zero" * 'for-4.13-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: cgroup: fix error return value from cgroup_subtree_control() cgroup: create dfl_root files on subsys registration cgroup: don't call migration methods if there are no tasks to migrate	2017-07-31 14:03:05 -07:00
Linus Torvalds	ff2620f778	Merge branch 'for-4.13-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq Pull workqueue fixes from Tejun Heo: "Two notable fixes. - While adding NUMA affinity support to unbound workqueues, the assumption that an unbound workqueue with max_active == 1 is ordered was broken. The plan was to use explicit alloc_ordered_workqueue() for those cases. Unfortunately, I forgot to update the documentation properly and we grew a handful of use cases which depend on that assumption. While we want to convert them to alloc_ordered_workqueue(), we don't really lose anything by enforcing ordered execution on unbound max_active == 1 workqueues and it doesn't make sense to risk subtle bugs. Restore the assumption. - Workqueue assumes that CPU <-> NUMA node mapping remains static. This is a general assumption - we don't have any synchronization mechanism around CPU <-> node mapping. Unfortunately, powerpc may change the mapping dynamically leading to crashes. Michael added a workaround so that we at least don't crash while powerpc hotplug code gets updated" * 'for-4.13-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: workqueue: Work around edge cases for calc of pool's cpumask workqueue: implicit ordered attribute should be overridable workqueue: restore WQ_UNBOUND/max_active==1 to be ordered	2017-07-31 13:37:28 -07:00
Linus Torvalds	3dcc4c7d42	Merge branch 'for-4.13-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata Pull libata fixes from Tejun Heo: "Dan found a really old bug where libata hotplug code wasn't sanitizing index value from userland and may end up indexing with a negative number. It is scary but fortunately can only be triggered by root. Other than that, minor fixes" * 'for-4.13-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata: libata: fix a couple of doc build warnings libata: array underflow in ata_find_dev() ata: sata_rcar: add gen[23] fallback compatibility strings libata: remove unused rc in ata_eh_handle_port_resume libata: Cleanup ata_read_log_page() ata: fix gemini Kconfig dependencies	2017-07-31 13:33:21 -07:00
Sylwester Nawrocki	5b30850bd6	clk: samsung: exynos5420: The EPLL rate table corrections This patch fixes values of the EPLL K coefficient and changes the EPLL output frequency values to match exactly what is possible to achieve with given M, P, S, K coefficients. This allows to avoid rounding errors and unexpected frequency being set with clk_set_rate(), due to recalc_rate returning different values than the PLL rate specified in the exynos5420_epll_24mhz_tbl table. E.g. this prevents a case where two consecutive clk_set_rate() calls with same argument result in different PLL output frequency. The PLL output frequencies have been calculated with formula: f = fxtal * (M * 2^16 + K) / (P * 2^S) / 2^16 where fxtal = 24000000. Fixes: `9842452acd` ("clk: samsung: exynos542x: Add EPLL rate table") Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com> Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>	2017-07-31 13:16:03 -07:00
Babu Moger	74ad3d28af	parisc: Define CONFIG_CPU_BIG_ENDIAN While working on enabling queued rwlock on SPARC, found this following code in include/asm-generic/qrwlock.h which uses CONFIG_CPU_BIG_ENDIAN to clear a byte. static inline u8 __qrwlock_write_byte(struct qrwlock lock) { return (u8 )lock + 3 IS_BUILTIN(CONFIG_CPU_BIG_ENDIAN); } Problem is many of the fixed big endian architectures don't define CPU_BIG_ENDIAN and clears the wrong byte. Define CPU_BIG_ENDIAN for parisc architecture to fix it. Signed-off-by: Babu Moger <babu.moger@oracle.com> Signed-off-by: Helge Deller <deller@gmx.de>	2017-07-31 17:51:27 +02:00
Jonathan Corbet	2f60e1ab2f	libata: fix a couple of doc build warnings The kerneldoc comments for a couple of functions in drivers/ata/libata-eh.c had fallen behind the current implementation, resulting in these doc build warnings: ./drivers/ata/libata-eh.c:1449: warning: No description found for parameter 'link' ./drivers/ata/libata-eh.c:1449: warning: Excess function parameter 'ap' description in 'ata_eh_done' ./drivers/ata/libata-eh.c:1590: warning: No description found for parameter 'qc' ./drivers/ata/libata-eh.c:1590: warning: Excess function parameter 'dev' description in 'ata_eh_request_sense' Update the comments and make the warnings go away. Signed-off-by: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Tejun Heo <tj@kernel.org>	2017-07-31 08:03:06 -07:00
James Bottomley	93964fd4ea	parisc: pdc_stable: Fix locking when creating sysfs links There's no need to take the write lock when creating sysfs links. This patch fixes the following BUG: BUG: sleeping function called from invalid context at mm/slab.h:416 in_atomic(): 1, irqs_disabled(): 0, pid: 1, name: swapper/0 CPU: 2 PID: 1 Comm: swapper/0 Not tainted 4.13.0-rc2-00110-g0b5477d9dabd #111 Backtrace: [<0000000040217ac8>] show_stack+0x20/0x38 [<00000000406fbbb0>] dump_stack+0xb0/0x128 [<0000000040274090>] ___might_sleep+0x180/0x1b8 [<0000000040274144>] __might_sleep+0x7c/0xe8 [<0000000040373874>] kmem_cache_alloc+0x14c/0x1e0 [<0000000040419514>] __kernfs_new_node+0x84/0x1b8 [<000000004041b09c>] kernfs_new_node+0x3c/0x78 [<000000004041e040>] kernfs_create_link+0x40/0xd8 [<000000004041f320>] sysfs_do_create_link_sd.isra.0+0xb0/0x130 [<000000004041f3d4>] sysfs_create_link+0x34/0x58 [<000000004011b4a4>] pdc_stable_init+0x2c4/0x458 [<0000000040200250>] do_one_initcall+0x70/0x1d8 [<0000000040101644>] kernel_init_freeable+0x27c/0x390 [<000000004020be44>] kernel_init+0x24/0x1c0 Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Reported-by: Meelis Roos <mroos@linux.ee> Signed-off-by: Helge Deller <deller@gmx.de>	2017-07-31 16:43:13 +02:00
Axel Lin	fa8f6d0619	gpio: lp87565: Set proper output level and direction for direction_output The value argument of lp87565_gpio_direction_output() means output level rather than gpio direction. Signed-off-by: Axel Lin <axel.lin@ingics.com> Reviewed-by: Keerthy <j-keerthy@ti.com> Tested-by: Keerthy <j-keerthy@ti.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>	2017-07-31 15:26:57 +02:00
Rafael J. Wysocki	a684c5b188	thunderbolt: icm: Ignore mailbox errors in icm_suspend() On one of my test machines nhi_mailbox_cmd() called from icm_suspend() times out and returnes an error which then is propagated to the caller and causes the entire system suspend to be aborted which isn't very useful. Instead of aborting system suspend, print the error into the log and continue. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com> Acked-by: Michael Jamet <michael.jamet@intel.com>	2017-07-31 13:24:29 +02:00
Loic Poulain	fb776481c4	Bluetooth: hci_uart: Fix uninitialized alignment value Force alignment value to the default one (1 byte) if uninitialized. This fixes hci_ll serdev driver (alignment = 0) and avoid any further issues with upcoming drivers. Signed-off-by: Loic Poulain <loic.poulain@gmail.com> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2017-07-31 13:27:37 +03:00
Nicholas Piggin	cc491f1d35	powerpc/64s: Fix stack setup in watchdog soft_nmi_common() The watchdog soft-NMI exception stack setup loads a stack pointer twice, which is an obvious error. It ends up using the system reset interrupt (true-NMI) stack, which is also a bug because the watchdog could be preempted by a system reset interrupt that overwrites the NMI stack. Change the soft-NMI to use the "emergency stack". The current kernel stack is not used, because of the longer-term goal to prevent asynchronous stack access using soft-disable. Fixes: `2104180a53` ("powerpc/64s: implement arch-specific hardlockup watchdog") Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-07-31 20:22:37 +10:00
Michael Ellerman	bb272221e9	Linux v4.13-rc1 -----BEGIN PGP SIGNATURE----- iQEcBAABAgAGBQJZapWhAAoJEHm+PkMAQRiGKb0IAJM6b7SbWaw69Og7+qiFB+zZ xp29iXqbE9fPISC6a5BRQV1ONjeDM6opGixGHqGC8Hla6k2IYz25VDNoF8wd0MXN cz/Ih20vd3C5afxXGe5cTT8lsPAlV0mWXxForlu6j8jPeL62FPfq6RhEkw7AcrYL yfYy3k3qSdOrrvBdII0WAAUi46UfIs+we9BQgbsMbkHOiqV2K0MOrzKE84Xbgepq RAy2xg6P4b4+hTx8xTrYc1MXwpnqjRc0oJ08gdmiwW3AOOU7LxYFn7zDkLPWi9Rr g4x6r4YhBTGxT4wNvovLIiqd9QFs//dMCuPWYwEtTICG48umIqqq24beQ0mvCdg= =08Ic -----END PGP SIGNATURE----- Merge tag 'v4.13-rc1' into fixes The fixes branch is based off a random pre-rc1 commit, because we had some fixes that needed to go in before rc1 was released. However we now need to fix some code that went in after that point, but before rc1, so merge rc1 to get that code into fixes so we can fix it!	2017-07-31 20:20:29 +10:00
Kuppuswamy Sathyanarayanan	727fd697da	MAINTAINERS: Add entry for Whiskey Cove PMIC GPIO driver Added maintainer info for Whiskey Cove PMIC GPIO driver. Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>	2017-07-31 09:13:52 +02:00
Helge Deller	8f8201dfed	parisc: Increase thread and stack size to 32kb Since kernel 4.11 the thread and irq stacks on parisc randomly overflow the default size of 16k. The reason why stack usage suddenly grew is yet unknown. Signed-off-by: Helge Deller <deller@gmx.de> Cc: stable@vger.kernel.org # 4.11+ Signed-off-by: Helge Deller <deller@gmx.de>	2017-07-31 08:41:26 +02:00
John David Anglin	13d57093c1	parisc: Handle vma's whose context is not current in flush_cache_range In testing James' patch to drivers/parisc/pdc_stable.c, I hit the BUG statement in flush_cache_range() during a system shutdown: kernel BUG at arch/parisc/kernel/cache.c:595! CPU: 2 PID: 6532 Comm: kworker/2:0 Not tainted 4.13.0-rc2+ #1 Workqueue: events free_ioctx IAOQ[0]: flush_cache_range+0x144/0x148 IAOQ[1]: flush_cache_page+0x0/0x1a8 RP(r2): flush_cache_range+0xec/0x148 Backtrace: [<00000000402910ac>] unmap_page_range+0x84/0x880 [<00000000402918f4>] unmap_single_vma+0x4c/0x60 [<0000000040291a18>] zap_page_range_single+0x110/0x160 [<0000000040291c34>] unmap_mapping_range+0x174/0x1a8 [<000000004026ccd8>] truncate_pagecache+0x50/0xa8 [<000000004026cd84>] truncate_setsize+0x54/0x70 [<000000004033d534>] put_aio_ring_file+0x44/0xb0 [<000000004033d5d8>] aio_free_ring+0x38/0x140 [<000000004033d714>] free_ioctx+0x34/0xa8 [<00000000401b0028>] process_one_work+0x1b8/0x4d0 [<00000000401b04f4>] worker_thread+0x1b4/0x648 [<00000000401b9128>] kthread+0x1b0/0x208 [<0000000040150020>] end_fault_vector+0x20/0x28 [<0000000040639518>] nf_ip_reroute+0x50/0xa8 [<0000000040638ed0>] nf_ip_route+0x10/0x78 [<0000000040638c90>] xfrm4_mode_tunnel_input+0x180/0x1f8 CPU: 2 PID: 6532 Comm: kworker/2:0 Not tainted 4.13.0-rc2+ #1 Workqueue: events free_ioctx Backtrace: [<0000000040163bf0>] show_stack+0x20/0x38 [<0000000040688480>] dump_stack+0xa8/0x120 [<0000000040163dc4>] die_if_kernel+0x19c/0x2b0 [<0000000040164d0c>] handle_interruption+0xa24/0xa48 This patch modifies flush_cache_range() to handle non current contexts. In as much as this occurs infrequently, the simplest approach is to flush the entire cache when this happens. Signed-off-by: John David Anglin <dave.anglin@bell.net> Cc: stable@vger.kernel.org # 4.9+ Signed-off-by: Helge Deller <deller@gmx.de>	2017-07-31 08:22:33 +02:00
Jeff Layton	9c5d58fb9e	ext4: convert swap_inode_data() over to use swap() on most of the fields For some odd reason, it forces a byte-by-byte copy of each field. A plain old swap() on most of these fields would be more efficient. We do need to retain the memswap of i_data however as that field is an array. Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Jeff Layton <jlayton@redhat.com> Reviewed-by: Jan Kara <jack@suse.cz>	2017-07-31 00:55:34 -04:00
Emoly Liu	191eac3300	ext4: error should be cleared if ea_inode isn't added to the cache For Lustre, if ea_inode fails in hash validation but passes parent inode and generation checks, it won't be added to the cache as well as the error "-EFSCORRUPTED" should be cleared, otherwise it will cause "Structure needs cleaning" when running getfattr command. Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-9723 Cc: stable@vger.kernel.org Fixes: `dec214d00e` Signed-off-by: Emoly Liu <emoly.liu@intel.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Andreas Dilger <adilger@dilger.ca> Reviewed-by: tahsin@google.com	2017-07-31 00:40:22 -04:00
Jan Kara	a3bb2d5587	ext4: Don't clear SGID when inheriting ACLs When new directory 'DIR1' is created in a directory 'DIR0' with SGID bit set, DIR1 is expected to have SGID bit set (and owning group equal to the owning group of 'DIR0'). However when 'DIR0' also has some default ACLs that 'DIR1' inherits, setting these ACLs will result in SGID bit on 'DIR1' to get cleared if user is not member of the owning group. Fix the problem by moving posix_acl_update_mode() out of __ext4_set_acl() into ext4_set_acl(). That way the function will not be called when inheriting ACLs which is what we want as it prevents SGID bit clearing and the mode has been properly set by posix_acl_create() anyway. Fixes: `073931017b` CC: stable@vger.kernel.org Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Jan Kara <jack@suse.cz> Reviewed-by: Andreas Gruenbacher <agruenba@redhat.com>	2017-07-30 23:33:01 -04:00
Ernesto A. Fernández	397e434176	ext4: preserve i_mode if __ext4_set_acl() fails When changing a file's acl mask, __ext4_set_acl() will first set the group bits of i_mode to the value of the mask, and only then set the actual extended attribute representing the new acl. If the second part fails (due to lack of space, for example) and the file had no acl attribute to begin with, the system will from now on assume that the mask permission bits are actual group permission bits, potentially granting access to the wrong users. Prevent this by only changing the inode mode after the acl has been set. Signed-off-by: Ernesto A. Fernández <ernesto.mnd.fernandez@gmail.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Jan Kara <jack@suse.cz>	2017-07-30 22:43:41 -04:00
Eric Whitney	a627b0a7c1	ext4: remove unused metadata accounting variables Two variables in ext4_inode_info, i_reserved_meta_blocks and i_allocated_meta_blocks, are unused. Removing them saves a little memory per in-memory inode and cleans up clutter in several tracepoints. Adjust tracepoint output from ext4_alloc_da_blocks() for consistency and fix a typo and whitespace near these changes. Signed-off-by: Eric Whitney <enwlinux@gmail.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Jan Kara <jack@suse.cz>	2017-07-30 22:30:11 -04:00
David S. Miller	764646b08d	Merge branch 'net-sched-actions-improve-dump-performance' Jamal Hadi Salim says: ==================== net sched actions: improve dump performance Changes since v11: ------------------ 1) Jiri - renames: nla_value to value and nla_selector to selector 2) Jiri - rename: validate_nla_bitfield_32 to validate_nla_bitfield_32 3) Jiri - rename: NLA_BITFIELD_32 to NLA_BITFIELD32 4) Jiri - remove unnecessary break when we return in case statement 5) Jiri - rename and move nla_get_bitfield_32 to an earlier patch 6) Jiri - xmas tree alignment of var declaration 7) Jiri - rename all declarations of bitfield 32 vars to be consistent ("bf") 8) Jiri - improve validate_nla_bitfield32() validation to disallow valid bit values that are not selected by the selector Changes since v10: ----------------- 1) Jiri: move type->validate_content() to its own patch Jamal: decided to remove it altogether so we can get this patch set in. 2) Change name of NLA_FLAG_BITS to NLA_BITFIELD_32 based on discussions with D. Ahern and Jiri. D. Ahern suggests to make this a variable bitmap size. My analysis at this point is it too complex and i only need a few bit flags. If we run out of bits someone else can create a new NLA_BITFIELD_XXX and start using that. So please let this go. 3) Jamal - Add Suggested-by: Jiri for type NLA_BITFIELD_32 4) Jiri: Change name allowed_flags to tcaa_root_flags_allowed 5) Jiri: Introduce nla_get_flag_bits_values() helper instead of using memcpy for retrieving nla_bitfield_32 fields. Changes since v9: ----------------- 1) General consensus: - remove again the use of BIT() to maintain uapi consistency ;-> 1) Jiri: - Add a new netlink type NLA_FLAG_BITS to check for valid bits and use it instead of inline vetting (patch 4/4 now) Changes since v8: ----------------- 1) Jiri: - Add back the use of BIT(). Eventually fix iproute2 instead - Rename VALID_TCA_FLAGS to VALID_TCA_ROOT_FLAGS Changes since v7: ----------------- Jamal: No changes. Patch 1 went out twice. Resend without two copies of patch 1 changes since v6: ----------------- 1) DaveM: New rules for netlink messages. From now on we are going to start checking for bits that are not used and rejecting anything we dont understand. In the future this is going to require major changes to user space code (tc etc). This is just a start. To quote, David: " Again, bits you aren't using now, make sure userspace doesn't set them. And if it does, reject. " Added checks for ensuring things work as above. 2) Jiri: a)Fix the commit message to properly use "Fixes" description b)Align assignments for nla_policy Changes since v5: ---------------- 0) Remove use of BIT() because it is kernel specific. Requires a separate patch (Jiri can submit that in his cleanups) 1)To paraphrase Eric D. "memcpy(nla_data(count_attr), &cb->args[1], sizeof(u32)); wont work on 64bit BE machines because cb->args[1] (which is 64 bit is larger in size than sizeof(u32))" Fixed 2) Jiri Pirko i) Spotted a bug fix mixed in the patch for wrong TLV fix. Add patch 1/3 to address this. Make part of this series because of dependencies. ii) Rename ACT_LARGE_DUMP_ON -> TCA_FLAG_LARGE_DUMP_ON iii) Satisfy Jiri's obsession against the noun "tcaa" a)Rename struct nlattr tcaa --> struct nlattr tb b)Rename TCAA_ACT_XXX -> TCA_ROOT_XXX Changes since v4: ----------------- 1) Eric D. pointed out that when all skb space is used up by the dump there will be no space to insert the TCAA_ACT_COUNT attribute. 2) Jiri: i) Change: enum { TCAA_UNSPEC, TCAA_ACT_TAB, TCAA_ACT_FLAGS, TCAA_ACT_COUNT, TCAA_ACT_TIME_FILTER, __TCAA_MAX }; to: enum { TCAA_UNSPEC, TCAA_ACT_TAB, TCAA_ACT_FLAGS, TCAA_ACT_COUNT, __TCAA_MAX, }; Jiri plans to followup with the rest of the code to make the style consistent. ii) Rename attribute TCAA_ACT_TIME_FILTER --> TCAA_ACT_TIME_DELTA iii) Rename variable jiffy_filter --> jiffy_since iv) Rename msecs_filter --> msecs_since v) get rid of unused cb->args[0] and rename cb->args[4] to cb->args[0] Earlier Changes ---------------- - Jiri mostly on names of things. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-30 19:28:08 -07:00
Jamal Hadi Salim	e62e484df0	net sched actions: add time filter for action dumping This patch adds support for filtering based on time since last used. When we are dumping a large number of actions it is useful to have the option of filtering based on when the action was last used to reduce the amount of data crossing to user space. With this patch the user space app sets the TCA_ROOT_TIME_DELTA attribute with the value in milliseconds with "time of interest since now". The kernel converts this to jiffies and does the filtering comparison matching entries that have seen activity since then and returns them to user space. Old kernels and old tc continue to work in legacy mode since they dont specify this attribute. Some example (we have 400 actions bound to 400 filters); at installation time. Using updated when tc setting the time of interest to 120 seconds earlier (we see 400 actions): prompt$ hackedtc actions ls action gact since 120000\| grep index \| wc -l 400 go get some coffee and wait for > 120 seconds and try again: prompt$ hackedtc actions ls action gact since 120000 \| grep index \| wc -l 0 Lets see a filter bound to one of these actions: .... filter pref 10 u32 filter pref 10 u32 fh 800: ht divisor 1 filter pref 10 u32 fh 800::800 order 2048 key ht 800 bkt 0 flowid 1:10 (rule hit 2 success 1) match 7f000002/ffffffff at 12 (success 1 ) action order 1: gact action pass random type none pass val 0 index 23 ref 2 bind 1 installed 1145 sec used 802 sec Action statistics: Sent 84 bytes 1 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 .... that coffee took long, no? It was good. Now lets ping -c 1 127.0.0.2, then run the actions again: prompt$ hackedtc actions ls action gact since 120 \| grep index \| wc -l 1 More details please: prompt$ hackedtc -s actions ls action gact since 120000 action order 0: gact action pass random type none pass val 0 index 23 ref 2 bind 1 installed 1270 sec used 30 sec Action statistics: Sent 168 bytes 2 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 And the filter? filter pref 10 u32 filter pref 10 u32 fh 800: ht divisor 1 filter pref 10 u32 fh 800::800 order 2048 key ht 800 bkt 0 flowid 1:10 (rule hit 4 success 2) match 7f000002/ffffffff at 12 (success 2 ) action order 1: gact action pass random type none pass val 0 index 23 ref 2 bind 1 installed 1324 sec used 84 sec Action statistics: Sent 168 bytes 2 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-30 19:28:08 -07:00
Jamal Hadi Salim	90825b23a8	net sched actions: dump more than TCA_ACT_MAX_PRIO actions per batch When you dump hundreds of thousands of actions, getting only 32 per dump batch even when the socket buffer and memory allocations allow is inefficient. With this change, the user will get as many as possibly fitting within the given constraints available to the kernel. The top level action TLV space is extended. An attribute TCA_ROOT_FLAGS is used to carry flags; flag TCA_FLAG_LARGE_DUMP_ON is set by the user indicating the user is capable of processing these large dumps. Older user space which doesnt set this flag doesnt get the large (than 32) batches. The kernel uses the TCA_ROOT_COUNT attribute to tell the user how many actions are put in a single batch. As such user space app knows how long to iterate (independent of the type of action being dumped) instead of hardcoded maximum of 32 thus maintaining backward compat. Some results dumping 1.5M actions below: first an unpatched tc which doesnt understand these features... prompt$ time -p tc actions ls action gact \| grep index \| wc -l 1500000 real 1388.43 user 2.07 sys 1386.79 Now lets see a patched tc which sets the correct flags when requesting a dump: prompt$ time -p updatedtc actions ls action gact \| grep index \| wc -l 1500000 real 178.13 user 2.02 sys 176.96 That is about 8x performance improvement for tc app which sets its receive buffer to about 32K. Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-30 19:28:08 -07:00
Jamal Hadi Salim	df823b0297	net sched actions: Use proper root attribute table for actions Bug fix for an issue which has been around for about a decade. We got away with it because the enumeration was larger than needed. Fixes: `7ba699c604` ("[NET_SCHED]: Convert actions from rtnetlink to new netlink API") Suggested-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-30 19:28:08 -07:00
Jamal Hadi Salim	64c83d8373	net netlink: Add new type NLA_BITFIELD32 Generic bitflags attribute content sent to the kernel by user. With this netlink attr type the user can either set or unset a flag in the kernel. The value is a bitmap that defines the bit values being set The selector is a bitmask that defines which value bit is to be considered. A check is made to ensure the rules that a kernel subsystem always conforms to bitflags the kernel already knows about. i.e if the user tries to set a bit flag that is not understood then the _it will be rejected_. In the most basic form, the user specifies the attribute policy as: [ATTR_GOO] = { .type = NLA_BITFIELD32, .validation_data = &myvalidflags }, where myvalidflags is the bit mask of the flags the kernel understands. If the user _does not_ provide myvalidflags then the attribute will also be rejected. Examples: value = 0x0, and selector = 0x1 implies we are selecting bit 1 and we want to set its value to 0. value = 0x2, and selector = 0x2 implies we are selecting bit 2 and we want to set its value to 1. Suggested-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-30 19:28:08 -07:00
Eric Whitney	1e21196c8e	ext4: correct comment references to ext4_ext_direct_IO() Commit `914f82a32d` "ext4: refactor direct IO code" deleted ext4_ext_direct_IO(), but references to that function remain in comments. Update them to refer to ext4_direct_IO_write(). Signed-off-by: Eric Whitney <enwlinux@gmail.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Andreas Dilger <adilger@dilger.ca> Reviewed-by: Jan Kara <jack@suse.cz>	2017-07-30 22:26:40 -04:00
Andrew Lunn	fbbeefdd21	net: fec: Allow reception of frames bigger than 1522 bytes The FEC Receive Control Register has a 14 bit field indicating the longest frame that may be received. It is being set to 1522. Frames longer than this are discarded, but counted as being in error. When using DSA, frames from the switch has an additional header, either 4 or 8 bytes if a Marvell switch is used. Thus a full MTU frame of 1522 bytes received by the switch on a port becomes 1530 bytes when passed to the host via the FEC interface. Change the maximum receive size to 2048 - 64, where 64 is the maximum rx_alignment applied on the receive buffer for AVB capable FEC cores. Use this value also for the maximum receive buffer size. The driver is already allocating a receive SKB of 2048 bytes, so this change should not have any significant effects. Tested on imx51, imx6, vf610. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-30 19:26:01 -07:00
Andrew Lunn	9558df3a82	net: fec: Issue error for missing but expected PHY If the PHY is missing but expected, e.g. because of a typ0 in the dt file, it is not possible to open the interface. ip link returns: RTNETLINK answers: No such device It is not very obvious what the problem is. Add a netdev_err() in this case to make it easier to debug the issue. [ 21.409385] fec 2188000.ethernet eth0: Unable to connect to phy RTNETLINK answers: No such device Signed-off-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Fugang Duan <fugang.duan@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-30 19:25:22 -07:00
David S. Miller	509394e841	Merge branch 'dsa-lan9303-Fix-MDIO-issues' Egil Hjelmeland says: ==================== net: dsa: lan9303: Fix MDIO issues. This series fix the MDIO interface for the lan9303 DSA driver. Bugs found after testing on actual HW. This series is extracted from the first patch of my first large series. Significant changes from that version are: - use mdiobus_write_nested, mdiobus_read_nested. - EXPORT lan9303_indirect_phy_ops Unfortunately I do not have access to i2c based system for testing. Changes from first version: - Change EXPORT_SYMBOL to EXPORT_SYMBOL_GPL ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-30 19:23:29 -07:00
Egil Hjelmeland	2c3408986c	net: dsa: lan9303: MDIO access phy registers directly Indirect access (PMI) to phy register only work in I2C mode. In MDIO mode phy registers must be accessed directly. Introduced struct lan9303_phy_ops to handle the two modes. Signed-off-by: Egil Hjelmeland <privat@egil-hjelmeland.no> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-30 19:23:29 -07:00
Egil Hjelmeland	9e866e5dab	net: dsa: lan9303: Renamed indirect phy access functions Preparing for the following fix of MDIO phy access: Renamed functions that access PHY 1 and 2 indirectly through PMI registers. lan9303_port_phy_reg_wait_for_completion() to lan9303_indirect_phy_wait_for_completion() lan9303_port_phy_reg_read() to lan9303_indirect_phy_read() lan9303_port_phy_reg_write() to lan9303_indirect_phy_write() Also changed "val" parameter of lan9303_indirect_phy_write() to u16, for clarity. Signed-off-by: Egil Hjelmeland <privat@egil-hjelmeland.no> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-30 19:23:29 -07:00
Egil Hjelmeland	ab78acb152	net: dsa: lan9303: Multiply by 4 to get MDIO register lan9303_mdio_write()/_read() must multiply register number by 4 to get offset. Added some commments to the register definitions. Signed-off-by: Egil Hjelmeland <privat@egil-hjelmeland.no> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-30 19:23:29 -07:00
Egil Hjelmeland	d329ac88eb	net: dsa: lan9303: Fix lan9303_detect_phy_setup() for MDIO Handle that MDIO read with no response return 0xffff. Signed-off-by: Egil Hjelmeland <privat@egil-hjelmeland.no> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-30 19:23:29 -07:00
Linus Torvalds	16f73eb02d	Linux 4.13-rc3	2017-07-30 12:40:36 -07:00
Linus Torvalds	f137e0b0c5	Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Thomas Gleixner: "A small set of x86 fixes: - prevent the kernel from using the EFI reboot method when EFI is disabled. - two patches addressing clang issues" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/boot: Disable the address-of-packed-member compiler warning x86/efi: Fix reboot_mode when EFI runtime services are disabled x86/boot: #undef memcpy() et al in string.c	2017-07-30 12:19:35 -07:00
Linus Torvalds	e4776b8ccb	Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Thomas Gleixner: "Two patches addressing build warnings caused by inconsistent kernel doc comments" * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/wait: Clean up some documentation warnings sched/core: Fix some documentation build warnings	2017-07-30 11:54:08 -07:00
Linus Torvalds	dbc52a8030	Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Thomas Gleixner: "A couple of fixes for performance counters and kprobes: - a series of small patches which make the uncore performance counters on Skylake server systems work correctly - add a missing instruction slot release to the failure path of kprobes" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: kprobes/x86: Release insn_slot in failure path perf/x86/intel/uncore: Fix missing marker for skx_uncore_cha_extra_regs perf/x86/intel/uncore: Fix SKX CHA event extra regs perf/x86/intel/uncore: Remove invalid Skylake server CHA filter field perf/x86/intel/uncore: Fix Skylake server CHA LLC_LOOKUP event umask perf/x86/intel/uncore: Fix Skylake server PCU PMU event format perf/x86/intel/uncore: Fix Skylake UPI PMU event masks	2017-07-30 11:52:15 -07:00
Linus Torvalds	06efc7df37	Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fix from Thomas Gleixner: "Fix for a regression caused by the conversion of x86 to the generic hotplug code. Instead of doing a plain single line revert, this adds a pile of comments so the semantics of the force argument are clear" * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: genirq/cpuhotplug: Revert "Set force affinity flag on hotplug migration"	2017-07-30 11:27:33 -07:00
Hanjun Guo	f7f3dd5b4c	ACPI: APD: Fix HID for Hisilicon Hip07/08 ACPI HID for Hisilicon Hip07/08 should be HISI02A1/2, not HISI0A21/2, HISI02A1/2 was tested ok but was modified by the stupid typo when upstream the patches (by me), correct them to the right IDs (matching the IDs in drivers/i2c/busses/i2c-designware-platdrv.c). Fixes: `6e14cf361a` (ACPI / APD: Add clock frequency for Hisilicon Hip07/08 I2C controller) Reported-by: Tao Tian <tiantao6@huawei.com> Signed-off-by: Hanjun Guo <hanjun.guo@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-07-30 14:33:48 +02:00
Rafael J. Wysocki	4815d3c56d	cpufreq: x86: Make scaling_cur_freq behave more as expected After commit `f8475cef90` "x86: use common aperfmperf_khz_on_cpu() to calculate KHz using APERF/MPERF" the scaling_cur_freq policy attribute in sysfs only behaves as expected on x86 with APERF/MPERF registers available when it is read from at least twice in a row. The value returned by the first read may not be meaningful, because the computations in there use cached values from the previous iteration of aperfmperf_snapshot_khz() which may be stale. To prevent that from happening, modify arch_freq_get_on_cpu() to call aperfmperf_snapshot_khz() twice, with a short delay between these calls, if the previous invocation of aperfmperf_snapshot_khz() was too far back in the past (specifically, more that 1s ago). Also, as pointed out by Doug Smythies, aperf_delta is limited now and the multiplication of it by cpu_khz won't overflow, so simplify the s->khz computations too. Fixes: `f8475cef90` "x86: use common aperfmperf_khz_on_cpu() to calculate KHz using APERF/MPERF" Reported-by: Doug Smythies <dsmythies@telus.net> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-07-30 14:26:51 +02:00
Daniel Borkmann	9975a54b3c	bpf: fix bpf_prog_get_info_by_fd to dump correct xlated_prog_len bpf_prog_size(prog->len) is not the correct length we want to dump back to user space. The code in bpf_prog_get_info_by_fd() uses this to copy prog->insnsi to user space, but bpf_prog_size(prog->len) also includes the size of struct bpf_prog itself plus program instructions and is usually used either in context of accounting or for bpf_prog_alloc() et al, thus we copy out of bounds in bpf_prog_get_info_by_fd() potentially. Use the correct bpf_prog_insn_size() instead. Fixes: `1e27097690` ("bpf: Add BPF_OBJ_GET_INFO_BY_FD") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-29 23:29:41 -07:00
Arnd Bergmann	efe967cdec	tcp: avoid bogus gcc-7 array-bounds warning When using CONFIG_UBSAN_SANITIZE_ALL, the TCP code produces a false-positive warning: net/ipv4/tcp_output.c: In function 'tcp_connect': net/ipv4/tcp_output.c:2207:40: error: array subscript is below array bounds [-Werror=array-bounds] tp->chrono_stat[tp->chrono_type - 1] += now - tp->chrono_start; ^~ net/ipv4/tcp_output.c:2207:40: error: array subscript is below array bounds [-Werror=array-bounds] tp->chrono_stat[tp->chrono_type - 1] += now - tp->chrono_start; ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~ I have opened a gcc bug for this, but distros have already shipped compilers with this problem, and it's not clear yet whether there is a way for gcc to avoid the warning. As the problem is related to the bitfield access, this introduces a temporary variable to store the old enum value. I did not notice this warning earlier, since UBSAN is disabled when building with COMPILE_TEST, and that was always turned on in both allmodconfig and randconfig tests. Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81601 Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-29 23:26:29 -07:00
David S. Miller	736b9b9c50	Merge branch 'ethtool-fec' Roopa Prabhu says: ==================== ethtool: support for forward error correction mode setting on a link Forward Error Correction (FEC) modes i.e Base-R and Reed-Solomon modes are introduced in 25G/40G/100G standards for providing good BER at high speeds. Various networking devices which support 25G/40G/100G provides ability to manage supported FEC modes and the lack of FEC encoding control and reporting today is a source for interoperability issues for many vendors. FEC capability as well as specific FEC mode i.e. Base-R or RS modes can be requested or advertised through bits D44:47 of base link codeword. This patch set intends to provide option under ethtool to manage and report FEC encoding settings for networking devices as per IEEE 802.3 bj, bm and by specs. v2 : - minor patch format fixes and typos pointed out by Andrew - there was a pending discussion on the use of 'auto' vs 'automatic' for fec settings. I have left it as 'auto' because in most cases today auto is used in place of automatic to represent automatically generated values. We use it in other networking config too. I would prefer leaving it as auto. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-29 23:23:45 -07:00
Casey Leedom	7fece840e3	cxgb4: ethtool forward error correction management support Signed-off-by: Casey Leedom <leedom@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-29 23:23:44 -07:00

... 9 10 11 12 13 ...

693356 Commits All Branches Search

693356 Commits

All Branches