usb_device_id are not supposed to change at runtime. All functions
working with usb_device_id provided by <linux/usb.h> work with
const usb_device_id. So mark the non-const structs as const.
Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Showing message that driver is loaded is common across drivers.
This change also fixes checkpatch (--strict) warning
"Alignment should match open parenthesis".
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Give read access to module parameters to all and write access to root.
This change also improves driver error path testing.
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Fix format warnings (seen on i386) in nvdimm/btt.c:
../drivers/nvdimm/btt.c: In function ‘btt_map_init’:
../drivers/nvdimm/btt.c:430:3: warning: format ‘%lx’ expects argument of type ‘long unsigned int’, but argument 4 has type ‘size_t’ [-Wformat=]
dev_WARN_ONCE(to_dev(arena), size < 512,
^
../drivers/nvdimm/btt.c: In function ‘btt_log_init’:
../drivers/nvdimm/btt.c:474:3: warning: format ‘%lx’ expects argument of type ‘long unsigned int’, but argument 4 has type ‘size_t’ [-Wformat=]
dev_WARN_ONCE(to_dev(arena), size < 512,
^
Fixes: 86652d2eb3 ("libnvdimm, btt: clean up warning and error messages")
Reported-by: Arnd Bergmann <arnd@arndb.de>
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
These watchdog_ops and watchdog_info structures are only stored
in the ops and info fields of a watchdog_device structure,
respectively, which are const. Thus make the watchdog_ops and
watchdog_info structures const as well.
Done with the help of Coccinelle. The rules for the watchdog_ops case are
as follows:
// <smpl>
@r disable optional_qualifier@
identifier i;
position p;
@@
static struct watchdog_ops i@p = { ... };
@ok@
identifier r.i;
struct watchdog_device e;
position p;
@@
e.ops = &i@p;
@bad@
position p != {r.p,ok.p};
identifier r.i;
struct watchdog_ops e;
@@
e@i@p
@depends on !bad disable optional_qualifier@
identifier r.i;
@@
static
+const
struct watchdog_ops i = { ... };
// </smpl>
Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
We should never return more time left than there actually is. So, switch
to a plain divider instead of DIV_ROUND_CLOSEST.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
If we set RWTCSRB to 0, we can gain 4096 as another divider value. This
is supported by all R-Car Gen2 and Gen3 devices which we aim to support.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
The error margin of the clks_per_second variable was too large and
caused offsets when used with clock frequencies which left a remainder
after applying the dividers. Now we always calculate directly using the
clock rate and the divider using some helper macros. That also means
that DIV_ROUND_UP moves from probe to the multiplication macro. In
probe, we don't need to ensure anymore that 'clks_per_sec' would go too
fast but rather ensure that the lower limit is really at least 1 to
certainly get a full cycle.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
We should never return more time left than there actually is. So, switch
to a plain divider instead of DIV_ROUND_CLOSEST.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
When checking the clock rate, ensure also that counting all 16 bits
takes at least one second to match the granularity of the framework.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Because the smallest clock divider we can select is 1, 'clks_per_sec'
must be the same type as 'rate'.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Commit a53e35db70 ("reset: Ensure drivers are explicit when requesting
reset lines") started to transition the reset control request API calls
to explicitly state whether the driver needs exclusive or shared reset
control behavior. Convert all drivers requesting exclusive resets to the
explicit API call so the temporary transition helpers can be removed.
No functional changes.
Cc: linux-watchdog@vger.kernel.org
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Commit a53e35db70 ("reset: Ensure drivers are explicit when requesting
reset lines") started to transition the reset control request API calls
to explicitly state whether the driver needs exclusive or shared reset
control behavior. Convert all drivers requesting exclusive resets to the
explicit API call so the temporary transition helpers can be removed.
No functional changes.
Cc: linux-watchdog@vger.kernel.org
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Commit a53e35db70 ("reset: Ensure drivers are explicit when requesting
reset lines") started to transition the reset control request API calls
to explicitly state whether the driver needs exclusive or shared reset
control behavior. Convert all drivers requesting exclusive resets to the
explicit API call so the temporary transition helpers can be removed.
No functional changes.
Cc: linux-watchdog@vger.kernel.org
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Commit a53e35db70 ("reset: Ensure drivers are explicit when requesting
reset lines") started to transition the reset control request API calls
to explicitly state whether the driver needs exclusive or shared reset
control behavior. Convert all drivers requesting exclusive resets to the
explicit API call so the temporary transition helpers can be removed.
No functional changes.
Cc: linux-watchdog@vger.kernel.org
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
The watchdog IP block on Meson8 and Meson8m2 is already supported by the
existing meson-wdt driver. Meson8 uses the same register bits as Meson6,
while the newer Meson8m2 SoC uses the same register bits as Meson8b.
Currently watchdog support on Meson8 SoC already works because
meson8.dtsi simply uses the "amlogic,meson6-wdt" compatible. Adding a
separate compatible for Meson8 makes this more explicit though.
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Reviewed-by: Neil Armstrong <narmstrong@baylibre.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Don't populate array chip_name on the stack but instead make it static.
Makes the object code smaller by 40 bytes:
Before:
text data bss dec hex filename
5641 2840 384 8865 22a1 drivers/watchdog/w83627hf_wdt.o
After:
text data bss dec hex filename
5545 2896 384 8825 2279 drivers/watchdog/w83627hf_wdt.o
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Check for watchdog_ops structures that are only stored in the ops field of
a watchdog_device structure. This field is declared const, so watchdog_ops
structures that have this property can be declared as const also.
This issue was detected using Coccinelle and the following semantic patch:
@r
disable optional_qualifier@
identifier i;
position p;
@@
static struct watchdog_ops i@p = { ... };
@ok@
identifier r.i;
struct watchdog_device e;
position p;
@@
e.ops = &i@p;
@bad@
position p != {r.p,ok.p};
identifier r.i;
struct watchdog_ops e;
@@
e@i@p
@depends on !bad disable optional_qualifier@
identifier r.i;
@@
static
+const
struct watchdog_ops i = { ... };
Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Check for watchdog_ops structures that are only stored in the ops field of
a watchdog_device structure. This field is declared const, so watchdog_ops
structures that have this property can be declared as const also.
This issue was detected using Coccinelle and the following semantic patch:
@r
disable optional_qualifier@
identifier i;
position p;
@@
static struct watchdog_ops i@p = { ... };
@ok@
identifier r.i;
struct watchdog_device e;
position p;
@@
e.ops = &i@p;
@bad@
position p != {r.p,ok.p};
identifier r.i;
struct watchdog_ops e;
@@
e@i@p
@depends on !bad disable optional_qualifier@
identifier r.i;
@@
static
+const
struct watchdog_ops i = { ... };
Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Check for watchdog_ops structures that are only stored in the ops field of
a watchdog_device structure. This field is declared const, so watchdog_ops
structures that have this property can be declared as const also.
This issue was detected using Coccinelle and the following semantic patch:
@r
disable optional_qualifier@
identifier i;
position p;
@@
static struct watchdog_ops i@p = { ... };
@ok@
identifier r.i;
struct watchdog_device e;
position p;
@@
e.ops = &i@p;
@bad@
position p != {r.p,ok.p};
identifier r.i;
struct watchdog_ops e;
@@
e@i@p
@depends on !bad disable optional_qualifier@
identifier r.i;
@@
static
+const
struct watchdog_ops i = { ... };
Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Check for watchdog_ops structures that are only stored in the ops field of
a watchdog_device structure. This field is declared const, so watchdog_ops
structures that have this property can be declared as const also.
This issue was detected using Coccinelle and the following semantic patch:
@r
disable optional_qualifier@
identifier i;
position p;
@@
static struct watchdog_ops i@p = { ... };
@ok@
identifier r.i;
struct watchdog_device e;
position p;
@@
e.ops = &i@p;
@bad@
position p != {r.p,ok.p};
identifier r.i;
struct watchdog_ops e;
@@
e@i@p
@depends on !bad disable optional_qualifier@
identifier r.i;
@@
static
+const
struct watchdog_ops i = { ... };
Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Pull networking fixes from David Miller:
"The iwlwifi firmware compat fix is in here as well as some other
stuff:
1) Fix request socket leak introduced by BPF deadlock fix, from Eric
Dumazet.
2) Fix VLAN handling with TXQs in mac80211, from Johannes Berg.
3) Missing __qdisc_drop conversions in prio and qfq schedulers, from
Gao Feng.
4) Use after free in netlink nlk groups handling, from Xin Long.
5) Handle MTU update properly in ipv6 gre tunnels, from Xin Long.
6) Fix leak of ipv6 fib tables on netns teardown, from Sabrina Dubroca
with follow-on fix from Eric Dumazet.
7) Need RCU and preemption disabled during generic XDP data patch,
from John Fastabend"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (54 commits)
bpf: make error reporting in bpf_warn_invalid_xdp_action more clear
Revert "mdio_bus: Remove unneeded gpiod NULL check"
bpf: devmap, use cond_resched instead of cpu_relax
bpf: add support for sockmap detach programs
net: rcu lock and preempt disable missing around generic xdp
bpf: don't select potentially stale ri->map from buggy xdp progs
net: tulip: Constify tulip_tbl
net: ethernet: ti: netcp_core: no need in netif_napi_del
davicom: Display proper debug level up to 6
net: phy: sfp: rename dt properties to match the binding
dt-binding: net: sfp binding documentation
dt-bindings: add SFF vendor prefix
dt-bindings: net: don't confuse with generic PHY property
ip6_tunnel: fix setting hop_limit value for ipv6 tunnel
ip_tunnel: fix setting ttl and tos value in collect_md mode
ipv6: fix typo in fib6_net_exit()
tcp: fix a request socket leak
sctp: fix missing wake ups in some situations
netfilter: xt_hashlimit: fix build error caused by 64bit division
netfilter: xt_hashlimit: alloc hashtable with right size
...
Merge more updates from Andrew Morton:
- most of the rest of MM
- a small number of misc things
- lib/ updates
- checkpatch
- autofs updates
- ipc/ updates
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (126 commits)
ipc: optimize semget/shmget/msgget for lots of keys
ipc/sem: play nicer with large nsops allocations
ipc/sem: drop sem_checkid helper
ipc: convert kern_ipc_perm.refcount from atomic_t to refcount_t
ipc: convert sem_undo_list.refcnt from atomic_t to refcount_t
ipc: convert ipc_namespace.count from atomic_t to refcount_t
kcov: support compat processes
sh: defconfig: cleanup from old Kconfig options
mn10300: defconfig: cleanup from old Kconfig options
m32r: defconfig: cleanup from old Kconfig options
drivers/pps: use surrounding "if PPS" to remove numerous dependency checks
drivers/pps: aesthetic tweaks to PPS-related content
cpumask: make cpumask_next() out-of-line
kmod: move #ifdef CONFIG_MODULES wrapper to Makefile
kmod: split off umh headers into its own file
MAINTAINERS: clarify kmod is just a kernel module loader
kmod: split out umh code into its own file
test_kmod: flip INT checks to be consistent
test_kmod: remove paranoid UINT_MAX check on uint range processing
vfat: deduplicate hex2bin()
...
I removed all the gperf use, but not the Makefile rules. Sam Ravnborg
says I get bonus points for cleaning this up. I'll hold him to it.
Requested-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
It's pretty much a duplicate of nfs_scan_commit_list() that also
clears the PG_COMMIT_TO_DS flag.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Since the commit list is not ordered, it is possible for nfs_scan_commit_list
to hold a request that nfs_lock_and_join_requests() is waiting for, while
at the same time trying to grab a request that nfs_lock_and_join_requests
already holds.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
This reverts commit 1fccb73011.
Reported as Bug 196509 - iTCO_wdt regression reboot before timeout expire
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
The kernel watchdog is a great debugging tool for finding tasks that
consume a disproportionate amount of CPU time in contiguous chunks. One
can imagine building a similar watchdog for arbitrary driver threads
using save_stack_trace_tsk() and print_stack_trace(). However, this is
not viable for dynamically loaded driver modules on ARM platforms
because save_stack_trace_tsk() is not exported for those architectures.
Export save_stack_trace_tsk() for the ARM architecture to align with
x86 and support various debugging use cases such as arbitrary driver
thread watchdog timers.
Signed-off-by: Dustin Brown <dustinb@codeaurora.org>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Differ between illegal XDP action code and just driver
unsupported one to provide better feedback when we throw
a one-time warning here. Reason is that with 814abfabef
("xdp: add bpf_redirect helper function") not all drivers
support the new XDP return code yet and thus they will
fall into their 'default' case when checking for return
codes after program return, which then triggers a
bpf_warn_invalid_xdp_action() stating that the return
code is illegal, but from XDP perspective it's not.
I decided not to place something like a XDP_ACT_MAX define
into uapi i) given we don't have this either for all other
program types, ii) future action codes could have further
encoding there, which would render such define unsuitable
and we wouldn't be able to rip it out again, and iii) we
rarely add new action codes.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This reverts commit 95b80bf3db ("mdio_bus:
Remove unneeded gpiod NULL check"), this commit assumed that GPIOLIB
checks for NULL descriptors, so it's safe to drop them, but it is not
when CONFIG_GPIOLIB is disabled in the kernel. If we do call
gpiod_set_value_cansleep() on a GPIO descriptor we will issue warnings
coming from the inline stubs declared in include/linux/gpio/consumer.h.
Fixes: 95b80bf3db ("mdio_bus: Remove unneeded gpiod NULL check")
Reported-by: Woojung Huh <Woojung.Huh@microchip.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
John Fastabend says:
====================
net: Fixes for XDP/BPF
The following fixes, UAPI updates, and small improvement,
i. XDP needs to be called inside RCU with preempt disabled.
ii. Not strictly a bug fix but we have an attach command in the
sockmap UAPI already to avoid having a single kernel released with
only the attach and not the detach I'm pushing this into net branch.
Its early in the RC cycle so I think this is OK (not ideal but better
than supporting a UAPI with a missing detach forever).
iii. Final patch replace cpu_relax with cond_resched in devmap.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Be a bit more friendly about waiting for flush bits to complete.
Replace the cpu_relax() with a cond_resched().
Suggested-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The bpf map sockmap supports adding programs via attach commands. This
patch adds the detach command to keep the API symmetric and allow
users to remove previously added programs. Otherwise the user would
have to delete the map and re-add it to get in this state.
This also adds a series of additional tests to capture detach operation
and also attaching/detaching invalid prog types.
API note: socks will run (or not run) programs depending on the state
of the map at the time the sock is added. We do not for example walk
the map and remove programs from previously attached socks.
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
do_xdp_generic must be called inside rcu critical section with preempt
disabled to ensure BPF programs are valid and per-cpu variables used
for redirect operations are consistent. This patch ensures this is true
and fixes the splat below.
The netif_receive_skb_internal() code path is now broken into two rcu
critical sections. I decided it was better to limit the preempt_enable/disable
block to just the xdp static key portion and the fallout is more
rcu_read_lock/unlock calls. Seems like the best option to me.
[ 607.596901] =============================
[ 607.596906] WARNING: suspicious RCU usage
[ 607.596912] 4.13.0-rc4+ #570 Not tainted
[ 607.596917] -----------------------------
[ 607.596923] net/core/dev.c:3948 suspicious rcu_dereference_check() usage!
[ 607.596927]
[ 607.596927] other info that might help us debug this:
[ 607.596927]
[ 607.596933]
[ 607.596933] rcu_scheduler_active = 2, debug_locks = 1
[ 607.596938] 2 locks held by pool/14624:
[ 607.596943] #0: (rcu_read_lock_bh){......}, at: [<ffffffff95445ffd>] ip_finish_output2+0x14d/0x890
[ 607.596973] #1: (rcu_read_lock_bh){......}, at: [<ffffffff953c8e3a>] __dev_queue_xmit+0x14a/0xfd0
[ 607.597000]
[ 607.597000] stack backtrace:
[ 607.597006] CPU: 5 PID: 14624 Comm: pool Not tainted 4.13.0-rc4+ #570
[ 607.597011] Hardware name: Dell Inc. Precision Tower 5810/0HHV7N, BIOS A17 03/01/2017
[ 607.597016] Call Trace:
[ 607.597027] dump_stack+0x67/0x92
[ 607.597040] lockdep_rcu_suspicious+0xdd/0x110
[ 607.597054] do_xdp_generic+0x313/0xa50
[ 607.597068] ? time_hardirqs_on+0x5b/0x150
[ 607.597076] ? mark_held_locks+0x6b/0xc0
[ 607.597088] ? netdev_pick_tx+0x150/0x150
[ 607.597117] netif_rx_internal+0x205/0x3f0
[ 607.597127] ? do_xdp_generic+0xa50/0xa50
[ 607.597144] ? lock_downgrade+0x2b0/0x2b0
[ 607.597158] ? __lock_is_held+0x93/0x100
[ 607.597187] netif_rx+0x119/0x190
[ 607.597202] loopback_xmit+0xfd/0x1b0
[ 607.597214] dev_hard_start_xmit+0x127/0x4e0
Fixes: d445516966 ("net: xdp: support xdp generic on virtual devices")
Fixes: b5cdae3291 ("net: Generic XDP")
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
We can potentially run into a couple of issues with the XDP
bpf_redirect_map() helper. The ri->map in the per CPU storage
can become stale in several ways, mostly due to misuse, where
we can then trigger a use after free on the map:
i) prog A is calling bpf_redirect_map(), returning XDP_REDIRECT
and running on a driver not supporting XDP_REDIRECT yet. The
ri->map on that CPU becomes stale when the XDP program is unloaded
on the driver, and a prog B loaded on a different driver which
supports XDP_REDIRECT return code. prog B would have to omit
calling to bpf_redirect_map() and just return XDP_REDIRECT, which
would then access the freed map in xdp_do_redirect() since not
cleared for that CPU.
ii) prog A is calling bpf_redirect_map(), returning a code other
than XDP_REDIRECT. prog A is then detached, which triggers release
of the map. prog B is attached which, similarly as in i), would
just return XDP_REDIRECT without having called bpf_redirect_map()
and thus be accessing the freed map in xdp_do_redirect() since
not cleared for that CPU.
iii) prog A is attached to generic XDP, calling the bpf_redirect_map()
helper and returning XDP_REDIRECT. xdp_do_generic_redirect() is
currently not handling ri->map (will be fixed by Jesper), so it's
not being reset. Later loading a e.g. native prog B which would,
say, call bpf_xdp_redirect() and then returns XDP_REDIRECT would
find in xdp_do_redirect() that a map was set and uses that causing
use after free on map access.
Fix thus needs to avoid accessing stale ri->map pointers, naive
way would be to call a BPF function from drivers that just resets
it to NULL for all XDP return codes but XDP_REDIRECT and including
XDP_REDIRECT for drivers not supporting it yet (and let ri->map
being handled in xdp_do_generic_redirect()). There is a less
intrusive way w/o letting drivers call a reset for each BPF run.
The verifier knows we're calling into bpf_xdp_redirect_map()
helper, so it can do a small insn rewrite transparent to the prog
itself in the sense that it fills R4 with a pointer to the own
bpf_prog. We have that pointer at verification time anyway and
R4 is allowed to be used as per calling convention we scratch
R0 to R5 anyway, so they become inaccessible and program cannot
read them prior to a write. Then, the helper would store the prog
pointer in the current CPUs struct redirect_info. Later in
xdp_do_*_redirect() we check whether the redirect_info's prog
pointer is the same as passed xdp_prog pointer, and if that's
the case then all good, since the prog holds a ref on the map
anyway, so it is always valid at that point in time and must
have a reference count of at least 1. If in the unlikely case
they are not equal, it means we got a stale pointer, so we clear
and bail out right there. Also do reset map and the owning prog
in bpf_xdp_redirect(), so that bpf_xdp_redirect_map() and
bpf_xdp_redirect() won't get mixed up, only the last call should
take precedence. A tc bpf_redirect() doesn't use map anywhere
yet, so no need to clear it there since never accessed in that
layer.
Note that in case the prog is released, and thus the map as
well we're still under RCU read critical section at that time
and have preemption disabled as well. Once we commit with the
__dev_map_insert_ctx() from xdp_do_redirect_map() and set the
map to ri->map_to_flush, we still wait for a xdp_do_flush_map()
to finish in devmap dismantle time once flush_needed bit is set,
so that is fine.
Fixes: 97f91a7cf0 ("bpf: add bpf_redirect_map helper routine")
Reported-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Don't remove rx_napi specifically just before free_netdev(),
it's supposed to be done in it and is confusing w/o tx_napi deletion.
Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This will make it explicit some messages are of the form:
dm9000_dbg(db, 5, ...
Signed-off-by: Mathieu Malaterre <malat@debian.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make the Rx rate select control gpio property name match the documented
binding. This would make the addition of 'rate-select1-gpios' for SFP+
support more natural.
Also, make the MOD-DEF0 gpio property name match the documentation.
Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add device-tree binding documentation SFP transceivers. Support for SFP
transceivers has been recently introduced (drivers/net/phy/sfp.c).
Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
This complements commit 9a94b3a4bd (dt-binding: phy: don't confuse with
Ethernet phy properties).
The generic PHY 'phys' property sometime appears in the same node with
the Ethernet PHY 'phy' or 'phy-handle' properties. Add a warning in
ethernet.txt to reduce confusion.
Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
Similar to vxlan/geneve tunnel, if hop_limit is zero, it should fall
back to ip6_dst_hoplimt().
Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
ttl and tos variables are declared and assigned, but are not used in
iptunnel_xmit() function.
Fixes: cfc7381b30 ("ip_tunnel: add collect_md mode to IPIP tunnel")
Cc: Alexei Starovoitov <ast@fb.com>
Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add zstd compression and decompression support to SquashFS. zstd is a
great fit for SquashFS because it can compress at ratios approaching xz,
while decompressing twice as fast as zlib. For SquashFS in particular,
it can decompress as fast as lzo and lz4. It also has the flexibility
to turn down the compression ratio for faster compression times.
The compression benchmark is run on the file tree from the SquashFS archive
found in ubuntu-16.10-desktop-amd64.iso [1]. It uses `mksquashfs` with the
default block size (128 KB) and and various compression algorithms/levels.
xz and zstd are also benchmarked with 256 KB blocks. The decompression
benchmark times how long it takes to `tar` the file tree into `/dev/null`.
See the benchmark file in the upstream zstd source repository located under
`contrib/linux-kernel/squashfs-benchmark.sh` [2] for details.
I ran the benchmarks on a Ubuntu 14.04 VM with 2 cores and 4 GiB of RAM.
The VM is running on a MacBook Pro with a 3.1 GHz Intel Core i7 processor,
16 GB of RAM, and a SSD.
| Method | Ratio | Compression MB/s | Decompression MB/s |
|----------------|-------|------------------|--------------------|
| gzip | 2.92 | 15 | 128 |
| lzo | 2.64 | 9.5 | 217 |
| lz4 | 2.12 | 94 | 218 |
| xz | 3.43 | 5.5 | 35 |
| xz 256 KB | 3.53 | 5.4 | 40 |
| zstd 1 | 2.71 | 96 | 210 |
| zstd 5 | 2.93 | 69 | 198 |
| zstd 10 | 3.01 | 41 | 225 |
| zstd 15 | 3.13 | 11.4 | 224 |
| zstd 16 256 KB | 3.24 | 8.1 | 210 |
This patch was written by Sean Purcell <me@seanp.xyz>, but I will be
taking over the submission process.
[1] http://releases.ubuntu.com/16.10/
[2] https://github.com/facebook/zstd/blob/dev/contrib/linux-kernel/squashfs-benchmark.sh
zstd source repository: https://github.com/facebook/zstd
Signed-off-by: Sean Purcell <me@seanp.xyz>
Signed-off-by: Nick Terrell <terrelln@fb.com>
Signed-off-by: Chris Mason <clm@fb.com>
Acked-by: Phillip Lougher <phillip@squashfs.org.uk>
The writeback code wants to send a commit after processing the pages,
which is why we want to delay releasing the struct path until after
that's done.
Also, the layout code expects that we do not free the inode before
we've put the layout segments in pnfs_writehdr_free() and
pnfs_readhdr_free()
Fixes: 919e3bd9a8 ("NFS: Ensure we commit after writeback is complete")
Fixes: 4714fb51fd ("nfs: remove pgio_header refcount, related cleanup")
Cc: stable@vger.kernel.org
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
ipc_findkey() used to scan all objects to look for the wanted key. This
is slow when using a high number of keys. This change adds an rhashtable
of kern_ipc_perm objects in ipc_ids, so that one lookup cease to be O(n).
This change gives a 865% improvement of benchmark reaim.jobs_per_min on a
56 threads Intel(R) Xeon(R) CPU E5-2695 v3 @ 2.30GHz with 256G memory [1]
Other (more micro) benchmark results, by the author: On an i5 laptop, the
following loop executed right after a reboot took, without and with this
change:
for (int i = 0, k=0x424242; i < KEYS; ++i)
semget(k++, 1, IPC_CREAT | 0600);
total total max single max single
KEYS without with call without call with
1 3.5 4.9 µs 3.5 4.9
10 7.6 8.6 µs 3.7 4.7
32 16.2 15.9 µs 4.3 5.3
100 72.9 41.8 µs 3.7 4.7
1000 5,630.0 502.0 µs * *
10000 1,340,000.0 7,240.0 µs * *
31900 17,600,000.0 22,200.0 µs * *
*: unreliable measure: high variance
The duration for a lookup-only usage was obtained by the same loop once
the keys are present:
total total max single max single
KEYS without with call without call with
1 2.1 2.5 µs 2.1 2.5
10 4.5 4.8 µs 2.2 2.3
32 13.0 10.8 µs 2.3 2.8
100 82.9 25.1 µs * 2.3
1000 5,780.0 217.0 µs * *
10000 1,470,000.0 2,520.0 µs * *
31900 17,400,000.0 7,810.0 µs * *
Finally, executing each semget() in a new process gave, when still
summing only the durations of these syscalls:
creation:
total total
KEYS without with
1 3.7 5.0 µs
10 32.9 36.7 µs
32 125.0 109.0 µs
100 523.0 353.0 µs
1000 20,300.0 3,280.0 µs
10000 2,470,000.0 46,700.0 µs
31900 27,800,000.0 219,000.0 µs
lookup-only:
total total
KEYS without with
1 2.5 2.7 µs
10 25.4 24.4 µs
32 106.0 72.6 µs
100 591.0 352.0 µs
1000 22,400.0 2,250.0 µs
10000 2,510,000.0 25,700.0 µs
31900 28,200,000.0 115,000.0 µs
[1] http://lkml.kernel.org/r/20170814060507.GE23258@yexl-desktop
Link: http://lkml.kernel.org/r/20170815194954.ck32ta2z35yuzpwp@debix
Signed-off-by: Guillaume Knispel <guillaume.knispel@supersonicimagine.com>
Reviewed-by: Marc Pardo <marc.pardo@supersonicimagine.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Kees Cook <keescook@chromium.org>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Serge Hallyn <serge@hallyn.com>
Cc: Andrey Vagin <avagin@openvz.org>
Cc: Guillaume Knispel <guillaume.knispel@supersonicimagine.com>
Cc: Marc Pardo <marc.pardo@supersonicimagine.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>