Commit Graph

708440 Commits

Author SHA1 Message Date
Kees Cook abec4be3ee net: ethernet: stmmac: Convert timers to use timer_setup()
In preparation for unconditionally passing the struct timer_list pointer to
all timer callbacks, switch to using the new timer_setup() and from_timer()
to pass the timer pointer explicitly.

Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
Cc: netdev@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by:  Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-18 12:39:37 +01:00
Kees Cook 80c5a20b53 pcmcia/electra_cf: Convert timers to use timer_setup()
In preparation for unconditionally passing the struct timer_list pointer to
all timer callbacks, switch to using the new timer_setup() and from_timer()
to pass the timer pointer explicitly.

Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: linux-pcmcia@lists.infradead.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-18 12:39:37 +01:00
Kees Cook 41fce7034b net: tulip: de2104x: Convert timers to use timer_setup()
In preparation for unconditionally passing the struct timer_list pointer to
all timer callbacks, switch to using the new timer_setup() and from_timer()
to pass the timer pointer explicitly.

Cc: "David S. Miller" <davem@davemloft.net>
Cc: "yuval.shaia@oracle.com" <yuval.shaia@oracle.com>
Cc: Tobias Klauser <tklauser@distanz.ch>
Cc: Jarod Wilson <jarod@redhat.com>
Cc: Philippe Reynes <tremyfr@gmail.com>
Cc: netdev@vger.kernel.org
Cc: linux-parisc@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-18 12:39:37 +01:00
Kees Cook eb8c6b5b44 ethernet/broadcom: Convert timers to use timer_setup()
In preparation for unconditionally passing the struct timer_list pointer to
all timer callbacks, switch to using the new timer_setup() and from_timer()
helper to pass the timer pointer explicitly.

Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: bcm-kernel-feedback-list@broadcom.com
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Jarod Wilson <jarod@redhat.com>
Cc: netdev@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-18 12:39:37 +01:00
Kees Cook c3aed70953 xfrm: Convert timers to use timer_setup()
In preparation for unconditionally passing the struct timer_list pointer to
all timer callbacks, switch to using the new timer_setup() and from_timer()
helper to pass the timer pointer explicitly.

Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: netdev@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-18 12:39:37 +01:00
Kees Cook 8e763de0b9 net/hamradio/6pack: Convert timers to use timer_setup()
In preparation for unconditionally passing the struct timer_list pointer to
all timer callbacks, switch to using the new timer_setup() and from_timer()
to pass the timer pointer explicitly.

Cc: Andreas Koensgen <ajk@comnets.uni-bremen.de>
Cc: linux-hams@vger.kernel.org
Cc: netdev@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-18 12:39:37 +01:00
Kees Cook 5e8b824d91 isdn/hisax: Convert timers to use timer_setup()
In preparation for unconditionally passing the struct timer_list pointer to
all timer callbacks, switch to using the new timer_setup() and from_timer()
to pass the timer pointer explicitly.

Cc: Karsten Keil <isdn@linux-pingi.de>
Cc: Geliang Tang <geliangtang@gmail.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: netdev@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-18 12:39:37 +01:00
Kees Cook d8eb7e262d net/wireless/ray_cs: Convert timers to use timer_setup()
In preparation for unconditionally passing the struct timer_list pointer to
all timer callbacks, switch to using the new timer_setup() and from_timer()
to pass the timer pointer explicitly.

Cc: Kalle Valo <kvalo@codeaurora.org>
Cc: linux-wireless@vger.kernel.org
Cc: netdev@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-18 12:39:37 +01:00
Kees Cook 2183c1a61c net/usb/usbnet: Convert timers to use timer_setup()
In preparation for unconditionally passing the struct timer_list pointer to
all timer callbacks, switch to using the new timer_setup() and from_timer()
to pass the timer pointer explicitly. Since the callback is called from
both a timer and a tasklet, adjust the tasklet to pass the timer address
too. When tasklets have their .data field removed, this can be refactored
to call a central function after resolving the correct container_of() for a
separate callback function for timer and tasklet.

Cc: Oliver Neukum <oneukum@suse.com>
Cc: netdev@vger.kernel.org
Cc: linux-usb@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Oliver Neukum <oneukum@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-18 12:39:37 +01:00
Kees Cook 0010e3f8b3 net/ti/tlan: Convert timers to use timer_setup()
In preparation for unconditionally passing the struct timer_list pointer to
all timer callbacks, switch to using the new timer_setup() and from_timer()
to pass the timer pointer explicitly.

Cc: Samuel Chessman <chessman@tux.org>
Cc: netdev@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-18 12:39:36 +01:00
Kees Cook 4966babd90 net/rose: Convert timers to use timer_setup()
In preparation for unconditionally passing the struct timer_list pointer to
all timer callbacks, switch to using the new timer_setup() and from_timer()
to pass the timer pointer explicitly.

Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: linux-hams@vger.kernel.org
Cc: netdev@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-18 12:39:36 +01:00
Kees Cook 83a37b3292 net/lapb: Convert timers to use timer_setup()
In preparation for unconditionally passing the struct timer_list pointer to
all timer callbacks, switch to using the new timer_setup() and from_timer()
to pass the timer pointer explicitly.

Cc: "David S. Miller" <davem@davemloft.net>
Cc: Hans Liljestrand <ishkamiel@gmail.com>
Cc: "Reshetova, Elena" <elena.reshetova@intel.com>
Cc: linux-x25@vger.kernel.org
Cc: netdev@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-18 12:39:36 +01:00
Kees Cook eb4ddaf474 net/decnet: Convert timers to use timer_setup()
In preparation for unconditionally passing the struct timer_list pointer to
all timer callbacks, switch to using the new timer_setup() and from_timer()
to pass the timer pointer explicitly.

Cc: "David S. Miller" <davem@davemloft.net>
Cc: Johannes Berg <johannes.berg@intel.com>
Cc: David Ahern <dsa@cumulusnetworks.com>
Cc: linux-decnet-user@lists.sourceforge.net
Cc: netdev@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-18 12:39:36 +01:00
David S. Miller 1bbc728988 Merge branch 'dsa-master-and-slave-helpers'
Vivien Didelot says:

====================
net: dsa: master and slave helpers

This patch series adds a few helpers to DSA core for clarity and
readability but brings no functional changes.

A dsa_slave_notify helper calls the DSA notifiers when (un)registering a
slave device.

Most of the DSA slave code only needs to access the dsa_port structure,
not the dsa_slave_priv (which only contains a few PHY-specific members).
Thus a dsa_slave_to_port helper returns a dsa_port structure of a slave
device.

A dsa_slave_to_master returns the master device of a slave device.

After that the netdev member of the dsa_port structure is split into two
explicit master and slave members to avoid confusion, and a dsa_to_port
helper is added for switch drivers to get a const reference to a port.

Changes in v2:
  - prefer dsa_slave_to_master instead of dsa_slave_get_master
  - rename dsa_master_get_slave to dsa_master_find_slave
  - pack master and slave net devices into an anonymous union
  - add dsa_to_port public helper for switch drivers
  - add Reviewed-by tags from Florian
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-18 12:24:34 +01:00
Vivien Didelot c8652c83bc net: dsa: add dsa_to_port helper
The dsa_port structure is part of DSA core data and must only be updated
by the later. It is OK and sometimes necessary for the DSA drivers to
access this data, but this has to be read only.

For that purpose, add a dsa_to_port() helper which returns a const
pointer to a dsa_port structure which must be used by DSA drivers from
now on instead of digging into ds->ports[] themselves.

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-18 12:24:33 +01:00
Vivien Didelot f8b8b1cd5a net: dsa: split dsa_port's netdev member
The dsa_port structure has a "netdev" member, which can be used for
either the master device, or the slave device, depending on its type.

It is true that today, CPU port are not exposed to userspace, thus the
port's netdev member can be used to point to its master interface.

But it is still slightly confusing, so split it into more explicit
"master" and "slave" members inside an anonymous union.

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-18 12:24:33 +01:00
Vivien Didelot 2231c43b56 net: dsa: rename dsa_master_get_slave
The dsa_master_get_slave is slightly confusing since the idiomatic "get"
term often suggests reference counting, in symmetry to "put".

Rename it to dsa_master_find_slave to make the look up operation clear.

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-18 12:24:33 +01:00
Vivien Didelot d0006b0022 net: dsa: add slave to master helper
Many part of the DSA slave code require to get the master device
assigned to a slave device. Remove dsa_master_netdev() in favor of a
dsa_slave_to_master() helper which does that.

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-18 12:24:33 +01:00
Vivien Didelot d945097bb1 net: dsa: add slave to port helper
Many portions of DSA core code require to get the dsa_port structure
corresponding to a slave net_device. For this purpose, introduce a
dsa_slave_to_port() helper.

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-18 12:24:33 +01:00
Vivien Didelot 6158eaa7a7 net: dsa: add slave notify helper
Both DSA slave create and destroy functions call call_dsa_notifiers with
respectively DSA_PORT_REGISTER and DSA_PORT_UNREGISTER and the same
dsa_notifier_register_info structure.

Wrap this in a dsa_slave_notify helper so prevent cluttering these
functions.

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-18 12:24:33 +01:00
Vivien Didelot a5b930e059 net: dsa: use port's cpu_dp when creating a slave
When dsa_slave_create is called, the related port already has a CPU port
assigned to it, available in its cpu_dp member. Use it instead of the
unique tree cpu_dp.

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-18 12:24:33 +01:00
Johannes Berg a2084f5650 netlink: use NETLINK_CB(in_skb).sk instead of looking it up
When netlink_ack() reports an allocation error to the sending
socket, there's no need to look up the sending socket since
it's available in the SKB's CB. Use that instead of going to
the trouble of looking it up.

Note that the pointer is only available since Eric Biederman's
commit 3fbc290540 ("netlink: Make the sending netlink socket availabe in NETLINK_CB")
which is far newer than the original lookup code (Oct 2003)
(though the field was called 'ssk' in that commit and only got
renamed to 'sk' later, I'd actually argue 'ssk' was better - or
perhaps it should've been 'source_sk' - since there are so many
different 'sk's involved.)

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-18 12:20:13 +01:00
David S. Miller 452606d6c9 Merge branch 'bpf-cpumap-type-for-XDP_REDIRECT'
Jesper Dangaard Brouer says:

====================
net: New bpf cpumap type for XDP_REDIRECT

Introducing a new way to redirect XDP frames.  Notice how no driver
changes are necessary given the design of XDP_REDIRECT.

This redirect map type is called 'cpumap', as it allows redirection
XDP frames to remote CPUs.  The remote CPU will do the SKB allocation
and start the network stack invocation on that CPU.

This is a scalability and isolation mechanism, that allow separating
the early driver network XDP layer, from the rest of the netstack, and
assigning dedicated CPUs for this stage.  The sysadm control/configure
the RX-CPU to NIC-RX queue (as usual) via procfs smp_affinity and how
many queues are configured via ethtool --set-channels.  Benchmarks
show that a single CPU can handle approx 11Mpps.  Thus, only assigning
two NIC RX-queues (and two CPUs) is sufficient for handling 10Gbit/s
wirespeed smallest packet 14.88Mpps.  Reducing the number of queues
have the advantage that more packets being "bulk" available per hard
interrupt[1].

[1] https://www.netdevconf.org/2.1/papers/BusyPollingNextGen.pdf

Use-cases:

1. End-host based pre-filtering for DDoS mitigation.  This is fast
   enough to allow software to see and filter all packets wirespeed.
   Thus, no packets getting silently dropped by hardware.

2. Given NIC HW unevenly distributes packets across RX queue, this
   mechanism can be used for redistribution load across CPUs.  This
   usually happens when HW is unaware of a new protocol.  This
   resembles RPS (Receive Packet Steering), just faster, but with more
   responsibility placed on the BPF program for correct steering.

3. Auto-scaling or power saving via only activating the appropriate
   number of remote CPUs for handling the current load.  The cpumap
   tracepoints can function as a feedback loop for this purpose.

In V7, a --stress-mode was implemented for the samples program, which
between each stats update, adds + removes CPUs from the map
concurrently with traffic.  I did find and fix some concurrency issues
in the tear-down path, details in patch desc.  The stress test have
now been running for 15 hours without any issues, while being
bombarded with 11.6 Mpps via pktgen_sample04_many_flows.sh.

See individual patches for patchset-version changes.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-18 12:12:19 +01:00
Jesper Dangaard Brouer fad3917e36 samples/bpf: add cpumap sample program xdp_redirect_cpu
This sample program show how to use cpumap and the associated
tracepoints.

It provides command line stats, which shows how the XDP-RX process,
cpumap-enqueue and cpumap kthread dequeue is cooperating on a per CPU
basis.  It also utilize the xdp_exception and xdp_redirect_err
transpoints to allow users quickly to identify setup issues.

One issue with ixgbe driver is that the driver reset the link when
loading XDP.  This reset the procfs smp_affinity settings.  Thus,
after loading the program, these must be reconfigured.  The easiest
workaround it to reduce the RX-queue to e.g. two via:

 # ethtool --set-channels ixgbe1 combined 2

And then add CPUs above 0 and 1, like:

 # xdp_redirect_cpu --dev ixgbe1 --prog 2 --cpu 2 --cpu 3 --cpu 4

Another issue with ixgbe is that the page recycle mechanism is tied to
the RX-ring size.  And the default setting of 512 elements is too
small.  This is the same issue with regular devmap XDP_REDIRECT.
To overcome this I've been using 1024 rx-ring size:

 # ethtool -G ixgbe1 rx 1024 tx 1024

V3:
 - whitespace cleanups
 - bpf tracepoint cannot access top part of struct

V4:
 - report on kthread sched events, according to tracepoint change
 - report average bulk enqueue size

V5:
 - bpf_map_lookup_elem on cpumap not allowed from bpf_prog
   use separate map to mark CPUs not available

V6:
 - correct kthread sched summary output

V7:
 - Added a --stress-mode for concurrently changing underlying cpumap

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-18 12:12:18 +01:00
Jesper Dangaard Brouer f9419f7bd7 bpf: cpumap add tracepoints
This adds two tracepoint to the cpumap.  One for the enqueue side
trace_xdp_cpumap_enqueue() and one for the kthread dequeue side
trace_xdp_cpumap_kthread().

To mitigate the tracepoint overhead, these are invoked during the
enqueue/dequeue bulking phases, thus amortizing the cost.

The obvious use-cases are for debugging and monitoring.  The
non-intuitive use-case is using these as a feedback loop to know the
system load.  One can imagine auto-scaling by reducing, adding or
activating more worker CPUs on demand.

V4: tracepoint remove time_limit info, instead add sched info

V8: intro struct bpf_cpu_map_entry members cpu+map_id in this patch

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-18 12:12:18 +01:00
Jesper Dangaard Brouer 1c601d829a bpf: cpumap xdp_buff to skb conversion and allocation
This patch makes cpumap functional, by adding SKB allocation and
invoking the network stack on the dequeuing CPU.

For constructing the SKB on the remote CPU, the xdp_buff in converted
into a struct xdp_pkt, and it mapped into the top headroom of the
packet, to avoid allocating separate mem.  For now, struct xdp_pkt is
just a cpumap internal data structure, with info carried between
enqueue to dequeue.

If a driver doesn't have enough headroom it is simply dropped, with
return code -EOVERFLOW.  This will be picked up the xdp tracepoint
infrastructure, to allow users to catch this.

V2: take into account xdp->data_meta

V4:
 - Drop busypoll tricks, keeping it more simple.
 - Skip RPS and Generic-XDP-recursive-reinjection, suggested by Alexei

V5: correct RCU read protection around __netif_receive_skb_core.

V6: Setting TASK_RUNNING vs TASK_INTERRUPTIBLE based on talk with Rik van Riel

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-18 12:12:18 +01:00
Jesper Dangaard Brouer 9c270af37b bpf: XDP_REDIRECT enable use of cpumap
This patch connects cpumap to the xdp_do_redirect_map infrastructure.

Still no SKB allocation are done yet.  The XDP frames are transferred
to the other CPU, but they are simply refcnt decremented on the remote
CPU.  This served as a good benchmark for measuring the overhead of
remote refcnt decrement.  If driver page recycle cache is not
efficient then this, exposes a bottleneck in the page allocator.

A shout-out to MST's ptr_ring, which is the secret behind is being so
efficient to transfer memory pointers between CPUs, without constantly
bouncing cache-lines between CPUs.

V3: Handle !CONFIG_BPF_SYSCALL pointed out by kbuild test robot.

V4: Make Generic-XDP aware of cpumap type, but don't allow redirect yet,
 as implementation require a separate upstream discussion.

V5:
 - Fix a maybe-uninitialized pointed out by kbuild test robot.
 - Restrict bpf-prog side access to cpumap, open when use-cases appear
 - Implement cpu_map_enqueue() as a more simple void pointer enqueue

V6:
 - Allow cpumap type for usage in helper bpf_redirect_map,
   general bpf-prog side restriction moved to earlier patch.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-18 12:12:18 +01:00
Jesper Dangaard Brouer 6710e11269 bpf: introduce new bpf cpu map type BPF_MAP_TYPE_CPUMAP
The 'cpumap' is primarily used as a backend map for XDP BPF helper
call bpf_redirect_map() and XDP_REDIRECT action, like 'devmap'.

This patch implement the main part of the map.  It is not connected to
the XDP redirect system yet, and no SKB allocation are done yet.

The main concern in this patch is to ensure the datapath can run
without any locking.  This adds complexity to the setup and tear-down
procedure, which assumptions are extra carefully documented in the
code comments.

V2:
 - make sure array isn't larger than NR_CPUS
 - make sure CPUs added is a valid possible CPU

V3: fix nitpicks from Jakub Kicinski <kubakici@wp.pl>

V5:
 - Restrict map allocation to root / CAP_SYS_ADMIN
 - WARN_ON_ONCE if queue is not empty on tear-down
 - Return -EPERM on memlock limit instead of -ENOMEM
 - Error code in __cpu_map_entry_alloc() also handle ptr_ring_cleanup()
 - Moved cpu_map_enqueue() to next patch

V6: all notice by Daniel Borkmann
 - Fix err return code in cpu_map_alloc() introduced in V5
 - Move cpu_possible() check after max_entries boundary check
 - Forbid usage initially in check_map_func_compatibility()

V7:
 - Fix alloc error path spotted by Daniel Borkmann
 - Did stress test adding+removing CPUs from the map concurrently
 - Fixed refcnt issue on cpu_map_entry, kthread started too soon
 - Make sure packets are flushed during tear-down, involved use of
   rcu_barrier() and kthread_run only exit after queue is empty
 - Fix alloc error path in __cpu_map_entry_alloc() for ptr_ring

V8:
 - Nitpicking comments and gramma by Edward Cree
 - Fix missing semi-colon introduced in V7 due to rebasing
 - Move struct bpf_cpu_map_entry members cpu+map_id to tracepoint patch

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-18 12:12:18 +01:00
Joel Stanley 4b70c62b9e net: ftgmac100: Request clock and set speed
According to the ASPEED datasheet, gigabit speeds require a clock of
100MHz or higher. Other speeds require 25MHz or higher. This patch
configures a 100MHz clock if the system has a direct-attached
PHY, or 25MHz if the system is running NC-SI which is limited to 100MHz.

There appear to be no other upstream users of the FTGMAC100 driver it is
hard to know the clocking requirements of other platforms. Therefore a
conservative approach was taken with enabling clocks. If the platform is
not ASPEED, both requesting the clock and configuring the speed is
skipped.

Signed-off-by: Joel Stanley <joel@jms.id.au>
Tested-by: Andrew Jeffery <andrew@aj.id.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-18 12:09:06 +01:00
David Howells bc5e3a546d rxrpc: Use MSG_WAITALL to tell sendmsg() to temporarily ignore signals
Make AF_RXRPC accept MSG_WAITALL as a flag to sendmsg() to tell it to
ignore signals whilst loading up the message queue, provided progress is
being made in emptying the queue at the other side.

Progress is defined as the base of the transmit window having being
advanced within 2 RTT periods.  If the period is exceeded with no progress,
sendmsg() will return anyway, indicating how much data has been copied, if
any.

Once the supplied buffer is entirely decanted, the sendmsg() will return.

Signed-off-by: David Howells <dhowells@redhat.com>
2017-10-18 11:43:07 +01:00
David Howells f4d15fb6f9 rxrpc: Provide functions for allowing cleaner handling of signals
Provide a couple of functions to allow cleaner handling of signals in a
kernel service.  They are:

 (1) rxrpc_kernel_get_rtt()

     This allows the kernel service to find out the RTT time for a call, so
     as to better judge how large a timeout to employ.

     Note, though, that whilst this returns a value in nanoseconds, the
     timeouts can only actually be in jiffies.

 (2) rxrpc_kernel_check_life()

     This returns a number that is updated when ACKs are received from the
     peer (notably including PING RESPONSE ACKs which we can elicit by
     sending PING ACKs to see if the call still exists on the server).

     The caller should compare the numbers of two calls to see if the call
     is still alive.

These can be used to provide an extending timeout rather than returning
immediately in the case that a signal occurs that would otherwise abort an
RPC operation.  The timeout would be extended if the server is still
responsive and the call is still apparently alive on the server.

For most operations this isn't that necessary - but for FS.StoreData it is:
OpenAFS writes the data to storage as it comes in without making a backup,
so if we immediately abort it when partially complete on a CTRL+C, say, we
have no idea of the state of the file after the abort.

Signed-off-by: David Howells <dhowells@redhat.com>
2017-10-18 11:42:48 +01:00
David Howells a68f4a27f5 rxrpc: Support service upgrade from a kernel service
Provide support for a kernel service to make use of the service upgrade
facility.  This involves:

 (1) Pass an upgrade request flag to rxrpc_kernel_begin_call().

 (2) Make rxrpc_kernel_recv_data() return the call's current service ID so
     that the caller can detect service upgrade and see what the service
     was upgraded to.

Signed-off-by: David Howells <dhowells@redhat.com>
2017-10-18 11:37:20 +01:00
Johannes Berg 3c798a4531 iwlwifi: pcie: remove set but not used variable tcph
This variable is never used, so remove the code to set it.
After this, the variable 'iph' also has the same fate.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
2017-10-18 13:02:01 +03:00
Luca Coelho 1105a33737 iwlwifi: pcie: sort IDs for the 9000 series for easier comparisons
It's hard to find values that are missing in the list, so sorting the
values and comparing them makes it much easier.  To simplify this
task, sort the devices in the list.

Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
2017-10-18 13:02:00 +03:00
Liad Kaufman 41fd2fec56 iwlwifi: mvm: add missing lq_color
In the compressed BA notif, the driver didn't parse out
the LQ color, so statistics for the rates tried were
always thrown out. Add it so it gets correctly used.

While at it, fix the name of the relevant field in the
struct.

Fixes: c46e7724bf ("iwlwifi: mvm: support new BA notification response")
Signed-off-by: Liad Kaufman <liad.kaufman@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
2017-10-18 13:02:00 +03:00
Luca Coelho 3485e76e73 iwlwifi: define minimum valid address for umac_error_event_table in cfg
We now have two different minimum valid values for
umac_error_event_table.  To avoid hardcoding the minimum value in the
driver, add a value to cfg where it can be read from.

Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
2017-10-18 13:01:52 +03:00
Luca Coelho fb5b28469d iwlwifi: mvm: move umac_error_event_table validity check to where it's set
There's no point in checking the validity of the
umac_error_event_table pointer every time we generate a dump.  It's
cleaner to do so when we read the value, namely when we receive the
alive data.

Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
2017-10-18 13:00:44 +03:00
Beni Lev 0e1be40a45 iwlwifi: mvm: allow reading UMAC error data from SMEM in A000 devices
Currently, UMAC error data reading is restricted to DCCM.
A000 NICs use SMEM for this data.

Signed-off-by: Beni Lev <beni.lev@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
2017-10-18 13:00:44 +03:00
Johannes Berg 76f4a85e1d iwlwifi: mvm: pass baid_data to iwl_mvm_release_frames()
All callers of iwl_mvm_release_frames() already have the baid_data
pointer, so we don't need to (re)calculate it inside the function.
Just pass it instead.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
2017-10-18 13:00:44 +03:00
Sara Sharon 3f1c4c5806 iwlwifi: mvm: remove duplicated fields in mvm reorder buffer
The reason station id and tid fields are both in baid data and
in the reorder buffer per queue is that we couldn't access the
baid_data in the reorder timer functions.
Now that we do some pointer math and access it anyway, those
fields can be removed.
This save some space and some code.

Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
2017-10-18 13:00:43 +03:00
Johannes Berg dfdddd92a5 iwlwifi: mvm: allocate reorder buffer according to need
Now that we may have up to 256 entries per reorder buffer, and possibly up
to 16 queues, we can use a LOT of memory for this (64k for each station).
Allocate it according to what we need, which is of course much less for HT
stations (only 16k at a max of 16 queues).

However, this comes at the expense of complicating the code a bit to
calculate the right entry structure to use for each frame.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
2017-10-18 13:00:43 +03:00
Alan Brady 6c32e0d9fd i40e: fix u64 division usage
Commit 52eb1ff93e98 ("i40e: Add support setting TC max bandwidth rates")
and commit 1ea6f21ae530 ("i40e: Refactor VF BW rate limiting") add some
needed functionality for TC bandwidth rate limiting.  Unfortunately they
introduce several usages of unsigned 64-bit division which needs to be
handled special by the kernel to support all architectures.

Fixes: 52eb1ff93e98 ("i40e: Add support setting TC max bandwidth
rates")
Fixes: 1ea6f21ae530 ("i40e: Refactor VF BW rate limiting")

Signed-off-by: Alan Brady <alan.brady@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-10-17 10:48:52 -07:00
Alan Brady cee919959b i40e: convert i40e_set_link_ksettings to new API
This finishes off the conversion to the new ethtool API by removing the
old macros being used in i40e_set_link_ksettings and replacing them with
shiny new ones.

This conversion also allows us to provide link speed support for new 25G
and 10G macros which is included here as well.

Signed-off-by: Alan Brady <alan.brady@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-10-17 10:48:51 -07:00
Alan Brady 636b62d778 i40e: rename 'change' variable to 'autoneg_changed'
This variable isn't actually very descriptive and makes the code a bit
confusing as to what it is being used for.  This patch enhances the
variable with the longer name, 'autoneg_changed', which makes it clear
we are concerned with autoneg changing in this context.

Signed-off-by: Alan Brady <alan.brady@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-10-17 10:48:51 -07:00
Alan Brady 79f04a3aba i40e: convert i40e_get_settings_link_up to new API
This removes references to old ethtool API macros and functions in
i40e_get_settings_link_up as part of the process of converting to the
new API.  The new API also allows us to provide more explicit support
for new 25G and 10G PHY types so some of the PHY types have been
adjusted where necessary as well.

Signed-off-by: Alan Brady <alan.brady@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-10-17 10:48:51 -07:00
Alan Brady 1eaae5198e i40e: convert i40e_phy_type_to_ethtool to new API
We are still largely using the old ethtool API macros.  This is
problematic because eventually they will be removed and they only
support 32 bits of PHY types.

This overhauls i40e_phy_type_to_ethtool to use only the new API.  Doing
this also allows us to provide much better support for newer 25G and 10G
PHY types which is included here as well.

The remaining usages of the old ethtool API will be addressed in other
patches in the series.

Signed-off-by: Alan Brady <alan.brady@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-10-17 10:48:51 -07:00
Alan Brady 5a6cd6de76 ethtool: add ethtool_intersect_link_masks
This function provides a way to intersect two link masks together to
find the common ground between them.  For example in i40e, the driver
first generates link masks for what is supported by the PHY type.  The
driver then gets the link masks for what the NVM supports.  The
resulting intersection between them yields what can truly be supported.

Signed-off-by: Alan Brady <alan.brady@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-10-17 10:48:51 -07:00
Sudheer Mogilappagari 211b4c140a i40e: Add new PHY types for 25G AOC and ACC support
This patch adds support for 25G Active Optical Cables (AOC) and Active
Copper Cables (ACC) PHY types.

Signed-off-by: Sudheer Mogilappagari <sudheer.mogilappagari@intel.com>
Signed-off-by: Krzysztof Malek <krzysztof.malek@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-10-17 10:48:51 -07:00
Alan Brady 6987bd25e2 i40e: group autoneg PHY types together
This separates the setting of autoneg in i40e_phy_types_to_ethtool into
its own conditional.  Doing this adds clarity as what PHYs
support/advertise autoneg and makes it easier to add new PHY types in
the future.

This also fixes an issue on devices with CRT_RETIMER where advertising
autoneg was being set, but supported autoneg was not.

Signed-off-by: Alan Brady <alan.brady@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-10-17 10:48:51 -07:00
Alan Brady a03af69f5c i40e: fix whitespace issues in i40e_ethtool.c
There's a number of minor incidental whitespace issues in this file.
This addresses most of the ones I could find.

Signed-off-by: Alan Brady <alan.brady@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-10-17 10:48:51 -07:00