Commit Graph

679514 Commits

Author SHA1 Message Date
David S. Miller d980b8d1fc Merge branch 'MDIO-bus-reset-GPIO-cleanups'
Sergei Shtylyov says:

====================
MDIO bus reset GPIO cleanups

Commit 4c5e7a2c05 ("dt-bindings: mdio: Clarify binding document")
declared that a MDIO reset GPIO property should have only a single GPIO
reference/specifier, however the supporting code was left intact...
Here's a couple of the obvious cleanups to that code:

[1/2] mdio_bus: handle only single PHY reset GPIO
[2/2] mdio_bus: use devm_gpiod_get_optional()
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-13 12:56:43 -04:00
Sergei Shtylyov fe0e4052fb mdio_bus: use devm_gpiod_get_optional()
The MDIO reset GPIO is really a classical optional GPIO property case,
so devm_gpiod_get_optional() should have been used, not devm_gpiod_get().
Doing this  saves several LoCs...

Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-13 12:56:42 -04:00
Sergei Shtylyov d396e84c56 mdio_bus: handle only single PHY reset GPIO
Commit 4c5e7a2c05 ("dt-bindings: mdio: Clarify binding document")
declared that a MDIO reset GPIO property should have only a single GPIO
reference/specifier, however the supporting code was left intact, still
burdening the kernel with now apparently useless loops -- get rid of them.

Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-13 12:56:42 -04:00
Nathan Fontenot 61d3e1d9bc ibmvnic: Remove netdev notify for failover resets
When handling a driver reset due to a failover of the backing
server on the vios, doing the netdev_notify_peers() can cause
network traffic to stall or halt. Remove the netdev notify call
for failover resets.

Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-13 12:53:36 -04:00
Thomas Falcon 40c9db8ad8 ibmvnic: Client-initiated failover
The IBM vNIC protocol provides support for the user to initiate
a failover from the client LPAR in case the current backing infrastructure
is deemed inadequate or in an error state.

Support for two H_VIOCTL sub-commands for vNIC devices are required
to implement this function. These commands are H_GET_SESSION_TOKEN
and H_SESSION_ERR_DETECTED.

"[H_GET_SESSION_TOKEN] is used to obtain a session token from a VNIC client
adapter.  This token is opaque to the caller and is intended to be used in
tandem with the SESSION_ERROR_DETECTED vioctl subfunction."

"[H_SESSION_ERR_DETECTED] is used to report that the currently active
backing device for a VNIC client adapter is behaving poorly, and that
the hypervisor should attempt to fail over to a different backing device,
if one is available."

To provide tools access to this functionality the vNIC driver creates a
sysfs file that, when written to, will send a request to pHyp to failover
to a different backing device.

Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Reviewed-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-13 12:53:35 -04:00
Antoine Ténart 725757aee0 net: mvpp2: enable basic 10G support
On GOP port 0 two MAC modes are available: GMAC and XLG. The XLG MAC is
used for 10G connectivity. This patch adds a basic 10G support by
allowing to use the XLG MAC on port 0 and by reworking the
port_enable/disable functions so that the XLG MAC is configured when
using 10G.

Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-13 12:50:43 -04:00
David S. Miller 88efe190f3 Merge branch 'dsa-mv88e6xxx-port-macros-cosmetics'
Vivien Didelot says:

====================
net: dsa: mv88e6xxx: port macros cosmetics

This patch series brings no functional changes.

It prefixes all common port registers macros with MV88E6XXX_PORT.
If registers or some bits differs between switch models, a reference
model is chosen (e.g. MV88E6390_PORT_MAC_CTL_SPEED_10000.)

The register names are documented as found in the datasheets.

Avoid BIT() and shifts defines and prefer a better representation of the
Marvell switch registers with ordered, hexadecimal, 16-bit values.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-13 11:23:13 -04:00
Vivien Didelot b81095947e net: dsa: mv88e6xxx: prefix remaining port macros
For implicit namespacing and clarity, prefix the remaining common Port
Registers macros with MV88E6XXX_PORT.

Document the register and prefer ordered hex masks values for all
Marvell 16-bit registers.

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-13 11:23:12 -04:00
Vivien Didelot 8009df9e70 net: dsa: mv88e6xxx: prefix Port IEEE Priority mapping macros
For implicit namespacing and clarity, prefix the common Port IEEE
Priority Remapping registers macros with MV88E6095_PORT_IEEE_PRIO.

The 88E6390 family turned the 0x18 register into a single indirect
table, document that at the same time.

Document the register and prefer ordered hex masks values for all
Marvell 16-bit registers.

Also fix the following checkpatch checks with a temporary variable:

    CHECK: Alignment should match open parenthesis
    #65: FILE: drivers/net/dsa/mv88e6xxx/port.c:932:
    +		err = mv88e6xxx_port_ieeepmt_write(chip, port,
    +			   MV88E6390_PORT_IEEE_PRIO_MAP_TABLE_INGRESS_PCP,

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-13 11:23:12 -04:00
Vivien Didelot 2a4614e4ef net: dsa: mv88e6xxx: prefix Port Association Vector macros
For implicit namespacing and clarity, prefix the common Port Association
Vector Register macros with MV88E6XXX_PORT_ASSOC_VECTOR.

Document the register and prefer ordered hex masks values for all
Marvell 16-bit registers.

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-13 11:23:12 -04:00
Vivien Didelot 2cb8cb144e net: dsa: mv88e6xxx: prefix Port Egress Rate Control macros
For implicit namespacing and clarity, prefix the common Port Egress Rate
Control and Port Egress Rate Control 2 registers macros with
MV88E6XXX_PORT_EGRESS_RATE_CTL1 and MV88E6XXX_PORT_EGRESS_RATE_CTL2.

Document the register and prefer ordered hex masks values for all
Marvell 16-bit registers.

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-13 11:23:11 -04:00
Vivien Didelot 81c6edb23b net: dsa: mv88e6xxx: prefix Port Control 2 macros
For implicit namespacing and clarity, prefix the common Port Control 2
Register macros with MV88E6XXX_PORT_CTL2 and the ones which differ
between implementations with a chosen reference model
(e.g. MV88E6095_PORT_CTL2_CPU_PORT_MASK.)

Document the register and prefer ordered hex masks values for all
Marvell 16-bit registers.

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-13 11:23:11 -04:00
Vivien Didelot b7929fb36d net: dsa: mv88e6xxx: prefix Port Default VLAN macros
For implicit namespacing and clarity, prefix the common Port Default
VLAN Register macros with MV88E6XXX_PORT_DEFAULT_VLAN.

Document the register and prefer ordered hex masks values for all
Marvell 16-bit registers.

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-13 11:23:11 -04:00
Vivien Didelot 7e5cc5f1b5 net: dsa: mv88e6xxx: prefix Port Based VLAN macros
For implicit namespacing and clarity, prefix the common Port Based VLAN
Register macros with MV88E6XXX_PORT_BASE_VLAN.

Document the register and prefer ordered hex masks values for all
Marvell 16-bit registers.

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-13 11:23:10 -04:00
Vivien Didelot cd985bbf9a net: dsa: mv88e6xxx: prefix Port Control 1 macros
For implicit namespacing and clarity, prefix the common Port Control 1
Register macros with MV88E6XXX_PORT_CTL1.

Document the register and prefer ordered hex masks values for all
Marvell 16-bit registers.

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-13 11:23:10 -04:00
Vivien Didelot a89b433bee net: dsa: mv88e6xxx: prefix Port Control macros
For implicit namespacing and clarity, prefix the common Port Control
Register macros with MV88E6XXX_PORT_CTL0 and the ones which differ
between implementations with a chosen reference model
(e.g. MV88E6185_PORT_CTL0_USE_TAG.)

The reason for CTL0 is to make it clear between the badly named
"Port Control", "Port Control 1" and "Port Control 2" registers.

Document the register and prefer ordered hex masks values for all
Marvell 16-bit registers.

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-13 11:23:10 -04:00
Vivien Didelot 107fcc10e8 net: dsa: mv88e6xxx: prefix Port Switch ID macros
For implicit namespacing and clarity, prefix the common Switch ID
Register macros with MV88E6XXX_PORT_SWITCH_ID.

Document the register and prefer ordered hex masks values for all
Marvell 16-bit registers, this means shifting their values by 4.

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-13 11:23:10 -04:00
Vivien Didelot 6c96bbfdd0 net: dsa: mv88e6xxx: prefix Port Jamming macros
For implicit namespacing and clarity, prefix the common Port Jamming
Control Register macros with MV88E6XXX_PORT_JAM_CTL and the ones which
differ between implementations with a chosen reference model
(e.g. MV88E6097_PORT_JAM_CTL.)

The 88E6390 family renamed the register to Flow Control and turned it
into an indirect table. Document that as well.

Document the register and prefer ordered hex masks values for all
Marvell 16-bit registers.

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-13 11:23:09 -04:00
Vivien Didelot 5ee55577cf net: dsa: mv88e6xxx: prefix Port MAC Control macros
For implicit namespacing and clarity, prefix the common MAC Control
Register macros with MV88E6XXX_PORT_MAC_CTL and the ones which differ
between implementations with a chosen reference model
(e.g. MV88E6065_PORT_MAC_CTL_SPEED_200.)

Document the register and prefer ordered hex masks values for all
Marvell 16-bit registers.

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-13 11:23:09 -04:00
Vivien Didelot 5f83dc93b2 net: dsa: mv88e6xxx: prefix Port Status macros
For implicit namespacing and clarity, prefix the common Port Status
Register macros with MV88E6XXX_PORT_STS and the ones which differ
between implementations with a chosen reference model
(e.g. MV88E6352_PORT_STS_EEE.)

Document the register and prefer ordered hex masks values for all
Marvell 16-bit registers.

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-13 11:23:09 -04:00
Thomas Bogendoerfer a1fa1a00b3 net: phy: marvell: Show complete link partner advertising
Give back all modes advertised by the link partner. This change brings
the marvell phy driver in line with all other phy drivers.

Signed-off-by: Thomas Bogendoerfer <tbogendoerfer@suse.de>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-12 12:07:36 -04:00
Roopa Prabhu e0090a9e97 vxlan: dont migrate permanent fdb entries during learn
This patch fixes vxlan_snoop to not move permanent fdb entries
on learn events. This is consistent with the bridge fdb
handling of permanent entries.

Fixes: 26a41ae604 ("vxlan: only migrate dynamic FDB entries")
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-12 11:01:00 -04:00
David S. Miller 63a2f310d0 wireless-drivers-next patches for 4.13
The first pull request for 4.13. We have a new driver qtnfmac, but
 also rsi driver got a support for new firmware and supporting ath10k
 SDIO devices was started.
 
 Major changes:
 
 ath10k
 
 * add initial SDIO support (still work in progress)
 
 rsi
 
 * new loading for the new firmware version
 
 rtlwifi
 
 * final patches for the new btcoex support
 
 rt2x00
 
 * add device ID for Epson WN7512BEP
 
 qtnfmac
 
 * new driver for Quantenna QSR10G chipsets
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJZPpCkAAoJEG4XJFUm622bUTMH/ipenw7p7otrhw9qtSBcqBRJ
 hS+1i2J+VXDXptFng0ziPWVv6mvhANvBszuOmiNMOgqjx2HrVknwlKUKViklHN3p
 E1PE3/A3bRpzBjMk4j8r/Z7VJK3rDa4WSi/AMJtqV9noNm5FfiOrCk7rXm2MLBns
 x1Wyr/7OX12hiB4SCoOuOZqS/TvHlNCW71BoyRruq01N2MA1iSonLCYJbovgi7Ds
 9acAY90PF8jLC+o7wxYkwuRqWncNhnKOsVNhc/6DKH91zB+C5sw2NGfD3WMfot/Z
 9uC5nBWNadoLGi636y+9evIRgNAFCczCZZSUeY7jMWiVn7XyFy8zoc4fv2ZSBto=
 =6md1
 -----END PGP SIGNATURE-----

Merge tag 'wireless-drivers-next-for-davem-2017-06-12' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next

Kalle Valo says:

====================
wireless-drivers-next patches for 4.13

The first pull request for 4.13. We have a new driver qtnfmac, but
also rsi driver got a support for new firmware and supporting ath10k
SDIO devices was started.

Major changes:

ath10k

* add initial SDIO support (still work in progress)

rsi

* new loading for the new firmware version

rtlwifi

* final patches for the new btcoex support

rt2x00

* add device ID for Epson WN7512BEP

qtnfmac

* new driver for Quantenna QSR10G chipsets
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-12 10:14:29 -04:00
David S. Miller 073cf9e20c Merge branch 'udp-reduce-cache-pressure'
Paolo Abeni says:

====================
udp: reduce cache pressure

In the most common use case, many skb fields are not used by recvmsg(), and
the few ones actually accessed lays on cold cachelines, which leads to several
cache miss per packet.

This patch series attempts to reduce such misses with different strategies:
* caching the interesting fields in the scratched space
* avoid accessing at all uninteresting fields
* prefetching

Tested using the udp_sink program by Jesper[1] as the receiver, an h/w l4 rx
hash on the ingress nic, so that the number of ingress nic rx queues hit by the
udp traffic could be controlled via ethtool -L.

The udp_sink program was bound to the first idle cpu, to get more
stable numbers.

On a single numa node receiver:

nic rx queues           vanilla                 patched kernel      delta
1                       1850 kpps               1850 kpps           0%
2                       2370 kpps               2700 kpps           13.9%
16                      2000 kpps               2220 kpps           11%

[1] https://github.com/netoptimizer/network-testing/blob/master/src/udp_sink.c

v1 -> v2:
  - replaced secpath_reset() with skb_release_head_state()
  - changed udp_dev_scratch fields types to u{32,16} variant,
    replaced bitfield with bool

v2 -> v3:
  - no changes, tested against apachebench for performances regression
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-12 10:01:30 -04:00
Paolo Abeni b65ac44674 udp: try to avoid 2 cache miss on dequeue
when udp_recvmsg() is executed, on x86_64 and other archs, most skb
fields are on cold cachelines.
If the skb are linear and the kernel don't need to compute the udp
csum, only a handful of skb fields are required by udp_recvmsg().
Since we already use skb->dev_scratch to cache hot data, and
there are 32 bits unused on 64 bit archs, use such field to cache
as much data as we can, and try to prefetch on dequeue the relevant
fields that are left out.

This can save up to 2 cache miss per packet.

v1 -> v2:
  - changed udp_dev_scratch fields types to u{32,16} variant,
    replaced bitfiled with bool

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-12 10:01:29 -04:00
Paolo Abeni 0a463c78d2 udp: avoid a cache miss on dequeue
Since UDP no more uses sk->destructor, we can clear completely
the skb head state before enqueuing. Amend and use
skb_release_head_state() for that.

All head states share a single cacheline, which is not
normally used/accesses on dequeue. We can avoid entirely accessing
such cacheline implementing and using in the UDP code a specialized
skb free helper which ignores the skb head state.

This saves a cacheline miss at skb deallocation time.

v1 -> v2:
  replaced secpath_reset() with skb_release_head_state()

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-12 10:01:29 -04:00
Paolo Abeni 3889a803e1 net: factor out a helper to decrement the skb refcount
The same code is replicated in 3 different places; move it to a
common helper.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-12 10:01:29 -04:00
Niklas Söderlund 78d6102256 sh_eth: add support for changing MTU
The hardware supports the MTU to be changed and the driver it self is
somewhat prepared to support this. This patch hooks up the callbacks to
be able to change the MTU from user-space.

Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Acked-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-12 09:57:52 -04:00
Daniel Borkmann f1c9eed7f4 bpf, arm64: take advantage of stack_depth tracking
Make use of recently implemented stack_depth tracking for arm64 JIT,
so that stack usage can be reduced heavily for programs not using
tail calls at least.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-11 18:18:32 -04:00
David S. Miller 66e037ca57 mlx5-updates-2017-06-11
This series provides updates to mlx5 header rewrite feature, from Or Gerlitz.
 and three more small updates From maor and eran.
 
 -------
 Or says:
 
 Packets belonging to flows which are different by matching may still need
 to go through the same header re-writes (e.g set the current routing hop
 MACs and issue TTL decrement).  To minimize the number of modify header
 IDs, we add a cache for header re-write IDs which is keyed by the binary
 chain of modify header actions.
 
 The caching is supported for both eswitch and NIC use-cases, where the
 actual conversion of the code to use caching comes in separate patches,
 one per use-case.
 
 Using a per field mask field, the TC pedit action supports modifying
 partial fields. The last patch enables offloading that.
 -------
 
 From Maor, update flow table commands layout to the latest HW spec.
 From Eran, ethtool connector type reporting updates.
 
 Thanks,
 Saeed.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJZPUrBAAoJEEg/ir3gV/o+ptQH+gIXfHH1mrJp0ZwM/hhLidYE
 Bj/ie9Y1Ir6q3RU2+g/NLqejtvTIzAyhMfiq4ag4eCVVRuGGjPXRZJWivWXUCbjm
 /XLaXTK62qNxJNAWyzgxEJSUI1URMtQWIf9SF8LMLGiNfZfx8b7o/Q08P18tNxbb
 AeNzmgYetva9lBialWF0dPDaAvd8THngBPF7LYDWghEXbDPvYTfvABN0qUrHs1s2
 /LbmZ7L9U+RDSusz/klYW+/WorNiOm44nwk+KgdnsrVITZVVfblI6VEvgEjprhZF
 c3SFUPYz079WjndNHVWezqsl4gIZIoM9FAtmoiBg+hxTmWJVI9uwllUcarW9s+k=
 =0Cyo
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-updates-2017-06-11' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5-updates-2017-06-11

This series provides updates to mlx5 header rewrite feature, from Or Gerlitz.
and three more small updates From maor and eran.

-------
Or says:

Packets belonging to flows which are different by matching may still need
to go through the same header re-writes (e.g set the current routing hop
MACs and issue TTL decrement).  To minimize the number of modify header
IDs, we add a cache for header re-write IDs which is keyed by the binary
chain of modify header actions.

The caching is supported for both eswitch and NIC use-cases, where the
actual conversion of the code to use caching comes in separate patches,
one per use-case.

Using a per field mask field, the TC pedit action supports modifying
partial fields. The last patch enables offloading that.
-------

From Maor, update flow table commands layout to the latest HW spec.
From Eran, ethtool connector type reporting updates.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-11 18:10:42 -04:00
Grygorii Strashko 6d307f6b09 net: ethernet: ti: cpdma: do not enable host error misc irq
CPSW driver does not handle this interrupt, so there are no reasons to enable
it in hardware.

Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-10 22:10:13 -04:00
Grygorii Strashko e9523a5a32 net: ethernet: ti: cpsw: enable HWTSTAMP_FILTER_PTP_V1_L4_EVENT filter
CPSW driver supports PTP v1 messages, but for unknown reasons this filter
is not advertised. As result,
./tools/testing/selftests/networking/timestamping/timestamping utility
can't be used for testing of CPSW RX timestamping with option
SOF_TIMESTAMPING_RX_HARDWARE, because it uses
HWTSTAMP_FILTER_PTP_V1_L4_SYNC filter.

Hence, fix it by advertising HWTSTAMP_FILTER_PTP_V1_L4_XXX filters
in CPSW driver.

Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-10 22:10:13 -04:00
David S. Miller 9fd7aca2c6 Merge branch 'bpf-misc-updates'
Daniel Borkmann says:

====================
Misc BPF updates

This set contains a couple of misc updates: stack usage reduction
for perf_sample_data in tracing progs, reduction of stale data in
verifier on register state transitions that I still had in my queue
and few selftest improvements as well as bpf_set_hash() helper for
tc programs.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-10 19:05:47 -04:00
Daniel Borkmann ded092cd73 bpf: add bpf_set_hash helper for tc progs
Allow for tc BPF programs to set a skb->hash, apart from clearing
and triggering a recalc that we have right now. It allows for BPF
to implement a custom hashing routine for skb_get_hash().

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-10 19:05:47 -04:00
Daniel Borkmann 966789fb86 bpf: remove cg_skb_func_proto and use sk_filter_func_proto directly
Since cg_skb_func_proto() doesn't do anything else than just calling
into sk_filter_func_proto(), remove it and set sk_filter_func_proto()
directly for .get_func_proto callback.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-10 19:05:46 -04:00
Daniel Borkmann f735b64926 bpf, tests: set rlimit also for test_align, so it doesn't fail
When running all the tests, through 'make run_tests', I had
test_align failing due to insufficient rlimit. Set it the same
way as all other test cases from BPF selftests do, so that
test case properly loads everything.

  [...]
  Summary: 7 PASSED, 1 FAILED
  selftests: test_progs [PASS]
  /home/foo/net-next/tools/testing/selftests/bpf
  Test   0: mov ... Failed to load program.
  FAIL
  Test   1: shift ... Failed to load program.
  FAIL
  Test   2: addsub ... Failed to load program.
  FAIL
  Test   3: mul ... Failed to load program.
  FAIL
  Test   4: unknown shift ... Failed to load program.
  FAIL
  Test   5: unknown mul ... Failed to load program.
  FAIL
  Test   6: packet const offset ... Failed to load program.
  FAIL
  Test   7: packet variable offset ... Failed to load program.
  FAIL
  Results: 0 pass 8 fail
  selftests: test_align [PASS]
  [...]

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-10 19:05:46 -04:00
Daniel Borkmann 5ecf51fd9c bpf, tests: add a test for htab lookup + update traversal
Add a test case to track behaviour when traversing and updating the
htab map. We recently used such traversal, so it's quite useful to
keep it as an example in selftests.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-10 19:05:46 -04:00
Daniel Borkmann 36e24c0030 bpf: reset id on spilled regs in clear_all_pkt_pointers
Right now, we don't reset the id of spilled registers in case of
clear_all_pkt_pointers(). Given pkt_pointers are highly likely to
contain an id, do so by reusing __mark_reg_unknown_value().

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-10 19:05:45 -04:00
Daniel Borkmann 4a2ff55aa4 bpf: reset id on CONST_IMM transition
Whenever we set the register to the type CONST_IMM, we currently don't
reset the id to 0. id member is not used in CONST_IMM case, so don't
let it become stale, where pruning won't be able to match later on.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-10 19:05:45 -04:00
Daniel Borkmann d25da6caa2 bpf: don't check spilled reg state for non-STACK_SPILLed type slots
spilled_regs[] state is only used for stack slots of type STACK_SPILL,
never for STACK_MISC. Right now, in states_equal(), even if we have
old and current stack state of type STACK_MISC, we compare spilled_regs[]
for that particular offset. Just skip these like we do everywhere else.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-10 19:05:45 -04:00
Daniel Borkmann 20b9d7ac48 bpf: avoid excessive stack usage for perf_sample_data
perf_sample_data consumes 386 bytes on stack, reduce excessive stack
usage and move it to per cpu buffer. It's allowed due to preemption
being disabled for tracing, xdp and tc programs, thus at all times
only one program can run on a specific CPU and programs cannot run
from interrupt. We similarly also handle bpf_pt_regs.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-10 19:05:45 -04:00
Fabio Estevam 41e8e40458 net: fec: Add a fec_enet_clear_ethtool_stats() stub for CONFIG_M5272
Commit 2b30842b23 ("net: fec: Clear and enable MIB counters on imx51")
introduced fec_enet_clear_ethtool_stats(), but missed to add a stub
for the CONFIG_M5272=y case, causing build failure for the
m5272c3_defconfig.

Add the missing empty stub to fix the build failure.

Fixes: Commit 2b30842b23 ("net: fec: Clear and enable MIB counters on imx51")
Reported-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-10 16:45:22 -04:00
Chenbo Feng 89dfba3e1b Remove the redundant skb->dev initialization in ip6_fragment
After moves the skb->dev and skb->protocol initialization into
ip6_output, setting the skb->dev inside ip6_fragment is unnecessary.

Fixes: 97a7a37a7b7b("ipv6: Initial skb->dev and skb->protocol in ip6_output")
Signed-off-by: Chenbo Feng <fengc@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-10 16:25:21 -04:00
Xin Long 4abf5a653b sctp: no need to check assoc id before calling sctp_assoc_set_id
sctp_assoc_set_id does the assoc id check in the beginning when
processing dupcookie, no need to do the same check before calling
it.

v1->v2:
  fix some typo errs Marcelo pointed in changelog.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-10 16:23:31 -04:00
Xin Long c0a4c2d1cd sctp: use read_lock_bh in sctp_eps_seq_show
This patch is to use read_lock_bh instead of local_bh_disable
and read_lock in sctp_eps_seq_show.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-10 16:22:26 -04:00
Xin Long 6dfe4b97e0 sctp: fix recursive locking warning in sctp_do_peeloff
Dmitry got the following recursive locking report while running syzkaller
fuzzer, the Call Trace:
 __dump_stack lib/dump_stack.c:16 [inline]
 dump_stack+0x2ee/0x3ef lib/dump_stack.c:52
 print_deadlock_bug kernel/locking/lockdep.c:1729 [inline]
 check_deadlock kernel/locking/lockdep.c:1773 [inline]
 validate_chain kernel/locking/lockdep.c:2251 [inline]
 __lock_acquire+0xef2/0x3430 kernel/locking/lockdep.c:3340
 lock_acquire+0x2a1/0x630 kernel/locking/lockdep.c:3755
 lock_sock_nested+0xcb/0x120 net/core/sock.c:2536
 lock_sock include/net/sock.h:1460 [inline]
 sctp_close+0xcd/0x9d0 net/sctp/socket.c:1497
 inet_release+0xed/0x1c0 net/ipv4/af_inet.c:425
 inet6_release+0x50/0x70 net/ipv6/af_inet6.c:432
 sock_release+0x8d/0x1e0 net/socket.c:597
 __sock_create+0x38b/0x870 net/socket.c:1226
 sock_create+0x7f/0xa0 net/socket.c:1237
 sctp_do_peeloff+0x1a2/0x440 net/sctp/socket.c:4879
 sctp_getsockopt_peeloff net/sctp/socket.c:4914 [inline]
 sctp_getsockopt+0x111a/0x67e0 net/sctp/socket.c:6628
 sock_common_getsockopt+0x95/0xd0 net/core/sock.c:2690
 SYSC_getsockopt net/socket.c:1817 [inline]
 SyS_getsockopt+0x240/0x380 net/socket.c:1799
 entry_SYSCALL_64_fastpath+0x1f/0xc2

This warning is caused by the lock held by sctp_getsockopt() is on one
socket, while the other lock that sctp_close() is getting later is on
the newly created (which failed) socket during peeloff operation.

This patch is to avoid this warning by use lock_sock with subclass
SINGLE_DEPTH_NESTING as Wang Cong and Marcelo's suggestion.

Reported-by: Dmitry Vyukov <dvyukov@google.com>
Suggested-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Suggested-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-10 16:22:25 -04:00
Rosen, Rami 664f46a290 net/packet: remove unneeded declaraion of tpacket_snd().
This patch removes unneeded forward declaration of tpacket_snd()
in net/packet/af_packet.c.

Signed-off-by: Rami Rosen <rami.rosen@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-10 16:15:25 -04:00
Chenbo Feng 384abed1fe bpf: Remove duplicate tcp_filter hook in ipv6
There are two tcp_filter hooks in tcp_ipv6 ingress path currently.
One is at tcp_v6_rcv and another is in tcp_v6_do_rcv. It seems the
tcp_filter() call inside tcp_v6_do_rcv is redundent and some packet
will be filtered twice in this situation. This will cause trouble
when using eBPF filters to account traffic data.

Signed-off-by: Chenbo Feng <fengc@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-10 16:08:02 -04:00
Nicolas Dichtel cd99c37763 bonding: warn user when 802.3ad speed is unknown
Goal is to advertise the user when ethtool speeds and 802.3ad speeds are
desynchronized.
When this case happens, the kernel needs to be patched.

Suggested-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-10 16:06:49 -04:00
Nicolas Dichtel 10d486a30c netns: fix error code when the nsid is already used
When the user tries to assign a specific nsid, idr_alloc() is called with
the range [nsid, nsid+1]. If this nsid is already used, idr_alloc() returns
ENOSPC (No space left on device). In our case, it's better to return
EEXIST to make it clear that the nsid is not available.

CC: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-10 15:58:50 -04:00