Commit Graph

725745 Commits

Author SHA1 Message Date
Lorenzo Bianconi 62e7b6a57c l2tp: remove l2specific_len dependency in l2tp_core
Remove l2specific_len dependency while building l2tpv3 header or
parsing the received frame since default L2-Specific Sublayer is
always four bytes long and we don't need to rely on a user supplied
value.
Moreover in l2tp netlink code there are no sanity checks to
enforce the relation between l2specific_len and l2specific_type,
so sending a malformed netlink message is possible to set
l2specific_type to L2TP_L2SPECTYPE_DEFAULT (or even
L2TP_L2SPECTYPE_NONE) and set l2specific_len to a value greater than
4 leaking memory on the wire and sending corrupted frames.

Reviewed-by: Guillaume Nault <g.nault@alphalink.fr>
Tested-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-19 15:00:49 -05:00
Lorenzo Bianconi dfffc97d0e l2tp: double-check l2specific_type provided by userspace
Add sanity check on l2specific_type provided by userspace in
l2tp_nl_cmd_session_create() since just L2TP_L2SPECTYPE_DEFAULT and
L2TP_L2SPECTYPE_NONE are currently supported.
Moreover explicitly set l2specific_type to L2TP_L2SPECTYPE_DEFAULT
only if the userspace does not provide a value for it

Reviewed-by: Guillaume Nault <g.nault@alphalink.fr>
Tested-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-19 15:00:48 -05:00
David S. Miller e0e8a14971 Merge branch 'cxgb4-reduce-memory-footprint-for-collecting-firmware-dump'
Rahul Lakkireddy says:

====================
cxgb4: reduce memory footprint for collecting firmware dump

Firmware dump can be large (upto 2 GB).  In low memory conditions,
ethtool fails to allocate such large memory.  So, use zlib deflate
to compress collected firmware dump.

Patch 1 updates collection logic to use compression.

Patch 2 adds zlib deflate to compress collected firmware dump.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-19 14:56:32 -05:00
Rahul Lakkireddy 91c1953de3 cxgb4: use zlib deflate to compress firmware dump
Use zlib deflate to compress firmware dump. Collect and compress
as much firmware dump as possible into a 32 MB buffer.

Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Vishal Kulkarni <vishal@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-19 14:56:32 -05:00
Rahul Lakkireddy 56cf2635ce cxgb4: update dump collection logic to use compression
Update firmware dump collection logic to use compression when available.
Let collection logic attempt to do compression, instead of returning out
of memory early.

Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Vishal Kulkarni <vishal@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-19 14:56:32 -05:00
Talat Batheesh 5165674ff5 net/dim: Fix fixpoint divide exception in net_dim_stats_compare
Helmut reported a bug about devision by zero while
running traffic and doing physical cable pull test.

When the cable unplugged the ppms become zero, so when
dividing the current ppms by the previous ppms in the
next dim iteration there is devision by zero.

This patch prevent this division for both ppms and epms.

Fixes: c3164d2fc4 ("net/mlx5e: Added BW check for DIM decision mechanism")
Fixes: 4c4dbb4a73 ("net/mlx5e: Move dynamic interrupt coalescing code to include/linux")
Reported-by: Helmut Grauer <helmut.grauer@de.ibm.com>
Signed-off-by: Talat Batheesh <talatb@mellanox.com>
Signed-off-by: Tal Gilboa <talgi@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-19 14:53:32 -05:00
Wei Yongjun 43dd7512b5 devlink: Make some functions static
Fixes the following sparse warnings:

net/core/devlink.c:2297:25: warning:
 symbol 'devlink_resource_find' was not declared. Should it be static?
net/core/devlink.c:2322:6: warning:
 symbol 'devlink_resource_validate_children' was not declared. Should it be static?

Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-19 14:36:29 -05:00
Wei Yongjun 8df1d08bf2 mlxsw: spectrum: Make function mlxsw_sp_kvdl_part_occ() static
Fixes the following sparse warning:

drivers/net/ethernet/mellanox/mlxsw/spectrum_kvdl.c:289:5: warning:
 symbol 'mlxsw_sp_kvdl_part_occ' was not declared. Should it be static?

Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Acked-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-19 14:35:32 -05:00
Zhu Yanjun 1da847b992 forcedeth: remove unused variable
The variable miistat is not used. So it is removed.

CC: Srinivas Eeda <srinivas.eeda@oracle.com>
CC: Joe Jin <joe.jin@oracle.com>
CC: Junxiao Bi <junxiao.bi@oracle.com>
Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-19 14:32:25 -05:00
Eric Dumazet 5a75114adb ipv6: mcast: remove dead code
Since commit 41033f029e ("snmp: Remove duplicate OUTMCAST stat
increment") one line of code became unneeded.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-19 14:17:44 -05:00
Arnd Bergmann ce6289661b caif: reduce stack size with KASAN
When CONFIG_KASAN is set, we can use relatively large amounts of kernel
stack space:

net/caif/cfctrl.c:555:1: warning: the frame size of 1600 bytes is larger than 1280 bytes [-Wframe-larger-than=]

This adds convenience wrappers around cfpkt_extr_head(), which is responsible
for most of the stack growth. With those wrapper functions, gcc apparently
starts reusing the stack slots for each instance, thus avoiding the
problem.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-19 14:02:12 -05:00
David S. Miller aa73dc95b2 linux-can-next-for-4.16-20180119
-----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCgAxFiEE4bay/IylYqM/npjQHv7KIOw4HPYFAlphvhMTHG1rbEBwZW5n
 dXRyb25peC5kZQAKCRAe/sog7Dgc9ocrB/92eX9hS7luraFld9CwH2Dq6U9vVTUE
 bEUq9tLznxA8encUX6ulxBow0ZO/vzWjinDy+kjY4dzvswEV0SHm6YgAtd3uvj7W
 4pLNfjl0aqNPdjM3mbr93isyNWixyr1VAX9hCi3sCgTbE29Ms9OmC3rF7HM8bI+J
 thCAVxnNt0P9HngclMiopV92weDsGwcUUw/oU1+FpzEacjrp4frhh/9T1YCng4Ab
 HAWgdTM1k5mKWQDUxzasOQrzcz+WncOox6dzw7IxWDLtNhCwBAHWu8drBLTCWEU6
 e8NiKjjMuLafmhZZpXw+17GDeTYcxpzj7KQ+KR/HV2abXhAmrM+oGsTu
 =MdGk
 -----END PGP SIGNATURE-----

Merge tag 'linux-can-next-for-4.16-20180119' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next

Marc Kleine-Budde says:

====================
pull-request: can-next 2018-01-16

this is a pull request for net-next/master consisting of 1 patch.

This patch by Arnd Bergmann for the m_can driver silences a compiler
warning if CONFIG_PM is not selected.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-19 10:30:57 -05:00
David S. Miller 8b7d828788 wireless-drivers-next patches for 4.16
Final few patches before the merge window, nothing really special.
 
 ath9k
 
 * add MSI support (not enabled by default yet)
 
 rtlwifi
 
 * support A-MSDU in A-MPDU aggregation
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJaYa7GAAoJEG4XJFUm622bp+IIAJuAvqfr5G1056cmvvpIDF1i
 tsiNcmTkYlV33+99t1GXLgXATM5oYLOe4KLgVk+SxG2mCn2ZIun/TUUExM8tebIr
 a6U1loiWhvztBHkzSglfIGknFHH6Ib6gZ6ti6cb5PpZE/ey1bsjlNBjvI5k2di5h
 2KobJfQPk34e/DgJI49wYO6CwuzuT+rnNaWFzKEUoKm6lwGeUqpukV90gXN1O/qj
 Xt2A9xIlUUEXHBfkJIef34+YMs/c9jeXqwoBMZIs8G/tQGeYOMjqbn7kTLW1MpMX
 bu+59PCmshGdM/OMJsOZux5tm4DFGWaK/FYkedAJ1i5u4WzBEr1IJfo19es4oJM=
 =kFNX
 -----END PGP SIGNATURE-----

Merge tag 'wireless-drivers-next-for-davem-2018-01-19' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next

Kalle Valo says:

====================
wireless-drivers-next patches for 4.16

Final few patches before the merge window, nothing really special.

ath9k

* add MSI support (not enabled by default yet)

rtlwifi

* support A-MSDU in A-MPDU aggregation
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-19 10:26:53 -05:00
Arnd Bergmann 5835919319 can: m_can: mark runtime-PM handlers as __maybe_unused
Building without CONFIG_PM results in a harmless warning:

drivers/net/can/m_can/m_can.c:1763:12: error: 'm_can_runtime_resume' defined but not used [-Werror=unused-function]
drivers/net/can/m_can/m_can.c:1752:12: error: 'm_can_runtime_suspend' defined but not used [-Werror=unused-function]

Marking the functions as __maybe_unused lets the compiler
silently drop them instead.

Fixes: cdf8259d65 ("can: m_can: Add PM Support")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2018-01-19 10:41:15 +01:00
Andrew Morton ef58ca38db net/sched/sch_prio.c: work around gcc-4.4.4 union initializer issues
gcc-4.4.4 has problems witn anon union initializers.  Work around this.

net/sched/sch_prio.c: In function 'prio_dump_offload':
net/sched/sch_prio.c:260: error: unknown field 'stats' specified in initializer
net/sched/sch_prio.c:260: warning: initialization makes integer from pointer without a cast
net/sched/sch_prio.c:261: error: unknown field 'stats' specified in initializer
net/sched/sch_prio.c:261: warning: initialization makes integer from pointer without a cast

Fixes: 7fdb61b44c ("net: sch: prio: Add offload ability to PRIO qdisc")
Cc: Nogah Frankel <nogahf@mellanox.com>
Cc: Yuval Mintz <yuvalm@mellanox.com>
Cc: Jiri Pirko <jiri@mellanox.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-18 21:11:31 -05:00
Christopher Díaz Riveros 89290b831e flow_netlink: Remove unneeded semicolons
Trivial fix removes unneeded semicolons after if blocks.

This issue was detected by using the Coccinelle software.

Signed-off-by: Christopher Díaz Riveros <chrisadr@gentoo.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-18 16:25:01 -05:00
Luis de Bethencourt 4c5386bae9 net/mlx5e: Fix trailing semicolon
The trailing semicolon is an empty statement that does no operation.
Removing it since it doesn't do anything.

Signed-off-by: Luis de Bethencourt <luisbg@kernel.org>
Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-18 16:22:11 -05:00
Jiri Pirko d680b3524c net: sched: silence uninitialized parent variable warning in tc_dump_tfilter
When tcm->tcm_ifindex == TCM_IFINDEX_MAGIC_BLOCK, parent is still passed
down but the value is never used. Compiler does not recognize it and
issues a warning. Silence it down initializing parent to 0.

Fixes: 7960d1daf2 ("net: sched: use block index as a handle instead of qdisc when block is shared")
Reported-by: David Miller <davem@davemloft.net>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-18 10:37:12 -05:00
Ping-Ke Shih 5f9066930b rtlwifi: Support A-MSDU in A-MPDU capability
Due to the fact that A-MSDU deaggregation is done in software,
we set this flag to support the A-MSDU in A-MPDU

Signed-off-by: Steven Ting <steventing@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2018-01-18 15:27:11 +02:00
Maya Erez 454099ed09 MAINTAINERS: wireless: update wil6210 maintainer entry
wil6210 maintainer email and mail list has changed, hence update
its MAINTAINERS entry accordingly.

Signed-off-by: Maya Erez <merez@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2018-01-18 15:26:41 +02:00
David S. Miller 4f7d58517f linux-can-next-for-4.16-20180116
-----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCgAxFiEE4bay/IylYqM/npjQHv7KIOw4HPYFAlpeCCQTHG1rbEBwZW5n
 dXRyb25peC5kZQAKCRAe/sog7Dgc9ti9B/4jAoCYTuYXnkRvS34jdgQQV99QyC8M
 EpcAgzHo2kYNPDf5q8TEVheXxvA9XeGiA+TtjL9cNowxwMMtJev3/dmBtOmU6jek
 RgOsYlR8guxBHOx8pj1IMl1xPoWTLfg4Kv1qnpXx3zvhOP610G/edBBSZt659MGF
 SW6pBoNivbl+EYSM5x81QIfqJ4NlD5AKY4PeeSnGrnSthd8EFNp2zKkcY8nMOJ0D
 Gm2YyxGJXh+lHse965DQRNg+owZxIWyheQplulVrw9v34LOjbFko3Cd+D9KLW5MG
 LVVRJ3E0jm7W75AvxNcv2WP+lZVcDxXqsxFH0dP8WOJNZiKZeJ5aZfji
 =ROXk
 -----END PGP SIGNATURE-----

Merge tag 'linux-can-next-for-4.16-20180116' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next

Marc Kleine-Budde says:

====================
pull-request: can-next 2018-01-16

this is a pull request for net-next/master consisting of 9 patches.

This is a series of patches, some of them initially by Franklin S Cooper
Jr, which was picked up by Faiz Abbas. Faiz Abbas added some patches
while working on this series, I contributed one as well.

The first two patches add support to CAN device infrastructure to limit
the bitrate of a CAN adapter if the used CAN-transceiver has a certain
maximum bitrate.

The remaining patches improve the m_can driver. They add support for
bitrate limiting to the driver, clean up the driver and add support for
runtime PM.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-17 16:08:25 -05:00
Luis de Bethencourt 5ef7e0ba10 vxlan: Fix trailing semicolon
The trailing semicolon is an empty statement that does no operation.
It is completely stripped out by the compiler. Removing it since it doesn't do
anything.

Fixes: 5f35227ea3 ("net: Generalize ndo_gso_check to ndo_features_check")
Signed-off-by: Luis de Bethencourt <luisbg@kernel.org>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-17 16:07:24 -05:00
Ganesh Goudar baf5086840 cxgb4: restructure VF mgmt code
restructure the code which adds support for configuring
PCIe VF via mgmt netdevice. which was added by
commit 7829451c69 ("cxgb4: Add control net_device for
configuring PCIe VF")

Original work by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-17 16:04:48 -05:00
Kirill Tkhai 42157277af net: Remove spinlock from get_net_ns_by_id()
idr_find() is safe under rcu_read_lock() and
maybe_get_net() guarantees that net is alive.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-17 15:42:35 -05:00
Kirill Tkhai 0c06bea919 net: Fix possible race in peernet2id_alloc()
peernet2id_alloc() is racy without rtnl_lock() as refcount_read(&peer->count)
under net->nsid_lock does not guarantee, peer is alive:

rcu_read_lock()
peernet2id_alloc()                            ..
  spin_lock_bh(&net->nsid_lock)               ..
  refcount_read(&peer->count) (!= 0)          ..
  ..                                          put_net()
  ..                                            cleanup_net()
  ..                                              for_each_net(tmp)
  ..                                                spin_lock_bh(&tmp->nsid_lock)
  ..                                                __peernet2id(tmp, net) == -1
  ..                                                    ..
  ..                                                    ..
    __peernet2id_alloc(alloc == true)                   ..
  ..                                                    ..
rcu_read_unlock()                                       ..
..                                                synchronize_rcu()
..                                                kmem_cache_free(net)

After the above situation, net::netns_id contains id pointing to freed memory,
and any other dereferencing by the id will operate with this freed memory.

Currently, peernet2id_alloc() is used under rtnl_lock() everywhere except
ovs_vport_cmd_fill_info(), and this race can't occur. But peernet2id_alloc()
is generic interface, and better we fix it before someone really starts
use it in wrong context.

v2: Don't place refcount_read(&net->count) under net->nsid_lock
    as suggested by Eric W. Biederman <ebiederm@xmission.com>
v3: Rebase on top of net-next

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-17 15:42:35 -05:00
David S. Miller a29ae44c7a Merge branch 'tun-allow-to-attach-eBPF-filter'
Jason Wang says:

====================
tun: allow to attach eBPF filter

This series tries to implement eBPF socket filter for tun. This could
be used for implementing efficient virtio-net receive filter for
vhost-net.

Changes from V2:
- fix typo
- remove unnecessary double check
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-17 15:32:11 -05:00
Jason Wang aff3d70a07 tun: allow to attach ebpf socket filter
This patch allows userspace to attach eBPF filter to tun. This will
allow to implement VM dataplane filtering in a more efficient way
compared to cBPF filter by allowing either qemu or libvirt to
attach eBPF filter to tun.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-17 15:32:10 -05:00
Jason Wang cd5681d7d8 tuntap: rename struct tun_steering_prog to struct tun_prog
To be reused by other eBPF program other than queue selection.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-17 15:32:10 -05:00
David S. Miller ca46abd6f8 Merge branch 'net-sched-allow-qdiscs-to-share-filter-block-instances'
Jiri Pirko says:

====================
net: sched: allow qdiscs to share filter block instances

Currently the filters added to qdiscs are independent. So for example if you
have 2 netdevices and you create ingress qdisc on both and you want to add
identical filter rules both, you need to add them twice. This patchset
makes this easier and mainly saves resources allowing to share all filters
within a qdisc - I call it a "filter block". Also this helps to save
resources when we do offload to hw for example to expensive TCAM.

So back to the example. First, we create 2 qdiscs. Both will share
block number 22. "22" is just an identification:
$ tc qdisc add dev ens7 ingress_block 22 ingress
                        ^^^^^^^^^^^^^^^^
$ tc qdisc add dev ens8 ingress_block 22 ingress
                        ^^^^^^^^^^^^^^^^

If we don't specify "block" command line option, no shared block would
be created:
$ tc qdisc add dev ens9 ingress

Now if we list the qdiscs, we will see the block index in the output:

$ tc qdisc
qdisc ingress ffff: dev ens7 parent ffff:fff1 ingress_block 22
qdisc ingress ffff: dev ens8 parent ffff:fff1 ingress_block 22
qdisc ingress ffff: dev ens9 parent ffff:fff1

To make is more visual, the situation looks like this:

   ens7 ingress qdisc                 ens7 ingress qdisc
          |                                  |
          |                                  |
          +---------->  block 22  <----------+

Unlimited number of qdiscs may share the same block.

Note that this patchset introduces block sharing support also for clsact
qdisc:
$ tc qdisc add dev ens10 ingress_block 23 egress_block 24 clsact
$ tc qdisc show dev ens10
qdisc clsact ffff: dev ens10 parent ffff:fff1 ingress_block 23 egress_block 24

We can add filter using the block index:

$ tc filter add block 22 protocol ip pref 25 flower dst_ip 192.168.0.0/16 action drop

Note we cannot use the qdisc for filter manipulations of shared blocks:

$ tc filter add dev ens8 ingress protocol ip pref 1 flower dst_ip 192.168.100.2 action drop
Error: This filter block is shared. Please use the block index to manipulate the filters.

We will see the same output if we list filters for ingress qdisc of
ens7 and ens8, also for the block 22:

$ tc filter show block 22
filter block 22 protocol ip pref 25 flower chain 0
filter block 22 protocol ip pref 25 flower chain 0 handle 0x1
...

$ tc filter show dev ens7 ingress
filter block 22 protocol ip pref 25 flower chain 0
filter block 22 protocol ip pref 25 flower chain 0 handle 0x1
...

$ tc filter show dev ens8 ingress
filter block 22 protocol ip pref 25 flower chain 0
filter block 22 protocol ip pref 25 flower chain 0 handle 0x1
...

---
v10->v11:
- patch 2:
 - fixed error path when register_pernet_subsys fails pointed out by Cong
- patch 9:
 - rebased on top of the current net-next

v9->v10:
- patch 7:
 - fixed ifindex magic in the patch description
- userspace patches:
 - added manpages and patch descriptions

v8->v9:
- patch "net: sched: add rt netlink message type for block get" was
  removed, userspace check filter existence using qdisc dump

v7->v8:
- patch 7:
 - added comment to ifindex block magic
- patch 9:
 - new patch
- patch 10:
 - base this on the patch that introduces qdisc-generic block index
   attributes parsing/dumping
- patch 13:
 - rebased on top of current net-next

v6->v7:
- patch 1:
 - unsquashed shared block patch that was previously squashed by mistake
 - fixed error path in block create - freeing chain 0
- patch 2:
 - new patch - splitted from the previous one as it got accidentaly
   squashed in the rebasing process in the past
 - converted to idr extended
 - removed auto-generating of block indexes. Callers have to explicily
   tell that the block is shared by passing non-zero block index
 - fixed error path in block get ext - freeing chain 0
- patch 7:
 - changed extack message for block index handle as suggested by DaveA
 - added extack message when block index does not exist
 - the block ifindex magic is in define and change to 0xffffffff
   as suggested by Jamal
- patch 8:
 - new patch implementing RTM_GETBLOCK in order to query if the block
   with some index exists
- patch 9:
 - adjust to the core changes and check block index attributes for being 0

v5->v6:
- added patch 6 that introduces block handle

v4->v5:
- patch 5:
 - add tracking of binding of devs that are unable to offload and check
   that before block cbs call.

v3->v4:
- patch 1:
 - rebased on top of the current net-next
 - added some extack strings
- patch 3:
 - rebased on top of the current net-next
- patch 5:
 - propagate netdev_ops->ndo_setup_tc error up to tcf_block_offload_bind
   caller
- patch 7:
 - rebased on top of the current net-next

v2->v3:
- removed original patch 1, removing tp->q cls_bpf dependency. Fixed by
  Jakub in the meantime.
- patch 1:
 - rebased on top of the current net-next
- patch 5:
 - new patch
- patch 8:
 - removed "p_" prefix from block index function args
- patch 10:
 - add tc offload feature handling
====================

Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-17 14:53:58 -05:00
Jiri Pirko 4b23258d6a mlxsw: spectrum_acl: Pass mlxsw_sp_port down to ruleset bind/unbind ops
No need to convert from mlxsw_sp_port to net_device and back again.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-17 14:53:58 -05:00
Jiri Pirko 3aaff32304 mlxsw: spectrum_acl: Implement TC block sharing
Benefit from the prepared TC and in-driver ACL infrastructure and
introduce block sharing offload. For that, a new struct "block" is
introduced in spectrum_acl in order to hold a list of specific
block-port bindings.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-17 14:53:58 -05:00
Jiri Pirko 02caf4995a mlxsw: spectrum_acl: Don't store netdev and ingress for ruleset unbind
Instead, pass netdev and ingress flag to ruleset unbind op.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-17 14:53:57 -05:00
Jiri Pirko 9fe5fdf27e mlxsw: spectrum_acl: Reshuffle code around mlxsw_sp_acl_ruleset_create/destroy
In order to prepare for follow-up changes, make the bind/unbind helpers
very simple. That required move of ht insertion/removal and bind/unbind
calls into mlxsw_sp_acl_ruleset_create/destroy.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-17 14:53:57 -05:00
Jiri Pirko 51ab2994c3 net: sched: allow ingress and clsact qdiscs to share filter blocks
Benefit from the previously introduced shared filter blocks
infrastructure and allow ingress and clsact qdisc instances to share
filter blocks. The block index is coming from userspace as qdisc option.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-17 14:53:57 -05:00
Jiri Pirko d47a6b0e7c net: sched: introduce ingress/egress block index attributes for qdisc
Introduce two new attributes to be used for qdisc creation and dumping.
One for ingress block, one for egress block. Introduce a set of ops that
qdisc which supports block sharing would implement.

Passing block indexes in qdisc change is not supported yet and it is
checked and forbidded.

In future, these attributes are to be reused for specifying block
indexes for classes as well. As of this moment however, it is not
supported so a check is in place to forbid it.

Suggested-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-17 14:53:57 -05:00
Jiri Pirko 7960d1daf2 net: sched: use block index as a handle instead of qdisc when block is shared
As the tcm_ifindex with value TCM_IFINDEX_MAGIC_BLOCK is invalid ifindex,
use it to indicate that we work with block, instead of qdisc.
So if tcm_ifindex is set to TCM_IFINDEX_MAGIC_BLOCK, tcm_parent is used
to carry block_index.

If the block is set to be shared between at least 2 qdiscs, it is
forbidden to use the qdisc handle to add/delete filters. In that case,
userspace has to pass block_index.

Also, for dump of the filters, in case the block is shared in between at
least 2 qdiscs, the each filter is dumped with tcm_ifindex value
TCM_IFINDEX_MAGIC_BLOCK and tcm_parent set to block_index. That gives
the user clear indication, that the filter belongs to a shared block
and not only to one qdisc under which it is dumped.

Suggested-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-17 14:53:57 -05:00
Jiri Pirko caa7260156 net: sched: keep track of offloaded filters and check tc offload feature
During block bind, we need to check tc offload feature. If it is
disabled yet still the block contains offloaded filters, forbid the
bind. Also forbid to register callback for a block that already
contains offloaded filters, as the play back is not supported now.
For keeping track of offloaded filters there is a new counter
introduced, alongside with couple of helpers called from cls_* code.
These helpers set and clear TCA_CLS_FLAGS_IN_HW flag.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-17 14:53:57 -05:00
Jiri Pirko edf6711c98 net: sched: remove classid and q fields from tcf_proto
Both are no longer used, so remove them.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-17 14:53:56 -05:00
Jiri Pirko f36fe1c498 net: sched: introduce block mechanism to handle netif_keep_dst calls
Couple of classifiers call netif_keep_dst directly on q->dev. That is
not possible to do directly for shared blocke where multiple qdiscs are
owning the block. So introduce a infrastructure to keep track of the
block owners in list and use this list to implement block variant of
netif_keep_dst.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-17 14:53:56 -05:00
Jiri Pirko 9d3aaff3d8 net: sched: avoid usage of tp->q in tcf_classify
Use block index in the messages instead.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-17 14:53:56 -05:00
Jiri Pirko 4861738775 net: sched: introduce shared filter blocks infrastructure
Allow qdiscs to share filter blocks among them. Each qdisc type has to
use block get/put extended modifications that enable sharing.
Shared blocks are tracked within each net namespace and identified
by u32 index. This index is passed from user during the qdisc creation.
If user passes index that is not used by any other qdisc, new block
is created. If user passes index that is already used, the existing
block will be re-used.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-17 14:53:56 -05:00
Jiri Pirko a9b19443ed net: sched: introduce support for multiple filter chain pointers registration
So far, there was possible only to register a single filter chain
pointer to block->chain[0]. However, when the blocks will get shareable,
we need to allow multiple filter chain pointers registration.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-17 14:53:56 -05:00
David S. Miller c9a824210f Merge branch 'bnxt_en-next'
Michael Chan says:

====================
bnxt_en: Updates for net-next.

First, we upgrade the firmware interface spec.  Due to a change in
the toolchains, the auto-generated bnxt_hsi.h does not match the
old bnxt_hsi.h and the patch is really big.  This should be just
one-time.  Going forward, changes should be incremental.

The next 10 patches implement a new scheme for the PF and VF drivers
to allocate and reserve resources.  The new scheme is more flexible
and allows dynamic and asymmetric distribution of resources, whereas
the old scheme is static and even distribution.

The last few patches add cacheline size setting, a couple of PCI IDs,
better management of VF MAC address, and a better parent switchdev ID
for dual-port devices.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-17 14:48:27 -05:00
Sathya Perla dd4ea1da12 bnxt_en: export a common switchdev PARENT_ID for all reps of an adapter
Currently the driver exports different switchdev PARENT_IDs for
representors belonging to different SR-IOV PF-pools of an adapter.
This is not correct as the adapter can switch across all vports
of an adapter. This patch fixes this by exporting a common switchdev
PARENT_ID for all reps of an adapter. The PCIE DSN is used as the id.

Signed-off-by: Sathya Perla <sathya.perla@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-17 14:48:27 -05:00
Michael Chan c3480a6037 bnxt_en: Add cache line size setting to optimize performance.
The chip supports 64-byte and 128-byte cache line size for more optimal
DMA performance when matched to the CPU cache line size.  The default is 64.
If the system is using 128-byte cache line size, set it to 128.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-17 14:48:27 -05:00
Vasundhara Volam 91cdda4071 bnxt_en: Forward VF MAC address to the PF.
Forward hwrm_func_vf_cfg command from VF to PF driver, to store
VF MAC address in PF's context.  This will allow "ip link show"
to display all VF MAC addresses.

Maintain 2 locations of MAC address in VF info structure, one for
a PF assigned MAC and one for VF assigned MAC.

Display VF assigned MAC in "ip link show", only if PF assigned MAC is
not valid.

Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-17 14:48:27 -05:00
Vasundhara Volam 92abef361b bnxt_en: Add BCM5745X NPAR device IDs
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-17 14:48:26 -05:00
Michael Chan 8f23d638b3 bnxt_en: Expand bnxt_check_rings() to check all resources.
bnxt_check_rings() is called by ethtool, XDP setup, and ndo_setup_tc()
to see if there are enough resources to support the new configuration.
Expand the call to test all resources if the firmware supports the new
API.  With the more flexible resource allocation scheme, this call must
be made to check that all resources are available before committing to
allocate the resources.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-17 14:48:26 -05:00
Michael Chan 4673d66468 bnxt_en: Implement new method for the PF to assign SRIOV resources.
Instead of the old method of evenly dividing the resources to the VFs,
use the new firmware API to specify min and max resources for each VF.
This way, there is more flexibility for each VF to allocate more or less
resources.

The min is the absolute minimum for each VF to function.  The max is the
global resources minus the resources used by the PF.  Each VF is
guaranteed the min.  Up to max resources may be available for some VFs.

The PF driver can use one of 2 strategies specified in NVRAM to assign
the resources.  The old legacy strategy of evenly dividing the resources
or the new flexible strategy.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-17 14:48:26 -05:00
Michael Chan 6a1eef5b90 bnxt_en: Reserve resources for RFS.
In bnxt_rfs_capable(), add call to reserve vnic resources to support
NTUPLE.  Return true if we can successfully reserve enough vnics.
Otherwise, reserve the minimum 1 VNIC for normal operations not
supporting NTUPLE and return false.

Also, suppress warning message about not enough resources for NTUPLE when
only 1 RX ring is in use.  NTUPLE filters by definition require multiple
RX rings.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-17 14:48:26 -05:00