Commit Graph

26235 Commits

Author SHA1 Message Date
Kai-Heng Feng 1fb3a7a75e igb: Fix an issue that PME is not enabled during runtime suspend
I210 ethernet card doesn't wakeup when a cable gets plugged. It's
because its PME is not set.

Since commit 42eca23021 ("PCI: Don't touch card regs after runtime
suspend D3"), if the PCI state is saved, pci_pm_runtime_suspend() stops
calling pci_finish_runtime_suspend(), which enables the PCI PME.

To fix the issue, let's not to save PCI states when it's runtime
suspend, to let the PCI subsystem enables PME.

Fixes: 42eca23021 ("PCI: Don't touch card regs after runtime suspend D3")
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-12-20 12:14:23 -08:00
Young Xiao eec903769b ice: Do not enable NAPI on q_vectors that have no rings
If ice driver has q_vectors w/ active NAPI that has no rings,
then this will result in a divide by zero error. To correct it
I am updating the driver code so that we only support NAPI on
q_vectors that have 1 or more rings allocated to them.

See commit 13a8cd191a ("i40e: Do not enable NAPI on q_vectors
that have no rings") for detail.

Signed-off-by: Young Xiao <YangX92@hotmail.com>
Acked-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-12-20 12:10:24 -08:00
Miroslav Lichvar 9a2d57a7a0 i40e: extend PTP gettime function to read system clock
This adds support for the PTP_SYS_OFFSET_EXTENDED ioctl.

Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Acked-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-12-20 12:06:35 -08:00
Konstantin Khorenko 31389b53b3 i40e: define proper net_device::neigh_priv_len
Out of bound read reported by KASan.

i40iw_net_event() reads unconditionally 16 bytes from
neigh->primary_key while the memory allocated for
"neighbour" struct is evaluated in neigh_alloc() as

  tbl->entry_size + dev->neigh_priv_len

where "dev" is a net_device.

But the driver does not setup dev->neigh_priv_len and
we read beyond the neigh entry allocated memory,
so the patch in the next mail fixes this.

Signed-off-by: Konstantin Khorenko <khorenko@virtuozzo.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-12-20 12:02:26 -08:00
YueHaibing cd0d465bb6 e100: Fix passing zero to 'PTR_ERR' warning in e100_load_ucode_wait
Fix a static code checker warning:
drivers/net/ethernet/intel/e100.c:1349
 e100_load_ucode_wait() warn: passing zero to 'PTR_ERR'

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-12-20 11:54:27 -08:00
David S. Miller 2be09de7d6 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Lots of conflicts, by happily all cases of overlapping
changes, parallel adds, things of that nature.

Thanks to Stephen Rothwell, Saeed Mahameed, and others
for their guidance in these resolutions.

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-20 11:53:36 -08:00
Jesus Sanchez-Palencia 6f9ae17530 igb: Change RXPBSIZE size when setting Qav mode
Section 4.5.9 of the datasheet says that the total size of all packet
buffers combined (TxPB 0 + 1 + 2 + 3 + RxPB + BMC2OS + OS2BMC) must not
exceed 60KB. Today we are configuring a total of 62KB, so reduce the
RxPB from 32KB to 30KB in order to respect that.

The choice of changing RxPBSIZE here is mainly because it seems more
correct to give more priority to the transmit packet buffers over the
receiver ones when running in Qav mode. Also, the BMC2OS and OS2BMC
sizes are already too short.

Signed-off-by: Jesus Sanchez-Palencia <jesus.s.palencia@gmail.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-12-20 11:45:10 -08:00
Jeff Kirsher 59361316af igb: reduce CPU0 latency when updating statistics
This change is based off of the work and suggestion of Jan Jablonsky
<jan.jablonsky@thalesgroup.com>.

The Watchdog workqueue in igb driver is scheduled every 2s for each
network interface. That includes updating a statistics protected by
spinlock. Function igb_update_stats in this case will be protected
against preemption. According to number of a statistics registers
(cca 60), processing this function might cause additional cpu load
 on CPU0.

In case of statistics spinlock may be replaced with mutex, which
reduce latency on CPU0.

CC: Bernhard Kaindl  <bernhard.kaindl@thalesgroup.com>
CC: Jan Jablonsky <jan.jablonsky@thalesgroup.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-12-20 11:02:06 -08:00
Jakub Kicinski 4987eaccd2 nfp: bpf: optimize codegen for JSET with a constant
The top word of the constant can only have bits set if sign
extension set it to all-1, therefore we don't really have to
mask the top half of the register.  We can just OR it into
the result as is.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-12-20 17:28:29 +01:00
Jakub Kicinski 6e774845b3 nfp: bpf: remove the trivial JSET optimization
The verifier will now understand the JSET instruction, so don't
mark the dead branch in the JIT as noop.  We won't generate any
code, anyway.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-12-20 17:28:28 +01:00
Michael Chan 0c2ff8d796 bnxt_en: Adjust default RX coalescing ticks to 10 us.
For a little better performance on faster machines and faster link
speeds.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-20 08:26:16 -08:00
Venkat Duvvuru abd43a1352 bnxt_en: Support for 64-bit flow handle.
Older firmware only supports 16-bit flow handle, because of which the
number of flows that can be offloaded can’t scale beyond a point.
Newer firmware supports 64-bit flow handle enabling the host to scale
upto millions of flows. With the new 64-bit flow handle support, driver
has to query flow stats in a different way compared to the older approach.

This patch adds support for 64-bit flow handle and new way to query
flow stats.

Signed-off-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com>
Reviewed-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-20 08:26:16 -08:00
Michael Chan cf6daed098 bnxt_en: Increase context memory allocations on 57500 chips for RDMA.
If RDMA is supported on the 57500 chip, increase context memory
allocations for the resources used by RDMA.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-20 08:26:16 -08:00
Michael Chan 08fe9d1816 bnxt_en: Add Level 2 context memory paging support.
Add the new functions bnxt_alloc_ctx_pg_tbls()/bnxt_free_ctx_pg_tbls()
to allocate and free pages for context memory.  The new functions
will handle the different levels of paging support and allocate/free
the pages accordingly using the existing functions.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-20 08:26:16 -08:00
Michael Chan 4f49b2b8d4 bnxt_en: Enhance bnxt_alloc_ring()/bnxt_free_ring().
To support level 2 context page memory structures, enhance the
bnxt_ring_mem_info structure with a "depth" field to specify the page
level and add a flag to specify using full pages for L1 and L2 page
tables.  This is needed to support RDMA functionality on 57500 chips
since RDMA requires more context memory.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-20 08:26:16 -08:00
Venkat Duvvuru 760b6d3341 bnxt_en: Add support for 2nd firmware message channel.
Earlier, some of the firmware commands (ex: CFA_FLOW_*) which are processed
by KONG processor were sent to the CHIMP processor from the host. This
approach was taken as there was no direct message channel to KONG.
CHIMP in turn used to send them to KONG. Newer firmware supports a new
message channel which the host can send messages directly to the KONG
processor.

This patch adds support for required changes needed in the driver
to support direct KONG message channel.  This speeds up flow related
messages sent to the firmware for CLS_FLOWER offload.

Signed-off-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-20 08:26:16 -08:00
Venkat Duvvuru 5c209fc821 bnxt_en: Introduce bnxt_get_hwrm_resp_addr & bnxt_get_hwrm_seq_id routines.
These routines will be enhanced in the subsequent patch to
return the 2nd firmware comm. channel's hwrm response address &
sequence id respectively.

Signed-off-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-20 08:26:16 -08:00
Venkat Duvvuru 89455017fb bnxt_en: Avoid arithmetic on void * pointer.
Typecast hwrm_cmd_resp_addr to (u8 *) from (void *) before doing
arithmetic.

Signed-off-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-20 08:26:15 -08:00
Venkat Duvvuru 2e9ee39877 bnxt_en: Use macros for firmware message doorbell offsets.
In preparation for adding a 2nd communication channel to firmware.

Signed-off-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-20 08:26:15 -08:00
Venkat Duvvuru fc718bb2d1 bnxt_en: Set hwrm_intr_seq_id value to its inverted value.
Set hwrm_intr_seq_id value to its inverted value instead of
HWRM_SEQ_INVALID, when an hwrm completion of type
CMPL_BASE_TYPE_HWRM_DONE is received. This will enable us to use
the complete 16-bit sequence ID space.

Signed-off-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-20 08:26:15 -08:00
Michael Chan 3322479e6d bnxt_en: Update firmware interface spec. to 1.10.0.33.
The major changes are in the flow offload firmware APIs.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-20 08:26:15 -08:00
Aviv Heller a64917446e net/mlx5: Fix LAG requirement when CONFIG_MLX5_ESWITCH is off
If CONFIG_MLX5_ESWITCH is not defined, test for SR-IOV being disabled,
instead of calling e-switch LAG prereq routine.

Since LAG with SRIOV is allowed only when switchdev mode is on.

Fixes: eff849b2c6 ("net/mlx5: Allow/disallow LAG according to pre-req only")
Signed-off-by: Aviv Heller <avivh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-12-20 05:06:03 -08:00
Aviv Heller 0a5b589111 net/mlx5: Fix query_nic_sys_image_guid() error during init
vport system image guid should be queried using vport nic API for
Ethernet ports, and vport hca API for Infiniband ports.

Fixes: fadd59fc50 ("net/mlx5: Introduce inter-device communication mechanism")
Signed-off-by: Aviv Heller <avivh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-12-20 05:06:03 -08:00
Eli Britstein e32ee6c78e net/mlx5e: Support tunnel encap over tagged Ethernet
Generate encap header depending on the routed device to support
native/tagged Ethernet header.

Signed-off-by: Eli Britstein <elibr@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-12-20 05:06:03 -08:00
Eli Britstein aa331450b8 net/mlx5e: Support VLAN encap ETH header generation
Support generation of native or tagged Ethernet header for encap
header, depending on provided net device.

Signed-off-by: Eli Britstein <elibr@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-12-20 05:06:03 -08:00
Eli Britstein c7bcb277bd net/mlx5e: Re-order route and encap header memory allocation
Change the order to first route IPv4/6 and return if error. Only after
successful route continue to allocate an encap header, with no
functional change.

Signed-off-by: Eli Britstein <elibr@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-12-20 05:06:02 -08:00
Eli Britstein 05ada1adb6 net/mlx5e: Tunnel encap ETH header helper function
In tunnel encap we prepare the encap header for IPv4/6 cases, in two
separate functions. For ETH header generation the code is almost
duplicated.

Move the ETH header generation code from IPv4/6 functions to a helper
function, with no functional change.

Signed-off-by: Eli Britstein <elibr@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-12-20 05:06:02 -08:00
Eli Britstein b168cff0b9 net/mlx5e: Fail attempt to offload e-switch TC encap flows with vlan on underlay
Currently we don't support nor fail attempts to offload encap flows routed
to vlan device on the underlay network. We wrongly consider a vlan underlay
device to be on the same e-switch b/c the switchdev ID is retrieved recursively.

Add explicit check for that and fail such attempts.

Also align to a more strict check for the ingress and the underlay devices
to practically be on the same eswitch.

Fixes: ce99f6b97f ('net/mlx5e: Support SRIOV TC encapsulation offloads for IPv6 tunnels')
Fixes: 3e621b19b0 ('net/mlx5e: Support TC encapsulation offloads with upper devices')
Signed-off-by: Eli Britstein <elibr@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-12-20 05:06:02 -08:00
Eli Britstein 442e1228cb net/mlx5e: Tunnel routing output devs helper function
For tunnel we determine the output devs for IPv4/6 cases, in two
separate functions, with a duplicated code.

Move that code from IPv4/6 functions to a helper function, with no
functional change.

Signed-off-by: Eli Britstein <elibr@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-12-20 05:06:01 -08:00
Eli Britstein a0646c88ed net/mlx5e: Fail attempt to offload e-switch TC flows with egress upper devices
We use the switchdev parent HW id helper to identify if the mirred device
shares the same ASIC/port with the ingress device. This can get us wrong
in the presence of upper devices such as vlan or bridge set over the HW
devices (VF or uplink representors), b/c the switchdev ID is retrieved
recursively.

To fail offload attempts in such cases, we condition the check on the
egress device to have not only the same switchdev ID but also the relevant
mlx5 netdev ops.

Fixes: 03a9d11e6e ('net/mlx5e: Add TC drop and mirred/redirect action parsing for SRIOV offloads')
Signed-off-by: Eli Britstein <elibr@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Acked-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-12-20 05:06:01 -08:00
Or Gerlitz 1ee4457c5c net/mlx5e: Allow vlans on e-switch uplink reps
There are cases (e.g tunneling with vlan on underlay and potentially
more) where this makes sense, so allow that.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Eli Britstein <elibr@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-12-20 05:06:01 -08:00
Gavi Teitz 4c8fb2986d net/mlx5e: Increase VF representors' SQ size to 128
The default size for the VF representors' SQ was too small to handle high
packet rates. Doubling the size from 64 to 128 drastically improves the
packet rate under stress (by about 50%), whereas increasing the size
beyond 128 has not shown to make any further difference.

The impact of the SQ size was measured with UDP traffic, in the following
topology: TG <-> PF <-> TC forwarding <-> VF representor <-> VF in VM
over a single core processing bi-directional traffic, with the following
results:

                                  SQ size of 64:     SQ size of 128:
Packet rate for 64B UDP packets:    860 [Kpps]         1280 [Kpps]
Packet rate for 114B VxLan
encapsulated UDP packets:           320 [Kpps]          500 [Kpps]

Signed-off-by: Gavi Teitz <gavi@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-12-20 05:06:00 -08:00
Miroslav Lichvar 4a0475d57a mlx5: extend PTP gettime function to read system clock
Read the system time right before and immediately after reading the low
register of the internal timer. This adds support for the
PTP_SYS_OFFSET_EXTENDED ioctl.

Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-12-20 05:06:00 -08:00
Miroslav Lichvar 5d8678365c mlx5: update timecounter at least twice per counter overflow
The timecounter needs to be updated at least once in half of the
cyclecounter interval to prevent timecounter_cyc2time() interpreting a
new timestamp as an old value and causing a backward jump.

This would be an issue if the timecounter multiplier was so small that
the update interval would not be limited by the 64-bit overflow in
multiplication.

Shorten the calculated interval to make sure the timecounter is updated
in time even when the system clock is slowed down by up to 10%, the
multiplier is increased by up to 10%, and the scheduled overflow check
is late by 15%.

Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Ariel Levkovich <lariel@mellanox.com>
Cc: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-12-20 05:06:00 -08:00
Peng Li 1154bb26c8 net: hns3: remove redundant variable initialization
This patch removes the redundant variable initialization,
as driver will devm_kzalloc to set value to hdev soon.

Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-19 23:47:59 -08:00
Peng Li 31a16f99e0 net: hns3: fix the descriptor index when get rss type
Driver gets rss information from the last descriptor of the packet.
When driver handle the rss type, ring->next_to_clean indicates the
first descriptor of next packet.

This patch fix the descriptor index with "ring->next_to_clean - 1".

Fixes: 232fc64b6e ("net: hns3: Add HW RSS hash information to RX skb")
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-19 23:47:59 -08:00
Jian Shen 8edc2285b7 net: hns3: don't restore rules when flow director is disabled
When user disables flow director, all the rules will be disabled. But
when reset happens, it will restore all the rules again. It's not
reasonable. This patch fixes it by add flow director status check before
restore fules.

Fixes: 6871af29b3 ("net: hns3: Add reset handle for flow director")
Fixes: c17852a893 ("net: hns3: Add support for enable/disable flow director")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-19 23:47:59 -08:00
Jian Shen 0285dbae5d net: hns3: fix vf id check issue when add flow director rule
When add flow director fule for vf, the vf id is used as array
subscript before valid checking, which may cause memory overflow.

Fixes: dd74f815dd ("net: hns3: Add support for rule add/delete for flow director")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-19 23:47:58 -08:00
Huazhong Tan 39cfbc9c4f net: hns3: reset tqp while doing DOWN operation
While doing DOWN operation, the driver will reclaim the memory which has
already used for TX. If the hardware is processing this memory, it will
cause a RCB error to the hardware. According the hardware's description,
the driver should reset the tqp before reclaim the memory during DOWN.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-19 23:47:58 -08:00
Jian Shen 75edb61086 net: hns3: add max vector number check for pf
Each pf supports max 64 vectors and 128 tqps. For 2p/4p core scenario,
there may be more than 64 cpus online. So the result of min_t(u16,
num_Online_cpus(), tqp_num) may be more than 64. This patch adds check
for the vector number.

Fixes: dd38c72604 ("net: hns3: fix for coalesce configuration lost during reset")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-19 23:47:58 -08:00
Peng Li 1b7d7b0581 net: hns3: fix a bug caused by udelay
udelay() in driver may always occupancy processor. If there is only
one cpu in system, the VF driver may initialize fail when insmod
PF and VF driver in the same system. This patch use msleep() to free
cpu when VF wait PF message.

Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-19 23:47:58 -08:00
Jian Shen a298797532 net: hns3: change default tc state to close
In original codes, default tc value is set to the max tc. It's more
reasonable to close tc by changing default tc value to 1. Users can
enable it with lldp tool when they want to use tc.

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-19 23:47:58 -08:00
Jian Shen 8cdb992f0d net: hns3: refine the handle for hns3_nic_net_open/stop()
When triggering nic down, there is a time window between bringing down
the protocol stack and stopping the work task. If the net is up in the
time window, it may bring up the protocol stack again.

This patch fixes it by stop the work task at the beginning of
hns3_nic_net_stop(). To keep symmetrical, start the work task at the
end of hns3_nic_net_open().

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-19 23:47:58 -08:00
Antoine Tenart 1b451fb205 net: mvpp2: fix the phylink mode validation
The mvpp2_phylink_validate() sets all modes that are supported by a
given PPv2 port. An mistake made the 10000baseT_Full mode being
advertised in some cases when a port wasn't configured to perform at
10G. This patch fixes this.

Fixes: d97c9f4ab0 ("net: mvpp2: 1000baseX support")
Reported-by: Russell King <linux@armlinux.org.uk>
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-19 16:38:35 -08:00
Biao Huang 22a3a5403b net-next: stmmac: dwmac-mediatek: remove fine-tune property
1. remove fine-tune property and related setting to simplify
the timing adjustment flow.
2. set timing value according to the value from device tree,
and will not care whether PHY insert internal delay.

Signed-off-by: Biao Huang <biao.huang@mediatek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-19 16:24:58 -08:00
Bryan Whitehead e0e587878f lan743x: Remove MAC Reset from initialization
The MAC Reset was noticed to erase important EEPROM settings.
It is also unnecessary since a chip wide reset was done earlier
in initialization, and that reset preserves EEPROM settings.

There for this patch removes the unnecessary MAC specific reset.

Signed-off-by: Bryan Whitehead <Bryan.Whitehead@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-19 15:48:11 -08:00
Alaa Hleihel 4765420439 net/mlx5e: Remove the false indication of software timestamping support
mlx5 driver falsely advertises support of software timestamping.
Fix it by removing the false indication.

Fixes: ef9814deaf ("net/mlx5e: Add HW timestamping (TS) support")
Signed-off-by: Alaa Hleihel <alaa@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-12-19 13:31:16 -08:00
Yuval Avnery f033788914 net/mlx5: Typo fix in del_sw_hw_rule
Expression terminated with "," instead of ";", resulted in
set_fte getting bad value for modify_enable_mask field.

Fixes: bd5251dbf1 ("net/mlx5_core: Introduce flow steering destination of type counter")
Signed-off-by: Yuval Avnery <yuvalav@mellanox.com>
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-12-19 13:31:16 -08:00
Tariq Toukan bfc698254b net/mlx5e: RX, Fix wrong early return in receive queue poll
When the completion queue of the RQ is empty, do not immediately return.
If left-over decompressed CQEs (from the previous cycle) were processed,
need to go to the finalization part of the poll function.

Bug exists only when CQE compression is turned ON.

This solves the following issue:
mlx5_core 0000:82:00.1: mlx5_eq_int:544:(pid 0): CQ error on CQN 0xc08, syndrome 0x1
mlx5_core 0000:82:00.1 p4p2: mlx5e_cq_error_event: cqn=0x000c08 event=0x04

Fixes: 4b7dfc9925 ("net/mlx5e: Early-return on empty completion queues")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-12-19 13:31:16 -08:00
Ido Schimmel b61cd7c6f9 mlxsw: spectrum_router: Hold a reference on RIF's netdev
Previous patches tried to make RIF deletion more robust and avoid
use-after-free situations.

As another precaution, hold a reference on a RIF's netdev and release it
when the RIF is deleted.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-19 12:28:07 -08:00
Ido Schimmel 965fa8e600 mlxsw: spectrum_router: Make RIF deletion more robust
In the past we had multiple instances where RIFs were not properly
deleted.

One of the reasons for leaking a RIF was that at the time when IP
addresses were flushed from the respective netdev (prompting the
destruction of the RIF), the netdev was no longer a mlxsw upper. This
caused the inet{,6}addr notification blocks to ignore the NETDEV_DOWN
event and leak the RIF.

Instead of checking whether the netdev is our upper when an IP address
is removed, we can instead check if the netdev has a RIF configured.

To look up a RIF we need to access mlxsw private data, so the patch
stores the notification blocks inside a mlxsw struct. This then allows
us to use container_of() and extract the required private data.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-19 12:28:07 -08:00
Ido Schimmel 21ffedb6db mlxsw: spectrum_router: Propagate 'struct mlxsw_sp' further
Next patch is going to make RIF deletion more robust by removing
reliance on fragile mlxsw_sp_lower_get(). This is because a netdev is
not necessarily our upper anymore when its IP addresses are flushed.

The inet{,6}addr notification blocks are going to resolve 'struct
mlxsw_sp' using container_of(), but the functions they call still use
mlxsw_sp_lower_get().

As a preparation for the next patch, propagate 'struct mlxsw_sp' down to
the functions called from the notification blocks and remove reliance on
mlxsw_sp_lower_get().

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-19 12:28:07 -08:00
Ido Schimmel be2d6f421f mlxsw: spectrum: Properly cleanup LAG uppers when removing port from LAG
When a LAG device or a VLAN device on top of it is enslaved to a bridge,
the driver propagates the CHANGEUPPER event to the LAG's slaves.

This causes each physical port to increase the reference count of the
internal representation of the bridge port by calling
mlxsw_sp_port_bridge_join().

However, when a port is removed from a LAG, the corresponding leave()
function is not called and the reference count is not decremented. This
leads to ugly hacks such as mlxsw_sp_bridge_port_should_destroy() that
try to understand if the bridge port should be destroyed even when its
reference count is not 0.

Instead, make sure that when a port is unlinked from a LAG it would see
the same events as if the LAG (or its uppers) were unlinked from a
bridge.

The above is achieved by walking the LAG's uppers when a port is
unlinked and calling mlxsw_sp_port_bridge_leave() for each upper that is
enslaved to a bridge.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-19 12:28:07 -08:00
Ido Schimmel 635c8c8bba mlxsw: spectrum: Remove reference count from VLAN entries
Commit b3529af6bb ("spectrum: Reference count VLAN entries") started
reference counting port-VLAN entries in a similar fashion to the 8021q
driver.

However, this is not actually needed and only complicates things.
Instead, the driver should forbid the creation of a VLAN on a port if
this VLAN already exists. This would also solve the issue fixed by the
mentioned commit.

Therefore, remove the get()/put() API and use create()/destroy()
instead.

One place that needs special attention is VLAN addition in a VLAN-aware
bridge via switchdev operations. In case the VLAN flags (e.g., 'pvid')
are toggled, then the VLAN entry already exists. To prevent the driver
from wrongly returning EEXIST, the driver is changed to check in the
prepare phase whether the entry already exists and only returns an error
in case it is not associated with the correct bridge port.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-19 12:28:07 -08:00
Ido Schimmel e149113a74 mlxsw: spectrum: Handle VLAN device unlinking
In commit 993107fea5 ("mlxsw: spectrum_switchdev: Fix VLAN device
deletion via ioctl") I fixed a bug caused by the fact that the driver
views differently the deletion of a VLAN device when it is deleted via
an ioctl and netlink.

Instead of relying on a specific order of events (device being
unregistered vs. VLAN filter being updated), simply make sure that the
driver performs the necessary cleanup when the VLAN device is unlinked,
which always happens before the other two events.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-19 12:28:07 -08:00
Ido Schimmel f1d7c33d6a mlxsw: spectrum_fid: Remove unused function
This function is no longer used. Remove it.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-19 12:28:07 -08:00
Ido Schimmel 32fd4b49a3 mlxsw: spectrum_router: Do not destroy RIFs based on FID's reference count
Currently, when a RIF is constructed on top of a FID, the RIF increments
the FID's reference count and the RIF is destroyed when the FID's
reference count drops to 1. This effectively means that when no local
ports are member in the FID, the FID is destroyed regardless if the
router port is a member in the FID or not.

The above can lead to the unexpected behavior in which routes using a
VLAN interface as their nexthop device are no longer offloaded after the
last local port leaves the corresponding VLAN (FID).

Example:
# ip -4 route show dev br0.10
192.0.2.0/24 proto kernel scope link src 192.0.2.1 offload
# bridge vlan del vid 10 dev swp3
# ip -4 route show dev br0.10
192.0.2.0/24 proto kernel scope link src 192.0.2.1

After the patch, the route is offloaded before and after the VLAN is
removed from local port 'swp3', as the RIF corresponding to 'br0.10'
continues to exists.

In order to remove RIFs' reliance on the underlying FID's reference
count, we need to add a reference count to sub-port RIFs, which are RIFs
that correspond to physical ports and their uppers (e.g., LAG devices).

In this case, each {Port, VID} ('struct mlxsw_sp_port_vlan') needs to
hold a reference on the RIF. For example:

                       bond0.10
                          |
                        bond0
                          |
                      +-------+
                      |       |
                    swp1    swp2

Both {Port 1, VID 10} and {Port 2, VID 10} will hold a reference on the
RIF corresponding to 'bond0.10'. When the last reference is dropped, the
RIF will be destroyed.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-19 12:28:07 -08:00
Ido Schimmel 927d0ef10a mlxsw: spectrum: Sanitize VLAN interface's uppers
Currently, only VRF and macvlan uppers are supported on top of VLAN
device configured over a bridge, so make sure the driver forbids other
uppers.

Note that enslavement to a VRF is handled earlier in the notification
block, so there is no need to check for a VRF upper here.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-19 12:28:07 -08:00
Michael Chan 84404d5fd5 bnxt_en: Fix ethtool self-test loopback.
The current code has 2 problems.  It assumes that the RX ring for
the loopback packet is combined with the TX ring.  This is not
true if the ethtool channels are set to non-combined mode.  The
second problem is that it won't work on 57500 chips without
adjusting the logic to get the proper completion ring (cpr) pointer.
Fix both issues by locating the proper cpr pointer through the RX
ring.

Fixes: e44758b78a ("bnxt_en: Use bnxt_cp_ring_info struct pointer as parameter for RX path.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-19 11:31:09 -08:00
Florian Westphal a84e3f5333 xfrm: prefer secpath_set over secpath_dup
secpath_set is a wrapper for secpath_dup that will not perform
an allocation if the secpath attached to the skb has a reference count
of one, i.e., it doesn't need to be COW'ed.

Also, secpath_dup doesn't attach the secpath to the skb, it leaves
this to the caller.

Use secpath_set in places that immediately assign the return value to
skb.

This allows to remove skb->sp without touching these spots again.

secpath_dup can eventually be removed in followup patch.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-19 11:21:38 -08:00
Florian Westphal 6362a6a040 drivers: net: ethernet: mellanox: use skb_sec_path helper
Will avoid touching this when sp pointer is removed from sk_buff struct.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-19 11:21:37 -08:00
Florian Westphal 2fdb435bc0 drivers: net: intel: use secpath helpers in more places
Use skb_sec_path and secpath_exists helpers where possible.
This reduces noise in followup patch that removes skb->sp pointer.

v2: no changes, preseve acks from v1.

Acked-by: Shannon Nelson <shannon.lee.nelson@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-19 11:21:37 -08:00
Ioana Radulescu 610febc68a dpaa2-eth: Add QBMAN related stats
Add statistics for pending frames in Rx/Tx conf FQs and
number of buffers in pool. Available through ethtool -S.

Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Signed-off-by: Ioana ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-19 10:37:22 -08:00
Heiner Kallweit 33f18c96af net: ethernet: don't set phylib state CHANGELINK in drivers
After phy_start() phylib takes care of all needed actions, including
aneg settings and checking link state. There's no need to set state
PHY_CHANGELINK in drivers.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18 22:07:20 -08:00
Colin Ian King f7db2beb4c vxge: ensure data0 is initialized in when fetching firmware version information
Currently variable data0 is not being initialized so a garbage value is
being passed to vxge_hw_vpath_fw_api and this value is being written to
the rts_access_steer_data0 register.  There are other occurrances where
data0 is being initialized to zero (e.g. in function
vxge_hw_upgrade_read_version) so I think it makes sense to ensure data0
is initialized likewise to 0.

Detected by CoverityScan, CID#140696 ("Uninitialized scalar variable")

Fixes: 8424e00dfd ("vxge: serialize access to steering control register")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18 22:00:40 -08:00
Bryan Whitehead 0db7d253e9 lan743x: Expand phy search for LAN7431
The LAN7431 uses an external phy, and it can be found anywhere in
the phy address space. This patch uses phy address 1 for LAN7430
only. And searches all addresses otherwise.

Signed-off-by: Bryan Whitehead <Bryan.Whitehead@microchip.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18 21:35:56 -08:00
David S. Miller 6c86bc2342 mlx5-uplink-rep-2018-12-15
Or Gerlitz says:
 
 This series is essentially a cleanup to align with the rest of the NIC
 switchdev drivers and make us
 more robust and clear/n: currently the PF netdev serves as the mlx5
 e-switch uplink netdev
 representor when going into switchdev mode and back as plain NIC
 netdev when going out.
 This causes some irregularities and misc troubles.
 
 Move to use dedicated uplink rep, as we have for the VF vports.
 
 The uplink rep netdev does has sysfs link and supports the sriov vf
 mac ndo, these two are in
 use by libvirt and other orchestrators, It also has richer ethtool
 support to allow controlling the
 port link & mtu along with supporting dcb and plugging into the mlx5
 lag logic.
 -----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJcF/MBAAoJEEg/ir3gV/o+apkIAKVeB/ha8qiY+9ntA9B8HVxO
 cYb+fv4+DuYjTg/nN5QdYSAisN4SwFDA5ukz/ckr6XwV65EgTjAtlon/fHg/rDyz
 YT3dJRP9yjNXKxRvxa0K4PmTasmXGXzaJzsXYemHsfCDdnkNknZsqA+yoNe+2c40
 HoNsdy/Pvr18PjzYtpX0BiuSjH623daBmdFH0o+rwuwsApkAUlBJfosTg/7xyU7v
 TAjy8S1HCloJ7OAjOlzXVmYr65sAlQoFr860vLDQ7Og7JemxjSXQ+AdNb6mSAvZ5
 LBEvCA3G5tb4OkFX+1i/alxOcuG8/FoO+O3ZQWve1DNxoS76po851bgHXIdZCPg=
 =pqTq
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-uplink-rep-2018-12-15' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed:

====================
mlx5-uplink-rep-2018-12-15

Or Gerlitz says:

This series is essentially a cleanup to align with the rest of the NIC
switchdev drivers and make us
more robust and clear/n: currently the PF netdev serves as the mlx5
e-switch uplink netdev
representor when going into switchdev mode and back as plain NIC
netdev when going out.
This causes some irregularities and misc troubles.

Move to use dedicated uplink rep, as we have for the VF vports.

The uplink rep netdev does has sysfs link and supports the sriov vf
mac ndo, these two are in
use by libvirt and other orchestrators, It also has richer ethtool
support to allow controlling the
port link & mtu along with supporting dcb and plugging into the mlx5
lag logic.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18 16:44:45 -08:00
Anssi Hannula 6e0af29806 net: macb: add missing barriers when reading descriptors
When reading buffer descriptors on RX or on TX completion, an
RX_USED/TX_USED bit is checked first to ensure that the descriptors have
been populated, i.e. the ownership has been transferred. However, there
are no memory barriers to ensure that the data protected by the
RX_USED/TX_USED bit is up-to-date with respect to that bit.

Specifically:

- TX timestamp descriptors may be loaded before ctrl is loaded for the
  TX_USED check, which is racy as the descriptors may be updated between
  the loads, causing old timestamp descriptor data to be used.

- RX ctrl may be loaded before addr is loaded for the RX_USED check,
  which is racy as a new frame may be written between the loads, causing
  old ctrl descriptor data to be used.
  This issue exists for both macb_rx() and gem_rx() variants.

Fix the races by adding DMA read memory barriers on those paths and
reordering the reads in macb_rx().

I have not observed any actual problems in practice caused by these
being missing, though.

Tested on a ZynqMP based system.

Fixes: 89e5785fc8 ("[PATCH] Atmel MACB ethernet driver")
Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi>
Cc: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18 16:17:48 -08:00
Anssi Hannula 8159ecab0d net: macb: fix dropped RX frames due to a race
Bit RX_USED set to 0 in the address field allows the controller to write
data to the receive buffer descriptor.

The driver does not ensure the ctrl field is ready (cleared) when the
controller sees the RX_USED=0 written by the driver. The ctrl field might
only be cleared after the controller has already updated it according to
a newly received frame, causing the frame to be discarded in gem_rx() due
to unexpected ctrl field contents.

A message is logged when the above scenario occurs:

  macb ff0b0000.ethernet eth0: not whole frame pointed by descriptor

Fix the issue by ensuring that when the controller sees RX_USED=0 the
ctrl field is already cleared.

This issue was observed on a ZynqMP based system.

Fixes: 4df95131ea ("net/macb: change RX path for GEM")
Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi>
Tested-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Cc: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18 16:17:48 -08:00
Anssi Hannula e100a897bf net: macb: fix random memory corruption on RX with 64-bit DMA
64-bit DMA addresses are split in upper and lower halves that are
written in separate fields on GEM. For RX, bit 0 of the address is used
as the ownership bit (RX_USED). When the RX_USED bit is unset the
controller is allowed to write data to the buffer.

The driver does not guarantee that the controller already sees the upper
half when the RX_USED bit is cleared, possibly resulting in the
controller writing an incoming frame to an address with an incorrect
upper half and therefore possibly corrupting unrelated system memory.

Fix that by adding the necessary DMA memory barrier between the writes.

This corruption was observed on a ZynqMP based system.

Fixes: fff8019a08 ("net: macb: Add 64 bit addressing support for GEM")
Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi>
Acked-by: Harini Katakam <harini.katakam@xilinx.com>
Tested-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Cc: Nicolas Ferre <nicolas.ferre@microchip.com>
Cc: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18 16:17:48 -08:00
Claudiu Beznea 4298388574 net: macb: restart tx after tx used bit read
On some platforms (currently detected only on SAMA5D4) TX might stuck
even the pachets are still present in DMA memories and TX start was
issued for them. This happens due to race condition between MACB driver
updating next TX buffer descriptor to be used and IP reading the same
descriptor. In such a case, the "TX USED BIT READ" interrupt is asserted.
GEM/MACB user guide specifies that if a "TX USED BIT READ" interrupt
is asserted TX must be restarted. Restart TX if used bit is read and
packets are present in software TX queue. Packets are removed from software
TX queue if TX was successful for them (see macb_tx_interrupt()).

Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18 15:57:07 -08:00
YueHaibing 8937388acb qlcnic: remove set but not used variables 'op, cmd_op'
Fixes gcc '-Wunused-but-set-variable' warning:

drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_pf.c:1070:5: warning:
 variable 'op' set but not used [-Wunused-but-set-variable]
drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_pf.c:1342:5: warning:
 variable 'cmd_op' set but not used [-Wunused-but-set-variable]

'op' never used since introduction in commit 7cb03b2347 ("qlcnic:
Support VF-PF communication channel commands.")
'cmd_op' not used since commit 6226204bcf ("qlcnic: Fix operation
type and command type.")

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18 15:49:46 -08:00
Dan Carpenter b26322d2ac net: stmmac: Fix an error code in probe()
The function should return an error if create_singlethread_workqueue()
fails.

Fixes: 34877a15f7 ("net: stmmac: Rework and fix TX Timeout code")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18 15:45:28 -08:00
Dan Carpenter f07d427689 qed: Fix an error code qed_ll2_start_xmit()
We accidentally deleted the code to set "rc = -ENOMEM;" and this patch
adds it back.

Fixes: d2201a2159 ("qed: No need for LL2 frags indication")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18 15:43:44 -08:00
Heiner Kallweit 2429f13870 net: fec: remove workaround to restart phylib state machine on MDIO timeout
There's a workaround to restart the phylib state machine in case of a
MDIO access timeout. Seems it was introduced to deal with the
consequences of a too small MDIO timeout. See also commit message of
c3b084c24c ("net: fec: Adjust ENET MDIO timeouts") which increased
the timeout value later. Due to the later timeout value fix it seems
to be safe to remove the workaround.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18 15:01:55 -08:00
Antoine Tenart 0067917720 net: mvpp2: 10G modes aren't supported on all ports
The mvpp2_phylink_validate() function sets all modes that are
supported by a given PPv2 port. A recent change made all ports to
advertise they support 10G modes in certain cases. This is not true,
as only the port #0 can do so. This patch fixes it.

Fixes: 01b3fd5ac9 ("net: mvpp2: fix detection of 10G SFP modules")
Cc: Baruch Siach <baruch@tkos.co.il>
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18 14:48:15 -08:00
Yunsheng Lin af854724e5 net: hns3: fix a SSU buffer checking bug
When caculating the SSU buffer, it first allocate tx and
rx private buffer, then the remaining buffer is for rx
shared buffer. The remaining buffer size should be at
least bigger than or equal to the shared_std, which is the
minimum shared buffer size required by the driver, but
currently if the remaining buffer size is equal to the
shared_std, it returns failure, which causes SSU buffer
allocation failure problem.

This patch fixes this problem by rounding up shared_std before
checking the the remaining buffer size bigger than or equal to
the shared_std.

Fixes: 46a3df9f97 ("net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support")
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18 12:01:01 -08:00
Yunsheng Lin b9a400ac29 net: hns3: aligning buffer size in SSU to 256 bytes
The hardware expects the buffer size set to SSU is aligned to
256 bytes, this patch aligns the buffer size to 256 byte using
roundup or rounddown function.

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18 12:01:01 -08:00
Yunsheng Lin 368686be23 net: hns3: getting tx and dv buffer size through firmware
This patch adds support of getting tx and dv buffer size through
firmware, because different version of hardware requires different
size of tx and dv buffer.

This patch also add dv_buf_size to tc' private buffer size even if
pfc is not enable for the tc.

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18 12:01:01 -08:00
Peng Li 0ad5ea5dbd net: hns3: synchronize speed and duplex from phy when phy link up
Driver calls phy_connect_direct and registers hclge_mac_adjust_link
to synchronize mac speed and duplex from phy. It is better to
synchronize mac speed and duplex from phy when phy link up.

Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18 12:01:01 -08:00
Fuyun Liang 8362089d78 net: hns3: remove 1000M/half support of phy
Our phy does not support 1000M/half, this patch removes 1000M/half from
PHY_SUPPORTED_FEATURES.

Signed-off-by: Fuyun Liang <liangfuyun1@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18 12:01:01 -08:00
Peng Li 7445565cd0 net: hns3: update coalesce param per second
coalesce param updates every 100 napi times, it may update a little
late if ping test after a high rate flow, may over napi poll is called
100 times as ping test sends packets every second.

This patch updates coalesce param every second, instead with every
100 napi times. It can not update the param 100% in time, but the
lag time is very short.

Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18 12:01:01 -08:00
Huazhong Tan ae6017a711 net: hns3: fix incomplete uninitialization of IRQ in the hns3_nic_uninit_vector_data()
In the hns3_nic_uninit_vector_data(), the procedure of uninitializing
the tqp_vector's IRQ has not set affinity_notify to NULL and changes
its init flag. This patch fixes it. And for simplificaton, local
variable tqp_vector is used instead of priv->tqp_vector[i].

Fixes: 424eb834a9 ("net: hns3: Unified HNS3 {VF|PF} Ethernet Driver for hip08 SoC")
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18 12:01:01 -08:00
Huazhong Tan b51c366df7 net: hns3: remove unnecessary configuration recapture while resetting
When doing reset, it is unnecessary to get the hardware's default
configuration again, otherwise, the user's configuration will be
overwritten.

Fixes: 4ed340ab8f ("net: hns3: Add reset process in hclge_main")
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18 12:01:01 -08:00
Huazhong Tan b644a8d4cb net: hns3: update some variables while hclge_reset()/hclgevf_reset() done
When hclge_reset() completes successfully, it should update the
last_reset_time, set reset_fail_cnt to 0, and set reset_type of
hnae3_ae_dev to HNAE3_NONE_RESET.

Also when hclgevf_reset() completes successfully, it should update
the last_reset_time, and set reset_type of hnae3_ae_dev to
HNAE3_NONE_RESET.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18 12:01:01 -08:00
Huazhong Tan 531eba0fe2 net: hns3: fix napi_disable not return problem
While doing DOWN, the calling of napi_disable() may not return, since the
napi_complete() in the hns3_nic_common_poll() will never be called when
HNS3_NIC_STATE_DOWN is set. So we need to call napi_complete() before
checking HNS3_NIC_STETE_DOWN.

Fixes: ff0699e04b ("net: hns3: stop napi polling when HNS3_NIC_STATE_DOWN is set")
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18 12:01:01 -08:00
Huazhong Tan e3338205f0 net: hns3: uninitialize pci in the hclgevf_uninit
In the hclgevf_pci_reset(), it only uninitialize and initialize
the msi, so if the initialization fails, hclgevf_uninit_hdev()
does not need to uninitialize the msi, but needs to uninitialize
the pci, otherwise it will cause pci resource not free.

Fixes: 862d969a3a ("net: hns3: do VF's pci re-initialization while PF doing FLR")
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18 12:01:01 -08:00
Huazhong Tan cda69d2445 net: hns3: fix error handling int the hns3_get_vector_ring_chain
When hns3_get_vector_ring_chain() failed in the
hns3_nic_init_vector_data(), it should do the error handling instead
of return directly.

Also, cur_chain should be freed instead of chain and head->next should
be set to NULL in error handling of hns3_get_vector_ring_chain.

This patch fixes them.

Fixes: 73b907a083 ("net: hns3: bugfix for buffer not free problem during resetting")
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18 12:01:00 -08:00
Ido Schimmel 5edb7e8bd5 mlxsw: spectrum_nve: Fix memory leak upon driver reload
The pointer was NULLed before freeing the memory, resulting in a memory
leak. Trace from kmemleak:

unreferenced object 0xffff88820ae36528 (size 512):
  comm "devlink", pid 5374, jiffies 4295354033 (age 10829.296s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<00000000a43f5195>] kmem_cache_alloc_trace+0x1be/0x330
    [<00000000312f8140>] mlxsw_sp_nve_init+0xcb/0x1ae0
    [<0000000009201d22>] mlxsw_sp_init+0x1382/0x2690
    [<000000007227d877>] mlxsw_sp1_init+0x1b5/0x260
    [<000000004a16feec>] __mlxsw_core_bus_device_register+0x776/0x1360
    [<0000000070ab954c>] mlxsw_devlink_core_bus_device_reload+0x129/0x220
    [<00000000432313d5>] devlink_nl_cmd_reload+0x119/0x1e0
    [<000000003821a06b>] genl_family_rcv_msg+0x813/0x1150
    [<00000000d54d04c0>] genl_rcv_msg+0xd1/0x180
    [<0000000040543d12>] netlink_rcv_skb+0x152/0x3c0
    [<00000000efc4eae8>] genl_rcv+0x2d/0x40
    [<00000000ea645603>] netlink_unicast+0x52f/0x740
    [<00000000641fca1a>] netlink_sendmsg+0x9c7/0xf50
    [<00000000fed4a4b8>] sock_sendmsg+0xbe/0x120
    [<00000000d85795a9>] __sys_sendto+0x397/0x620
    [<00000000c5f84622>] __x64_sys_sendto+0xe6/0x1a0

Fixes: 6e6030bd54 ("mlxsw: spectrum_nve: Implement common NVE core")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18 09:17:39 -08:00
Ido Schimmel 5d5043917a mlxsw: spectrum: Add trap for decapsulated ARP packets
After a packet was decapsulated it is classified to the relevant FID
based on its VNI and undergoes L2 forwarding.

Unlike regular (non-encapsulated) ARP packets, Spectrum does not trap
decapsulated ARP packets during L2 forwarding and instead can only trap
such packets in the underlay router during decapsulation.

Add this missing packet trap, which is required for VXLAN routing when
the MAC of the target host is not known.

Fixes: b02597d513 ("mlxsw: spectrum: Add NVE packet traps")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18 09:17:38 -08:00
Shalom Toledo cf0b70e71b mlxsw: core: Increase timeout during firmware flash process
During the firmware flash process, some of the EMADs get timed out, which
causes the driver to send them again with a limit of 5 retries. There are
some situations in which 5 retries is not enough and the EMAD access fails.
If the failed EMAD was related to the flashing process, the driver fails
the flashing.

The reason for these timeouts during firmware flashing is cache misses in
the CPU running the firmware. In case the CPU needs to fetch instructions
from the flash when a firmware is flashed, it needs to wait for the
flashing to complete. Since flashing takes time, it is possible for pending
EMADs to timeout.

Fix by increasing EMADs' timeout while flashing firmware.

Fixes: ce6ef68f43 ("mlxsw: spectrum: Implement the ethtool flash_device callback")
Signed-off-by: Shalom Toledo <shalomt@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18 09:17:38 -08:00
John Hurley b12c97d45c nfp: flower: fix cb_ident duplicate in indirect block register
Previously the identifier used for indirect block callback registry and
for block rule cb registry (when done via indirect blocks) was the pointer
to the netdev we were interested in receiving updates on. This worked fine
if a single app existed that registered one callback per netdev of
interest. However, if multiple cards are in place and, in turn, multiple
apps, then each app may register the same callback with the same
identifier to both the netdev's indirect block cb list and to a block's cb
list. This can lead to EEXIST errors and/or incorrect cb deletions.

Prevent this conflict by using the app pointer as the identifier for
netdev indirect block cb registry, allowing each app to register a unique
callback per netdev. For block cb registry, the same app may register
multiple cbs to the same block if using TC shared blocks. Instead of the
app, use the pointer to the allocated cb_priv data as the identifier here.
This means that there can be a unique block callback for each app/netdev
combo.

Fixes: 3166dd07a9 ("nfp: flower: offload tunnel decap rules via indirect TC blocks")
Reported-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-17 23:34:12 -08:00
Shalom Toledo d1675a1602 mlxsw: spectrum: Update the supported firmware to version 13.1910.622
This new firmware contains:
 * New packet traps for discarded packets
 * Secure firmware flash bug fix
 * Fence mechanism bug fix
 * TCAM RMA bug fix

Signed-off-by: Shalom Toledo <shalomt@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-17 23:32:29 -08:00
Vasundhara Volam 56d3746247 bnxt_en: query force speeds before disabling autoneg mode.
With autoneg enabled, PHY loopback test fails. To disable autoneg,
driver needs to send a valid forced speed to FW. FW is not sending
async event for invalid speeds. To fix this, query forced speeds
and send the correct speed when disabling autoneg mode.

Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-17 23:08:53 -08:00
Michael Chan fd3ab1c70e bnxt_en: Do not free port statistics buffer when device is down.
Port statistics which include RDMA counters are useful even when the
netdevice is down.  Do not free the port statistics DMA buffers
when the netdevice is down.  This is keep the snapshot of the port
statistics and counters will just continue counting when the
netdevice goes back up.

Split the bnxt_free_stats() function into 2 functions.  The port
statistics buffers will only be freed when the netdevice is
removed.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-17 23:08:53 -08:00
Michael Chan b8875ca356 bnxt_en: Save ring statistics before reset.
With the current driver, the statistics reported by .ndo_get_stats64()
are reset when the device goes down.  Store a snapshot of the
rtnl_link_stats64 before shutdown.  This snapshot is added to the
current counters in .ndo_get_stats64() so that the counters will not
get reset when the device is down.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-17 23:08:53 -08:00
Vasundhara Volam 7c675421af bnxt_en: Return linux standard errors in bnxt_ethtool.c
Currently firmware specific errors are returned directly in flash_device
and reset ethtool hooks. Modify it to return linux standard errors
to userspace when flashing operations fail.

Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-17 23:08:53 -08:00
Michael Chan 24654f095e bnxt_en: Don't set ETS on unused TCs.
Currently, the code allows ETS bandwidth weight 0 to be set on unused TCs.
We should not set any DCB parameters on unused TCs at all.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-17 23:08:53 -08:00
Michael Chan e37fed7903 bnxt_en: Add ethtool -S priority counters.
Display the CoS counters as additional priority counters by looking up
the priority to CoS queue mapping.  If the TX extended port statistics
block size returned by firmware is big enough to cover the CoS counters,
then we will display the new priority counters.  We call firmware to get
the up-to-date pri2cos mapping to convert the CoS counters to
priority counters.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-17 23:08:53 -08:00
Michael Chan b16b689186 bnxt_en: Add SR-IOV support for 57500 chips.
There are some minor differences when assigning VF resources on the
new chips.  The MSIX (NQ) resource has to be assigned and ring group
is not needed on the new chips.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-17 23:08:53 -08:00