Commit Graph

636833 Commits

Author SHA1 Message Date
Larry Finger c38af3f06a rtlwifi: rtl8192cu: Remove all instances of DBG_EMERG
This is a step toward eliminating the RT_TRACE macros. Those calls that
have DBG_EMERG as the level are always logged, and they represent error
conditions, thus they are replaced with pr_err().

Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Cc: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2016-12-30 15:54:25 +02:00
Larry Finger b8c79f4548 rtlwifi: rtl8192de: Remove all instances of DBG_EMERG
This is a step toward eliminating the RT_TRACE macros. Those calls that
have DBG_EMERG as the level are always logged, and they represent error
conditions, thus they are replaced with pr_err().

Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Cc: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2016-12-30 15:54:24 +02:00
Larry Finger 2d15acac23 rtlwifi: rtl8192se: Remove all instances of DBG_EMERG
This is a step toward eliminating the RT_TRACE macros. Those calls that
have DBG_EMERG as the level are always logged, and they represent error
conditions, thus they are replaced with pr_err().

Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Cc: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2016-12-30 15:54:24 +02:00
Larry Finger c7532b8765 rtlwifi: rtl8723-common: Remove all instances of DBG_EMERG
This is a step toward eliminating the RT_TRACE macros. Those calls that
have DBG_EMERG as the level are always logged, and they represent error
conditions, thus they are replaced with pr_err().

Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Cc: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2016-12-30 15:54:23 +02:00
Larry Finger a44f59d603 rtlwifi: rtl8192ee: Remove all instances of DBG_EMERG
This is a step toward eliminating the RT_TRACE macros. Those calls that
have DBG_EMERG as the level are always logged, and they represent error
conditions, thus they are replaced with pr_err().

Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Cc: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2016-12-30 15:54:22 +02:00
Larry Finger a67005bc46 rtlwifi: rtl8723ae: Remove all instances of DBG_EMERG
This is a step toward eliminating the RT_TRACE macros. Those calls that
have DBG_EMERG as the level are always logged, and they represent error
conditions, thus they are replaced with pr_err().

Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Cc: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2016-12-30 15:54:22 +02:00
Larry Finger 4e2b4378f9 rtlwifi: rtl8723be: Remove all instances of DBG_EMERG
This is a step toward eliminating the RT_TRACE macros. Those calls that
have DBG_EMERG as the level are always logged, and they represent error
conditions, thus they are replaced with pr_err().

Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Cc: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2016-12-30 15:54:21 +02:00
Larry Finger 004a1e1679 rtlwifi: rtl8821ae: Remove all instances of DBG_EMERG
This is a step toward eliminating the RT_TRACE macros. Those calls that
have DBG_EMERG as the level are always logged, and they represent error
conditions, thus they are replaced with pr_err().

Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Cc: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2016-12-30 15:54:20 +02:00
Larry Finger b03d968b66 rtlwifi: Remove RT_TRACE messages that use DBG_EMERG
These messages are always logged and represent error conditions, thus
we can use pr_err().

Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Cc: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2016-12-30 15:54:20 +02:00
Larry Finger 531940f964 rtlwifi: Replace local debug macro RT_ASSERT
This macro can be replaced with WARN_ONCE. In addition to using a
standard debugging macro for these critical errors, we also get
a stack dump.

In rtl8821ae/hw.c, a senseless comment was removed, and an incorrect
indentation was fixed.

This patch also fixes two places in each of rtl8192ee, rtl8723be,
and rtl8821ae where the logical condition was incorrect.

Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Cc: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2016-12-30 15:54:19 +02:00
Stanislaw Gruszka c7d1c77781 rt2x00: add mutex to synchronize config and link tuner
Do not perform mac80211 config and link_tuner at the same time,
this can possibly result in wrong RF or BBP configuration.

Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2016-12-30 14:03:41 +02:00
Stanislaw Gruszka 31369c323b rt2800: replace msleep() with usleep_range() on channel switch
msleep(1) can sleep much more time then requested 1ms, this is not good
on channel switch, which we want to be performed fast (i.e. to make scan
faster). Replace msleep() with usleep_range(), which has much smaller
maximum sleeping time boundary.

Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2016-12-30 13:34:21 +02:00
Stanislaw Gruszka eb79a8fe94 rt2800: replace mdelay by usleep on vco calibration.
This procedure can sleep, hence mdelay is not needed.

Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2016-12-30 13:34:21 +02:00
Stanislaw Gruszka d96324703f rt2x00: merge agc and vco works with link tuner
We need to perform different actions (AGC and VCO calibrations and VGC
tuning) periodically at different intervals. We don't need separate
works for those, we can use link tuner work and just check for proper
interval on it.

This fixes performing AGC and VCO calibration when scanning on STA
mode. We need to be on-channel to perform those calibrations.

Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2016-12-30 13:34:20 +02:00
Stanislaw Gruszka 24d42ef3b1 rt2800: perform VCO recalibration for RF5592 chip
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2016-12-30 13:34:20 +02:00
Stanislaw Gruszka bc00770539 rt2800: warn if doing VCO recalibration for unknow RF chip
Since we reset TX_PIN_CFG register, we have to finish recalibration.
Warn if this is not possible.

Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2016-12-30 13:34:19 +02:00
Stanislaw Gruszka 8845254112 rt2800: rename adjust_freq_offset function
We have different modes of adjusting freq offset on different chips.
Call current adjustment similarly like vendor driver - mode1 .

Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2016-12-30 13:34:18 +02:00
Stanislaw Gruszka 8f03a7c6e7 rt2800: set MAX_PSDU len according to remote STAs capabilities
MAX_LEN_CFG_MAX_PSDU specify maximum transmitted by HW AMPDU length
(0 - 8kB, 1 - 16kB, 2 - 32kB, 3 - 64kB). Set this option according to
remote stations capabilities (based on HT ampdu_factor). However limit
the value based our hardware TX capabilities as some chips can not send
more than 16kB (factor 1). Limit for all chips is currently 32kB
(factor 2), but perhaps for some chips this could be increased
to 64kB by setting drv_data->max_psdu to 3.

Since MAX_LEN_CFG_MAX_PSDU is global setting, on multi stations modes
(AP, IBSS, mesh) we limit according to less capable remote STA. We can
not set bigger value to speed up communication with some stations and
do not break communication with slow stations.

Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2016-12-30 13:34:18 +02:00
Stanislaw Gruszka a51b89698c rt2800: set minimum MPDU and PSDU lengths to sane values
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2016-12-30 13:34:17 +02:00
Stanislaw Gruszka e49abb19d1 rt2800: don't set ht parameters for non-aggregated frames
Do not set ampdu_density and ba_size for frames without AMPDU bit i.e.
frames that will not be aggregated to AMPDU.

Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2016-12-30 13:34:16 +02:00
Stanislaw Gruszka a08b98196a rt2800: make rx ampdu_factor depend on number of rx chains
Initalize max ampdu_factor supported by us based on rx chains, vendor
driver do the same.

Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2016-12-30 13:34:16 +02:00
Andrew Lutomirski 1fef293b8a orinoco: Use shash instead of ahash for MIC calculations
Eric Biggers pointed out that the orinoco driver pointed scatterlists
at the stack.

Fix it by switching from ahash to shash.  The result should be
simpler, faster, and more correct.

Cc: stable@vger.kernel.org # 4.9 only
Reported-by: Eric Biggers <ebiggers3@gmail.com>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2016-12-30 13:34:15 +02:00
Dan Carpenter c705a6b3aa adm80211: return an error if adm8211_alloc_rings() fails
We accidentally return success when adm8211_alloc_rings() fails but we
should preserve the error code.

Fixes: cc0b88cf5e ("[PATCH] Add adm8211 802.11b wireless driver")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2016-12-30 13:29:21 +02:00
Johannes Berg ae3cf47645 iwlegacy: make il3945_mac_ops __ro_after_init
There's no need for this to be only __read_mostly, since
it's only written in a single way depending on the module
parameter, so that can be moved into the module's __init
function, and the ops can be __ro_after_init.

This is a little bit safer since it means the ops can't
be overwritten (accidentally or otherwise), which would
otherwise cause an arbitrary function or bad pointer to
be called.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Acked-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2016-12-30 13:28:40 +02:00
Amitkumar Karwar d7864cf212 mwifiex: Enable dynamic bandwidth signalling
Enable dynamic bandwidth signalling by setting the corresponding
bit in MAC control register.

Signed-off-by: Amitkumar Karwar <akarwar@marvell.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2016-12-30 13:25:58 +02:00
Amitkumar Karwar b82dd3bdf1 mwifiex: change width of MAC control variable
Firmware has started making use of reserved field.
Accordingly change curr_pkt_filter from u16 to u32.

Signed-off-by: Amitkumar Karwar <akarwar@marvell.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2016-12-30 13:25:58 +02:00
Amitkumar Karwar 74c8719b8e mwifiex: sdio: fix use after free issue for save_adapter
If we have sdio work requests received when sdio card reset is
happening, we may end up accessing older save_adapter pointer
later which is already freed during card reset.
This patch solves the problem by cancelling those pending requests.

Signed-off-by: Amitkumar Karwar <akarwar@marvell.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2016-12-30 13:19:40 +02:00
Alexey Khoroshilov d15697de60 adm80211: add checks for dma mapping errors
The driver does not check if mapping dma memory succeed.
The patch adds the checks and failure handling.

Found by Linux Driver Verification project (linuxtesting.org).

Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2016-12-05 13:10:53 +02:00
Dan Carpenter e54a8c4b57 mwifiex: clean up some messy indenting
These lines were indented one tab extra.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2016-12-05 13:10:11 +02:00
Erik Nordmark adc176c547 ipv6 addrconf: Implemented enhanced DAD (RFC7527)
Implemented RFC7527 Enhanced DAD.
IPv6 duplicate address detection can fail if there is some temporary
loopback of Ethernet frames. RFC7527 solves this by including a random
nonce in the NS messages used for DAD, and if an NS is received with the
same nonce it is assumed to be a looped back DAD probe and is ignored.
RFC7527 is enabled by default. Can be disabled by setting both of
conf/{all,interface}/enhanced_dad to zero.

Signed-off-by: Erik Nordmark <nordmark@arista.com>
Signed-off-by: Bob Gilligan <gilligan@arista.com>
Reviewed-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-03 23:21:37 -05:00
David S. Miller ce84c7c663 Merge branch 'mv88e6390-batch-three'
Andrew Lunn says:

====================
mv88e6390 batch 3

More patches to support the MV88e6390. This is mostly refactoring
existing code and adding implementations for the mv88e6390.  This
patchset set which reserved frames are sent to the cpu, the size of
jumbo frames that will be accepted, turn off egress rate limiting, and
configuration of pause frames.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-03 23:18:39 -05:00
Andrew Lunn 3ce0e65eb6 net: dsa: mv88e6xxx: Implement mv88e6390 pause control
The mv88e6390 has a number flow control registers accessed via the
Flow Control register. Use these to set the pause control.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-03 23:18:39 -05:00
Andrew Lunn b35d322a1d net: dsa: mv88e6xxx: Refactor pause configuration
The mv88e6390 has a different mechanism for configuring pause.
Refactor the code into an ops function, and for the moment, don't add
any mv88e6390 code yet.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-03 23:18:39 -05:00
Andrew Lunn ef70b1119e net: dsa: mv88e6xxx: Refactor egress rate limiting
There are two different rate limiting configurations, depending on the
switch generation. Refactor this into ops.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-03 23:18:38 -05:00
Andrew Lunn 5f4366660d net: dsa: mv88e6xxx: Refactor setting of jumbo frames
Some switches support jumbo frames. Refactor this code into operations
in the ops structure.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-03 23:18:38 -05:00
Andrew Lunn 6e55f69846 net: dsa: mv88e6xxx: Reserved Management frames to CPU
Older devices have a couple of registers in global2. The mv88e6390
family has a single register in global1 behind which hides similar
configuration. Implement and op for this.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-03 23:18:38 -05:00
David S. Miller 7a6c5cb960 Merge branch 'mv88e6390-batch-two'
Andrew Lunn says:

====================
MV88E6390 batch two

This is the second batch of patches adding support for the
MV88e6390. They are not sufficient to make it work properly.

The mv88e6390 has a much expanded set of priority maps. Refactor the
existing code, and implement basic support for the new device.

Similarly, the monitor control register has been reworked.

The mv88e6390 has something odd in its EDSA tagging implementation,
which means it is not possible to use it. So we need to use DSA
tagging. This is the first device with EDSA support where we need to
use DSA, and the code does not support this. So two patches refactor
the existing code. The two different register definitions are
separated out, and using DSA on an EDSA capable device is added.

v2:
Add port prefix
Add helper function for 6390
Add _IEEE_ into #defines
Split monitor_ctrl into a number of separate ops.
Remove 6390 code which is management, used in a later patch
s/EGREES/EGRESS/.
Broke up setup_port_dsa() and set_port_dsa() into a number of ops

v3:
Verify mandatory ops for port setup
Don't set ether type for DSA port.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-03 23:15:01 -05:00
Andrew Lunn 56995cbc35 net: dsa: mv88e6xxx: Refactor CPU and DSA port setup
Older chips only support DSA tagging. Newer chips have both DSA and
EDSA tagging. Refactor the code by adding port functions for setting the
frame mode, egress mode, and if to forward unknown frames.

This results in the helper mv88e6xxx_6065_family() becoming unused, so
remove it.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
v3:
Verify mandatory ops for port setup
Don't set ether type for DSA port.
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-03 23:15:00 -05:00
Andrew Lunn 443d5a1b7d net: dsa: mv88e6xxx: Move the tagging protocol into info
Older chips support a single tagging protocol, DSA. New chips support
both DSA and EDSA, an enhanced version. Having both as an option
changes the register layouts. Up until now, it has been assumed that
if EDSA is supported, it will be used. Hence the register layout has
been determined by which protocol should be used. However, mv88e6390
has a different implementation of EDSA, which requires we need to use
the DSA tagging. Hence separate the selection of the protocol from the
register layout.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-03 23:15:00 -05:00
Andrew Lunn 33641994a6 net: dsa: mv88e6xxx: Monitor and Management tables
The mv88e6390 changes the monitor control register into the Monitor
and Management control, which is an indirection register to various
registers.

Add ops to set the CPU port and the ingress/egress port for both
register layouts, to global1

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-03 23:15:00 -05:00
Andrew Lunn ef0a731882 net: dsa: mv88e6xxx: Implement mv88e6390 tag remap
The mv88e6390 does not have the two registers to set the frame
priority map. Instead it has an indirection registers for setting a
number of different priority maps. Refactor the old code into an
function, implement the mv88e6390 version, and use an op to call the
right one.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-03 23:15:00 -05:00
David S. Miller 69248719d0 Merge branch 'fib-notifier-event-replay'
Jiri Pirko says:

====================
ipv4: fib: Replay events when registering FIB notifier

Ido says:

In kernel 4.9 the switchdev-specific FIB offload mechanism was replaced
by a new FIB notification chain to which modules could register in order
to be notified about the addition and deletion of FIB entries. The
motivation for this change was that switchdev drivers need to be able to
reflect the entire FIB table and not only FIBs configured on top of the
port netdevs themselves. This is useful in case of in-band management.

The fundamental problem with this approach is that upon registration
listeners lose all the information previously sent in the chain and
thus have an incomplete view of the FIB tables, which can result in
packet loss. This patchset fixes that by dumping the FIB tables and
replaying notifications previously sent in the chain for the registered
notification block.

The entire dump process is done under RCU and thus the FIB notification
chain is converted to be atomic. The listeners are modified accordingly.
This is done in the first eight patches.

The ninth patch adds a change sequence counter to ensure the integrity
of the FIB dump. The last patch adds the dump itself to the FIB chain
registration function and modifies existing listeners to pass a callback
to be executed in case dump was inconsistent.

---
v3->v4:
- Register the notification block after the dump and protect it using
  the change sequence counter (Hannes Frederic Sowa).
- Since we now integrate the dump into the registration function, drop
  the sysctl to set maximum number of retries and instead set it to a
  fixed number. Lets see if it's really a problem before adding something
  we can never remove.
- For the same reason, dump FIB tables for all net namespaces.
- Add a comment regarding guarantees provided by mutex semantics.

v2->v3:
- Add sysctl to set the number of FIB dump retries (Hannes Frederic Sowa).
- Read the sequence counter under RTNL to ensure synchronization
  between the dump process and other processes changing the routing
  tables (Hannes Frederic Sowa).
- Pass a callback to the dump function to be executed prior to a retry.
- Limit the dump to a single net namespace.

v1->v2:
- Add a sequence counter to ensure the integrity of the FIB dump
  (David S. Miller, Hannes Frederic Sowa).
- Protect notifications from re-ordering in listeners by using an
  ordered workqueue (Hannes Frederic Sowa).
- Introduce fib_info_hold() (Jiri Pirko).
- Relieve rocker from the need to invoke the FIB dump by registering
  to the FIB notification chain prior to ports creation.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-03 19:29:37 -05:00
Ido Schimmel c3852ef7f2 ipv4: fib: Replay events when registering FIB notifier
Commit b90eb75494 ("fib: introduce FIB notification infrastructure")
introduced a new notification chain to notify listeners (f.e., switchdev
drivers) about addition and deletion of routes.

However, upon registration to the chain the FIB tables can already be
populated, which means potential listeners will have an incomplete view
of the tables.

Solve that by dumping the FIB tables and replaying the events to the
passed notification block. The dump itself is done using RCU in order
not to starve consumers that need RTNL to make progress.

The integrity of the dump is ensured by reading the FIB change sequence
counter before and after the dump under RTNL. This allows us to avoid
the problematic situation in which the dumping process sends a ENTRY_ADD
notification following ENTRY_DEL generated by another process holding
RTNL.

Callers of the registration function may pass a callback that is
executed in case the dump was inconsistent with current FIB tables.

The number of retries until a consistent dump is achieved is set to a
fixed number to prevent callers from looping for long periods of time.
In case current limit proves to be problematic in the future, it can be
easily converted to be configurable using a sysctl.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-03 19:29:35 -05:00
Ido Schimmel cacaad11f4 ipv4: fib: Allow for consistent FIB dumping
The next patch will enable listeners of the FIB notification chain to
request a dump of the FIB tables. However, since RTNL isn't taken during
the dump, it's possible for the FIB tables to change mid-dump, which
will result in inconsistency between the listener's table and the
kernel's.

Allow listeners to know about changes that occurred mid-dump, by adding
a change sequence counter to each net namespace. The counter is
incremented just before a notification is sent in the FIB chain.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-03 19:29:35 -05:00
Ido Schimmel d3f706f68e ipv4: fib: Convert FIB notification chain to be atomic
In order not to hold RTNL for long periods of time we're going to dump
the FIB tables using RCU.

Convert the FIB notification chain to be atomic, as we can't block in
RCU critical sections.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-03 19:29:35 -05:00
Ido Schimmel 17f8be7daf rocker: Register FIB notifier before creating ports
We can miss FIB notifications sent between the time the ports were
created and the FIB notification block registered.

Instead of receiving these notifications only when they are replayed for
the FIB notification block during registration, just register the
notification block before the ports are created.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-03 19:29:35 -05:00
Ido Schimmel db7019557c rocker: Implement FIB offload in deferred work
Convert rocker to offload FIBs in deferred work in a similar fashion to
mlxsw, which was converted in the previous commits.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-03 19:29:35 -05:00
Ido Schimmel c1bb279cfa rocker: Create an ordered workqueue for FIB offload
As explained in the previous commits, we need to process FIB entries
addition / deletion events in FIFO order or otherwise we can have a
mismatch between the kernel's FIB table and the device's.

Create an ordered workqueue for rocker to which these work items will be
submitted to.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-03 19:29:35 -05:00
Ido Schimmel 3057224e01 mlxsw: spectrum_router: Implement FIB offload in deferred work
FIB offload is currently done in process context with RTNL held, but
we're about to dump the FIB tables in RCU critical section, so we can no
longer sleep.

Instead, defer the operation to process context using deferred work. Make
sure fib info isn't freed while the work is queued by taking a reference
on it and releasing it after the operation is done.

Deferring the operation is valid because the upper layers always assume
the operation was successful. If it's not, then the driver-specific
abort mechanism is called and all routed traffic is directed to slow
path.

The work items are submitted to an ordered workqueue to prevent a
mismatch between the kernel's FIB table and the device's.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-03 19:29:35 -05:00
Ido Schimmel a3832b3189 mlxsw: core: Create an ordered workqueue for FIB offload
We're going to start processing FIB entries addition / deletion events
in deferred work. These work items must be processed in the order they
were submitted or otherwise we can have differences between the kernel's
FIB table and the device's.

Solve this by creating an ordered workqueue to which these work items
will be submitted to. Note that we can't simply convert the current
workqueue to be ordered, as EMADs re-transmissions are also processed in
deferred work.

Later on, we can migrate other work items to this workqueue, such as FDB
notification processing and nexthop resolution, since they all take the
same lock anyway.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-03 19:29:35 -05:00