Commit Graph

378407 Commits

Author SHA1 Message Date
Nicolas Dichtel 5e6700b3bf sit: add support of x-netns
This patch allows to switch the netns when packet is encapsulated or
decapsulated. In other word, the encapsulated packet is received in a netns,
where the lookup is done to find the tunnel. Once the tunnel is found, the
packet is decapsulated and injecting into the corresponding interface which
stands to another netns.

When one of the two netns is removed, the tunnel is destroyed.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-27 22:30:47 -07:00
Nicolas Dichtel 621e84d6f3 dev: introduce skb_scrub_packet()
The goal of this new function is to perform all needed cleanup before sending
an skb into another netns.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-27 22:29:05 -07:00
John W. Linville 0f817ed52d ath10k: minimally handle new channel width enumeration values
CC      drivers/net/wireless/ath/ath10k/mac.o
drivers/net/wireless/ath/ath10k/mac.c: In function ‘chan_to_phymode’:
drivers/net/wireless/ath/ath10k/mac.c:229:3: warning: enumeration value ‘NL80211_CHAN_WIDTH_5’ not handled in switch [-Wswitch]
drivers/net/wireless/ath/ath10k/mac.c:229:3: warning: enumeration value ‘NL80211_CHAN_WIDTH_10’ not handled in switch [-Wswitch]
drivers/net/wireless/ath/ath10k/mac.c:247:3: warning: enumeration value ‘NL80211_CHAN_WIDTH_5’ not handled in switch [-Wswitch]
drivers/net/wireless/ath/ath10k/mac.c:247:3: warning: enumeration value ‘NL80211_CHAN_WIDTH_10’ not handled in switch [-Wswitch]

Signed-off-by: John W. Linville <linville@tuxdriver.com>
2013-06-27 13:50:09 -04:00
Thomas Pedersen 0c9acaa835 ath9k_htc: ifdef out IFTYPE_MESH advertisement
This is needed so the interface combination can still be
validated when CONFIG_MAC80211_MESH is not enabled.
Otherwise wiphy registration fails.

Signed-off-by: Thomas Pedersen <thomas@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2013-06-27 13:45:18 -04:00
Arend van Spriel cc1b546394 brcmfmac: remove code and comment for older kernel support
In the code of the receive path some code was dealing with how
things were done in older kernels. Not really needed for an
upstream driver.

Reviewed-by: Hante Meuleman <meuleman@broadcom.com>
Reviewed-by: Pieter-Paul Giesberts <pieterpg@broadcom.com>
Signed-off-by: Arend van Spriel <arend@broadcom.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2013-06-27 13:42:19 -04:00
Arend van Spriel c6a681ab2c brcmfmac: reduce firmware-signalling locking scope in rx path
In the receive path a spinlock is taken upon parsing the TLV signal
header. This moves to locking to the TLV handling functions where
it protects the data structures.

Reviewed-by: Hante Meuleman <meuleman@broadcom.com>
Reviewed-by: Pieter-Paul Giesberts <pieterpg@broadcom.com>
Signed-off-by: Arend van Spriel <arend@broadcom.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2013-06-27 13:42:19 -04:00
Arend van Spriel 80898a1178 brcmfmac: cleanup debug messages in brcmf_fws_hdrpush()
Trivial cleanup of debug messages.

Reviewed-by: Hante Meuleman <meuleman@broadcom.com>
Signed-off-by: Arend van Spriel <arend@broadcom.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2013-06-27 13:42:18 -04:00
Arend van Spriel 456d0685c7 brcmfmac: tag packet in the netdev transmit callback
Transmit packets needs to be tagged in order to receive a tx status
feedback from the firmware. Determine the tag in the netdev transmit
callback instead of determining the tag just before transfer to the
device. This reduces the number of exception flows and hence makes
the driver code simpler.

Reviewed-by: Hante Meuleman <meuleman@broadcom.com>
Reviewed-by: Pieter-Paul Giesberts <pieterpg@broadcom.com>
Signed-off-by: Arend van Spriel <arend@broadcom.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2013-06-27 13:42:18 -04:00
Franky Lin 3b81a68094 brcmfmac: add broken scatter-gather DMA support
DMA engine of some old SDIO host controllers require block size alignment for
data length of each scatterlist item. This patch introduces an intermediate
buffer list to support this kind of platform. It decreases the throughput
because of an extra memcpy in critical data path. So don't turn this on unless
it's necessary.

Reviewed-by: Pieter-Paul Giesberts <pieterpg@broadcom.com>
Reviewed-by: Arend van Spriel <arend@broadcom.com>
Signed-off-by: Franky Lin <frankyl@broadcom.com>
Signed-off-by: Arend van Spriel <arend@broadcom.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2013-06-27 13:42:18 -04:00
Franky Lin 356bae6fb7 brcmfmac: use unified dongle address preparation function
Introduce a unified dongle backplane address preparation function
brcmf_sdio_addrprep to replace duplicate address prep code.

Reviewed-by: Pieter-Paul Giesberts <pieterpg@broadcom.com>
Reviewed-by: Arend van Spriel <arend@broadcom.com>
Signed-off-by: Franky Lin <frankyl@broadcom.com>
Signed-off-by: Arend van Spriel <arend@broadcom.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2013-06-27 13:42:17 -04:00
Franky Lin b058d4d258 brcmfmac: remove SDIO_REQ_ASYNC flag
Remove SDIO_REQ_ASYNC from brcmfmac since it is not being used.

Reviewed-by: Hante Meuleman <meuleman@broadcom.com>
Reviewed-by: Pieter-Paul Giesberts <pieterpg@broadcom.com>
Reviewed-by: Arend van Spriel <arend@broadcom.com>
Signed-off-by: Franky Lin <frankyl@broadcom.com>
Signed-off-by: Arend van Spriel <arend@broadcom.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2013-06-27 13:42:17 -04:00
Arend van Spriel 2a5d7b026d brcmfmac: remove (ab)use of NL80211_NUM_ACS
Used NL80211_NUM_ACS to indicate the BCMC fifo used in the driver
which has the same value now, but it is a bad idea relying on that.

Signed-off-by: Arend van Spriel <arend@broadcom.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2013-06-27 13:42:16 -04:00
Arend van Spriel 2086374658 brcmfmac: simplify transmit path
When getting a transmit packet from the networking layer simply
enqueue the packet unconditional and have it handled by the dequeue
worker. The transfer of the packet to the bus-specific driver part
is now done from one context.

Reviewed-by: Hante Meuleman <meuleman@broadcom.com>
Reviewed-by: Franky (Zhenhui) Lin <frankyl@broadcom.com>
Reviewed-by: Pieter-Paul Giesberts <pieterpg@broadcom.com>
Signed-off-by: Arend van Spriel <arend@broadcom.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2013-06-27 13:42:16 -04:00
Rafał Miłecki 88f9b65d44 bcma: add support for BCM43142
Signed-off-by: Rafał Miłecki <zajec5@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2013-06-27 13:42:16 -04:00
Rafał Miłecki 8960400eee b43: replace B43_BCMA_EXTRA with modparam allhwsupport
This allows enabling support for extra hardware with just a module
param, without kernel/module recompilation.

Signed-off-by: Rafał Miłecki <zajec5@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2013-06-27 13:42:15 -04:00
Michal Kazior d847e3e2e4 ath10k: leave MMIC generation to the HW
Apparently HW doesn't require us to generate MMIC
for TKIP suite.

Each frame was 8 bytes longer than it should be
and some APs would drop frames that exceed 1520
bytes of 802.11 payload. This could be observed
during throughput tests or fragmented IP traffic.

Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2013-06-27 13:42:15 -04:00
Michal Kazior 429ff56a4a ath10k: fix 5ghz channel definitions
Nonsense channel flags were being set.

Although it doesn't seem this was visible to the
user the patch makes sure that channel
availability won't be crippled in the future if
ath_common behaviour changes.

Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2013-06-27 13:42:14 -04:00
Michal Kazior 87b1423b71 ath10k: fix MSI-X setup failpath
Irqs were not freed up correctly upon msi-x setup
failure.

Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2013-06-27 13:42:14 -04:00
Gabor Juhos 0847beb286 rt2x00: rt2800lib: fix default TX power check for RT55xx
The code writes the default_power2 value into the TX field
of the RFCSR50 register, however the condition in the if
statement uses default_power1. Due to this, wrong TX power
value might be written into the register.

Use the correct value in the condition to fix the issue.

Compile tested only.

Signed-off-by: Gabor Juhos <juhosg@openwrt.org>
Cc: stable@vger.kernel.org # 3.10
Acked-by: Gertjan van Wingerde <gwingerde@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2013-06-27 13:42:13 -04:00
Sujith Manoharan 9a54c17636 ath9k: Add mix tx gain table for AR9462 2.0
Signed-off-by: Sujith Manoharan <c_manoha@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2013-06-27 13:42:13 -04:00
Gabor Juhos bb16d4881e rt2x00: rt2800lib: turn on tertiary PAs/LNAs for 3T/3R devices
The 3T/3R devices are using the tertiary PAs/LNAs
however those are never turned on. Fix the code to
turn on those on for such devices.

Also modify the code to use switch statements to
improve readability.

Signed-off-by: Gabor Juhos <juhosg@openwrt.org>
Acked-by: Gertjan van Wingerde <gwingerde@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2013-06-27 13:42:13 -04:00
Gabor Juhos 3e23d4e8ce rt2x00: rt2800lib: turn on secondary PAs/LNAs for 3T/3R devices
The secondary PAs/LNAs are turned on only for 2T/2R
devices, however these are used for 3T/3R devices as
well. Always turn those on if the device uses more
than one tx/rx chains.

Signed-off-by: Gabor Juhos <juhosg@openwrt.org>
Acked-by: Stanislaw Gruszka <stf_xl@wp.pl>
Acked-by: Gertjan van Wingerde <gwingerde@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2013-06-27 13:42:12 -04:00
Gabor Juhos 51f877ab1a rt2x00: rt2800: increase EEPROM_SIZE to 512 bytes
Ralink 3T chipsets are using a different EEPROM
layout than the others. The EEPROM on these devices
contain more data than the others which does not fit
into 272 byte which the rt2800 driver actually uses.

The Ralink reference driver defines EEPROM_SIZE to
512/1024 bytes for PCI/USB devices respectively.

Increase the EEPROM_SIZE constant to 512 bytes, in
order to make room for EEPROM data of 3T devices.

Signed-off-by: Gabor Juhos <juhosg@openwrt.org>
Acked-by: Helmut Schaa <helmut.schaa@googlemail.com>
Acked-by: Stanislaw Gruszka <stf_xl@wp.pl>
Acked-by: Gertjan van Wingerde <gwingerde@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2013-06-27 13:42:12 -04:00
Sachin Kamat 1f3e4b0cc4 can: at91_can: Use of_match_ptr()
of_match_ptr() eliminates having an #ifdef returning NULL for the case
when OF is disabled.

Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2013-06-27 15:15:42 +02:00
Fabio Estevam baffd2e8d9 ARM: imx: flexcan: Remove platform file
As there are no more users of the flexcan platform file, let's remove it.

Cc: Sascha Hauer <s.hauer@pengutronix.de>
Acked-by: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2013-06-27 15:15:32 +02:00
Fabio Estevam b7c4114b07 can: flexcan: Use a regulator to control the CAN transceiver
Instead of using a GPIO to turn on/off the CAN transceiver, it is better to
use a regulator as some systems may use a PMIC to power the CAN transceiver.

Acked-by: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2013-06-27 15:15:25 +02:00
Marc Kleine-Budde 30caa4b763 ARM: imx: prepare for removal of flexcan_platform_data
As there are no imx in-tree users of flexcan_platform_data, this patch removes
the possibility to register a flexcan device with platform data.

The functionality to swith on/off CAN transceivers is added to DT via
regulators in a later patch.

Compile time tested with imx_v4_v5_defconfig and imx_v6_v7_defconfig.

Acked-by: Shawn Guo <shawn.guo@linaro.org>
Acked-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2013-06-27 15:15:08 +02:00
John W. Linville 59731bb8c4 Merge branch 'for-john' of git://git.kernel.org/pub/scm/linux/kernel/git/iwlwifi/iwlwifi-next 2013-06-26 20:00:11 -04:00
Chris Healy 38ae92dc21 fec: Add support for reading RMON registers
Add ethtool operation to read RMON registers.

Tested against net-next on i.MX28.

v2: make conditional on #ifndef CONFIG_M5272

Signed-off-by: Chris Healy <cphealy@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-26 15:45:14 -07:00
Hannes Frederic Sowa 77ecaace6c ipv6: rearm router solicitaion timer when setting new tokenized address
When a new tokenized address gets installed we send out just one
router solicition. We should send out `rtr_solicits' in case one router
advertisment got lost.

So, rearm the timer as we do in addrconf_dad_complete.

Cc: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-26 15:23:01 -07:00
Nicolas Dichtel 963b89e80d sit: fix 4in4 + IPsec scenario
Since commit 32b8a8e59c "sit: add IPv4 over IPv4 support",
tunnel->parms.iph.protocol is 0 when both 4in4 and 6in4 are setup, but
xfrm_lookup() is called only when proto is != 0, thus we need to pass the real
value.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-26 13:42:03 -07:00
David S. Miller a77471ff70 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next
Steffen Klassert says:

====================
Just one patch this time.

1) Drop packets when the matching SA is in larval state and add a
   statistic counter for that. From Fan Du.

Please pull or let me know if there are problems.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-26 13:23:13 -07:00
John W. Linville 729d8d182b Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless 2013-06-26 12:01:42 -04:00
JunweiZhang 8b4d14d8eb netns: exclude ipvs from struct net when IPVS disabled
no real problem is fixed, just save a few bytes in
net_namespace structure.

Signed-off-by: JunweiZhang <junwei.zhang@6wind.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Reviewed-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
2013-06-26 18:01:46 +09:00
JunweiZhang d0667186eb kernel: remove unnecessary head file
ip_vs.h is not necessary for sysctl_binary.c.

prepare for the next patch to avoid compile issue.

Signed-off-by: JunweiZhang <junwei.zhang@6wind.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Reviewed-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
2013-06-26 18:01:46 +09:00
Julian Anastasov 4d0c875dcc ipvs: add sync_persist_mode flag
Add sync_persist_mode flag to reduce sync traffic
by syncing only persistent templates.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Tested-by: Aleksey Chudov <aleksey.chudov@gmail.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
2013-06-26 18:01:46 +09:00
Alexander Frolkin eba3b5a787 ipvs: SH fallback and L4 hashing
By default the SH scheduler rejects connections that are hashed onto a
realserver of weight 0.  This patch adds a flag to make SH choose a
different realserver in this case, instead of rejecting the connection.

The patch also adds a flag to make SH include the source port (TCP, UDP,
SCTP) in the hash as well as the source address.  This basically allows
for deterministic round-robin load balancing (i.e., where any director
in a cluster of directors with identical config will send the same
packet the same way).

The flags are service flags (IP_VS_SVC_F_SCHED*) so that these options
can be set per service.  They are set using a new option to ipvsadm.

Signed-off-by: Alexander Frolkin <avf@eldamar.org.uk>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
2013-06-26 18:01:46 +09:00
Julian Anastasov acaac5d8bb ipvs: drop SCTP connections depending on state
Drop SCTP connections under load (dropentry context) depending
on the protocol state, just like for TCP: INIT conns are
dropped immediately, established are dropped randomly while
connections in progress or shutdown are skipped.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
2013-06-26 18:01:46 +09:00
Julian Anastasov 61e7c420b4 ipvs: replace the SCTP state machine
Convert the SCTP state table, so that it is more readable.
Change the states to be according to the diagram in RFC 2960
and add more states suitable for middle box. Still, such
change in states adds incompatibility if systems in sync
setup include this change and others do not include it.

With this change we also have proper transitions in INPUT-ONLY
mode (DR/TUN) where we see packets only from client. Now
we should not switch to 10-second CLOSED state at a time
when we should stay in ESTABLISHED state.

The short names for states are because we have 16-char space
in ipvsadm and 11-char limit for the connection list format.
It is a sequence of the TCP implementation where the longest
state name is ESTABLISHED.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
2013-06-26 18:01:46 +09:00
Alexander Frolkin c6c96c1883 ipvs: sloppy TCP and SCTP
This adds support for sloppy TCP and SCTP modes to IPVS.

When enabled (sysctls net.ipv4.vs.sloppy_tcp and
net.ipv4.vs.sloppy_sctp), allows IPVS to create connection state on any
packet, not just a TCP SYN (or SCTP INIT).

This allows connections to fail over from one IPVS director to another
mid-flight.

Signed-off-by: Alexander Frolkin <avf@eldamar.org.uk>
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
2013-06-26 18:01:46 +09:00
Julian Anastasov bba54de5bd ipvs: provide iph to schedulers
Before now the schedulers needed access only to IP
addresses and it was easy to get them from skb by
using ip_vs_fill_iph_addr_only.

New changes for the SH scheduler will need the protocol
and ports which is difficult to get from skb for the
IPv6 case. As we have all the data in the iph structure,
to avoid the same slow lookups provide the iph to schedulers.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Acked-by: Hans Schillstrom <hans@schillstrom.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
2013-06-26 18:01:45 +09:00
Alexey Brodkin a4a1139b24 arc_emac: fix compile-time errors & warnings on PPC64
As reported by "kbuild test robot" there were some errors and warnings
on attempt to build kernel with "make ARCH=powerpc allmodconfig".

And this patch addresses both errors and warnings.
Below is a list of introduced changes:
1. Fix compile-time errors (misspellings in "dma_unmap_single") on PPC.
2. Use DMA address instead of "skb->data" as a pointer to data buffer.
This fixed warnings on pointer to int conversion on 64-bit systems.
3. Re-implemented initial allocation of Rx buffers in "arc_emac_open" in
the same way they're re-allocated during operation (receiving packets).
So once again DMA address could be used instead of "skb->data".
4. Explicitly use EMAC_BUFFER_SIZE for Rx buffers allocation.

Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>

Cc: netdev@vger.kernel.org
Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
Cc: Francois Romieu <romieu@fr.zoreil.com>
Cc: Joe Perches <joe@perches.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Mischa Jonker <mjonker@synopsys.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Grant Likely <grant.likely@linaro.org>
Cc: Rob Herring <rob.herring@calxeda.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: linux-kernel@vger.kernel.org
Cc: devicetree-discuss@lists.ozlabs.org
Cc: Florian Fainelli <florian@openwrt.org>
Cc: David Laight <david.laight@aculab.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-26 01:35:44 -07:00
Stephen Hemminger ba609e9bf1 vxlan: fix function name spelling
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2013-06-25 17:06:01 -07:00
Stephen Hemminger 3f5d6af094 Merge ../vxlan-x 2013-06-25 17:02:49 -07:00
Stephen Hemminger 537f7f8494 bridge: check for zero ether address in fdb add
The check for all-zero ether address was removed from rtnetlink core,
since Vxlan uses all-zero ether address to signify default address.
Need to add check back in for bridge.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2013-06-25 16:59:27 -07:00
Veaceslav Falico 8599b52e14 bonding: add an option to fail when any of arp_ip_target is inaccessible
Currently, we fail only when all of the ips in arp_ip_target are gone.
However, in some situations we might need to fail if even one host from
arp_ip_target becomes unavailable.

All situations, obviously, rely on the idea that we need *completely*
functional network, with all interfaces/addresses working correctly.

One real world example might be:
vlans on top on bond (hybrid port). If bond and vlans have ips assigned
and we have their peers monitored via arp_ip_target - in case of switch
misconfiguration (trunk/access port), slave driver malfunction or
tagged/untagged traffic dropped on the way - we will be able to switch
to another slave.

Though any other configuration needs that if we need to have access to all
arp_ip_targets.

This patch adds this possibility by adding a new parameter -
arp_all_targets (both as a module parameter and as a sysfs knob). It can be
set to:

	0 or any (the default) - which works exactly as it's working now -
	the slave is up if any of the arp_ip_targets are up.

	1 or all - the slave is up if all of the arp_ip_targets are up.

This parameter can be changed on the fly (via sysfs), and requires the mode
to be active-backup and arp_validate to be enabled (it obeys the
arp_validate config on which slaves to validate).

Internally it's done through:

1) Add target_last_arp_rx[BOND_MAX_ARP_TARGETS] array to slave struct. It's
   an array of jiffies, meaning that slave->target_last_arp_rx[i] is the
   last time we've received arp from bond->params.arp_targets[i] on this
   slave.

2) If we successfully validate an arp from bond->params.arp_targets[i] in
   bond_validate_arp() - update the slave->target_last_arp_rx[i] with the
   current jiffies value.

3) When getting slave's last_rx via slave_last_rx(), we return the oldest
   time when we've received an arp from any address in
   bond->params.arp_targets[].

If the value of arp_all_targets == 0 - we still work the same way as
before.

Also, update the documentation to reflect the new parameter.

v3->v4:
Kill the forgotten rtnl_unlock(), rephrase the documentation part to be
more clear, don't fail setting arp_all_targets if arp_validate is not set -
it has no effect anyway but can be easier to set up. Also, print a warning
if the last arp_ip_target is removed while the arp_interval is on, but not
the arp_validate.

v2->v3:
Use _bh spinlock, remove useless rtnl_lock() and use jiffies for new
arp_ip_target last arp, instead of slave_last_rx(). On bond_enslave(),
use the same initialization value for target_last_arp_rx[] as is used
for the default last_arp_rx, to avoid useless interface flaps.

Also, instead of failing to remove the last arp_ip_target just print a
warning - otherwise it might break existing scripts.

v1->v2:
Correctly handle adding/removing hosts in arp_ip_target - we need to
shift/initialize all slave's target_last_arp_rx. Also, don't fail module
loading on arp_all_targets misconfiguration, just disable it, and some
minor style fixes.

Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-25 16:58:38 -07:00
Veaceslav Falico d7d35c681f bonding: doc: some details on backup slave arp validation
Add some details to bonding documentation on how backup slave arp
validation works.

Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-25 16:58:38 -07:00
Veaceslav Falico aeea64ac71 bonding: don't trust arp requests unless active slave really works
Currently, if we receive any arp packet on a backup slave in active-backup
mode and arp_validate enabled, we suppose that it's an arp request, swap
source/target ip and try to validate it. This optimization gives us
virtually no downtime in the most common situation (active and backup
slaves are in the same broadcast domain and the active slave failed).

However, if we can't reach the arp_ip_target(s), we end up in an endless
loop of reselecting slaves, because we receive our arp requests, sent by
the active slave, and think that backup slaves are up, thus selecting them
as active and, again, sending arp requests, which fool our backup slaves.

Fix this by not validating the swapped arp packets if the current active
slave didn't receive any arp reply after it was selected as active. This
way we will only accept arp requests if we know that the current active
slave can actually reach arp_ip_target.

v3->v4:
Obey 80 lines and make checkpatch.pl happy, per Sergei's suggestion.

v1->v3:
No change.

Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-25 16:58:38 -07:00
Veaceslav Falico 2c14610210 bonding: don't validate arp if we don't have to
Currently, we validate all the incoming arps if arp_validate not 0.
However, we don't have to validate backup slaves if arp_validate == active
and vice versa, so return early in bond_arp_rcv() in these cases.

It works correctly now because we verify arp_validate in slave_last_rx(),
however we're just doing useless work in bond_arp_rcv().

Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-25 16:58:38 -07:00
Veaceslav Falico 0afee4e8b9 bonding: don't add duplicate targets to arp_ip_target
Print a warning and skip them.

Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-25 16:58:38 -07:00