Commit Graph

287467 Commits

Author SHA1 Message Date
Ben Hutchings cd2d5b529c sfc: Add SR-IOV back-end support for SFC9000 family
On the SFC9000 family, each port has 1024 Virtual Interfaces (VIs),
each with an RX queue, a TX queue, an event queue and a mailbox
register.  These may be assigned to up to 127 SR-IOV virtual functions
per port, with up to 64 VIs per VF.

We allocate an extra channel (IRQ and event queue only) to receive
requests from VF drivers.

There is a per-port limit of 4 concurrent RX queue flushes, and queue
flushes may be initiated by the MC in response to a Function Level
Reset (FLR) of a VF.  Therefore, when SR-IOV is in use, we submit all
flush requests via the MC.

The RSS indirection table is shared with VFs, so the number of RX
queues used in the PF is limited to the number of VIs per VF.

This is almost entirely the work of Steve Hodgson, formerly
shodgson@solarflare.com.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2012-02-16 00:25:13 +00:00
Ben Hutchings 28e47c498a sfc: Allocate SRAM between buffer table and descriptor caches at init time
Each port has a block of 64-bit SRAM that is divided between buffer
table and descriptor cache regions at initialisation time.  Currently
we use a fixed allocation, but it needs to be changed to support
larger numbers of queues.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2012-02-16 00:25:12 +00:00
Ben Hutchings a9a5250627 sfc: Pass NIC structure into efx_wanted_parallelism()
This lets us identify the NIC affected in case of failure, and
will be necessary to adjust for SR-IOV constraints.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2012-02-16 00:25:11 +00:00
Ben Hutchings 7f967c011a sfc: Add support for 'extra' channel types
Abstract some of the channel operations to allow for 'extra'
channels that do not have RX or TX queues.

- Try to assign a channel to each extra channel type that is enabled
  for the NIC, but gracefully degrade if we can't allocate sufficient
  MSI-X vectors
- Allow each extra channel type to generate its own channel name
- Allow channel types to disable reallocation and reinitialisation
  of their channels

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2012-02-16 00:25:10 +00:00
Ben Hutchings a16e5b246c sfc: Make all CPU/IRQ/channel/queue counts unsigned
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2012-02-16 00:25:09 +00:00
Ben Hutchings 5bbe2f4f64 sfc: Make buffer table indices and counts consistently unsigned
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2012-02-16 00:25:08 +00:00
Steve Hodgson a606f4325d sfc: Disable flow control during flushes
The TX DMA engine issues upstream read requests when there is room in
the TX FIFO for the completion. However, the fetches for the rest of
the packet might be delayed by any back pressure.  Since a flush must
wait for an EOP, the entire flush may be delayed by back pressure.

Mitigate this by disabling flow control before the flushes are
started.  Since PF and VF flushes run in parallel introduce
fc_disable, a reference count of the number of flushes outstanding.

The same principle could be applied to Falcon, but that
would bring with it its own testing.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2012-02-16 00:25:07 +00:00
Ben Hutchings 90893000e2 sfc: Generalise event generation to cover VF-owned event queues
For SR-IOV we will need to send events to event queues that belong to
VFs serviced by other drivers.  Change the parameters of
efx_generate_event() to allow this and declare it extern.

While we're at it, remove the existing declaration under the wrong
name efx_nic_generate_event().

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2012-02-16 00:25:06 +00:00
Ben Hutchings 9d9a6973a8 sfc: Use proper function to test for RX channel in efx_poll()
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2012-02-16 00:25:06 +00:00
Ben Hutchings 9f2cb71c2b sfc: Leave interrupts and event queues enabled whenever we can
When SR-IOV is enabled we may receive FLR (Function-Level Reset)
events, associated queue flush events and requests from VF drivers at
any time.  Therefore we need to keep event queues and interrupts
enabled whenever possible.

Currently we stop interrupt-driven event processing before flushing RX
and TX queues; efx_nic_flush_queues() then polls event queues for
flush events and discards any others it finds.  Change it to work with
the regular event handling functions.

Currently efx_start_channel() fills RX queues synchronously when a
device is brought up.  This could now race with NAPI, so change it to
send fill events.

This was almost entirely written by Steve Hodgson, formerly
shodgson@solarflare.com.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2012-02-16 00:24:46 +00:00
Ben Hutchings 2ae75dac30 sfc: Generate RX fill events based on RX queues, not channels
This makes it harder to accidentally send such events to TX-only
channels.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2012-02-16 00:15:03 +00:00
Ben Hutchings 4ef594eb89 sfc: Generalise driver event generation
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2012-02-16 00:14:59 +00:00
Ben Hutchings 055e0ad014 sfc: Correct MAC filter bitfield definitions
The RMFT_DEST_MAC and TMFT_SRC_MAC register fields were previously
documented as 44 bits wide, whereas a MAC address has 48 bits.
Thankfully the hardware uses the correct width and the driver has
used separate definitions that divide each of these into 32-bit and
16-bit fields.

Fix the initial definitions for these fields and rewrite the latter
definitions to use them.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2012-02-16 00:11:31 +00:00
Ben Hutchings 3d885e3921 sfc: Add support for TX MAC filters
On Siena each TX queue can be configured to send only packets for
which there is a TX MAC filter that matches the source MAC address,
queue ID, and optionally VID.  This will be used to implement the
'spoofchk' feature for SR-IOV virtual functions.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2012-02-16 00:11:30 +00:00
Ben Hutchings c274d65c94 sfc: Add support for configuring RX unicast/multicast default filters
On Siena all received packets that don't match a more specific filter
will match the unicast or multicast default filter.  Currently we
leave these set to the default values (RSS with base queue number of
0).  Allow them to be reconfigured to select a single RX queue.

These default filters are programmed through the FILTER_CTL register,
but we represent them internally as an additional table of size 2.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2012-02-16 00:11:24 +00:00
Ben Hutchings 7c43161c11 sfc: Warn if unable to create MTDs
Log an explicit warning if we are unable to create MTDs for a net
device.  Also correct the comment about why mtd_device_register() may
fail; there is no longer an MTD table to fill up.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2012-02-13 23:40:50 +00:00
Ben Hutchings 5b6262d0cc sfc: Replace some literal constants with EFX_PAGE_SIZE/EFX_BUF_SIZE
The 'page size' for PCIe DMA, i.e. the alignment of boundaries at
which DMA must be broken, is 4KB.  Name this value as EFX_PAGE_SIZE
and use it in efx_max_tx_len().  Redefine EFX_BUF_SIZE as
EFX_PAGE_SIZE since its value is also a result of that requirement,
and use it in efx_init_special_buffer().

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2012-02-13 23:40:49 +00:00
Ben Hutchings fadac6aae1 sfc: Do not retry hardware probe if it schedules a reset
If efx_pci_probe_main() schedules an INVISIBLE or ALL reset (but
nothing more drastic), we retry it up to 5 times.  So far as I'm
aware, this was a workaround for bugs in Falcon A0 which were fixed
in production silicon.  Remove the retry.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2012-02-13 23:40:48 +00:00
Ben Hutchings d9ab70079a sfc: Skip RX end-of-batch work on channels without an RX queue
The code in efx_process_channel() to update the RX queue after each
batch of RX completions works out as a no-op on a TX-only channel
where the RX queue structure is set to all-zeroes, but
(1) efx_channel_get_rx_queue() will BUG() if DEBUG is defined, and
(2) it's a waste of time.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2012-02-13 23:40:38 +00:00
David S. Miller 7280f5ae0d sonice: Fix build due to botched netdev_alloc_skb() conversion.
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-07 15:28:15 -05:00
Dan Carpenter af2ce213f6 caif: remove duplicate initialization
"priv" is initialized twice.  I kept the second one, because it is next
to the check for NULL.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-07 13:39:37 -05:00
Eric Dumazet bb7d92e3e3 sh-eth: use netdev stats structure and fix dma_map_single
No need to maintain a parallel net_device_stats structure in
sh_eth_private, since we have a generic one in netdev

Fix two dma_map_single() incorrect parameters, passing skb->tail instead
of skb->data. Seems that there is no corresponding dmap_unmap_single()
calls for the moment in this driver.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Tested-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-07 13:38:57 -05:00
Fabio Estevam b72061a3cb net: fec: Fix build due to wrong dev annotation
commit 21a4e469 (netdev: ethernet dev_alloc_skb to netdev_alloc_skb)
should have used "ndev" instead of "dev".

This causes the following build errors:

drivers/net/ethernet/freescale/fec.c: In function 'fec_enet_rx':
drivers/net/ethernet/freescale/fec.c:714: error: 'dev' undeclared (first use in this function)
drivers/net/ethernet/freescale/fec.c:714: error: (Each undeclared identifier is reported only once
drivers/net/ethernet/freescale/fec.c:714: error: for each function it appears in.)
drivers/net/ethernet/freescale/fec.c: In function 'fec_enet_alloc_buffers':
drivers/net/ethernet/freescale/fec.c:1213: error: 'dev' undeclared (first use in this function)

Fix it, so that fec driver can be built again.

Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-07 13:32:50 -05:00
Shriram Rajagopalan c3059be16c net/sched: sch_plug - Queue traffic until an explicit release command
The qdisc supports two operations - plug and unplug. When the
qdisc receives a plug command via netlink request, packets arriving
henceforth are buffered until a corresponding unplug command is received.
Depending on the type of unplug command, the queue can be unplugged
indefinitely or selectively.

This qdisc can be used to implement output buffering, an essential
functionality required for consistent recovery in checkpoint based
fault-tolerance systems. Output buffering enables speculative execution
by allowing generated network traffic to be rolled back. It is used to
provide network protection for Xen Guests in the Remus high availability
project, available as part of Xen.

This module is generic enough to be used by any other system that wishes
to add speculative execution and output buffering to its applications.

This module was originally available in the linux 2.6.32 PV-OPS tree,
used as dom0 for Xen.

For more information, please refer to http://nss.cs.ubc.ca/remus/
and http://wiki.xensource.com/xenwiki/Remus

Changes in V3:
  * Removed debug output (printk) on queue overflow
  * Added TCQ_PLUG_RELEASE_INDEFINITE - that allows the user to
    use this qdisc, for simple plug/unplug operations.
  * Use of packet counts instead of pointers to keep track of
    the buffers in the queue.

Signed-off-by: Shriram Rajagopalan <rshriram@cs.ubc.ca>
Signed-off-by: Brendan Cully <brendan@cs.ubc.ca>
[author of the code in the linux 2.6.32 pvops tree]
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-07 12:54:56 -05:00
David S. Miller 17b8a74f00 Merge branch 'tipc_net-next' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux 2012-02-07 12:31:01 -05:00
Bruce Allan 0e15df490e e1000e: minor whitespace and indentation cleanup
Cleanup of some whitespace and indentation of a single code block.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2012-02-07 04:18:09 -08:00
Bruce Allan e885d762b7 e1000e: fix sparse warnings with -D__CHECK_ENDIAN__
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2012-02-07 04:16:54 -08:00
Bruce Allan a2a5b3235d e1000e: fix checkpatch warning from MINMAX test
WARNING: min() should probably be min_t(unsigned int, 4, skb->data_len)

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2012-02-07 04:15:52 -08:00
Bruce Allan 24b706b2f4 e1000e: cleanup - use braces in both branches of a conditional statement
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2012-02-07 04:14:50 -08:00
Bruce Allan f23efdff77 e1000e: cleanup e1000_set_phys_id
Use the existing hw pointer.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2012-02-07 04:13:17 -08:00
Bruce Allan 66092f5925 e1000e: cleanup e1000_init_mac_params_82571()
Combine two switch statements into one, convert a nebulous pointer to one
that is a bit more in keeping with the rest of the driver code and cleanup
some coding style.  No change in functionality, just cosmetic changes.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2012-02-07 04:12:14 -08:00
Bruce Allan e68782ed78 e1000e: cleanup e1000_init_mac_params_80003es2lan()
Combine two switch statements into one, convert a nebulous pointer to one
that is a bit more in keeping with the rest of the driver code and remove
some dead code (there are no 80003es2lan devices with fiber).  No change in
functionality, just cosmetic changes.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2012-02-07 04:11:10 -08:00
Bruce Allan 9e2d7657e2 e1000e: cleanup - check return values consistently
The majority of the e1000e code checks most function return values using a
test like 'if (ret_val)' or 'if (!ret_val)' but there are a few instances
of 'if (ret_val == 0)'.  This patch converts the latter to the former for
consistency.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2012-02-07 04:10:07 -08:00
Bruce Allan f36bb6cacd e1000e: add missing initializers reported when compiling with W=1
warning: missing initializer

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2012-02-07 04:09:14 -08:00
Tushar Dave b04e36bac5 e1000: Adding e1000_dump function
When TX hang occurs e1000_dump prints TX ring, RX ring and Device registers.

Signed-off-by: Tushar Dave <tushar.n.dave@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2012-02-07 04:01:46 -08:00
Mitch A Williams ab50a2a430 igbvf: refactor Interrupt Throttle Rate code
The existing ITR code is broken and confusing, with lots of similarly-named
variables that do different things. Additionally, after the driver carefully
determines the optimal interrupt rate for the adapter, it then
ignores it and always writes a fixed, suboptimal value.

This patch refactors that code to make variable names more descriptive of
what they actually do, and then actually writes the calculated result to
the hardware.

Preliminary testing shows that netperf TCP_STREAM tests goes from ~918Mbps
to ~940Mbps, and TCP_RR goes from ~2k transactions/sec up to > 8k.

Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Robert E Garrett <robertX.e.garrett@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2012-02-07 03:49:23 -08:00
Allan Stephens dff10e9e63 tipc: Minor optimization to rejection of connection-based messages
Modifies message rejection logic so that TIPC doesn't attempt to
send a FIN message to the rejecting port if it is known in advance
that there is no such message because the rejecting port doesn't exist.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2012-02-06 16:59:19 -05:00
Allan Stephens 3175bd9add tipc: Eliminate alteration of publication key during name table purging
Removes code that alters the publication key of a name table entry
that is being forcibly purged from TIPC's name table after contact
with the publishing node has been lost.

Current TIPC ensures that all defunct names are purged before
re-establishing contact with a failed node.  There used to be a risk
that the publication might be accidentally deleted because it might be
re-added to the name table before the purge operation was completed.
But now there is no longer a need to ensure that the new key is different
than the old one.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2012-02-06 16:59:19 -05:00
Allan Stephens 63e7f1ac28 tipc: Prevent loss of fragmented messages over broadcast link
Modifies broadcast link so that an incoming fragmented message is not
lost if reassembly cannot begin because there currently is no buffer
big enough to hold the entire reassembled message. The broadcast link
now ignores the first fragment completely, which causes the sending node
to retransmit the first fragment so that reassembly can be re-attempted.

Previously, the sender would have had no reason to retransmit the 1st
fragment, so we would never have a chance to re-try the allocation.

To do this cleanly without duplicaton, a new bclink_accept_pkt()
function is introduced.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2012-02-06 16:59:19 -05:00
Allan Stephens b76b27cad5 tipc: Prevent loss of fragmented messages over unicast links
Modifies unicast link endpoint logic so an incoming fragmented message
is not lost if reassembly cannot begin because there is no buffer big
enough to hold the entire reassembled message. The link endpoint now
ignores the first fragment completely, which causes the sending node to
retransmit the first fragment so that reassembly can be re-attempted.

Previously, the sender would have had no reason to retransmit the 1st
fragment, so we would never have a chance to re-try the allocation.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
2012-02-06 16:59:19 -05:00
Allan Stephens 1ec2bb0840 tipc: Remove obsolete broadcast tag capability
Eliminates support for the broadcast tag field, which is no longer
used by broadcast link NACK messages.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2012-02-06 16:59:18 -05:00
Allan Stephens 7a54d4a99d tipc: Major redesign of broadcast link ACK/NACK algorithms
Completely redesigns broadcast link ACK and NACK mechanisms to prevent
spurious retransmit requests in dual LAN networks, and to prevent the
broadcast link from stalling due to the failure of a receiving node to
acknowledge receiving a broadcast message or request its retransmission.

Note: These changes only impact the timing of when ACK and NACK messages
are sent, and not the basic broadcast link protocol itself, so inter-
operability with nodes using the "classic" algorithms is maintained.

The revised algorithms are as follows:

1) An explicit ACK message is still sent after receiving 16 in-sequence
messages, and implicit ACK information continues to be carried in other
unicast link message headers (including link state messages).  However,
the timing of explicit ACKs is now based on the receiving node's absolute
network address rather than its relative network address to ensure that
the failure of another node does not delay the ACK beyond its 16 message
target.

2) A NACK message is now typically sent only when a message gap persists
for two consecutive incoming link state messages; this ensures that a
suspected gap is not confirmed until both LANs in a dual LAN network have
had an opportunity to deliver the message, thereby preventing spurious NACKs.
A NACK message can also be generated by the arrival of a single link state
message, if the deferred queue is so big that the current message gap
cannot be the result of "normal" mis-ordering due to the use of dual LANs
(or one LAN using a bonded interface). Since link state messages typically
arrive at different nodes at different times the problem of multiple nodes
issuing identical NACKs simultaneously is inherently avoided.

3) Nodes continue to "peek" at NACK messages sent by other nodes. If
another node requests retransmission of a message gap suspected (but not
yet confirmed) by the peeking node, the peeking node forgets about the
gap and does not generate a duplicate retransmit request. (If the peeking
node subsequently fails to receive the lost message, later link state
messages will cause it to rediscover and confirm the gap and send another
NACK.)

4) Message gap "equality" is now determined by the start of the gap only.
This is sufficient to deal with the most common cases of message loss,
and eliminates the need for complex end of gap computations.

5) A peeking node no longer tries to determine whether it should send a
complementary NACK, since the most common cases of message loss don't
require it to be sent. Consequently, the node no longer examines the
"broadcast tag" field of a NACK message when peeking.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2012-02-06 16:59:18 -05:00
Allan Stephens b98158e3b3 tipc: Add missing locks in broadcast link statistics accumulation
Ensures that all attempts to update broadcast link statistics are done
only while holding the lock that protects the link's main data structures,
to prevent interference by simultaneous updates caused by messages
arriving on other interfaces.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2012-02-06 16:59:18 -05:00
Allan Stephens 0232c5a566 tipc: Fix bug in broadcast link duplicate message statistics
Modifies broadcast link so that it increments the "received duplicate
message" count if an incoming message cannot be added to the deferred
message queue because it is already present in the queue. (The aligns
broadcast link behavior with that of TIPC's unicast links.)

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2012-02-06 16:59:17 -05:00
Allan Stephens 8a275a6a30 tipc: Fix node lock reclamation issues in broadcast link reception
Fixes a pair of problems in broadcast link message reception code
relating to the reclamation of the node lock after consuming an
in-sequence message.

1) Now retests to see if the sending node is still up after reclaiming
   the node lock, and bails out if it is non-operational.

2) Now manipulates the node's deferred message queue only after
   reclaiming the node lock, rather than using queue head pointer
   information that was cached previously.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2012-02-06 16:59:17 -05:00
Allan Stephens 57732560d1 tipc: Add missing broadcast link lock when sending NACK
Ensures that any attempt to send a NACK message over TIPC's broadcast
link has exclusive access to the link's main data structures, to prevent
interference with a simultaneous attempt to send other broadcast link
traffic (such as application-generated multicast messages).

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2012-02-06 16:59:17 -05:00
Allan Stephens 47361c87c5 tipc: Fix problem with broadcast link synchronization between nodes
Corrects a problem in which a link endpoint that activates as the
result of receiving a RESET/STATE sequence of link protocol messages
fails to properly record the broadcast link status information about
the node to which it is now communicating with. (The problem does
not occur with the more common RESET/ACTIVATE sequence of messages.)
The fix ensures that the broadcast link status info is updated after
the RESET message resets the link endpoint, rather than before, thereby
preventing new information from being overwritten by the reset operation.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2012-02-06 16:59:16 -05:00
Allan Stephens 9349931371 tipc: Ensure broadcast link re-acquires node after link failure
Fix a bug that can prevent TIPC from sending broadcast messages to a node
if contact with the node is lost and then regained. The problem occurs if
the broadcast link first clears the flag indicating the node is part of the
link's distribution set (when it loses contact with the node), and later
fails to restore the flag (when contact is regained); restoration fails
if contact with the node is regained by implicit unicast link activation
triggered by the arrival of a data message, rather than explicitly by the
arrival of a link activation message.

The broadcast link now uses separate fields to track whether a node is
theoretically capable of receiving broadcast messages versus whether it is
actually part of the link's distribution set. The former member is updated
by the receipt of link protocol messages, which can occur at any time; the
latter member is updated only when contact with the node is gained or lost.
This change also permits the simplification of several conditional
expressions since the broadcast link's "supported" field can now only be
set if there are working links to the associated node.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2012-02-06 16:59:16 -05:00
Allan Stephens 4d75313ce9 tipc: Prevent broadcast link stalling in dual LAN environments
Ensure that sequence number information about incoming broadcast link
messages is initialized only by the activation of the first link to a
given cluster node.  Previously, a race condition allowed reset and/or
activation messages for a second link to re-initialize this sequence
number information with obsolete values. This could trigger TIPC to
request the retransmission of previously acknowledged broadcast link
messages from that node, resulting in broadcast link processing becoming
stalled if the node had already released one or more of those messages
and was unable to perform the required retransmission.

Thanks to Laser <gotolaser@gmail.com> for identifying this problem
and assisting in the development of this fix.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2012-02-06 16:59:16 -05:00
Allan Stephens 92d2c905b4 tipc: Prevent transmission of outdated link protocol messages
Ensures that a link endpoint discards any previously deferred link
protocol message whenever it attempts to send a new one.

Previously, it was possible for a link protocol message that was unsent
due to congestion to be transmitted after newer protocol messages had
been sent. The stale link protocol message might then cause the receiving
link endpoint to malfunction because of its outdated conent.

Thanks to Osamu Kaminuma [okaminum@avaya.com] for diagnosing the problem
and contributing a prototype patch.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2012-02-06 16:59:15 -05:00