Commit Graph

492 Commits

Author SHA1 Message Date
Ben Hutchings f88a4a9b65 bonding/vlan: Fix mangled NAs on slaves without VLAN tag insertion
bond_na_send() attempts to insert a VLAN tag in between building and
sending packets of the respective formats.  If the slave does not
implement hardware VLAN tag insertion then vlan_put_tag() will mangle
the network-layer header because the Ethernet header is not present at
this point (unlike in bond_arp_send()).

Fix this by adding the tag out-of-line and relying on
dev_hard_start_xmit() to insert it inline if necessary.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Reviewed-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-16 12:43:22 -08:00
Ben Hutchings ffa95ed50f bonding: Change active slave quietly when bond is down
bond_change_active_slave() may be called when a slave is added, even
if the bond has not been brought up yet.  It may then attempt to send
packets, and further it may use mcast_work which is uninitialised
before the bond is brought up.  Add the necessary checks for
netif_running(bond->dev).

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-16 12:43:22 -08:00
Ben Hutchings 8387451e55 bonding/vlan: Remove redundant VLAN tag insertion logic
A bond may have a mixture of slave devices with and without hardware
VLAN tag insertion capability.  Therefore it always claims this
capability and performs software VLAN tag insertion if the slave does
not.

Since commit 7b9c609037, this has
also been done by dev_hard_start_xmit().  The result is that VLAN-
tagged skbs are now double-tagged when transmitted through slave
devices without hardware VLAN tag insertion!

Remove the now-redundant logic from bond_dev_queue_xmit().

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Reviewed-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-16 12:43:21 -08:00
Hillf Danton af3e5bd5f6 bonding: Fix slave selection bug.
The returned slave is incorrect, if the net device under check is not
charged yet by the master.

Signed-off-by: Hillf Danton <dhillf@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-16 12:24:02 -08:00
Taku Izumi f073c7ca29 bonding: add the debugfs facility to the bonding driver
This patch provides the debugfs facility to the bonding driver.
The "bonding" directory is created in the debugfs root and directories of
each bonding interface (like bond0, bond1...) are created in that.

 # mount -t debugfs none /sys/kernel/debug

 # ls /sys/kernel/debug/bonding
 bond0  bond1

Signed-off-by: Taku Izumi <izumi.taku@jp.fujitsu.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-10 16:24:33 -08:00
Neil Horman fb4fa76a1f net: Convert netpoll blocking api in bonding driver to be a counter
A while back I made some changes to enable netpoll in the bonding driver.  Among
them was a per-cpu flag that indicated we were in a path that held locks which
could cause the netpoll path to block in during tx, and as such the tx path
should queue the frame for later use.  This appears to have given rise to a
regression.  If one of those paths on which we hold the per-cpu flag yields the
cpu, its possible for us to come back on a different cpu, leading to us clearing
a different flag than we set.  This results in odd netpoll drops, and BUG
backtraces appearing in the log, as we check to make sure that we only clear set
bits, and only set clear bits.  I had though briefly about changing the
offending paths so that they wouldn't sleep, but looking at my origional work
more closely, it doesn't appear that a per-cpu flag is warranted.  We alrady
gate the checking of this flag on IFF_IN_NETPOLL, so we don't hit this in the
normal tx case anyway.  And practically speaking, the normal use case for
netpoll is to only have one client anyway, so we're not going to erroneously
queue netpoll frames when its actually safe to do so.  As such, lets just
convert that per-cpu flag to an atomic counter.  It fixes the rescheduling bugs,
is equivalent from a performance perspective and actually eliminates some code
in the process.

Tested by the reporter and myself, successfully

Reported-by: Liang Zheng <lzheng@redhat.com>
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: David S. Miller <davem@davemloft.net>
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-09 20:33:46 -08:00
David S. Miller fe6c791570 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	drivers/net/wireless/ath/ath9k/ar9003_eeprom.c
	net/llc/af_llc.c
2010-12-08 13:47:38 -08:00
David Strand d13a2cb63d bonding: check for assigned mac before adopting the slaves mac address
Restore the check for an unassigned mac address before adopting the
first slaves as it's own. The change in behavior was introduced by:

commit c20811a79e
Author: Jiri Pirko <jpirko@redhat.com>

    bonding: move dev_addr cpy to bond_enslave


Signed-off-by: David Strand <dpstrand@gmail.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-01 11:43:08 -08:00
David S. Miller 24912420e9 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	drivers/net/bonding/bond_main.c
	net/core/net-sysfs.c
	net/ipv6/addrconf.c
2010-11-19 13:13:47 -08:00
Eric Dumazet 866f3b25a2 bonding: IGMP handling cleanup
Instead of iterating in_dev->mc_list from bonding driver, its better
to call a helper function provided by igmp.c
Details of implementation (locking) are private to igmp code.

ip_mc_rejoin_group(struct ip_mc_list *im) becomes
ip_mc_rejoin_groups(struct in_device *in_dev);

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-18 09:33:19 -08:00
Eric Dumazet 3006bc3889 bonding: fix a race in IGMP handling
RCU conversion in IGMP code done in net-next-2.6 raised a race in
__bond_resend_igmp_join_requests().

It iterates in_dev->mc_list without appropriate protection (RTNL, or
read_lock on in_dev->mc_list_lock).

Another cpu might delete an entry while we use it and trigger a fault.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-18 09:30:42 -08:00
Joe Perches c04914af68 drivers/net/bonding: Remove unnecessary casts of netdev_priv
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-17 10:36:50 -08:00
Eric Dumazet e4a7b93bd5 bonding: remove dev_base_lock use
bond_info_seq_start() uses a read_lock(&dev_base_lock) to make sure
device doesn’t disappear. Same goal can be achieved using RCU.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-08 13:50:07 -08:00
Uwe Kleine-König b595076a18 tree-wide: fix comment/printk typos
"gadget", "through", "command", "maintain", "maintain", "controller", "address",
"between", "initiali[zs]e", "instead", "function", "select", "already",
"equal", "access", "management", "hierarchy", "registration", "interest",
"relative", "memory", "offset", "already",

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2010-11-01 15:38:34 -04:00
Jarek Poplawski a71fb88145 bonding: Fix lockdep warning after bond_vlan_rx_register()
Fix lockdep warning:
[   52.991402] ======================================================
[   52.991511] [ INFO: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected ]
[   52.991569] 2.6.36-04573-g4b60626-dirty #65
[   52.991622] ------------------------------------------------------
[   52.991696] ip/4842 [HC0[0]:SC0[4]:HE1:SE0] is trying to acquire:
[   52.991758]  (&bond->lock){++++..}, at: [<efe4d300>] bond_set_multicast_list+0x60/0x2c0 [bonding]
[   52.991966]
[   52.991967] and this task is already holding:
[   52.992008]  (&bonding_netdev_addr_lock_key){+.....}, at: [<c04e5530>] dev_mc_sync+0x50/0xa0
[   52.992008] which would create a new lock dependency:
[   52.992008]  (&bonding_netdev_addr_lock_key){+.....} -> (&bond->lock){++++..}
[   52.992008]
[   52.992008] but this new dependency connects a SOFTIRQ-irq-safe lock:
[   52.992008]  (&(&mc->mca_lock)->rlock){+.-...}
[   52.992008] ... which became SOFTIRQ-irq-safe at:
[   52.992008]   [<c0272beb>] __lock_acquire+0x96b/0x1960
[   52.992008]   [<c027415e>] lock_acquire+0x7e/0xf0
[   52.992008]   [<c05f356d>] _raw_spin_lock_bh+0x3d/0x50
[   52.992008]   [<c0584e40>] mld_ifc_timer_expire+0xf0/0x280
[   52.992008]   [<c024cee6>] run_timer_softirq+0x146/0x310
[   52.992008]   [<c024591d>] __do_softirq+0xad/0x1c0
[   52.992008]
[   52.992008] to a SOFTIRQ-irq-unsafe lock:
[   52.992008]  (&bond->lock){++++..}
[   52.992008] ... which became SOFTIRQ-irq-unsafe at:
[   52.992008] ...  [<c0272c3b>] __lock_acquire+0x9bb/0x1960
[   52.992008]   [<c027415e>] lock_acquire+0x7e/0xf0
[   52.992008]   [<c05f36b8>] _raw_write_lock+0x38/0x50
[   52.992008]   [<efe4cbe4>] bond_vlan_rx_register+0x24/0x70 [bonding]
[   52.992008]   [<c0598010>] register_vlan_dev+0xc0/0x280
[   52.992008]   [<c0599f3a>] vlan_newlink+0xaa/0xd0
[   52.992008]   [<c04ed4b4>] rtnl_newlink+0x404/0x490
[   52.992008]   [<c04ece35>] rtnetlink_rcv_msg+0x1e5/0x220
[   52.992008]   [<c050424e>] netlink_rcv_skb+0x8e/0xb0
[   52.992008]   [<c04ecbac>] rtnetlink_rcv+0x1c/0x30
[   52.992008]   [<c0503bfb>] netlink_unicast+0x24b/0x290
[   52.992008]   [<c0503e37>] netlink_sendmsg+0x1f7/0x310
[   52.992008]   [<c04cd41c>] sock_sendmsg+0xac/0xe0
[   52.992008]   [<c04ceb80>] sys_sendmsg+0x130/0x230
[   52.992008]   [<c04cf04e>] sys_socketcall+0xde/0x280
[   52.992008]   [<c0202d10>] sysenter_do_call+0x12/0x36
[   52.992008]
[   52.992008] other info that might help us debug this:
...
[ Full info at netdev: Wed, 27 Oct 2010 12:24:30 +0200
  Subject: [BUG net-2.6 vlan/bonding] lockdep splats ]

Use BH variant of write_lock(&bond->lock) (as elsewhere in bond_main)
to prevent this dependency.

Fixes commit f35188faa0 [v2.6.36]

Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Tested-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Jay Vosburgh <fubar@us.ibm.com>
2010-10-27 14:24:07 -07:00
Bandan Das 7bfc475323 bonding: cleanup: remove braces from single block statements
checkpatch.pl cleanup : Remove braces from single statement
blocks.

Signed-off-by: Bandan Das <bandan.das@stratus.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-21 03:09:49 -07:00
Bandan Das 128ea6c3ee bonding: cleanup : add space around operators
checkpatch.pl cleanup: Added spaces around operators at various places.
Also fixed some c99 style comments that I came across.

Signed-off-by: Bandan Das <bandan.das@stratus.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-21 03:09:49 -07:00
stephen hemminger 26d8ee75e0 bonding: make release_and_destroy static
Only used in main file.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-21 03:09:46 -07:00
stephen hemminger 379b738341 bonding: make bond_resend_igmp_join_requests_delayed static
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Flavio Leitner <fleitner@redhat.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-21 03:09:43 -07:00
Neil Horman 9ff76c951c netpoll: Remove netpoll blocking from uninit path
Some recent testing in netpoll with bonding showed this backtrace

 ------------[ cut here ]------------
 kernel BUG at drivers/net/bonding/bonding.h:134!
 invalid opcode: 0000 [#1] SMP
 last sysfs file: /sys/devices/pci0000:00/0000:00:1d.2/usb7/devnum
 CPU 0
 Pid: 1876, comm: rmmod Not tainted 2.6.36-rc3+ #10 D26928/
 RIP: 0010:[<ffffffffa0514ba4>]  [<ffffffffa0514ba4>] bond_uninit+0x6f4/0x7a0
 RSP: 0018:ffff88003b1b5d58  EFLAGS: 00010296
 RAX: ffff88003b9b6200 RBX: ffff8800373e8e00 RCX: 00000000000f4240
 RDX: 00000000ffffffff RSI: 0000000000000286 RDI: 0000000000000286
 RBP: ffff88003b1b5dc8 R08: 0000000000000000 R09: 00000001af7de920
 R10: 0000000000000000 R11: ffff880002495e98 R12: ffff880037922700
 R13: ffff880038c31000 R14: ffff880037922730 R15: 0000000000000286
 FS:  00007f90e6d72700(0000) GS:ffff880002400000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
 CR2: 000000346f0d9ad0 CR3: 000000003b263000 CR4: 00000000000006f0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
 Process rmmod (pid: 1876, threadinfo ffff88003b1b4000, task ffff88003b36aa80)
 Stack:
 00000000ffffffff ffff88003b1b5d7a ffff8800379221e8 ffff880037922000
 <0> ffff88003b1b5dc8 ffffffff813eb5fb ffff88003b1b5da8 0000000031b177a3
 <0> ffff88003b1b5da8 ffff880037922000 ffff88003b1b5e48 ffff88003b1b5e48
 Call Trace:
 [<ffffffff813eb5fb>] ? rtmsg_ifinfo+0xcb/0xf0
 [<ffffffff813daad8>] rollback_registered_many+0x168/0x280
 [<ffffffff813dac09>] unregister_netdevice_many+0x19/0x80
 [<ffffffff813e97b3>] __rtnl_kill_links+0x63/0x90
 [<ffffffff813e980b>] __rtnl_link_unregister+0x2b/0x60
 [<ffffffff813e9bde>] rtnl_link_unregister+0x1e/0x30
 [<ffffffffa052124b>] bonding_exit+0x37/0x51 [bonding]
 [<ffffffff81098b2e>] sys_delete_module+0x19e/0x270
 [<ffffffff810bb2b2>] ? audit_syscall_entry+0x252/0x280
 [<ffffffff8100b0b2>] system_call_fastpath+0x16/0x1b
 RIP  [<ffffffffa0514ba4>] bond_uninit+0x6f4/0x7a0 [bonding]
 RSP <ffff88003b1b5d58>
 ---[ end trace 1395ad691cea24d1 ]---

It occurs because of my recent netpoll blocking patches, which I added to avoid
recursive deadlock in the bonding driver.  It relies on some per cpu bits, but
the shutdown path forces some rescheduling as we cancel workqueues for the
driver and wait for some device refcounts.  If after the forced reschedule, we
wind up on a different cpu we trigger the bughalt in unblock_netpoll_tx.

The fix is to remove the netpoll block/unblock calls from bond_release_all.
This is safe to do because bond_uninit, which is called via ndo_uninit in
rollback_registered_many, doesn't occur until we send a NETDEV_UNREGISTER event,
which triggers netconsole to remove us as a netpoll client, so we are guaranteed
not to recurse into our own tx path here.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Reviewed-by: WANG Cong <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-20 01:44:29 -07:00
Neil Horman 45b0cb8abd bonding: Re-enable netpoll over bonding
With the inclusion of previous fixup patches, netpoll over bonding apears to
work reliably with failover conditions.  This reverts Gospos previous commit
c22d7ac844, and allows access again to the netpoll
functionality in the bonding driver.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-18 08:32:09 -07:00
Neil Horman e843fa5088 bonding: Fix deadlock in bonding driver resulting from internal locking when using netpoll
The monitoring paths in the bonding driver take write locks that are shared by
the tx path.  If netconsole is in use, these paths can call printk which puts us
in the netpoll tx path, which, if netconsole is attached to the bonding driver,
result in deadlock (the xmit_lock guards are useless in netpoll_send_skb, as the
monitor paths in the bonding driver don't claim the xmit_lock, nor should they).
The solution is to use a per cpu flag internal to the driver to indicate when a
cpu is holding the lock in a path that might recusrse into the tx path for the
driver via netconsole.  By checking this flag on transmit, we can defer the
sending of the netconsole frames until a later time using the retransmit feature
of netpoll_send_skb that is triggered on the return code NETDEV_TX_BUSY.  I've
tested this and am able to transmit via netconsole while causing failover
conditions on the bond slave links.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-18 08:32:07 -07:00
Neil Horman c2355e1ab9 bonding: Fix bonding drivers improper modification of netpoll structure
The bonding driver currently modifies the netpoll structure in its xmit path
while sending frames from netpoll.  This is racy, as other cpus can access the
netpoll structure in parallel. Since the bonding driver points np->dev to a
slave device, other cpus can inadvertently attempt to send data directly to
slave devices, leading to improper locking with the bonding master, lost frames,
and deadlocks.  This patch fixes that up.

This patch also removes the real_dev pointer from the netpoll structure as that
data is really only used by bonding in the poll_controller, and we can emulate
its behavior by check each slave for IS_UP.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-18 08:32:07 -07:00
David S. Miller 69259abb64 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	drivers/net/pcmcia/pcnet_cs.c
	net/caif/caif_socket.c
2010-10-06 19:39:31 -07:00
Krzysztof Oledzki dd53df265b bonding: add Speed/Duplex information to /proc/net/bonding/bond
Effect:
 Slave Interface: eth5
 MII Status: up
 Speed: 10000 Mbps
 Duplex: full
 Link Failure Count: 0
 Permanent HW addr: XX:XX:XX:XX:XX:XX
 Slave queue ID: 0

Signed-off-by: Krzysztof Piotr Oledzki <ole@ans.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-06 18:43:34 -07:00
Krzysztof Piotr Oledzki 546add7946 bonding: reread information about speed and duplex when interface goes up
When an interface was enslaved when it was down, bonding thinks
it has speed -1 even after it goes up. This leads into selecting
a wrong active interface in active/backup mode on mixed 10G/1G or
1G/100M environment.

before:
 bonding: bond0: link status definitely up for interface eth5, 100 Mbps full duplex.
 bonding: bond0: link status definitely up for interface eth0, 100 Mbps full duplex.

after:
 bonding: bond0: link status definitely up for interface eth5, 10000 Mbps full duplex.
 bonding: bond0: link status definitely up for interface eth0, 1000 Mbps full duplex.

Signed-off-by: Krzysztof Piotr Oledzki <ole@ans.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-06 14:28:22 -07:00
Krzysztof Piotr Oledzki 700c2a779e bonding: print information about speed and duplex seen by the driver
before:
 bonding: bond0: link status definitely up for interface eth5
 bonding: bond0: link status definitely up for interface eth0

after:
 bonding: bond0: link status definitely up for interface eth5, 100 Mbps full duplex.
 bonding: bond0: link status definitely up for interface eth0, 100 Mbps full duplex.


Signed-off-by: Krzysztof Piotr Oledzki <ole@ans.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-06 14:25:06 -07:00
Flavio Leitner c2952c314b bonding: add retransmit membership reports tunable
Allow sysadmins to configure the number of multicast
membership report sent on a link failure event.

Signed-off-by: Flavio Leitner <fleitner@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-05 20:26:58 -07:00
Flavio Leitner 5a37e8ca85 bonding: rejoin multicast groups on VLANs
During a failover, the IGMP membership is sent to update
the switch restoring the traffic, but it misses groups added
to VLAN devices running on top of bonding devices.

This patch changes it to iterate over all VLAN devices
on top of it sending IGMP memberships too.

Signed-off-by: Flavio Leitner <fleitner@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-05 20:26:56 -07:00
Neil Horman 27e6f065df bonding: fix WARN_ON when writing to bond_master sysfs file
Fix a WARN_ON failure in bond_masters sysfs file

Got a report of this warning recently

bonding: bond0 is being created...
------------[ cut here ]------------
WARNING: at fs/proc/generic.c:590 proc_register+0x14d/0x185()
Hardware name: ProLiant BL465c G1
proc_dir_entry 'bonding/bond0' already registered
Modules linked in: bonding ipv6 tg3 bnx2 shpchp amd64_edac_mod edac_core
ipmi_si
ipmi_msghandler serio_raw i2c_piix4 k8temp edac_mce_amd hpwdt microcode hpsa
cc
iss radeon ttm drm_kms_helper drm i2c_algo_bit i2c_core [last unloaded:
scsi_wai
t_scan]
Pid: 935, comm: ifup-eth Not tainted 2.6.33.5-124.fc13.x86_64 #1
Call Trace:
[<ffffffff8104b54c>] warn_slowpath_common+0x77/0x8f
[<ffffffff8104b5b1>] warn_slowpath_fmt+0x3c/0x3e
[<ffffffff8114bf0b>] proc_register+0x14d/0x185
[<ffffffff8114c20c>] proc_create_data+0x87/0xa1
[<ffffffffa0211e9b>] bond_create_proc_entry+0x55/0x95 [bonding]
[<ffffffffa0215e5d>] bond_init+0x95/0xd0 [bonding]
[<ffffffff8138cd97>] register_netdevice+0xdd/0x29e
[<ffffffffa021240b>] bond_create+0x8e/0xb8 [bonding]
[<ffffffffa021c4be>] bonding_store_bonds+0xb3/0x1c1 [bonding]
[<ffffffff812aec85>] class_attr_store+0x27/0x29
[<ffffffff8115423d>] sysfs_write_file+0x10f/0x14b
[<ffffffff81101acf>] vfs_write+0xa9/0x106
[<ffffffff81101be2>] sys_write+0x45/0x69
[<ffffffff81009b02>] system_call_fastpath+0x16/0x1b
---[ end trace a677c3f7f8b16b1e ]---
bonding: Bond creation failed.

It happens because a user space writer to bond_master can try to
register an already existing bond interface name.  Fix it by teaching
bond_create to check for the existance of devices with that name first
in cases where a non-NULL name parameter has been passed in

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-05 20:06:01 -07:00
David S. Miller e40051d134 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	drivers/net/qlcnic/qlcnic_init.c
	net/ipv4/ip_output.c
2010-09-27 01:03:03 -07:00
Eric Dumazet 807540baae drivers/net: return operator cleanup
Change "return (EXPR);" to "return EXPR;"

return is not a function, parentheses are not required.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-26 18:34:29 -07:00
Eric Dumazet e6599c2ecf bonding: enable gro by default
gro can be enabled by default on bonding devices.

Actual support depends on the lower devices.

One can still use ethtool to switch off GRO if needed.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-17 16:53:24 -07:00
Andy Gospodarek ab12811c89 bonding: correctly process non-linear skbs
It was recently brought to my attention that 802.3ad mode bonds would no
longer form when using some network hardware after a driver update.
After snooping around I realized that the particular hardware was using
page-based skbs and found that skb->data did not contain a valid LACPDU
as it was not stored there.  That explained the inability to form an
802.3ad-based bond.  For balance-alb mode bonds this was also an issue
as ARPs would not be properly processed.

This patch fixes the issue in my tests and should be applied to 2.6.36
and as far back as anyone cares to add it to stable.

Thanks to Alexander Duyck <alexander.h.duyck@intel.com> and Jesse
Brandeburg <jesse.brandeburg@intel.com> for the suggestions on this one.

Signed-off-by: Andy Gospodarek <andy@greyhouse.net>
CC: Alexander Duyck <alexander.h.duyck@intel.com>
CC: Jesse Brandeburg <jesse.brandeburg@intel.com>
CC: stable@kerne.org
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-14 14:25:32 -07:00
Jiri Bohac cb32f2a0d1 bonding: Fix jiffies overflow problems (again)
The time_before_eq()/time_after_eq() functions operate on unsigned
long and only work if the difference between the two compared values
is smaller than half the range of unsigned long (31 bits on i386).

Some of the variables (slave->jiffies, dev->trans_start, dev->last_rx)
used by bonding store a copy of jiffies and may not be updated for a
long time. With HZ=1000, time_before_eq()/time_after_eq() will start
giving bad results after ~25 days.

jiffies will never be before slave->jiffies, dev->trans_start,
dev->last_rx by more than possibly a couple ticks caused by preemption
of this code. This allows us to detect/prevent these overflows by
replacing time_before_eq()/time_after_eq() with time_in_range().

Signed-off-by: Jiri Bohac <jbohac@suse.cz>
Signed-off-by: Jean Delvare <jdelvare@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-07 13:57:20 -07:00
Andy Gospodarek c5cb002fb0 bonding: prevent sysfs from allowing arp monitoring with alb/tlb
When using module options arp monitoring and balance-alb/balance-tlb
are mutually exclusive options.  Anytime balance-alb/balance-tlb are
enabled mii monitoring is forced to 100ms if not set.  When configuring
via sysfs no checking is currently done.

Handling these cases with sysfs has to be done a bit differently because
we do not have all configuration information available at once.  This
patch will not allow a mode change to balance-alb/balance-tlb if
arp_interval is already non-zero.  It will also not allow the user to
set a non-zero arp_interval value if the mode is already set to
balance-alb/balance-tlb.  They are still mutually exclusive on a
first-come, first serve basis.

Tested with initscripts on Fedora and manual setting via sysfs.

Signed-off-by: Andy Gospodarek <gospo@redhat.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-30 23:27:57 -07:00
David S. Miller bb7e95c8fd Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	drivers/net/bnx2x_main.c

Merge bnx2x bug fixes in by hand... :-/

Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-27 21:01:35 -07:00
Greg Edwards d8190dff01 bonding: set device in RLB ARP packet handler
After:

commit 6146b1a4da
Author: Jay Vosburgh <fubar@us.ibm.com>
Date:   Tue Nov 4 17:51:15 2008 -0800

    bonding: Fix ALB mode to balance traffic on VLANs

the dev field in the RLB ARP packet handler was set to NULL to wildcard
and accommodate balancing VLANs on top of bonds.

This has the side-effect of the packet handler being called against
other, non RLB-enabled bonds, and a kernel oops results when it tries to
dereference rx_hashtbl in rlb_update_entry_from_arp(), which won't be
set for those bonds, e.g. active-backup.

With the __netif_receive_skb() changes from:

commit 1f3c8804ac
Author: Andy Gospodarek <andy@greyhouse.net>
Date:   Mon Dec 14 10:48:58 2009 +0000

    bonding: allow arp_ip_targets on separate vlans to use arp validation

frames received on VLANs correctly make their way to the bond's handler,
so we no longer need to wildcard the device.

The oops can be reproduced by:

modprobe bonding

echo active-backup > /sys/class/net/bond0/bonding/mode
echo 100 > /sys/class/net/bond0/bonding/miimon
ifconfig bond0 xxx.xxx.xxx.xxx netmask xxx.xxx.xxx.xxx
echo +eth0 > /sys/class/net/bond0/bonding/slaves
echo +eth1 > /sys/class/net/bond0/bonding/slaves

echo +bond1 > /sys/class/net/bonding_masters
echo balance-alb > /sys/class/net/bond1/bonding/mode
echo 100 > /sys/class/net/bond1/bonding/miimon
ifconfig bond1 xxx.xxx.xxx.xxx netmask xxx.xxx.xxx.xxx
echo +eth2 > /sys/class/net/bond1/bonding/slaves
echo +eth3 > /sys/class/net/bond1/bonding/slaves

Pass some traffic on bond0.  Boom.

[ Tested, behaves as advertised.  I do not believe a test of the bonding
mode is necessary, as there is no race between the packet handler and
the bonding mode changing (the mode can only change when the device is
closed).  Also updated the log message to include the reproduction and
full commit ids.  -J ]

Signed-off-by: Greg Edwards <greg.edwards@hp.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Acked-by: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-24 20:37:48 -07:00
Jay Vosburgh 03dc2f4c52 bonding: don't lock when copying/clearing VLAN list on slave
When copying VLAN information to or removing from a slave
during slave addition or removal, the bonding code currently holds
the bond->lock for write to prevent concurrent modification of the
vlan_list / vlgrp.

	This is unnecessary, as all of these operations occur under
RTNL.  Holding the bond->lock also caused might_sleep issues for
some drivers' ndo_vlan_* functions.  This patch removes the extra
locking.

	Problem reported by Michael Chan <mchan@broadcom.com>

Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Cc: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-22 14:14:47 -07:00
Jay Vosburgh f35188faa0 bonding: change test for presence of VLANs
After commit ad1afb0039
("vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)")
it is now regular practice for a VLAN "add vid" for VLAN 0 to
arrive prior to any VLAN registration or creation of a vlan_group.

	This patch updates the bonding code that tests for the presence
of VLANs configured above bonding.  The new logic tests for bond->vlgrp
to determine if a registration has occured, instead of testing that
bonding's internal vlan_list is empty.

	The old code would panic when vlan_list was not empty, but
vlgrp was still NULL (because only an "add vid" for VLAN 0 had occured).

	Bonding still adds VLAN 0 to its internal list so that 802.1p
frames are handled correctly on transmit when non-VLAN accelerated
slaves are members of the bond.  The test against bond->vlan_list
remains in bond_dev_queue_xmit for this reason.

	Modification to the bond->vlgrp now occurs under lock (in
addition to RTNL), because not all inspections of it occur under RTNL.

	Additionally, because 8021q will never issue a "kill vid" for
VLAN 0, there is now logic in bond_uninit to release any remaining
entries from vlan_list.

Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Cc: Pedro Garcia <pedro.netdev@dondevamos.com>
Cc: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-22 14:14:46 -07:00
Eric Dumazet 90e1795b9b bonding: avoid a warning
drivers/net/bonding/bond_main.c:179:12: warning: ‘disable_netpoll’
defined but not used

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-19 13:34:16 -07:00
Eric Dumazet db5dda9057 bonding: fix bond_inet6addr_event()
After commit ad1afb0039 (vlan_dev: VLAN 0 should be treated
as "no vlan tag" (802.1p packet)),
bond_inet6addr_event() might be called with a NULL bond->vlgrp pointer, and
a non empty bond->vlan_list. vlan_group_get_device() is dereferencing a NULL pointer.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-19 09:34:31 -07:00
Nicolas de Pesloüan 79236680bd bonding: fix a buffer overflow in bonding_show_queue_id.
The test for buffer overflow ensures we have room for 6 more bytes.
sprintf, called with %s:%d, slave->dev->name, slave->queue_id may yield
far more than 6 bytes.

The correct test is res > (PAGE_SIZE - IFNAMSIZ - 6) .

Signed-off-by: Nicolas de Pesloüan <nicolas.2p.debian@free.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-14 18:24:54 -07:00
David S. Miller 597e608a84 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2010-07-07 15:59:38 -07:00
Eric Dumazet 28172739f0 net: fix 64 bit counters on 32 bit arches
There is a small possibility that a reader gets incorrect values on 32
bit arches. SNMP applications could catch incorrect counters when a
32bit high part is changed by another stats consumer/provider.

One way to solve this is to add a rtnl_link_stats64 param to all
ndo_get_stats64() methods, and also add such a parameter to
dev_get_stats().

Rule is that we are not allowed to use dev->stats64 as a temporary
storage for 64bit stats, but a caller provided area (usually on stack)

Old drivers (only providing get_stats() method) need no changes.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-07 14:58:56 -07:00
Flavio Leitner 42d782ac1b bonding: check if clients MAC addr has changed
When two systems using bonding devices in adaptive load
balancing (ALB) communicates with each other, an endless
ping-pong of ARP replies starts between these two systems.

What happens? In the ALB mode, bonding driver keeps track
of each client connected in a hash table, so it can do the
receive load balancing (RLB). This hash table is updated
when an ARP reply is received, then it scans for the client
entry, updates its MAC address and flag it to be announced
later. Therefore, two seconds later, the alb monitor runs
and send for each updated client entry two ARP replies
updating this specific client. The same process happens on
the receiving system, causing the endless ping-pong of arp
replies.

See more information including the relevant functions below:

   System 1                          System 2
    bond0                             bond0

   ping <system2>
    ARP request  --------->
                           <--------- ARP reply

+->rlb_arp_recv  <---------------------+   <--- loop begins
|  rlb_update_entry_from_arp           |
|  client_info->ntt = 1;               |
|  bond_info->rx_ntt = 1;              |
|                                      |
|         <communication succeed>      |
|                                      |
|  bond_alb_monitor                    |
|  rlb_update_rx_clients               |
|  rlb_update_client                   |
|  arp_create(ARPOP_REPLY)             |
|   send ARP reply -------------->     V
|   send ARP reply -------------->
|                               rlb_arp_recv
|                               rlb_update_entry_from_arp
|                               client_info->ntt = 1;
|                               bond_info->rx_ntt = 1;
|                           < snipped, same as in system 1>
+-------           <-------------- send ARP reply
                   <-------------- send ARP reply

Besides the unneeded networking traffic, this loop breaks
a cluster because a backup system can't take over the IP
address. There is always one system sending an ARP reply
poisoning the network.

This patch fixes the problem adding a check for the MAC
address before updating it. Thus, if the MAC address didn't
change, there is no need to update neither to announce it later.

Signed-off-by: Flavio Leitner <fleitner@redhat.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-30 13:51:11 -07:00
Andy Gospodarek c22d7ac844 bonding: prevent netpoll over bonded interfaces
Support for netpoll over bonded interfaces was added here:

	commit f6dc31a85c
	Author: WANG Cong <amwang@redhat.com>
	Date:   Thu May 6 00:48:51 2010 -0700

	    bonding: make bonding support netpoll

but it is bad enough that we should probably just disable netpoll over
bonding until some of the locking logic in the bonding driver is changed
or converted completely to RCU.  Simple actions like changing the active
slave in active-backup mode will hang the box if a high enough printk
debugging level is enabled.

Keeping the old code around will be good for anyone that wants to work
on it (and for after the RCU conversion), so I propose this small patch
rather than ripping it all out.

Signed-off-by: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-28 23:54:10 -07:00
Ben Hutchings be1f3c2c02 net: Enable 64-bit net device statistics on 32-bit architectures
Use struct rtnl_link_stats64 as the statistics structure.

On 32-bit architectures, insert 32 bits of padding after/before each
field of struct net_device_stats to make its layout compatible with
struct rtnl_link_stats64.  Add an anonymous union in net_device; move
stats into the union and add struct rtnl_link_stats64 stats64.

Add net_device_ops::ndo_get_stats64, implementations of which will
return a pointer to struct rtnl_link_stats64.  Drivers that implement
this operation must not update the structure asynchronously.

Change dev_get_stats() to call ndo_get_stats64 if available, and to
return a pointer to struct rtnl_link_stats64.  Change callers of
dev_get_stats() accordingly.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-12 15:51:22 -07:00
Changli Gao d8d1f30b95 net-next: remove useless union keyword
remove useless union keyword in rtable, rt6_info and dn_route.

Since there is only one member in a union, the union keyword isn't useful.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-10 23:31:35 -07:00
Andy Gospodarek bb1d912323 bonding: allow user-controlled output slave selection
v2: changed bonding module version, modified to apply on top of changes
from previous patch in series, and updated documentation to elaborate on
multiqueue awareness that now exists in bonding driver.

This patch give the user the ability to control the output slave for
round-robin and active-backup bonding.  Similar functionality was
discussed in the past, but Jay Vosburgh indicated he would rather see a
feature like this added to existing modes rather than creating a
completely new mode.  Jay's thoughts as well as Neil's input surrounding
some of the issues with the first implementation pushed us toward a
design that relied on the queue_mapping rather than skb marks.
Round-robin and active-backup modes were chosen as the first users of
this slave selection as they seemed like the most logical choices when
considering a multi-switch environment.

Round-robin mode works without any modification, but active-backup does
require inclusion of the first patch in this series and setting
the 'all_slaves_active' flag.  This will allow reception of unicast traffic on
any of the backup interfaces.

This was tested with IPv4-based filters as well as VLAN-based filters
with good results.

More information as well as a configuration example is available in the
patch to Documentation/networking/bonding.txt.

Signed-off-by: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-05 02:23:17 -07:00
Andy Gospodarek ebd8e4977a bonding: add all_slaves_active parameter
v2: changed parameter name from 'keep_all' to 'all_slaves_active' and
skipped setting slaves to inactive rather than creating a new flag at
Jay's suggestion.

In an effort to suppress duplicate frames on certain bonding modes
(specifically the modes that do not require additional configuration on
the switch or switches connected to the host), code was added in the
generic receive patch in 2.6.16.  The current behavior works quite well
for most users, but there are some times it would be nice to restore old
functionality and allow all frames to make their way up the stack.

This patch adds support for a new module option and sysfs file called
'all_slaves_active' that will restore pre-2.6.16 functionality if the
user desires.  The default value is '0' and retains existing behavior,
but the user can set it to '1' and allow all frames up if desired.

Signed-off-by: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-05 02:23:17 -07:00
Jiri Pirko 097811bb48 bonding: optimize tlb_get_least_loaded_slave
In the worst case, when the first loop breaks an the end of the slave list,
the slave list is iterated through twice. This patch reduces this
function only to one loop. Also makes it simpler.

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-02 04:16:24 -07:00
Jiri Pirko 5206e24c2c bonding: remove unused original_flags struct slave member
This is stored but never restored. So remove this as it is useless.

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-02 04:16:24 -07:00
Jiri Pirko c20811a79e bonding: move dev_addr cpy to bond_enslave
Move the code that copies slave's mac address in case that's the first slave into
bond_enslave. Ifenslave app does this also but that's not a problem. This is
something that should be done in bond_enslave, and it shound not matter from
where is it called.

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-02 04:16:23 -07:00
Jiri Pirko f9f3545e1e bonding: make bonding_store_slaves simpler
This patch makes bonding_store_slaves function nicer and easier to understand.

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-02 03:39:42 -07:00
Jiri Pirko 3dd90905e0 bonding: remove redundant checks from bonding_store_slaves V2
(it's actually the same as v1)

Remove checks that duplicates similar checks in bond_enslave.

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-02 03:39:42 -07:00
Jiri Pirko b15ba0fbdc bonding: move slave MTU handling from sysfs V2
V1->V2: corrected res/ret use

For some reason, MTU handling (storing, and restoring) is taking  place in
bond_sysfs. The correct place for this code is in bond_enslave, bond_release.
So move it there.

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-02 03:39:41 -07:00
Jiri Pirko 6458590999 bonding: remove unused variable "found"
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-02 03:39:40 -07:00
WANG Cong f6dc31a85c bonding: make bonding support netpoll
Based on Andy's work, but I modified a lot.

Similar to the patch for bridge, this patch does:

1) implement the 2 methods to support netpoll for bonding;

2) modify netpoll during forwarding packets via bonding;

3) disable netpoll support of bonding when a netpoll-unabled device
   is added to bonding;

4) enable netpoll support when all underlying devices support netpoll.

Cc: Andy Gospodarek <gospo@redhat.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Matt Mackall <mpm@selenic.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: Jay Vosburgh <fubar@us.ibm.com>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: WANG Cong <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-06 00:48:51 -07:00
David S. Miller 4a35ecf8bf Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	drivers/net/bonding/bond_main.c
	drivers/net/via-velocity.c
	drivers/net/wireless/iwlwifi/iwl-agn.c
2010-04-06 23:53:30 -07:00
Jiri Pirko 22bedad3ce net: convert multicast list to list_head
Converts the list and the core manipulating with it to be the same as uc_list.

+uses two functions for adding/removing mc address (normal and "global"
 variant) instead of a function parameter.
+removes dev_mcast.c completely.
+exposes netdev_hw_addr_list_* macros along with __hw_addr_* functions for
 manipulation with lists on a sandbox (used in bonding and 80211 drivers)

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-04-03 14:22:15 -07:00
Jiri Pirko a748ee2426 net: move address list functions to a separate file
+little renaming of unicast functions to be smooth with multicast ones

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-04-03 14:22:11 -07:00
Amerigo Wang 9e2e61fbf8 bonding: fix potential deadlock in bond_uninit()
bond_uninit() is invoked with rtnl_lock held, when it does destroy_workqueue()
which will potentially flush all works in this workqueue, if we hold rtnl_lock
again in the work function, it will deadlock.

So move destroy_workqueue() to destructor where rtnl_lock is not held any more,
suggested by Eric.

Signed-off-by: WANG Cong <amwang@redhat.com>
Cc: Jay Vosburgh <fubar@us.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Jiri Pirko <jpirko@redhat.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-04-01 17:26:01 -07:00
Eric Dumazet 00ae702847 bonding: bond_xmit_roundrobin() fix
Commit a2fd940f (bonding: fix broken multicast with round-robin mode)
added a problem on litle endian machines.

drivers/net/bonding/bond_main.c:4159: warning: comparison is always
false due to limited range of data type

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-31 03:23:56 -07:00
Andy Gospodarek a2fd940f4c bonding: fix broken multicast with round-robin mode
Round-robin (mode 0) does nothing to ensure that any multicast traffic
originally destined for the host will continue to arrive at the host when
the link that sent the IGMP join or membership report goes down.  One of
the benefits of absolute round-robin transmit.

Keeping track of subscribed multicast groups for each slave did not seem
like a good use of resources, so I decided to simply send on the
curr_active slave of the bond (typically the first enslaved device that
is up).  This makes failover management simple as IGMP membership
reports only need to be sent when the curr_active_slave changes.  I
tested this patch and it appears to work as expected.

Originally reported by Lon Hohberger <lhh@redhat.com>.

Signed-off-by: Andy Gospodarek <andy@greyhouse.net>
CC: Lon Hohberger <lhh@redhat.com>
CC: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-27 16:39:15 -07:00
Frans Pop 2381a55c88 net/various: remove trailing space in messages
Signed-off-by: Frans Pop <elendil@planet.nl>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-24 14:10:38 -07:00
Jiri Pirko 32a806c194 bonding: flush unicast and multicast lists when changing type
After the type change, addresses in unicast and multicast lists wouldn't make
sense, not to mention possible different lenghts. So flush both lists here.

Note "dev_addr_discard" will be very soon replaced by "dev_mc_flush" (once
mc_list conversion will be done).

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-21 18:31:34 -07:00
stephen hemminger 502a2ffd73 ipv6: convert idev_list to list macros
Convert to list macro's for the list of addresses per interface
in IPv6.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-20 15:45:09 -07:00
Jiri Pirko 3ca5b4042e bonding: check return value of nofitier when changing type
This patch adds the possibility to refuse the bonding type change for
other subsystems (such as for example bridge, vlan, etc.)

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-18 20:00:02 -07:00
Jiri Pirko 93d9b7d7a8 net: rename notifier defines for netdev type change
Since generally there could be more netdevices changing type other
than bonding, making this event type name "bonding-unrelated"

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-18 20:00:01 -07:00
Andi Kleen 28812fe11a driver-core: Add attribute argument to class_attribute show/store
Passing the attribute to the low level IO functions allows all kinds
of cleanups, by sharing low level IO code without requiring
an own function for every piece of data.

Also drivers can extend the attributes with own data fields
and use that in the low level function.

This makes the class attributes the same as sysdev_class attributes
and plain attributes.

This will allow further cleanups in drivers.

Full tree sweep converting all users.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-03-07 17:04:48 -08:00
Patrick McHardy 8d6184e488 bonding: fix device leak on error in bond_create()
When the register_netdevice() call fails, the newly allocated device is
not freed.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-02-27 02:52:05 -08:00
Ajit Khaparde 35cfabdc5e bonding: Remove net_device_stats from bonding struct
There is no need to maintain stats in the bonding structure.
Use the instance of net_device_stats in netdevice.

Signed-off-by: Ajit Khaparde <ajitk@serverengines.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-02-03 20:32:27 -08:00
David S. Miller 05ba712d7e Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2010-01-28 06:12:38 -08:00
stephen hemminger b473946a08 bonding: bond_open error return value
The convention for API functions in kernel is to return errno value;
bond_open would return -1 if alb setup failed. The only reason that
could happen is if kmalloc() failed.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-01-28 05:55:54 -08:00
Alexey Dobriyan 2c8c1e7297 net: spread __net_init, __net_exit
__net_init/__net_exit are apparently not going away, so use them
to full extent.

In some cases __net_init was removed, because it was called from
__net_exit code.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-01-17 19:16:02 -08:00
David S. Miller d4a66e752d Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	drivers/net/benet/be_cmds.h
	include/linux/sysctl.h
2010-01-10 22:55:03 -08:00
Andy Gospodarek 1f3c8804ac bonding: allow arp_ip_targets on separate vlans to use arp validation
This allows a bond device to specify an arp_ip_target as a host that is
not on the same vlan as the base bond device and still use arp
validation.  A configuration like this, now works:

BONDING_OPTS="mode=active-backup arp_interval=1000 arp_ip_target=10.0.100.1 arp_validate=3"

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eth1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master bond0 qlen 1000
    link/ether 00:13:21:be:33:e9 brd ff:ff:ff:ff:ff:ff
3: eth0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master bond0 qlen 1000
    link/ether 00:13:21:be:33:e9 brd ff:ff:ff:ff:ff:ff
8: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue
    link/ether 00:13:21:be:33:e9 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::213:21ff:febe:33e9/64 scope link
       valid_lft forever preferred_lft forever
9: bond0.100@bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue
    link/ether 00:13:21:be:33:e9 brd ff:ff:ff:ff:ff:ff
    inet 10.0.100.2/24 brd 10.0.100.255 scope global bond0.100
    inet6 fe80::213:21ff:febe:33e9/64 scope link
       valid_lft forever preferred_lft forever

Ethernet Channel Bonding Driver: v3.6.0 (September 26, 2009)

Bonding Mode: fault-tolerance (active-backup)
Primary Slave: None
Currently Active Slave: eth1
MII Status: up
MII Polling Interval (ms): 0
Up Delay (ms): 0
Down Delay (ms): 0
ARP Polling Interval (ms): 1000
ARP IP target/s (n.n.n.n form): 10.0.100.1

Slave Interface: eth1
MII Status: up
Link Failure Count: 1
Permanent HW addr: 00:40:05:30:ff:30

Slave Interface: eth0
MII Status: up
Link Failure Count: 0
Permanent HW addr: 00:13:21:be:33:e9

Signed-off-by: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-01-03 21:17:16 -08:00
Dan Carpenter c99a3d2e04 bond_3ad.c avoid possible null deref
A few lines earlier we assume that best->slave could be either null or non-null so
we should check it here as well.

Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-26 20:24:46 -08:00
Joe Perches a4aee5c808 drivers/net/bonding/: : use pr_fmt
Add #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
Remove DRV_NAME from pr_<level>s
Consolidate long format strings
Remove some extra tab indents
Remove some unnecessary ()s from pr_<level>s arguments
Align pr_<level> arguments

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-13 20:06:07 -08:00
Linus Torvalds 4ef58d4e2a Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (42 commits)
  tree-wide: fix misspelling of "definition" in comments
  reiserfs: fix misspelling of "journaled"
  doc: Fix a typo in slub.txt.
  inotify: remove superfluous return code check
  hdlc: spelling fix in find_pvc() comment
  doc: fix regulator docs cut-and-pasteism
  mtd: Fix comment in Kconfig
  doc: Fix IRQ chip docs
  tree-wide: fix assorted typos all over the place
  drivers/ata/libata-sff.c: comment spelling fixes
  fix typos/grammos in Documentation/edac.txt
  sysctl: add missing comments
  fs/debugfs/inode.c: fix comment typos
  sgivwfb: Make use of ARRAY_SIZE.
  sky2: fix sky2_link_down copy/paste comment error
  tree-wide: fix typos "couter" -> "counter"
  tree-wide: fix typos "offest" -> "offset"
  fix kerneldoc for set_irq_msi()
  spidev: fix double "of of" in comment
  comment typo fix: sybsystem -> subsystem
  ...
2009-12-09 19:43:33 -08:00
Thadeu Lima de Souza Cascardo 94e2bd6888 tree-wide: fix some typos and punctuation in comments
fix some typos and punctuation in comments

Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@holoscopio.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2009-12-04 15:39:48 +01:00
Joe Perches 8e95a2026f drivers/net: Move && and || to end of previous line
Only files where David Miller is the primary git-signer.
wireless, wimax, ixgbe, etc are not modified.

Compile tested x86 allyesconfig only
Not all files compiled (not x86 compatible)

Added a few > 80 column lines, which I ignored.
Existing checkpatch complaints ignored.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-03 13:18:01 -08:00
Eric W. Biederman 15449745e5 net: Simplify the bond drivers pernet operations.
Take advantage of the new pernet automatic storage management,
and stop using compatibility network namespace functions.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-01 16:15:53 -08:00
David S. Miller 3505d1a9fd Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	drivers/net/sfc/sfe4001.c
	drivers/net/wireless/libertas/cmd.c
	drivers/staging/Kconfig
	drivers/staging/Makefile
	drivers/staging/rtl8187se/Kconfig
	drivers/staging/rtl8192e/Kconfig
2009-11-18 22:19:03 -08:00
Eric Dumazet f99189b186 netns: net_identifiers should be read_mostly
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-18 05:03:25 -08:00
Jay Vosburgh 2d6682db11 bonding: fix 802.3ad standards compliance error
The language of 802.3ad 43.4.9 requires the "recordPDU" function
to, in part, compare the Partner parameter values in a received LACPDU
to the stored Actor values.  If those match, then the Partner's
synchronization state is set to true.

	The current 802.3ad implementation is performing these steps out
of order; first, the synchronization check is done, then the paramters are
checked to see if they match (the synch check being done against a match
check of a prior LACPDU).  This causes delays in establishing aggregators
in some circumstances.

	This patch modifies the 802.3ad code to call __choose_matched,
the function that does the "match" comparisions, as the first step of
__record_pdu, instead of immediately afterwards.  This new behavior is
in compliance with the language of the standard.

	Some additional commentary relating to code vs. standard is also
added.

	Reported by Martin Patterson <martin@gear6.com> who also supplied
the logic of the fix and verified the patch.

Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-15 22:21:34 -08:00
Eric W. Biederman 6639104bd8 bond: Get the rtnl_link_ops support correct
- Don't call rtnl_link_unregister if rtnl_link_register fails
- Set .priv_size so we aren't stomping on uninitialized memory
  when we use netdev_priv, on bond devices created with
  ip link add type bond.

Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-30 12:41:22 -07:00
Eric W. Biederman ec87fd3b4e bond: Add support for multiple network namespaces
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-30 12:41:21 -07:00
Eric W. Biederman 88ead97710 bond: Implement a basic set of rtnl link ops
This implements a basic set of rtnl link ops and takes advantage of
the fact that rtnl_link_unregister kills all of the surviving
devices to all us to kill bond_free_all.  A module alias
is added so ip link add can pull in the bonding module.

Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-30 12:41:21 -07:00
Eric W. Biederman c67dfb299e bond: Simplify bond device destruction
Manually inline the code from bond_deinit to bond_uninit.  bond_uninit
is the only caller and it is short.

Move the call of bond_release_all from the netdev notifier into
bond_uninit.  The call site is effectively the same and performing
the call explicitly allows all the paths for destroying a
bonding device to behave the same way.

Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-30 12:41:20 -07:00
Eric W. Biederman 30c15ba993 bond: Simplify bond_create.
Stop calling dev_get_by_name to see if the bond device already
exists.  register_netdevice already does that.

Stop calling bond_deinit if register_netdevice fails as bond_uninit
is guaranteed to be called if bond_init succeeds.

Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-30 12:41:19 -07:00
Eric W. Biederman 6151b3d435 bond: Simply bond sysfs group creation
This patch delegates the work of creating the sysfs groups
to the netdev layer and ultimately to the device layer.  This
closes races between uevents.

Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-30 12:41:19 -07:00
David S. Miller 0519d83d83 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2009-10-29 21:28:59 -07:00
Linus Torvalds 49b2de8e6f Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (43 commits)
  net: Fix 'Re: PACKET_TX_RING: packet size is too long'
  netdev: usb: dm9601.c can drive a device not supported yet, add support for it
  qlge: Fix firmware mailbox command timeout.
  qlge: Fix EEH handling.
  AF_RAW: Augment raw_send_hdrinc to expand skb to fit iphdr->ihl (v2)
  bonding: fix a race condition in calls to slave MII ioctls
  virtio-net: fix data corruption with OOM
  sfc: Set ip_summed correctly for page buffers passed to GRO
  cnic: Fix L2CTX_STATUSB_NUM offset in context memory.
  MAINTAINERS: rt2x00 list is moderated
  airo: Reorder tests, check bounds before element
  mac80211: fix for incorrect sequence number on hostapd injected frames
  libertas spi: fix sparse errors
  mac80211: trivial: fix spelling in mesh_hwmp
  cfg80211: sme: deauthenticate on assoc failure
  mac80211: keep auth state when assoc fails
  mac80211: fix ibss joining
  b43: add 'struct b43_wl' missing declaration
  b43: Fix Bugzilla #14181 and the bug from the previous 'fix'
  rt2x00: Fix crypto in TX frame for rt2800usb
  ...
2009-10-29 09:22:08 -07:00
Jiri Bohac d9d5283228 bonding: fix a race condition in calls to slave MII ioctls
In mii monitor mode, bond_check_dev_link() calls the the ioctl
handler of slave devices. It stores the ndo_do_ioctl function
pointer to a static (!) ioctl variable and later uses it to call the
handler with the IOCTL macro.

If another thread executes bond_check_dev_link() at the same time
(even with a different bond, which none of the locks prevent), a
race condition occurs. If the two racing slaves have different
drivers, this may result in one driver's ioctl handler being
called with a pointer to a net_device controlled with a different
driver, resulting in unpredictable breakage.

Unless I am overlooking something, the "static" must be a
copy'n'paste error (?).

Signed-off-by: Jiri Bohac <jbohac@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-28 22:23:54 -07:00
Jasper Spaans a361c83cb4 bonding: Remove bond_dev from xmit_hash_policy call.
Now that the bonding device is no longer used in determining the device to
which to send packets, it can be dropped from the argument list of the various
xmit_hash_policy calls.

Signed-off-by: Jasper Spaans <spaans@fox-it.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-27 01:05:13 -07:00
David S. Miller cfadf853f6 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	drivers/net/sh_eth.c
2009-10-27 01:03:26 -07:00
Jasper Spaans d3da68310a bonding: Modify hash transmit policies to use the packet's source MAC address
Modify bonding hash transmit policies to use the psource MAC address of
the packet instead of the MAC address configured for the bonding device.

The old sitation conflicts with the documentation.

Signed-off-by: Jasper Spaans <spaans@fox-it.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-24 07:02:28 -07:00
Nicolas de Pesloüan 38fc0026da bonding: change bond_create_proc_entry() to return void
The function bond_create_proc_entry is currently of type int.

Two versions of this function exist:

The one in the ifdef CONFIG_PROC_FS branch always return 0.
The one in the else branch (which is empty) return nothing.

When CONFIG_PROC_FS is undef, this cause the following warning:

drivers/net/bonding/bond_main.c: In function `bond_create_proc_entry':
drivers/net/bonding/bond_main.c:3393: warning: control reaches end of
non-void function

No caller of this function use the returned value.

So change the returned type from int to void and remove the
useless return 0; .

Signed-off-by: Nicolas de Pesloüan <nicolas.2p.debian@free.fr>
Reported-by: Rakib Mullick <rakib.mullick@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-13 00:45:06 -07:00
Alexey Dobriyan d43c36dc6b headers: remove sched.h from interrupt.h
After m68k's task_thread_info() doesn't refer to current,
it's possible to remove sched.h from interrupt.h and not break m68k!
Many thanks to Heiko Carstens for allowing this.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
2009-10-11 11:20:58 -07:00
Nicolas de Pesloüan 49b4ad92d1 bonding: remove useless assignment
The variable old_active is first set to bond->curr_active_slave.
Then, it is unconditionally set to new_active, without being used in between.

The first assignment, having no side effect, is useless.

Signed-off-by: Nicolas de Pesloüan <nicolas.2p.debian@free.fr>
Reviewed-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-07 14:11:00 -07:00
Nicolas de Pesloüan 3c6aaa2461 bonding: fix a parameter name in error message
When parsing module parameters, bond_check_params() erroneously use
'xor_mode' as the name of a module parameter in an error message.

The right name for this parameter is 'xmit_hash_policy'.

Signed-off-by: Nicolas de Pesloüan <nicolas.2p.debian@free.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-07 14:10:36 -07:00
Jiri Pirko a549952ad3 bonding: introduce primary_reselect option
In some cases there is not desirable to switch back to primary interface when
it's link recovers and rather stay with currently active one. We need to avoid
packetloss as much as we can in some cases. This is solved by introducing
primary_reselect option. Note that enslaved primary slave is set as current
active no matter what.

Patch modified by Jay Vosburgh as follows: fixed bug in action
after change of option setting via sysfs, revised the documentation
update, and bumped the bonding version number.

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-07 01:07:39 -07:00
Jiri Pirko ce501caf16 bonding: set primary param via sysfs
Primary module parameter passed to bonding is pernament. That means if you
release the primary slave and enslave it again, it becomes the primary slave
again. But if you set primary slave via sysfs, the primary slave is only set
once and it's not remembered in bond->params structure. Therefore the setting is
lost after releasing the primary slave. This simple one-liner fixes this.

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-01 14:34:29 -07:00
Anand Gadiyar fd589a8f0a trivial: fix typo "to to" in multiple files
Signed-off-by: Anand Gadiyar <gadiyar@ti.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2009-09-21 15:14:55 +02:00
Jiri Pirko b9f602533e bonding: make ab_arp select active slaves as other modes
When I was implementing primary_passive option (formely named primary_lazy) I've
run into troubles with ab_arp. This is the only mode which is not using
bond_select_active_slave() function to select active slave and instead it
selects it itself. This seems to be not the right behaviour and it would be
better to do it in bond_select_active_slave() for all cases. This patch makes
this happen. Please review.

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-09-16 17:04:58 -07:00
Moni Shoua 75c78500dd bonding: remap muticast addresses without using dev_close() and dev_open()
This patch fixes commit e36b9d16c6. The approach
there is to call dev_close()/dev_open() whenever the device type is changed in
order to remap the device IP multicast addresses to HW multicast addresses.
This approach suffers from 2 drawbacks:

*. It assumes tha the device is UP when calling dev_close(), or otherwise
   dev_close() has no affect. It is worth to mention that initscripts (Redhat)
   and sysconfig (Suse) doesn't act the same in this matter. 
*. dev_close() has other side affects, like deleting entries from the routing
   table, which might be unnecessary.

The fix here is to directly remap the IP multicast addresses to HW multicast
addresses for a bonding device that changes its type, and nothing else.
   
Reported-by:   Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: Moni Shoua <monis@voltaire.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-09-15 02:37:40 -07:00
Eric Dumazet 885a136c52 bonding: use compare_ether_addr_64bits() in ALB
We can speedup ether addresses compares using compare_ether_addr_64bits()
instead of memcmp(). We make sure all operands are at least 8 bytes long and
16bits aligned (or better, long word aligned if possible)

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Reviewed-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-09-01 17:40:26 -07:00
Stephen Hemminger 424efe9caf netdev: convert pseudo drivers to netdev_tx_t
These are all drivers that don't touch real hardware.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-09-01 01:13:40 -07:00
Petri Gynther 6c9888532b bonding: Have bond_check_dev_link examine netif_running
bonding: Have bond_check_dev_link examine netif_running

	Some network devices do not call netif_carrier_off when they
are set administratively down.  Have the bonding link check function
also inspect the netif_running state.  Ignore netif_running if the
bond_check_dev_link function is called with "reporting" set, as in that
case it's inspecting the capabilities of the non-netif_carrier device
driver.

Signed-off-by: Petri Gynther <pgynther@google.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-28 23:01:20 -07:00
Nicolas de Pesloüan f584130616 bonding: Fix useless test: int > INT_MAX
max_bonds is of type int and cannot be greater than INT_MAX.

Signed-off-by: Nicolas de Pesloüan <nicolas.2p.debian@free.fr>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-28 23:01:16 -07:00
Stephen Hemminger 89c76c62f1 bonding: use compare_ether_addr
Bonding can use compare_ether_addr() in bond_release.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-28 23:01:15 -07:00
Jay Vosburgh 278339a42a bonding: propogate vlan_features to bonding master
Propogate the vlan_features of the slave devices to the bonding
master device, using the same logic as for regular features.

	Tested by Or Gerlitz <ogerlitz@voltaire.com>, who also removed
the debug logic from the original test patch.

Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-28 23:01:12 -07:00
Jiri Pirko e5e2a8fd83 bonding: wipe out printk's
I did not introduce new lines over 80 chars. I even eliminated some of
them.

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-13 16:43:32 -07:00
David S. Miller da8120355e Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	drivers/net/wireless/orinoco/main.c
2009-07-16 20:21:24 -07:00
Moni Shoua e36b9d16c6 bonding: clean muticast addresses when device changes type
Bonding device forbids slave device of different types under the same
master.

However, it is possible for a bonding master to change type during its
lifetime.  This can be either from ARPHRD_ETHER to ARPHRD_INFINIBAND
or the other way arround.  The change of type requires device level
multicast address cleanup because device level multicast addresses
depend on the device type.

The patch adds a call to dev_close() before the bonding master changes
type and dev_open() just after that.

In the example below I enslaved an IPoIB device (ib0) under
bond0. Since each bonding master starts as device of type ARPHRD_ETHER
by default, a change of type occurs when ib0 is enslaved.

This is how /proc/net/dev_mcast looks like without the patch

5    bond0           1     0     00ffffffff12601bffff000000000001ff96ca05
5    bond0           1     0     01005e000116
5    bond0           1     0     01005e7ffffd
5    bond0           1     0     01005e000001
5    bond0           1     0     333300000001
6    ib0             1     0     00ffffffff12601bffff000000000001ff96ca05
6    ib0             1     0     333300000001
6    ib0             1     0     01005e000001
6    ib0             1     0     01005e7ffffd
6    ib0             1     0     01005e000116
6    ib0             1     0     00ffffffff12401bffff00000000000000000001
6    ib0             1     0     00ffffffff12601bffff00000000000000000001

and this is how it looks like after the patch.

5    bond0           1     0     00ffffffff12601bffff000000000001ff96ca05
5    bond0           1     0     00ffffffff12601bffff00000000000000000001
5    bond0           1     0     00ffffffff12401bffff0000000000000ffffffd
5    bond0           1     0     00ffffffff12401bffff00000000000000000116
5    bond0           1     0     00ffffffff12401bffff00000000000000000001
6    ib0             1     0     00ffffffff12601bffff000000000001ff96ca05
6    ib0             1     0     00ffffffff12401bffff00000000000000000116
6    ib0             1     0     00ffffffff12401bffff0000000000000ffffffd
6    ib0             2     0     00ffffffff12401bffff00000000000000000001
6    ib0             2     0     00ffffffff12601bffff00000000000000000001

Signed-off-by: Moni Shoua <monis@voltaire.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-16 18:05:23 -07:00
Julia Lawall 8e321c4f72 drivers/net/bonding: Adjust constant name
AD_SHORT_TIMEOUT and AD_STATE_LACP_ACTIVITY have the same value, but
AD_STATE_LACP_ACTIVITY better reflects the intended semantics.

[ J adds: AD_STATE_LACP_ACTIVITY is a value defined by the standard, and
should be set here in accordance with 802.3ad 43.4.12; AD_SHORT_TIMEOUT
is a constant specific to the Linux 802.3ad implementation that happens
to have the same value ]

The semantic match that finds this problem is as follows:
(http://www.emn.fr/x-info/coccinelle/)

// <smpl>
@@
struct port_params p;
@@
* p.port_state |= AD_SHORT_TIMEOUT
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-12 20:11:32 -07:00
Patrick McHardy ec634fe328 net: convert remaining non-symbolic return values in ndo_start_xmit() functions
This patch converts the remaining occurences of raw return values to their
symbolic counterparts in ndo_start_xmit() functions that were missed by the
previous automatic conversion.

Additionally code that assumed the symbolic value of NETDEV_TX_OK to be zero
is changed to explicitly use NETDEV_TX_OK.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-05 19:23:38 -07:00
Stephen Hemminger 181470fcf3 bonding: initialization rework
Need to rework how bonding devices are initialized to make it more
amenable to creating bonding devices via netlink.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-06-13 23:29:04 -07:00
Stephen Hemminger 5c5129b54f bonding: use is_zero_ether_addr
Remove bogus non-portable possibly unaligned way of testing
for zero addres..

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-06-13 23:29:03 -07:00
Stephen Hemminger 373500db92 bonding: network device names are case sensative
The bonding device acts unlike all other Linux network device functions
in that it ignores case of device names. The developer must have come
from windows!

Cleanup the management of names and use standard routines where possible.
Flag places where bonding device still doesn't work right with network
namespaces.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-06-13 23:29:01 -07:00
Stephen Hemminger 6d7ab43ccc bonding: elminate bad refcount code
The "expected_refcount" stuff in bonding sysfs module is a mistake.
Sysfs does proper refcounting, and it is okay to remove a bond device
that has some user process holding the file open.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-06-13 23:29:00 -07:00
Stephen Hemminger 3d632c3f28 bonding: fix style issues
Resolve some of the complaints from checkpatch, and remove "magic emacs format"
comments, and useless MODULE_SUPPORTED_DEVICE(). But should not
change actual code.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-06-13 23:28:57 -07:00
Stephen Hemminger 9e71626c1c bonding: fix destructor
It is not safe to use a network device destructor that is a function in
the module, since it can be called after module is unloaded if sysfs
handle is open.

When eventually using netlink, the device cleanup code needs to be done
via uninit function.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-06-13 23:28:56 -07:00
Stephen Hemminger 7e08384045 bonding: remove bonding read/write semaphore
The whole read/write semaphore locking can be removed. It doesn't add any
protection that isn't already done by using the RTNL mutex properly.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-06-13 23:28:54 -07:00
Stephen Hemminger d93216051a bonding: initialize before registration
Avoid a unnecessary carrier state transistion that happens when device
is registered.
Lockdep works better if initialization is done before registration as well.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-06-13 23:28:52 -07:00
Stephen Hemminger d2991f7535 bonding: bond_create always called with default parameters
bond_create() is always called with same parameters so move the argument
down.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-06-13 23:28:51 -07:00
Stephen Hemminger 130aa61a77 bonding: fix multiple module load problem
Some users still load bond module multiple times to create bonding
devices.  This accidentally was broken by a later patch about
the time sysfs was fixed.  According to Jay, it was broken
by:
   commit b8a9787edd
   Author: Jay Vosburgh <fubar@us.ibm.com>
   Date:   Fri Jun 13 18:12:04 2008 -0700

     bonding: Allow setting max_bonds to zero

Note: sysfs and procfs still produce WARN() messages when this is done
so the sysfs method is the recommended API.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-06-11 05:46:04 -07:00
Jiri Pirko ae63e808f5 bonding: use bond_is_lb() when it's appropriate
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-29 22:08:31 -07:00
Eric Dumazet 93f154b594 net: release dst entry in dev_hard_start_xmit()
One point of contention in high network loads is the dst_release() performed
when a transmited skb is freed. This is because NIC tx completion calls
dev_kree_skb() long after original call to dev_queue_xmit(skb).

CPU cache is cold and the atomic op in dst_release() stalls. On SMP, this is
quite visible if one CPU is 100% handling softirqs for a network device,
since dst_clone() is done by other cpus, involving cache line ping pongs.

It seems right place to release dst is in dev_hard_start_xmit(), for most
devices but ones that are virtual, and some exceptions.

David Miller suggested to define a new device flag, set in alloc_netdev_mq()
(so that most devices set it at init time), and carefuly unset in devices
which dont want a NULL skb->dst in their ndo_start_xmit().

List of devices that must clear this flag is :

- loopback device, because it calls netif_rx() and quoting Patrick :
    "ip_route_input() doesn't accept loopback addresses, so loopback packets
     already need to have a dst_entry attached."
- appletalk/ipddp.c : needs skb->dst in its xmit function

- And all devices that call again dev_queue_xmit() from their xmit function
(as some classifiers need skb->dst) : bonding, vlan, macvlan, eql, ifb, hdlc_fr

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-18 22:19:19 -07:00
Eric W. Biederman 496a60cdcd net: FIX bonding sysfs rtnl_lock deadlock
Sysfs files for a network device can not unconditionally take the
rtnl_lock as the bonding sysfs files do.  If someone accesses those
sysfs files while the network device is being unregistered with the
rtnl_lock held we will deadlock.

So use trylock and restart_syscall to avoid this problem.

Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-18 22:16:00 -07:00
David S. Miller bb803cfbec Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	drivers/scsi/fcoe/fcoe.c
2009-05-18 21:08:20 -07:00
Stephen Hemminger 4cd6fe1c64 bonding: fix link down handling in 802.3ad mode
One of the purposes of bonding is to allow for redundant links, and failover
correctly if the cable is pulled. If all the members of a bonded device have
no carrier present, the bonded device itself needs to report no carrier present
to user space so management tools (like routing daemons) can respond.

Bonding in 802.3ad mode does not work correctly for this because it incorrectly
chooses a link that is down as a possible aggregator.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-17 21:15:54 -07:00
Eric Dumazet 9d21493b4b net: tx scalability works : trans_start
struct net_device trans_start field is a hot spot on SMP and high performance
devices, particularly multi queues ones, because every transmitter dirties
it. Is main use is tx watchdog and bonding alive checks.

But as most devices dont use NETIF_F_LLTX, we have to lock
a netdev_queue before calling their ndo_start_xmit(). So it makes
sense to move trans_start from net_device to netdev_queue. Its update
will occur on a already present (and in exclusive state) cache line, for
free.

We can do this transition smoothly. An old driver continue to
update dev->trans_start, while an updated one updates txq->trans_start.

Further patches could also put tx_bytes/tx_packets counters in 
netdev_queue to avoid dirtying dev->stats (vlan device comes to mind)

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-17 20:55:16 -07:00
Jiri Pirko 3a6d54c563 net: remove needless (now buggy) & from dev->dev_addr
Patch fixes issues with dev->dev_addr changing from array to pointer.
Hopefully there are no others.

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-17 11:59:47 -07:00
Florian Westphal 9d34d1a20e bonding: fix panic if initialization fails
If module initialisation failed (e.g. because the bonding sysfs entry
cannot be created), kernel panics:
 IP: [<ffffffff8024910a>] destroy_workqueue+0x2d/0x146
Call Trace:
 [<ffffffff808268c4>] bond_destructor+0x28/0x78
 [<ffffffff80b64471>] netdev_run_todo+0x231/0x25a
 [<ffffffff80b6dbcd>] rtnl_unlock+0x9/0xb
 [<ffffffff81567907>] bonding_init+0x83e/0x84a

Remove the calls to bond_work_cancel_all() and destroy_workqueue();
both are also called/scheduled via bond_free_all().

bond_destroy_sysfs is unecessary because the sysfs entry has
not been created in the error case.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-09 13:19:47 -07:00
Richard Genoud ed9b58bc44 Remove duplicate slow protocol define in bond_3ad.h
ETH_P_SLOW is already defined in include/linux/if_ether.h.
There's no need to define BOND_ETH_P_LACPDU in drivers/net/bonding/bond_3ad.h

Signed-off-by: Richard Genoud <richard.genoud@gmail.com>
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-09 13:15:49 -07:00
David S. Miller 22f6dacdfc Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	include/net/tcp.h
2009-05-08 02:48:30 -07:00
Jiri Pirko aee64faf23 bonding: get rid of CONFIG_PROC_FS ifdefs
Remove CONFIG_PROC_FS ifdefs from the code by adding void functions.

Signed-off-by: Jiri Pirko <jpirko@redhat.com>

 drivers/net/bonding/bond_main.c |   30 ++++++++++++++++++++----------
 1 files changed, 20 insertions(+), 10 deletions(-)
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-05 12:26:25 -07:00
Jay Vosburgh 815bcc2719 bonding: fix alb mode locking regression
Fix locking issue in alb MAC address management; removed
incorrect locking and replaced with correct locking.  This bug was
introduced in commit 059fe7a578
("bonding: Convert locks to _bh, rework alb locking for new locking")

	Bug reported by Paul Smith <paul@mad-scientist.net>, who also
tested the fix.

Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-04 21:28:10 -07:00
David S. Miller d252a5e7b7 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2009-05-03 14:07:43 -07:00
Jiri Pirko 1363d9b135 bonding: correct the cleanup in bond_create()
This patch makes the cleanup in bond_create nicer :) Also now the forgotten
free_netdev is called.

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-01 15:35:28 -07:00
Eric Dumazet 689c96cca7 bonding: bond_slave_info_query() fix
bond_slave_info_query() should keep a read lock while accessing slave info,
or risk accessing stale data and corruption.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-01 15:18:19 -07:00
Jiri Pirko 41f8910040 bonding: ignore updelay param when there is no active slave
Pointed out by Sean E. Millichamp.

Quote from Documentation/networking/bonding.txt:
"Note that when a bonding interface has no active links, the
driver will immediately reuse the first link that goes up, even if the
updelay parameter has been specified (the updelay is ignored in this
case).  If there are slave interfaces waiting for the updelay timeout
to expire, the interface that first went into that state will be
immediately reused.  This reduces down time of the network if the
value of updelay has been overestimated, and since this occurs only in
cases with no connectivity, there is no additional penalty for
ignoring the updelay."

This patch actually changes the behaviour in this way.

Signed-off-by: Jiri Pirko <jpirko@redhat.com>

 drivers/net/bonding/bond_main.c |    8 ++++++++
 1 files changed, 8 insertions(+), 0 deletions(-)
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-04-27 02:57:44 -07:00
Jiri Pirko 29112f4e24 bonding: use ethtool for link checking first
This patch only changes the order of interfaces to use for checking slave link
status in bond_check_dev_link() to priorize ethtool interface. Should safe some
troubles as ethtool seems to be more supported.

Jirka

Signed-off-by: Jiri Pirko <jpirko@redhat.com>

 drivers/net/bonding/bond_main.c |   26 ++++++++++++--------------
 1 files changed, 12 insertions(+), 14 deletions(-)
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-04-27 02:57:44 -07:00
Jay Vosburgh 2690f8d62e bonding: Remove debug printk
Remove debug printk I accidently left in as part of commit:

commit 6146b1a4da
Author: Jay Vosburgh <fubar@us.ibm.com>
Date:   Tue Nov 4 17:51:15 2008 -0800

    bonding: Fix ALB mode to balance traffic on VLANs

	Reported by Duncan Gibb <duncan.gibb@siriusit.co.uk>

Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-04-14 16:53:14 -07:00
Brian Haley 5a31bec014 Bonding: fix zero address hole bug in arp_ip_target list
Fix a zero address hole bug in the bonding arp_ip_target list
that was causing the bond to ignore ARP replies (bugz 13006).
Instead of just setting the array entry to zero, we now
copy any additional entries down one slot, putting the
zero entry at the end.  With this change we can now have
all the loops that walk the array stop when they hit a zero
since there will be no addresses after it.

Changes are based in part on code fragment provided in kernel:
bugzilla 13006:

	http://bugzilla.kernel.org/show_bug.cgi?id=13006

by Steve Howard <steve@astutenetworks.com>

Signed-off-by: Brian Haley <brian.haley@hp.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-04-13 00:12:41 -07:00
Alexey Dobriyan 99b7623380 proc 2/2: remove struct proc_dir_entry::owner
Setting ->owner as done currently (pde->owner = THIS_MODULE) is racy
as correctly noted at bug #12454. Someone can lookup entry with NULL
->owner, thus not pinning enything, and release it later resulting
in module refcount underflow.

We can keep ->owner and supply it at registration time like ->proc_fops
and ->data.

But this leaves ->owner as easy-manipulative field (just one C assignment)
and somebody will forget to unpin previous/pin current module when
switching ->owner. ->proc_fops is declared as "const" which should give
some thoughts.

->read_proc/->write_proc were just fixed to not require ->owner for
protection.

rmmod'ed directories will be empty and return "." and ".." -- no harm.
And directories with tricky enough readdir and lookup shouldn't be modular.
We definitely don't want such modular code.

Removing ->owner will also make PDE smaller.

So, let's nuke it.

Kudos to Jeff Layton for reminding about this, let's say, oversight.

http://bugzilla.kernel.org/show_bug.cgi?id=12454

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
2009-03-31 01:14:44 +04:00
Jiri Pirko 5a29f7893f bonding: select current active slave when enslaving device for mode tlb and alb
I've hit an issue on my system when I've been using RealTek RTL8139D cards in
bonding interface in mode balancing-alb. When I enslave a card, the current
active slave (bond->curr_active_slave) is not set and the link is therefore
not functional.

----
# cat /proc/net/bonding/bond0
Ethernet Channel Bonding Driver: v3.5.0 (November 4, 2008)

Bonding Mode: adaptive load balancing
Primary Slave: None
Currently Active Slave: None
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0

Slave Interface: eth1
MII Status: up
Link Failure Count: 0
Permanent HW addr: 00:1f:1f:01:2f:22
----

The thing that gets it right is when I unplug the cable and then I put it back
into the NIC. Then the current active slave is set to eth1 and link is working
just fine. Here is dmesg log with bonding DEBUG messages turned on:
----
ADDRCONF(NETDEV_UP): bond0: link is not ready
event_dev: bond0, event: 1
IFF_MASTER
event_dev: bond0, event: 8
IFF_MASTER
bond_ioctl: master=bond0, cmd=35216
slave_dev=cac5d800: 
slave_dev->name=eth1: 
eth1: ! NETIF_F_VLAN_CHALLENGED
event_dev: eth1, event: 8
eth1: link up, 100Mbps, full-duplex, lpa 0xC5E1
event_dev: eth1, event: 1
event_dev: eth1, event: 8
IFF_SLAVE
Initial state of slave_dev is BOND_LINK_UP
bonding: bond0: enslaving eth1 as an active interface with an up link.
ADDRCONF(NETDEV_CHANGE): bond0: link becomes ready
event_dev: bond0, event: 4
IFF_MASTER
bond0: no IPv6 routers present

<<<<cable unplug>>>>

eth1: link down
event_dev: eth1, event: 4
IFF_SLAVE
bonding: bond0: link status definitely down for interface eth1, disabling it
event_dev: bond0, event: 4
IFF_MASTER

<<<<cable plug>>>>

eth1: link up, 100Mbps, full-duplex, lpa 0xC5E1
event_dev: eth1, event: 4
IFF_SLAVE
bonding: bond0: link status definitely up for interface eth1.
bonding: bond0: making interface eth1 the new active one.
event_dev: eth1, event: 8
IFF_SLAVE
event_dev: eth1, event: 8
IFF_SLAVE
bonding: bond0: first active interface up!
event_dev: bond0, event: 4
IFF_MASTER
----

The current active slave is set by calling bond_select_active_slave() function
from bond_miimon_commit() function when the slave (eth1) link goes to state up.

I also tested this on other machine with Broadcom NetXtreme II BCM5708
1000Base-T NIC and there all works fine. The thing is that this adapter is down
and goes up after few seconds after it is enslaved.

This patch calls bond_select_active_slave() in bond_enslave() function for modes
alb and tlb and makes sure that the current active slave is set up properly even
when the slave state is already up. Tested on both systems, works fine.

Notice: The same problem can maybe also occrur in mode 8023AD but I'm unable to
test that.

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-25 17:23:38 -07:00