For case of bridging pseudo devices, the get created/destroyed (Xen)
need to allow setting address to any valid value.
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Need to tell STP daemon about more events, like any time a
device is added even when it is down.
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
This patchset based on work by Aji_Srinivas@emc.com provides allows
spanning tree to be controled from userspace. Like hotplug, it
uses call_usermodehelper when spanning tree is enabled so there
is no visible API change. If call to start usermode STP fails
it falls back to existing kernel STP.
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
The RSTP daemon needs to be able to flush all dynamic forwarding
entries in the case of topology change.
This is a temporary interface. It will change to a netlink interface
before RSTP daemon is officially released.
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Instead of hashing the whole Ethernet address, it should be faster
to just use the last 4 bytes. Add a random salt value to the hash
to make it more difficult to construct worst case DoS hash chains.
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
While in the STP learning state, don't route packets; wait until
forwarding delay has expired. The purpose of the forwarding delay
is to detect loops in the network, and if a brouter started up
and started forwarding, it could cause a flood.
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Change the bridging hook to be simple function with return value
rather than modifying the skb argument. This could generate better
code and is cleaner.
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
To clearly state the intent of copying to linear sk_buffs, _offset being a
overly long variant but interesting for the sake of saving some bytes.
Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
To clearly state the intent of copying from linear sk_buffs, _offset being a
overly long variant but interesting for the sake of saving some bytes.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Right now Xen has a horrible hack that lets it forward packets with
partial checksums. One of the reasons that CHECKSUM_PARTIAL and
CHECKSUM_COMPLETE were added is so that we can get rid of this hack
(where it creates two extra bits in the skbuff to essentially mirror
ip_summed without being destroyed by the forwarding code).
I had forgotten that I've already gone through all the deivce drivers
last time around to make sure that they're looking at ip_summed ==
CHECKSUM_PARTIAL rather than ip_summed != 0 on transmit. In any case,
I've now done that again so it should definitely be safe.
Unfortunately nobody has yet added any code to update CHECKSUM_COMPLETE
values on forward so we I'm setting that to CHECKSUM_NONE. This should
be safe to remove for bridging but I'd like to check that code path
first.
So here is the patch that lets us get rid of the hack by preserving
ip_summed (mostly) on forwarded packets.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Renaming skb->h to skb->transport_header, skb->nh to skb->network_header and
skb->mac to skb->mac_header, to match the names of the associated helpers
(skb[_[re]set]_{transport,network,mac}_header).
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now the skb->nh union has just one member, .raw, i.e. it is just like the
skb->mac union, strange, no? I'm just leaving it like that till the transport
layer is done with, when we'll rename skb->mac.raw to skb->mac_header (or
->mac_header_offset?), ditto for ->{h,nh}.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use logic operations rather than memcmp() to compare destination
address with link local multicast addresses.
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
For the places where we need a pointer to the network header, it is still legal
to touch skb->nh.raw directly if just adding to, subtracting from or setting it
to another layer header.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For the places where we need a pointer to the mac header, it is still legal to
touch skb->mac.raw directly if just adding to, subtracting from or setting it
to another layer header.
This one also converts some more cases to skb_reset_mac_header() that my
regex missed as it had no spaces before nor after '=', ugh.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For the common, open coded 'skb->mac.raw = skb->data' operation, so that we can
later turn skb->mac.raw into a offset, reducing the size of struct sk_buff in
64bit land while possibly keeping it as a pointer on 32bit.
This one touches just the most simple case, next will handle the slightly more
"complex" cases.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We currently use a special structure (struct skb_timeval) and plain
'struct timeval' to store packet timestamps in sk_buffs and struct
sock.
This has some drawbacks :
- Fixed resolution of micro second.
- Waste of space on 64bit platforms where sizeof(struct timeval)=16
I suggest using ktime_t that is a nice abstraction of high resolution
time services, currently capable of nanosecond resolution.
As sizeof(ktime_t) is 8 bytes, using ktime_t in 'struct sock' permits
a 8 byte shrink of this structure on 64bit architectures. Some other
structures also benefit from this size reduction (struct ipq in
ipv4/ip_fragment.c, struct frag_queue in ipv6/reassembly.c, ...)
Once this ktime infrastructure adopted, we can more easily provide
nanosecond resolution on top of it. (ioctl SIOCGSTAMPNS and/or
SO_TIMESTAMPNS/SCM_TIMESTAMPNS)
Note : this patch includes a bug correction in
compat_sock_get_timestamp() where a "err = 0;" was missing (so this
syscall returned -ENOENT instead of 0)
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
CC: Stephen Hemminger <shemminger@linux-foundation.org>
CC: John find <linux.kernel@free.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
compare_ether_addr() implicitly requires that the addresses
passed are 2-bytes aligned in memory.
This is not true for br_stp_change_bridge_id() and
br_stp_recalculate_bridge_id() in which one of the addresses
is unsigned char *, and thus may not be 2-bytes aligned.
Signed-off-by: Evgeny Kravtsunov <emkravts@openvz.org>
Signed-off-by: Kirill Korotaev <dev@openvz.org>
Signed-off-by: Pavel Emelianov <xemul@openvz.org>
Turning up the warnings on gcc makes it emit warnings
about the placement of 'inline' in function declarations.
Here's everything that was under net/
Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
br_fdb_get use atomic_inc to increase the refcount of an element found
on a RCU protected list, which can lead to the following race:
CPU0 CPU1
br_fdb_get: rcu_read_lock
__br_fdb_get: find element
fdb_delete: hlist_del_rcu
br_fdb_put
br_fdb_put: atomic_dec_and_test
call_rcu(fdb_rcu_free) br_fdb_get: atomic_inc
rcu_read_unlock
fdb_rcu_free: kmem_cache_free
Use atomic_inc_not_zero instead.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
One change introduced by the workqueue removal patch is that adding an
interface that is up to a bridge which is also up does not ever call
br_stp_enable_port(), leaving the port in DISABLED state until we do
ifconfig down and up or link events occur.
The following patch to the br_add_if function fixes it.
This is a regression introduced in 2.6.21.
Submitted-by: Aji_Srinivas@emc.com
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This change goes with earlier change to get rid of
work queue for path cost. Now stp_set_path_cost does its own
locking. This is to allow it to call br_path_cost() which calls
ethtool interfaces (might sleep).
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Having a work queue for checking carrier leads to lots of race issues.
Simpler to just get the cost when data structure is created and
update on change.
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The bridge hasn't used miscdevice for a long long time.
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
We frequently need the maximum number of possible processors in order to
allocate arrays for all processors. So far this was done using
highest_possible_processor_id(). However, we do need the number of
processors not the highest id. Moreover the number was so far dynamically
calculated on each invokation. The number of possible processors does not
change when the system is running. We can therefore calculate that number
once.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Cc: Frederik Deweerdt <frederik.deweerdt@gmail.com>
Cc: Neil Brown <neilb@suse.de>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The semantic effect of insert_at_head is that it would allow new registered
sysctl entries to override existing sysctl entries of the same name. Which is
pain for caching and the proc interface never implemented.
I have done an audit and discovered that none of the current users of
register_sysctl care as (excpet for directories) they do not register
duplicate sysctl entries.
So this patch simply removes the support for overriding existing entries in
the sys_sysctl interface since no one uses it or cares and it makes future
enhancments harder.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: David Howells <dhowells@redhat.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Andi Kleen <ak@muc.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Corey Minyard <minyard@acm.org>
Cc: Neil Brown <neilb@suse.de>
Cc: "John W. Linville" <linville@tuxdriver.com>
Cc: James Bottomley <James.Bottomley@steeleye.com>
Cc: Jan Kara <jack@ucw.cz>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: Mark Fasheh <mark.fasheh@oracle.com>
Cc: David Chinner <dgc@sgi.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Patrick McHardy <kaber@trash.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
After Al Viro (finally) succeeded in removing the sched.h #include in module.h
recently, it makes sense again to remove other superfluous sched.h includes.
There are quite a lot of files which include it but don't actually need
anything defined in there. Presumably these includes were once needed for
macros that used to live in sched.h, but moved to other header files in the
course of cleaning it up.
To ease the pain, this time I did not fiddle with any header files and only
removed #includes from .c-files, which tend to cause less trouble.
Compile tested against 2.6.20-rc2 and 2.6.20-rc2-mm2 (with offsets) on alpha,
arm, i386, ia64, mips, powerpc, and x86_64 with allnoconfig, defconfig,
allmodconfig, and allyesconfig as well as a few randconfigs on x86_64 and all
configs in arch/arm/configs on arm. I also checked that no new warnings were
introduced by the patch (actually, some warnings are removed that were emitted
by unnecessarily included header files).
Signed-off-by: Tim Schmielau <tim@physik3.uni-rostock.de>
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jarek Poplawski <jarkao2@o2.pl>
Acked-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
- rename nf_logging to nf_loggers since its an array of registered loggers
- rename nf_log_unregister_logger() to nf_log_unregister() to make it
symetrical to nf_log_register() and convert all users
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently netlink users BUG when the allocated skb for an event
notification is undersized. While this is certainly a kernel bug,
its not critical and crashing the kernel is too drastic, especially
when considering that these errors have appeared multiple times in
the past and it BUGs even if no listeners are present.
This patch replaces BUG by WARN_ON and changes the notification
functions to inform potential listeners of undersized allocations
using a unique error code (EMSGSIZE).
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
This lets the network core have the ability to handle suspend/resume
issues, if it wants to.
Thanks to Frederik Deweerdt <frederik.deweerdt@gmail.com> for the arm
driver fixes.
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
We cannot compute the gap until we know we have a 'struct ebt_entry' and
not 'struct ebt_entries'. Failure to check can cause crash.
Tested-by: Santiago Garcia Mantinan <manty@manty.net>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove the deferred hooks and all related code as scheduled in
feature-removal-schedule.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Replace all uses of kmem_cache_t with struct kmem_cache.
The patch was generated using the following script:
#!/bin/sh
#
# Replace one string by another in all the kernel sources.
#
set -e
for file in `find * -name "*.c" -o -name "*.h"|xargs grep -l $1`; do
quilt add $file
sed -e "1,\$s/$1/$2/g" $file >/tmp/$$
mv /tmp/$$ $file
quilt refresh
done
The script was run like this
sh replace kmem_cache_t "struct kmem_cache"
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The attached patch resolves an issue where a IP DNATed packet with a
martian source is forwarded while it's better to drop it. It also
resolves messages complaining about ip forwarding being disabled while
it's actually enabled. Thanks to lepton <ytht.net@gmail.com> for
reporting this problem.
This is probably a candidate for the -stable release.
Signed-off-by: Bart De Schuymer <bdschuym@pandora.be>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Conflicts:
drivers/infiniband/core/iwcm.c
drivers/net/chelsio/cxgb2.c
drivers/net/wireless/bcm43xx/bcm43xx_main.c
drivers/net/wireless/prism54/islpci_eth.c
drivers/usb/core/hub.h
drivers/usb/input/hid-core.c
net/core/netpoll.c
Fix up merge failures with Linus's head and fix new compilation failures.
Signed-Off-By: David Howells <dhowells@redhat.com>
... and pass just repl->name to translate_table()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
We can check newinfo->hook_entry[...] instead.
Kill unused argument.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Check for valid_hooks is redundant (newinfo->hook_entry[i] will
be NULL if bit i is not set). Kill it, kill unused arguments.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since newinfo->hook_table[] already has been set up, we can switch to using
it instead of repl->{hook_info,valid_hooks}.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Take intialization of ->hook_entry[...], ->entries_size and ->nentries
over there, pull the check for empty chains into the end of that sucker.
Now it's self-contained, so we can move it up in the very beginning of
translate_table() *and* we can rely on ->hook_entry[] being properly
transliterated after it.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
It's easier to expand the iterator here *and* we'll be able to move all
uses of ebt_replace from translate_table() into this one.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Split ebt_check_entry_size_and_hooks() in two parts - one that does
sanity checks on pointers (basically, checks that we can safely
use iterator from now on) and the rest of it (looking into details
of entry).
The loop applying ebt_check_entry_size_and_hooks() is split in two.
Populating newinfo->hook_entry[] is done in the first part.
Unused arguments killed.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
No need to revisit a chain we'd already finished with during
the check for current hook. It's either instant loop (which
we'd just detected) or a duplicate work.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
We need that for iterator to work; existing check had been too weak.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
We need to verify that
a) we are not too close to the end of buffer to dereference
b) next entry we'll be checking won't be _before_ our
While we are at it, don't subtract unrelated pointers...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
The attached patch adds --snat-arp support, which makes it possible to
change the source mac address in both the mac header and the arp header
with one rule.
Signed-off-by: Bart De Schuymer <bdschuym@pandora.be>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Removes dependency on buggy rta_buf, fixes a memory corruption bug due to
a unvalidated netlink attribute, and simplifies the code.
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Account for the netlink message header size directly in nlmsg_new()
instead of relying on the caller calculate it correctly.
Replaces error handling of message construction functions when
constructing notifications with bug traps since a failure implies
a bug in calculating the size of the skb.
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Acked-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
nfmark is being used in various subsystems and has become
the defacto mark field for all kinds of packets. Therefore
it makes sense to rename it to `mark' and remove the
dependency on CONFIG_NETFILTER.
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
net/bridge/netfilter/ebtables.c: In function 'ebt_dev_check':
net/bridge/netfilter/ebtables.c:89: warning: initialization discards qualifiers from pointer target type
So make the char* a const char * and the warning is gone.
Signed-off-by: Meelis Roos <mroos@linux.ee>
Signed-off-by: David S. Miller <davem@davemloft.net>
Flush the forwarding table when carrier is lost. This helps for
availability because we don't want to forward to a downed device and
new packets may come in on other links.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The following patch adds or/and/xor functionality for the mark target,
while staying backwards compatible.
Signed-off-by: Bart De Schuymer <bdschuym@pandora.be>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kill listhelp.h and use the list.h functions instead.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Change some bridge sysctl tunables to __read_mostly.
Signed-off-by: Brian Haley <brian.haley@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If CONFIG_NETFILTER_DEBUG is enabled, it shouldn't change the
actions of the filtering. The message about skb->dst being NULL
is commonly triggered by dhclient, so it is useless. Make sure all
messages end in newline.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cleanup and rearrangement for better style and clarity:
Split the function nf_bridge_maybe_copy_header into two pieces
Move copy portion out of line.
Use Ethernet header size macros.
Use header file to handle CONFIG_NETFILTER_BRIDGE differences
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This code is wrong on so many levels, please lose it so it isn't
replicated anywhere else.
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
We seem to send 3 extra bytes in a TCN, which will be whatever happens
to be on the stack. Thanks to Aji_Srinivas@emc.com for seeing.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The bridge-netfilter code will overwrite memory if there is not
headroom in the skb to save the header. This first showed up when
using Xen with sky2 driver that doesn't allocate the extra space.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
When the bridge recomputes features, it does not maintain the
constraint that SG/GSO must be off if TX checksum is off.
This patch adds that constraint.
On a completely unrelated note, I've also added TSO6 and TSO_ECN
feature bits if GSO is enabled on the underlying device through
the new NETIF_F_GSO_SOFTWARE macro.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix kernel panic on various SMP machines. The culprit is a null
ub->skb in ulog_send(). If ulog_timer() has already been scheduled on
one CPU and is spinning on the lock, and ipt_ulog_packet() flushes the
queue on another CPU by calling ulog_send() right before it exits,
there will be no skbuff when ulog_timer() acquires the lock and calls
ulog_send(). Cancelling the timer in ulog_send() doesn't help because
it has already been scheduled and is running on the first CPU.
Similar problem exists in ebt_ulog.c and nfnetlink_log.c.
Signed-off-by: Mark Huang <mlhuang@cs.princeton.edu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix code that passes back netlink status messages about
bridge changes. Submitted by Aji_Srinivas@emc.com
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add bridge netfilter deferred output hooks to feature-removal-schedule
and disable them by default. Until their removal they will be
activated by the physdev match when needed.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds the wrapper function skb_is_gso which can be used instead
of directly testing skb_shinfo(skb)->gso_size. This makes things a little
nicer and allows us to change the primary key for indicating whether an skb
is GSO (if we ever want to do that).
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix for inability of br_dump_ifinfo to handle non-zero start index:
loop index never increases when entered with non-zero start.
Spotted by Kirill Korotaev.
Signed-off-by: Andrey Savochkin <saw@swsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When GSO packets come from an untrusted source (e.g., a Xen guest domain),
we need to verify the header integrity before passing it to the hardware.
Since the first step in GSO is to verify the header, we can reuse that
code by adding a new bit to gso_type: SKB_GSO_DODGY. Packets with this
bit set can only be fed directly to devices with the corresponding bit
NETIF_F_GSO_ROBUST. If the device doesn't have that bit, then the skb
is fed to the GSO engine which will allow the packet to be sent to the
hardware if it passes the header check.
This patch changes the sg flag to a full features flag. The same method
can be used to implement TSO ECN support. We simply have to mark packets
with CWR set with SKB_GSO_ECN so that only hardware with a corresponding
NETIF_F_TSO_ECN can accept them. The GSO engine can either fully segment
the packet, or segment the first MTU and pass the rest to the hardware for
further segmentation.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds a generic segmentation offload toggle that can be turned
on/off for each net device. For now it only supports in TCPv4.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Having separate fields in sk_buff for TSO/UFO (tso_size/ufo_size) is not
going to scale if we add any more segmentation methods (e.g., DCCP). So
let's merge them.
They were used to tell the protocol of a packet. This function has been
subsumed by the new gso_type field. This is essentially a set of netdev
feature bits (shifted by 16 bits) that are required to process a specific
skb. As such it's easy to tell whether a given device can process a GSO
skb: you just have to and the gso_type field and the netdev's features
field.
I've made gso_type a conjunction. The idea is that you have a base type
(e.g., SKB_GSO_TCPV4) that can be modified further to support new features.
For example, if we add a hardware TSO type that supports ECN, they would
declare NETIF_F_TSO | NETIF_F_TSO_ECN. All TSO packets with CWR set would
have a gso_type of SKB_GSO_TCPV4 | SKB_GSO_TCPV4_ECN while all other TSO
packets would be SKB_GSO_TCPV4. This means that only the CWR packets need
to be emulated in software.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
As it is the bridge will only ever declare NETIF_F_IP_CSUM even if all
its constituent devices support NETIF_F_HW_CSUM. This patch fixes
this by supporting the first one out of NETIF_F_NO_CSUM,
NETIF_F_HW_CSUM, and NETIF_F_IP_CSUM that is supported by all
constituent devices.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
The current stack treats NETIF_F_HW_CSUM and NETIF_F_NO_CSUM
identically so we test for them in quite a few places. For the sake
of brevity, I'm adding the macro NETIF_F_GEN_CSUM for these two. We
also test the disjunct of NETIF_F_IP_CSUM and the other two in various
places, for that purpose I've added NETIF_F_ALL_CSUM.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
I found a few more spots where pskb_trim_rcsum could be used but were not.
This patch changes them to use it.
Also, sk_filter can get paged skb data. Therefore we must use pskb_trim
instead of skb_trim.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add basic netlink support to the Ethernet bridge. Including:
* dump interfaces in bridges
* monitor link status changes
* change state of bridge port
For some demo programs see:
http://developer.osdl.org/shemminger/prototypes/brnl.tar.gz
These are to allow building a daemon that does alternative
implementations of Spanning Tree Protocol.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Return address in use, if some other kernel code has the SAP.
Propogate out error codes from netfilter registration and unwind.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Small optimizations of bridge forwarding path.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
There are several bugs in error handling in br_add_bridge:
- when dev_alloc_name fails, allocated net_device is not freed
- unregister_netdev is called when rtnl lock is held
- free_netdev is called before netdev_run_todo has a chance to be run after
unregistering net_device
Signed-off-by: Jiri Benc <jbenc@suse.cz>
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Bridge will OOPS on removal if other application has the SAP open.
The bridge SAP might be shared with other usages, so need
to do reference counting on module removal rather than explicit
close/delete.
Since packet might arrive after or during removal, need to clear
the receive function handle, so LLC only hands it to user (if any).
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The prefix argument for nf_log_packet is a format specifier,
so don't pass the user defined string directly to it.
Signed-off-by: Philip Craig <philipc@snapgear.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now that netdevice sysfs registration is done as part of
register_netdevice; bridge code no longer has to be tricky when adding
it's kobjects to bridges.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>