Fix bridge netfilter code so that it uses CONFIG_IPV6 as needed:
net/built-in.o: In function `ebt_filter_ip6':
ebt_ip6.c:(.text+0x87c37): undefined reference to `ipv6_skip_exthdr'
net/built-in.o: In function `ebt_log_packet':
ebt_log.c:(.text+0x88dee): undefined reference to `ipv6_skip_exthdr'
make[1]: *** [.tmp_vmlinux1] Error 1
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Normally, the bridge just chooses the smallest mac address as the
bridge id and mac address of bridge device. But if the administrator
has explictly set the interface address then don't change it.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Any frame addressed to link-local addresses should be processed by local
receive path. The earlier code would process them only if STP was enabled.
Since there are other frames like LACP for bonding, we should always
process them.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
After the sctp_remaddr_proc_init failed, the proper rollback is
not the sctp_remaddr_proc_exit, but the sctp_assocs_proc_exit.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The H.245 helper is not registered/unregistered, but assigned to
connections manually from the Q.931 helper. This means on unload
existing expectations and connections using the helper are not
cleaned up, leading to the following oops on module unload:
CPU 0 Unable to handle kernel paging request at virtual address c00a6828, epc == 802224dc, ra == 801d4e7c
Oops[#1]:
Cpu 0
$ 0 : 00000000 00000000 00000004 c00a67f0
$ 4 : 802a5ad0 81657e00 00000000 00000000
$ 8 : 00000008 801461c8 00000000 80570050
$12 : 819b0280 819b04b0 00000006 00000000
$16 : 802a5a60 80000000 80b46000 80321010
$20 : 00000000 00000004 802a5ad0 00000001
$24 : 00000000 802257a8
$28 : 802a4000 802a59e8 00000004 801d4e7c
Hi : 0000000b
Lo : 00506320
epc : 802224dc ip_conntrack_help+0x38/0x74 Tainted: P
ra : 801d4e7c nf_iterate+0xbc/0x130
Status: 1000f403 KERNEL EXL IE
Cause : 00800008
BadVA : c00a6828
PrId : 00019374
Modules linked in: ip_nat_pptp ip_conntrack_pptp ath_pktlog wlan_acl wlan_wep wlan_tkip wlan_ccmp wlan_xauth ath_pci ath_dev ath_dfs ath_rate_atheros wlan ath_hal ip_nat_tftp ip_conntrack_tftp ip_nat_ftp ip_conntrack_ftp pppoe ppp_async ppp_deflate ppp_mppe pppox ppp_generic slhc
Process swapper (pid: 0, threadinfo=802a4000, task=802a6000)
Stack : 801e7d98 00000004 802a5a60 80000000 801d4e7c 801d4e7c 802a5ad0 00000004
00000000 00000000 801e7d98 00000000 00000004 802a5ad0 00000000 00000010
801e7d98 80b46000 802a5a60 80320000 80000000 801d4f8c 802a5b00 00000002
80063834 00000000 80b46000 802a5a60 801e7d98 80000000 802ba854 00000000
81a02180 80b7e260 81a021b0 819b0000 819b0000 80570056 00000000 00000001
...
Call Trace:
[<801e7d98>] ip_finish_output+0x0/0x23c
[<801d4e7c>] nf_iterate+0xbc/0x130
[<801d4e7c>] nf_iterate+0xbc/0x130
[<801e7d98>] ip_finish_output+0x0/0x23c
[<801e7d98>] ip_finish_output+0x0/0x23c
[<801d4f8c>] nf_hook_slow+0x9c/0x1a4
One way to fix this would be to split helper cleanup from the unregistration
function and invoke it for the H.245 helper, but since ctnetlink needs to be
able to find the helper for synchonization purposes, a better fix is to
register it normally and make sure its not assigned to connections during
helper lookup. The missing l3num initialization is enough for this, this
patch changes it to use AF_UNSPEC to make it more explicit though.
Reported-by: liannan <liannan@twsz.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Properly free h323_buffer when helper registration fails.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix three ct_extend/NAT extension related races:
- When cleaning up the extension area and removing it from the bysource hash,
the nat->ct pointer must not be set to NULL since it may still be used in
a RCU read side
- When replacing a NAT extension area in the bysource hash, the nat->ct
pointer must be assigned before performing the replacement
- When reallocating extension storage in ct_extend, the old memory must
not be freed immediately since it may still be used by a RCU read side
Possibly fixes https://bugzilla.redhat.com/show_bug.cgi?id=449315
and/or http://bugzilla.kernel.org/show_bug.cgi?id=10875
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
In nr_release(), one code path calls sock_orphan() which
will NULL out sk->sk_socket already.
In the other case, handling states other than NR_STATE_{0,1,2,3},
seems to not be possible other than due to bugs. Even for an
uninitialized nr->state value, that would be zero or NR_STATE_0.
It might be wise to stick a WARN_ON() here.
Signed-off-by: David S. Miller <davem@davemloft.net>
It doesn't grab the sk_callback_lock, it doesn't NULL out
the sk->sk_sleep waitqueue pointer, etc.
Signed-off-by: David S. Miller <davem@davemloft.net>
It doesn't grab the sk_callback_lock, it doesn't NULL out
the sk->sk_sleep waitqueue pointer, etc.
Signed-off-by: David S. Miller <davem@davemloft.net>
This is the x25 variant of changeset
9375cb8a12
("ax25: Use sock_graft() and remove bogus sk_socket and sk_sleep init.")
Signed-off-by: David S. Miller <davem@davemloft.net>
This is the rose variant of changeset
9375cb8a12
("ax25: Use sock_graft() and remove bogus sk_socket and sk_sleep init.")
Signed-off-by: David S. Miller <davem@davemloft.net>
This is the netrom variant of changeset
9375cb8a12
("ax25: Use sock_graft() and remove bogus sk_socket and sk_sleep init.")
Signed-off-by: David S. Miller <davem@davemloft.net>
The way that listening sockets work in ax25 is that the packet input
code path creates new socks via ax25_make_new() and attaches them
to the incoming SKB. This SKB gets queued up into the listening
socket's receive queue.
When accept()'d the sock gets hooked up to the real parent socket.
Alternatively, if the listening socket is closed and released, any
unborn socks stuff up in the receive queue get released.
So during this time period these sockets are unreachable in any
other way, so no wakeup events nor references to their ->sk_socket
and ->sk_sleep members can occur. And even if they do, all such
paths have to make NULL checks.
So do not deceptively initialize them in ax25_make_new() to the
values in the listening socket. Leave them at NULL.
Finally, use sock_graft() in ax25_accept().
Signed-off-by: David S. Miller <davem@davemloft.net>
Three major portions to this change:
1) Add IW_EV_COMPAT_LCP_LEN, IW_EV_COMPAT_POINT_OFF,
and IW_EV_COMPAT_POINT_LEN helper defines.
2) Delete iw_stream_check_add_*(), they are unused.
3) Add iw_request_info argument to iwe_stream_add_*(), and use it to
size the event and pointer lengths correctly depending upon whether
IW_REQUEST_FLAG_COMPAT is set or not.
4) The mechanical transformations to the drivers and wireless stack
bits to get the iw_request_info passed down into the routines
modified in #3. Also, explicit references to IW_EV_LCP_LEN are
replaced with iwe_stream_lcp_len(info).
With a lot of help and bug fixes from Masakazu Mokuno.
Signed-off-by: David S. Miller <davem@davemloft.net>
Next we can kill the hacks in fs/compat_ioctl.c and also
dispatch compat ioctls down into the driver and 80211 protocol
helper layers in order to handle iw_point objects embedded in
stream replies which need to be translated.
Signed-off-by: David S. Miller <davem@davemloft.net>
It happens that if a packet arrives in a VC between the call to open it on
the hardware and the call to change the backend to br2684, br2684_regvcc
processes the packet and oopses dereferencing skb->dev because it is
NULL before the call to br2684_push().
Signed-off-by: Jorge Boncompte [DTI2] <jorge@dti2.net>
Signed-off-by: Chas Williams <chas@cmf.nrl.navy.mil>
Same as for inet_hashfn, prepare its ipv6 incarnation.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Although this hash takes addresses into account, the ehash chains
can also be too long when, for instance, communications via lo occur.
So, prepare the inet_hashfn to take struct net into account.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Listening-on-one-port sockets in many namespaces produce long
chains in the listening_hash-es, so prepare the inet_lhashfn to
take struct net into account.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Binding to some port in many namespaces may create too long
chains in bhash-es, so prepare the hashfn to take struct net
into account.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Every caller already has this one. The new argument is currently
unused, but this will be fixed shortly.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
They both calculate the hash chain, but currently do not have
a struct net pointer, so pass one there via additional argument,
all the more so their callers already have such.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently the chain to store a UDP socket is calculated with
simple (x & (UDP_HTABLE_SIZE - 1)). But taking net into account
would make this calculation a bit more complex, so moving it into
a function would help.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
1) Remove ICMP_MIN_LENGTH, as it is unused.
2) Remove unneeded tcp_v4_send_check() declaration.
Signed-off-by: Rami Rosen <ramirose@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
I just noticed "cat /proc/net/raw" was buggy, missing '\n' separators.
I believe this was introduced by commit 8cd850efa4
([RAW]: Cleanup IPv4 raw_seq_show.)
This trivial patch restores correct behavior, and applies to current
Linus tree (should also be applied to stable tree as well.)
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Selected device feature bits can be propagated to VLAN devices, so we
can make use of TX checksum offload and TSO on VLAN-tagged packets.
However, if the physical device does not do VLAN tag insertion or
generic checksum offload then the test for TX checksum offload in
dev_queue_xmit() will see a protocol of htons(ETH_P_8021Q) and yield
false.
This splits the checksum offload test into two functions:
- can_checksum_protocol() tests a given protocol against a feature bitmask
- dev_can_checksum() first tests the skb protocol against the device
features; if that fails and the protocol is htons(ETH_P_8021Q) then
it tests the encapsulated protocol against the effective device
features for VLANs
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Right now, any time we set a primary transport we set
the changeover_active flag. As a result, we invoke SFR-CACC
even when there has been no changeover events.
Only set changeover_active, when there is a true changeover
event, i.e. we had a primary path and we are changing to
another transport.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch remove the proc fs entry which has been created if fail to
set up proc fs entry for the SCTP protocol.
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ingo's system is still seeing strange behavior, and he
reports that is goes away if the rest of the deferred
accept changes are reverted too.
Therefore this reverts e4c7884028
("[TCP]: TCP_DEFER_ACCEPT updates - dont retxmt synack") and
539fae89be ("[TCP]: TCP_DEFER_ACCEPT
updates - defer timeout conflicts with max_thresh").
Just like the other revert, these ideas can be revisited for
2.6.27
Signed-off-by: David S. Miller <davem@davemloft.net>
We've introduced extra need of compat layer for ip_tunnel_prl{}
for PRL (Potential Router List) management. Though compat_ioctl
is still missing in ipv4/ipv6, let's make the interface more
straight-forward and eliminate extra need for nasty compat layer
anyway since the interface is new for 2.6.26.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a htb_hysteresis parameter to htb_sch.ko and by sysfs magic make
it runtime adjustable via
/sys/module/sch_htb/parameters/htb_hysteresis mode 640.
Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk>
Acked-by: Martin Devera <devik@cdi.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
The HTB hysteresis mode reduce the CPU load, but at the
cost of scheduling accuracy.
On ADSL links (512 kbit/s upstream), this inaccuracy introduce
significant jitter, enought to disturbe VoIP. For details see my
masters thesis (http://www.adsl-optimizer.dk/thesis/), chapter 7,
section 7.3.1, pp 69-70.
Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk>
Acked-by: Martin Devera <devik@cdi.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
Change struct proto destroy function pointer to return void. Noticed
by Al Viro.
Signed-off-by: Brian Haley <brian.haley@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In IBSS mode prior to join/creation of new IBSS it is possible that
a frame from unknown station is received and an ibss_add_sta() is
called. This will cause a warning in rate_lowest_index() since the
list of supported rates of our station is not initialized yet.
The fix is to add ibss stations with a rate we received that frame
at; this single-element set will be extended later based on beacon
data. Also there is no need to store stations from a foreign IBSS.
Signed-off-by: Vladimir Koutny <vlado@ksp.sk>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Also change the arguments of the phase1, 2 key mixing to take
a pointer to the encrytion key and the tkip_ctx in the same
order.
Do the dereference of the encryption key in the callers.
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Take a __le16 directly rather than a host-endian value.
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This patch fixes setting beacon interval
1. in register_hw it honors value requested by the driver
2. It uses default 100 instead of 1000 or 10000. Scanning for beacon
interval ~1sec and above is not sane
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This patch denies the use of framentation while ampdu is used.
Signed-off-by: Ron Rindjunsky <ron.rindjunsky@intel.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Implement missing EU regulatory domain for mac80211. Based on the
information in IEEE 802.11-2007 (specifically pages 1142, 1143 & 1148)
and ETSI 301 893 (V1.4.1).
With thanks to Johannes Berg.
Signed-off-by: Tony Vroon <tony@linx.net>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This patch adds '\n' in debug printk (wme.c HT DEBUG)
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
The patch checks interface status, if it is in IBSS_JOINED mode
show cell id it is associated with.
Signed-off-by: Abhijeet Kolekar <abhijeet.kolekar@intel.com>
Signed-off-by: Zhu Yi <yi.zhu@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
tcp: Revert 'process defer accept as established' changes.
ipv6: Fix duplicate initialization of rawv6_prot.destroy
bnx2x: Updating the Maintainer
net: Eliminate flush_scheduled_work() calls while RTNL is held.
drivers/net/r6040.c: correct bad use of round_jiffies()
fec_mpc52xx: MPC52xx_MESSAGES_DEFAULT: 2nd NETIF_MSG_IFDOWN => IFUP
ipg: fix receivemode IPG_RM_RECEIVEMULTICAST{,HASH} in ipg_nic_set_multicast_list()
netfilter: nf_conntrack: fix ctnetlink related crash in nf_nat_setup_info()
netfilter: Make nflog quiet when no one listen in userspace.
ipv6: Fail with appropriate error code when setting not-applicable sockopt.
ipv6: Check IPV6_MULTICAST_LOOP option value.
ipv6: Check the hop limit setting in ancillary data.
ipv6 route: Fix route lifetime in netlink message.
ipv6 mcast: Check address family of gf_group in getsockopt(MS_FILTER).
dccp: Bug in initial acknowledgment number assignment
dccp ccid-3: X truncated due to type conversion
dccp ccid-3: TFRC reverse-lookup Bug-Fix
dccp ccid-2: Bug-Fix - Ack Vectors need to be ignored on request sockets
dccp: Fix sparse warnings
dccp ccid-3: Bug-Fix - Zero RTT is possible
This reverts two changesets, ec3c0982a2
("[TCP]: TCP_DEFER_ACCEPT updates - process as established") and
the follow-on bug fix 9ae27e0adb
("tcp: Fix slab corruption with ipv6 and tcp6fuzz").
This change causes several problems, first reported by Ingo Molnar
as a distcc-over-loopback regression where connections were getting
stuck.
Ilpo Järvinen first spotted the locking problems. The new function
added by this code, tcp_defer_accept_check(), only has the
child socket locked, yet it is modifying state of the parent
listening socket.
Fixing that is non-trivial at best, because we can't simply just grab
the parent listening socket lock at this point, because it would
create an ABBA deadlock. The normal ordering is parent listening
socket --> child socket, but this code path would require the
reverse lock ordering.
Next is a problem noticed by Vitaliy Gusev, he noted:
----------------------------------------
>--- a/net/ipv4/tcp_timer.c
>+++ b/net/ipv4/tcp_timer.c
>@@ -481,6 +481,11 @@ static void tcp_keepalive_timer (unsigned long data)
> goto death;
> }
>
>+ if (tp->defer_tcp_accept.request && sk->sk_state == TCP_ESTABLISHED) {
>+ tcp_send_active_reset(sk, GFP_ATOMIC);
>+ goto death;
Here socket sk is not attached to listening socket's request queue. tcp_done()
will not call inet_csk_destroy_sock() (and tcp_v4_destroy_sock() which should
release this sk) as socket is not DEAD. Therefore socket sk will be lost for
freeing.
----------------------------------------
Finally, Alexey Kuznetsov argues that there might not even be any
real value or advantage to these new semantics even if we fix all
of the bugs:
----------------------------------------
Hiding from accept() sockets with only out-of-order data only
is the only thing which is impossible with old approach. Is this really
so valuable? My opinion: no, this is nothing but a new loophole
to consume memory without control.
----------------------------------------
So revert this thing for now.
Signed-off-by: David S. Miller <davem@davemloft.net>
In changeset 22dd485022
("raw: Raw socket leak.") code was added so that we
flush pending frames on raw sockets to avoid leaks.
The ipv4 part was fine, but the ipv6 part was not
done correctly. Unlike the ipv4 side, the ipv6 code
already has a .destroy method for rawv6_prot.
So now there were two assignments to this member, and
what the compiler does is use the last one, effectively
making the ipv6 parts of that changeset a NOP.
Fix this by removing the:
.destroy = inet6_destroy_sock,
line, and adding an inet6_destroy_sock() call to the
end of raw6_destroy().
Noticed by Al Viro.
Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
This patch removes CVS keywords that weren't updated for a long time
from comments.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
When creation of a new conntrack entry in ctnetlink fails after having
set up the NAT mappings, the conntrack has an extension area allocated
that is not getting properly destroyed when freeing the conntrack again.
This means the NAT extension is still in the bysource hash, causing a
crash when walking over the hash chain the next time:
BUG: unable to handle kernel paging request at 00120fbd
IP: [<c03d394b>] nf_nat_setup_info+0x221/0x58a
*pde = 00000000
Oops: 0000 [#1] PREEMPT SMP
Pid: 2795, comm: conntrackd Not tainted (2.6.26-rc5 #1)
EIP: 0060:[<c03d394b>] EFLAGS: 00010206 CPU: 1
EIP is at nf_nat_setup_info+0x221/0x58a
EAX: 00120fbd EBX: 00120fbd ECX: 00000001 EDX: 00000000
ESI: 0000019e EDI: e853bbb4 EBP: e853bbc8 ESP: e853bb78
DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Process conntrackd (pid: 2795, ti=e853a000 task=f7de10f0 task.ti=e853a000)
Stack: 00000000 e853bc2c e85672ec 00000008 c0561084 63c1db4a 00000000 00000000
00000000 0002e109 61d2b1c3 00000000 00000000 00000000 01114e22 61d2b1c3
00000000 00000000 f7444674 e853bc04 00000008 c038e728 0000000a f7444674
Call Trace:
[<c038e728>] nla_parse+0x5c/0xb0
[<c0397c1b>] ctnetlink_change_status+0x190/0x1c6
[<c0397eec>] ctnetlink_new_conntrack+0x189/0x61f
[<c0119aee>] update_curr+0x3d/0x52
[<c03902d1>] nfnetlink_rcv_msg+0xc1/0xd8
[<c0390228>] nfnetlink_rcv_msg+0x18/0xd8
[<c0390210>] nfnetlink_rcv_msg+0x0/0xd8
[<c038d2ce>] netlink_rcv_skb+0x2d/0x71
[<c0390205>] nfnetlink_rcv+0x19/0x24
[<c038d0f5>] netlink_unicast+0x1b3/0x216
...
Move invocation of the extension destructors to nf_conntrack_free()
to fix this problem.
Fixes http://bugzilla.kernel.org/show_bug.cgi?id=10875
Reported-and-Tested-by: Krzysztof Piotr Oledzki <ole@ans.pl>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
The message "nf_log_packet: can't log since no backend logging module loaded
in! Please either load one, or disable logging explicitly" was displayed for
each logged packet when no userspace application is listening to nflog events.
The message seems to warn for a problem with a kernel module missing but as
said before this is not the case. I thus propose to suppress the message (I
don't see any reason to flood the log because a user application has crashed.)
Signed-off-by: Eric Leblond <eric@inl.fr>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
IPV6_MULTICAST_HOPS, for example, is not valid for stream sockets.
Since they are virtually unavailable for stream sockets,
we should return ENOPROTOOPT instead of EINVAL.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Only 0 and 1 are valid for IPV6_MULTICAST_LOOP socket option,
and we should return an error of EINVAL otherwise, per RFC3493.
Based on patch from Shan Wei <shanwei@cn.fujitsu.com>.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
When specifing the outgoing hop limit as ancillary data for sendmsg(),
the kernel doesn't check the integer hop limit value as specified in
[RFC-3542] section 6.3.
Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
1) We may have route lifetime larger than INT_MAX.
In that case we had wired value in lifetime.
Use INT_MAX if lifetime does not fit in s32.
2) Lifetime is valid iif RTF_EXPIRES is set.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
As we do for other socket/timewait-socket specific parameters,
let the callers pass appropriate arguments to
tcp_v{4,6}_do_calc_md5_hash().
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
We can share most part of the hash calculation code because
the only difference between IPv4 and IPv6 is their pseudo headers.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
This pacth makes IPv6 address labels per network namespace.
It keeps the global label tables, ip6addrlbl_table, but
adds a 'net' member to each ip6addrlbl_entry.
This new member is taken into account when matching labels.
Changelog
=========
* v1: Initial version
* v2:
* Minize the penalty when network namespaces are not configured:
* the 'net' member is added only if CONFIG_NET_NS is
defined. This saves space when network namespaces are not
configured.
* 'net' value is retrieved with the inlined function
ip6addrlbl_net() that always return &init_net when
CONFIG_NET_NS is not defined.
* 'net' member in ip6addrlbl_entry renamed to the less generic
'lbl_net' name (helps code search).
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
This inline function, for readability, returns if the route
is a "prefix" route regardless if it was installed by RA or by
hand.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
MRT6_VERSION should be used instead of MRT_VERSION in ip6mr.c.
Signed-off-by: Rami Rosen <ramirose@gmail.com>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
This patch removes MLDV2_QQIC macro from mcast.c
as it is unused.
Signed-off-by: Rami Rosen <ramirose@gmail.com>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (42 commits)
net: Fix routing tables with id > 255 for legacy software
sky2: Hold RTNL while calling dev_close()
s2io iomem annotations
atl1: fix suspend regression
qeth: start dev queue after tx drop error
qeth: Prepare-function to call s390dbf was wrong
qeth: reduce number of kernel messages
qeth: Use ccw_device_get_id().
qeth: layer 3 Oops in ip event handler
virtio: use callback on empty in virtio_net
virtio: virtio_net free transmit skbs in a timer
virtio: Fix typo in virtio_net_hdr comments
virtio_net: Fix skb->csum_start computation
ehea: set mac address fix
sfc: Recover from RX queue flush failure
add missing lance_* exports
ixgbe: fix typo
forcedeth: msi interrupts
ipsec: pfkey should ignore events when no listeners
pppoe: Unshare skb before anything else
...
Step 8.5 in RFC 4340 says for the newly cloned socket
Initialize S.GAR := S.ISS,
but what in fact the code (minisocks.c) does is
Initialize S.GAR := S.ISR,
which is wrong (typo?) -- fixed by the patch.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
This fixes a bug in computing the inter-packet-interval t_ipi = s/X:
scaled_div32(a, b) uses u32 for b, but in "scaled_div32(s, X)" the type of the
sending rate `X' is u64. Since X is scaled by 2^6, this truncates rates greater
than 2^26 Bps (~537 Mbps).
Using full 64-bit division now.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
This fixes a bug in the reverse lookup of p: given a value f(p), instead of p,
the function returned the smallest tabulated value f(p).
The smallest tabulated value of
10^6 * f(p) = sqrt(2*p/3) + 12 * sqrt(3*p/8) * (32 * p^3 + p)
for p=0.0001 is 8172.
Since this value is scaled by 10^6, the outcome of this bug is that a loss
of 8172/10^6 = 0.8172% was reported whenever the input was below the table
resolution of 0.01%.
This means that the value was over 80 times too high, resulting in large spikes
of the initial loss interval, thus unnecessarily reducing the throughput.
Also corrected the printk format (%u for u32).
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
This fixes an oversight from an earlier patch, ensuring that Ack Vectors
are not processed on request sockets.
The issue is that Ack Vectors must not be parsed on request sockets, since
the Ack Vector feature depends on the selection of the (TX) CCID. During the
initial handshake the CCIDs are undefined, and so RFC 4340, 10.3 applies:
"Using CCID-specific options and feature options during a negotiation
for the corresponding CCID feature is NOT RECOMMENDED [...]"
And it is not even possible: when the server receives the Request from the
client, the CCID and Ack vector features are undefined; when the Ack finalising
the 3-way hanshake arrives, the request socket has not been cloned yet into a
full socket. (This order is necessary, since otherwise the newly created socket
would have to be destroyed whenever an option error occurred - a malicious
hacker could simply send garbage options and exploit this.)
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
This patch fixes the following sparse warnings:
* nested min(max()) expression:
net/dccp/ccids/ccid3.c:91:21: warning: symbol '__x' shadows an earlier one
net/dccp/ccids/ccid3.c:91:21: warning: symbol '__y' shadows an earlier one
* Declaration of function prototypes in .c instead of .h file, resulting in
"should it be static?" warnings.
* Declared "struct dccpw" static (local to dccp_probe).
* Disabled dccp_delayed_ack() - not fully removed due to RFC 4340, 11.3
("Receivers SHOULD implement delayed acknowledgement timers ...").
* Used a different local variable name to avoid
net/dccp/ackvec.c:293:13: warning: symbol 'state' shadows an earlier one
net/dccp/ackvec.c:238:33: originally declared here
* Removed unused functions `dccp_ackvector_print' and `dccp_ackvec_print'.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
In commit $(825de27d9e) (from 27th May, commit
message `dccp ccid-3: Fix "t_ipi explosion" bug'), the CCID-3 window counter
computation was fixed to cope with RTTs < 4 microseconds.
Such RTTs can be found e.g. when running CCID-3 over loopback. The fix removed
a check against RTT < 4, but introduced a divide-by-zero bug.
All steady-state RTTs in DCCP are filtered using dccp_sample_rtt(), which
ensures non-zero samples. However, a zero RTT is possible on initialisation,
when there is no RTT sample from the Request/Response exchange.
The fix is to use the fallback-RTT from RFC 4340, 3.4.
This is also better than just fixing update_win_count() since it allows other
parts of the code to always assume that the RTT is non-zero during the time
that the CCID is used.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Most legacy software do not like tables > 255 as rtm_table is u8
so tb_id is sent &0xff and it is possible to mismatch for example
table 510 with table 254 (main).
This patch introduces RT_TABLE_COMPAT=252 so the code uses it if
tb_id > 255. It makes such old applications happy, new
ones are still able to use RTA_TABLE to get a proper table id.
Signed-off-by: Krzysztof Piotr Oledzki <ole@ans.pl>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Makes people happy who try to keep a list of addresses up to date by
listening to notifications.
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>