Commit Graph

119 Commits

Author SHA1 Message Date
Pavel Emelyanov f54873982c [NETNS][DCCPV4]: Use proper net to route the reset packet.
The dccp_v4_route_skb used in dccp_v4_ctl_send_reset, currently
works with init_net's routing tables - fix it.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-13 22:30:19 -07:00
Pavel Emelyanov b76c4b27fe [NETNS][DCCPV4]: Actually create ctl socket on each net and use it.
Move the call to inet_ctl_sock_create to init callback (and
inet_ctl_sock_destroy to exit one) and use proper ctl sock
in dccp_v4_ctl_send_reset.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-13 22:29:59 -07:00
Pavel Emelyanov 7b1cffa8c9 [NETNS][DCCPV4]: Move the dccp_v4_ctl_sk on the struct net.
And replace all its usage with init_net's socket.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-13 22:29:37 -07:00
Pavel Emelyanov 72a2d61382 [NETNS][DCCPV4]: Add dummy per-net operations.
They will be responsible for ctl socket initialization, but
currently they are void.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-13 22:29:13 -07:00
Denis V. Lunev 5677242f43 [NETNS]: Inet control socket should not hold a namespace.
This is a generic requirement, so make inet_ctl_sock_create namespace
aware and create a inet_ctl_sock_destroy wrapper around
sk_release_kernel.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-03 14:28:30 -07:00
Denis V. Lunev eee4fe4ded [INET]: Let inet_ctl_sock_create return sock rather than socket.
All upper protocol layers are already use sock internally.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-03 14:27:58 -07:00
Denis V. Lunev 3d58b5fa8e [INET]: Rename inet_csk_ctl_sock_create to inet_ctl_sock_create.
This call is nothing common with INET connection sockets code. It
simply creates an unhashes kernel sockets for protocol messages.

Move the new call into af_inet.c after the rename.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-03 14:22:32 -07:00
Denis V. Lunev 4f049b4f33 [DCCP]: dccp_v(4|6)_ctl_socket is leaked.
This seems a purism as module can't be unloaded, but though if cleanup
method is present it should be correct and clean all staff created.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-03 14:21:33 -07:00
Denis V. Lunev 7630f02681 [DCCP]: Replace socket with sock for reset sending.
Replace dccp_v(4|6)_ctl_socket with sock to unify a code with TCP/ICMP.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-03 14:20:52 -07:00
Pavel Emelyanov bdcde3d71a [SOCK]: Drop inuse pcounter from struct proto (v2).
An uppercut - do not use the pcounter on struct proto.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-28 16:39:33 -07:00
Pavel Emelyanov 39d8cda76c [SOCK]: Add udp_hash member to struct proto.
Inspired by the commit ab1e0a13 ([SOCK] proto: Add hashinfo member to 
struct proto) from Arnaldo, I made similar thing for UDP/-Lite IPv4 
and -v6 protocols.

The result is not that exciting, but it removes some levels of
indirection in udpxxx_get_port and saves some space in code and text.

The first step is to union existing hashinfo and new udp_hash on the
struct proto and give a name to this union, since future initialization 
of tcpxxx_prot, dccp_vx_protinfo and udpxxx_protinfo will cause gcc 
warning about inability to initialize anonymous member this way.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-22 16:50:58 -07:00
Eric Dumazet ee6b967301 [IPV4]: Add 'rtable' field in struct sk_buff to alias 'dst' and avoid casts
(Anonymous) unions can help us to avoid ugly casts.

A common cast it the (struct rtable *)skb->dst one.

Defining an union like  :
union {
     struct dst_entry *dst;
     struct rtable *rtable;
};
permits to use skb->rtable in place.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-05 18:30:47 -08:00
Denis V. Lunev fd80eb942a [INET]: Remove struct dst_entry *dst from request_sock_ops.rtx_syn_ack.
It looks like dst parameter is used in this API due to historical
reasons.  Actually, it is really used in the direct call to
tcp_v4_send_synack only.  So, create a wrapper for tcp_v4_send_synack
and remove dst from rtx_syn_ack.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-29 11:43:03 -08:00
Arnaldo Carvalho de Melo ab1e0a13d7 [SOCK] proto: Add hashinfo member to struct proto
This way we can remove TCP and DCCP specific versions of

sk->sk_prot->get_port: both v4 and v6 use inet_csk_get_port
sk->sk_prot->hash:     inet_hash is directly used, only v6 need
                       a specific version to deal with mapped sockets
sk->sk_prot->unhash:   both v4 and v6 use inet_hash directly

struct inet_connection_sock_af_ops also gets a new member, bind_conflict, so
that inet_csk_get_port can find the per family routine.

Now only the lookup routines receive as a parameter a struct inet_hashtable.

With this we further reuse code, reducing the difference among INET transport
protocols.

Eventually work has to be done on UDP and SCTP to make them share this
infrastructure and get as a bonus inet_diag interfaces so that iproute can be
used with these protocols.

net-2.6/net/ipv4/inet_hashtables.c:
  struct proto			     |   +8
  struct inet_connection_sock_af_ops |   +8
 2 structs changed
  __inet_hash_nolisten               |  +18
  __inet_hash                        | -210
  inet_put_port                      |   +8
  inet_bind_bucket_create            |   +1
  __inet_hash_connect                |   -8
 5 functions changed, 27 bytes added, 218 bytes removed, diff: -191

net-2.6/net/core/sock.c:
  proto_seq_show                     |   +3
 1 function changed, 3 bytes added, diff: +3

net-2.6/net/ipv4/inet_connection_sock.c:
  inet_csk_get_port                  |  +15
 1 function changed, 15 bytes added, diff: +15

net-2.6/net/ipv4/tcp.c:
  tcp_set_state                      |   -7
 1 function changed, 7 bytes removed, diff: -7

net-2.6/net/ipv4/tcp_ipv4.c:
  tcp_v4_get_port                    |  -31
  tcp_v4_hash                        |  -48
  tcp_v4_destroy_sock                |   -7
  tcp_v4_syn_recv_sock               |   -2
  tcp_unhash                         | -179
 5 functions changed, 267 bytes removed, diff: -267

net-2.6/net/ipv6/inet6_hashtables.c:
  __inet6_hash |   +8
 1 function changed, 8 bytes added, diff: +8

net-2.6/net/ipv4/inet_hashtables.c:
  inet_unhash                        | +190
  inet_hash                          | +242
 2 functions changed, 432 bytes added, diff: +432

vmlinux:
 16 functions changed, 485 bytes added, 492 bytes removed, diff: -7

/home/acme/git/net-2.6/net/ipv6/tcp_ipv6.c:
  tcp_v6_get_port                    |  -31
  tcp_v6_hash                        |   -7
  tcp_v6_syn_recv_sock               |   -9
 3 functions changed, 47 bytes removed, diff: -47

/home/acme/git/net-2.6/net/dccp/proto.c:
  dccp_destroy_sock                  |   -7
  dccp_unhash                        | -179
  dccp_hash                          |  -49
  dccp_set_state                     |   -7
  dccp_done                          |   +1
 5 functions changed, 1 bytes added, 242 bytes removed, diff: -241

/home/acme/git/net-2.6/net/dccp/ipv4.c:
  dccp_v4_get_port                   |  -31
  dccp_v4_request_recv_sock          |   -2
 2 functions changed, 33 bytes removed, diff: -33

/home/acme/git/net-2.6/net/dccp/ipv6.c:
  dccp_v6_get_port                   |  -31
  dccp_v6_hash                       |   -7
  dccp_v6_request_recv_sock          |   +5
 3 functions changed, 5 bytes added, 38 bytes removed, diff: -33

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-03 04:28:52 -08:00
Pavel Emelyanov c67499c0e7 [NETNS]: Tcp-v4 sockets per-net lookup.
Add a net argument to inet_lookup and propagate it further
into lookup calls. Plus tune the __inet_check_established.

The dccp and inet_diag, which use that lookup functions
pass the init_net into them.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-31 19:28:19 -08:00
Denis V. Lunev f1b050bf7a [NETNS]: Add namespace parameter to ip_route_output_flow.
Needed to propagate it down to the __ip_route_output_key.

Signed_off_by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:11:06 -08:00
Pavel Emelyanov 152da81deb [INET]: Uninline the __inet_hash function.
This one is used in quite many places in the networking code and
seems to big to be inline.

After the patch net/ipv4/build-in.o loses ~650 bytes:
add/remove: 2/0 grow/shrink: 0/5 up/down: 461/-1114 (-653)
function                                     old     new   delta
__inet_hash_nolisten                           -     282    +282
__inet_hash                                    -     179    +179
tcp_sacktag_write_queue                     2255    2254      -1
__inet_lookup_listener                       284     274     -10
tcp_v4_syn_recv_sock                         755     493    -262
tcp_v4_hash                                  389      35    -354
inet_hash_connect                           1086     599    -487

This version addresses the issue pointed by Eric, that
while being inline this function was optimized by gcc
in respect to the 'listen_possible' argument.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:59:26 -08:00
Gerrit Renker 8b81941248 [DCCP]: Allow to parse options on Request Sockets
The option parsing code currently only parses on full sk's. This causes a problem for
options sent during the initial handshake (in particular timestamps and feature-negotiation
options). Therefore, this patch extends the option parsing code with an additional argument
for request_socks: if it is non-NULL, options are parsed on the request socket, otherwise
the normal path (parsing on the sk) is used.

Subsequent patches, which implement feature negotiation during connection setup, make use
of this facility.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:57:50 -08:00
David S. Miller c62cf5cb17 [DCCP]: Use DEFINE_PROTO_INUSE infrastructure.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-07 04:09:01 -08:00
Gerrit Renker 1238d0873b [DCCP]: One more exemption from full sequence number checks
This fixes the following problem: client connects to peer which has no DCCP
enabled or loaded; ICMP error messages ("Protocol Unavailable") can be seen
on the wire, but the application hangs. Reason: ICMP packets don't get through
to dccp_v4_err.

When reporting errors, a sequence number check is made for the DCCP packet
that had caused an ICMP error to arrive.
Such checks can not be made if the socket is in state LISTEN, RESPOND (which
in the implementation is the same as LISTEN), or REQUEST, since update_gsr()
has not been called in these states, hence the sequence window is 0..0.

This patch fixes the problem by adding the REQUEST state as another exemption
to the window check. The error reporting now works as expected on connecting.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
2007-10-24 10:18:06 -02:00
Gerrit Renker fde20105f3 [DCCP]: Retrieve packet sequence number for error reporting
This fixes a problem when analysing erroneous packets in dccp_v{4,6}_err:
* dccp_hdr_seq currently takes an skb
* however, the transport headers in the skb are shifted, due to the
  preceding IPv4/v6 header.
Fixed for v4 and v6 by changing dccp_hdr_seq to take a struct dccp_hdr as
argument. Verified that the correct sequence number is now reported in the
error handler.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2007-10-24 10:12:09 -02:00
Jean Delvare 7131c6c736 [INET]: Use MODULE_ALIAS_NET_PF_PROTO_TYPE where possible.
Now that we have this new macro, use it where possible.

Signed-off-by: Jean Delvare <jdelvare@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-22 02:59:54 -07:00
Gerrit Renker 4a5409a5a8 [DCCP]: Twice the wrong reset code in receiving connection-Requests
This fixes two bugs in processing of connection-Requests in
v{4,6}_conn_request:

 1. Due to using the variable `reset_code', the Reset code generated
    internally by dccp_parse_options() is overwritten with the
    initialised value ("Too Busy") of reset_code, which is not what is
    intended.

 2. When receiving a connection-Request on a multicast or broadcast
    address, no Reset should be generated, to avoid storms of such
    packets. Instead of jumping to the `drop' label, the
    v{4,6}_conn_request functions now return 0. Below is why in my
    understanding this is correct:

    When the conn_request function returns < 0, then the caller,
    dccp_rcv_state_process(), returns 1. In all instances where
    dccp_rcv_state_process is called (dccp_v4_do_rcv, dccp_v6_do_rcv,
    and dccp_child_process), a return value of != 0 from
    dccp_rcv_state_process() means that a Reset is generated.

    If on the other hand the conn_request function returns 0, the
    packet is discarded and no Reset is generated.

Note: There may be a related problem when sending the Response, due to
the following.

	if (dccp_v6_send_response(sk, req, NULL))
		goto drop_and_free;
	/* ... */
	drop_and_free:
		return -1;

In this case, if send_response fails due to transmission errors, the
next thing that is generated is a Reset with a code "Too Busy". I
haven't been able to conjure up such a condition, but it might be good
to change the behaviour here also (not done by this patch).

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:54:38 -07:00
Gerrit Renker e356d37a09 [DCCP]: Factor out common code for generating Resets
This factors code common to dccp_v{4,6}_ctl_send_reset into a separate function,
and adds support for filling in the Data 1 ... Data 3 fields from RFC 4340, 5.6.

It is useful to have this separate, since the following Reset codes will always
be generated from the control socket rather than via dccp_send_reset:
 * Code 3, "No Connection", cf. 8.3.1;
 * Code 4, "Packet Error" (identification for Data 1 added);
 * Code 5, "Option Error" (identification for Data 1..3 added, will be used later);
 * Code 6, "Mandatory Error" (same as Option Error);
 * Code 7, "Connection Refused" (what on Earth is the difference to "No Connection"?);
 * Code 8, "Bad Service Code";
 * Code 9, "Too Busy";
 * Code 10, "Bad Init Cookie" (not used).

Code 0 is not recommended by the RFC, the following codes would be used in
dccp_send_reset() instead, since they all relate to an established DCCP connection:
 * Code 1, "Closed";
 * Code 2, "Aborted";
 * Code 11, "Aggression Penalty" (12.3).

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
2007-10-10 16:52:44 -07:00
Gerrit Renker 9bf55cda9b [DCCP]: Sequence number wrap-around when sending reset
This replaces normal addition with mod-48 addition so that sequence number
wraparound is respected.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
2007-10-10 16:52:43 -07:00
Micah Gruber c726187225 [DCCP]: Remove unneeded pointer newdp from dccp_v4_request_recv_sock()
This trivial patch removes the unneeded pointer newdp, which is never used.

Signed-off-by: Micah Gruber <micah.gruber@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:48:59 -07:00
Arnaldo Carvalho de Melo 88c7664f13 [SK_BUFF]: Introduce icmp_hdr(), remove skb->h.icmph
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:25:23 -07:00
Arnaldo Carvalho de Melo eddc9ec53b [SK_BUFF]: Introduce ip_hdr(), remove skb->nh.iph
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:25:10 -07:00
YOSHIFUJI Hideaki c9eaf17341 [NET] DCCP: Fix whitespace errors.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-10 23:19:27 -08:00
David S. Miller 8eb9086f21 [IPV4/IPV6]: Always wait for IPSEC SA resolution in socket contexts.
Do this even for non-blocking sockets.  This avoids the silly -EAGAIN
that applications can see now, even for non-blocking sockets in some
cases (f.e. connect()).

With help from Venkat Tekkirala.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-08 12:38:45 -08:00
Arnaldo Carvalho de Melo 8109b02b53 [DCCP]: Whitespace cleanups
That accumulated over the last months hackaton, shame on me for not
using git-apply whitespace helping hand, will do that from now on.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-12-11 14:35:00 -08:00
Gerrit Renker 59348b19ef [DCCP]: Simplified conditions due to use of enum:8 states
This reaps the benefit of the earlier patch, which changed the type of
CCID 3 states to use enums, in that many conditions are now simplified
and the number of possible (unexpected) values is greatly reduced.

In a few instances, this also allowed to simplify pre-conditions; where
care has been taken to retain logical equivalence.

[DCCP]: Introduce a consistent BUG/WARN message scheme

This refines the existing set of DCCP messages so that
 * BUG(), BUG_ON(), WARN_ON() have meaningful DCCP-specific counterparts
 * DCCP_CRIT (for severe warnings) is not rate-limited
 * DCCP_WARN() is introduced as rate-limited wrapper

Using these allows a faster and cleaner transition to their original
counterparts once the code has matured into a full DCCP implementation.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-12-02 21:24:38 -08:00
Arnaldo Carvalho de Melo 58a5a7b955 [NET]: Conditionally use bh_lock_sock_nested in sk_receive_skb
Spotted by Ian McDonald, tentatively fixed by Gerrit Renker:

http://www.mail-archive.com/dccp%40vger.kernel.org/msg00599.html

Rewritten not to unroll sk_receive_skb, in the common case, i.e. no lock
debugging, its optimized away.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-12-02 21:23:51 -08:00
Arnaldo Carvalho de Melo e523a1550e [DCCP]: One NET_INC_STATS() could be NET_INC_STATS_BH in dccp_v4_err()
Spotted by Eric Dumazet in tcp_v4_rcv().

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-12-02 21:23:50 -08:00
Al Viro 2bda285315 [NET]: Annotate csum_tcpudp_magic() callers in net/*
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:23:30 -08:00
YOSHIFUJI Hideaki cfb6eeb4c8 [TCP]: MD5 Signature Option (RFC2385) support.
Based on implementation by Rick Payne.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:22:39 -08:00
Gerrit Renker 09dbc3895e [DCCP]: Miscellaneous code tidy-ups
This patch does not change code; it performs some trivial clean/tidy-ups:

  * removal of a `debug_prefix' string in favour of the
    already existing dccp_role(sk)

  * add documentation of structures and constants

  * separated out the cases for invalid packets (step 1
    of the packet validation)

  * removing duplicate statements

  * combining declaration & initialisation

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-12-02 21:22:30 -08:00
Gerrit Renker b9df3cb8cf [TCP/DCCP]: Introduce net_xmit_eval
Throughout the TCP/DCCP (and tunnelling) code, it often happens that the
return code of a transmit function needs to be tested against NET_XMIT_CN
which is a value that does not indicate a strict error condition.

This patch uses a macro for these recurring situations which is consistent
with the already existing macro net_xmit_errno, saving on duplicated code.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-12-02 21:22:27 -08:00
Gerrit Renker d7f7365f57 [DCCPv6]: Choose a genuine initial sequence number
This
	* resolves a FIXME - DCCPv6 connections started all with
	  an initial sequence number of 1;
	* provides a redirection `secure_dccpv6_sequence_number'
	  in case the init_sequence_v6 code should be updated later;
	* concentrates the update of S.GAR into dccp_connect_init();
	* removes a duplicate dccp_update_gss() in ipv4.c;
	* uses inet->dport instead of usin->sin_port, due to the
	  following assignment in dccp_v4_connect():
 		inet->dport = usin->sin_port;

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-12-02 21:22:22 -08:00
Gerrit Renker 865e9022d8 [DCCP]: Remove redundant statements in init_sequence (ISS)
This patch removes the following redundancies:

 1) The test skb->protocol == htons(ETH_P_IPV6) in dccp_v6_init_sequence
    is always true since
     * dccp_v6_conn_request() is the only calling function
     * dccp_v6_conn_request() redirects all skb's with ETH_P_IP to
       dccp_v4_conn_request()

 2) The first argument, `struct sock *sk', of dccp_v{4,6}_init_sequence()
    is never used.

(This is similar for tcp_v{4,6}_init_sequence, an analogous patch has been
 submitted to netdev and merged.)

By the way - are the `sport' / `dport' arguments in the right order?
I have made them consistent among calls but they seem to be in the
reverse order.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-12-02 21:22:21 -08:00
Gerrit Renker afb0a34dd3 [DCCP]: Introduce a consistent naming scheme for sysctls
In order to make their function clearer and obtain a consistent naming
scheme to identify sysctls, all existing DCCP sysctls have been prefixed
with `sysctl_dccp', following the same convention as used by TCP.

Feature-specific sysctls retain the `feat' in the middle, although the
`default' has been dropped, since it is obvious from use.

Also removed a duplicate `dccp_feat_default_sequence_window' in ipv4.c.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-12-02 21:22:19 -08:00
Gerrit Renker 6f4e5fff1e [DCCP]: Support for partial checksums (RFC 4340, sec. 9.2)
This patch does the following:
  a) introduces variable-length checksums as specified in [RFC 4340, sec. 9.2]
  b) provides necessary socket options and documentation as to how to use them
  c) basic support and infrastructure for the Minimum Checksum Coverage feature
     [RFC 4340, sec. 9.2.1]: acceptability tests, user notification and user
     interface

In addition, it

 (1) fixes two bugs in the DCCPv4 checksum computation:
 	* pseudo-header used checksum_len instead of skb->len
	* incorrect checksum coverage calculation based on dccph_x
 (2) removes dccp_v4_verify_checksum() since it reduplicates code of the
     checksum computation; code calling this function is updated accordingly.
 (3) now uses skb_checksum(), which is safer than checksum_partial() if the
     sk_buff has is a non-linear buffer (has pages attached to it).
 (4) fixes an outstanding TODO item:
        * If P.CsCov is too large for the packet size, drop packet and return.

The code has been tested with applications, the latest version of tcpdump now
comes with support for partial DCCP checksums.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-12-02 21:22:09 -08:00
Gerrit Renker d83ca5accb [DCCP]: Update code comments for Step 2/3
Sorts out the comments for processing steps 2,3 in section 8.5 of RFC 4340.
All comments have been updated against this document, and the reference to step
2 has been made consistent throughout the files.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-12-02 21:22:04 -08:00
Gerrit Renker cf557926f6 [DCCP]: tidy up dccp_v{4,6}_conn_request
This is a code simplification to remove reduplicated code
by concentrating and abstracting shared code.

Detailed Changes:
2006-12-02 21:22:03 -08:00
Gerrit Renker 3d2fe62b8d [DCCPv4]: remove forward declarations in ipv4.c
This relates to Arnaldo's announcement in
http://www.mail-archive.com/dccp@vger.kernel.org/msg00604.html

Originally this had been part of the Oops fix and is a revised variant of
http://www.mail-archive.com/dccp@vger.kernel.org/msg00598.html

No code change, merely reshuffling, with the particular objective of
having all request_sock_ops close(r) together for more clarity.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-12-02 21:21:59 -08:00
Gerrit Renker 8a73cd09d9 [DCCP]: calling dccp_v{4,6}_reqsk_send_ack is a BUG
This patch removes two functions, the send_ack functions of request_sock,
which are not called/used by the DCCP code. It is correct that these
functions are not called, below is a justification why calling these
functions (on a passive socket in the LISTEN/RESPOND state) would mean
a DCCP protocol violation.

A) Background: using request_sock in TCP:
2006-12-02 21:21:58 -08:00
Gerrit Renker d23c7107bf [DCCP]: Simplify jump labels in dccp_v{4,6}_rcv
This is a code simplification and was singled out from the
DCCPv6 Oops patch on
http://www.mail-archive.com/dccp@vger.kernel.org/msg00600.html

It mainly makes the code consistent between ipv{4,6}.c for the functions
        dccp_v4_rcv
        dccp_v6_rcv
and removes the do_time_wait label to simplify code somewhat.

Commiter note: fixed up a compile problem, trivial.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-12-02 21:21:56 -08:00
Gerrit Renker 9b42078ed6 [DCCP]: Combine allocating & zeroing header space on skb
This is a code simplification:
it combines three often recurring operations into one inline function,

        * allocate `len' bytes header space in skb
        * fill these `len' bytes with zeroes
        * cast the start of this header space as dccp_hdr

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-12-02 21:21:55 -08:00
David S. Miller 494b4e7d81 [DCCP]: Fix typo _read_mostly --> __read_mostly.
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:21:45 -08:00
Eric Dumazet 72a3effaf6 [NET]: Size listen hash tables using backlog hint
We currently allocate a fixed size (TCP_SYNQ_HSIZE=512) slots hash table for
each LISTEN socket, regardless of various parameters (listen backlog for
example)

On x86_64, this means order-1 allocations (might fail), even for 'small'
sockets, expecting few connections. On the contrary, a huge server wanting a
backlog of 50000 is slowed down a bit because of this fixed limit.

This patch makes the sizing of listen hash table a dynamic parameter,
depending of :
- net.core.somaxconn tunable (default is 128)
- net.ipv4.tcp_max_syn_backlog tunable (default : 256, 1024 or 128)
- backlog value given by user application  (2nd parameter of listen())

For large allocations (bigger than PAGE_SIZE), we use vmalloc() instead of
kmalloc().

We still limit memory allocation with the two existing tunables (somaxconn &
tcp_max_syn_backlog). So for standard setups, this patch actually reduce RAM
usage.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:21:44 -08:00