linux_old1

Commit Graph

Author	SHA1	Message	Date
Gerrit Renker	6d57b43bf8	[DCCP]: Remove redundant dependency on IP_DCCP This cleans up the consequences of an earlier patch which introduced the `if IP_DCCP' clause into net/dccp/Kconfig. The CCID Kconfig menu is sourced within this clause; as a consequence, all tests of type `depends on IP_DCCP' are now redundant. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:54:46 -08:00
Gerrit Renker	e333b3edc4	[DCCP]: Promote CCID2 as default CCID This patch addresses the following problems: 1. DCCP relies for its proper functioning on having at least one CCID module enabled (as in TCP plugable congestion control). Currently it is possible to disable both CCIDs and thus leave the DCCP module in a compiled, but entirely non-functional state: no sockets can be created when no CCID is available. Furthermore, the protocol is (again like TCP) not intended to be used without CCIDs. Last, a non-empty CCID list is needed for doing CCID feature negotiation. 2. Internally the default CCID that is advertised by the Linux host is set to CCID2 (DCCPF_INITIAL_CCID in include/linux/dccp.h). Disabling CCID2 in the Kconfig menu without changing the defaults leads to a failure `module not found' when trying to load the dccp module (which internally tries to load the default CCID). 3. The specification (RFC 4340, sec. 10) treats CCID2 somewhat like a `minimum common denominator'; the specification says that: * "New connections start with CCID 2 for both endpoints" * "A DCCP implementation intended for general use, such as an implementation in a general-purpose operating system kernel, SHOULD implement at least CCID 2. The intent is to make CCID 2 broadly available for interoperability [...]" Providing CCID2 as minimum-required CCID (like Reno/Cubic in TCP) thus seems reasonable. Hence this patch automatically selects CCID2 when DCCP is enabled. Documentation also added. Discussions with Ian McDonald on this subject are gratefully acknowledged. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:54:46 -08:00
Gerrit Renker	8e8c71f1ab	[DCCP]: Honour and make use of shutdown option set by user This extends the DCCP socket API by honouring any shutdown(2) option set by the user. The behaviour is, as much as possible, made consistent with the API for TCP's shutdown. This patch exploits the information provided by the user via the socket API to reduce processing costs: * if the read end is closed (SHUT_RD), it is not necessary to deliver to input CCID; * if the write end is closed (SHUT_WR), the same idea applies, but with a difference - as long as the TX queue has not been drained, we need to receive feedback to keep congestion-control rates up to date. Hence SHUT_WR is honoured only after the last packet (under congestion control) has been sent; * although SHUT_RDWR seems nonsensical, it is nevertheless supported in the same manner as for TCP (and agrees with test for SHUTDOWN_MASK in dccp_poll() in net/dccp/proto.c). Furthermore, most of the code already honours the sk_shutdown flags (dccp_recvmsg() for instance sets the read length to 0 if SHUT_RD had been called); CCID handling is now added to this by the present patch. There will also no longer be any delivery when the socket is in the final stages, i.e. when one of dccp_close(), dccp_fin(), or dccp_done() has been called - which is fine since at that stage the connection is its final stages. Motivation and background are on http://www.erg.abdn.ac.uk/users/gerrit/dccp/notes/shutdown A FIXME has been added to notify the other end if SHUT_RD has been set (RFC 4340, 11.7). Note: There is a comment in inet_shutdown() in net/ipv4/af_inet.c which asks to "make sure the socket is a TCP socket". This should probably be extended to mean `TCP or DCCP socket' (the code is also used by UDP and raw sockets). Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:54:44 -08:00
Gerrit Renker	c3ada46a00	[CCID3]: Inline for moving average The moving average computation occurs so frequently in the CCID 3 code that it merits an inline function of its own. This is uses a suggestion by Arnaldo as per http://www.mail-archive.com/dccp@vger.kernel.org/msg01662.html Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:54:43 -08:00
Gerrit Renker	a5358fdc9c	[CCID3]: Accurately determine idle & application-limited periods This fixes/updates the handling of idle and application-limited periods in CCID3, which currently is broken: there is no detection as to how long a sender has been idle - there is only one flag which is toggled in between function calls. Being obsolete now, the `idle' flag is removed. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:54:42 -08:00
Gerrit Renker	eb279b79c4	[CCID3]: Ignore trivial amounts of elapsed time This patch fixes a previously undiscovered bug; the problem is in computing the elapsed time as the time between `receiving' the packet (i.e. skb enters CCID module) and sending feedback: - there is no layer-processing, queueing, or delay involved, - hence the elapsed time is in the order of 1 function call - this is in the dimension of maximally 50..100usec - which renders the use of elapsed time almost entirely useless. The fix is simply to ignore such trivial amounts of elapsed time. As a further advantage, the now useless elapsed_time field can be removed from the socket, which reduces the socket structure by another four bytes. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:54:42 -08:00
Gerrit Renker	6c08b2cf48	[CCID3]: Revert use of MSS instead of s This updates the CCID3 code with regard to two instances of using `MSS' in place of `s': 1. The RFC3390-based initial rate: both rfc3448bis as well as the Faster Restart draft now consistently use `s' instead of MSS. 2. Now agrees with section 4.2 of rfc3448bis: "If the sender is ready to send data when it does not yet have a round trip sample, the value of X is set to s bytes per second, for segment size s [...]" Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:54:41 -08:00
Pavel Emelyanov	b24b8a247f	[NET]: Convert init_timer into setup_timer Many-many code in the kernel initialized the timer->function and timer->data together with calling init_timer(timer). There is already a helper for this. Use it for networking code. The patch is HUGE, but makes the code 130 lines shorter (98 insertions(+), 228 deletions(-)). Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:53:35 -08:00
Joe Perches	5e8e034cc5	[DCCP]: Spelling fixes Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-12-20 13:59:39 -08:00
Joe Perches	4756daa3b6	[DCCP]: Add missing "space" Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-11-19 23:46:02 -08:00
Eric Dumazet	230140cffa	[INET]: Remove per bucket rwlock in tcp/dccp ehash table. As done two years ago on IP route cache table (commit `22c047ccbc`) , we can avoid using one lock per hash bucket for the huge TCP/DCCP hash tables. On a typical x86_64 platform, this saves about 2MB or 4MB of ram, for litle performance differences. (we hit a different cache line for the rwlock, but then the bucket cache line have a better sharing factor among cpus, since we dirty it less often). For netstat or ss commands that want a full scan of hash table, we perform fewer memory accesses. Using a 'small' table of hashed rwlocks should be more than enough to provide correct SMP concurrency between different buckets, without using too much memory. Sizing of this table depends on num_possible_cpus() and various CONFIG settings. This patch provides some locking abstraction that may ease a future work using a different model for TCP/DCCP table. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-11-07 04:15:11 -08:00
David S. Miller	c62cf5cb17	[DCCP]: Use DEFINE_PROTO_INUSE infrastructure. Signed-off-by: David S. Miller <davem@davemloft.net>	2007-11-07 04:09:01 -08:00
Gerrit Renker	24c667db59	[CCID2/3]: Initialisation assignments of 0 are redundant Assigning initial values of `0' is redundant when loading a new CCID structure, since in net/dccp/ccid.c the entire CCID structure is zeroed out prior to initialisation in ccid_new(): struct ccid { struct ccid_operations ccid_ops; char ccid_priv[0]; }; // ... if (rx) { memset(ccid + 1, 0, ccid_ops->ccid_hc_rx_obj_size); if (ccid->ccid_ops->ccid_hc_rx_init != NULL && ccid->ccid_ops->ccid_hc_rx_init(ccid, sk) != 0) goto out_free_ccid; } else { memset(ccid + 1, 0, ccid_ops->ccid_hc_tx_obj_size); / analogous to the rx case */ } This patch therefore removes the redundant assignments. Thanks to Arnaldo for the inspiration. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2007-10-24 10:53:01 -02:00
Gerrit Renker	76fd1e87d9	[DCCP]: Unaligned pointer access This fixes `unaligned (read) access' errors of the type Kernel unaligned access at TPC[100f970c] dccp_parse_options+0x4f4/0x7e0 [dccp] Kernel unaligned access at TPC[1011f2e4] ccid3_hc_tx_parse_options+0x1ac/0x380 [dccp_ccid3] Kernel unaligned access at TPC[100f9898] dccp_parse_options+0x680/0x880 [dccp] by using the get_unaligned macro for parsing options. Commiter note: Preserved the sparse __be{16,32} annotations. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2007-10-24 10:46:58 -02:00
Gerrit Renker	d8ef2c29a0	[DCCP]: Convert Reset code into socket error number This adds support for converting the 11 currently defined Reset codes into system error numbers, which are stored in sk_err for further interpretation. This makes the externally visible API behaviour similar to TCP, since a client connecting to a non-existing port will experience ECONNREFUSED. * Code 0, Unspecified, is interpreted as non-error (0); * Code 1, Closed (normal termination), also maps into 0; * Code 2, Aborted, maps into "Connection reset by peer" (ECONNRESET); * Code 3, No Connection and Code 7, Connection Refused, map into "Connection refused" (ECONNREFUSED); * Code 4, Packet Error, maps into "No message of desired type" (ENOMSG); * Code 5, Option Error, maps into "Illegal byte sequence" (EILSEQ); * Code 6, Mandatory Error, maps into "Operation not supported on transport endpoint" (EOPNOTSUPP); * Code 8, Bad Service Code, maps into "Invalid request code" (EBADRQC); * Code 9, Too Busy, maps into "Too many users" (EUSERS); * Code 10, Bad Init Cookie, maps into "Invalid request descriptor" (EBADR); * Code 11, Aggression Penalty, maps into "Quota exceeded" (EDQUOT) which makes sense in terms of using more than the `fair share' of bandwidth. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2007-10-24 10:27:48 -02:00
Gerrit Renker	1238d0873b	[DCCP]: One more exemption from full sequence number checks This fixes the following problem: client connects to peer which has no DCCP enabled or loaded; ICMP error messages ("Protocol Unavailable") can be seen on the wire, but the application hangs. Reason: ICMP packets don't get through to dccp_v4_err. When reporting errors, a sequence number check is made for the DCCP packet that had caused an ICMP error to arrive. Such checks can not be made if the socket is in state LISTEN, RESPOND (which in the implementation is the same as LISTEN), or REQUEST, since update_gsr() has not been called in these states, hence the sequence window is 0..0. This patch fixes the problem by adding the REQUEST state as another exemption to the window check. The error reporting now works as expected on connecting. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>	2007-10-24 10:18:06 -02:00
Gerrit Renker	fde20105f3	[DCCP]: Retrieve packet sequence number for error reporting This fixes a problem when analysing erroneous packets in dccp_v{4,6}_err: * dccp_hdr_seq currently takes an skb * however, the transport headers in the skb are shifted, due to the preceding IPv4/v6 header. Fixed for v4 and v6 by changing dccp_hdr_seq to take a struct dccp_hdr as argument. Verified that the correct sequence number is now reported in the error handler. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2007-10-24 10:12:09 -02:00
Arnaldo Carvalho de Melo	6273172e17	[DCCP]: Implement SIOCINQ/FIONREAD Just like UDP. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Leandro Melo de Sales <leandroal@gmail.com> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-23 21:27:50 -07:00
Jean Delvare	7131c6c736	[INET]: Use MODULE_ALIAS_NET_PF_PROTO_TYPE where possible. Now that we have this new macro, use it where possible. Signed-off-by: Jean Delvare <jdelvare@suse.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-22 02:59:54 -07:00
Jean Delvare	305e1e9691	[INET]: Let inet_diag and friends autoload By adding module aliases to inet_diag, tcp_diag and dccp_diag, we let them load automatically as needed. This makes tools like "ss" run faster. Signed-off-by: Jean Delvare <jdelvare@suse.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-22 02:59:54 -07:00
Ingo Molnar	bd5435e76a	[DCCP]: fix link error with !CONFIG_SYSCTL Do not define the sysctl_dccp_sync_ratelimit sysctl variable in the CONFIG_SYSCTL dependent sysctl.c module - move it to input.c instead. This fixes the following build bug: net/built-in.o: In function `dccp_check_seqno': input.c:(.text+0xbd859): undefined reference to `sysctl_dccp_sync_ratelimit' distcc[29953] ERROR: compile (null) on localhost failed make: *** [vmlinux] Error 1 Found via 'make randconfig' build testing. Signed-off-by: Ingo Molnar <mingo@elte.hu> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-17 19:33:06 -07:00
Herbert Xu	e5bbef20e0	[IPV6]: Replace sk_buff ** with sk_buff * in input handlers With all the users of the double pointers removed from the IPv6 input path, this patch converts all occurances of sk_buff ** to sk_buff * in IPv6 input handlers. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-15 12:50:28 -07:00
Gerrit Renker	4a5409a5a8	[DCCP]: Twice the wrong reset code in receiving connection-Requests This fixes two bugs in processing of connection-Requests in v{4,6}_conn_request: 1. Due to using the variable `reset_code', the Reset code generated internally by dccp_parse_options() is overwritten with the initialised value ("Too Busy") of reset_code, which is not what is intended. 2. When receiving a connection-Request on a multicast or broadcast address, no Reset should be generated, to avoid storms of such packets. Instead of jumping to the `drop' label, the v{4,6}_conn_request functions now return 0. Below is why in my understanding this is correct: When the conn_request function returns < 0, then the caller, dccp_rcv_state_process(), returns 1. In all instances where dccp_rcv_state_process is called (dccp_v4_do_rcv, dccp_v6_do_rcv, and dccp_child_process), a return value of != 0 from dccp_rcv_state_process() means that a Reset is generated. If on the other hand the conn_request function returns 0, the packet is discarded and no Reset is generated. Note: There may be a related problem when sending the Response, due to the following. if (dccp_v6_send_response(sk, req, NULL)) goto drop_and_free; /* ... */ drop_and_free: return -1; In this case, if send_response fails due to transmission errors, the next thing that is generated is a Reset with a code "Too Busy". I haven't been able to conjure up such a condition, but it might be good to change the behaviour here also (not done by this patch). Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:54:38 -07:00
Gerrit Renker	dcad856fe8	[DCCP]: Wrong format in printk The elapsed time uses u32, but printk was using %d, not %u. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2007-10-10 16:54:36 -07:00
Gerrit Renker	451bc0473f	[DCCP]: Tidy-up -- minisock initialisation This * removes a declaration of a non-existent function __dccp_minisock_init; * shifts the initialisation function dccp_minisock_init() from options.c to minisocks.c, where it is more naturally expected to be. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:54:36 -07:00
Gerrit Renker	5e28599a6e	[CCID2]: Sequence number wraparound issues This replaces several uses of standard arithmetic with the DCCP sequence number arithmetic functions. The problem here is that the sequence number wrap-around was not taken into consideration. * Condition "seqp->ccid2s_seq <= prev->ccid2s_seq" has been replaced by dccp_delta_seqno(seqp->ccid2s_seq, prev->ccid2s_seq) >= 0 since if seqp is `before' prev, then the delta_seqno() is positive. * The test whether sequence numbers `a' and `b' are consecutive has the form dccp_delta_seqno(a, b) == 1 * Increment of ccid2hctx_rpseq could be done using dccp_inc_seqno(), but since here the incremented ccid2hctx_rpseq == seqno, used assignment instead. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:54:35 -07:00
Gerrit Renker	6c58324808	[CCID2]: Remove redundant case block skb's passed to ccid2_hc_tx_send_packet() are headerless, the packet type is decided later, in dccp_write_xmit(). Therefore the first test of the switch/case block is always true, the others are never reached. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:54:35 -07:00
Gerrit Renker	ee196c2186	[CCID2]: Remove redundant BUG_ON This removes a test for `val < 1' which would only have been triggered when val < 0, due to a preceding test for 0. Fixed by using an unsigned type for cwnd (as in TCP) instead. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:54:34 -07:00
Gerrit Renker	7d9e8931f9	[CCID2]: Remove ugly BUG_ON This removes an ugly BUG_ON which has been pointed out by Arnaldo. Instead of freezing up the machine, a `critical' message is now issued to the system log. There is potential of doing this more gracefully (eg. there are a few internal variables which could be updated despite the lack of memory), but that requires more complicated changes to the algorithm; thus a `FIXME' has been added. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:54:34 -07:00
Gerrit Renker	cd1f7d347c	[CCID2]: Simplify interface This patch simplifies the interface of ccid2_hc_tx_alloc_seq(): * ccid2_hc_tx_alloc_seq() is always called with an argument of CCID2_SEQBUF_LEN; * other code - ccid2_hc_tx_check_sanity() - even depends on the assumption that ccid2_hc_tx_alloc_seq() has been called with this particular size; * passing the `gfp_t' argument to ccid2_hc_tx_alloc_seq() is redundant with gfp_any(). Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:54:33 -07:00
Gerrit Renker	042d18f9f3	[DCCP]: Make all `debug' parameters bool This just sets the parameter to bool, since debugging messages are either on or off. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:54:32 -07:00
Gerrit Renker	7c559a9e44	[DCCP]: Add socket option to query the current MPS This enables applications to query the current value of the Maximum Packet Size via a socket option, suggested as a SHOULD in (RFC 4340, p. 102). This socket option is useful to avoid the annoying bail-out via `-EMSGSIZE'. In particular, as fragmentation is not currently supported (and its use is partly discouraged in RFC 4340). With this option, it is possible to size buffers accordingly, e.g. int buflen = dccp_get_cur_mps(sockfd); /* or */ if (msgsize > dccp_get_cur_mps(sockfd)) die("message is too large for this path"); Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:54:31 -07:00
Gerrit Renker	bc8498721d	[DCCP]: Wait for CCID This performs a minor optimisation: when ccid_hc_tx_send_packet returns a value greater zero, then the same call previously was done again at the begin of the while loop in dccp_wait_for_ccid. This patch exploits the available information and schedule-timeouts directly instead. Documentation also added. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:54:31 -07:00
Arnaldo Carvalho de Melo	bb293e6a24	[CCID3]: Remove ifdef surrounding BUG_ON As suggested by DaveM. Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>	2007-10-10 16:52:45 -07:00
Gerrit Renker	cecd8d0ec4	[DCCP]: Reduce the number of writable states Since DCCP requires to close both ends of a connection simultaneously, permission to write in state DCCP_CLOSING is removed in dccp_sendmsg(): * if the sending end closed, it would encounter a write error anyhow; * if the other end has closed the connection, it accepts no more data. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>	2007-10-10 16:52:45 -07:00
Gerrit Renker	e356d37a09	[DCCP]: Factor out common code for generating Resets This factors code common to dccp_v{4,6}_ctl_send_reset into a separate function, and adds support for filling in the Data 1 ... Data 3 fields from RFC 4340, 5.6. It is useful to have this separate, since the following Reset codes will always be generated from the control socket rather than via dccp_send_reset: * Code 3, "No Connection", cf. 8.3.1; * Code 4, "Packet Error" (identification for Data 1 added); * Code 5, "Option Error" (identification for Data 1..3 added, will be used later); * Code 6, "Mandatory Error" (same as Option Error); * Code 7, "Connection Refused" (what on Earth is the difference to "No Connection"?); * Code 8, "Bad Service Code"; * Code 9, "Too Busy"; * Code 10, "Bad Init Cookie" (not used). Code 0 is not recommended by the RFC, the following codes would be used in dccp_send_reset() instead, since they all relate to an established DCCP connection: * Code 1, "Closed"; * Code 2, "Aborted"; * Code 11, "Aggression Penalty" (12.3). Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>	2007-10-10 16:52:44 -07:00
Gerrit Renker	9bf55cda9b	[DCCP]: Sequence number wrap-around when sending reset This replaces normal addition with mod-48 addition so that sequence number wraparound is respected. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>	2007-10-10 16:52:43 -07:00
Gerrit Renker	a94f0f9705	[DCCP]: Rate-limit DCCP-Syncs This implements a SHOULD from RFC 4340, 7.5.4: "To protect against denial-of-service attacks, DCCP implementations SHOULD impose a rate limit on DCCP-Syncs sent in response to sequence-invalid packets, such as not more than eight DCCP-Syncs per second." The rate-limit is maintained on a per-socket basis. This is a more stringent policy than enforcing the rate-limit on a per-source-address basis and protects against attacks with forged source addresses. Moreover, the mechanism is deliberately kept simple. In contrast to xrlim_allow(), bursts of Sync packets in reply to sequence-invalid packets are not supported. This foils such attacks where the receipt of a Sync triggers further sequence-invalid packets. (I have tested this mechanism against xrlim_allow algorithm for Syncs, permitting bursts just increases the problems.) In order to keep flexibility, the timeout parameter can be set via sysctl; and the whole mechanism can even be disabled (which is however not recommended). The algorithm in this patch has been improved with regard to wrapping issues thanks to a suggestion by Arnaldo. Commiter note: Rate limited the step 6 DCCP_WARN too, as it says we're sending a sync. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>	2007-10-10 16:52:43 -07:00
Gerrit Renker	ee1a15922d	[DCCP]: Remove duplicate code for Reset from connected socket In this patch, duplicated code is removed for the case when a Reset packet is sent from a connected socket. This code duplication is between dccp_make_reset and dccp_transmit_skb, which already contained an (up to now entirely unused) switch statement to fill in the reset code from the DCCP_SKB_CB. The only thing that has been removed is the call to dst_clone(dst), since the queue_xmit functions use sk_dst_cache anyway. I wasn't sure which purpose inet_sk_rebuild_header served, so I left it in. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>	2007-10-10 16:52:42 -07:00
Gerrit Renker	0430ee3451	[DCCP]: Add Support for Data 1 .. 3 fields of Reset packets This adds fields to support the informational Data 1..3 fields of the DCCP-Reset packets (RFC 4340, 5.6), and makes minor cosmetic changes to documentation. Code which fills in these fields follows in subsequent patches, it is primarily used for reporting option-processing and feature-negotiation errors. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>	2007-10-10 16:52:42 -07:00
Gerrit Renker	727ecc5faa	[DCCP]: Add FIXME for send_delayed_ack This adds a FIXME to signal that the function dccp_send_delayed_ack is nowhere used in the entire DCCP/CCID code. Using a delayed Ack timer is suggested in 11.3 of RFC 4340, but it has also rather subtle implications for the Ack-Ratio-accounting. CCID2 does not use this (maybe it should). I think leaving the function in is good, in case someone wants to implement this. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>	2007-10-10 16:52:41 -07:00
Gerrit Renker	2e86908f7d	[CCID3]: Move NULL-protection into function This moves several instances of testing against NULL into the function which is used to de-reference the CCID-private data. Committer note: Made the BUG_ON depend on having CONFIG_IP_DCCP_CCID3_DEBUG, as it is too much to have this on production code. Also made sure that the macro is used only after checking if sk_state is not LISTEN, to make it equivalent to what we had before. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>	2007-10-10 16:52:41 -07:00
Gerrit Renker	08831700cc	[DCCP]: Send Reset upon Sync in state REQUEST This fixes the code to correspond to RFC 4340, 7.5.4, which states the exception that a Sync received in state REQUEST generates a Reset (not a SyncAck). To achieve this, only a small change is required. Since dccp_rcv_request_sent_state_process() already uses the correct Reset Code number 4 ("Packet Error"), we only need to shift the if-statement a few lines further down. (To test this case: replace DCCP_PKT_RESPONSE with DCCP_PKT_SYNC in dccp_make_response.) Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>	2007-10-10 16:52:40 -07:00
Gerrit Renker	b0d045ca45	[DCCP]: Parameter renaming The parameter `seq' of dccp_send_sync() is in fact an acknowledgement number and not a sequence number - thus renamed by this patch into `ackno'. Secondly, a `critical' warning is added when a Sync/SyncAck could not be sent. Sanity: I have checked all other functions that are called in dccp_transmit_skb, there are no clashes with the use of dccpd_ack_seq; no other function is using this slot at the same time. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:52:37 -07:00
Gerrit Renker	e155d76922	[DCCP]: Fix Reset/Sync-Flood Bug This updates sequence number checking with regard to RFC 4340, 7.5.4. Missing in the code was an exception for sequence-invalid Reset packets, which get a Sync acknowledging GSR, instead of (as usual) P.seqno. This can lead to an oscillating ping-pong flood of Reset packets. In fact, it has been observed on the wire as follows: 1. client establishes connection to server; 2. before server can write to client, client crashes without notifying the server (NB: now no longer possible due to ABORT function); 3. server sends DCCP-Data packet (has no ackno); 4. client generates Reset "No Connection", seqno=0, increments seqno; 5. server replies with Sync, using ackno = P.seqno; 6. client generates Reset "No Connection" with seqno = ackno + 1; 7. goto (5). The difference is that now in (5) the server uses GSR. This causes the Reset sent by the client in (6) to become sequence-valid, so that in (7) the vicious circle is broken; the Reset is then enqueued and causes the socket to enter TIMEWAIT state. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:52:37 -07:00
Gerrit Renker	cbe1f5f88a	[DCCP]: Shorten variable names in dccp_check_seqno This patch is in part required by the next patch; it * replaces 6 instances of `DCCP_SKB_CB(skb)->dccpd_seq' with `seqno'; * replaces 7 instances of `DCCP_SKB_CB(skb)->dccpd_ack_seq' with `ackno'; * replaces 1 use of dccp_inc_seqno() by unfolding `ADD48' macro in place. No changes in algorithm, all changes are text replacement/substitution. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:52:36 -07:00
Gerrit Renker	3393da8241	[DCCP]: Simplify interface of dccp_sample_rtt The third parameter of dccp_sample_rtt now becomes useless and is removed. Also combined the subtraction of the timestamp echo and the elapsed time. This is safe, since (a) presence of timestamp echo is tested first and (b) elapsed time is either present and non-zero or it is not set and equals 0 due to the memset in dccp_parse_options. To avoid measuring option-processing time, the timestamp for measuring the initial Request/Response RTT sample is taken directly when the function is called (the Linux implementation always adds a timestamp on the Request, so there is no loss in doing this). Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:52:35 -07:00
Gerrit Renker	4c70f383e0	[DCCP]: Provide 10s of microsecond timesource This provides a timesource, conveniently used for DCCP timestamps, which returns the elapsed time in 10s of microseconds since initialisation. This makes for a wrap-around time of about 11.9 hours, which should be sufficient for most applications. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:52:35 -07:00
Gerrit Renker	aa97efd97a	[DCCP]: Reuse ktime_get_real() calls again This patch reduces the number of timestamps taken in the receive path for each packet. The ccid3_hc_tx_update_x() routine is called in * the receive path for each CCID3-controlled packet * for the nofeedback timer (if no feedback arrives during 4 RTT) Currently, when there is no loss, each packet gets timestamped twice. The patch resolves this by recycling the first timestamp taken on packet reception for RTT sampling. When the no_feedback_timer() is called, then the timestamp argument is simply set to NULL - so that ccid3_hc_tx_update_x() takes care of the logic. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:52:34 -07:00
Eric W. Biederman	457c4cbc5a	[NET]: Make /proc/net per network namespace This patch makes /proc/net per network namespace. It modifies the global variables proc_net and proc_net_stat to be per network namespace. The proc_net file helpers are modified to take a network namespace argument, and all of their callers are fixed to pass &init_net for that argument. This ensures that all of the /proc/net files are only visible and usable in the initial network namespace until the code behind them has been updated to be handle multiple network namespaces. Making /proc/net per namespace is necessary as at least some files in /proc/net depend upon the set of network devices which is per network namespace, and even more files in /proc/net have contents that are relevant to a single network namespace. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:49:06 -07:00

1 2 3 4 5 ...

447 Commits