linux_old1

Commit Graph

Author	SHA1	Message	Date
Gerrit Renker	6f4e5fff1e	[DCCP]: Support for partial checksums (RFC 4340, sec. 9.2) This patch does the following: a) introduces variable-length checksums as specified in [RFC 4340, sec. 9.2] b) provides necessary socket options and documentation as to how to use them c) basic support and infrastructure for the Minimum Checksum Coverage feature [RFC 4340, sec. 9.2.1]: acceptability tests, user notification and user interface In addition, it (1) fixes two bugs in the DCCPv4 checksum computation: * pseudo-header used checksum_len instead of skb->len * incorrect checksum coverage calculation based on dccph_x (2) removes dccp_v4_verify_checksum() since it reduplicates code of the checksum computation; code calling this function is updated accordingly. (3) now uses skb_checksum(), which is safer than checksum_partial() if the sk_buff has is a non-linear buffer (has pages attached to it). (4) fixes an outstanding TODO item: * If P.CsCov is too large for the packet size, drop packet and return. The code has been tested with applications, the latest version of tcpdump now comes with support for partial DCCP checksums. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>	2006-12-02 21:22:09 -08:00
Gerrit Renker	d83ca5accb	[DCCP]: Update code comments for Step 2/3 Sorts out the comments for processing steps 2,3 in section 8.5 of RFC 4340. All comments have been updated against this document, and the reference to step 2 has been made consistent throughout the files. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>	2006-12-02 21:22:04 -08:00
Gerrit Renker	cf557926f6	[DCCP]: tidy up dccp_v{4,6}_conn_request This is a code simplification to remove reduplicated code by concentrating and abstracting shared code. Detailed Changes:	2006-12-02 21:22:03 -08:00
Ian McDonald	f45b3ec481	[DCCP]: Fix logfile overflow This patch fixes data being spewed into the logs continually. As the code stood if there was a large queue and long delays timeo would go down to zero and never get reset. This fixes it by resetting timeo. Put constant into header as well. Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>	2006-12-02 21:22:02 -08:00
Ian McDonald	fec5b80e49	[DCCP]: Fix DCCP Probe Typo Fixes a typo in Kconfig, patch is by Ian McDonald and is re-sent from http://www.mail-archive.com/dccp@vger.kernel.org/msg00579.html Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>	2006-12-02 21:22:01 -08:00
Gerrit Renker	73c9e02c22	[DCCPv6]: remove forward declarations in ipv6.c This does the same for ipv6.c as the preceding one does for ipv4.c: Only the inet_connection_sock_af_ops forward declarations remain, since at least dccp_ipv6_mapped has a circular dependency to dccp_v6_request_recv_sock. No code change, merely re-ordering. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>	2006-12-02 21:22:00 -08:00
Gerrit Renker	3d2fe62b8d	[DCCPv4]: remove forward declarations in ipv4.c This relates to Arnaldo's announcement in http://www.mail-archive.com/dccp@vger.kernel.org/msg00604.html Originally this had been part of the Oops fix and is a revised variant of http://www.mail-archive.com/dccp@vger.kernel.org/msg00598.html No code change, merely reshuffling, with the particular objective of having all request_sock_ops close(r) together for more clarity. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>	2006-12-02 21:21:59 -08:00
Gerrit Renker	8a73cd09d9	[DCCP]: calling dccp_v{4,6}_reqsk_send_ack is a BUG This patch removes two functions, the send_ack functions of request_sock, which are not called/used by the DCCP code. It is correct that these functions are not called, below is a justification why calling these functions (on a passive socket in the LISTEN/RESPOND state) would mean a DCCP protocol violation. A) Background: using request_sock in TCP:	2006-12-02 21:21:58 -08:00
Arnaldo Carvalho de Melo	f6484f7c7a	[DCCP] timewait: Remove leftover extern declarations Gerrit Renker noticed dccp_tw_deschedule and submitted a patch with a FIXME, but as he suggests in the same patch the best thing is to just ditch this declaration, while doing that also noticed that tcp_tw_count is as well not defined anywhere, so ditch it too. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>	2006-12-02 21:21:57 -08:00
Gerrit Renker	d23c7107bf	[DCCP]: Simplify jump labels in dccp_v{4,6}_rcv This is a code simplification and was singled out from the DCCPv6 Oops patch on http://www.mail-archive.com/dccp@vger.kernel.org/msg00600.html It mainly makes the code consistent between ipv{4,6}.c for the functions dccp_v4_rcv dccp_v6_rcv and removes the do_time_wait label to simplify code somewhat. Commiter note: fixed up a compile problem, trivial. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>	2006-12-02 21:21:56 -08:00
Gerrit Renker	9b42078ed6	[DCCP]: Combine allocating & zeroing header space on skb This is a code simplification: it combines three often recurring operations into one inline function, * allocate `len' bytes header space in skb * fill these `len' bytes with zeroes * cast the start of this header space as dccp_hdr Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>	2006-12-02 21:21:55 -08:00
Gerrit Renker	89e7e57778	[DCCPv6]: Add a FIXME for missing IPV6_PKTOPTIONS This refers to the possible memory leak pointed out in http://www.mail-archive.com/dccp@vger.kernel.org/msg00574.html, fixed by David Miller in http://www.mail-archive.com/netdev@vger.kernel.org/msg24881.html and adds a FIXME to point out where code is missing. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>	2006-12-02 21:21:54 -08:00
Gerrit Renker	60361be1be	[DCCP]: set safe upper bound for option length This is a re-send from http://www.mail-archive.com/dccp@vger.kernel.org/msg00553.html It is the same patch as before, but I have built in Arnaldo's suggestions pointed out in that posting. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>	2006-12-02 21:21:53 -08:00
David S. Miller	931731123a	[TCP]: Don't set SKB owner in tcp_transmit_skb(). The data itself is already charged to the SKB, doing the skb_set_owner_w() just generates a lot of noise and extra atomics we don't really need. Lmbench improvements on lat_tcp are minimal: before: TCP latency using localhost: 23.2701 microseconds TCP latency using localhost: 23.1994 microseconds TCP latency using localhost: 23.2257 microseconds after: TCP latency using localhost: 22.8380 microseconds TCP latency using localhost: 22.9465 microseconds TCP latency using localhost: 22.8462 microseconds Signed-off-by: David S. Miller <davem@davemloft.net>	2006-12-02 21:21:52 -08:00
David S. Miller	494b4e7d81	[DCCP]: Fix typo _read_mostly --> __read_mostly. Signed-off-by: David S. Miller <davem@davemloft.net>	2006-12-02 21:21:45 -08:00
Eric Dumazet	72a3effaf6	[NET]: Size listen hash tables using backlog hint We currently allocate a fixed size (TCP_SYNQ_HSIZE=512) slots hash table for each LISTEN socket, regardless of various parameters (listen backlog for example) On x86_64, this means order-1 allocations (might fail), even for 'small' sockets, expecting few connections. On the contrary, a huge server wanting a backlog of 50000 is slowed down a bit because of this fixed limit. This patch makes the sizing of listen hash table a dynamic parameter, depending of : - net.core.somaxconn tunable (default is 128) - net.ipv4.tcp_max_syn_backlog tunable (default : 256, 1024 or 128) - backlog value given by user application (2nd parameter of listen()) For large allocations (bigger than PAGE_SIZE), we use vmalloc() instead of kmalloc(). We still limit memory allocation with the two existing tunables (somaxconn & tcp_max_syn_backlog). So for standard setups, this patch actually reduce RAM usage. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-12-02 21:21:44 -08:00
Akinobu Mita	ac16ca6412	[NET]: Fix kfifo_alloc() error check. The return value of kfifo_alloc() should be checked by IS_ERR(). Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-11-25 15:16:49 -08:00
David Howells	c4028958b6	WorkStruct: make allyesconfig Fix up for make allyesconfig. Signed-Off-By: David Howells <dhowells@redhat.com>	2006-11-22 14:57:56 +00:00
YOSHIFUJI Hideaki	f2776ff047	[IPV6]: Fix address/interface handling in UDP and DCCP, according to the scoping architecture. TCP and RAW do not have this issue. Closes Bug #7432. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-11-21 17:41:56 -08:00
Randy Dunlap	234af48401	[DCCP]: fix printk format warnings Fix printk format warnings: build2.out:net/dccp/ccids/ccid2.c:355: warning: long long unsigned int format, u64 arg (arg 3) build2.out:net/dccp/ccids/ccid2.c:360: warning: long long unsigned int format, u64 arg (arg 3) build2.out:net/dccp/ccids/ccid2.c:482: warning: long long unsigned int format, u64 arg (arg 5) build2.out:net/dccp/ccids/ccid2.c:639: warning: long long unsigned int format, u64 arg (arg 3) build2.out:net/dccp/ccids/ccid2.c:639: warning: long long unsigned int format, u64 arg (arg 4) build2.out:net/dccp/ccids/ccid2.c:674: warning: long long unsigned int format, u64 arg (arg 3) build2.out:net/dccp/ccids/ccid2.c:720: warning: long long unsigned int format, u64 arg (arg 3) Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-10-30 15:24:37 -08:00
Gerrit Renker	0e64e94e47	[DCCP]: Update documentation references. Updates the references to spec documents throughout the code, taking into account that * the DCCP, CCID 2, and CCID 3 drafts all became RFCs in March this year * RFC 1063 was obsoleted by RFC 1191 * draft-ietf-tcpimpl-pmtud-0x.txt was published as an Informational RFC, RFC 2923 on 2000-09-22. All references verified. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-10-24 16:17:51 -07:00
David S. Miller	fd169f15a6	[DCCP] ipv6: Fix opt_skb leak. Based upon a patch from Jesper Juhl. Try to match the TCP IPv6 code this was copied from as much as possible, so that it's easy to see where to add the ipv6 pktoptions support code. Signed-off-by: David S. Miller <davem@davemloft.net>	2006-10-21 19:55:21 -07:00
Gerrit Renker	82709531a8	[DCCP]: Fix Oops in DCCPv6 I think I got the cause for the Oops observed in http://www.mail-archive.com/dccp@vger.kernel.org/msg00578.html The problem is always with applications listening on PF_INET6 sockets. Apart from the mentioned oops, I observed another one one, triggered at irregular intervals via timer interrupt: run_timer_softirq -> dccp_keepalive_timer -> inet_csk_reqsk_queue_prune -> reqsk_free -> dccp_v6_reqsk_destructor The latter function is the problem and is also the last function to be called in said kernel panic. In any case, there is a real problem with allocating the right request_sock which is what this patch tackles. It fixes the following problem: - application listens on PF_INET6 - DCCPv4 packet comes in, is handed over to dccp_v4_do_rcv, from there to dccp_v4_conn_request Now: socket is PF_INET6, packet is IPv4. The following code then furnishes the connection with IPv6 - request_sock operations: req = reqsk_alloc(sk->sk_prot->rsk_prot); The first problem is that all further incoming packets will get a Reset since the connection can not be looked up. The second problem is worse: --> reqsk_alloc is called instead of inet6_reqsk_alloc --> consequently inet6_rsk_offset is never set (dangling pointer) --> the request_sock_ops are nevertheless still dccp6_request_ops --> destructor is called via reqsk_free --> dccp_v6_reqsk_destructor tries to free random memory location (inet6_rsk_offset not set) --> panic I have tested this for a while, DCCP sockets are now handled correctly in all three scenarios (v4/v6 only/v4-mapped). Commiter note: I've added the dccp_request_sock_ops forward declaration to keep the tree building and to reduce the size of the patch for 2.6.19, later I'll move the functions to the top of the affected source code to match what we have in the TCP counterpart, where this problem hasn't existed in the first place, dumb me not to have done the same thing on DCCP land 8) Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>	2006-10-21 19:55:20 -07:00
YOSHIFUJI Hideaki	9469c7b4aa	[NET]: Use typesafe inet_twsk() inline function instead of cast. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-10-11 23:59:58 -07:00
Al Viro	bada8adc4e	[IPV4]: ip_route_connect() ipv4 address arguments annotated annotated address arguments (port number left alone for now); ditto for inferred net-endian variables in callers. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 17:54:06 -07:00
Ian McDonald	e41542f516	[DCCP]: Introduce dccp_probe This adds DCCP probing shamelessly ripped off from TCP probes by Stephen Hemminger. I've put in here support for further CCID3 variables as well. Andrea/Arnaldo might look to extend for CCID2. Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>	2006-09-24 18:08:17 -03:00
Ian McDonald	3dd9a7c3a1	[DCCP]: Use constants for CCIDs With constants for CCID numbers this now uses them in some places. Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>	2006-09-24 18:03:41 -03:00
Gerrit Renker	00e4d116a7	[DCCP]: Allow default/fallback service code. This has been discussed on dccp@vger and removes the necessity for applications to supply service codes in each and every case. If an application does not want to provide a service code, that's fine, it will be given 0. Otherwise, service codes can be set via socket options as before. This patch has been tested using various client/server configurations (including listening on multiple service codes). Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>	2006-09-24 17:49:26 -03:00
Andrea Bittau	593f16aa62	[DCCP] CCID2: Add helper functions for changing important CCID2 state Introduce methods which manipulate interesting congestion control state such as pipe and rtt estimate. This is useful for people wishing to monitor the variables of CCID and instrument the code [perhaps using Kprobes]. Personally, I am a fan of encapsulation---that justifies this change =D. Signed-off-by: Andrea Bittau <a.bittau@cs.ucl.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:19:42 -07:00
Andrea Bittau	374bcf32c8	[DCCP] CCID2: Halve cwnd once upon multiple losses in a single RTT When multiple losses occur in one RTT, the window should be halved only once [a single "congestion event"]. This is now implemented, although not perfectly. Slightly changed the interface for changing the cwnd: pass hctx instead of dp. This is required in order to allow for change_cwnd to be called from _init(). Signed-off-by: Andrea Bittau <a.bittau@cs.ucl.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:19:41 -07:00
Andrea Bittau	07978aabd5	[DCCP] CCID2: Allocate seq records on demand Allocate more sequence state on demand. Each time a packet is sent out by CCID2, a record of it needs to be kept. This list of records grows proportionally to cwnd. Previously, the length of this list was hardcored and therefore the cwnd could only grow to this value (of 128). Now, records are allocated on demand as necessary---cwnd may grow as it wishes. The exceptional case of when memory is not available is not handled gracefully. Perhaps, cwnd should be capped at that point. Signed-off-by: Andrea Bittau <a.bittau@cs.ucl.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:19:40 -07:00
Andrea Bittau	8d424f6ca2	[DCCP] CCID2: Add Kconfig option for CCID2 debug Allow the user to choose whether or not to enable CCID2 debugging via Kconfig. Signed-off-by: Andrea Bittau <a.bittau@cs.ucl.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:19:39 -07:00
Andrea Bittau	446dec30c7	[DCCP] CCID2: Tell DCCP to quickly check whether cwnd is available If not enough cwnd is available, tell the sender to check again as soon as possible. This will increase CPU utilization (polling frequently for cwnd) but will improve network performance. That is, the sender will need to wait less before detecting the increase of cwnd. A better architecture would be for the CCID to call-back (or dequeue) from DCCP when it is able to transmit traffic -- not the other way around as it currently occurs. Signed-off-by: Andrea Bittau <a.bittau@cs.ucl.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:19:39 -07:00
Andrea Bittau	d458c25ce2	[DCCP] CCID2: Initialize ssthresh to infinity Initialize the slow-start threshold to infinity. This way, upon connection initiation, slow-start will be exited only upon a packet loss. This patch will allow connections to quickly gain speed. Signed-off-by: Andrea Bittau <a.bittau@cs.ucl.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:19:11 -07:00
Andrea Bittau	29651cda97	[DCCP] CCID2: Fix jiffie wrap issues Jiffies are now handled correctly (I hope) in CCID2. If they wrap, no problem. Signed-off-by: Andrea Bittau <a.bittau@cs.ucl.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:19:10 -07:00
Andrea Bittau	4a0a50fb43	[DCCP] ackvec: Remove unused variables Get rid of unused variables in ackvector state. Signed-off-by: Andrea Bittau <a.bittau@cs.ucl.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:19:09 -07:00
Andrea Bittau	8e27e4650c	[DCCP] ackvec: Fix how DCCP_ACKVEC_STATE_NOT_RECEIVED is used Fix the way state is masked out. DCCP_ACKVEC_STATE_NOT_RECEIVED is defined as appears in the packet, therefore bit shifting is not required. This fix allows CCID2 to correctly detect losses. Signed-off-by: Andrea Bittau <a.bittau@cs.ucl.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:19:08 -07:00
Andrea Bittau	23d06e3b98	[DCCP] ACKVEC: fix ackvector length calculation Fix ackvector length calculation upon receiving an "ack-of-ack". This patch avoids the ackvector from growing too large which causes it to not be inserted into packets. Signed-off-by: Andrea Bittau <a.bittau@cs.ucl.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:19:07 -07:00
Dmitry Mishin	fda9ef5d67	[NET]: Fix sk->sk_filter field access Function sk_filter() is called from tcp_v{4,6}_rcv() functions with arg needlock = 0, while socket is not locked at that moment. In order to avoid this and similar issues in the future, use rcu for sk->sk_filter field read protection. Signed-off-by: Dmitry Mishin <dim@openvz.org> Signed-off-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Signed-off-by: Kirill Korotaev <dev@openvz.org>	2006-09-22 15:18:47 -07:00
Ian McDonald	fc747e82b4	[DCCP]: Tidyup CCID3 list handling As Arnaldo Carvalho de Melo points out I should be using list_entry in case the structure changes in future. Current code functions but is reliant on position and requires type cast. Noticed when doing this that I have one more variable than I needed so removing that also. Signed off by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:18:33 -07:00
Ian McDonald	97e5848dd3	[DCCP]: Introduce tx buffering This adds transmit buffering to DCCP. I have tested with CCID2/3 and with loss and rate limiting. Signed off by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:18:17 -07:00
Ian McDonald	2a0109a707	[DCCP]: Shift sysctls into feat.h This shifts further sysctls into feat.h. No change in functionality - shifting code only. Signed off by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:18:16 -07:00
YOSHIFUJI Hideaki	8e1ef0a95b	[IPV6]: Cache source address as well in ipv6_pinfo{}. Based on MIPL2 kernel patch. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: Ville Nuorvala <vnuorval@tcs.hut.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:55:45 -07:00
Herbert Xu	8f491069b4	[IPV4]: Use network-order dport for all visible inet_lookup_* Right now most inet_lookup_* functions take a host-order hnum instead of a network-order dport because that's how it is represented internally. This means that users of these functions have to be careful about using the right byte-order. To add more confusion, inet_lookup takes a network-order dport unlike all other functions. So this patch changes all visible inet_lookup functions to take a dport and move all dport->hnum conversion inside them. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:54:14 -07:00
Venkat Yekkirala	4237c75c0a	[MLSXFRM]: Auto-labeling of child sockets This automatically labels the TCP, Unix stream, and dccp child sockets as well as openreqs to be at the same MLS level as the peer. This will result in the selection of appropriately labeled IPSec Security Associations. This also uses the sock's sid (as opposed to the isec sid) in SELinux enforcement of secmark in rcv_skb and postroute_last hooks. Signed-off-by: Venkat Yekkirala <vyekkirala@TrustedCS.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:53:29 -07:00
Venkat Yekkirala	beb8d13bed	[MLSXFRM]: Add flow labeling This labels the flows that could utilize IPSec xfrms at the points the flows are defined so that IPSec policy and SAs at the right label can be used. The following protos are currently not handled, but they should continue to be able to use single-labeled IPSec like they currently do. ipmr ip_gre ipip igmp sit sctp ip6_tunnel (IPv6 over IPv6 tunnel device) decnet Signed-off-by: Venkat Yekkirala <vyekkirala@TrustedCS.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:53:27 -07:00
Ian McDonald	66a377c504	[DCCP]: Fix CCID3 This fixes CCID3 to give much closer performance to RFC4342. CCID3 is meant to alter sending rate based on RTT and loss. The performance was verified against: http://wand.net.nz/~perry/max_download.php For example I tested with netem and had the following parameters: Delayed Acks 1, MSS 256 bytes, RTT 105 ms, packet loss 5%. This gives a theoretical speed of 71.9 Kbits/s. I measured across three runs with this patch set and got 70.1 Kbits/s. Without this patchset the average was 232 Kbits/s which means Linux can't be used for CCID3 research properly. I also tested with netem turned off so box just acting as router with 1.2 msec RTT. The performance with this is the same with or without the patch at around 30 Mbit/s. Signed off by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-08-26 23:40:50 -07:00
Ian McDonald	80193aee18	[DCCP]: Introduce dccp_rx_hist_find_entry This adds a new function dccp_rx_hist_find_entry. Signed off by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-08-26 19:07:36 -07:00
Ian McDonald	837d107cd1	[DCCP]: Introduces follows48 function This adds a new function to see if two sequence numbers follow each other. Signed off by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-08-26 19:06:42 -07:00
Ian McDonald	e6bccd3573	[DCCP]: Update contact details and copyright Just updating copyright and contacts Signed off by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-08-26 19:01:30 -07:00
Ian McDonald	f3166c0717	[DCCP]: Fix typo This fixes a small typo in net/dccp/libs/packet_history.c Signed off by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-08-26 19:01:03 -07:00
Herbert Xu	497c615aba	[IPV6]: Audit all ip6_dst_lookup/ip6_dst_store calls The current users of ip6_dst_lookup can be divided into two classes: 1) The caller holds no locks and is in user-context (UDP). 2) The caller does not want to lookup the dst cache at all. The second class covers everyone except UDP because most people do the cache lookup directly before calling ip6_dst_lookup. This patch adds ip6_sk_dst_lookup for the first class. Similarly ip6_dst_store users can be divded into those that need to take the socket dst lock and those that don't. This patch adds __ip6_dst_store for those (everyone except UDP/datagram) that don't need an extra lock. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-08-02 13:38:14 -07:00
Ian McDonald	4b79f0af48	[DCCP]: Fix default sequence window size When using the default sequence window size (100) I got the following in my logs: Jun 22 14:24:09 localhost kernel: [ 1492.114775] DCCP: Step 6 failed for DATA packet, (LSWL(6279674225) <= P.seqno(6279674749) <= S.SWH(6279674324)) and (P.ackno doesn't exist or LAWL(18798206530) <= P.ackno(1125899906842620) <= S.AWH(18798206548), sending SYNC... Jun 22 14:24:09 localhost kernel: [ 1492.115147] DCCP: Step 6 failed for DATA packet, (LSWL(6279674225) <= P.seqno(6279674750) <= S.SWH(6279674324)) and (P.ackno doesn't exist or LAWL(18798206530) <= P.ackno(1125899906842620) <= S.AWH(18798206549), sending SYNC... I went to alter the default sysctl and it didn't take for new sockets. Below patch fixes this. I think the default is too low but it is what the DCCP spec specifies. As a side effect of this my rx speed using iperf goes from about 2.8 Mbits/sec to 3.5. This is still far too slow but it is a step in the right direction. Compile tested only for IPv6 but not particularly complex change. Signed off by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-07-24 12:44:21 -07:00
Alan Cox	9faefb6d41	[DCCP]: Fix sparse warnings. No actual bugs that I can see just a couple of unmarked casts getting annoying in my debug log files. Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-07-10 14:50:37 -07:00
Jörn Engel	6ab3d5624e	Remove obsolete #include <linux/config.h> Signed-off-by: Jörn Engel <joern@wohnheim.fh-wedel.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>	2006-06-30 19:25:36 +02:00
Jean-Luc Leger	538c5902b8	[PATCH] clean up default value of IP_DCCP_ACKVEC Default values for boolean and tristate options can only be 'y', 'm' or 'n'. This patch removes wrong default for IP_DCCP_ACKVEC. Signed-off-by: Jean-Luc Leger <jean-luc.leger@dspnet.fr.eu.org> Cc: Arnaldo Carvalho de Melo <acme@conectiva.com.br> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-06-23 07:43:04 -07:00
Chris Leech	624d116473	[I/OAT]: Make sk_eat_skb I/OAT aware. Add an extra argument to sk_eat_skb, and make it move early copied packets to the async_wait_queue instead of freeing them. Signed-off-by: Chris Leech <christopher.leech@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-17 21:25:52 -07:00
Andrea Bittau	afec35e3fe	[DCCP] Ackvec: fix soft lockup in ackvec handling code A soft lockup existed in the handling of ack vector records. Specifically, when a tail of the list of ack vector records was removed, it was possible to end up iterating infinitely on an element of the tail. Signed-off-by: Andrea Bittau <a.bittau@cs.ucl.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-11 21:08:03 -07:00
Herbert Xu	134af34632	[DCCP]: Fix sock_orphan dead lock Calling sock_orphan inside bh_lock_sock in dccp_close can lead to dead locks. For example, the inet_diag code holds sk_callback_lock without disabling BH. If an inbound packet arrives during that admittedly tiny window, it will cause a dead lock on bh_lock_sock. Another possible path would be through sock_wfree if the network device driver frees the tx skb in process context with BH enabled. We can fix this by moving sock_orphan out of bh_lock_sock. The tricky bit is to work out when we need to destroy the socket ourselves and when it has already been destroyed by someone else. By moving sock_orphan before the release_sock we can solve this problem. This is because as long as we own the socket lock its state cannot change. So we simply record the socket state before the release_sock and then check the state again after we regain the socket lock. If the socket state has transitioned to DCCP_CLOSED in the time being, we know that the socket has been destroyed. Otherwise the socket is still ours to keep. This problem was discoverd by Ingo Molnar using his lock validator. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-05-05 17:09:13 -07:00
Eric Sesterhenn	b8282dcf04	[DCCP]: Fix leak in net/dccp/ipv4.c we dont free req if we cant parse the options. This fixes coverity bug id #1046 Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-04-11 17:21:06 -07:00
Randy Dunlap	68907dad58	[DCCP]: Use NULL for pointers, comfort sparse. From: Randy Dunlap <rdunlap@xenotime.net> Use NULL instead of 0 for pointers. Fix these sparse warnings: net/dccp/feat.c:207:20: warning: Using plain integer as NULL pointer net/dccp/feat.c:325:21: warning: Using plain integer as NULL pointer net/dccp/feat.c:526:20: warning: Using plain integer as NULL pointer Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-29 13:58:25 -08:00
Davide Libenzi	f348d70a32	[PATCH] POLLRDHUP/EPOLLRDHUP handling for half-closed devices notifications Implement the half-closed devices notifiation, by adding a new POLLRDHUP (and its alias EPOLLRDHUP) bit to the existing poll/select sets. Since the existing POLLHUP handling, that does not report correctly half-closed devices, was feared to be changed, this implementation leaves the current POLLHUP reporting unchanged and simply add a new bit that is set in the few places where it makes sense. The same thing was discussed and conceptually agreed quite some time ago: http://lkml.org/lkml/2003/7/12/116 Since this new event bit is added to the existing Linux poll infrastruture, even the existing poll/select system calls will be able to use it. As far as the existing POLLHUP handling, the patch leaves it as is. The pollrdhup-2.6.16.rc5-0.10.diff defines the POLLRDHUP for all the existing archs and sets the bit in the six relevant files. The other attached diff is the simple change required to sys/epoll.h to add the EPOLLRDHUP definition. There is "a stupid program" to test POLLRDHUP delivery here: http://www.xmailserver.org/pollrdhup-test.c It tests poll(2), but since the delivery is same epoll(2) will work equally. Signed-off-by: Davide Libenzi <davidel@xmailserver.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Michael Kerrisk <mtk-manpages@gmx.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-03-25 08:22:56 -08:00
Arnaldo Carvalho de Melo	8ca0d17bd7	[DCCP] feat: Pass dccp_minisock ptr where only the minisock is used This is in preparation for having a dccp_minisock embedded into dccp_request_sock so that feature negotiation can be done prior to creating the full blown dccp_sock. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 22:51:53 -08:00
Arnaldo Carvalho de Melo	a4bf390242	[DCCP] minisock: Rename struct dccp_options to struct dccp_minisock This will later be included in struct dccp_request_sock so that we can have per connection feature negotiation state while in the 3way handshake, when we clone the DCCP_ROLE_LISTEN socket (in dccp_create_openreq_child) we'll just copy this state from dreq_minisock to dccps_minisock. Also the feature negotiation and option parsing code will mostly touch dccps_minisock, which will simplify some stuff. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 22:50:58 -08:00
Arnaldo Carvalho de Melo	543d9cfeec	[NET]: Identation & other cleanups related to compat_[gs]etsockopt cset No code changes, just tidying up, in some cases moving EXPORT_SYMBOLs to just after the function exported, etc. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 22:48:35 -08:00
Arnaldo Carvalho de Melo	dec73ff029	[ICSK] compat: Introduce inet_csk_compat_[gs]etsockopt Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 22:46:16 -08:00
Dmitry Mishin	3fdadf7d27	[NET]: {get\|set}sockopt compatibility layer This patch extends {get\|set}sockopt compatibility layer in order to move protocol specific parts to their place and avoid huge universal net/compat.c file in the future. Signed-off-by: Dmitry Mishin <dim@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 22:45:21 -08:00
David S. Miller	fb9504964d	[DCCP]: Fix uninitialized var warnings in dccp_parse_options(). Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 22:36:01 -08:00
Arnaldo Carvalho de Melo	2d0817d11e	[DCCP] options: Make dccp_insert_options & friends yell on error And not the silly LIMIT_NETDEBUG and silently return without inserting the option requested. Also drop some old debugging messages associated to option insertion. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 22:32:06 -08:00
Arnaldo Carvalho de Melo	110bae4efb	[DCCP]: Remove leftover dccp_send_response prototype Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 22:31:46 -08:00
Arnaldo Carvalho de Melo	c5fed1597e	[DCCP]: ditch dccp_v[46]_ctl_send_ack Merging it with its only user: dccp_v[46]_reqsk_send_ack. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 22:31:26 -08:00
Arnaldo Carvalho de Melo	118b2c9532	[DCCP]: Use sk->sk_prot->max_header consistently for non-data packets Using this also provides opportunities for introducing inet_csk_alloc_skb that would call alloc_skb, account it to the sock and skb_reserve(max_header), but I'll leave this for later, for now using sk_prot->max_header consistently is enough. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 22:31:09 -08:00
Arnaldo Carvalho de Melo	e5a6de915b	[DCCP] options: Fix handling of ackvecs in DATA packets I.e. they should be just ignored, but we have to use 'break', not 'continue', as we have to possibly reset the mandatory flag. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 22:30:51 -08:00
Arnaldo Carvalho de Melo	6df9424a9c	[DCCP] options: Fix some aspects of mandatory option processing According to dccp draft (draft-ietf-dccp-spec-13.txt) section 5.8.2 (Mandatory Option) the following patch correct the handling of the following cases: 1) "... and any Mandatory options received on DCCP-Data packets MUST be ignored." 2) "The connection is in error and should be reset with Reset Code 5, ... if option O is absent (Mandatory was the last byte of the option list), or if option O equals Mandatory." Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 22:06:02 -08:00
Arnaldo Carvalho de Melo	c0c736db7e	[DCCP] ccid2: coding style cleanups No changes in the logic where made. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 22:05:37 -08:00
Arnaldo Carvalho de Melo	45329e71ee	[DCCP] ipv6: cleanups No changes in the logic were made, just removing trailing whitespaces, etc. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 22:01:29 -08:00
Arnaldo Carvalho de Melo	c4d9390941	[ICSK]: Introduce inet_csk_ctl_sock_create Consolidating open coded sequences in tcp and dccp, v4 and v6. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 22:01:03 -08:00
Arnaldo Carvalho de Melo	7247887357	[DCCP] ipv6: Add missing ipv6 control socket I guess I forgot to add it, nah, now it just works: 18:04:33.274066 IP6 ::1.1476 > ::1.5001: request (service=0) 18:04:33.334482 IP6 ::1.5001 > ::1.1476: reset (code=bad_service_code) Ditched IP_DCCP_UNLOAD_HACK, as now we would have to do it for both IPv6 and IPv4, so I'll come up with another way for freeing the control sockets in upcoming changesets. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 22:00:37 -08:00
Arnaldo Carvalho de Melo	c25a18ba34	[DCCP]: Uninline some functions Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 21:58:56 -08:00
Adrian Bunk	5e0817f84c	[DCCP] ipv4: make struct dccp_v4_prot static There's no reason for struct dccp_v4_prot being global. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 21:58:29 -08:00
Arnaldo Carvalho de Melo	b61fafc4ef	[DCCP]: Move the IPv4 specific bits from proto.c to ipv4.c With this patch in place we can break down the complexity by better compartmentalizing the code that is common to ipv6 and ipv4. Now we have these modules: Module Size Used by dccp_diag 1344 0 inet_diag 9448 1 dccp_diag dccp_ccid3 15856 0 dccp_tfrc_lib 12320 1 dccp_ccid3 dccp_ccid2 5764 0 dccp_ipv4 16996 2 dccp 48208 4 dccp_diag,dccp_ccid3,dccp_ccid2,dccp_ipv4 dccp_ipv6 still requires dccp_ipv4 due to dccp_ipv6_mapped, that is the next target to work on the "hey, ipv4 is legacy, I only want ipv6 dude!" direction. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 21:25:11 -08:00
Arnaldo Carvalho de Melo	46f09ffa7d	[DCCP]: Rename init_dccp_v4_mibs to dccp_mib_init And introduce dccp_mib_exit grouping previously open coded sequence. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 21:24:42 -08:00
Arnaldo Carvalho de Melo	075ae86611	[DCCP]: Move dccp_hashinfo from ipv4.c to the core As it is used by both ipv4 and ipv6. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 21:24:19 -08:00
Arnaldo Carvalho de Melo	0a1ec676dd	[DCCP]: Dont use dccp_v4_checksum in dccp_make_response dccp_make_response is shared by ipv4/6 and the ipv6 code was recalculating the checksum, not good, so move the dccp_v4_checksum call to dccp_v4_send_response. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 21:23:59 -08:00
Arnaldo Carvalho de Melo	c985ed705f	[DCCP]: Move dccp_[un]hash from ipv4.c to the core As this is used by both ipv4 and ipv6 and is not ipv4 specific. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 21:23:39 -08:00
Arnaldo Carvalho de Melo	3e0fadc51f	[DCCP]: Move dccp_v4_{init,destroy}_sock to the core Removing one more ipv6 uses ipv4 stuff case in dccp land. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 21:23:15 -08:00
Arnaldo Carvalho de Melo	017487d7d1	[DCCP]: Generalize dccp_v4_send_reset Renaming it to dccp_send_reset and moving it from the ipv4 specific code to the core dccp code. This fixes some bugs in IPV6 where timers would send v4 resets, etc. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 19:25:24 -08:00
Arnaldo Carvalho de Melo	e55d912f5b	[DCCP] feat: Introduce sysctls for the default features [root@qemu ~]# for a in /proc/sys/net/dccp/default/* ; do echo $a ; cat $a ; done /proc/sys/net/dccp/default/ack_ratio 2 /proc/sys/net/dccp/default/rx_ccid 3 /proc/sys/net/dccp/default/send_ackvec 1 /proc/sys/net/dccp/default/send_ndp 1 /proc/sys/net/dccp/default/seq_window 100 /proc/sys/net/dccp/default/tx_ccid 3 [root@qemu ~]# So if wanting to test ccid3 as the tx CCID one can just do: [root@qemu ~]# echo 3 > /proc/sys/net/dccp/default/tx_ccid [root@qemu ~]# echo 2 > /proc/sys/net/dccp/default/rx_ccid [root@qemu ~]# cat /proc/sys/net/dccp/default/[tr]x_ccid 2 3 [root@qemu ~]# Of course we also need the setsockopt for each app to tell its preferences, but for testing or defining something other than CCID2 as the default for apps that don't explicitely set their preference the sysctl interface is handy. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 19:25:02 -08:00
Arnaldo Carvalho de Melo	04e2661e9c	[DCCP]: Call dccp_feat_init more early in dccp_v4_init_sock So that dccp_feat_clean doesn't get confused with uninitialized list_heads. Noticed when testing with no ccid kernel modules. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 19:24:41 -08:00
Arnaldo Carvalho de Melo	057fc6755a	[DCCP]: Kconfig tidy up Make CCID2 and CCID3 default to what was selected for DCCP and use the standard short description for the CCIDs (TCP-Like & TCP-Friendly). Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 19:24:22 -08:00
Andrea Bittau	60fe62e789	[DCCP]: sparse endianness annotations This also fixes the layout of dccp_hdr short sequence numbers, problem was not fatal now as we only support long (48 bits) sequence numbers. Signed-off-by: Andrea Bittau <a.bittau@cs.ucl.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 19:23:32 -08:00
Andrea Bittau	6ffd30fbbb	[DCCP] feat: Actually change the CCID upon negotiation Change the CCID upon successful feature negotiation. Commiter note: patch mostly rewritten to use the new ccid API. Signed-off-by: Andrea Bittau <a.bittau@cs.ucl.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 19:22:37 -08:00
Arnaldo Carvalho de Melo	91f0ebf7b6	[DCCP] CCID: Improve CCID infrastructure 1. No need for ->ccid_init nor ->ccid_exit, this is what module_{init,exit} does and anynways neither ccid2 nor ccid3 were using it. 2. Rename struct ccid to struct ccid_operations and introduce struct ccid with a pointer to ccid_operations and rigth after it the rx or tx private state. 3. Remove the pointer to the state of the half connections from struct dccp_sock, now its derived thru ccid_priv() from the ccid pointer. Now we also can implement the setsockopt for changing the CCID easily as no ccid init routines can affect struct dccp_sock in any way that prevents other CCIDs from working if a CCID switch operation is asked by apps. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 19:21:44 -08:00
Andrea Bittau	77ff72d528	[DCCP] CCID2: Drop sock reference count on timer expiration and reset. There was a hybrid use of standard timers and sk_timers. This caused the reference count of the sock to be incorrect when resetting the RTO timer. The sock reference count should now be correct, enabling its destruction, and allowing the DCCP module to be unloaded. Signed-off-by: Andrea Bittau <a.bittau@cs.ucl.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>	2006-03-20 17:57:52 -08:00
Andrea Bittau	afe00251dd	[DCCP]: Initial feature negotiation implementation Still needs more work, but boots and doesn't crashes, even does some negotiation! 18:38:52.174934 127.0.0.1.43458 > 127.0.0.1.5001: request <change_l ack_ratio 2, change_r ccid 2, change_l ccid 2> 18:38:52.218526 127.0.0.1.5001 > 127.0.0.1.43458: response <nop, nop, change_l ack_ratio 2, confirm_r ccid 2 2, confirm_l ccid 2 2, confirm_r ack_ratio 2> 18:38:52.185398 127.0.0.1.43458 > 127.0.0.1.5001: <nop, confirm_r ack_ratio 2, ack_vector0 0x00, elapsed_time 212> :-) Signed-off-by: Andrea Bittau <a.bittau@cs.ucl.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 17:43:56 -08:00
Andrea Bittau	2a91aa3967	[DCCP] CCID2: Initial CCID2 (TCP-Like) implementation Original work by Andrea Bittau, Arnaldo Melo cleaned up and fixed several issues on the merge process. For now CCID2 was turned the default for all SOCK_DCCP connections, but this will be remedied soon with the merge of the feature negotiation code. Signed-off-by: Andrea Bittau <a.bittau@cs.ucl.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 17:41:47 -08:00
Arnaldo Carvalho de Melo	aa5d7df3b2	[DCCP] CCID3: Set the no_feedback_timer fields near init_timer Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 17:35:13 -08:00
Arnaldo Carvalho de Melo	9833d6da00	[DCCP]: Don't alloc ack vector for the control sock Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 17:34:53 -08:00
Arnaldo Carvalho de Melo	d5e9b2c737	[DCCP] ackvec: Delete all the ack vector records in dccp_ackvec_free Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 17:20:46 -08:00
Arnaldo Carvalho de Melo	411447019a	[DCCP] CCID: Allow ccid_{init,exit} to be NULL Testing if the ccid being instantiated has these methods in ccid_init(). Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 17:20:23 -08:00
Andrea Bittau	02bcf28c82	[DCCP] ackvec: Introduce ack vector records Based on a patch by Andrea Bittau. Signed-off-by: Andrea Bittau <a.bittau@cs.ucl.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 17:19:55 -08:00
Arnaldo Carvalho de Melo	9b07ef5dda	[DCCP] ackvec: Introduce dccp_ackvec_slab Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 17:16:17 -08:00
Arnaldo Carvalho de Melo	fa23e2ecd3	[DCCP]: Fix error handling in dccp_init Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 17:16:01 -08:00
Arnaldo Carvalho de Melo	7400d78110	[DCCP] ackvec: Ditch dccpav_buf_len Simplifying the code a bit as we're always using DCCP_MAX_ACKVEC_LEN. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 17:15:42 -08:00
Ian McDonald	c09966608d	[DCCP] ccid3: Divide by zero fix In rare circumstances 0 is returned by dccp_li_hist_calc_i_mean which leads to a divide by zero in ccid3_hc_rx_packet_recv. Explicitly check for zero return now. Update copyright notice at same time. Found by Arnaldo. Signed-off-by: Ian McDonald <imcdnzl@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-04 21:06:29 -08:00
Al Viro	1b8623545b	[PATCH] remove bogus asm/bug.h includes. A bunch of asm/bug.h includes are both not needed (since it will get pulled anyway) and bogus (since they are done too early). Removed. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2006-02-07 20:56:35 -05:00
David S. Miller	0cbd782507	[DCCP] ipv6: dccp_v6_send_response() has a DST leak too. It was copy&pasted from tcp_v6_send_synack() which has a DST leak recently fixed by Eric W. Biederman. So dccp_v6_send_response() needs the same fix too. Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-31 17:53:37 -08:00
Patrick McHardy	5d39a795bf	[IPV4]: Always set fl.proto in ip_route_newports ip_route_newports uses the struct flowi from the struct rtable returned by ip_route_connect for the new route lookup and just replaces the port numbers if they have changed. If an IPsec policy exists which doesn't match port 0 the struct flowi won't have the proto field set and no xfrm lookup is done for the changed ports. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-31 17:35:35 -08:00
Kris Katterjohn	a8fc3d8dec	[NET]: "signed long" -> "long" Signed-off-by: Kris Katterjohn <kjak@users.sourceforge.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-17 13:03:54 -08:00
Patrick McHardy	eb9c7ebe69	[NETFILTER]: Handle NAT in IPsec policy checks Handle NAT of decapsulated IPsec packets by reconstructing the struct flowi of the original packet from the conntrack information for IPsec policy checks. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-07 12:57:37 -08:00
Patrick McHardy	b59c270104	[NETFILTER]: Keep conntrack reference until IPsec policy checks are done Keep the conntrack reference until policy checks have been performed for IPsec NAT support. The reference needs to be dropped before a packet is queued to avoid having the conntrack module unloadable. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-07 12:57:36 -08:00
Patrick McHardy	951dbc8ac7	[IPV6]: Move nextheader offset to the IP6CB Move nextheader offset to the IP6CB to make it possible to pass a packet to ip6_input_finish multiple times and have it skip already parsed headers. As a nice side effect this gets rid of the manual hopopts skipping in ip6_input_finish. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-07 12:57:29 -08:00
David S. Miller	aa0e4e4aea	[DCCP]: ipv6.c needs net/ip6_checksum.c Reported by Dave Jones. Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-07 12:57:26 -08:00
Arnaldo Carvalho de Melo	e4dfd449c8	[DCCP] ackvec: use u8 for the buf offsets Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>	2006-01-04 01:46:34 -02:00
Andrea Bittau	6742bbcbb8	[DCCP] ackvec: Fix spelling of "throw" Signed-off-by: Andrea Bittau <a.bittau@cs.ucl.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>	2006-01-04 01:45:17 -02:00
Andrea Bittau	e84a9f5e9c	[DCCP]: Notify CCID only after ACK vectors have been processed. The CCID should be notified of packet reception only when a packet is valid. Therefore, the ACK vector needs to be processed before notifying the CCID. Also, the CCID might need information provided by the ACK vector. Signed-off-by: Andrea Bittau <a.bittau@cs.ucl.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-03 14:26:15 -08:00
Andrea Bittau	9e377202d2	[DCCP]: Send an ACK vector when ACKing a response packet If ACK vectors are used, each packet with an ACK should contain an ACK vector. The only exception currently is response packets. It probably is not a good idea to store ACK vector state before the connection is completed (to help protect from syn floods). Signed-off-by: Andrea Bittau <a.bittau@cs.ucl.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-03 14:25:49 -08:00
Andrea Bittau	709dd3aaf5	[DCCP]: Do not process a packet twice when it's not in state DCCP_OPEN. When packets are received, the connection is either in DCCP_OPEN [fast-path] or it isn't. If it's not [e.g. DCCP_PARTOPEN] upper layers will perform sanity checks and parse options. If it is in DCCP_OPEN, dccp_rcv_established() will do it. It is important not to re-parse options in dccp_rcv_established() when it is not called from the fast-path. Else, fore example, the ack vector will be added twice and the CCID will see the packet twice. The solution is to always enfore sanity checks from the upper layers. When packets arrive in the fast-path, sanity checks will be performed before calling dccp_rcv_established(). Note(acme): I rewrote the patch to achieve the same result but keeping dccp_rcv_established with the previous semantics and having it split into __dccp_rcv_established, that doesn't does do any sanity check, code in state != DCCP_OPEN use this lighter version as they already do the sanity checks. Signed-off-by: Andrea Bittau <a.bittau@cs.ucl.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-03 14:25:17 -08:00
Arnaldo Carvalho de Melo	14c850212e	[INET_SOCK]: Move struct inet_sock & helper functions to net/inet_sock.h To help in reducing the number of include dependencies, several files were touched as they were getting needed headers indirectly for stuff they use. Thanks also to Alan Menegotto for pointing out that net/dccp/proto.c had linux/dccp.h include twice. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-03 13:11:21 -08:00
Arnaldo Carvalho de Melo	25995ff577	[SOCK]: Introduce sk_receive_skb Its common enough to to justify that, TCP still can't use it as it has the prequeueing stuff, still to be made generic in the not so distant future :-) Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-03 13:11:19 -08:00
Eric Dumazet	90ddc4f047	[NET]: move struct proto_ops to const I noticed that some of 'struct proto_ops' used in the kernel may share a cache line used by locks or other heavily modified data. (default linker alignement is 32 bytes, and L1_CACHE_LINE is 64 or 128 at least) This patch makes sure a 'struct proto_ops' can be declared as const, so that all cpus can share all parts of it without false sharing. This is not mandatory : a driver can still use a read/write structure if it needs to (and eventually a __read_mostly) I made a global stubstitute to change all existing occurences to make them const. This should reduce the possibility of false sharing on SMP, and speedup some socket system calls. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-03 13:11:15 -08:00
Arnaldo Carvalho de Melo	d83d8461f9	[IP_SOCKGLUE]: Remove most of the tcp specific calls As DCCP needs to be called in the same spots. Now we have a member in inet_sock (is_icsk), set at sock creation time from struct inet_protosw->flags (if INET_PROTOSW_ICSK is set, like for TCP and DCCP) to see if a struct sock instance is a inet_connection_sock for places like the ones in ip_sockglue.c (v4 and v6) where we previously were looking if sk_type was SOCK_STREAM, that is insufficient because we now use the same code for DCCP, that has sk_type SOCK_DCCP. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-03 13:10:58 -08:00
Arnaldo Carvalho de Melo	d8313f5ca2	[INET6]: Generalise tcp_v6_hash_connect Renaming it to inet6_hash_connect, making it possible to ditch dccp_v6_hash_connect and share the same code with TCP instead. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-03 13:10:56 -08:00
Arnaldo Carvalho de Melo	a7f5e7f164	[INET]: Generalise tcp_v4_hash_connect Renaming it to inet_hash_connect, making it possible to ditch dccp_v4_hash_connect and share the same code with TCP instead. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-03 13:10:55 -08:00
Arnaldo Carvalho de Melo	6d6ee43e0b	[TWSK]: Introduce struct timewait_sock_ops So that we can share several timewait sockets related functions and make the timewait mini sockets infrastructure closer to the request mini sockets one. Next changesets will take advantage of this, moving more code out of TCP and DCCP v4 and v6 to common infrastructure. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-03 13:10:54 -08:00
Arnaldo Carvalho de Melo	fc44b98053	[DCCP]: Use reqsk_free in dccp_v4_conn_request Now we have the destructor (dccp_v4_reqsk_destructor) in our request_sock_ops vtable. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-03 13:10:53 -08:00
Arnaldo Carvalho de Melo	3df80d9320	[DCCP]: Introduce DCCPv6 Still needs mucho polishing, specially in the checksum code, but works just fine, inet_diag/iproute2 and all 8) Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-03 13:10:52 -08:00
Arnaldo Carvalho de Melo	f21e68caa0	[DCCP]: Prepare the AF agnostic core for the introduction of DCCPv6 Basically exports a similar set of functions as the one exported by the non-AF specific TCP code. In the process moved some non-AF specific code from dccp_v4_connect to dccp_connect_init and moved the checksum verification from dccp_invalid_packet to dccp_v4_rcv, so as to use it in dccp_v6_rcv too. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-03 13:10:50 -08:00
Arnaldo Carvalho de Melo	34ca686081	[DCCP]: Just rename dccp_v4_prot to dccp_prot To match TCP equivalent. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-03 13:10:49 -08:00
Arnaldo Carvalho de Melo	57cca05af1	[DCCP]: Introduce dccp_ipv4_af_ops And make the core DCCP code AF agnostic, just like TCP, now its time to work on net/dccp/ipv6.c, we are close to the end! Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-03 13:10:40 -08:00
Arnaldo Carvalho de Melo	971af18bbf	[IPV6]: Reuse inet_csk_get_port in tcp_v6_get_port Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-03 13:10:33 -08:00
Ian McDonald	4c7e689502	[DCCP]: Comment typo I hope to actually change this behaviour shortly but this will help anybody grepping code at present. Signed-off-by: Ian McDonald <imcdnzl@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-12-21 19:02:39 -08:00
Patrick McHardy	a516b04950	[DCCP]: Add missing no_policy flag to struct net_protocol Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-11-20 21:16:13 -08:00
Jesper Juhl	a51482bde2	[NET]: kfree cleanup From: Jesper Juhl <jesper.juhl@gmail.com> This is the net/ part of the big kfree cleanup patch. Remove pointless checks for NULL prior to calling kfree() in net/. Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Arnaldo Carvalho de Melo <acme@conectiva.com.br> Acked-by: Marcel Holtmann <marcel@holtmann.org> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: Andrew Morton <akpm@osdl.org>	2005-11-08 09:41:34 -08:00
Stephen Hemminger	6df716340d	[TCP/DCCP]: Randomize port selection This patch randomizes the port selected on bind() for connections to help with possible security attacks. It should also be faster in most cases because there is no need for a global lock. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>	2005-11-05 21:23:15 -02:00
Herbert Xu	edc9e81917	[DCCP]: Set socket owner iff packet is not data Here is a complimentary insurance policy for those feeling a bit insecure. You don't have to accept this. However, if you do, you can't blame me for it :) > 1) dccp_transmit_skb sets the owner for all packets except data packets. We can actually verify this by looking at pkt_type. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>	2005-10-31 22:30:02 -02:00
Herbert Xu	48918a4dbd	[DCCP]: Simplify skb_set_owner_w semantics While we're at it let's reorganise the set_owner_w calls a little so that: 1) dccp_transmit_skb sets the owner for all packets except data packets. 2) Add dccp_skb_entail to set owner for packets queued for retransmission. 3) Make dccp_transmit_skb static. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>	2005-10-31 19:26:17 -02:00
Al Viro	7d877f3bda	[PATCH] gfp_t: net/* Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-10-28 08:16:47 -07:00
Herbert Xu	49c5bfaffe	[DCCP]: Clear the IPCB area Turns out the problem has nothing to do with use-after-free or double-free. It's just that we're not clearing the CB area and DCCP unlike TCP uses a CB format that's incompatible with IP. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Ian McDonald <imcdnzl@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>	2005-10-20 14:49:59 -02:00
Herbert Xu	ffa29347df	[DCCP]: Make dccp_write_xmit always free the packet icmp_send doesn't use skb->sk at all so even if skb->sk has already been freed it can't cause crash there (it would've crashed somewhere else first, e.g., ip_queue_xmit). I found a double-free on an skb that could explain this though. dccp_sendmsg and dccp_write_xmit are a little confused as to what should free the packet when something goes wrong. Sometimes they both go for the ball and end up in each other's way. This patch makes dccp_write_xmit always free the packet no matter what. This makes sense since dccp_transmit_skb which in turn comes from the fact that ip_queue_xmit always frees the packet. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>	2005-10-20 14:44:29 -02:00
Herbert Xu	fda0fd6c5b	[DCCP]: Use skb_set_owner_w in dccp_transmit_skb when skb->sk is NULL David S. Miller <davem@davemloft.net> wrote: > One thing you can probably do for this bug is to mark data packets > explicitly somehow, perhaps in the SKB control block DCCP already > uses for other data. Put some boolean in there, set it true for > data packets. Then change the test in dccp_transmit_skb() as > appropriate to test the boolean flag instead of "skb_cloned(skb)". I agree. In fact we already have that flag, it's called skb->sk. So here is patch to test that instead of skb_cloned(). Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Ian McDonald <imcdnzl@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>	2005-10-20 14:25:28 -02:00
Arnaldo Carvalho de Melo	2a9bc9bb4d	[DCCP]: Transition from PARTOPEN to OPEN when receiving DATA packets Noticed by Andrea Bittau, that provided a patch that was modified to not transition from RESPOND to OPEN when receiving DATA packets. Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-10-10 21:25:00 -07:00
Arnaldo Carvalho de Melo	777b25a2fe	[CCID]: Check if ccid is NULL in the hc_[tr]x_exit functions For consistency with ccid_exit and to fix a bug when IP_DCCP_UNLOAD_HACK is enabled as the control sock is not associated to any CCID. Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-10-10 21:24:20 -07:00
Al Viro	dd0fc66fb3	[PATCH] gfp flags annotations - part 1 - added typedef unsigned int __nocast gfp_t; - replaced __nocast uses for gfp flags with gfp_t - it gives exactly the same warnings as far as sparse is concerned, doesn't change generated code (from gcc point of view we replaced unsigned int with typedef) and documents what's going on far better. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-10-08 15:00:57 -07:00
Eric Dumazet	81c3d5470e	[INET]: speedup inet (tcp/dccp) lookups Arnaldo and I agreed it could be applied now, because I have other pending patches depending on this one (Thank you Arnaldo) (The other important patch moves skc_refcnt in a separate cache line, so that the SMP/NUMA performance doesnt suffer from cache line ping pongs) 1) First some performance data : -------------------------------- tcp_v4_rcv() wastes a lot of time in __inet_lookup_established() The most time critical code is : sk_for_each(sk, node, &head->chain) { if (INET_MATCH(sk, acookie, saddr, daddr, ports, dif)) goto hit; /* You sunk my battleship! / } The sk_for_each() does use prefetch() hints but only the begining of "struct sock" is prefetched. As INET_MATCH first comparison uses inet_sk(__sk)->daddr, wich is far away from the begining of "struct sock", it has to bring into CPU cache cold cache line. Each iteration has to use at least 2 cache lines. This can be problematic if some chains are very long. 2) The goal ----------- The idea I had is to change things so that INET_MATCH() may return FALSE in 99% of cases only using the data already in the CPU cache, using one cache line per iteration. 3) Description of the patch --------------------------- Adds a new 'unsigned int skc_hash' field in 'struct sock_common', filling a 32 bits hole on 64 bits platform. struct sock_common { unsigned short skc_family; volatile unsigned char skc_state; unsigned char skc_reuse; int skc_bound_dev_if; struct hlist_node skc_node; struct hlist_node skc_bind_node; atomic_t skc_refcnt; + unsigned int skc_hash; struct proto skc_prot; }; Store in this 32 bits field the full hash, not masked by (ehash_size - 1) Using this full hash as the first comparison done in INET_MATCH permits us immediatly skip the element without touching a second cache line in case of a miss. Suppress the sk_hashent/tw_hashent fields since skc_hash (aliased to sk_hash and tw_hash) already contains the slot number if we mask with (ehash_size - 1) File include/net/inet_hashtables.h 64 bits platforms : #define INET_MATCH(__sk, __hash, __cookie, __saddr, __daddr, __ports, __dif)\ (((__sk)->sk_hash == (__hash)) ((((__u64 )&(inet_sk(__sk)->daddr)))== (__cookie)) && \ ((((__u32 )&(inet_sk(__sk)->dport))) == (__ports)) && \ (!((__sk)->sk_bound_dev_if) \|\| ((__sk)->sk_bound_dev_if == (__dif)))) 32bits platforms: #define TCP_IPV4_MATCH(__sk, __hash, __cookie, __saddr, __daddr, __ports, __dif)\ (((__sk)->sk_hash == (__hash)) && \ (inet_sk(__sk)->daddr == (__saddr)) && \ (inet_sk(__sk)->rcv_saddr == (__daddr)) && \ (!((__sk)->sk_bound_dev_if) \|\| ((__sk)->sk_bound_dev_if == (__dif)))) - Adds a prefetch(head->chain.first) in __inet_lookup_established()/__tcp_v4_check_established() and __inet6_lookup_established()/__tcp_v6_check_established() and __dccp_v4_check_established() to bring into cache the first element of the list, before the {read\|write}_lock(&head->lock); Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Acked-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-10-03 14:13:38 -07:00
Arnaldo Carvalho de Melo	88f964db6e	[DCCP]: Introduce CCID getsockopt for the CCIDs Allocation for the optnames is similar to the DCCP options, with a range for rx and tx half connection CCIDs. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-09-18 00:19:32 -07:00
Arnaldo Carvalho de Melo	561713cf47	[DCCP]: Don't use necessarily the same CCID for tx and rx Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-09-18 00:18:52 -07:00
Arnaldo Carvalho de Melo	65299d6c3c	[CCID3]: Introduce include/linux/tfrc.h Moving the TFRC sender and receiver variables to separate structs, so that we can copy these structs to userspace thru getsockopt, dccp_diag, etc. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-09-18 00:18:32 -07:00
Arnaldo Carvalho de Melo	ae31c3399d	[DCCP]: Move the ack vector code to net/dccp/ackvec.[ch] Isolating it, that will be used when we introduce a CCID2 (TCP-Like) implementation. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-09-18 00:17:51 -07:00
Arnaldo Carvalho de Melo	67e6b62921	[DCCP]: Introduce DCCP_SOCKOPT_SERVICE As discussed in the dccp@vger mailing list: Now applications have to use setsockopt(DCCP_SOCKOPT_SERVICE, service[s]), prior to calling listen() and connect(). An array of unsigned ints can be passed meaning that the listening sock accepts connection requests for several services. With this we can ditch struct sockaddr_dccp and use only sockaddr_in (and sockaddr_in6 in the future). Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-09-16 16:58:40 -07:00
Arnaldo Carvalho de Melo	0c10c5d968	[DCCP]: More precisely set reset_code when sending RESET packets Moving the setting of DCCP_SKB_CB(skb)->dccpd_reset_code to the places where events happen that trigger sending a RESET packet. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-09-16 16:58:33 -07:00
Arnaldo Carvalho de Melo	2b80230a7f	[DCCP]: Handle SYNC packets in dccp_rcv_state_process Eliciting a SYNCACK in response, we were handling SYNC packets only in the DCCP_OPEN state, in dccp_rcv_established. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>	2005-09-13 19:05:08 -03:00
Arnaldo Carvalho de Melo	811265b8e8	[DCCP]: Check if already in the CLOSING state in dccp_rcv_closereq It is possible to receive more than one CLOSEREQ packet if the CLOSE packet sent in response is somehow lost, change the state to DCCP_CLOSING only on the first CLOSEREQ packet received. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>	2005-09-13 19:03:15 -03:00
Arnaldo Carvalho de Melo	59c2353dd0	[CCID3]: Listen socks doesn't have a private CCID block Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-09-12 14:16:58 -07:00
Arnaldo Carvalho de Melo	59d203f9e9	[CCID3] Cleanup ccid3 debug calls Also use some BUG_ON where appropriate and use LIMIT_NETDEBUG for the unlikely cases where we, at this stage, want to know about, that in my tests hasn't appeared in the radar. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>	2005-09-09 20:01:25 -03:00
Arnaldo Carvalho de Melo	dc19336c76	[DCCP] Only call the HC _exit() routines in dccp_v4_destroy_sock Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>	2005-09-09 19:59:26 -03:00
Arnaldo Carvalho de Melo	d7e0fb985c	[CCID3] Initialize ccid3hctx_t_ipi to 250ms To match more closely what is described in RFC 3448. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: Ian McDonald <iam4@cs.waikato.ac.nz>	2005-09-09 19:58:18 -03:00
Arnaldo Carvalho de Melo	59725dc2a2	[CCID3] Introduce ccid3_hc_[rt]x_sk() for overal consistency Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>	2005-09-09 02:40:58 -03:00
Arnaldo Carvalho de Melo	b0e567806d	[DCCP] Introduce dccp_timestamp To start the timestamps with 0.0ms, easing the integer maths in the CCIDs, this probably will be reworked to use the to be introduced struct timeval_offset infrastructure out of skb_get_timestamp, etc. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>	2005-09-09 02:38:35 -03:00
Arnaldo Carvalho de Melo	954ee31f36	[CCID3] Initialize more fields in ccid3_hc_rx_init The initialization of ccid3hcrx_rtt to 5ms is just a bandaid, I'll continue auditing the CCID3 HC rx codebase to fix this properly, probably I'll add a feedback timer as suggested in the CCID3 draft. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>	2005-09-09 02:37:05 -03:00
Arnaldo Carvalho de Melo	b3a3077d96	[CCID3] Make the ccid3hcrx_rtt calc look more like the ccid3hctx_rtt one Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>	2005-09-09 02:34:10 -03:00
Arnaldo Carvalho de Melo	1a28599a2c	[CCID3] Use ELAPSED_TIME in the HC TX RTT estimation Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>	2005-09-09 02:32:56 -03:00
Arnaldo Carvalho de Melo	1c14ac0ae8	[DCCP] Give precedence to the biggest ELAPSED_TIME We can get this value in an TIMESTAMP_ECHO and/or in an ELAPSED_TIME option, if receiving both give precendence to the biggest one. In my tests they are very close if not equal at all times, so we may well think about removing the code in CCID3 that inserts this option and leaving this to the core, and perhaps even use just TIMESTAMP_ECHO including the elapsed time. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>	2005-09-09 02:32:01 -03:00
Arnaldo Carvalho de Melo	27ae543e6f	[CCID3] Calculate ccid3hcrx_x_recv using usecs_div Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>	2005-09-09 02:31:07 -03:00
Arnaldo Carvalho de Melo	507d37cf26	[CCID] Only call the HC insert_options methods when requested Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>	2005-09-09 02:30:07 -03:00
Arnaldo Carvalho de Melo	0ba7a3ba66	[CCID3] Avoid unsigned integer overflows in usecs_div Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>	2005-09-09 02:28:47 -03:00
Arnaldo Carvalho de Melo	c530cfb1ce	[CCID3]: Call sk->sk_write_space(sk) when receiving a feedback packet This makes the send rate calculations behave way more closely to what is specified, with the jitter previously seen on x and x_recv disappearing completely on non lossy setups. This resembles the tcp_data_snd_check code, that possibly we'll end up using in DCCP as well, perhaps moving this code to inet_connection_sock. For now I'm doing the simplest implementation tho. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 16:13:46 -07:00
Arnaldo Carvalho de Melo	a84ffe4303	[DCCP]: Introduce DCCP_SOCKOPT_PACKET_SIZE So that applications can set dccp_sock->dccps_pkt_size, that in turn is used in the CCID3 half connection init routines to set ccid3hc[tr]x_s and use it in its rate calculations. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 16:13:37 -07:00
Arnaldo Carvalho de Melo	29e4f8b3c3	[CCID3]: Move ccid3_hc_rx_detect_loss to packet_history.c Renaming it to dccp_rx_hist_detect_loss. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 16:13:17 -07:00
Arnaldo Carvalho de Melo	072ab6c68e	[CCID3]: Move ccid3_hc_rx_add_hist to packet_history.c Renaming it to dccp_rx_hist_add_packet. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 16:13:10 -07:00
Arnaldo Carvalho de Melo	36729c1a73	[DCCP]: Move the calc_X routines to dccp_tfrc_lib Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 16:12:47 -07:00
Arnaldo Carvalho de Melo	5cea0ddce5	[DCCP]: Introduce dccp_tfrc_lib module with net/dccp/ccids/lib/*.c I'll now take a look at the other proposed TFRC DCCP CCIDs to find more code that is now in ccid3.c and move to this module, the loss event rate, calc_X, etc most probably will be moved there. The main goal of these changes is to pave the way for the implementation of more TFRC based DCCP CCIDs and to shrink ccid3.c, reducing its complexity and helping in getting it rock solid. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 16:12:33 -07:00
Arnaldo Carvalho de Melo	4524b25954	[DCCP]: Just move packet_history.[ch] to net/dccp/ccids/lib/ Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 16:12:25 -07:00
Arnaldo Carvalho de Melo	ae6706f067	[CCID3]: Move the loss interval code to loss_interval.[ch] And put this into net/dccp/ccids/lib/, where packet_history.[ch] will also be moved and then we'll have a tfrc_lib.ko module that will be used by dccp_ccid3.ko and other CCIDs that are variations of TFRC (RFC 3448). Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 16:12:17 -07:00
Arnaldo Carvalho de Melo	cfc3c525a3	[CCID3]: Move the CCID3 defines to ccid3.h Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 16:12:10 -07:00
Arnaldo Carvalho de Melo	6b5e633ab1	[CCID3]: Introduce usecs_div To avoid open coding this all over the place. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 16:12:03 -07:00
Arnaldo Carvalho de Melo	b6ee3d4ada	[CCID3]: Reorganise timeval handling Introducing functions to add to or subtract from a timeval variable and renaming now_delta to timeval_new_delta that calls do_gettimeofday and then timeval_delta, that should be used when there are several deltas made relative to the current time or setting variables to it, so as to avoid calling do_gettimeofday excessively. I'm leaving these "timeval_" prefixed funcions internal to DCCP for a while till we're sure there are no subtle bugs in it. It also is more correct as it checks if the number of usecs added to or subtracted from a tv_usec field is more than 2 seconds. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 16:11:56 -07:00
Arnaldo Carvalho de Melo	1f2333aea3	[CCID3]: Reflow to mostly fit under 80 columns No code changes. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 16:11:46 -07:00
Arnaldo Carvalho de Melo	d6809c12b3	[DCCP]: Introduce dccp_wait_for_ccid and use it in dccp_write_xmit This is not quite what I think we should have long term but improves performance for now, so lets use it till we get CCID3 working well, then we can think about using sk_write_queue, perhaps using some ideas from Juwen Lai's old stack for 2.4.20. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 16:11:38 -07:00
Arnaldo Carvalho de Melo	75b3f207b4	[DCCP]: Make the Debug Menu available when DCCP is statically linked too Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 16:11:27 -07:00
Eric Dumazet	ba89966c19	[NET]: use __read_mostly on kmem_cache_t , DEFINE_SNMP_STAT pointers This patch puts mostly read only data in the right section (read_mostly), to help sharing of these data between CPUS without memory ping pongs. On one of my production machine, tcp_statistics was sitting in a heavily modified cache line, so every SNMP update had to force a reload. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 16:11:18 -07:00
Arnaldo Carvalho de Melo	331968bd0c	[DCCP]: Initial dccp_poll implementation Tested with a patched netcat, no horror stories so far 8) Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 16:05:45 -07:00
Arnaldo Carvalho de Melo	8efa544f9c	[DCCP]: Call the HC exit routines at dccp_v4_destroy_sock Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 16:05:38 -07:00
Arnaldo Carvalho de Melo	2babe1f6fe	[DCCP]: Introduce dccp_get_info And also hc_tx and hc_rx get_info functions for the CCIDs to fill in information that is specific to them. For now reusing struct tcp_info, later I'll try to figure out a better solution, for now its really nice to get this kind of info: [root@qemu ~]# ./ss -danemi State Recv-Q Send-Q Local Addr:Port Peer Addr:Port LISTEN 0 0 :5001 :* ino:628 sk:c1340040 mem:(r0,w0,f0,t0) cwnd:0 ssthresh:0 ESTAB 0 0 172.20.0.2:5001 172.20.0.1:32785 ino:629 sk:c13409a0 mem:(r0,w0,f0,t0) ts rto:1000 rtt:0.004/0 cwnd:0 ssthresh:0 rcv_rtt:61.377 This, for instance, shows that we're not congestion controlling ACKs, as the above output is in the ttcp receiving host, and ttcp is a one way app, i.e. the received never calls sendmsg, so ccid_hc_tx_send_packet is never called, so the TX half connection stays in TFRC_SSTATE_NO_SENT state and hctx_rtt is never calculated, stays with the value set in ccid3_hc_tx_init, 4us, as show above in milliseconds (0.004ms), upcoming patches will fix this. rcv_rtt seems sane tho, matching ping results :-) Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 16:05:07 -07:00
Arnaldo Carvalho de Melo	4fded33b3e	[CCID3]: Calculate the RTT in the RX half connection Using TIMESTAMP_ECHO and ELAPSED_TIME options received. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 16:05:01 -07:00
Arnaldo Carvalho de Melo	d4b81ff705	[DCCP]: Export dccp_insert_option_timestamp to CCIDs And don't insert a TIMESTAMP option in all packets, leave the decision to the CCIDs. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 16:04:53 -07:00
Arnaldo Carvalho de Melo	012e13eac7	[CCID]: Make ccid_hc_[rt]x_exit accept NULL arguments Just like kfree, etc it will just not call the CCID exit routines when the private data area is set to NULL. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 16:04:48 -07:00
Arnaldo Carvalho de Melo	a4beb1b64f	[DCCP]: Send a DATAACK packet when we have a TIMESTAMP_ECHO pending Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 16:04:43 -07:00
Arnaldo Carvalho de Melo	20472af986	[DCCP]: Fix skb leak in dccp_sendmsg Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 16:04:38 -07:00
Arnaldo Carvalho de Melo	7ad07e7cf3	[DCCP]: Implement the CLOSING timer So that we retransmit CLOSE/CLOSEREQ packets till they elicit an answer or we hit a timeout. Most of the machinery uses TCP approaches, this code has to be polished & audited, but this is better than we had before. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 16:04:31 -07:00
David S. Miller	58e45131dc	[DCCP]: Fix printf format warnings on 64-bit. Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 16:04:27 -07:00
Arnaldo Carvalho de Melo	24117727b7	[DCCP]: Fix ackno setting in SYNC/SYNCACK packets Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 16:03:52 -07:00
Arnaldo Carvalho de Melo	03ace394ac	[DCCP]: Fix the ACK and SEQ window variables settings This is from a first audit, more eyeballs are more than welcome. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 16:03:42 -07:00
Arnaldo Carvalho de Melo	a3054d48b9	[DCCP]: Give more info on Step 6 failure debug printk Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 16:03:33 -07:00
Arnaldo Carvalho de Melo	2807d4ffb0	[DCCP]: Fix seqno setting in dccp_v4_ctl_send_reset Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 16:03:25 -07:00
Arnaldo Carvalho de Melo	c68e64cfb5	[CCID3]: Reintroduce ccid3hctx_t_rto CCID3 keeps this variable in usecs, inet_connection_socks in jiffies, so to avoid Mars orbiter losses lets reintroduce ccid3hctx_t_rto 8) Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 16:03:18 -07:00
Ian McDonald	1bc0986957	[DCCP]: Fix the timestamp options This changes timestamp, timestamp echo, and elapsed time to use units of 10 usecs as per DCCP spec. This has been tested to verify that times are correct. Also fixed up length and used hton/ntoh more. Still to add in later patches: - actually use elapsed time to adjust RTT (commented out as was prior to this patch) - send options at times more closely following the spec (content is now correct) Signed-off-by: Ian McDonald <iam4@cs.waikato.ac.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 16:02:34 -07:00
Arnaldo Carvalho de Melo	c59eab4637	[DCCP]: Use LIMIT_NETDEBUG in some debugging printks Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 16:02:26 -07:00
Arnaldo Carvalho de Melo	5480855bfb	[DCCP]: Set dccp_ctl_socket to NULL in dccp_ctl_sock_exit Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 16:02:22 -07:00
Ian McDonald	b1c9fe7b81	[DCCP]: Fix elapsed time option as per section 13.2 of spec v11 The elapsed time can be two bytes or four bytes only. Signed-off-by: Ian McDonald <iam4@cs.waikato.ac.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 16:02:03 -07:00
Arnaldo Carvalho de Melo	e92ae93a8a	[DCCP]: Send SYNCACK packets in response to SYNC packets Also fix step 6 when receiving SYNC or SYNCACK packets, i.e. we were not using the updated swl. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 16:01:50 -07:00
Patrick McHardy	a10cedd4b9	[DCCP]: Fix compiler warnings may be a false warning if there always is something on ccid3hcrx_hist: net/dccp/ccids/ccid3.c: In function 'ccid3_hc_rx_packet_recv': net/dccp/ccids/ccid3.c:1634: warning: 'tstamp.tv_usec' may be used uninitialized in this function net/dccp/ccids/ccid3.c:1634: warning: 'tstamp.tv_sec' may be used uninitialized in this function const on inline functions doesn't have any effect: net/dccp/dccp.h:64: warning: type qualifiers ignored on function return type net/dccp/dccp.h:70: warning: type qualifiers ignored on function return type net/dccp/dccp.h:76: warning: type qualifiers ignored on function return type Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 16:00:12 -07:00
Arnaldo Carvalho de Melo	a1d3a35518	[DCCP]: Fix sparse warnings Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:59:59 -07:00
Arnaldo Carvalho de Melo	8649b0d416	[DCCP]: Fix RESET handling in dccp_rcv_state_process To avoid holding TIMEWAIT state for sockets in the LISTEN state. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:59:50 -07:00
Arnaldo Carvalho de Melo	725ba8eee3	[DCCP]: Introduce the DCCP Kernel hacking menu Only available if CONFIG_DEBUG_KERNEL is enabled in the "Kernel Hacking" Menu. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:59:43 -07:00
Arnaldo Carvalho de Melo	531669a0a9	[DCCP]: Rewrite dccp_sendmsg to be more like UDP Based on discussions with Nishida-san. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:59:34 -07:00
Arnaldo Carvalho de Melo	7690af3fff	[DCCP]: Just reflow the source code to fit in 80 columns Andrew Morton should be happy now 8) Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:59:26 -07:00
Arnaldo Carvalho de Melo	c173437669	[PACKET_HISTORY]: Add dccphtx_rtt and rename the win_count fields As requested by Ian. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: Ian McDonald <iam4@cs.waikato.ac.nz> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:59:17 -07:00
Arnaldo Carvalho de Melo	17b085eace	[INET_DIAG]: Move the tcp_diag interface to the proper place With this the previous setup is back, i.e. tcp_diag can be built as a module, as dccp_diag and both share the infrastructure available in inet_diag. If one selects CONFIG_INET_DIAG as module CONFIG_INET_TCP_DIAG will also be built as a module, as will CONFIG_INET_DCCP_DIAG, if CONFIG_IP_DCCP was selected static or as a module, if CONFIG_INET_DIAG is y, being statically linked CONFIG_INET_TCP_DIAG will follow suit and CONFIG_INET_DCCP_DIAG will be built in the same manner as CONFIG_IP_DCCP. Now to aim at UDP, converting it to use inet_hashinfo, so that we can use iproute2 for UDP sockets as well. Ah, just to show an example of this new infrastructure working for DCCP :-) [root@qemu ~]# ./ss -dane State Recv-Q Send-Q Local Address:Port Peer Address:Port LISTEN 0 0 :5001 :* ino:942 sk:cfd503a0 ESTAB 0 0 127.0.0.1:5001 127.0.0.1:32770 ino:943 sk:cfd50a60 ESTAB 0 0 127.0.0.1:32770 127.0.0.1:5001 ino:947 sk:cfd50700 TIME-WAIT 0 0 127.0.0.1:32769 127.0.0.1:5001 timer:(timewait,3.430ms,0) ino:0 sk:cf209620 Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:57:54 -07:00
Arnaldo Carvalho de Melo	a8c2190ee7	[INET_DIAG]: Rename tcp_diag.[ch] to inet_diag.[ch] Next changeset will introduce net/ipv4/tcp_diag.c, moving the code that was put transitioanlly in inet_diag.c. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:57:48 -07:00
Arnaldo Carvalho de Melo	73c1f4a033	[TCPDIAG]: Just rename everything to inet_diag Next changeset will rename tcp_diag.[ch] to inet_diag.[ch]. I'm taking this longer route so as to easy review, making clear the changes made all along the way. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:57:44 -07:00
Arnaldo Carvalho de Melo	4f5736c4c7	[TCPDIAG]: Introduce inet_diag_{register,unregister} Next changeset will rename tcp_diag to inet_diag and move the tcp_diag code out of it and into a new tcp_diag.c, similar to the net/dccp/diag.c introduced in this changeset, completing the transition to a generic inet_diag infrastructure. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:57:38 -07:00
Arnaldo Carvalho de Melo	cef07fd602	[CCID3]: Ditch USEC_IN_SEC as time.h has USEC_PER_SEC That is equivalent, no need to have a private one. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>	2005-08-29 15:56:33 -07:00
Arnaldo Carvalho de Melo	8c60f3fab5	[CCID3]: Separate most of the packet history code This also changes the list_for_each_entry_safe_continue behaviour to match its kerneldoc comment, that is, to start after the pos passed. Also adds several helper functions from previously open coded fragments, making the code more clear. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>	2005-08-29 15:56:28 -07:00
Arnaldo Carvalho de Melo	540722ffc3	[TCPDIAG]: Implement cheapest way of supporting DCCPDIAG_GETSOCK With ugly ifdefs, etc, but this actually: 1. keeps the existing ABI, i.e. no need to recompile the iproute2 utilities if not interested in DCCP. 2. Provides all the tcp_diag functionality in DCCP, with just a small patch that makes iproute2 support DCCP. Of course I'll get this cleaned-up in time, but for now I think its OK to be this way to quickly get this functionality. iproute2-ss050808 patch at: http://vger.kernel.org/~acme/iproute2-ss050808.dccp.patch Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:56:23 -07:00
Patrick McHardy	64ce207306	[NET]: Make NETDEBUG pure printk wrappers Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:56:08 -07:00
Arnaldo Carvalho de Melo	64cf1e5d8b	[DCCP]: Finish the TIMEWAIT minisock support Using most of the infrastructure TCP uses, with a dccp_death_row, etc. As per my current interpretation of the draft what we have with this changeset seems to be all we need (or very close to it 8)). Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:56:03 -07:00
Arnaldo Carvalho de Melo	0b4e03bf0b	[DCCP]: Initialize icsk_rto in dccp_v4_init_sock Fixes nasty bug related to the retransmit timer (yeah, DCCP does retransmits) firing too early. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:55:43 -07:00
Arnaldo Carvalho de Melo	27258ee54f	[DCCP]: Introduce dccp_write_xmit from code in dccp_sendmsg This way it gets closer to the TCP flow, where congestion window checks are done, it seems we can map ccid_hc_tx_send_packet in dccp_write_xmit to tcp_snd_wnd_test in tcp_write_xmit, a CCID2 decision should just fit in here as well... Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:55:18 -07:00
David S. Miller	f6ccf55419	[DCCP]: Fix u64 printf format warnings. Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:54:34 -07:00
Arnaldo Carvalho de Melo	bb97d31f51	[INET]: Make inet_create try to load protocol modules Syntax is net-pf-PROTOCOL_FAMILY-PROTOCOL-SOCK_TYPE and if this fails net-pf-PROTOCOL_FAMILY-PROTOCOL. Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:50:54 -07:00
Arnaldo Carvalho de Melo	757f612e09	[CCID3]: Reenable list_for_each_entry_safe_continue usage Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:50:08 -07:00
Yoshifumi Nishida	95b81ef794	[DCCP]: Fix checksum routines Signed-off-by: Yoshifumi Nishida <nishida@csl.sony.co.jp> Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:49:55 -07:00
Arnaldo Carvalho de Melo	a019d6fe2b	[ICSK]: Move generalised functions from tcp to inet_connection_sock This also improves reqsk_queue_prune and renames it to inet_csk_reqsk_queue_prune, as it deals with both inet_connection_sock and inet_request_sock objects, not just with request_sock ones thus belonging to inet_request_sock. Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:49:50 -07:00
Arnaldo Carvalho de Melo	7c657876b6	[DCCP]: Initial implementation Development to this point was done on a subversion repository at: http://oops.ghostprotocols.net:81/cgi-bin/viewcvs.cgi/dccp-2.6/ This repository will be kept at this site for the foreseable future, so that interested parties can see the history of this code, attributions, etc. If I ever decide to take this offline I'll provide the full history at some other suitable place. Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:49:46 -07:00

... 15 16 17 18 19 ...

1025 Commits