linux_old1

Commit Graph

Author	SHA1	Message	Date
David S. Miller	87a50699cb	rtnetlink: Remove ts/tsage args to rtnl_put_cacheinfo(). Nobody provides non-zero values any longer. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-10 22:40:13 -07:00
David S. Miller	3e12939a2a	inet: Kill FLOWI_FLAG_PRECOW_METRICS. No longer needed. TCP writes metrics, but now in it's own special cache that does not dirty the route metrics. Therefore there is no longer any reason to pre-cow metrics in this way. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-10 22:40:12 -07:00
David S. Miller	1d861aa4b3	inet: Minimize use of cached route inetpeer. Only use it in the absolutely required cases: 1) COW'ing metrics 2) ipv4 PMTU 3) ipv4 redirects Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-10 22:40:11 -07:00
David S. Miller	16d1839907	inet: Remove ->get_peer() method. No longer used. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-10 22:40:10 -07:00
David S. Miller	b6242b9b45	tcp: Remove tw->tw_peer No longer used. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-10 22:40:09 -07:00
David S. Miller	81166dd6fa	tcp: Move timestamps from inetpeer to metrics cache. With help from Lin Ming. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-10 22:40:08 -07:00
David S. Miller	794785bf12	net: Don't report route RTT metric value in cache dumps. We don't maintain it dynamically any longer, so reporting it would be extremely misleading. Report zero instead. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-10 22:40:06 -07:00
David S. Miller	51c5d0c4b1	tcp: Maintain dynamic metrics in local cache. Maintain a local hash table of TCP dynamic metrics blobs. Computed TCP metrics are no longer maintained in the route metrics. The table uses RCU and an extremely simple hash so that it has low latency and low overhead. A simple hash is legitimate because we only make metrics blobs for fully established connections. Some tweaking of the default hash table sizes, metric timeouts, and the hash chain length limit certainly could use some tweaking. But the basic design seems sound. With help from Eric Dumazet and Joe Perches. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-10 22:39:57 -07:00
David S. Miller	ab92bb2f67	tcp: Abstract back handling peer aliveness test into helper function. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-10 20:33:49 -07:00
David S. Miller	4aabd8ef8c	tcp: Move dynamnic metrics handling into seperate file. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-10 20:31:36 -07:00
David S. Miller	d3a5ea6e21	Merge branch 'master' of git://1984.lsi.us.es/nf-next	2012-07-07 16:18:50 -07:00
David S. Miller	f4530fa574	ipv4: Avoid overhead when no custom FIB rules are installed. If the user hasn't actually installed any custom rules, or fiddled with the default ones, don't go through the whole FIB rules layer. It's just pure overhead. Instead do what we do with CONFIG_IP_MULTIPLE_TABLES disabled, check the individual tables by hand, one by one. Also, move fib_num_tclassid_users into the ipv4 network namespace. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-05 22:13:13 -07:00
Eric Dumazet	bf5e53e371	ipv4: defer fib_compute_spec_dst() call ip_options_compile() can avoid calling fib_compute_spec_dst() by default, and perform the call only if needed. David suggested to add a helper to make the call only once. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-05 03:03:32 -07:00
David S. Miller	f187bc6efb	ipv4: No need to set generic neighbour pointer. Nobody reads it any longer. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-05 02:41:59 -07:00
David S. Miller	f894cbf847	net: Add optional SKB arg to dst_ops->neigh_lookup(). Causes the handler to use the daddr in the ipv4/ipv6 header when the route gateway is unspecified (local subnet). Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-05 01:04:01 -07:00
David S. Miller	5110effee8	net: Do delayed neigh confirmation. When a dst_confirm() happens, mark the confirmation as pending in the dst. Then on the next packet out, when we have the neigh in-hand, do the update. This removes the dependency in dst_confirm() of dst's having an attached neigh. While we're here, remove the explicit 'dst' NULL check, all except 2 or 3 call sites ensure it's not NULL. So just fix those cases up. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-05 01:03:06 -07:00
David S. Miller	3c521f2ba9	ipv4: Don't report neigh uptodate state in rtcache procfs. Soon routes will not have a cached neigh attached, nor will we be able to necessarily go directly to a neigh from an arbitrary route. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-05 01:02:31 -07:00
David S. Miller	a263b30936	ipv4: Make neigh lookups directly in output packet path. Do not use the dst cached neigh, we'll be getting rid of that. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-05 01:02:12 -07:00
David S. Miller	11604721a3	ipv4: Fix crashes in ip_options_compile(). The spec_dst uses should be guarded by skb_rtable() being non-NULL not just the SKB being non-null. Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-04 16:13:17 -07:00
Pablo Neira Ayuso	08911475d1	netfilter: nf_conntrack: generalize nf_ct_l4proto_net This patch generalizes nf_ct_l4proto_net by splitting it into chunks and moving the corresponding protocol part to where it really belongs to. To clarify, note that we follow two different approaches to support per-net depending if it's built-in or run-time loadable protocol tracker. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Acked-by: Gao feng <gaofeng@cn.fujitsu.com>	2012-07-04 19:37:22 +02:00
Pablo Neira Ayuso	a31f2d17b3	netlink: add netlink_kernel_cfg parameter to netlink_kernel_create This patch adds the following structure: struct netlink_kernel_cfg { unsigned int groups; void (input)(struct sk_buff skb); struct mutex *cb_mutex; }; That can be passed to netlink_kernel_create to set optional configurations for netlink kernel sockets. I've populated this structure by looking for NULL and zero parameters at the existing code. The remaining parameters that always need to be set are still left in the original interface. That includes optional parameters for the netlink socket creation. This allows easy extensibility of this interface in the future. This patch also adapts all callers to use this new interface. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-29 16:46:02 -07:00
David S. Miller	7a9bc9b81a	ipv4: Elide fib_validate_source() completely when possible. If rpfilter is off (or the SKB has an IPSEC path) and there are not tclassid users, we don't have to do anything at all when fib_validate_source() is invoked besides setting the itag to zero. We monitor tclassid uses with a counter (modified only under RTNL and marked __read_mostly) and we protect the fib_validate_source() real work with a test against this counter and whether rpfilter is to be done. Having a way to know whether we need no tclassid processing or not also opens the door for future optimized rpfilter algorithms that do not perform full FIB lookups. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-29 01:36:36 -07:00
David S. Miller	3085a4b7d3	ipv4: Remove extraneous assignment of dst->tclassid. We already set it several lines above. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-28 22:17:39 -07:00
David S. Miller	9e56e3800e	ipv4: Adjust in_dev handling in fib_validate_source() Checking for in_dev being NULL is pointless. In fact, all of our callers have in_dev precomputed already, so just pass it in and remove the NULL checking. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-28 18:54:02 -07:00
David S. Miller	a207a4b2e8	ipv4: Fix bugs in fib_compute_spec_dst(). Based upon feedback from Julian Anastasov. 1) Use route flags to determine multicast/broadcast, not the packet flags. 2) Leave saddr unspecified in flow key. 3) Adjust how we invoke inet_select_addr(). Pass ip_hdr(skb)->saddr as second arg, and if it was zeronet use link scope. 4) Use loopback as input interface in flow key. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-28 18:33:58 -07:00
David S. Miller	41347dcdd8	ipv4: Kill rt->rt_spec_dst, no longer used. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-28 04:05:27 -07:00
David S. Miller	35ebf65e85	ipv4: Create and use fib_compute_spec_dst() helper. The specific destination is the host we direct unicast replies to. Usually this is the original packet source address, but if we are responding to a multicast or broadcast packet we have to use something different. Specifically we must use the source address we would use if we were to send a packet to the unicast source of the original packet. The routing cache precomputes this value, but we want to remove that precomputation because it creates a hard dependency on the expensive rpfilter source address validation which we'd like to make cheaper. There are only three places where this matters: 1) ICMP replies. 2) pktinfo CMSG 3) IP options Now there will be no real users of rt->rt_spec_dst and we can simply remove it altogether. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-28 03:59:11 -07:00
David S. Miller	70e7341673	ipv4: Show that ip_send_reply() is purely unicast routine. Rename it to ip_send_unicast_reply() and add explicit 'saddr' argument. This removed one of the few users of rt->rt_spec_dst. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-28 03:21:41 -07:00
David S. Miller	160eb5a6b1	ipv4: Kill early demux method return value. It's completely unnecessary. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-27 22:01:22 -07:00
David S. Miller	c10237e077	Revert "ipv4: tcp: dont cache unconfirmed intput dst" This reverts commit `c074da2810`. This change has several unwanted side effects: 1) Sockets will cache the DST_NOCACHE route in sk->sk_rx_dst and we'll thus never create a real cached route. 2) All TCP traffic will use DST_NOCACHE and never use the routing cache at all. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-27 17:05:06 -07:00
Eric Dumazet	22911fc581	net: skb_free_datagram_locked() doesnt drop all packets dropwatch wrongly diagnose all received UDP packets as drops. This patch removes trace_kfree_skb() done in skb_free_datagram_locked(). Locations calling skb_free_datagram_locked() should do it on their own. As a result, drops are accounted on the right function. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-27 15:40:57 -07:00
Thomas Graf	92a395e52f	ipmr: Do not use RTA_PUT() macros Also fix a needless skb tailroom check for a 4 bytes area after after each rtnexthop block. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-27 15:36:44 -07:00
Thomas Graf	6e277ed59a	inet_diag: Do not use RTA_PUT() macros Also, no need to trim on nlmsg_put() failure, nothing has been added yet. We also want to use nlmsg_end(), nlmsg_new() and nlmsg_free(). Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-27 15:36:44 -07:00
Eric Dumazet	c074da2810	ipv4: tcp: dont cache unconfirmed intput dst DDOS synflood attacks hit badly IP route cache. On typical machines, this cache is allowed to hold up to 8 Millions dst entries, 256 bytes for each, for a total of 2GB of memory. rt_garbage_collect() triggers and tries to cleanup things. Eventually route cache is disabled but machine is under fire and might OOM and crash. This patch exploits the new TCP early demux, to set a nocache boolean in case incoming TCP frame is for a not yet ESTABLISHED or TIMEWAIT socket. This 'nocache' boolean is then used in case dst entry is not found in route cache, to create an unhashed dst entry (DST_NOCACHE) SYN-cookie-ACK sent use a similar mechanism (ipv4: tcp: dont cache output dst for syncookies), so after this patch, a machine is able to absorb a DDOS synflood attack without polluting its IP route cache. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Hans Schillstrom <hans.schillstrom@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-27 15:34:24 -07:00
Gao feng	a9082b45ad	netfilter: nf_ct_icmp: add icmp_kmemdup[_compat]_sysctl_table function Split sysctl function into smaller chucks to cleanup code and prepare patches to reduce ifdef pollution. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-06-27 19:14:55 +02:00
Gao feng	f1caad2745	netfilter: nf_conntrack: prepare l4proto->init_net cleanup l4proto->init contain quite redundant code. We can simplify this by adding a new parameter l3proto. This patch prepares that code simplification. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-06-27 18:31:14 +02:00
David S. Miller	c2bd4baf41	netfilter: ipt_ULOG: Move away from NLMSG_PUT(). And use nlmsg_data() while we're here too. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-26 21:30:49 -07:00
David S. Miller	d106352d9f	inet_diag: Move away from NLMSG_PUT(). And use nlmsg_data() while we're here too, and remove useless casts. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-26 21:28:54 -07:00
David S. Miller	251da41301	ipv4: Cache ip_error() routes even when not forwarding. And account for the fact that, when we are not forwarding, we should bump statistic counters rather than emit an ICMP response. RP-filter rejected lookups are still not cached. Since -EHOSTUNREACH and -ENETUNREACH can now no longer be seen in ip_rcv_finish(), remove those checks. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-26 16:27:09 -07:00
David S. Miller	df67e6c9a6	ipv4: Remove unnecessary code from rt_check_expire(). IPv4 routing cache entries no longer use dst->expires, because the metrics, PMTU, and redirect information are stored in the inetpeer cache. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-26 00:10:09 -07:00
Vijay Subramanian	7011d0851b	tcp: Fix bug in tcp socket early demux The dest port for the call to __inet_lookup_established() in TCP early demux code is passed with the wrong endian-ness. This causes the lookup to fail leading to early demux not being used. Signed-off-by: Vijay Subramanian <subramanian.vijay@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-23 23:22:38 -07:00
David S. Miller	0b4a9e1a59	Merge branch 'master' of git://1984.lsi.us.es/nf-next Pablo says: ==================== The following four patches provide Netfilter fixes for the cthelper infrastructure that was recently merged mainstream, they are: * two fixes for compilation breakage with two different configurations: - CONFIG_NF_NAT=m and CONFIG_NF_CT_NETLINK=y - NF_CONNTRACK_EVENTS=n and CONFIG_NETFILTER_NETLINK_QUEUE_CT=y * two fixes for sparse warnings. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-23 17:10:10 -07:00
Eric Dumazet	7586eceb0a	ipv4: tcp: dont cache output dst for syncookies Don't cache output dst for syncookies, as this adds pressure on IP route cache and rcu subsystem for no gain. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Hans Schillstrom <hans.schillstrom@ericsson.com> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-22 21:47:33 -07:00
Alexander Duyck	6648bd7e0e	ipv4: Add sysctl knob to control early socket demux This change is meant to add a control for disabling early socket demux. The main motivation behind this patch is to provide an option to disable the feature as it adds an additional cost to routing that reduces overall throughput by up to 5%. For example one of my systems went from 12.1Mpps to 11.6 after the early socket demux was added. It looks like the reason for the regression is that we are now having to perform two lookups, first the one for an established socket, and then the one for the routing table. By adding this patch and toggling the value for ip_early_demux to 0 I am able to get back to the 12.1Mpps I was previously seeing. [ Move local variables in ip_rcv_finish() down into the basic block in which they are actually used. -DaveM ] Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-22 17:11:13 -07:00
Pablo Neira Ayuso	d584a61a93	netfilter: nfnetlink_queue: fix compilation with CONFIG_NF_NAT=m and CONFIG_NF_CT_NETLINK=y LD init/built-in.o net/built-in.o:(.data+0x4408): undefined reference to `nf_nat_tcp_seq_adjust' make: *** [vmlinux] Error 1 This patch adds a new pointer hook (nfq_ct_nat_hook) similar to other existing in Netfilter to solve our complicated configuration dependencies. Reported-by: Valdis Kletnieks <valdis.kletnieks@vt.edu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-06-22 02:49:52 +02:00
David S. Miller	fd62e09b94	tcp: Validate route interface in early demux. Otherwise we might violate reverse path filtering. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-21 14:58:10 -07:00
Eric Dumazet	da55737467	inetpeer: inetpeer_invalidate_tree() cleanup No need to use cmpxchg() in inetpeer_invalidate_tree() since we hold base lock. Also use correct rcu annotations to remove sparse errors (CONFIG_SPARSE_RCU_POINTER=y) net/ipv4/inetpeer.c:144:19: error: incompatible types in comparison expression (different address spaces) net/ipv4/inetpeer.c:149:20: error: incompatible types in comparison expression (different address spaces) net/ipv4/inetpeer.c:595:10: error: incompatible types in comparison expression (different address spaces) Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-20 14:38:55 -07:00
David S. Miller	41063e9dd1	ipv4: Early TCP socket demux. Input packet processing for local sockets involves two major demuxes. One for the route and one for the socket. But we can optimize this down to one demux for certain kinds of local sockets. Currently we only do this for established TCP sockets, but it could at least in theory be expanded to other kinds of connections. If a TCP socket is established then it's identity is fully specified. This means that whatever input route was used during the three-way handshake must work equally well for the rest of the connection since the keys will not change. Once we move to established state, we cache the receive packet's input route to use later. Like the existing cached route in sk->sk_dst_cache used for output packets, we have to check for route invalidations using dst->obsolete and dst->ops->check(). Early demux occurs outside of a socket locked section, so when a route invalidation occurs we defer the fixup of sk->sk_rx_dst until we are actually inside of established state packet processing and thus have the socket locked. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-19 21:22:05 -07:00
David S. Miller	f9242b6b28	inet: Sanitize inet{,6} protocol demux. Don't pretend that inet_protos[] and inet6_protos[] are hashes, thay are just a straight arrays. Remove all unnecessary hash masking. Document MAX_INET_PROTOS. Use RAW_HTABLE_SIZE when appropriate. Reported-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-19 18:56:21 -07:00
David S. Miller	6fac262526	ipv4: Cap ADVMSS metric in the FIB rather than the routing cache. It makes no sense to execute this limit test every time we create a routing cache entry. We can't simply error out on these things since we've silently accepted and truncated them forever. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-17 19:47:34 -07:00

1 2 3 4 5 ...

4835 Commits