Commit Graph

11089 Commits

Author SHA1 Message Date
Gerrit Renker dd9c0e363c dccp: Deprecate Ack Ratio sysctl
This patch deprecates the Ack Ratio sysctl, since
 * Ack Ratio is entirely ignored by CCID-3 and CCID-4,
 * Ack Ratio currently doesn't work in CCID-2 (i.e. is always set to 1);
 * even if it would work in CCID-2, there is no point for a user to change it:
   - Ack Ratio is constrained by cwnd (RFC 4341, 6.1.2),
   - if Ack Ratio > cwnd, the system resorts to spurious RTO timeouts
     (since waiting for Acks which will never arrive in this window),
   - cwnd is not a user-configurable value.

The only reasonable place for Ack Ratio is to print it for debugging. It is
planned to do this later on, as part of e.g. dccp_probe.

With this patch Ack Ratio is now under full control of feature negotiation:
 * Ack Ratio is resolved as a dependency of the selected CCID;
 * if the chosen CCID supports it (i.e. CCID == CCID-2), Ack Ratio is set to
   the default of 2, following RFC 4340, 11.3 - "New connections start with Ack
   Ratio 2 for both endpoints";
 * what happens then is part of another patch set, since it concerns the
   dynamic update of Ack Ratio while the connection is in full flight.

Thanks to Tomasz Grobelny for discussion leading up to this patch.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-16 22:55:08 -08:00
Gerrit Renker 2945055984 dccp: Feature negotiation for minimum-checksum-coverage
This provides feature negotiation for server minimum checksum coverage
which so far has been missing.

Since sender/receiver coverage values range only from 0...15, their
type has also been reduced in size from u16 to u4.

Feature-negotiation options are now generated for both sender and receiver
coverage, i.e. when the peer has `forgotten' to enable partial coverage
then feature negotiation will automatically enable (negotiate) the partial
coverage value for this connection.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-16 22:53:48 -08:00
Gerrit Renker 49aebc66d6 dccp: Deprecate old setsockopt framework
The previous setsockopt interface, which passed socket options via struct
dccp_so_feat, is complicated/difficult to use. Continuing to support it leads to
ugly code since the old approach did not distinguish between NN and SP values.

This patch removes the old setsockopt interface and replaces it with two new
functions to register NN/SP values for feature negotiation. 
These are essentially wrappers around the internal __feat_register functions,
with checking added to avoid

 * wrong usage (type);
 * changing values while the connection is in progress.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-16 22:51:23 -08:00
Gerrit Renker 0c1168398e dccp: Mechanism to resolve CCID dependencies
This adds a hook to resolve features whose value depends on the choice of
CCID. It is done at the server since it can only be done after the CCID
values have been negotiated; i.e. the client will add its CCID preference
list on the Change options sent in the Request, which will be reconciled
with the local preference list of the server.

The concept is documented on
http://www.erg.abdn.ac.uk/users/gerrit/dccp/notes/feature_negotiation/\
				implementation_notes.html#ccid_dependencies

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-16 22:49:52 -08:00
Alexey Dobriyan 908cd2dabb net: use %pF for /proc/net/ptype
Technically, patch changes format for modules, but I think nobody cares.

	-86dd          :ipv6:ipv6_rcv+0x0
	+86dd          ipv6_rcv+0x0/0x400 [ipv6]

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-16 19:50:35 -08:00
Rémi Denis-Courmont ebfe92ca65 Phonet: refuse to send bigger than MTU packets
Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-16 19:48:49 -08:00
Eric Dumazet 3ab5aee7fe net: Convert TCP & DCCP hash tables to use RCU / hlist_nulls
RCU was added to UDP lookups, using a fast infrastructure :
- sockets kmem_cache use SLAB_DESTROY_BY_RCU and dont pay the
  price of call_rcu() at freeing time.
- hlist_nulls permits to use few memory barriers.

This patch uses same infrastructure for TCP/DCCP established
and timewait sockets.

Thanks to SLAB_DESTROY_BY_RCU, no slowdown for applications
using short lived TCP connections. A followup patch, converting
rwlocks to spinlocks will even speedup this case.

__inet_lookup_established() is pretty fast now we dont have to
dirty a contended cache line (read_lock/read_unlock)

Only established and timewait hashtable are converted to RCU
(bind table and listen table are still using traditional locking)

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-16 19:40:17 -08:00
Eric Dumazet 88ab1932ea udp: Use hlist_nulls in UDP RCU code
This is a straightforward patch, using hlist_nulls infrastructure.

RCUification already done on UDP two weeks ago.

Using hlist_nulls permits us to avoid some memory barriers, both
at lookup time and delete time.

Patch is large because it adds new macros to include/net/sock.h.
These macros will be used by TCP & DCCP in next patch.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-16 19:39:21 -08:00
Balazs Scheidler e8b2dfe9b4 TPROXY: implemented IP_RECVORIGDSTADDR socket option
In case UDP traffic is redirected to a local UDP socket,
the originally addressed destination address/port
cannot be recovered with the in-kernel tproxy.

This patch adds an IP_RECVORIGDSTADDR sockopt that enables
a IP_ORIGDSTADDR ancillary message in recvmsg(). This
ancillary message contains the original destination address/port
of the packet being received.

Signed-off-by: Balazs Scheidler <bazsi@balabit.hu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-16 19:32:39 -08:00
Ben Greear 8164f1b797 ipv4: Fix ARP behavior with many mac-vlans
Ben Greear wrote:
> I have 500 mac-vlans on a system talking to 500 other
> mac-vlans.  My problem is that the arp-table gets extremely
> huge because every time an arp-request comes in on all mac-vlans,
> a stale arp entry is added for each mac-vlan.  I have filtering
> turned on, but that doesn't help because the neigh_event_ns call
> below will cause a stale neighbor entry to be created regardless
> of whether a replay will be sent or not.
> Maybe the neigh_event code should be below the checks for dont_send,
> and only create check neigh_event_ns if we are !dont_send?

The attached patch makes it work much better for me.  The patch
will cause the code to NOT create a stale neighbor entry if we
are not going to respond to the ARP request.  The old code
*would* create a stale entry even if we are not going to respond.

Signed-off-by: Ben Greear <greearb@candelatech.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-16 19:19:38 -08:00
Pavel Emelyanov 5421ae0153 scm: fix scm_fp_list->list initialization made in wrong place
This is the next page of the scm recursion story (the commit 
f8d570a4 net: Fix recursive descent in __scm_destroy()).

In function scm_fp_dup(), the INIT_LIST_HEAD(&fpl->list) of newly
created fpl is done *before* the subsequent memcpy from the old 
structure and thus the freshly initialized list is overwritten.

But that's OK, since this initialization is not required at all,
since the fpl->list is list_add-ed at the destruction time in any
case (and is unused in other code), so I propose to drop both
initializations, rather than moving it after the memcpy.

Please, correct me if I miss something significant.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-14 14:51:45 -08:00
Eric Dumazet ef711cf1d1 net: speedup dst_release()
During tbench/oprofile sessions, I found that dst_release() was in third position.

CPU: Core 2, speed 2999.68 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (Unhalted core cycles) count 100000
samples  %        symbol name
483726    9.0185  __copy_user_zeroing_intel
191466    3.5697  __copy_user_intel
185475    3.4580  dst_release
175114    3.2648  ip_queue_xmit
153447    2.8608  tcp_sendmsg
108775    2.0280  tcp_recvmsg
102659    1.9140  sysenter_past_esp
101450    1.8914  tcp_current_mss
95067     1.7724  __copy_from_user_ll
86531     1.6133  tcp_transmit_skb

Of course, all CPUS fight on the dst_entry associated with 127.0.0.1 

Instead of first checking the refcount value, then decrement it,
we use atomic_dec_return() to help CPU to make the right memory transaction
(ie getting the cache line in exclusive mode)

dst_release() is now at the fifth position, and tbench a litle bit faster ;)

CPU: Core 2, speed 3000.1 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (Unhalted core cycles) count 100000
samples  %        symbol name
647107    8.8072  __copy_user_zeroing_intel
258840    3.5229  ip_queue_xmit
258302    3.5155  __copy_user_intel
209629    2.8531  tcp_sendmsg
165632    2.2543  dst_release
149232    2.0311  tcp_current_mss
147821    2.0119  tcp_recvmsg
137893    1.8767  sysenter_past_esp
127473    1.7349  __copy_from_user_ll
121308    1.6510  ip_finish_output
118510    1.6129  tcp_transmit_skb
109295    1.4875  tcp_v4_rcv

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-14 00:53:54 -08:00
Ingo Molnar e8f6fbf62d lockdep: include/linux/lockdep.h - fix warning in net/bluetooth/af_bluetooth.c
fix this warning:

  net/bluetooth/af_bluetooth.c:60: warning: ‘bt_key_strings’ defined but not used
  net/bluetooth/af_bluetooth.c:71: warning: ‘bt_slock_key_strings’ defined but not used

this is a lockdep macro problem in the !LOCKDEP case.

We cannot convert it to an inline because the macro works on multiple types,
but we can mark the parameter used.

[ also clean up a misaligned tab in sock_lock_init_class_and_name() ]

[ also remove #ifdefs from around af_family_clock_key strings - which
  were certainly added to get rid of the ugly build warnings. ]

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-13 23:19:10 -08:00
Jarek Poplawski f30ab418a1 pkt_sched: Remove qdisc->ops->requeue() etc.
After implementing qdisc->ops->peek() and changing sch_netem into
classless qdisc there are no more qdisc->ops->requeue() users. This
patch removes this method with its wrappers (qdisc_requeue()), and
also unused qdisc->requeue structure. There are a few minor fixes of
warnings (htb_enqueue()) and comments btw.

The idea to kill ->requeue() and a similar patch were first developed
by David S. Miller.

Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-13 22:56:30 -08:00
Wang Chen 524ad0a791 netdevice: safe convert to netdev_priv() #part-4
We have some reasons to kill netdev->priv:
1. netdev->priv is equal to netdev_priv().
2. netdev_priv() wraps the calculation of netdev->priv's offset, obviously
   netdev_priv() is more flexible than netdev->priv.
But we cann't kill netdev->priv, because so many drivers reference to it
directly.

This patch is a safe convert for netdev->priv to netdev_priv(netdev).
Since all of the netdev->priv is only for read.
But it is too big to be sent in one mail.
I split it to 4 parts and make every part smaller than 100,000 bytes,
which is max size allowed by vger.

Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-12 23:39:10 -08:00
Randy Dunlap 1fa989e80a 9p: restrict RDMA usage
Make 9p's RDMA option depend on INET since it uses Infiniband rdma_*
functions and that code depends on INET.  Otherwise 9p can try to
use symbols which don't exist.

ERROR: "rdma_destroy_id" [net/9p/9pnet_rdma.ko] undefined!
ERROR: "rdma_connect" [net/9p/9pnet_rdma.ko] undefined!
ERROR: "rdma_create_id" [net/9p/9pnet_rdma.ko] undefined!
ERROR: "rdma_create_qp" [net/9p/9pnet_rdma.ko] undefined!
ERROR: "rdma_resolve_route" [net/9p/9pnet_rdma.ko] undefined!
ERROR: "rdma_disconnect" [net/9p/9pnet_rdma.ko] undefined!
ERROR: "rdma_resolve_addr" [net/9p/9pnet_rdma.ko] undefined!

I used an if/endif block so that the menu items would remain
presented together.

Also correct an article adjective.

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-12 23:33:57 -08:00
Arnaud Ebalard 7a12122c7a net: Remove unused parameter of xfrm_gen_index()
In commit 2518c7c2b3 ("[XFRM]: Hash
policies when non-prefixed."), the last use of xfrm_gen_policy() first
argument was removed, but the argument was left behind in the
prototype.

Signed-off-by: Arnaud Ebalard <arno@natisbad.org>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-12 23:28:15 -08:00
Alexey Dobriyan 9c0188acf6 net: shy netns_ok check
Failure to pass netns_ok check is SILENT, except some MIB counter is
incremented somewhere.

And adding "netns_ok = 1" (after long head-scratching session) is
usually the last step in making some protocol netns-ready...

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-12 23:23:51 -08:00
Brian Haley 6e093d9dff ipv6: routing header fixes
This patch fixes two bugs:

1. setsockopt() of anything but a Type 2 routing header should return
EINVAL instead of EPERM.  Noticed by Shan Wei
(shanwei@cn.fujitsu.com).

2. setsockopt()/sendmsg() of a Type 2 routing header with invalid
length or segments should return EINVAL.  These values are statically
fixed in RFC 3775, unlike the variable Type 0 was.

Signed-off-by: Brian Haley <brian.haley@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-12 22:59:21 -08:00
David S. Miller ddd535c713 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 2008-11-12 14:37:29 -08:00
Johannes Berg db7fb86b0c mac80211: fix notify_mac function
The ieee80211_notify_mac() function uses ieee80211_sta_req_auth() which
in turn calls ieee80211_set_disassoc() which calls a few functions that
need to be able to sleep, so ieee80211_notify_mac() cannot use RCU
locking for the interface list and must use rtnl locking instead.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-11-12 16:49:53 -05:00
Patrick Ohly d35aac10eb net: put_cmsg_compat + SO_TIMESTAMP[NS]: use same name for value as caller
In __sock_recv_timestamp() the additional SCM_TIMESTAMP[NS] is used. This
has the same value as SO_TIMESTAMP[NS], so this is a purely cosmetic change.

Signed-off-by: Patrick Ohly <patrick.ohly@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-12 01:54:56 -08:00
Doug Leith 8f65b5354b tcp_htcp: last_cong bug fix
This patch fixes a minor bug in tcp_htcp.c which has been
highlighted by Lachlan Andrew and Lawrence Stewart.  Currently, the
time since the last congestion event, which is stored in variable
last_cong, is reset whenever there is a state change into
TCP_CA_Open.  This includes transitions of the type
TCP_CA_Open->TCP_CA_Disorder->TCP_CA_Open which are not associated
with backoff of cwnd.  The patch changes last_cong to be updated
only on transitions into TCP_CA_Open that occur after experiencing
the congestion-related states TCP_CA_Loss, TCP_CA_Recovery,
TCP_CA_CWR.

Signed-off-by: Doug Leith <doug.leith@nuim.ie>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-12 01:41:09 -08:00
Eric Dumazet e42ea986e4 net: Cleanup of neighbour code
Using read_pnet() and write_pnet() in neighbour code ease the reading
of code.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-12 00:54:54 -08:00
Eric Dumazet 7a9546ee35 net: ib_net pointer should depends on CONFIG_NET_NS
We can shrink size of "struct inet_bind_bucket" by 50%, using
read_pnet() and write_pnet()

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-12 00:54:20 -08:00
Gerrit Renker 9eca0a47de dccp: Resolve dependencies of features on choice of CCID
This provides a missing link in the code chain, as several features implicitly
depend and/or rely on the choice of CCID. Most notably, this is the Send Ack Vector
feature, but also Ack Ratio and Send Loss Event Rate (also taken care of).

For Send Ack Vector, the situation is as follows:
 * since CCID2 mandates the use of Ack Vectors, there is no point in allowing 
   endpoints which use CCID2 to disable Ack Vector features such a connection;

 * a peer with a TX CCID of CCID2 will always expect Ack Vectors, and a peer
   with a RX CCID of CCID2 must always send Ack Vectors (RFC 4341, sec. 4);

 * for all other CCIDs, the use of (Send) Ack Vector is optional and thus
   negotiable. However, this implies that the code negotiating the use of Ack
   Vectors also supports it (i.e. is able to supply and to either parse or
   ignore received Ack Vectors). Since this is not the case (CCID-3 has no Ack
   Vector support), the use of Ack Vectors is here disabled, with a comment
   in the source code.

An analogous consideration arises for the Send Loss Event Rate feature,
since the CCID-3 implementation does not support the loss interval options
of RFC 4342. To make such use explicit, corresponding feature-negotiation
options are inserted which signal the use of the loss event rate option,
as it is used by the CCID3 code.

Lastly, the values of the Ack Ratio feature are matched to the choice of CCID.

The patch implements this as a function which is called after the user has
made all other registrations for changing default values of features.

The table is variable-length, the reserved (and hence for feature-negotiation
invalid, confirmed by considering section 19.4 of RFC 4340) feature number `0'
is used to mark the end of the table.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-12 00:48:44 -08:00
Gerrit Renker d90ebcbfa7 dccp: Query supported CCIDs
This provides a data structure to record which CCIDs are locally supported
and three accessor functions:
 - a test function for internal use which is used to validate CCID requests
   made by the user;
 - a copy function so that the list can be used for feature-negotiation;   
 - documented getsockopt() support so that the user can query capabilities.

The data structure is a table which is filled in at compile-time with the
list of available CCIDs (which in turn depends on the Kconfig choices).

Using the copy function for cloning the list of supported CCIDs is useful for
feature negotiation, since the negotiation is now with the full list of available
CCIDs (e.g. {2, 3}) instead of the default value {2}. This means negotiation 
will not fail if the peer requests to use CCID3 instead of CCID2. 

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-12 00:47:26 -08:00
Gerrit Renker e8ef967a54 dccp: Registration routines for changing feature values
Two registration routines, for SP and NN features, are provided by this patch,
replacing a previous routine which was used for both feature types.

These are internal-only routines and therefore start with `__feat_register'.

It further exports the known limits of Sequence Window and Ack Ratio as symbolic
constants.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-12 00:43:40 -08:00
Gerrit Renker f74e91b6cc dccp: Limit feature negotiation to connection setup phase
This patch limits feature (capability) negotation to the connection setup phase:

 1. Although it is theoretically possible to perform feature negotiation at any
    time (and RFC 4340 supports this), in practice this is prohibitively complex,
    as it requires to put traffic on hold for each new negotiation.
 2. As a byproduct of restricting feature negotiation to connection setup, the
    feature-negotiation retransmit timer is no longer required. This part is now
    mapped onto the protocol-level retransmission.
    Details indicating why timers are no longer needed can be found on
    http://www.erg.abdn.ac.uk/users/gerrit/dccp/notes/feature_negotiation/\
	                                      implementation_notes.html

This patch disables anytime negotiation, subsequent patches work out full
feature negotiation support for connection setup.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-12 00:42:58 -08:00
Alexey Dobriyan 6bb3ce25d0 net: remove struct dst_entry::entry_size
Unused after kmem_cache_zalloc() conversion.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-11 17:25:22 -08:00
Alexey Dobriyan 9b739ba5e6 net: remove struct neigh_table::pde
->pde isn't actually needed, since name is stashed in ->id.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-11 16:47:44 -08:00
David S. Miller 7e452baf6b Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:

	drivers/message/fusion/mptlan.c
	drivers/net/sfc/ethtool.c
	net/mac80211/debugfs_sta.c
2008-11-11 15:43:02 -08:00
Linus Torvalds 0a4cf2c878 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
  dsa: fix master interface allmulti/promisc handling
  dsa: fix skb->pkt_type when mac address of slave interface differs
  net: fix setting of skb->tail in skb_recycle_check()
  net: fix /proc/net/snmp as memory corruptor
  mac80211: fix a buffer overrun in station debug code
  netfilter: payload_len is be16, add size of struct rather than size of pointer
  ipv6: fix ip6_mr_init error path
  [4/4] dca: fixup initialization dependency
  [3/4] I/OAT: fix async_tx.callback checking
  [2/4] I/OAT: fix dma_pin_iovec_pages() error handling
  [1/4] I/OAT: fix channel resources free for not allocated channels
  ssb: Fix DMA-API compilation for non-PCI systems
  SSB: hide empty sub menu
  vlan: Fix typos in proc output string
  [netdrvr] usb/hso: Cleanup rfkill error handling
  sfc: Correct address of gPXE boot configuration in EEPROM
  el3_common_init() should be __devinit, not __init
  hso: rfkill type should be WWAN
  mlx4_en: Start port error flow bug fix
  af_key: mark policy as dead before destroying
2008-11-11 09:20:29 -08:00
Lennert Buytenhek df02c6ff2e dsa: fix master interface allmulti/promisc handling
Before commit b6c40d68ff ("net: only
invoke dev->change_rx_flags when device is UP"), the dsa driver could
sort-of get away with only fiddling with the master interface's
allmulti/promisc counts in ->change_rx_flags() and not touching them
in ->open() or ->stop().  After this commit (note that it was merged
almost simultaneously with the dsa patches, which is why this wasn't
caught initially), the breakage that was already there became more
apparent.

Since it makes no sense to keep the master interface's allmulti or
promisc count pinned for a slave interface that is down, copy the vlan
driver's sync logic (which does exactly what we want) over to dsa to
fix this.

Bug report from Dirk Teurlings <dirk@upexia.nl> and Peter van Valderen
<linux@ddcrew.com>.

Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Tested-by: Dirk Teurlings <dirk@upexia.nl>
Tested-by: Peter van Valderen <linux@ddcrew.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-10 21:53:12 -08:00
Lennert Buytenhek 14ee6742b1 dsa: fix skb->pkt_type when mac address of slave interface differs
When a dsa slave interface has a mac address that differs from that
of the master interface, eth_type_trans() won't explicitly set
skb->pkt_type back to PACKET_HOST -- we need to do this ourselves
before calling eth_type_trans().

Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-10 21:52:42 -08:00
Lennert Buytenhek 5cd33db212 net: fix setting of skb->tail in skb_recycle_check()
Since skb_reset_tail_pointer() reads skb->data, we need to set
skb->data before calling skb_reset_tail_pointer().  This was causing
spurious skb_over_panic()s from skb_put() being called on a recycled
skb that had its skb->tail set to beyond where it should have been.

Bug report from Peter van Valderen <linux@ddcrew.com>.

Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-10 21:45:05 -08:00
Eric Dumazet b971e7ac83 net: fix /proc/net/snmp as memory corruptor
icmpmsg_put() can happily corrupt kernel memory, using a static
table and forgetting to reset an array index in a loop.

Remove the static array since its not safe without proper locking.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-10 21:43:08 -08:00
Jianjun Kong 013cd39753 mac80211: fix a buffer overrun in station debug code
net/mac80211/debugfs_sta.c
The trailing zero was written to state[4], it's out of bounds.

Signed-off-by: Jianjun Kong <jianjun@zeuux.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-10 21:37:39 -08:00
Jesse Brandeburg eb37b41cc2 pktgen: add full reset functionality
While testing pktgen, I found that sometimes my configurations from
previous runs would be left over, particularly when going from a test
with 8 threads down to a test with 4 threads.

This adds new functionality to pktgen where you can call
pgset "reset"

and it will be just like you just insmod'ed pktgen again.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Robert Olsson <robert.olsson@its.uu.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-10 16:48:03 -08:00
Harvey Harrison b7b45f47d6 netfilter: payload_len is be16, add size of struct rather than size of pointer
payload_len is a be16 value, not cpu_endian, also the size of a ponter
to a struct ipv6hdr was being added, not the size of the struct itself.

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-10 16:46:06 -08:00
Benjamin Thery 87b30a6530 ipv6: fix ip6_mr_init error path
The order of cleanup operations in the error/exit section of ip6_mr_init()
is completely inversed. It should be the other way around.
Also a del_timer() is missing in the error path.

Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-10 16:34:11 -08:00
Rémi Denis-Courmont 9b1582d451 Phonet: use net_device built-in stats for GPRS
Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-10 16:21:05 -08:00
Kay Sievers fb28ad3590 net: struct device - replace bus_id with dev_name(), dev_set_name()
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-10 13:55:14 -08:00
Ferenc Wagner 309f796f30 vlan: Fix typos in proc output string
Signed-off-by: Ferenc Wagner <wferi@niif.hu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-10 13:37:40 -08:00
David S. Miller 2377989754 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 2008-11-10 13:24:44 -08:00
Luis R. Rodriguez 5166ccd220 cfg80211: Add kdoc for struct regulatory_request
As regulatory_request gets bigger there will be more questions
of what things means, so clarify documenation for it and
keep track of the special alpha2 codes we use internally
and on the userspace regulatory agents.

Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-11-10 15:17:41 -05:00
Luis R. Rodriguez 9c96477d10 cfg80211: Add regulatory domain intersection capability
There are certain scenerios where we require intersecting
two regulatory domains. This adds intersection support.
When we enable 802.11d support we will use this to intersect
the regulatory domain from the AP's country IE and what our
regulatory agent believes is correct for a country.

This patch enables intersection for now in the case where
the last regdomain was set by a country IE which was parsed
and the user then wants to set the regulatory domain. Since
we don't support country IE parsing yet this code path will not
be hit, however this allows us to pave the way for 11d support.

Intersection code has been tested in userspace with CRDA.

Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-11-10 15:17:41 -05:00
Luis R. Rodriguez d71aaf6053 cfg80211: a reg rule is invalid if freq diff is 0
A regulatory rule is invalid when the frequency difference
between the end of the frequency range and the start is 0.

Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-11-10 15:17:41 -05:00
Jouni Malinen fbf1892739 mac80211: Allow AP mode to be enabled
With the addition of basic rate set and TX queue parameter
configuration and confirmation that power save buffering is
working again, mac80211 is now in state that allows AP mode to be
used without major problems. Consequently, it is time to allow this
mode to be enabled without having to patch the kernel.

AP mode requires hostapd for management frame processing and as such,
configuring this mode is only allowed through cfg80211 (not with
iwconfig and WEXT).

Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-11-10 15:17:40 -05:00
Tomas Winkler d61272cbb3 mac80211: fix basic rates setting from association response
In previous code all the rates were marked as basic.

Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-11-10 15:17:40 -05:00
Jouni Malinen 318884875b nl80211: Add TX queue parameter configuration
Add a new attribute, NL80211_ATTR_WIPHY_TXQ_PARAMS, that can be used with
NL80211_CMD_SET_WIPHY for userspace (e.g., hostapd) to set TX queue
parameters (txop, cwmin, cwmax, aifs).

Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-11-10 15:17:40 -05:00
Jouni Malinen 90c97a040d nl80211: Add basic rate configuration for AP mode
Add a new attribute, NL80211_ATTR_BSS_BASIC_RATES, that can be used with
NL80211_CMD_SET_BSS for userspace (e.g., hostapd) to set which rates are
in the basic rate set.

Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-11-10 15:17:39 -05:00
Johannes Berg bd81525272 wireless: implement basic rate helper function
This adds a helper function that, given a bitmap of basic
rates and a bitrate returns the response rate for this rate.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-11-10 15:17:35 -05:00
Holger Schurig 2a941ecb51 wireless: fix two bad print_ssid conversions
This patch fixes two current compilation problems. They showed up
with CONFIG_IEEE80211_DEBUG defined.

Signed-off-by: Holger Schurig <hs4233@mail.mn-solutions.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-11-10 15:17:33 -05:00
Sujith 8469cdef1f mac80211: Add a new event in ieee80211_ampdu_mlme_action
Send a notification to the driver on succesful
reception of an ADDBA response, add IEEE80211_AMPDU_TX_RESUME
for this purpose.

Signed-off-by: Sujith <Sujith.Manoharan@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-11-10 15:17:32 -05:00
Johannes Berg 41bb73eeac mac80211: remove SSID driver code
Remove the SSID from the driver API since now there is no
driver that requires knowing the SSID and I think it's
unlikely that any hardware design that does require the
SSID will play well with mac80211.

This also removes support for setting the SSID in master
mode which will require a patch to hostapd to not try.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-11-10 15:11:56 -05:00
Johannes Berg 2df78167ad wireless: fix a few sparse warnings
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-11-10 15:10:17 -05:00
Johannes Berg 1239cd58d2 wireless: move mesh config length constant
This is a constant from the 802.11 specification.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Cc: Javier Cardona <javier@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-11-10 15:10:16 -05:00
Zhu Yi 97c8b013da mac80211: print reason code for deauth/dissoc frames
The patch prints reason code for deauth/dissoc frames to give users
more ideas what's happened for the disconnection.

Signed-off-by: Zhu Yi <yi.zhu@intel.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-11-10 15:10:16 -05:00
Miklos Szeredi 6209344f5a net: unix: fix inflight counting bug in garbage collector
Previously I assumed that the receive queues of candidates don't
change during the GC.  This is only half true, nothing can be received
from the queues (see comment in unix_gc()), but buffers could be added
through the other half of the socket pair, which may still have file
descriptors referring to it.

This can result in inc_inflight_move_tail() erronously increasing the
"inflight" counter for a unix socket for which dec_inflight() wasn't
previously called.  This in turn can trigger the "BUG_ON(total_refs <
inflight_refs)" in a later garbage collection run.

Fix this by only manipulating the "inflight" counter for sockets which
are candidates themselves.  Duplicating the file references in
unix_attach_fds() is also needed to prevent a socket becoming a
candidate for GC while the skb that contains it is not yet queued.

Reported-by: Andrea Bittau <a.bittau@cs.ucl.ac.uk>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
CC: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-09 11:17:33 -08:00
Harvey Harrison f574179b63 tipc: trivial endian annotation in debug statement
Use htonl rather than ntohl on a u32.
net/tipc/name_table.c:557:2: warning: cast to restricted __be32

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-07 23:37:50 -08:00
Thomas Graf f400923735 pkt_sched: Control group classifier
The classifier should cover the most common use case and will work
without any special configuration.

The principle of the classifier is to directly access the
task_struct via get_current(). In order for this to work,
classification requests from softirqs must be ignored. This is
not a problem because the vast majority of packets in softirq
context are not assigned to a task anyway. For this to work, a
mechanism is needed to trace softirq context. 

This repost goes back to the method of relying on the number of
nested bh disable calls for the sake of not adding too much
complexity and the option to come up with something more reliable
if actually needed.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-07 22:56:00 -08:00
Eric W. Biederman 505d4f73dd net: Guaranetee the proper ordering of the loopback device. v2
I was recently hunting a bug that occurred in network namespace
cleanup.  In looking at the code it became apparrent that we have
and will continue to have cases where if we have anything going
on in a network namespace there will be assumptions that the
loopback device is present.   Things like sending igmp unsubscribe
messages when we bring down network devices invokes the routing
code which assumes that at least the loopback driver is present.

Therefore to avoid magic initcall ordering hackery that is hard
to follow and hard to get right insert a call to register the
loopback device directly from net_dev_init().    This guarantes
that the loopback device is the first device registered and
the last network device to go away.

But do it carefully so we register the loopback device after
we clear dev_boot_phase.

Signed-off-by: Eric W. Biederman <ebiederm@maxwell.aristanetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-07 22:54:20 -08:00
Eric W. Biederman 5d6d480908 net: fib_rules ordering fixes.
We need to setup the network namespace state before we register
the notifier.  Otherwise if a network device is already registered
we get a nasty NULL pointer dereference.

Signed-off-by: Eric W. Biederman <ebiederm@maxwell.aristanetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-07 22:52:34 -08:00
David S. Miller 3d8160b149 Revert "net: Guaranetee the proper ordering of the loopback device."
This reverts commit ae33bc40c0.
2008-11-07 22:52:14 -08:00
David S. Miller 167c6274c3 Merge branch 'davem-next' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6 2008-11-07 01:37:16 -08:00
Harvey Harrison 5c7f033358 phonet: sparse annotations of protocol, remove forward declaration
net/phonet/af_phonet.c:38:36: error: marked inline, but without a definition
net/phonet/pep-gprs.c:63:10: warning: incorrect type in return expression (different base types)
net/phonet/pep-gprs.c:63:10:    expected int
net/phonet/pep-gprs.c:63:10:    got restricted __be16 [usertype] <noident>
net/phonet/pep-gprs.c:65:10: warning: incorrect type in return expression (different base types)
net/phonet/pep-gprs.c:65:10:    expected int
net/phonet/pep-gprs.c:65:10:    got restricted __be16 [usertype] <noident>
net/phonet/pep-gprs.c:124:16: warning: incorrect type in assignment (different base types)
net/phonet/pep-gprs.c:124:16:    expected restricted __be16 [usertype] protocol
net/phonet/pep-gprs.c:124:16:    got unsigned short [unsigned] [usertype] protocol

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-06 23:10:50 -08:00
Harvey Harrison ca62059b7e ipvs: oldlen, newlen should be be16, not be32
Noticed by sparse:
net/netfilter/ipvs/ip_vs_proto_tcp.c:195:6: warning: incorrect type in argument 5 (different base types)
net/netfilter/ipvs/ip_vs_proto_tcp.c:195:6:    expected restricted __be16 [usertype] oldlen
net/netfilter/ipvs/ip_vs_proto_tcp.c:195:6:    got restricted __be32 [usertype] <noident>
net/netfilter/ipvs/ip_vs_proto_tcp.c:196:6: warning: incorrect type in argument 6 (different base types)
net/netfilter/ipvs/ip_vs_proto_tcp.c:196:6:    expected restricted __be16 [usertype] newlen
net/netfilter/ipvs/ip_vs_proto_tcp.c:196:6:    got restricted __be32 [usertype] <noident>
net/netfilter/ipvs/ip_vs_proto_tcp.c:270:6: warning: incorrect type in argument 5 (different base types)
net/netfilter/ipvs/ip_vs_proto_tcp.c:270:6:    expected restricted __be16 [usertype] oldlen
net/netfilter/ipvs/ip_vs_proto_tcp.c:270:6:    got restricted __be32 [usertype] <noident>
net/netfilter/ipvs/ip_vs_proto_tcp.c:271:6: warning: incorrect type in argument 6 (different base types)
net/netfilter/ipvs/ip_vs_proto_tcp.c:271:6:    expected restricted __be16 [usertype] newlen
net/netfilter/ipvs/ip_vs_proto_tcp.c:271:6:    got restricted __be32 [usertype] <noident>
net/netfilter/ipvs/ip_vs_proto_udp.c:206:6: warning: incorrect type in argument 5 (different base types)
net/netfilter/ipvs/ip_vs_proto_udp.c:206:6:    expected restricted __be16 [usertype] oldlen
net/netfilter/ipvs/ip_vs_proto_udp.c:206:6:    got restricted __be32 [usertype] <noident>
net/netfilter/ipvs/ip_vs_proto_udp.c:207:6: warning: incorrect type in argument 6 (different base types)
net/netfilter/ipvs/ip_vs_proto_udp.c:207:6:    expected restricted __be16 [usertype] newlen
net/netfilter/ipvs/ip_vs_proto_udp.c:207:6:    got restricted __be32 [usertype] <noident>
net/netfilter/ipvs/ip_vs_proto_udp.c:282:6: warning: incorrect type in argument 5 (different base types)
net/netfilter/ipvs/ip_vs_proto_udp.c:282:6:    expected restricted __be16 [usertype] oldlen
net/netfilter/ipvs/ip_vs_proto_udp.c:282:6:    got restricted __be32 [usertype] <noident>
net/netfilter/ipvs/ip_vs_proto_udp.c:283:6: warning: incorrect type in argument 6 (different base types)
net/netfilter/ipvs/ip_vs_proto_udp.c:283:6:    expected restricted __be16 [usertype] newlen
net/netfilter/ipvs/ip_vs_proto_udp.c:283:6:    got restricted __be32 [usertype] <noident>

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Acked-by: Simon Horman <horms@verge.net.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-06 23:09:56 -08:00
Alexey Dobriyan 70e90679ff af_key: mark policy as dead before destroying
xfrm_policy_destroy() will oops if not dead policy is passed to it.
On error path in pfkey_compile_policy() exactly this happens.

Oopsable for CAP_NET_ADMIN owners.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-06 23:08:37 -08:00
Alexey Dobriyan 76acfdb9b7 net: mark flow_cache_cpu_prepare() as __init
It's called from __init code only. And__devinit in generic networking code
is pretty strange :^)

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-06 23:06:44 -08:00
David S. Miller 9eeda9abd1 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:

	drivers/net/wireless/ath5k/base.c
	net/8021q/vlan_core.c
2008-11-06 22:43:03 -08:00
Linus Torvalds 4bab0ea1d4 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
  net: Fix recursive descent in __scm_destroy().
  iwl3945: fix deadlock on suspend
  iwl3945: do not send scan command if channel count zero
  iwl3945: clear scanning bits upon failure
  ath5k: correct handling of rx status fields
  zd1211rw: Add 2 device IDs
  Fix logic error in rfkill_check_duplicity
  iwlagn: avoid sleep in softirq context
  iwlwifi: clear scanning bits upon failure
  Revert "ath5k: honor FIF_BCN_PRBRESP_PROMISC in STA mode"
  tcp: Fix recvmsg MSG_PEEK influence of blocking behavior.
  netfilter: netns ct: walk netns list under RTNL
  ipv6: fix run pending DAD when interface becomes ready
  net/9p: fix printk format warnings
  net: fix packet socket delivery in rx irq handler
  xfrm: Have af-specific init_tempsel() initialize family field of temporary selector
2008-11-06 16:44:23 -08:00
David S. Miller ca409d6e08 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 2008-11-06 15:52:00 -08:00
Linus Torvalds 39d4e58d36 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs:
  net/9p: fix printk format warnings
  unsigned fid->fid cannot be negative
  9p: rdma: remove duplicated #include
  p9: Fix leak of waitqueue in request allocation path
  9p: Remove unneeded free of fcall for Flush
  9p: Make all client spin locks IRQ safe
  9p: rdma: Set trans prior to requesting async connection ops
2008-11-06 15:45:57 -08:00
David S. Miller 3b53fbf431 net: Fix recursive descent in __scm_destroy().
__scm_destroy() walks the list of file descriptors in the scm_fp_list
pointed to by the scm_cookie argument.

Those, in turn, can close sockets and invoke __scm_destroy() again.

There is nothing which limits how deeply this can occur.

The idea for how to fix this is from Linus.  Basically, we do all of
the fput()s at the top level by collecting all of the scm_fp_list
objects hit by an fput().  Inside of the initial __scm_destroy() we
keep running the list until it is empty.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-06 15:45:32 -08:00
David Miller f8d570a474 net: Fix recursive descent in __scm_destroy().
__scm_destroy() walks the list of file descriptors in the scm_fp_list
pointed to by the scm_cookie argument.

Those, in turn, can close sockets and invoke __scm_destroy() again.

There is nothing which limits how deeply this can occur.

The idea for how to fix this is from Linus.  Basically, we do all of
the fput()s at the top level by collecting all of the scm_fp_list
objects hit by an fput().  Inside of the initial __scm_destroy() we
keep running the list until it is empty.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-06 13:51:50 -08:00
Jonathan McDowell 4a9d916717 Fix logic error in rfkill_check_duplicity
> I'll have a prod at why the [hso] rfkill stuff isn't working next

Ok, I believe this is due to the addition of rfkill_check_duplicity in
rfkill and the fact that test_bit actually returns a negative value
rather than the postive one expected (which is of course equally true).
So when the second WLAN device (the hso device, with the EEE PC WLAN
being the first) comes along rfkill_check_duplicity returns a negative
value and so rfkill_register returns an error. Patch below fixes this
for me.

Signed-Off-By: Jonathan McDowell <noodles@earth.li>
Acked-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-11-06 16:37:09 -05:00
Brian Haley 305d552acc bonding: send IPv6 neighbor advertisement on failover
This patch adds better IPv6 failover support for bonding devices,
especially when in active-backup mode and there are only IPv6 addresses
configured, as reported by Alex Sidorenko.

- Creates a new file, net/drivers/bonding/bond_ipv6.c, for the
   IPv6-specific routines.  Both regular bonds and VLANs over bonds
   are supported.

- Adds a new tunable, num_unsol_na, to limit the number of unsolicited
   IPv6 Neighbor Advertisements that are sent on a failover event.
   Default is 1.

- Creates two new IPv6 neighbor discovery functions:

   ndisc_build_skb()
   ndisc_send_skb()

   These were required to support VLANs since we have to be able to
   add the VLAN id to the skb since ndisc_send_na() and friends
   shouldn't be asked to do this.  These two routines are basically
   __ndisc_send() split into two pieces, in a slightly different order.

- Updates Documentation/networking/bonding.txt and bumps the rev of bond
   support to 3.4.0.

On failover, this new code will generate one packet:

- An unsolicited IPv6 Neighbor Advertisement, which helps the switch
   learn that the address has moved to the new slave.

Testing has shown that sending just the NA results in pretty good
behavior when in active-back mode, I saw no lost ping packets for example.

Signed-off-by: Brian Haley <brian.haley@hp.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-11-06 00:49:37 -05:00
Eric W. Biederman 0a36b345ab net: Don't leak packets when a netns is going down
I have been tracking for a while a case where when the
network namespace exits the cleanup gets stck in an
endless precessess of:

unregister_netdevice: waiting for lo to become free. Usage count = 3
unregister_netdevice: waiting for lo to become free. Usage count = 3
unregister_netdevice: waiting for lo to become free. Usage count = 3
unregister_netdevice: waiting for lo to become free. Usage count = 3
unregister_netdevice: waiting for lo to become free. Usage count = 3
unregister_netdevice: waiting for lo to become free. Usage count = 3
unregister_netdevice: waiting for lo to become free. Usage count = 3

It turns out that if you listen on a multicast address an unsubscribe
packet is sent when the network device goes down.   If you shutdown
the network namespace without carefully cleaning up this can trigger
the unsubscribe packet to be sent over the loopback interface while
the network namespace is going down.

All of which is fine except when we drop the packet and forget to
free it leaking the skb and the dst entry attached to.  As it
turns out the dst entry hold a reference to the idev which holds
the dev and keeps everything from being cleaned up.  Yuck!

By fixing my earlier thinko and add the needed kfree_skb and everything
cleans up beautifully. 

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-05 16:00:24 -08:00
Eric W. Biederman ae33bc40c0 net: Guaranetee the proper ordering of the loopback device.
I was recently hunting a bug that occurred in network namespace
cleanup.  In looking at the code it became apparrent that we have
and will continue to have cases where if we have anything going
on in a network namespace there will be assumptions that the
loopback device is present.   Things like sending igmp unsubscribe
messages when we bring down network devices invokes the routing
code which assumes that at least the loopback driver is present.

Therefore to avoid magic initcall ordering hackery that is hard
to follow and hard to get right insert a call to register the
loopback device directly from net_dev_init().    This guarantes
that the loopback device is the first device registered and
the last network device to go away.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-05 16:00:02 -08:00
Eric W. Biederman d0c082cea6 netns: Delete virtual interfaces during namespace cleanup
When physical devices are inside of network namespace and that
network namespace terminates we can not make them go away.  We
have to keep them and moving them to the initial network namespace
is the best we can do.

For virtual devices left in a network namespace that is exiting
we have no need to preserve them and we now have the infrastructure
that allows us to delete them.  So delete virtual devices when we
exit a network namespace.  Keeping the necessary user space clean up
after a network namespace exits much more tractable.

Acked-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Acked-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-05 15:59:38 -08:00
Randy Dunlap b0d5fdef52 net/9p: fix printk format warnings
Fix printk format warnings in net/9p.
Built cleanly on 7 arches.

net/9p/client.c:820: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'u64'
net/9p/client.c:820: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'u64'
net/9p/client.c:867: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'u64'
net/9p/client.c:867: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'u64'
net/9p/client.c:932: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'u64'
net/9p/client.c:932: warning: format '%llx' expects type 'long long unsigned int', but argument 6 has type 'u64'
net/9p/client.c:982: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'u64'
net/9p/client.c:982: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'u64'
net/9p/client.c:1025: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'u64'
net/9p/client.c:1025: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'u64'
net/9p/client.c:1227: warning: format '%llx' expects type 'long long unsigned int', but argument 7 has type 'u64'
net/9p/client.c:1227: warning: format '%llx' expects type 'long long unsigned int', but argument 12 has type 'u64'
net/9p/client.c:1227: warning: format '%llx' expects type 'long long unsigned int', but argument 8 has type 'u64'
net/9p/client.c:1227: warning: format '%llx' expects type 'long long unsigned int', but argument 13 has type 'u64'
net/9p/client.c:1252: warning: format '%llx' expects type 'long long unsigned int', but argument 7 has type 'u64'
net/9p/client.c:1252: warning: format '%llx' expects type 'long long unsigned int', but argument 12 has type 'u64'
net/9p/client.c:1252: warning: format '%llx' expects type 'long long unsigned int', but argument 8 has type 'u64'
net/9p/client.c:1252: warning: format '%llx' expects type 'long long unsigned int', but argument 13 has type 'u64'

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-11-05 13:19:07 -06:00
Roel Kluin 9f3e9bbe62 unsigned fid->fid cannot be negative
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-11-05 13:19:07 -06:00
Huang Weiyi 1558c62149 9p: rdma: remove duplicated #include
Removed duplicated #include <rdma/ib_verbs.h> in
net/9p/trans_rdma.c.

Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-11-05 13:19:07 -06:00
Tom Tucker 45abdf1c7b p9: Fix leak of waitqueue in request allocation path
If a T or R fcall cannot be allocated, the function returns an error
but neglects to free the wait queue that was successfully allocated.

If it comes through again a second time this wq will be overwritten
with a new allocation and the old allocation will be leaked.

Also, if the client is subsequently closed, the close path will
attempt to clean up these allocations, so set the req fields to
NULL to avoid duplicate free.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-11-05 13:19:06 -06:00
Tom Tucker 82b189eaaf 9p: Remove unneeded free of fcall for Flush
T and R fcall are reused until the client is destroyed. There does
not need to be a special case for Flush

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-11-05 13:19:06 -06:00
Tom Tucker cac23d6505 9p: Make all client spin locks IRQ safe
The client lock must be IRQ safe. Some of the lock acquisition paths
took regular spin locks.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-11-05 13:19:06 -06:00
Tom Tucker 517ac45af4 9p: rdma: Set trans prior to requesting async connection ops
The RDMA connection manager is fundamentally asynchronous.
Since the async callback context is the client pointer, the
transport in the client struct needs to be set prior to calling
the first async op.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-11-05 13:19:06 -06:00
David S. Miller 518a09ef11 tcp: Fix recvmsg MSG_PEEK influence of blocking behavior.
Vito Caputo noticed that tcp_recvmsg() returns immediately from
partial reads when MSG_PEEK is used.  In particular, this means that
SO_RCVLOWAT is not respected.

Simply remove the test.  And this matches the behavior of several
other systems, including BSD.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-05 03:36:01 -08:00
Alexey Dobriyan efb9a8c28c netfilter: netns ct: walk netns list under RTNL
netns list (just list) is under RTNL. But helper and proto unregistration
happen during rmmod when RTNL is not held, and that's how it was tested:
modprobe/rmmod vs clone(CLONE_NEWNET)/exit.

BUG: unable to handle kernel paging request at 0000000000100100	<===
IP: [<ffffffffa009890f>] nf_conntrack_l4proto_unregister+0x96/0xae [nf_conntrack]
PGD 15e300067 PUD 15e1d8067 PMD 0
Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
last sysfs file: /sys/kernel/uevent_seqnum
CPU 0
Modules linked in: nf_conntrack_proto_sctp(-) nf_conntrack_proto_dccp(-) af_packet iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 iptable_filter ip_tables xt_tcpudp ip6table_filter ip6_tables x_tables ipv6 sr_mod cdrom [last unloaded: nf_conntrack_proto_sctp]
Pid: 16758, comm: rmmod Not tainted 2.6.28-rc2-netns-xfrm #3
RIP: 0010:[<ffffffffa009890f>]  [<ffffffffa009890f>] nf_conntrack_l4proto_unregister+0x96/0xae [nf_conntrack]
RSP: 0018:ffff88015dc1fec8  EFLAGS: 00010212
RAX: 0000000000000000 RBX: 00000000001000f8 RCX: 0000000000000000
RDX: ffffffffa009575c RSI: 0000000000000003 RDI: ffffffffa00956b5
RBP: ffff88015dc1fed8 R08: 0000000000000002 R09: 0000000000000000
R10: 0000000000000000 R11: ffff88015dc1fe48 R12: ffffffffa0458f60
R13: 0000000000000880 R14: 00007fff4c361d30 R15: 0000000000000880
FS:  00007f624435a6f0(0000) GS:ffffffff80521580(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000100100 CR3: 0000000168969000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process rmmod (pid: 16758, threadinfo ffff88015dc1e000, task ffff880179864218)
Stack:
 ffffffffa0459100 0000000000000000 ffff88015dc1fee8 ffffffffa0457934
 ffff88015dc1ff78 ffffffff80253fef 746e6e6f635f666e 6f72705f6b636172
 00707463735f6f74 ffffffff8024cb30 00000000023b8010 0000000000000000
Call Trace:
 [<ffffffffa0457934>] nf_conntrack_proto_sctp_fini+0x10/0x1e [nf_conntrack_proto_sctp]
 [<ffffffff80253fef>] sys_delete_module+0x19f/0x1fe
 [<ffffffff8024cb30>] ? trace_hardirqs_on_caller+0xf0/0x114
 [<ffffffff803ea9b2>] ? trace_hardirqs_on_thunk+0x3a/0x3f
 [<ffffffff8020b52b>] system_call_fastpath+0x16/0x1b
Code: 13 35 e0 e8 c4 6c 1a e0 48 8b 1d 6d c6 46 e0 eb 16 48 89 df 4c 89 e2 48 c7 c6 fc 85 09 a0 e8 61 cd ff ff 48 8b 5b 08 48 83 eb 08 <48> 8b 43 08 0f 18 08 48 8d 43 08 48 3d 60 4f 50 80 75 d3 5b 41
RIP  [<ffffffffa009890f>] nf_conntrack_l4proto_unregister+0x96/0xae [nf_conntrack]
 RSP <ffff88015dc1fec8>
CR2: 0000000000100100
---[ end trace bde8ac82debf7192 ]---

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-05 03:03:18 -08:00
Benjamin Thery e3ec6cfc26 ipv6: fix run pending DAD when interface becomes ready
With some net devices types, an IPv6 address configured while the
interface was down can stay 'tentative' forever, even after the interface
is set up. In some case, pending IPv6 DADs are not executed when the
device becomes ready.

I observed this while doing some tests with kvm. If I assign an IPv6 
address to my interface eth0 (kvm driver rtl8139) when it is still down
then the address is flagged tentative (IFA_F_TENTATIVE). Then, I set
eth0 up, and to my surprise, the address stays 'tentative', no DAD is
executed and the address can't be pinged.

I also observed the same behaviour, without kvm, with virtual interfaces
types macvlan and veth.

Some easy steps to reproduce the issue with macvlan:

1. ip link add link eth0 type macvlan
2. ip -6 addr add 2003::ab32/64 dev macvlan0
3. ip addr show dev macvlan0
   ... 
   inet6 2003::ab32/64 scope global tentative
   ...
4. ip link set macvlan0 up
5. ip addr show dev macvlan0
   ...
   inet6 2003::ab32/64 scope global tentative
   ...
   Address is still tentative

I think there's a bug in net/ipv6/addrconf.c, addrconf_notify():
addrconf_dad_run() is not always run when the interface is flagged IF_READY.
Currently it is only run when receiving NETDEV_CHANGE event. Looks like
some (virtual) devices doesn't send this event when becoming up.

For both NETDEV_UP and NETDEV_CHANGE events, when the interface becomes
ready, run_pending should be set to 1. Patch below.

'run_pending = 1' could be moved below the if/else block but it makes 
the code less readable.

Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-05 01:43:57 -08:00
Eric Dumazet 270acefafe net: sk_free_datagram() should use sk_mem_reclaim_partial()
I noticed a contention on udp_memory_allocated on regular UDP applications.

While tcp_memory_allocated is seldom used, it appears each incoming UDP frame
is currently touching udp_memory_allocated when queued, and when received by
application.

One possible solution is to use sk_mem_reclaim_partial() instead of
sk_mem_reclaim(), so that we keep a small reserve (less than one page)
of memory for each UDP socket.

We did something very similar on TCP side in commit
9993e7d313
([TCP]: Do not purge sk_forward_alloc entirely in tcp_delack_timer())

A more complex solution would need to convert prot->memory_allocated to
use a percpu_counter with batches of 64 or 128 pages.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-05 01:38:06 -08:00
Randy Dunlap b22cecdd8f net/9p: fix printk format warnings
Fix printk format warnings in net/9p.
Built cleanly on 7 arches.

net/9p/client.c:820: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'u64'
net/9p/client.c:820: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'u64'
net/9p/client.c:867: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'u64'
net/9p/client.c:867: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'u64'
net/9p/client.c:932: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'u64'
net/9p/client.c:932: warning: format '%llx' expects type 'long long unsigned int', but argument 6 has type 'u64'
net/9p/client.c:982: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'u64'
net/9p/client.c:982: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'u64'
net/9p/client.c:1025: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'u64'
net/9p/client.c:1025: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'u64'
net/9p/client.c:1227: warning: format '%llx' expects type 'long long unsigned int', but argument 7 has type 'u64'
net/9p/client.c:1227: warning: format '%llx' expects type 'long long unsigned int', but argument 12 has type 'u64'
net/9p/client.c:1227: warning: format '%llx' expects type 'long long unsigned int', but argument 8 has type 'u64'
net/9p/client.c:1227: warning: format '%llx' expects type 'long long unsigned int', but argument 13 has type 'u64'
net/9p/client.c:1252: warning: format '%llx' expects type 'long long unsigned int', but argument 7 has type 'u64'
net/9p/client.c:1252: warning: format '%llx' expects type 'long long unsigned int', but argument 12 has type 'u64'
net/9p/client.c:1252: warning: format '%llx' expects type 'long long unsigned int', but argument 8 has type 'u64'
net/9p/client.c:1252: warning: format '%llx' expects type 'long long unsigned int', but argument 13 has type 'u64'

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-05 01:35:55 -08:00
Gerrit Renker d99a7bd210 dccp: Cleanup routines for feature negotiation
This inserts the required de-allocation routines for memory allocated
by feature negotiation in the socket destructors, replacing
dccp_feat_clean() in one instance.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-04 23:56:30 -08:00
Gerrit Renker ac75773c27 dccp: Per-socket initialisation of feature negotiation
This provides feature-negotiation initialisation for both DCCP sockets
and DCCP request_sockets, to support feature negotiation during
connection setup.

It also resolves a FIXME regarding the congestion control
initialisation.

Thanks to Wei Yongjun for help with the IPv6 side of this patch.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-04 23:55:49 -08:00
Gerrit Renker 61e6473efb dccp: List management for new feature negotiation
This adds list initial fields and list management functions for the
new feature negotiation implementation.

Thanks to Arnaldo for suggestions and improvements.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-04 23:54:04 -08:00
Gerrit Renker 7d43d1a0f2 dccp: Implement lookup table for feature-negotiation information
A lookup table for feature-negotiation information, extracted from RFC
4340/42, is provided by this patch. All currently known features can
be found in this table, along with their feature location, their
default value, and type.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-04 23:43:47 -08:00
Gerrit Renker bd012f2e7b dccp: Basic data structure for feature negotiation
This patch prepares for the new and extended feature-negotiation
routines.

The following feature-negotiation data structures are provided:
	* a container for the various (SP or NN) values,
	* symbolic state names to track feature states,
	* an entry struct which holds all current information together,
	* elementary functions to fill in and process these structures.

Entry structs are arranged as FIFO for the following reason: RFC 4340
specifies that if multiple options of the same type are present, they
are processed in the order of their appearance in the packet; which
means that this order needs to be preserved in the local data
structure (the later insertion code also respects this order).

The struct list_head has been chosen for the following reasons: the most
frequent operations are

 * add new entry at tail (when receiving Change or setting socket
   options);
 * delete entry (when Confirm has been received);
 * deep copy of entire list (cloning from listening socket onto
   request socket).

The NN value has been set to 64 bit, which is a currently sufficient
upper limit (Sequence Window feature has 48 bit).

Thanks to Arnaldo, who contributed the streamlined layout of the entry
struct.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-04 23:38:20 -08:00
Patrick McHardy 9b22ea5609 net: fix packet socket delivery in rx irq handler
The changes to deliver hardware accelerated VLAN packets to packet
sockets (commit bc1d0411) caused a warning for non-NAPI drivers.
The __vlan_hwaccel_rx() function is called directly from the drivers
RX function, for non-NAPI drivers that means its still in RX IRQ
context:

[   27.779463] ------------[ cut here ]------------
[   27.779509] WARNING: at kernel/softirq.c:136 local_bh_enable+0x37/0x81()
...
[   27.782520]  [<c0264755>] netif_nit_deliver+0x5b/0x75
[   27.782590]  [<c02bba83>] __vlan_hwaccel_rx+0x79/0x162
[   27.782664]  [<f8851c1d>] atl1_intr+0x9a9/0xa7c [atl1]
[   27.782738]  [<c0155b17>] handle_IRQ_event+0x23/0x51
[   27.782808]  [<c015692e>] handle_edge_irq+0xc2/0x102
[   27.782878]  [<c0105fd5>] do_IRQ+0x4d/0x64

Split hardware accelerated VLAN reception into two parts to fix this:

- __vlan_hwaccel_rx just stores the VLAN TCI and performs the VLAN
  device lookup, then calls netif_receive_skb()/netif_rx()

- vlan_hwaccel_do_receive(), which is invoked by netif_receive_skb()
  in softirq context, performs the real reception and delivery to
  packet sockets.

Reported-and-tested-by: Ramon Casellas <ramon.casellas@cttc.es>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-04 14:49:57 -08:00
Andreas Steffen 79654a7698 xfrm: Have af-specific init_tempsel() initialize family field of temporary selector
While adding MIGRATE support to strongSwan, Andreas Steffen noticed that
the selectors provided in XFRM_MSG_ACQUIRE have their family field
uninitialized (those in MIGRATE do have their family set).

Looking at the code, this is because the af-specific init_tempsel()
(called via afinfo->init_tempsel() in xfrm_init_tempsel()) do not set
the value.

Reported-by: Andreas Steffen <andreas.steffen@strongswan.org>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Arnaud Ebalard <arno@natisbad.org>
2008-11-04 14:49:19 -08:00
Linus Torvalds 75fa67706c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
  xfrm: Fix xfrm_policy_gc_lock handling.
  niu: Use pci_ioremap_bar().
  bnx2x: Version Update
  bnx2x: Calling netif_carrier_off at the end of the probe
  bnx2x: PCI configuration bug on big-endian
  bnx2x: Removing the PMF indication when unloading
  mv643xx_eth: fix SMI bus access timeouts
  net: kconfig cleanup
  fs_enet: fix polling
  XFRM: copy_to_user_kmaddress() reports local address twice
  SMC91x: Fix compilation on some platforms.
  udp: Fix the SNMP counter of UDP_MIB_INERRORS
  udp: Fix the SNMP counter of UDP_MIB_INDATAGRAMS
  drivers/net/smc911x.c: Fix lockdep warning on xmit.
2008-11-04 08:30:12 -08:00
Simon Arlott 6e3354c1e9 netfilter: nf_nat: remove warn_if_extra_mangle
In net/ipv4/netfilter/nf_nat_rule.c, the function warn_if_extra_mangle was added
in commit 5b1158e909 (2006-12-02). I have a DNAT
target in the OUTPUT chain than changes connections with dst 2.0.0.1 to another
address which I'll substitute with 66.102.9.99 below.

On every boot I get the following message:
[  146.252505] NAT: no longer support implicit source local NAT
[  146.252517] NAT: packet src 66.102.9.99 -> dst 2.0.0.1

As far as I can tell from reading the function doing this, it should warn if the
source IP for the route to 66.102.9.99 is different from 2.0.0.1 but that is not
the case. It doesn't make sense to check the DNAT target against the local route
source.

Either the function should be changed to correctly check the route, or it should
be removed entirely as it's been nearly 2 years since it was added.

Signed-off-by: Simon Arlott <simon@fire.lp0.eu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-11-04 14:35:39 +01:00
Alexey Dobriyan 249b62035c netfilter: netns ebtables: br_nf_pre_routing_finish() fixup
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-11-04 14:31:29 +01:00
Alexey Dobriyan b71b30a626 netfilter: netns ebtables: ebtable_nat in netns
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-11-04 14:30:46 +01:00
Alexey Dobriyan 4aad10938d netfilter: netns ebtables: ebtable_filter in netns
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-11-04 14:29:58 +01:00
Alexey Dobriyan 8157e6d16a netfilter: netns ebtables: ebtable_broute in netns
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-11-04 14:29:03 +01:00
Alexey Dobriyan dbcdf85a2e netfilter: netns ebtables: more cleanup during ebt_unregister_table()
Now that ebt_unregister_table() can be called during netns stop, and module
pinning scheme can't prevent netns stop, do table cleanup by hand.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-11-04 14:28:04 +01:00
Alexey Dobriyan 6beceee5aa netfilter: netns ebtables: part 2
* return ebt_table from ebt_register_table(), module code will save it into
  per-netns data for unregistration
* duplicate ebt_table at the very beginning of registration -- it's added into
  list, so one ebt_table wouldn't end up in many lists (and each netns has
  different one)
* introduce underscored tables in individial modules, this is temporary to not
  break bisection.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-11-04 14:27:15 +01:00
Alexey Dobriyan 511061e2dd netfilter: netns ebtables: part 1
* propagate netns from userspace, register table in passed netns
* remporarily register every ebt_table in init_net

P. S.: one needs to add ".netns_ok = 1" to igmp_protocol to test with
ebtables(8) in netns.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-11-04 14:22:55 +01:00
Alexey Dobriyan 19223f26d9 netfilter: arptable_filter: merge forward hook
It's identical to NF_ARP_IN hook.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-11-04 14:22:13 +01:00
Alexey Dobriyan d4ec52bae7 netfilter: netns-aware ipt_addrtype
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-11-04 14:21:48 +01:00
Eric Leblond 5f7340eff8 netfilter: xt_NFLOG: don't call nf_log_packet in NFLOG module.
This patch modifies xt_NFLOG to suppress the call to nf_log_packet()
function. The call of this wrapper in xt_NFLOG was causing NFLOG to
use the first initialized module. Thus, if ipt_ULOG is loaded before
nfnetlink_log all NFLOG rules are treated as plain LOG rules.

Signed-off-by: Eric Leblond <eric@inl.fr>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-11-04 14:21:08 +01:00
David S. Miller d2ad3ca88d net/: Kill now superfluous ->last_rx stores.
The generic packet receive code takes care of setting
netdev->last_rx when necessary, for the sake of the
bonding ARP monitor.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-03 22:01:07 -08:00
Stephen Hemminger 265eb67fb4 netem: eliminate unneeded return values
All these individual parsing functions never return an error,
so they can be void.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-03 21:13:26 -08:00
Alexey Dobriyan bbb770e7ab xfrm: Fix xfrm_policy_gc_lock handling.
From: Alexey Dobriyan <adobriyan@gmail.com>

Based upon a lockdep trace by Simon Arlott.

xfrm_policy_kill() can be called from both BH and
non-BH contexts, so we have to grab xfrm_policy_gc_lock
with BH disabling.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-03 19:11:29 -08:00
Jianjun Kong ab29109210 net: remove two duplicated #include
Removed duplicated #include <rdma/ib_verbs.h> in net/9p/trans_rdma.c
		and  #include <linux/thread_info.h> in net/socket.c

Signed-off-by: Jianjun Kong <jianjun@zeuux.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-03 18:23:09 -08:00
Alexey Dobriyan 6d9f239a1e net: '&' redux
I want to compile out proc_* and sysctl_* handlers totally and
stub them to NULL depending on config options, however usage of &
will prevent this, since taking adress of NULL pointer will break
compilation.

So, drop & in front of every ->proc_handler and every ->strategy
handler, it was never needed in fact.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-03 18:21:05 -08:00
Stephen Hemminger 24f8b2385e net: increase receive packet quantum
This patch gets about 1.25% back on tbench regression.

My change to NAPI for multiqueue support changed the time limit on
network receive processing.  Under sustained loads like tbench, this
can cause the receiver to reschedule prematurely. 

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-03 17:14:38 -08:00
Julius Volz 48148938b4 IPVS: Remove supports_ipv6 scheduler flag
Remove the 'supports_ipv6' scheduler flag since all schedulers now
support IPv6.

Signed-off-by: Julius Volz <julius.volz@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-03 17:08:56 -08:00
Julius Volz 445483758e IPVS: Add IPv6 support to LBLC/LBLCR schedulers
Add IPv6 support to LBLC and LBLCR schedulers. These were the last
schedulers without IPv6 support, but we might want to keep the
supports_ipv6 flag in the case of future schedulers without IPv6
support.

Signed-off-by: Julius Volz <julius.volz@gmail.com>
Acked-by: Simon Horman <horms@verge.net.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-03 17:08:28 -08:00
Jarek Poplawski 67305ebc99 pkt_sched: sch_generic: Kfree gso_skb in qdisc_reset()
Since gso_skb is re-used for qdisc_peek_dequeued(), and this skb is
counted in the qdisc->q.qlen, it has to be kfreed during qdisc_reset()
when qlen is zeroed.

With help from David S. Miller <davem@davemloft.net>

Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-03 02:52:50 -08:00
Jianjun Kong 5799de0b12 net: clean up net/ipv4/tcp_ipv4.c
Signed-off-by: Jianjun Kong <jianjun@zeuux.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-03 02:49:10 -08:00
Jianjun Kong 539afedfcc net: clean up net/ipv4/devinet.c
Signed-off-by: Jianjun Kong <jianjun@zeuux.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-03 02:48:48 -08:00
Jianjun Kong f4cca7ffb2 net: clean up net/ipv4/pararp.c
Signed-off-by: Jianjun Kong <jianjun@zeuux.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-03 02:48:14 -08:00
Jianjun Kong fd3f8c4cb6 net: clean up net/ipv4/ip_fragment.c tcp_timer.c ip_input.c
Signed-off-by: Jianjun Kong <jianjun@zeuux.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-03 02:47:38 -08:00
Arnaud Ebalard a1caa32295 XFRM: copy_to_user_kmaddress() reports local address twice
While adding support for MIGRATE/KMADDRESS in strongSwan (as specified
in draft-ebalard-mext-pfkey-enhanced-migrate-00), Andreas Steffen
noticed that XFRMA_KMADDRESS attribute passed to userland contains the
local address twice (remote provides local address instead of remote
one).

This bug in copy_to_user_kmaddress() affects only key managers that use
native XFRM interface (key managers that use PF_KEY are not affected).

For the record, the bug was in the initial changeset I posted which
added support for KMADDRESS (13c1d18931
'xfrm: MIGRATE enhancements (draft-ebalard-mext-pfkey-enhanced-migrate)').

Signed-off-by: Arnaud Ebalard <arno@natisbad.org>
Reported-by: Andreas Steffen <andreas.steffen@strongswan.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-03 01:30:23 -08:00
Jianjun Kong c354e12463 net: clean up net/ipv4/ipmr.c
Signed-off-by: Jianjun Kong <jianjun@zeuux.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-03 00:28:02 -08:00
Jianjun Kong 09cb105ea7 net: clean up net/ipv4/ip_sockglue.c tcp_output.c
Signed-off-by: Jianjun Kong <jianjun@zeuux.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-03 00:27:11 -08:00
Jianjun Kong a7e9ff735b net: clean up net/ipv4/igmp.c
Signed-off-by: Jianjun Kong <jianjun@zeuux.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-03 00:26:09 -08:00
Jianjun Kong 6ed2533e55 net: clean up net/ipv4/fib_frontend.c fib_hash.c ip_gre.c
Signed-off-by: Jianjun Kong <jianjun@zeuux.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-03 00:25:16 -08:00
Jianjun Kong 5a5f3a8db9 net: clean up net/ipv4/ipip.c raw.c tcp.c tcp_minisocks.c tcp_yeah.c xfrm4_policy.c
Signed-off-by: Jianjun Kong <jianjun@zeuux.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-03 00:24:34 -08:00
Jianjun Kong d9319100c1 net: clean up net/ipv4/ah4.c esp4.c fib_semantics.c inet_connection_sock.c inetpeer.c ip_output.c
Signed-off-by: Jianjun Kong <jianjun@zeuux.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-03 00:23:42 -08:00
David S. Miller e0db4a786b sunrpc: Fix build warning due to typo in %pI4 format changes.
Noticed by Stephen Hemminger.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-02 23:57:06 -08:00
Wei Yongjun 0856f93958 udp: Fix the SNMP counter of UDP_MIB_INERRORS
UDP packets received in udpv6_recvmsg() are not only IPv6 UDP packets, but
also have IPv4 UDP packets, so when do the counter of UDP_MIB_INERRORS in
udpv6_recvmsg(), we should check whether the packet is a IPv6 UDP packet
or a IPv4 UDP packet.

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-02 23:52:46 -08:00
Wei Yongjun f26ba17511 udp: Fix the SNMP counter of UDP_MIB_INDATAGRAMS
If UDP echo is sent to xinetd/echo-dgram, the UDP reply will be received
at the sender. But the SNMP counter of UDP_MIB_INDATAGRAMS will be not
increased, UDP6_MIB_INDATAGRAMS will be increased instead.

  Endpoint A                      Endpoint B
  UDP Echo request ----------->
  (IPv4, Dst port=7)
                   <----------    UDP Echo Reply
                                  (IPv4, Src port=7)

This bug is come from this patch cb75994ec3.

It do counter UDP[6]_MIB_INDATAGRAMS until udp[v6]_recvmsg. Because
xinetd used IPv6 socket to receive UDP messages, thus, when received
UDP packet, the UDP6_MIB_INDATAGRAMS will be increased in function
udpv6_recvmsg() even if the packet is a IPv4 UDP packet.

This patch fixed the problem.

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-02 23:52:45 -08:00
Julius Volz 20971a0afb IPVS: Add IPv6 support to SH and DH schedulers
Add IPv6 support to SH and DH schedulers. I hope this simple IPv6 address
hashing is good enough. The 128 bit are just XORed into 32 before hashing
them like an IPv4 address.

Signed-off-by: Julius Volz <julius.volz@gmail.com>
Acked-by: Simon Horman <horms@verge.net.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-02 23:52:01 -08:00
Linus Torvalds 391e572cd1 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (33 commits)
  af_unix: netns: fix problem of return value
  IRDA: remove double inclusion of module.h
  udp: multicast packets need to check namespace
  net: add documentation for skb recycling
  key: fix setkey(8) policy set breakage
  bpa10x: free sk_buff with kfree_skb
  xfrm: do not leak ESRCH to user space
  net: Really remove all of LOOPBACK_TSO code.
  netfilter: nf_conntrack_proto_gre: switch to register_pernet_gen_subsys()
  netns: add register_pernet_gen_subsys/unregister_pernet_gen_subsys
  net: delete excess kernel-doc notation
  pppoe: Fix socket leak.
  gianfar: Don't reset TBI<->SerDes link if it's already up
  gianfar: Fix race in TBI/SerDes configuration
  at91_ether: request/free GPIO for PHY interrupt
  amd8111e: fix dma_free_coherent context
  atl1: fix vlan tag regression
  SMC91x: delete unused local variable "lp"
  myri10ge: fix stop/go mmio ordering
  bonding: fix panic when taking bond interface down before removing module
  ...
2008-11-02 10:15:52 -08:00
Jarek Poplawski 8ba25dad0a sch_netem: Replace ->requeue() method with open code
After removing netem classful functionality we are sure its inner
qdisc is tfifo, so we can replace qdisc->ops->requeue() method with
open code. After this patch there are no more ops->requeue() users.

The idea of this patch is by Patrick McHardy.

Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-02 00:36:03 -07:00
Jarek Poplawski 0220146411 sch_netem: Remove classful functionality
Patrick McHardy noticed that: "a lot of the functionality of netem
requires the inner tfifo anyways and rate-limiting is usually done
on top of netem. So I would suggest so either hard-wire the tfifo
qdisc or at least make the assumption that inner qdiscs are
work-conserving.", and later: "- a lot of other qdiscs still don't
work as inner qdiscs of netem [...]".

So, according to his suggestion, this patch removes classful options
of netem. The main reason of this change is to remove ops->requeue()
method, which is currently used only by netem.

Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-02 00:35:24 -07:00
Sangtae Ha ae27e98a51 [TCP] CUBIC v2.3
Signed-off-by: Sangtae Ha <sha2@ncsu.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-02 00:28:10 -07:00
Jianjun Kong e27dfcea48 af_unix: clean up net/unix/af_unix.c garbage.c sysctl_net_unix.c
clean up net/unix/af_unix.c garbage.c sysctl_net_unix.c

Signed-off-by: Jianjun Kong <jianjun@zeuux.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-01 21:38:31 -07:00
Jianjun Kong 48dcc33e5e af_unix: netns: fix problem of return value
fix problem of return value

net/unix/af_unix.c: unix_net_init()
when error appears, it should return 'error', not always return 0.

Signed-off-by: Jianjun Kong <jianjun@zeuux.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-01 21:37:27 -07:00
Eric Dumazet 920a46115c udp: multicast packets need to check namespace
Current UDP multicast delivery is not namespace aware.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Acked-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-01 21:22:23 -07:00
Eric Dumazet c37ccc0d4e udp: add a missing smp_wmb() in udp_lib_get_port()
Corey Minyard spotted a missing memory barrier in udp_lib_get_port()

We need to make sure a reader cannot read the new 'sk->sk_next' value
and previous value of 'sk->sk_hash'. Or else, an item could be deleted
from a chain, and inserted into another chain. If new chain was empty
before the move, 'next' pointer is NULL, and lockless reader can
not detect it missed following items in original chain.

This patch is temporary, since we expect an upcoming patch
to introduce another way of handling the problem.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-01 21:19:18 -07:00
Nicolas Dichtel 7e3a42a12c xfrm6: handling fragment
RFC4301 Section 7.1 says:

"7.1.  Tunnel Mode SAs that Carry Initial and Non-Initial Fragments

     All implementations MUST support tunnel mode SAs that are configured
     to pass traffic without regard to port field (or ICMP type/code or
     Mobility Header type) values.  If the SA will carry traffic for
     specified protocols, the selector set for the SA MUST specify the
     port fields (or ICMP type/code or Mobility Header type) as ANY.  An
     SA defined in this fashion will carry all traffic including initial
     and non-initial fragments for the indicated Local/Remote addresses
     and specified Next Layer protocol(s)."

But for IPv6, fragment is treated as a protocol.  This change catches
protocol transported in fragmented packet.  In IPv4, there is no
problem.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-01 21:12:07 -07:00
Stephen Hemminger d1a203eac0 net: add documentation for skb recycling
Commit 04a4bb55bc ("net: add
skb_recycle_check() to enable netdriver skb recycling") added a
method for network drivers to recycle skbuffs, but while use of
this mechanism was documented in the commit message, it should
really have been added as a docbook comment as well -- this
patch does that.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-01 21:01:09 -07:00
Al Viro 233e70f422 saner FASYNC handling on file close
As it is, all instances of ->release() for files that have ->fasync()
need to remember to evict file from fasync lists; forgetting that
creates a hole and we actually have a bunch that *does* forget.

So let's keep our lives simple - let __fput() check FASYNC in
file->f_flags and call ->fasync() there if it's been set.  And lose that
crap in ->release() instances - leaving it there is still valid, but we
don't have to bother anymore.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-01 09:49:46 -07:00
Alexey Dobriyan 920da6923c key: fix setkey(8) policy set breakage
Steps to reproduce:

	#/usr/sbin/setkey -f
	flush;
	spdflush;

	add 192.168.0.42 192.168.0.1 ah 24500 -A hmac-md5 "1234567890123456";
	add 192.168.0.42 192.168.0.1 esp 24501 -E 3des-cbc "123456789012123456789012";

	spdadd 192.168.0.42 192.168.0.1 any -P out ipsec
		esp/transport//require
		ah/transport//require;

setkey: invalid keymsg length

Policy dump will bail out with the same message after that.

-recv(4, "\2\16\0\0\32\0\3\0\0\0\0\0\37\r\0\0\3\0\5\0\377 \0\0\2\0\0\0\300\250\0*\0"..., 32768, 0) = 208
+recv(4, "\2\16\0\0\36\0\3\0\0\0\0\0H\t\0\0\3\0\5\0\377 \0\0\2\0\0\0\300\250\0*\0"..., 32768, 0) = 208

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-31 16:41:26 -07:00
Johannes Berg e25cf4a694 mac80211: fix two kernel-doc warnings
One parameter wasn't described and one I forgot to update when
renaming it; also update TBDs in sta_info.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-10-31 19:02:36 -04:00
Johannes Berg 84fa4f43c4 wireless regulatory: move ignore_request
This function is only used once, move it closer to its caller.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-10-31 19:02:32 -04:00
Johannes Berg 2083c4997b wireless: clean up regulatory ignore_request function
This function has a few WARNs that may eventually trigger
when an AP sends rogue beacons, those must be removed. Some
of the comments in the function are also inappropriate as
this function is concerned with the global hint, not a per-
wiphy thing (which a multidomain flag on a wiphy would imply).

I'm convinced that we don't need to do anything to implement
multi-domain capability as 802.11-2007 specifies it because
it makes only two things mandatory:
 * starting of BSS/IBSS must have country information
   (this can easily be done with a mac80211 patch)
 * a STA must adopt the country information (we already have
   the framework for this)

But we don't have anything implemented anyway for now.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-10-31 19:02:31 -04:00
Johannes Berg be3d48106c wireless: remove struct regdom hinting
The code needs to be split out and cleaned up, so as a
first step remove the capability, to add it back in a
subsequent patch as a separate function. Also remove the
publically facing return value of the function and the
wiphy argument. A number of internal functions go from
being generic helpers to just being used for alpha2
setting.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-10-31 19:02:30 -04:00
Johannes Berg d2372b3152 wireless: make regdom passing semantics simpler
The regdom struct is given to the core, so it might as well
free it in error conditions.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-10-31 19:02:30 -04:00
Sujith 8b30b1fe36 mac80211: Re-enable aggregation
Wireless HW without any dedicated queues for aggregation
do not need the ampdu_queues mechanism present right now
in mac80211. Since mac80211 is still incomplete wrt TX MQ
changes, do not allow aggregation sessions for drivers that
set ampdu_queues.

This is only an interim hack until Intel fixes the requeue issue.

Signed-off-by: Sujith <Sujith.Manoharan@atheros.com>
Signed-off-by: Luis Rodriguez <Luis.Rodriguez@Atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-10-31 19:02:14 -04:00
Andrey Yurovsky 4393dce940 mac80211: allow all interfaces types to handle RX action frames
Eliminate the vif.type check in ieee80211_rx_h_action.  This check is
unnecessary (these action frames can be handled by all interface types) and
currently prevents, for example, AP interfaces from handling BACK action frames
such as ADDBA and DELBA requests.

Signed-off-by: Andrey Yurovsky <andrey@cozybit.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-10-31 19:00:56 -04:00
Johannes Berg f3e63db2e5 wireless: remove write-only 'granted' variable
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-10-31 19:00:56 -04:00
Sujith 075cbc9eb1 mac80211: Change WARN_ON to WARN_ON_ONCE
A warning would be printed for every packet that
is transmitted if the rate control information isn't
setup. Change this to WARN_ON_ONCE.

Signed-off-by: Sujith <Sujith.Manoharan@atheros.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-10-31 19:00:54 -04:00
Luis R. Rodriguez d9d2925713 mac80211: make use of regulatory tx power settings on change of tx power
We do not know what max power to allow until a device is targeting
a channel, therefore only allow changing tx power if a channel is defined.
Also make use of the channel's max power setting as defined by
regulatory rules before allowing the user to use the requested power
setting. If the user asked us to figure it out we use the max allowed
by regulatory.

Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-10-31 19:00:52 -04:00
Rami Rosen e2ef12d3fd mac80211: check return value of dev_alloc_skb() in ieee80211_sta_join_ibss().
This patch add a check on the return value of dev_alloc_skb() in
ieee80211_sta_join_ibss() in net/mac80211/mlme.c.

Signed-off-by: Rami Rosen <ramirose@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-10-31 19:00:51 -04:00
John W. Linville 7211801527 wireless: avoid some net/ieee80211.h vs. linux/ieee80211.h conflicts
There is quite a lot of overlap in definitions between these headers...

Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-10-31 19:00:50 -04:00
John W. Linville 9387b7caf3 wireless: use individual buffers for printing ssid values
Also change escape_ssid to print_ssid to match print_mac semantics.

Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-10-31 19:00:50 -04:00
John W. Linville 2819f8ad6d wireless: escape_ssid should handle non-printables
Also use common backslash sequences like \t, \n, \r, and \\ as well as \0.

Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-10-31 19:00:49 -04:00
John W. Linville c5d3dce875 wireless: remove NETWORK_EMPTY_ESSID flag
It is unnecessary and of questionable value.  Also remove
is_empty_ssid, as it is also unnecessary.

Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-10-31 19:00:48 -04:00
John W. Linville 7e272fcff6 wireless: consolidate on a single escape_essid implementation
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-10-31 19:00:46 -04:00
Johannes Berg ddf4ac53fb mac80211: insert AP sta entry after filling it
We never clearly defined the semantics of the sta_notify callback
and it was originally posted for iwlwifi which still doesn't use
it at all. With the recent HT rework ath9k started relying on it,
but I made a mistake there in that I made ath9k assume the HT
information has already been filled in at sta_notify time. This
isn't a hard thing to do, so do it.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-10-31 19:00:46 -04:00
Johannes Berg ac9440a4e4 wireless: fix EU check
http://en.wikipedia.org/wiki/De_Morgan%27s_laws is useful.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-10-31 19:00:44 -04:00
Johannes Berg f6037d09e2 wireless: get rid of pointless request list
We really only need to know the last request at each point in time.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-10-31 19:00:43 -04:00
Johannes Berg f3b407fba5 wireless: remove cfg80211_reg_mutex
This mutex is wrong, we use cfg80211_drv_mutex (which should
possibly be renamed to just cfg80211_mutex) everywhere except
in one place, fix that and get rid of the extra mutex.

Also get rid of a spurious regulatory_requests list definition.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-10-31 19:00:42 -04:00
Johannes Berg cf03268e6e wireless: don't publish __regulatory_hint
This function requires an internal lock to be held, so it cannot
be published to other modules in the kernel.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-10-31 19:00:41 -04:00
colin@cozybit.com 93da9cc17c Add nl80211 commands to get and set o11s mesh networking parameters
The two new commands are NL80211_CMD_GET_MESH_PARAMS and
NL80211_CMD_SET_MESH_PARAMS. There is a new attribute enum,
NL80211_ATTR_MESH_PARAMS, which enumerates the various mesh configuration
parameters.

Moved struct mesh_config from mac80211/ieee80211_i.h to net/cfg80211.h.
nl80211_get_mesh_params and nl80211_set_mesh_params unpack the netlink messages
and ask the driver to get or set the configuration.  This is done via two new
function stubs, get_mesh_params and set_mesh_params, in struct cfg80211_ops.

Signed-off-by: Colin McCabe <colin@cozybit.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-10-31 19:00:39 -04:00
Johannes Berg 4a68ec535e mac80211: inform userspace of probe/auth/assoc timeout
I noticed that when for some reason [1] the probe or auth times
out, wpa_supplicant doesn't realise this and only tries the next
AP when it runs into its own timeout, which is ten seconds, and
that's quite long. Fix this by making mac80211 notify userspace
that it didn't associate.

[1] my wrt350n in mixed B/G/HT mode often runs into this, maybe
it's because one of the antennas is broken off and for whatever
reason it decides to use that antenna to transmit the response
frames (auth, probe); I do see beacons fine so it's not totally
broken. Works fine in pure-G mode.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-10-31 19:00:38 -04:00
Johannes Berg 50fb2e4572 mac80211: remove rate_control_clear
"Clearing" the rate control algorithm is pointless, none of
the algorithms actually uses this operation and it's not even
invoked properly for all channel switching. Also, there's no
need to since rate control algorithms work per station.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-10-31 19:00:37 -04:00
Felix Fietkau f4a8cd94fc minstrel: improve performance for non-MRR drivers
This patch enhances minstrel's performance for non-MRR setups,
by preventing it from sampling slower rates with >95% success
probability and by putting at least 1 non-sample frame between
several sample frames.

Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-10-31 19:00:36 -04:00
Johannes Berg 0a9542ee12 nl80211: fix monitor flags
NLA_NESTED attributes cannot be empty, but we want to be able to
specify "no flags" (empty attribute) vs. "no change" (no attribute).
Therefore, remove the NLA_NESTED policy so it can work as an empty
attribute.

I guess I should have used a u32 for these flags instead, but we're
stuck with it now. Haven't noticed earlier because of a bug in iw...

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-10-31 19:00:35 -04:00
Johannes Berg 4acf074971 make ieee80211 invisible
This makes CONFIG_IEEE80211 invisible. The drivers that require it
(ipw2100, ipw2200, hostap) select it, and everybody else really
shouldn't even think about using it. Also, since there really is
no point in compiling anything without crypto support these days,
remove the crypto options and just enable them, leaving only the
debugging option which only shows up when a driver is select that
requires it. This makes it hard to enable, but most people wouldn't
want to anyway.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Acked-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-10-31 19:00:25 -04:00
Johannes Berg e6a9854b05 mac80211/drivers: rewrite the rate control API
So after the previous changes we were still unhappy with how
convoluted the API is and decided to make things simpler for
everybody. This completely changes the rate control API, now
taking into account 802.11n with MCS rates and more control,
most drivers don't support that though.

Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-10-31 19:00:23 -04:00
Johannes Berg cb121bad67 mac80211: add might_sleep to hw_config
Just to catch bugs when changing mac80211.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-10-31 19:00:22 -04:00
Johannes Berg ae5eb02641 mac80211: rewrite HT handling
The HT handling has the following deficiencies, which I've
(partially) fixed:
 * it always uses the AP info even if there is no AP,
   hence has no chance of working as an AP
 * it pretends to be HW config, but really is per-BSS
 * channel sanity checking is left to the drivers
 * it generally lets the driver control too much

HT enabling is still wrong with this patch if you have more than
one virtual STA mode interface, but that never happens currently.
Once WDS, IBSS or AP/VLAN gets HT capabilities, it will also be
wrong, see the comment in ieee80211_enable_ht().

Additionally, this fixes a number of bugs:
 * mac80211: ieee80211_set_disassoc doesn't notify the driver any
             more since the refactoring
 * iwl-agn-rs: always uses the HT capabilities from the wrong stuff
               mac80211 gives it rather than the actual peer STA
 * ath9k: a number of bugs resulting from the broken HT API

I'm not entirely happy with putting the HT capabilities into
struct ieee80211_sta as restricted to our own HT TX capabilities,
but I see no cleaner solution for now.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-10-31 19:00:16 -04:00
Johannes Berg bda3933a8a mac80211: move bss_conf into vif
Move bss_conf into the vif struct so that drivers can
access it during ->tx without having to store it in
the private data or similar. No driver updates because
this is only for when they want to start using it.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-10-31 19:00:15 -04:00
Johannes Berg 9124b07740 mac80211: make retry limits part of hw config
Instead of having a separate callback, use the HW config callback
with a new flag to change retry limits.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-10-31 19:00:14 -04:00
Johannes Berg d51626df57 nl80211: export HT capabilities
This exports the local HT capabilities in nl80211.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-10-31 19:00:13 -04:00
Johannes Berg 94778280fa mac80211: provide sequence numbers
I've come to think that not providing sequence numbers for
the normal STA mode case was a mistake, at least two drivers
now had to implement code they wouldn't otherwise need, and
I believe at76_usb and adm8211 might be broken.

This patch makes mac80211 assign a sequence number to all
those frames that need one except beacons. That means that
if a driver only implements modes that do not do beaconing
it need not worry about the sequence number.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-10-31 19:00:12 -04:00
Henrique de Moraes Holschuh 78236571a5 rfkill: rate-limit rfkill-input workqueue usage (v3)
Limit the number of "expensive" rfkill workqueue operations per second, in
order to not hog system resources too much when faced with a rogue source
of rfkill input events.

The old rfkill-input code (before it was refactored) had such a limit in
place.  It used to drop new events that were past the rate limit.  This
behaviour was not implemented as an anti-DoS measure, but rather as an
attempt to work around deficiencies in input device drivers which would
issue multiple KEY_FOO events too soon for a given key FOO (i.e. ones that
do not implement mechanical debouncing properly).

However, we can't really expect such issues to be worked around by every
input handler out there, and also by every userspace client of input
devices.  It is the input device driver's responsability to do debouncing
instead of spamming the input layer with bogus events.

The new limiter code is focused only on anti-DoS behaviour, and tries to
not lose events (instead, it coalesces them when possible).

The transmitters are updated once every 200ms, maximum.  Care is taken not
to delay a request to _enter_ rfkill transmitter Emergency Power Off (EPO)
mode.

If mistriggered (e.g. by a jiffies counter wrap), the code delays processing
*once* by 200ms.

Signed-off-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
Cc: Ivo van Doorn <IvDoorn@gmail.com>
Cc: Dmitry Torokhov <dtor@mail.ru>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-10-31 19:00:10 -04:00
Henrique de Moraes Holschuh 176707997b rfkill: honour EPO state when resuming a rfkill controller
rfkill_resume() would always restore the rfkill controller state to its
pre-suspend state.

Now that we know when we are under EPO, kick the rfkill controller to
SOFT_BLOCKED state instead of to its pre-suspend state when it is resumed
while EPO mode is active.

Signed-off-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
Cc: Ivo van Doorn <IvDoorn@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-10-31 19:00:10 -04:00
Henrique de Moraes Holschuh d003922dab rfkill: add master_switch_mode and EPO lock to rfkill and rfkill-input
Add of software-based sanity to rfkill and rfkill-input so that it can
reproduce what hardware-based EPO switches do, blocking all transmitters
and locking down any further attempts to unblock them until the switch is
deactivated.

rfkill-input is responsible for issuing the EPO control requests, like
before.

While an rfkill EPO is active, all transmitters are locked to one of the
BLOCKED states and all attempts to change that through the rfkill API
(userspace and kernel) will be either ignored or return -EPERM errors.

The lock will be released upon receipt of EV_SW SW_RFKILL_ALL ON by
rfkill-input, or should modular rfkill-input be unloaded.

This makes rfkill and rfkill-input extend the operation of an existing
wireless master kill switch to all wireless devices in the system, even
those that are not under hardware or firmware control.

Since the above is the expected operational behavior for the master rfkill
switch, the EPO lock functionality is not optional.

Also, extend rfkill-input to allow for three different behaviors when it
receives an EV_SW SW_RFKILL_ALL ON input event.  The user can set which
behavior he wants through the master_switch_mode parameter:

master_switch_mode = 0: EV_SW SW_RFKILL_ALL ON just unlocks rfkill
controller state changes (so that the rfkill userspace and kernel APIs can
now be used to change rfkill controller states again), but doesn't change
any of their states (so they will all remain blocked).  This is the safest
mode of operation, as it requires explicit operator action to re-enable a
transmitter.

master_switch_mode = 1: EV_SW SW_RFKILL_ALL ON causes rfkill-input to
attempt to restore the system to the state before the last EV_SW
SW_RFKILL_ALL OFF event, or to the default global states if no EV_SW
SW_RFKILL_ALL OFF ever happened.   This is the recommended mode of
operation for laptops.

master_switch_mode = 2: tries to unblock all rfkill controllers (i.e.
enable all transmitters) when an EV_SW SW_RFKILL_ALL ON event is received.
This is the default mode of operation, as it mimics the previous behavior
of rfkill-input.

In order to implement these features in a clean way, the entire event
handling of rfkill-input was refactored into a single worker function.

Protection against input event DoS (repeatedly firing rfkill events for
rfkill-input to process) was removed during the code refactoring.  It will
be added back in a future patch.

Note that with these changes, rfkill-input doesn't need to explicitly
handle any radio types for which KEY_<radio type> or SW_<radio type> events
do not exist yet.

Code to handle EV_SW SW_{WLAN,WWAN,BLUETOOTH,WIMAX,...} was added as it
might be needed in the future (and its implementation is not that obvious),
but is currently #ifdef'd out to avoid wasting resources.

Signed-off-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
Cc: Ivo van Doorn <IvDoorn@gmail.com>
Cc: Dmitry Torokhov <dtor@mail.ru>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-10-31 19:00:09 -04:00
Henrique de Moraes Holschuh 68d2413bec rfkill: export global states to rfkill-input
Export the the global switch states to rfkill-input.  This is needed to
properly implement KEY_* handling without disregarding the initial state.

Signed-off-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
Cc: Ivo van Doorn <IvDoorn@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-10-31 19:00:09 -04:00
Henrique de Moraes Holschuh cf4b4aab55 rfkill: use killable locks instead of interruptible
Apparently, many applications don't expect to get EAGAIN from fd read/write
operations, since POSIX doesn't mandate it.

Use mutex_lock_killable instead of mutex_lock_interruptible, which won't
cause issues.

Signed-off-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
Cc: Ivo van Doorn <IvDoorn@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-10-31 19:00:08 -04:00
Johannes Berg e8975581f6 mac80211: introduce hw config change flags
This makes mac80211 notify the driver which configuration
actually changed, e.g. channel etc.

No driver changes, this is just plumbing, driver authors are
expected to act on this if they want to.

Also remove the HW CONFIG debug printk, it's incorrect, often
we configure something else.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-10-31 19:00:07 -04:00
Johannes Berg 0f4ac38b59 mac80211: kill hw.conf.antenna_sel_{rx,tx}
Never actually used.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-10-31 19:00:06 -04:00
Johannes Berg d9fe60dea7 802.11: clean up/fix HT support
This patch cleans up a number of things:
 * the unusable definition of the HT capabilities/HT information
   information elements
 * variable names that are hard to understand
 * mac80211: move ieee80211_handle_ht to ht.c and remove the unused
             enable_ht parameter
 * mac80211: fix bug with MCS rate 32 in ieee80211_handle_ht
 * mac80211: fix bug with casting the result of ieee80211_bss_get_ie
             to an information element _contents_ rather than the
             whole element, add size checking (another out-of-bounds
             access bug fixed!)
 * mac80211: remove some unused return values in favour of BUG_ON
             checking
 * a few minor other things

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-10-31 19:00:06 -04:00
Rami Rosen 1397dcebd8 mac80211: remove unused declaration of struct sta_attribute.
This patch removes unused definition of struct sta_attribute
in net/mac80211/ieee80211_i.h.

Signed-off-by: Rami Rosen <ramirose@gmail.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-10-31 19:00:02 -04:00
Johannes Berg 7a5158ef8d mac80211: fix short slot handling
This patch makes mac80211 handle short slot requests from the AP
properly. Also warn about uses of IEEE80211_CONF_SHORT_SLOT_TIME
and optimise out the code since it cannot ever be hit anyway.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-10-31 18:58:53 -04:00
Johannes Berg e87a2feea7 mac80211: remove max_antenna_gain config
The antenna gain isn't exactly configurable, despite the belief of
some unnamed individual who thinks that the EEPROM might influence
it.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-10-31 18:06:01 -04:00
Johannes Berg d73782fdde mac80211: clean up ieee80211_hw_config errors
Warn when ieee80211_hw_config returns an error, it shouldn't
happen; remove a number of printks that would happen in such
a case and one printk that is user-triggerable.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-10-31 18:06:00 -04:00
Johannes Berg 3db594380b mac80211: remove wiphy_to_hw
This isn't used by anyone, if we ever need it we can add
it back, until then it's useless.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-10-31 18:06:00 -04:00
Johannes Berg c6a1fa12d2 mac80211: minor code cleanups
Nothing very interesting, some checkpatch inspired stuff,
some other things.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-10-31 18:05:59 -04:00
Johannes Berg 36ff382d00 mac80211: remove writable debugs mesh parameters
These parameters shouldn't be configurable via debugfs, if they
need to be configurable nl80211 support has to be added, if not
then they don't need to be writable here either.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Cc: Javier Cardona <javier@cozybit.com>
Cc: Luis Carlos Cobo <luisca@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-10-31 18:05:57 -04:00
Johannes Berg 804feeb826 mac80211: remove aggregation status write support from debugfs
This code uses static variables and thus cannot be kept.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-10-31 18:05:57 -04:00
Harvey Harrison 21454aaad3 net: replace NIPQUAD() in net/*/
Using NIPQUAD() with NIPQUAD_FMT, %d.%d.%d.%d or %u.%u.%u.%u
can be replaced with %pI4

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-31 00:54:56 -07:00
Harvey Harrison 14d5e834f6 net: replace NIPQUAD() in net/netfilter/
Using NIPQUAD() with NIPQUAD_FMT, %d.%d.%d.%d or %u.%u.%u.%u
can be replaced with %pI4

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-31 00:54:29 -07:00
Harvey Harrison 673d57e723 net: replace NIPQUAD() in net/ipv4/ net/ipv6/
Using NIPQUAD() with NIPQUAD_FMT, %d.%d.%d.%d or %u.%u.%u.%u
can be replaced with %pI4

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-31 00:53:57 -07:00
Harvey Harrison cffee385d7 net: replace NIPQUAD() in net/ipv4/netfilter/
Using NIPQUAD() with NIPQUAD_FMT, %d.%d.%d.%d or %u.%u.%u.%u
can be replaced with %pI4

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-31 00:53:08 -07:00
Jarek Poplawski 77be155cba pkt_sched: Add peek emulation for non-work-conserving qdiscs.
This patch adds qdisc_peek_dequeued() wrapper to emulate peek method
with qdisc->dequeue() and storing "peeked" skb in qdisc->gso_skb until
dequeuing. This is mainly for compatibility reasons not to break some
strange configs because peeking is expected for non-work-conserving
parent qdiscs to query work-conserving child qdiscs.

This implementation requires using qdisc_dequeue_peeked() wrapper
instead of directly calling qdisc->dequeue() for all qdiscs ever
querried with qdisc->ops->peek() or qdisc_peek_dequeued().

Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-31 00:47:01 -07:00
Jarek Poplawski 03c05f0d4b pkt_sched: Use qdisc->ops->peek() instead of ->dequeue() & ->requeue()
Use qdisc->ops->peek() instead of ->dequeue() & ->requeue() pair.
After this patch the only remaining user of qdisc->ops->requeue() is
netem_enqueue(). Based on ideas of Herbert Xu, Patrick McHardy and
David S. Miller.

Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-31 00:46:19 -07:00
Jarek Poplawski 8e3af97899 pkt_sched: Add qdisc->ops->peek() implementation.
Add qdisc->ops->peek() implementation for work-conserving qdiscs.
With feedback from Patrick McHardy.

Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-31 00:45:55 -07:00
Jarek Poplawski 99c0db2679 pkt_sched: sch_generic: Add generic qdisc->ops->peek() implementation.
With feedback from Patrick McHardy.

Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-31 00:45:27 -07:00
Patrick McHardy 48a8f519e0 pkt_sched: Add ->peek() methods for fifo, prio and SFQ qdiscs.
From: Patrick McHardy <kaber@trash.net>

Just as a demonstration how easy adding a peek operation to the
work-conserving qdiscs actually is. It doesn't need to keep or change
any internal state in many cases thanks to the guarantee that the
packet will either be dequeued or, if another packet arrives, the
upper qdisc will immediately ->peek again to reevaluate the state.

(This is only slightly modified Patrick's patch.)

Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-31 00:44:18 -07:00
Alexey Dobriyan d5917a35ac xfrm: C99 for xfrm_dev_notifier
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-31 00:41:59 -07:00
David S. Miller a1744d3bee Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:

	drivers/net/wireless/p54/p54common.c
2008-10-31 00:17:34 -07:00
fernando@oss.ntt.co a432226614 xfrm: do not leak ESRCH to user space
I noticed that, under certain conditions, ESRCH can be leaked from the
xfrm layer to user space through sys_connect. In particular, this seems
to happen reliably when the kernel fails to resolve a template either
because the AF_KEY receive buffer being used by racoon is full or
because the SA entry we are trying to use is in XFRM_STATE_EXPIRED
state.

However, since this could be a transient issue it could be argued that
EAGAIN would be more appropriate. Besides this error code is not even
documented in the man page for sys_connect (as of man-pages 3.07).

Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-31 00:06:03 -07:00
David S. Miller e5e7ad44d0 Merge branch 'master' of git://git.infradead.org/users/pcmoore/lblnet-2.6 2008-10-30 23:57:40 -07:00
Alexey Dobriyan 61e5744849 netfilter: nf_conntrack_proto_gre: switch to register_pernet_gen_subsys()
register_pernet_gen_device() can't be used is nf_conntrack_pptp module is
also used (compiled in or loaded).

Right now, proto_gre_net_exit() is called before nf_conntrack_pptp_net_exit().
The former shutdowns and frees GRE piece of netns, however the latter
absolutely needs it to flush keymap. Oops is inevitable.

Switch to shiny new register_pernet_gen_subsys() to get correct ordering in
netns ops list.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-30 23:55:44 -07:00
Alexey Dobriyan 485ac57bc1 netns: add register_pernet_gen_subsys/unregister_pernet_gen_subsys
netns ops which are registered with register_pernet_gen_device() are
shutdown strictly before those which are registered with
register_pernet_subsys(). Sometimes this leads to opposite (read: buggy)
shutdown ordering between two modules.

Add register_pernet_gen_subsys()/unregister_pernet_gen_subsys() for modules
which aren't elite enough for entry in struct net, and which can't use
register_pernet_gen_device(). PPTP conntracking module is such one.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-30 23:55:16 -07:00
Eric Dumazet c8db3fec5b udp: Should use spin_lock_bh()/spin_unlock_bh() in udp_lib_unhash()
Spotted by Alexander Beregalov

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-30 14:00:53 -07:00
Linus Torvalds 1b2d3d94ec Merge branch 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6
* 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6:
  SUNRPC: Fix potential race in put_rpccred()
  SUNRPC: Fix rpcauth_prune_expired
  NFS: Convert nfs_attr_generation_counter into an atomic_long
  SUNRPC: Respond promptly to server TCP resets
2008-10-30 12:51:42 -07:00
Manish Katiyar 47b676c0e0 netlabel: Fix compilation warnings in net/netlabel/netlabel_addrlist.c
Enable netlabel auditing functions only when CONFIG_AUDIT is set

Signed-off-by: Manish Katiyar <mkatiyar@gmail.com>
Signed-off-by: Paul Moore <paul.moore@hp.com>
2008-10-30 10:44:48 -04:00
Paul Moore f8a024796b netlabel: Fix compiler warnings in netlabel_mgmt.c
Fix the compiler warnings below, thanks to Andrew Morton for finding them.

 net/netlabel/netlabel_mgmt.c: In function `netlbl_mgmt_listentry':
 net/netlabel/netlabel_mgmt.c:268: warning: 'ret_val' might be used
  uninitialized in this function

Signed-off-by: Paul Moore <paul.moore@hp.com>
2008-10-29 16:09:12 -04:00
roel kluin 00af5c6959 cipso: unsigned buf_len cannot be negative
unsigned buf_len cannot be negative

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Paul Moore <paul.moore@hp.com>
2008-10-29 15:55:53 -04:00
Harvey Harrison 5b095d9892 net: replace %p6 with %pI6
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-29 12:52:50 -07:00
Harvey Harrison 4b7a4274ca net: replace %#p6 format specifier with %pi6
gcc warns when using the # modifier with the %p format specifier,
so we can't use this to omit the colons when needed, introduces
%pi6 instead.

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-29 12:50:24 -07:00
Eric Dumazet 96631ed16c udp: introduce sk_for_each_rcu_safenext()
Corey Minyard found a race added in commit 271b72c7fa
(udp: RCU handling for Unicast packets.)

 "If the socket is moved from one list to another list in-between the
 time the hash is calculated and the next field is accessed, and the
 socket has moved to the end of the new list, the traversal will not
 complete properly on the list it should have, since the socket will
 be on the end of the new list and there's not a way to tell it's on a
 new list and restart the list traversal.  I think that this can be
 solved by pre-fetching the "next" field (with proper barriers) before
 checking the hash."

This patch corrects this problem, introducing a new
sk_for_each_rcu_safenext() macro.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-29 11:19:58 -07:00
Eric Dumazet f52b5054ec udp: udp_get_next() should use spin_unlock_bh()
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-29 11:19:11 -07:00
Eric Dumazet 8203efb3c6 udp: calculate udp_mem based on low memory instead of all memory
This patch mimics commit 57413ebc4e
(tcp: calculate tcp_mem based on low memory instead of all memory)

The udp_mem array which contains limits on the total amount of memory
used by UDP sockets is calculated based on nr_all_pages.  On a 32 bits
x86 system, we should base this on the number of lowmem pages.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-29 02:32:32 -07:00
Eric Dumazet 271b72c7fa udp: RCU handling for Unicast packets.
Goals are :

1) Optimizing handling of incoming Unicast UDP frames, so that no memory
 writes should happen in the fast path.

 Note: Multicasts and broadcasts still will need to take a lock,
 because doing a full lockless lookup in this case is difficult.

2) No expensive operations in the socket bind/unhash phases :
  - No expensive synchronize_rcu() calls.

  - No added rcu_head in socket structure, increasing memory needs,
  but more important, forcing us to use call_rcu() calls,
  that have the bad property of making sockets structure cold.
  (rcu grace period between socket freeing and its potential reuse
   make this socket being cold in CPU cache).
  David did a previous patch using call_rcu() and noticed a 20%
  impact on TCP connection rates.
  Quoting Cristopher Lameter :
   "Right. That results in cacheline cooldown. You'd want to recycle
    the object as they are cache hot on a per cpu basis. That is screwed
    up by the delayed regular rcu processing. We have seen multiple
    regressions due to cacheline cooldown.
    The only choice in cacheline hot sensitive areas is to deal with the
    complexity that comes with SLAB_DESTROY_BY_RCU or give up on RCU."

  - Because udp sockets are allocated from dedicated kmem_cache,
  use of SLAB_DESTROY_BY_RCU can help here.

Theory of operation :
---------------------

As the lookup is lockfree (using rcu_read_lock()/rcu_read_unlock()),
special attention must be taken by readers and writers.

Use of SLAB_DESTROY_BY_RCU is tricky too, because a socket can be freed,
reused, inserted in a different chain or in worst case in the same chain
while readers could do lookups in the same time.

In order to avoid loops, a reader must check each socket found in a chain
really belongs to the chain the reader was traversing. If it finds a
mismatch, lookup must start again at the begining. This *restart* loop
is the reason we had to use rdlock for the multicast case, because
we dont want to send same message several times to the same socket.

We use RCU only for fast path.
Thus, /proc/net/udp still takes spinlocks.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-29 02:11:14 -07:00
Eric Dumazet 645ca708f9 udp: introduce struct udp_table and multiple spinlocks
UDP sockets are hashed in a 128 slots hash table.

This hash table is protected by *one* rwlock.

This rwlock is readlocked each time an incoming UDP message is handled.

This rwlock is writelocked each time a socket must be inserted in
hash table (bind time), or deleted from this table (close time)

This is not scalable on SMP machines :

1) Even in read mode, lock() and unlock() are atomic operations and
 must dirty a contended cache line, shared by all cpus.

2) A writer might be starved if many readers are 'in flight'. This can
 happen on a machine with some NIC receiving many UDP messages. User
 process can be delayed a long time at socket creation/dismantle time.

This patch prepares RCU migration, by introducing 'struct udp_table
and struct udp_hslot', and using one spinlock per chain, to reduce
contention on central rwlock.

Introducing one spinlock per chain reduces latencies, for port
randomization on heavily loaded UDP servers. This also speedup
bindings to specific ports.

udp_lib_unhash() was uninlined, becoming to big.

Some cleanups were done to ease review of following patch
(RCUification of UDP Unicast lookups)

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-29 01:41:45 -07:00
Harvey Harrison b189db5d29 net: remove NIP6(), NIP6_FMT, NIP6_SEQFMT and final users
Open code NIP6_FMT in the one call inside sscanf and one user
of NIP6() that could use %p6 in the netfilter code.

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-28 23:02:38 -07:00
Stephen Hemminger b30200616f vlan: propogate ethtool speed values
This enables more ethtool information. The speed and settings of the
underlying device are propagated up. This makes services like SNMP that
use ethtool to get speed setting, work when managing a vlan, without adding
silly heurtistics into SNMP daemon.

For the driver info, just use existing driver strings.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-28 23:02:34 -07:00
Harvey Harrison fdb46ee752 net, misc: replace uses of NIP6_FMT with %p6
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-28 23:02:32 -07:00
Harvey Harrison 0c6ce78abf net: replace uses of NIP6_FMT with %p6
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-28 23:02:31 -07:00
Harvey Harrison 38ff4fa49b netfilter: replace uses of NIP6_FMT with %p6
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-28 16:08:13 -07:00
Harvey Harrison b071195deb net: replace all current users of NIP6_SEQFMT with %#p6
The define in kernel.h can be done away with at a later time.

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-28 16:05:40 -07:00
Martin Willi 3a2dfbe8ac xfrm: Notify changes in UDP encapsulation via netlink
Add new_mapping() implementation to the netlink xfrm_mgr to notify
address/port changes detected in UDP encapsulated ESP packets.

Signed-off-by: Martin Willi <martin@strongswan.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-28 16:01:07 -07:00
Alexey Dobriyan 93adcc80f3 net: don't use INIT_RCU_HEAD
call_rcu() will unconditionally rewrite RCU head anyway.
Applies to 
	struct neigh_parms
	struct neigh_table
	struct net
	struct cipso_v4_doi
	struct in_ifaddr
	struct in_device
	rt->u.dst

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-28 13:25:09 -07:00
Alexey Dobriyan def8b4faff net: reduce structures when XFRM=n
ifdef out
* struct sk_buff::sp		(pointer)
* struct dst_entry::xfrm	(pointer)
* struct sock::sk_policy	(2 pointers)

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-28 13:24:06 -07:00
Jesse Brandeburg 882716604e pktgen: fix multiple queue warning
when testing the new pktgen module with multiple queues and ixgbe with:
	pgset "flag QUEUE_MAP_CPU"

I found that I was getting errors in dmesg like:
pktgen: WARNING: QUEUE_MAP_CPU disabled because CPU count (8) exceeds number
<4>pktgen: WARNING: of tx queues (8) on eth15

you'll note, 8 really doesn't exceed 8.

This patch seemed to fix the logic errors and also the attempts at
limiting line length in printk (which didn't work anyway)

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Robert Olsson <robert.olsson@its.uu.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-28 13:21:51 -07:00
Trond Myklebust 5f707eb429 SUNRPC: Fix potential race in put_rpccred()
We have to be careful when we try to unhash the credential in
put_rpccred(), because we're not holding the credcache lock, so the call to
rpcauth_unhash_cred() may fail if someone else has looked the cred up, and
obtained a reference to it.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-10-28 15:21:42 -04:00
Trond Myklebust eac0d18d44 SUNRPC: Fix rpcauth_prune_expired
We need to make sure that we don't remove creds from the cred_unused list
if they are still under the moratorium, or else they will never get
garbage collected.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-10-28 15:21:41 -04:00
Trond Myklebust 2a9e1cfa23 SUNRPC: Respond promptly to server TCP resets
If the server sends us an RST error while we're in the TCP_ESTABLISHED
state, then that will not result in a state change, and so the RPC client
ends up hanging forever (see
http://bugzilla.kernel.org/show_bug.cgi?id=11154)

We can intercept the reset by setting up an sk->sk_error_report callback,
which will then allow us to initiate a proper shutdown and retry...

We also make sure that if the send request receives an ECONNRESET, then we
shutdown too...

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-10-28 15:21:39 -04:00
Patrick McHardy b057efd4d2 netlink: constify struct nlattr * arg to parsing functions
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-28 11:59:11 -07:00
Eric W. Biederman 3891845e1e netns: Coexist with the sysfs limitations v2
To make testing of the network namespace simpler allow
the network namespace code and the sysfs code to be
compiled and run at the same time.  To do this only
virtual devices are allowed in the additional network
namespaces and those virtual devices are not placed
in the kobject tree.

Since virtual devices don't actually do anything interesting
hardware wise that needs device management there should
be no loss in keeping them out of the kobject tree and
by implication sysfs.  The gain in ease of testing
and code coverage should be significant.

Changelog:

v2: As pointed out by Benjamin Thery it only makes sense to call
    device_rename in the initial network namespace for now.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Benjamin Thery <benjamin.thery@bull.net>
Tested-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-27 17:51:47 -07:00
Johannes Berg e174961ca1 net: convert print_mac to %pM
This converts pretty much everything to print_mac. There were
a few things that had conflicts which I have just dropped for
now, no harm done.

I've built an allyesconfig with this and looked at the files
that weren't built very carefully, but it's a huge patch.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-27 17:06:18 -07:00
Johannes Berg 0c68ae2605 mac80211: convert to %pM away from print_mac
Also remove a few stray DECLARE_MAC_BUF that were no longer
used at all.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-27 17:06:16 -07:00
Neil Horman 1080d709fb net: implement emergency route cache rebulds when gc_elasticity is exceeded
This is a patch to provide on demand route cache rebuilding.  Currently, our
route cache is rebulid periodically regardless of need.  This introduced
unneeded periodic latency.  This patch offers a better approach.  Using code
provided by Eric Dumazet, we compute the standard deviation of the average hash
bucket chain length while running rt_check_expire.  Should any given chain
length grow to larger that average plus 4 standard deviations, we trigger an
emergency hash table rebuild for that net namespace.  This allows for the common
case in which chains are well behaved and do not grow unevenly to not incur any
latency at all, while those systems (which may be being maliciously attacked),
only rebuild when the attack is detected.  This patch take 2 other factors into
account:
1) chains with multiple entries that differ by attributes that do not affect the
hash value are only counted once, so as not to unduly bias system to rebuilding
if features like QOS are heavily used
2) if rebuilding crosses a certain threshold (which is adjustable via the added
sysctl in this patch), route caching is disabled entirely for that net
namespace, since constant rebuilding is less efficient that no caching at all

Tested successfully by me.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-27 17:06:14 -07:00
John W. Linville 51b94bf065 mac80211: correct warnings in minstrel rate control algorithm
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-10-27 17:46:11 -04:00
Dmitry Baryshkov d8b105f900 RFKILL: fix input layer initialisation
Initialise correctly last fields, so tasks can be actually executed.
On some architectures the initial jiffies value is not zero, so later
all rfkill incorrectly decides that rfkill_*.last is in future.

Signed-off-by: Dmitry Baryshkov <dbaryshkov@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-10-27 17:46:11 -04:00
Florian Westphal 8b5f12d04b syncookies: fix inclusion of tcp options in syn-ack
David Miller noticed that commit
33ad798c92 '(tcp: options clean up')
did not move the req->cookie_ts check.
This essentially disabled commit 4dfc281702
'[Syncookies]: Add support for TCP options via timestamps.'.

This restores the original logic.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-26 23:10:12 -07:00
Remi Denis-Courmont c3a90c788b Phonet: do not reply to indication reset packets
This fixes a potential error packet loop.

Signed-off-by: Remi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-26 23:07:25 -07:00
Arjan van de Ven 44a504c405 wireless: fix regression caused by regulatory config option
The default for the regulatory compatibility option is wrong;
if you picked the default you ended up with a non-functional wifi
system (at least I did on Fedora 9 with iwl4965).
I don't think even the October 2008 releases of the various distros
has the new userland so clearly the default is wrong, and also
we can't just go about deleting this in 2.6.29...

Change the default to "y" and also adjust the config text a little to
reflect this.

This patch fixes regression #11859

With thanks to Johannes Berg for the diagnostics

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-26 10:38:52 -07:00
Linus Torvalds 2242d5eff1 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (29 commits)
  tcp: Restore ordering of TCP options for the sake of inter-operability
  net: Fix disjunct computation of netdev features
  sctp: Fix to handle SHUTDOWN in SHUTDOWN_RECEIVED state
  sctp: Fix to handle SHUTDOWN in SHUTDOWN-PENDING state
  sctp: Add check for the TSN field of the SHUTDOWN chunk
  sctp: Drop ICMP packet too big message with MTU larger than current PMTU
  p54: enable 2.4/5GHz spectrum by eeprom bits.
  orinoco: reduce stack usage in firmware download path
  ath5k: fix suspend-related oops on rmmod
  [netdrvr] fec_mpc52xx: Implement polling, to make netconsole work.
  qlge: Fix MSI/legacy single interrupt bug.
  smc911x: Make the driver safer on SMP
  smc911x: Add IRQ polarity configuration
  smc911x: Allow Kconfig dependency on ARM
  sis190: add identifier for Atheros AR8021 PHY
  8139x: reduce message severity on driver overlap
  igb: add IGB_DCA instead of selecting INTEL_IOATDMA
  igb: fix tx data corruption with transition to L0s on 82575
  ehea: Fix memory hotplug support
  netdev: DM9000: remove BLACKFIN hacking in DM9000 netdev driver
  ...
2008-10-23 19:19:54 -07:00
Ilpo Järvinen fd6149d332 tcp: Restore ordering of TCP options for the sake of inter-operability
This is not our bug! Sadly some devices cannot cope with the change
of TCP option ordering which was a result of the recent rewrite of
the option code (not that there was some particular reason steming
from the rewrite for the reordering) though any ordering of TCP
options is perfectly legal. Thus we restore the original ordering
to allow interoperability with/through such broken devices and add
some warning about this trap. Since the reordering just happened
without any particular reason, this change shouldn't cost us
anything.

There are already couple of known failure reports (within close
proximity of the last release), so the problem might be more
wide-spread than a single device. And other reports which may
be due to the same problem though the symptoms were less obvious.
Analysis of one of the case revealed (with very high probability)
that sack capability cannot be negotiated as the first option
(SYN never got a response).

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Reported-by: Aldo Maggi <sentiniate@tiscali.it>
Tested-by: Aldo Maggi <sentiniate@tiscali.it>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-23 14:06:35 -07:00
Linus Torvalds 1f6d6e8ebe Merge branch 'v28-range-hrtimers-for-linus-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'v28-range-hrtimers-for-linus-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (37 commits)
  hrtimers: add missing docbook comments to struct hrtimer
  hrtimers: simplify hrtimer_peek_ahead_timers()
  hrtimers: fix docbook comments
  DECLARE_PER_CPU needs linux/percpu.h
  hrtimers: fix typo
  rangetimers: fix the bug reported by Ingo for real
  rangetimer: fix BUG_ON reported by Ingo
  rangetimer: fix x86 build failure for the !HRTIMERS case
  select: fix alpha OSF wrapper
  select: fix alpha OSF wrapper
  hrtimer: peek at the timer queue just before going idle
  hrtimer: make the futex() system call use the per process slack value
  hrtimer: make the nanosleep() syscall use the per process slack
  hrtimer: fix signed/unsigned bug in slack estimator
  hrtimer: show the timer ranges in /proc/timer_list
  hrtimer: incorporate feedback from Peter Zijlstra
  hrtimer: add a hrtimer_start_range() function
  hrtimer: another build fix
  hrtimer: fix build bug found by Ingo
  hrtimer: make select() and poll() use the hrtimer range feature
  ...
2008-10-23 10:53:02 -07:00
Linus Torvalds 5ed487bc2c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (46 commits)
  [PATCH] fs: add a sanity check in d_free
  [PATCH] i_version: remount support
  [patch] vfs: make security_inode_setattr() calling consistent
  [patch 1/3] FS_MBCACHE: don't needlessly make it built-in
  [PATCH] move executable checking into ->permission()
  [PATCH] fs/dcache.c: update comment of d_validate()
  [RFC PATCH] touch_mnt_namespace when the mount flags change
  [PATCH] reiserfs: add missing llseek method
  [PATCH] fix ->llseek for more directories
  [PATCH vfs-2.6 6/6] vfs: add LOOKUP_RENAME_TARGET intent
  [PATCH vfs-2.6 5/6] vfs: remove LOOKUP_PARENT from non LOOKUP_PARENT lookup
  [PATCH vfs-2.6 4/6] vfs: remove unnecessary fsnotify_d_instantiate()
  [PATCH vfs-2.6 3/6] vfs: add __d_instantiate() helper
  [PATCH vfs-2.6 2/6] vfs: add d_ancestor()
  [PATCH vfs-2.6 1/6] vfs: replace parent == dentry->d_parent by IS_ROOT()
  [PATCH] get rid of on-stack dentry in udf
  [PATCH 2/2] anondev: switch to IDA
  [PATCH 1/2] anondev: init IDR statically
  [JFFS2] Use d_splice_alias() not d_add() in jffs2_lookup()
  [PATCH] Optimise NFS readdir hack slightly.
  ...
2008-10-23 10:22:40 -07:00
Al Viro 421748ecde [PATCH] assorted path_lookup() -> kern_path() conversions
more nameidata eviction

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-10-23 05:12:52 -04:00
Herbert Xu b63365a2d6 net: Fix disjunct computation of netdev features
My change

    commit e2a6b85247
    net: Enable TSO if supported by at least one device

didn't do what was intended because the netdev_compute_features
function was designed for conjunctions.  So what happened was that
it would simply take the TSO status of the last constituent device.

This patch extends it to support both conjunctions and disjunctions
under the new name of netdev_increment_features.

It also adds a new function netdev_fix_features which does the
sanity checking that usually occurs upon registration.  This ensures
that the computation doesn't result in an illegal combination
since this checking is absent when the change is initiated via
ethtool.

The two users of netdev_compute_features have been converted.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-23 01:11:29 -07:00
Wei Yongjun 2e3f92dad6 sctp: Fix to handle SHUTDOWN in SHUTDOWN_RECEIVED state
Once an endpoint has reached the SHUTDOWN-RECEIVED state,
it MUST NOT send a SHUTDOWN in response to a ULP request.
The Cumulative TSN Ack of the received SHUTDOWN chunk
MUST be processed.

This patch fix to process Cumulative TSN Ack of the received
SHUTDOWN chunk in SHUTDOWN_RECEIVED state.

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-23 01:01:18 -07:00
Wei Yongjun cf896d514a sctp: Fix to handle SHUTDOWN in SHUTDOWN-PENDING state
If SHUTDOWN is received in SHUTDOWN-PENDING state, enpoint should enter
the SHUTDOWN-RECEIVED state and check the Cumulative TSN Ack field of
the SHUTDOWN chunk (RFC 4960 Section 9.2). If the SHUTDOWN chunk can
acknowledge all of the send DATA chunks, SHUTDOWN-ACK should be sent.

But now endpoint just silently discarded the SHUTDOWN chunk.

SHUTDOWN received in SHUTDOWN-PENDING state can happend when the last
SACK is lost by network, or the SHUTDOWN chunk can acknowledge all of
the received DATA chunks. The packet sequence(SACK lost) is like this:

Endpoint A                       Endpoint B       ULP
(ESTABLISHED)                    (ESTABLISHED)

               <-----------      DATA
                                             <--- shutdown
                                 Enter SHUTDOWN-PENDING state
  SACK         ----lost---->

  SHUTDOWN(*1) ------------>

               <-----------      SHUTDOWN-ACK

 (*1) silently discarded now.

This patch fix to handle SHUTDOWN in SHUTDOWN-PENDING state as the same
as ESTABLISHED state.

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-23 01:00:49 -07:00
Wei Yongjun df10eec476 sctp: Add check for the TSN field of the SHUTDOWN chunk
If SHUTDOWN chunk is received Cumulative TSN Ack beyond the max tsn currently
send, SHUTDOWN chunk be accepted and the association will be broken. New data
is send, but after received SACK it will be drop because TSN in SACK is less
than the Cumulative TSN, data will be retrans again and again even if correct
SACK is received.

The packet sequence is like this:

Endpoint A                       Endpoint B       ULP
(ESTABLISHED)                    (ESTABLISHED)

               <-----------      DATA (TSN=x-1)

               <-----------      DATA (TSN=x)

  SHUTDOWN     ----------->      (Now Cumulative TSN=x+1000)
  (TSN=x+1000)
               <-----------      DATA (TSN=x+1)

  SACK         ----------->      drop the SACK
  (TSN=x+1)
               <-----------      DATA (TSN=x+1)(retrans)

This patch fix this problem by terminating the association and respond to
the sender with an ABORT.

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-23 01:00:21 -07:00
Wei Yongjun 91bd6b1e03 sctp: Drop ICMP packet too big message with MTU larger than current PMTU
If ICMP packet too big message is received with MTU larger than current
PMTU, SCTP will still accept this ICMP message and sync the PMTU of assoc
with the wrong MTU.

Endpoing A                 Endpoint B
(ESTABLISHED)              (ESTABLISHED)
ICMP         --------->
(packet too big, MTU too larger)
                           sync PMTU

This patch fixed the problem by drop that ICMP message.

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-23 00:59:52 -07:00
Eric Van Hensbergen e45c5405e1 9p: fix sparse warnings
Several sparse warnings were introduced by patches accepted during the merge
window which weren't caught.  This patch fixes those warnings.

Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-10-22 18:54:47 -05:00
Tom Tucker fc79d4b104 9p: rdma: RDMA Transport Support for 9P
This patch implements the RDMA transport provider for 9P. It allows
mounts to be performed over iWARP and IB capable network interfaces.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Signed-off-by: Latchesar Ionkov <lionkov@lanl.gov>
2008-10-22 18:47:39 -05:00
Eric Van Hensbergen 0b15a3a528 9p: fix debug build error
Fixes build problem with 9p when building with debug disabled.
Also contains some fixes for warnings which pop up when 
CONFIG_NET_9P_DEBUG is disabled.

Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-10-22 18:47:40 -05:00
Thomas Gleixner 268a3dcfea Merge branch 'timers/range-hrtimers' into v28-range-hrtimers-for-linus-v2
Conflicts:

	kernel/time/tick-sched.c

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-10-22 09:48:06 +02:00
Ilpo Järvinen 75e3d8db53 tcp: should use number of sack blocks instead of -1
While looking for the recent "sack issue" I also read all eff_sacks
usage that was played around by some relevant commit. I found
out that there's another thing that is asking for a fix (unrelated
to the "sack issue" though).

This feature has probably very little significance in practice.
Opposite direction timeout with bidirectional tcp comes to me as
the most likely scenario though there might be other cases as
well related to non-data segments we send (e.g., response to the
opposite direction segment). Also some ACK losses or option space
wasted for other purposes is necessary to prevent the earlier
SACK feedback getting to the sender.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-21 16:28:36 -07:00
Linus Torvalds 45e4a24f7b Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs
* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs: (26 commits)
  9p: add more conservative locking
  9p: fix oops in protocol stat parsing error path.
  9p: fix device file handling
  9p: Improve debug support
  9p: eliminate depricated conv functions
  9p: rework client code to use new protocol support functions
  9p: remove unnecessary tag field from p9_req_t structure
  9p: remove 9p fcall debug prints
  9p: add new protocol support code
  9p: encapsulate version function
  9p: move dirread to fs layer
  9p: adjust 9p vfs write operation
  9p: move readn meta-function from client to fs layer
  9p: consolidate read/write functions
  9p: drop broken unused error path from p9_conn_create()
  9p: make rpc code common and rework flush code
  9p: use the rcall structure passed in the request in trans_fd read_work
  9p: apply common request code to trans_fd
  9p: apply common tagpool handling to trans_fd
  9p: move request management to client code
  ...
2008-10-20 09:39:47 -07:00
Linus Torvalds 5fdf11283e Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
  netfilter: replace old NF_ARP calls with NFPROTO_ARP
  netfilter: fix compilation error with NAT=n
  netfilter: xt_recent: use proc_create_data()
  netfilter: snmp nat leaks memory in case of failure
  netfilter: xt_iprange: fix range inversion match
  netfilter: netns: use NFPROTO_NUMPROTO instead of NUMPROTO for tables array
  netfilter: ctnetlink: remove obsolete NAT dependency from Kconfig
  pkt_sched: sch_generic: Fix oops in sch_teql
  dccp: Port redirection support for DCCP
  tcp: Fix IPv6 fallout from 'Port redirection support for TCP'
  netdev: change name dropping error codes
  ipvs: Update CONFIG_IP_VS_IPV6 description and help text
2008-10-20 09:06:35 -07:00
Jan Engelhardt fdc9314cbe netfilter: replace old NF_ARP calls with NFPROTO_ARP
(Supplements: ee999d8b95)

NFPROTO_ARP actually has a different value from NF_ARP, so ensure all
callers use the new value so that packets _do_ get delivered to the
registered hooks.

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-20 03:34:51 -07:00
Pablo Neira Ayuso 67671841df netfilter: fix compilation error with NAT=n
This patch fixes the compilation of ctnetlink when the NAT support
is not enabled.

/home/benh/kernels/linux-powerpc/net/netfilter/nf_conntrack_netlink.c:819: warning: enum nf_nat_manip_type\u2019 declared inside parameter list
/home/benh/kernels/linux-powerpc/net/netfilter/nf_conntrack_netlink.c:819: warning: its scope is only this definition or declaration, which is probably not what you want

Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Reported by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-20 03:34:27 -07:00
Alexey Dobriyan b09eec161b netfilter: xt_recent: use proc_create_data()
Fixes a crash in recent_seq_start:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000100
IP: [<ffffffffa002119c>] recent_seq_start+0x4c/0x90 [xt_recent]
PGD 17d33c067 PUD 107afe067 PMD 0
Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
CPU 0
Modules linked in: ipt_LOG xt_recent af_packet iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 xt_tcpudp iptable_filter ip_tables x_tables ext2 nls_utf8 fuse sr_mod cdrom [last unloaded: ntfs]
Pid: 32373, comm: cat Not tainted 2.6.27-04ab591808565f968d4406f6435090ad671ebdab #6
RIP: 0010:[<ffffffffa002119c>]  [<ffffffffa002119c>] recent_seq_start+0x4c/0x90 [xt_recent]
RSP: 0018:ffff88015fed7e28  EFLAGS: 00010246
...

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-20 03:33:49 -07:00
Ilpo Järvinen 311670f3ea netfilter: snmp nat leaks memory in case of failure
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-20 03:33:24 -07:00
Alexey Dobriyan 6def1eb481 netfilter: xt_iprange: fix range inversion match
Inverted IPv4 v1 and IPv6 v0 matches don't match anything since 2.6.25-rc1!

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-20 03:32:21 -07:00
Patrick McHardy 041fb574c7 netfilter: ctnetlink: remove obsolete NAT dependency from Kconfig
Now that ctnetlink doesn't have any NAT module depenencies anymore,
we can also remove them from Kconfig.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-20 03:31:17 -07:00
Jarek Poplawski 9f3ffae0db pkt_sched: sch_generic: Fix oops in sch_teql
After these commands:
# modprobe sch_teql
# tc qdisc add dev eth0 root teql0
# tc qdisc del dev eth0 root
we get an oops in teql_destroy() when spin_lock is taken from a null
qdisc_sleeping pointer. It's because at the moment teql0 dev haven't
been activated yet, and a qdisc_root_sleeping() is pointing to noop
qdisc's netdev_queue with qdisc_sleeping uninitialized. This patch
fixes this both for noop and noqueue netdev_queues to avoid similar
problems in the future.

Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-19 23:37:47 -07:00
Gerrit Renker 944f750227 dccp: Port redirection support for DCCP
Commit a3116ac5c2 from 1st October ("tcp: Port
redirection support for TCP") broke DCCP skb lookup by changing inet_csk_clone,
which is used by DCCP to generate the child socket after the handshake.

This patch updates DCCP to use 'loc_port' instead of 'sport', which fixes the
problem, and thus inheriting port redirection support via the new interface.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-19 23:36:47 -07:00
KOVACS Krisztian fd5070370c tcp: Fix IPv6 fallout from 'Port redirection support for TCP'
'tcp: Port redirection support for TCP' (a3116ac5c) added a new member
to inet_request_sock() which inet_csk_clone() makes use of but failed
to add proper initialization to the IPv6 syncookie code and missed a
couple of places where the new member should be used instead of
inet_sk(sk)->sport.

Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-19 23:35:58 -07:00
Stephen Hemminger 92845ffd2a netdev: change name dropping error codes
If changename notifier returns an error code, it gets incorrectly
cleared during rollback so the error is never returned to the user.
Found while testing similar code for MTU changes.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-19 23:33:56 -07:00
Julius Volz 0537ae6a3d ipvs: Update CONFIG_IP_VS_IPV6 description and help text
This adds a URL to further info to the CONFIG_IP_VS_IPV6 Kconfig help
text. Also, I think it should be ok to remove the "DANGEROUS" label in the
description line at this point to get people to try it out and find all
the bugs ;) It's still marked as experimental, of course.

Signed-off-by: Julius Volz <juliusv@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-19 23:29:56 -07:00
Eric Van Hensbergen 7eb923b80c 9p: add more conservative locking
During the reorganization some of the multi-theaded locking assumptions were
accidently relaxed.  This patch moves us back towards a more conservative 
locking strategy.

Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-10-17 12:45:40 -05:00
Eric Van Hensbergen f0a0ac2ee5 9p: fix oops in protocol stat parsing error path.
When we get an error on parsing a stat due to a protocol bug, 
we can generate an oops during cleanup because we didn't 
initialize the string pointers in the stat structure.

Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-10-17 12:45:23 -05:00
Eric Van Hensbergen e7f4b8f1a5 9p: Improve debug support
The new debug support lacks some of the information that the previous fcprint
code provided -- this patch focuses on better presentation of debug data along
with more helpful debug along error paths.

Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-10-17 16:20:07 -05:00
Arjan van de Ven 651dab4264 Merge commit 'linus/master' into merge-linus
Conflicts:

	arch/x86/kvm/i8254.c
2008-10-17 09:20:26 -07:00
Eric Van Hensbergen 02da398b95 9p: eliminate depricated conv functions
Remove depricated conv functions which have been replaced with new 
protocol routines.

This patch also reworks the one instance of the file-system code which
directly calls conversion routines (to accomplish unpacking dirreads).

Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-10-17 11:06:57 -05:00
Eric Van Hensbergen 51a87c552d 9p: rework client code to use new protocol support functions
Now that the new protocol functions are in place, this patch switches
the client code to using the new support code.

Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-10-17 11:04:45 -05:00
Eric Van Hensbergen cb198131b0 9p: remove unnecessary tag field from p9_req_t structure
This removes the vestigial tag field from the p9_req_t structure.

Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-10-17 11:04:45 -05:00
Eric Van Hensbergen 51d71f9f7a 9p: remove 9p fcall debug prints
One of the current debug options allows users to get a verbose dump of fcalls.
This isn't really necessary as correctly parsed protocol frames can be printed
as part of the code in the client functions.  The consolidated printfcalls
structure would require new entries to be added for every extension.  This
patch removes the debug print methods and their use.

Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-10-17 11:04:44 -05:00
Eric Van Hensbergen ace51c4dd2 9p: add new protocol support code
This adds a new protocol processing support code based on Anthony Liguori's
9p library code.  This code performs protocol marshalling/unmarshalling using
printf like strings to represent protocol elements.  It is my intent to use
them to replace the current functions in conv.c as well as the 
p9_create_* functions.

This should make the client implementation much more clear, and also make it
much easier to add new protocol extensions by limiting the number of places
in which changes need to be made.

Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-10-17 11:04:44 -05:00
Eric Van Hensbergen 6936bf60d2 9p: encapsulate version function
Alsmot all 9P client wire functions have their own (set of) functions.
Tversion is an exception as its encapsulated into the client_create code.

This patch moves the protocol specifics of this to a function to match the
rest of the code.

Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-10-17 11:04:43 -05:00
Eric Van Hensbergen 06b55b464e 9p: move dirread to fs layer
Currently reading a directory is implemented in the client code.
This function is not actually a wire operation, but a meta operation 
which calls read operations and processes the results.

This patch moves this functionality to the fs layer and calls component
wire operations instead of constructing their packets.  This provides a 
cleaner separation and will help when we reorganize the client functions
and protocol processing methods.

Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-10-17 11:04:43 -05:00
Eric Van Hensbergen fbedadc16e 9p: move readn meta-function from client to fs layer
There are a couple of methods in the client code which aren't actually
wire operations.  To keep things organized cleaner, these operations are
being moved to the fs layer.

This patch moves the readn meta-function (which executes multiple wire
reads until a buffer is full) to the fs layer.

Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-10-17 11:04:43 -05:00
Eric Van Hensbergen 0fc9655ec6 9p: consolidate read/write functions
Currently there are two separate versions of read and write.  One for
dealing with user buffers and the other for dealing with kernel buffers.
There is a tremendous amount of code duplication in the otherwise
identical versions of these functions.  This patch adds an additional
user buffer parameter to read and write and conditionalizes handling of
the buffer on whether the kernel buffer or the user buffer is populated.

Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-10-17 11:04:42 -05:00
Tejun Heo 95820a3651 9p: drop broken unused error path from p9_conn_create()
Post p9_fd_poll() error path which checks m->poll_waddr[i] for PTR_ERR
value has the following problems.

* It's completely unused.  Error value is set iff NULL @wait_address
  has been specified to p9_pollwait() which is guaranteed not to
  happen.

* It dereferences @m after deallocating it (introduced by 571ffeaf and
  spotted by Raja R Harinath.

* It returned the wrong value on error.  It should return
  poll_waddr[i] but it returnes poll_waddr (introduced by 571ffeaf).

* p9_mux_poll_stop() doesn't handle PTR_ERR value.  It will try to
  operate on the PTR_ERR value as if it's a normal pointer and cause
  oops.

As the error path is bogus in the first place, there's no reason to
hold onto it.  Kill it.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
Cc: Raja R Harinath <harinath@hurrynot.org>
2008-10-17 11:04:42 -05:00
Eric Van Hensbergen 91b8534fa8 9p: make rpc code common and rework flush code
This code moves the rpc function to the common client base,
reorganizes the flush code to be more simple and stable, and
makes the necessary adjustments to the underlying transports
to adapt to the new structure.

This reduces the overall amount of code duplication between the
transports and should make adding new transports more straightforward.

Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-10-17 11:04:42 -05:00
Eric Van Hensbergen 1b0a763bdd 9p: use the rcall structure passed in the request in trans_fd read_work
This patch reworks the read_work function to enable it to directly use a passed
in rcall structure.  This should help allow us to remove unnecessary copies
in the future.

Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-10-17 11:04:42 -05:00
Eric Van Hensbergen 673d62cdaa 9p: apply common request code to trans_fd
Apply the now common p9_req_t structure to the fd transport.

Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-10-17 11:04:42 -05:00
Eric Van Hensbergen ff683452f7 9p: apply common tagpool handling to trans_fd
Simplify trans_fd by using new common client tagpool structure.

Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-10-17 11:04:42 -05:00
Eric Van Hensbergen fea511a644 9p: move request management to client code
The virtio transport uses a simplified request management system
that I want to use for all transports.  This patch adapts and moves the
exisiting code for managing requests to the client common code.
Later patches will apply these mechanisms to the other transports.

Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-10-17 11:04:42 -05:00
Eric Van Hensbergen 044c776884 9p: eliminate callback complexity
The current trans_fd rpc mechanisms use a dynamic callback mechanism which
introduces a lot of complexity which only accomodates a single special case.
This patch removes much of that complexity in favor of a simple exception
mechanism to deal with flushes.

Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-10-17 11:04:41 -05:00
Eric Van Hensbergen 21c003687e 9p: consolidate mux_rpc and request structure
Currently, trans_fd has two structures (p9_req and p9_mux-rpc)
which contain mostly duplicate data.

This patch consolidates these two structures and removes p9_mux_rpc.

Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-10-17 11:04:41 -05:00
Eric Van Hensbergen 5503ac5659 9p: remove unnecessary prototypes
Cleanup files by reordering functions in order to remove need for
unnecessary function prototypes.

There are no code changes here, just functions being moved around and
prototypes being eliminated.

Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-10-17 11:04:41 -05:00
Eric Van Hensbergen bead27f0a8 9p: remove duplicate client state
Now that we are passing client state into the transport modules, remove
duplicate state which is present in transport private structures.

Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-10-17 11:04:41 -05:00
Eric Van Hensbergen 8b81ef589a 9p: consolidate transport structure
Right now there is a transport module structure which provides per-transport
type functions and data and a transport structure which contains per-instance
public data as well as function pointers to instance specific functions.

This patch moves public transport visible instance data to the client
structure (which in some cases had duplicate data) and consolidates the
functions into the transport module structure.

Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-10-17 11:04:41 -05:00
Tejun Heo 992b3f1dbe 9p-trans_fd: use single poller
trans_fd used pool of upto 100 pollers to monitor the r/w fds.  The
approach makes sense in userspace back when the only available
interfaces were poll(2) and select(2).  As each event monitor -
trigger - handling iteration took O(n) where `n' is the number of
watched fds, it makes sense to spread them to many pollers such that
the `n' can be divided by the number of pollers.  However, this
doesn't make any sense in kernel because persistent edge triggered
event monitoring is how the whole thing is implemented in the kernel
in the first place.

This patch converts trans_fd to use single poller which watches all
the fds instead of the poll of pollers approach.  All the fds are
registered for monitoring on creation and only the fds with pending
events are scanned when something happens much like how epoll is
implemented.

This change makes trans_fd fd monitoring more efficient and simpler.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-10-17 11:04:41 -05:00
Linus Torvalds b225ee5bed Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
  net: Remove CONFIG_KMOD from net/ (towards removing CONFIG_KMOD entirely)
  ipv4: Add a missing rcu_assign_pointer() in routing cache.
  [netdrvr] ibmtr: PCMCIA IBMTR is ok on 64bit
  xen-netfront: Avoid unaligned accesses to IP header
  lmc: copy_*_user under spinlock
  [netdrvr] myri10ge, ixgbe: remove broken select INTEL_IOATDMA
2008-10-17 08:58:52 -07:00
Linus Torvalds 52ad096465 Merge git://git.linux-nfs.org/projects/trondmy/nfs-2.6
* git://git.linux-nfs.org/projects/trondmy/nfs-2.6: (53 commits)
  NFS: Fix a resolution problem with nfs_inode->cache_change_attribute
  NFS: Fix the resolution problem with nfs_inode_attrs_need_update()
  NFS: Changes to inode->i_nlinks must set the NFS_INO_INVALID_ATTR flag
  RPC/RDMA: ensure connection attempt is complete before signalling.
  RPC/RDMA: correct the reconnect timer backoff
  RPC/RDMA: optionally emit useful transport info upon connect/disconnect.
  RPC/RDMA: reformat a debug printk to keep lines together.
  RPC/RDMA: harden connection logic against missing/late rdma_cm upcalls.
  RPC/RDMA: fix connect/reconnect resource leak.
  RPC/RDMA: return a consistent error, when connect fails.
  RPC/RDMA: adhere to protocol for unpadded client trailing write chunks.
  RPC/RDMA: avoid an oops due to disconnect racing with async upcalls.
  RPC/RDMA: maintain the RPC task bytes-sent statistic.
  RPC/RDMA: suppress retransmit on RPC/RDMA clients.
  RPC/RDMA: fix connection IRD/ORD setting
  RPC/RDMA: support FRMR client memory registration.
  RPC/RDMA: check selected memory registration mode at runtime.
  RPC/RDMA: add data types and new FRMR memory registration enum.
  RPC/RDMA: refactor the inline memory registration code.
  NFS: fix nfs_parse_ip_address() corner case
  ...
2008-10-16 15:39:20 -07:00
Johannes Berg 95a5afca4a net: Remove CONFIG_KMOD from net/ (towards removing CONFIG_KMOD entirely)
Some code here depends on CONFIG_KMOD to not try to load
protocol modules or similar, replace by CONFIG_MODULES
where more than just request_module depends on CONFIG_KMOD
and and also use try_then_request_module in ebtables.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-16 15:24:51 -07:00
Eric Dumazet 00269b54ed ipv4: Add a missing rcu_assign_pointer() in routing cache.
rt_intern_hash() is doing an update of a RCU guarded hash chain
without using rcu_assign_pointer() or equivalent barrier.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-16 14:18:29 -07:00
Linus Torvalds c813b4e16e Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6: (46 commits)
  UIO: Fix mapping of logical and virtual memory
  UIO: add automata sercos3 pci card support
  UIO: Change driver name of uio_pdrv
  UIO: Add alignment warnings for uio-mem
  Driver core: add bus_sort_breadthfirst() function
  NET: convert the phy_device file to use bus_find_device_by_name
  kobject: Cleanup kobject_rename and !CONFIG_SYSFS
  kobject: Fix kobject_rename and !CONFIG_SYSFS
  sysfs: Make dir and name args to sysfs_notify() const
  platform: add new device registration helper
  sysfs: use ilookup5() instead of ilookup5_nowait()
  PNP: create device attributes via default device attributes
  Driver core: make bus_find_device_by_name() more robust
  usb: turn dev_warn+WARN_ON combos into dev_WARN
  debug: use dev_WARN() rather than WARN_ON() in device_pm_add()
  debug: Introduce a dev_WARN() function
  sysfs: fix deadlock
  device model: Do a quickcheck for driver binding before doing an expensive check
  Driver core: Fix cleanup in device_create_vargs().
  Driver core: Clarify device cleanup.
  ...
2008-10-16 12:40:26 -07:00
Linus Torvalds cb23832e39 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (26 commits)
  decnet: Fix compiler warning in dn_dev.c
  IPV6: Fix default gateway criteria wrt. HIGH/LOW preference radv option
  net/802/fc.c: Fix compilation warnings
  netns: correct mib stats in ip6_route_me_harder()
  netns: fix net_generic array leak
  rt2x00: fix regression introduced by "mac80211: free up 2 bytes in skb->cb"
  rtl8187: Add USB ID for Belkin F5D7050 with RTL8187B chip
  p54usb: Device ID updates
  mac80211: fixme for kernel-doc
  ath9k/mac80211: disallow fragmentation in ath9k, report to userspace
  libertas : Remove unused variable warning for "old_channel" from cmd.c
  mac80211: Fix scan RX processing oops
  orinoco: fix unsafe locking in spectrum_cs_suspend
  orinoco: fix unsafe locking in orinoco_cs_resume
  cfg80211: fix debugfs error handling
  mac80211: fix debugfs netdev rename
  iwlwifi: fix ct kill configuration for 5350
  mac80211: fix HT information element parsing
  p54: Fix compilation problem on PPC
  mac80211: fix debugfs lockup
  ...
2008-10-16 11:26:26 -07:00
Alexey Dobriyan f221e726bf sysctl: simplify ->strategy
name and nlen parameters passed to ->strategy hook are unused, remove
them.  In general ->strategy hook should know what it's doing, and don't
do something tricky for which, say, pointer to original userspace array
may be needed (name).

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: David S. Miller <davem@davemloft.net> [ networking bits ]
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Matt Mackall <mpm@selenic.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-16 11:21:47 -07:00
Danny ter Haar 404d0ae289 fix random typos
Signed-off-by: Danny ter Haar <dth@cistron.nl>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Avi Kivity <avi@qumranet.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-16 11:21:30 -07:00
Jason Baron 346e15beb5 driver core: basic infrastructure for per-module dynamic debug messages
Base infrastructure to enable per-module debug messages.

I've introduced CONFIG_DYNAMIC_PRINTK_DEBUG, which when enabled centralizes
control of debugging statements on a per-module basis in one /proc file,
currently, <debugfs>/dynamic_printk/modules. When, CONFIG_DYNAMIC_PRINTK_DEBUG,
is not set, debugging statements can still be enabled as before, often by
defining 'DEBUG' for the proper compilation unit. Thus, this patch set has no
affect when CONFIG_DYNAMIC_PRINTK_DEBUG is not set.

The infrastructure currently ties into all pr_debug() and dev_dbg() calls. That
is, if CONFIG_DYNAMIC_PRINTK_DEBUG is set, all pr_debug() and dev_dbg() calls
can be dynamically enabled/disabled on a per-module basis.

Future plans include extending this functionality to subsystems, that define 
their own debug levels and flags.

Usage:

Dynamic debugging is controlled by the debugfs file, 
<debugfs>/dynamic_printk/modules. This file contains a list of the modules that
can be enabled. The format of the file is as follows:

	<module_name> <enabled=0/1>
		.
		.
		.

	<module_name> : Name of the module in which the debug call resides
	<enabled=0/1> : whether the messages are enabled or not

For example:

	snd_hda_intel enabled=0
	fixup enabled=1
	driver enabled=0

Enable a module:

	$echo "set enabled=1 <module_name>" > dynamic_printk/modules

Disable a module:

	$echo "set enabled=0 <module_name>" > dynamic_printk/modules

Enable all modules:

	$echo "set enabled=1 all" > dynamic_printk/modules

Disable all modules:

	$echo "set enabled=0 all" > dynamic_printk/modules

Finally, passing "dynamic_printk" at the command line enables
debugging for all modules. This mode can be turned off via the above
disable command.

[gkh: minor cleanups and tweaks to make the build work quietly]

Signed-off-by: Jason Baron <jbaron@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-10-16 09:24:47 -07:00
David S. Miller 8fa0b315fc decnet: Fix compiler warning in dn_dev.c
Use offsetof() instead of home-brewed version.

Based upon initial patch by Steven Whitehouse and suggestions
by Ben Hutchings.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-15 16:03:02 -07:00
Pedro Ribeiro 22441cfa0c IPV6: Fix default gateway criteria wrt. HIGH/LOW preference radv option
Problem observed:
               In IPv6, in the presence of multiple routers candidates to
               default gateway in one segment, each sending a different
               value of preference, the Linux hosts connected to the
               segment weren't selecting the right one in all the
               combinations possible of LOW/MEDIUM/HIGH preference.

This patch changes two files:
include/linux/icmpv6.h
               Get the "router_pref" bitfield in the right place
               (as RFC4191 says), named the bit left with this fix as
               "home_agent" (RFC3775 say that's his function)

net/ipv6/ndisc.c
               Corrects the binary logic behind the updating of the
               router preference in the flags of the routing table

Result:
               With this two fixes applied, the default route used by
               the system was to consistent with the rules mentioned
               in RFC4191 in case of changes in the value of preference
               in router advertisements

Signed-off-by: Pedro Ribeiro <pribeiro@net.ipl.pt>
Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-15 16:03:01 -07:00
Trond Myklebust 6925bac120 Merge branch 'next' 2008-10-15 15:54:56 -04:00
Manish Katiyar deb28d9bc4 net/802/fc.c: Fix compilation warnings
Signed-off-by: Manish Katiyar <mkatiyar@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-15 00:13:53 -07:00
David S. Miller ab55570d64 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 2008-10-14 23:19:16 -07:00
Alexey Dobriyan eef9d90dcd netns: correct mib stats in ip6_route_me_harder()
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-14 22:55:21 -07:00
Alexey Dobriyan 4ef079ccc1 netns: fix net_generic array leak
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-14 22:54:48 -07:00
Johannes Berg 4233df6b74 ath9k/mac80211: disallow fragmentation in ath9k, report to userspace
As I've reported, ath9k currently fails utterly when fragmentation
is enabled. This makes ath9k "support" hardware fragmentation by
not supporting fragmentation at all to avoid the double-free issue.
The patch also changes mac80211 to report errors from the driver
operation to userspace.

That hack in ath9k should be removed once the rate control algorithm
it has is fixed, and we can at that time consider removing the hw
fragmentation support entirely since it's not used by any driver.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Cc: stable@kernel.org
Acked-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-10-14 21:12:37 -04:00
Jouni Malinen d048e503a2 mac80211: Fix scan RX processing oops
ieee80211_bss_info_update() can return NULL. Verify that this is not the
case before calling ieee802111_rx_bss_put() which would trigger an oops
in interrupt context in atomic_dec_and_lock().

Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Acked-by: Benoit Papillault <benoit.papillault@free.fr>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-10-14 21:08:11 -04:00
Johannes Berg 33c0360bf7 cfg80211: fix debugfs error handling
If something goes wrong creating the debugfs dir or when
debugfs is not compiled in, the current code might lead to
trouble; make it more robust.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-10-14 20:48:25 -04:00
Johannes Berg c74e90a9e3 mac80211: fix debugfs netdev rename
If, for some reason, a netdev has no debugfs dir, we shouldn't
try to rename that dir.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Cc: Robin Holt <holt@sgi.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-10-14 20:48:00 -04:00
Johannes Berg 09914813da mac80211: fix HT information element parsing
There's no checking that the HT IEs are of the right length
which can be used by an attacker to cause an out-of-bounds
access by sending a too short HT information/capability IE.
Fix it by simply pretending those IEs didn't exist when too
short.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-10-14 20:47:15 -04:00
Johannes Berg 63044e9f54 mac80211: fix debugfs lockup
When debugfs_create_dir fails, sta_info_debugfs_add_work will not
terminate because it will find the same station again and again.
This is possible whenever debugfs fails for whatever reason; one
reason is a race condition in mac80211, unfortunately we cannot
do much about it, so just document it, it just means some station
may be missing from debugfs.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Cc: Robin Holt <holt@sgi.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-10-14 20:46:41 -04:00
Linus Torvalds e413b210c5 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid: (55 commits)
  HID: build drivers for all quirky devices by default
  HID: add missing blacklist entry for Apple ATV ircontrol
  HID: add support for Bright ABNT2 brazilian device
  HID: Don't let Avermedia Radio FM800 be handled by usb hid drivers
  HID: fix numlock led on Dell device 0x413c/0x2105
  HID: remove warn() macro from usb hid drivers
  HID: remove info() macro from usb HID drivers
  HID: add appletv IR receiver quirk
  HID: fix a lockup regression when using force feedback on a PID device
  HID: hiddev.h: Fix example code.
  HID: hiddev.h: Fix mixed space and tabs in example code.
  HID: convert to dev_* prints
  HID: remove hid-ff
  HID: move zeroplus FF processing
  HID: move thrustmaster FF processing
  HID: move pantherlord FF processing
  HID: fix incorrent length condition in hidraw_write()
  HID: fix tty<->hid deadlock
  HID: ignore iBuddy devices
  HID: report descriptor fix for remaining MacBook JIS keyboards
  ...
2008-10-14 16:35:43 -07:00
Jiri Slaby 93c10132a7 HID: move connect quirks
Move connecting from usbhid to the hid layer and fix also hidp in
that manner.
This removes all the ignore/force hidinput/hiddev connecting quirks.

Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2008-10-14 23:50:56 +02:00
Jiri Slaby 8c19a51591 HID: move apple quirks
Move them from the core code to a separate driver.

Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2008-10-14 23:50:49 +02:00
Jiri Slaby d458a9dfc4 HID: move ignore quirks
Move ignore quirks from usbhid-quirks into hid-core code. Also don't output
warning when ENODEV is error code in usbhid and try ordinal input in hidp
when that error is returned.

Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2008-10-14 23:50:49 +02:00
Jiri Slaby c500c97140 HID: hid, make parsing event driven
Next step for complete hid bus, this patch includes:
- call parser either from probe or from hid-core if there is no probe.
- add ll_driver structure and centralize some stuff there (open, close...)
- split and merge usb_hid_configure and hid_probe into several functions
  to allow hooks/fixes between them

Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2008-10-14 23:50:48 +02:00
Jiri Slaby 85cdaf524b HID: make a bus from hid code
Make a bus from hid core. This is the first step for converting all the
quirks and separate almost-drivers into real drivers attached to this bus.

It's implemented to change behaviour in very tiny manner, so that no driver
needs to be changed this time.

Also add generic drivers for both usb and bt into usbhid or hidp
respectively which will bind all non-blacklisted device. Those blacklisted
will be either grabbed by special drivers or by nobody if they are broken at
the very rude base.

Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2008-10-14 23:50:48 +02:00
Linus Torvalds 8acd3a60bc Merge branch 'for-2.6.28' of git://linux-nfs.org/~bfields/linux
* 'for-2.6.28' of git://linux-nfs.org/~bfields/linux: (59 commits)
  svcrdma: Fix IRD/ORD polarity
  svcrdma: Update svc_rdma_send_error to use DMA LKEY
  svcrdma: Modify the RPC reply path to use FRMR when available
  svcrdma: Modify the RPC recv path to use FRMR when available
  svcrdma: Add support to svc_rdma_send to handle chained WR
  svcrdma: Modify post recv path to use local dma key
  svcrdma: Add a service to register a Fast Reg MR with the device
  svcrdma: Query device for Fast Reg support during connection setup
  svcrdma: Add FRMR get/put services
  NLM: Remove unused argument from svc_addsock() function
  NLM: Remove "proto" argument from lockd_up()
  NLM: Always start both UDP and TCP listeners
  lockd: Remove unused fields in the nlm_reboot structure
  lockd: Add helper to sanity check incoming NOTIFY requests
  lockd: change nlmclnt_grant() to take a "struct sockaddr *"
  lockd: Adjust nlmsvc_lookup_host() to accomodate AF_INET6 addresses
  lockd: Adjust nlmclnt_lookup_host() signature to accomodate non-AF_INET
  lockd: Support non-AF_INET addresses in nlm_lookup_host()
  NLM: Convert nlm_lookup_host() to use a single argument
  svcrdma: Add Fast Reg MR Data Types
  ...
2008-10-14 12:31:14 -07:00
Pablo Neira Ayuso e6a7d3c04f netfilter: ctnetlink: remove bogus module dependency between ctnetlink and nf_nat
This patch removes the module dependency between ctnetlink and
nf_nat by means of an indirect call that is initialized when
nf_nat is loaded. Now, nf_conntrack_netlink only requires
nf_conntrack and nfnetlink.

This patch puts nfnetlink_parse_nat_setup_hook into the
nf_conntrack_core to avoid dependencies between ctnetlink,
nf_conntrack_ipv4 and nf_conntrack_ipv6.

This patch also introduces the function ctnetlink_change_nat
that is only invoked from the creation path. Actually, the
nat handling cannot be invoked from the update path since
this is not allowed. By introducing this function, we remove
the useless nat handling in the update path and we avoid
deadlock-prone code.

This patch also adds the required EAGAIN logic for nfnetlink.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-14 11:58:31 -07:00
Patrick McHardy 129404a1f1 netfilter: fix ebtables dependencies
Ingo Molnar reported a build error with ebtables:

ERROR: "ebt_register_table" [net/bridge/netfilter/ebtable_filter.ko] undefined!
ERROR: "ebt_do_table" [net/bridge/netfilter/ebtable_filter.ko] undefined!
ERROR: "ebt_unregister_table" [net/bridge/netfilter/ebtable_filter.ko] undefined!
ERROR: "ebt_register_table" [net/bridge/netfilter/ebtable_broute.ko] undefined!
ERROR: "ebt_do_table" [net/bridge/netfilter/ebtable_broute.ko] undefined!
ERROR: "ebt_unregister_table" [net/bridge/netfilter/ebtable_broute.ko] undefined!
make[1]: *** [__modpost] Error 1
make: *** [modules] Error 2

This reason is a missing dependencies that got lost during Kconfig cleanups.
Restore it.

Tested-by: Ingo Molnar <mingo@elte.hu>

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-14 11:57:33 -07:00
Patrick McHardy 38f7ac3eb7 netfilter: restore lost #ifdef guarding defrag exception
Nir Tzachar <nir.tzachar@gmail.com> reported a warning when sending
fragments over loopback with NAT:

[ 6658.338121] WARNING: at net/ipv4/netfilter/nf_nat_standalone.c:89 nf_nat_fn+0x33/0x155()

The reason is that defragmentation is skipped for already tracked connections.
This is wrong in combination with NAT and ip_conntrack actually had some ifdefs
to avoid this behaviour when NAT is compiled in.

The entire "optimization" may seem a bit silly, for now simply restoring the
lost #ifdef is the easiest solution until we can come up with something better.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-14 11:56:59 -07:00
Linus Torvalds 43096597a4 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
  qlge: Fix page size ifdef test.
  net: Rationalise email address: Network Specific Parts
  dsa: fix compile bug on s390
  netns: mib6 section fixlet
  enic: Fix Kconfig headline description
  de2104x: wrong MAC address fix
  s390: claw compile fixlet
  net: export genphy_restart_aneg
  cxgb3: extend copyrights to 2008
  cxgb3: update driver version
  net/phy: add missing kernel-doc
  pktgen: fix skb leak in case of failure
  mISDN/dsp_cmx.c: fix size checks
  misdn: use nonseekable_open()
  net: fix driver build errors due to missing net/ip6_checksum.h include
2008-10-14 10:28:49 -07:00
Geert Uytterhoeven 56f26f7b78 net/rfkill/rfkill-input.c needs <linux/sched.h>
For some m68k configs, I get:

| net/rfkill/rfkill-input.c: In function 'rfkill_start':
| net/rfkill/rfkill-input.c:208: error: dereferencing pointer to incomplete type

As the incomplete type is `struct task_struct', including <linux/sched.h> fixes
it.

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-14 10:23:27 -07:00
Alan Cox 113aa838ec net: Rationalise email address: Network Specific Parts
Clean up the various different email addresses of mine listed in the code
to a single current and valid address. As Dave says his network merges
for 2.6.28 are now done this seems a good point to send them in where
they won't risk disrupting real changes.

Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-13 19:01:08 -07:00
Heiko Carstens 510149e319 dsa: fix compile bug on s390
git commit 45cec1bac0
"dsa: Need to select PHYLIB." causes this build bug on s390:

drivers/built-in.o: In function `phy_stop_interrupts':
/home/heicarst/linux-2.6/drivers/net/phy/phy.c:631: undefined reference to `free_irq'
/home/heicarst/linux-2.6/drivers/net/phy/phy.c:646: undefined reference to `enable_irq'
drivers/built-in.o: In function `phy_start_interrupts':
/home/heicarst/linux-2.6/drivers/net/phy/phy.c:601: undefined reference to `request_irq'
drivers/built-in.o: In function `phy_interrupt':
/home/heicarst/linux-2.6/drivers/net/phy/phy.c:528: undefined reference to `disable_irq_nosync'
drivers/built-in.o: In function `phy_change':
/home/heicarst/linux-2.6/drivers/net/phy/phy.c:674: undefined reference to `enable_irq'
/home/heicarst/linux-2.6/drivers/net/phy/phy.c:692: undefined reference to `disable_irq'

PHYLIB has alread a depend on !S390, however select PHYLIB at DSA overrides
that unfortunately. So add a depend on !S390 to DSA as well.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-13 18:58:48 -07:00
Alexey Dobriyan e7dc849494 netns: mib6 section fixlet
LD      net/ipv6/ipv6.o
WARNING: net/ipv6/ipv6.o(.text+0xd8): Section mismatch in reference from the function inet6_net_init() to the function .init.text:ipv6_init_mibs()

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-13 18:54:07 -07:00
Ilpo Järvinen b4bb4ac8cb pktgen: fix skb leak in case of failure
Seems that skb goes into void unless something magic happened
in pskb_expand_head in case of failure.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-13 18:43:59 -07:00
Steven Whitehouse a447c09324 vfs: Use const for kernel parser table
This is a much better version of a previous patch to make the parser
tables constant. Rather than changing the typedef, we put the "const" in
all the various places where its required, allowing the __initconst
exception for nfsroot which was the cause of the previous trouble.

This was posted for review some time ago and I believe its been in -mm
since then.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Cc: Alexander Viro <aviro@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-13 10:10:37 -07:00
Linus Torvalds fffdedef69 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
  net/mac80211/rx.c: fix build error
  acpi: Make ACPI_TOSHIBA depend on INPUT.
  net/bfin_mac.c MDIO namespace fixes
  jme: remove unused #include <version.h>
  netfilter: remove unused #include <version.h>
  net: Fix off-by-one in skb_dma_map
  smc911x: Add support for LAN921{5,7,8} chips from SMSC
  qlge: remove duplicated #include
  wireless: remove duplicated #include
  net/au1000_eth.c MDIO namespace fixes
  net/tc35815.c: fix compilation
  sky2: Fix WOL regression
  r8169: NULL pointer dereference on r8169 load
2008-10-13 10:08:08 -07:00
Ingo Molnar bf94e17bc8 net/mac80211/rx.c: fix build error
older versions of gcc do not recognize that ieee80211_rx_h_mesh_fwding()
is unused when CONFIG_MAC80211_MESH is disabled:

  net/built-in.o: In function `ieee80211_rx_h_mesh_fwding':
  rx.c:(.text+0xd89af): undefined reference to `mpp_path_lookup'
  rx.c:(.text+0xd89c6): undefined reference to `mpp_path_add'

as this code construct:

        if (ieee80211_vif_is_mesh(&sdata->vif))
                CALL_RXH(ieee80211_rx_h_mesh_fwding);

still causes ieee80211_rx_h_mesh_fwding() to be linked in.

Protect these places with an #ifdef.

commit b0dee578 ("Fix modpost failure when rx handlers are not inlined.")
solved part of this problem - this patch is still needed.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-12 23:51:38 -07:00
Huang Weiyi 14717f811b netfilter: remove unused #include <version.h>
The file(s) below do not use LINUX_VERSION_CODE nor KERNEL_VERSION.
  net/netfilter/nf_tproxy_core.c

This patch removes the said #include <version.h>.

Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-12 21:08:34 -07:00
Dimitris Michailidis ab396eb03f net: Fix off-by-one in skb_dma_map
The unwind loop iterates down to -1 instead of stopping at 0 and ends up
accessing ->frags[-1].

Signed-off-by: Dimitris Michailidis <dm@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-12 21:07:34 -07:00
Huang Weiyi ff71268aa4 wireless: remove duplicated #include
Removed duplicated include <linux/list.h> in 
net/wireless/core.c.

Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-12 21:03:38 -07:00
James Morris 93db628658 Merge branch 'next' into for-linus 2008-10-13 09:35:14 +11:00
Herbert Xu 7bb82d9245 gre: Initialise rtnl_link tunnel parameters properly
Brown paper bag error of calling memset with sizeof(p) instead
of sizeof(*p).

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-11 12:20:15 -07:00
David S. Miller f901b64472 ipvs: Add proper dependencies on IP_VS, and fix description header line.
Linus noted a build failure case:

net/netfilter/ipvs/ip_vs_xmit.c: In function 'ip_vs_tunnel_xmit':
net/netfilter/ipvs/ip_vs_xmit.c:616: error: implicit declaration of function 'ip_select_ident'

The proper include file (net/ip.h) is being included in ip_vs_xmit.c to get
that declaration.  So the only possible case where this can happen is if
CONFIG_INET is not enabled.

This seems to be purely a missing dependency in the ipvs/Kconfig file IP_VS
entry.

Also, while we're here, remove the out of date "EXPERIMENTAL" string in the
IP_VS config help header line.  IP_VS no longer depends upon CONFIG_EXPERIMENTAL

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-11 12:18:04 -07:00
Tobias Brunner 1839faab9a af_key: fix SADB_X_SPDDELETE response
When deleting an SPD entry using SADB_X_SPDDELETE, c.data.byid is not
initialized to zero in pfkey_spddelete(). Thus, key_notify_policy()
responds with a PF_KEY message of type SADB_X_SPDDELETE2 instead of
SADB_X_SPDDELETE.

Signed-off-by: Tobias Brunner <tobias.brunner@strongswan.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-10 14:07:03 -07:00
Tom Talpey c055551e97 RPC/RDMA: ensure connection attempt is complete before signalling.
The RPC/RDMA connection logic could return early from reconnection
attempts, leading to additional spurious retries.

Signed-off-by: Tom Talpey <talpey@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-10-10 15:15:06 -04:00