Commit Graph

16236 Commits

Author SHA1 Message Date
Dan Carpenter 7e988014cd mac80211: freeing the wrong variable
The intent was to free "msp->ratelist" here.  "msp->sample_table" is
always NULL at this point.

Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-07-26 15:32:41 -04:00
John W. Linville 3289a8368c lib80211: remove unused host_build_iv option
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-07-26 15:09:04 -04:00
John W. Linville ea65145da8 minstrel: don't complain about feedback for unrequested rates
"It's not problematic if minstrel gets feedback for rates that it
doesn't have in its list, it should just ignore it. - Felix"

Signed-off-by: John W. Linville <linville@tuxdriver.com>
Cc: Felix Fietkau <nbd@openwrt.org>
2010-07-26 15:09:04 -04:00
John W. Linville 00fc90c886 minstrel_ht: remove unnecessary NULL check in minstrel_ht_update_caps
If sta is NULL, we will have problems long before we get here...

Reported-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Cc: Felix Fietkau <nbd@openwrt.org>
2010-07-26 15:09:04 -04:00
Ben Greear c736eefadb net: dev_forward_skb should call nf_reset
With conn-track zones and probably with different network
namespaces, the netfilter logic needs to be re-calculated
on packet receive.  If the netfilter logic is not reset,
it will not be recalculated properly.  This patch adds
the nf_reset logic to dev_forward_skb.

Signed-off-by: Ben Greear <greearb@candelatech.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-25 21:58:46 -07:00
Eric Dumazet fed66381d6 net: pskb_expand_head() optimization
Move frags[] at the end of struct skb_shared_info, and make
pskb_expand_head() copy only the used part of it instead of whole array.

This should avoid kmemcheck warnings and speedup pskb_expand_head() as
well, avoiding a lot of cache misses.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-24 21:05:57 -07:00
stephen hemminger 3b87956ea6 net sched: fix race in mirred device removal
This fixes hang when target device of mirred packet classifier
action is removed.

If a mirror or redirection action is configured to cause packets
to go to another device, the classifier holds a ref count, but was assuming
the adminstrator cleaned up all redirections before removing. The fix
is to add a notifier and cleanup during unregister.

The new list is implicitly protected by RTNL mutex because
it is held during filter add/delete as well as notifier.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-24 21:04:20 -07:00
Stefan Assmann c1f79426e2 sysfs: add attribute to indicate hw address assignment type
Add addr_assign_type to struct net_device and expose it via sysfs.
This new attribute has the purpose of giving user-space the ability to
distinguish between different assignment types of MAC addresses.

For example user-space can treat NICs with randomly generated MAC
addresses differently than NICs that have permanent (locally assigned)
MAC addresses.
For the former udev could write a persistent net rule by matching the
device path instead of the MAC address.
There's also the case of devices that 'steal' MAC addresses from slave
devices. In which it is also be beneficial for user-space to be aware
of the fact.

This patch also introduces a helper function to assist adoption of
drivers that generate MAC addresses randomly.

Signed-off-by: Stefan Assmann <sassmann@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-24 20:49:29 -07:00
David S. Miller 2a88e7e559 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6
Conflicts:
	drivers/net/wireless/iwlwifi/iwl-commands.h
2010-07-23 14:03:38 -07:00
Andy Shevchenko 451e07a264 net: core: don't use own hex_to_bin() method
Signed-off-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-23 12:50:51 -07:00
Changli Gao 7df0884ce1 netfilter: iptables: use skb->len for accounting
Use skb->len for accounting as xt_quota does.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-07-23 16:25:11 +02:00
Changli Gao 261abc8c96 netfilter: ip6tables: use skb->len for accounting
ipv6_hdr(skb)->payload_len is ZERO and can't be used for accounting, if
the payload is a Jumbo Payload specified in RFC2675.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-07-23 16:24:34 +02:00
Changli Gao 49daf6a226 xt_quota: report initial quota value instead of current value to userspace
We should copy the initial value to userspace for iptables-save and
to allow removal of specific quota rules.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-07-23 14:07:47 +02:00
Changli Gao b0c81aa566 netfilter: xt_quota: use per-rule spin lock
Use per-rule spin lock to improve the scalability.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-07-23 14:04:03 +02:00
Changli Gao f667009ecc netfilter: arptables: use arp_hdr_len()
use arp_hdr_len().

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-07-23 13:40:53 +02:00
Changli Gao c36952e524 netfilter: nf_nat_core: merge the same lines
proto->unique_tuple() will be called finally, if the previous calls fail. This
patch checks the false condition of (range->flags &IP_NAT_RANGE_PROTO_RANDOM)
instead to avoid duplicate line of code: proto->unique_tuple().

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-07-23 13:27:08 +02:00
Eric Dumazet e8648a1fdb netfilter: add xt_cpu match
In some situations a CPU match permits a better spreading of
connections, or select targets only for a given cpu.

With Remote Packet Steering or multiqueue NIC and appropriate IRQ
affinities, we can distribute trafic on available cpus, per session.
(all RX packets for a given flow is handled by a given cpu)

Some legacy applications being not SMP friendly, one way to scale a
server is to run multiple copies of them.

Instead of randomly choosing an instance, we can use the cpu number as a
key so that softirq handler for a whole instance is running on a single
cpu, maximizing cache effects in TCP/UDP stacks.

Using NAT for example, a four ways machine might run four copies of
server application, using a separate listening port for each instance,
but still presenting an unique external port :

iptables -t nat -A PREROUTING -p tcp --dport 80 -m cpu --cpu 0 \
        -j REDIRECT --to-port 8080

iptables -t nat -A PREROUTING -p tcp --dport 80 -m cpu --cpu 1 \
        -j REDIRECT --to-port 8081

iptables -t nat -A PREROUTING -p tcp --dport 80 -m cpu --cpu 2 \
        -j REDIRECT --to-port 8082

iptables -t nat -A PREROUTING -p tcp --dport 80 -m cpu --cpu 3 \
        -j REDIRECT --to-port 8083

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-07-23 12:59:36 +02:00
Hannes Eder 7f1c407579 IPVS: make FTP work with full NAT support
Use nf_conntrack/nf_nat code to do the packet mangling and the TCP
sequence adjusting.  The function 'ip_vs_skb_replace' is now dead
code, so it is removed.

To SNAT FTP, use something like:

% iptables -t nat -A POSTROUTING -m ipvs --vaddr 192.168.100.30/32 \
    --vport 21 -j SNAT --to-source 192.168.10.10
and for the data connections in passive mode:

% iptables -t nat -A POSTROUTING -m ipvs --vaddr 192.168.100.30/32 \
    --vportctl 21 -j SNAT --to-source 192.168.10.10
using '-m state --state RELATED' would also works.

Make sure the kernel modules ip_vs_ftp, nf_conntrack_ftp, and
nf_nat_ftp are loaded.

[ up-port and minor fixes by Simon Horman <horms@verge.net.au> ]
Signed-off-by: Hannes Eder <heder@google.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-07-23 12:48:52 +02:00
Hannes Eder 7b215ffc38 IPVS: make friends with nf_conntrack
Update the nf_conntrack tuple in reply direction, as we will see
traffic from the real server (RIP) to the client (CIP).  Once this is
done we can use netfilters SNAT in POSTROUTING, especially with
xt_ipvs, to do source NAT, e.g.:

% iptables -t nat -A POSTROUTING -m ipvs --vaddr 192.168.100.30/32 --vport 80 \
		  -j SNAT --to-source 192.168.10.10

[ minor fixes by Simon Horman <horms@verge.net.au> ]
Signed-off-by: Hannes Eder <heder@google.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-07-23 12:46:32 +02:00
Hannes Eder 9c3e1c3967 netfilter: xt_ipvs (netfilter matcher for IPVS)
This implements the kernel-space side of the netfilter matcher xt_ipvs.

[ minor fixes by Simon Horman <horms@verge.net.au> ]
Signed-off-by: Hannes Eder <heder@google.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
[ Patrick: added xt_ipvs.h to Kbuild ]
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-07-23 12:42:58 +02:00
Dan Carpenter b77026b391 caif: precedence bug
Negate has precedence over comparison so the original assert only
checked that "rfml->fragment_size" was larger than 1 or 0.

Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-22 14:14:47 -07:00
Eric Dumazet 963bfeeeec net: RTA_MARK addition
Add a new rt attribute, RTA_MARK, and use it in
rt_fill_info()/inet_rtm_getroute() to support following commands :

ip route get 192.168.20.110 mark NUMBER
ip route get 192.168.20.108 from 192.168.20.110 iif eth1 mark NUMBER
ip route list cache [192.168.20.110] mark NUMBER

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-22 13:46:21 -07:00
Brian Haley 64e724f62a ipv6: Don't add routes to ipv6 disabled interfaces.
If the interface has IPv6 disabled, don't add a multicast or
link-local route since we won't be adding a link-local address.

Reported-by: Mahesh Kelkar <maheshkelkar@gmail.com>
Signed-off-by: Brian Haley <brian.haley@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-22 13:41:32 -07:00
David S. Miller be2b6e6235 net: Fix skb_copy_expand() handling of ->csum_start
It should only be adjusted if ip_summed == CHECKSUM_PARTIAL.

Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-22 13:27:09 -07:00
Andrea Shepard 00c5a9834b net: Fix corruption of skb csum field in pskb_expand_head() of net/core/skbuff.c
Make pskb_expand_head() check ip_summed to make sure csum_start is really
csum_start and not csum before adjusting it.

This fixes a bug I encountered using a Sun Quad-Fast Ethernet card and VLANs.
On my configuration, the sunhme driver produces skbs with differing amounts
of headroom on receive depending on the packet size.  See line 2030 of
drivers/net/sunhme.c; packets smaller than RX_COPY_THRESHOLD have 52 bytes
of headroom but packets larger than that cutoff have only 20 bytes.

When these packets reach the VLAN driver, vlan_check_reorder_header()
calls skb_cow(), which, if the packet has less than NET_SKB_PAD (== 32) bytes
of headroom, uses pskb_expand_head() to make more.

Then, pskb_expand_head() needs to adjust a lot of offsets into the skb,
including csum_start.  Since csum_start is a union with csum, if the packet
has a valid csum value this will corrupt it, which was the effect I observed.
The sunhme hardware computes receive checksums, so the skbs would be created
by the driver with ip_summed == CHECKSUM_COMPLETE and a valid csum field, and
then pskb_expand_head() would corrupt the csum field, leading to an "hw csum
error" message later on, for example in icmp_rcv() for pings larger than the
sunhme RX_COPY_THRESHOLD.

On the basis of the comment at the beginning of include/linux/skbuff.h,
I believe that the csum_start skb field is only meaningful if ip_csummed is
CSUM_PARTIAL, so this patch makes pskb_expand_head() adjust it only in that
case to avoid corrupting a valid csum value.

Please see my more in-depth disucssion of tracking down this bug for
more details if you like:

http://puellavulnerata.livejournal.com/112186.html
http://puellavulnerata.livejournal.com/112567.html
http://puellavulnerata.livejournal.com/112891.html
http://puellavulnerata.livejournal.com/113096.html
http://puellavulnerata.livejournal.com/113591.html

I am not subscribed to this list, so please CC me on replies.

Signed-off-by: Andrea Shepard <andrea@persephoneslair.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-22 13:25:18 -07:00
Gustavo F. Padovan 3f30fc1570 net: remove last uses of __attribute__((packed))
Network code uses the __packed macro instead of __attribute__((packed)).

Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-21 14:44:29 -07:00
Johannes Berg 7a17a33c0d mac80211: proper IBSS locking
IBSS has never had locking, instead relying on some
memory barriers etc. That's hard to get right, and
I think we had it wrong too until the previous patch.
Since this is not performance sensitive, it doesn't
make sense to have the maintenance overhead of that,
so add proper locking.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-07-21 15:13:42 -04:00
Johannes Berg bc05d19f4b mac80211: fix IBSS lockdep complaint
Bob reported a lockdep complaint originating in
the mac80211 IBSS code due to the common work
struct patch. The reason is that the IBSS and
station mode code have different locking orders
for the cfg80211 wdev lock and the work struct
(where "locking" implies running/canceling).

Fix this by simply not canceling the work in
the IBSS code, it is not necessary since when
the REQ_RUN bit is cleared, the work will run
without effect if it runs. When the interface
is set down, it is flushed anyway, so there's
no concern about it running after memory has
been invalidated either.

This fixes
https://bugzilla.kernel.org/show_bug.cgi?id=16419

Additionally, looking into this I noticed that
there's a small window while the IBSS is torn
down in which the work may be rescheduled and
the REQ_RUN bit be set again after leave() has
cleared it when a scan finishes at exactly the
same time. Avoid that by setting the ssid_len
to zero before clearing REQ_RUN which signals
to the scan finish code that this interface is
not active.

Reported-by: Bob Copeland <me@bobcopeland.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-07-21 15:13:42 -04:00
Johannes Berg 9dca9c4901 mac80211: refuse shared key auth when WEP is unavailable
When WEP is not available, we should reject shared
key authentication because it could never succeed.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-07-21 15:13:42 -04:00
Maxime Bizon 5a652052fe cfg80211: fix race between sysfs and cfg80211
device_add() is called before adding the phy to the cfg80211 device
list.

So if a userspace program uses sysfs uevents to detect new phy
devices, and queries nl80211 to get phy info, it can get ENODEV even
though the phy exists in sysfs.

An easy workaround is to hold the cfg80211 mutex until the phy is
present in sysfs/cfg80211/debugfs.

Signed-off-by: Maxime Bizon <mbizon@freebox.fr>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-07-21 15:13:42 -04:00
Gustavo F. Padovan d1c4a17d58 Bluetooth: Enable L2CAP Extended features by default
Change the enable_ertm param to disable_ertm and default value to 0. That
means that L2CAP Extended features are enabled by default now.

Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2010-07-21 10:39:13 -07:00
Gustavo F. Padovan 893ef97112 Bluetooth: Fix typo in hci_event.c
memmory -> memory

Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2010-07-21 10:39:13 -07:00
Gustavo F. Padovan 5d8868ff3d Bluetooth: Add Google's copyright to L2CAP
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2010-07-21 10:39:12 -07:00
Suraj Sumangala 9981151086 Bluetooth: Implemented HCI frame reassembly for RX from stream
Implemented frame reassembly implementation for reassembling fragments
received from stream.

Signed-off-by: Suraj Sumangala <suraj@atheros.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2010-07-21 10:39:12 -07:00
Suraj Sumangala f39a3c0640 Bluetooth: Modified hci_recv_fragment() to use hci_reassembly helper
Modified packet based reassembly function hci_recv_fragment() to use
hci_reassembly()

Signed-off-by: Suraj Sumangala <suraj@atheros.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2010-07-21 10:39:12 -07:00
Suraj Sumangala 33e882a5f2 Bluetooth: Implement hci_reassembly helper to reassemble RX packets
Implements feature to reassemble received HCI frames from any input stream

Signed-off-by: Suraj Sumangala <suraj@atheros.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2010-07-21 10:39:12 -07:00
Suraj Sumangala cd4c53919e Bluetooth: Add one more buffer for HCI stream reassembly
Additional reassembly buffer to keep track of stream reasembly

Signed-off-by: Suraj Sumangala <suraj@atheros.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2010-07-21 10:39:12 -07:00
Gustavo F. Padovan dd135240e8 Bluetooth: Update L2CAP version information
We did some changes on the L2CAP configuration process and its behaviour
is bit different now. That justifies a updated on the L2CAP version.

Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2010-07-21 10:39:12 -07:00
Gustavo F. Padovan ce5706bd69 Bluetooth: Add Copyright notice to L2CAP
Copyright for the time I worked on L2CAP during the Google Summer of Code
program.

Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2010-07-21 10:39:11 -07:00
Gustavo F. Padovan 47731de789 Bluetooth: Keep code under column 80
Purely a cosmetic change, it doesn't change the code flow.

Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2010-07-21 10:39:11 -07:00
Gustavo F. Padovan 89746b856c Bluetooth: Fix bug in kzalloc allocation size
Probably a typo error. We were using the wrong struct to get size.

Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2010-07-21 10:39:11 -07:00
Gustavo F. Padovan e9aeb2ddd4 Bluetooth: Send ConfigReq after send a ConnectionRsp
The extended L2CAP features requires that one should initiate a
ConfigReq after send the ConnectionRsp. This patch changes the behaviour
of the configuration process of our stack.

Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2010-07-21 10:39:11 -07:00
João Paulo Rechi Vita 963cf687e8 Bluetooth: Fix error return on L2CAP-HCI interface.
L2CAP only deals with ACL links. EINVAL should be returned otherwise.

Signed-off-by: João Paulo Rechi Vita <jprvita@profusion.mobi>
Acked-by: Gustavo F. Padovan <padovan@profusion.mobi>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2010-07-21 10:39:10 -07:00
João Paulo Rechi Vita 7a560e5c99 Bluetooth: Fix error value for wrong FCS.
Signed-off-by: João Paulo Rechi Vita <jprvita@profusion.mobi>
Acked-by: Gustavo F. Padovan <padovan@profusion.mobi>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2010-07-21 10:39:10 -07:00
João Paulo Rechi Vita 57d3b22bf5 Bluetooth: Fix error return for l2cap_connect_rsp().
Signed-off-by: João Paulo Rechi Vita <jprvita@profusion.mobi>
Acked-by: Gustavo F. Padovan <padovan@profusion.mobi>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2010-07-21 10:39:10 -07:00
João Paulo Rechi Vita bc766db2ef Bluetooth: Fix error return value on sendmsg.
When the socket is in a bad state EBADFD is more appropriate then EINVAL.

Signed-off-by: João Paulo Rechi Vita <jprvita@profusion.mobi>
Acked-by: Gustavo F. Padovan <padovan@profusion.mobi>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2010-07-21 10:39:10 -07:00
João Paulo Rechi Vita f9dd11b03c Bluetooth: Fix error return value on sendmsg.
When we try to send a message bigger than the outgoing MTU value
EMSGSIZE (message too long) should be returned.

Signed-off-by: João Paulo Rechi Vita <jprvita@profusion.mobi>
Acked-by: Gustavo F. Padovan <padovan@profusion.mobi>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2010-07-21 10:39:10 -07:00
João Paulo Rechi Vita 305682e837 Bluetooth: Make l2cap_streaming_send() void.
It doesn't make sense to have a return value since we always set it
to 0.

Signed-off-by: João Paulo Rechi Vita <jprvita@profusion.mobi>
Acked-by: Gustavo F. Padovan <padovan@profusion.mobi>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2010-07-21 10:39:10 -07:00
João Paulo Rechi Vita 8b0dc6dc82 Bluetooth: Fix l2cap_sock_connect error return.
Return a proper error value if socket is already connected.

Signed-off-by: João Paulo Rechi Vita <jprvita@profusion.mobi>
Acked-by: Gustavo F. Padovan <padovan@profusion.mobi>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2010-07-21 10:39:10 -07:00
Gustavo F. Padovan 712132eb54 Bluetooth: Improve ERTM local busy handling
Now we also check if can push skb userspace just after receive a new
skb instead of only wait the l2cap_busy_work wake up from time to time
to check the local busy condition.

Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2010-07-21 10:39:09 -07:00
Gustavo F. Padovan 218bb9dfd2 Bluetooth: Add backlog queue to ERTM code
backlog queue is the canonical mechanism to avoid race conditions due
interrupts in bottom half context. After the socket lock is released the
net core take care of push all skb in its backlog queue.

Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2010-07-21 10:39:09 -07:00
Gustavo F. Padovan e0f66218b3 Bluetooth: Remove the send_lock spinlock from ERTM
Using a lock to deal with the ERTM race condition - interruption with
new data from the hci layer - is wrong. We should use the native skb
backlog queue.

Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2010-07-21 10:39:09 -07:00
Gustavo F. Padovan 8cb8e6f168 Bluetooth: Don't accept ConfigReq if we aren't in the BT_CONFIG state
If such event happens we shall reply with a Command Reject, because we are
not expecting any configure request.

Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2010-07-21 10:39:09 -07:00
Gustavo F. Padovan cf6c2c0b9f Bluetooth: Disconnect early if mode is not supported
When mode is mandatory we shall not send connect request and report this
to the userspace as well.

Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2010-07-21 10:39:09 -07:00
Gustavo F. Padovan 2ba13ed678 Bluetooth: Remove check for supported mode
Since now we have checks for the supported mode before on
l2cap_info_rsp we can remove the check for it here.

Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2010-07-21 10:39:08 -07:00
Gustavo F. Padovan 6c2ea7a8f5 Bluetooth: Refuse ConfigRsp with different mode
If our mode is Basic Mode we have to refuse any ConfigRsp that proposes
a different mode.

Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2010-07-21 10:39:08 -07:00
Gustavo F. Padovan 625477523b Bluetooth: Actively send request for Basic Mode
The Profile Tuning Suite requires that we send a RFC containing the
Basic Mode configuration when requesting Basic Mode.

Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2010-07-21 10:39:08 -07:00
Gustavo F. Padovan ae12d52efd Bluetooth: Prefer Basic Mode on receipt of ConfigReq
If we choose to use Basic Mode then we have to refuse the received mode
and propose Basic Mode again.

Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2010-07-21 10:39:08 -07:00
Gustavo F. Padovan 742e519b0d Bluetooth: Disconnect the channel if we don't want the proposed mode
If the device is a STATE 2 then it should disconnect the channel if the
remote device propose a mode different from its mandatory mode.

Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2010-07-21 10:39:08 -07:00
Gustavo F. Padovan 85eb53c6f7 Bluetooth: Change the way we set ERTM mode as mandatory
If the socket type is SOCK_STREAM we set Enhanced Retransmisson Mode or
Streaming Mode as mandatory. That means that we will close the channel
if the other side doesn't support or request the the mandatory mode.
Basic mode can't be set as mandatory.

Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Reviewed-by: João Paulo Rechi Vita <jprvita@profusion.mobi>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2010-07-21 10:39:07 -07:00
Gustavo F. Padovan 6498886863 Bluetooth: Tweaks to l2cap_send_i_or_rr_or_rnr() flow
l2cap_send_sframe() already set the F-bit if we set L2CAP_CONN_SEND_FBIT
and unset L2CAP_CONN_SEND_FBIT after send the F-bit.

Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Reviewed-by: João Paulo Rechi Vita <jprvita@profusion.mobi>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2010-07-21 10:39:07 -07:00
Gustavo F. Padovan 0e98958d4f Bluetooth: Add debug output to ERTM code
Use the dynamic debug to output info about ERTM protocol stuff.
The following script can be used to enable debug for ERTM:

DEBUGFS="/sys/kernel/debug/dynamic_debug/control"

echo -n 'func l2cap_send_disconn_req +p' > $DEBUGFS
echo -n 'func l2cap_monitor_timeout +p' > $DEBUGFS
echo -n 'func l2cap_retrans_timeout +p' > $DEBUGFS
echo -n 'func l2cap_busy_work  +p' > $DEBUGFS
echo -n 'func l2cap_push_rx_skb +p' > $DEBUGFS
echo -n 'func l2cap_data_channel_iframe +p' > $DEBUGFS
echo -n 'func l2cap_data_channel_rrframe +p' > $DEBUGFS
echo -n 'func l2cap_data_channel_rejframe +p' > $DEBUGFS
echo -n 'func l2cap_data_channel_srejframe +p' > $DEBUGFS
echo -n 'func l2cap_data_channel_rnrframe +p' > $DEBUGFS

Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Reviewed-by: João Paulo Rechi Vita <jprvita@profusion.mobi>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2010-07-21 10:39:07 -07:00
Gustavo F. Padovan 9b108fc0cf Bluetooth: Fix ERTM error reporting to the userspace
If any error occurs during transfers we have to tell userspace that
something wrong happened.

Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Reviewed-by: João Paulo Rechi Vita <jprvita@profusion.mobi>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2010-07-21 10:39:07 -07:00
Gustavo F. Padovan 4ea727ef9d Bluetooth: Fix missing retransmission action with RR(P=1)
The Bluetooth SIG Profile Tuning Suite Software uses the CSA1 spec
to run the L2CAP tests. The new 3.0 spec has a missing
Retransmit-I-Frames action when the Remote side is Busy.
We still start the retransmission timer if Remote is Busy and unacked
frames > 0. We do everything we did before this change plus the
Retransmission of I-frames.

Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Reviewed-by: João Paulo Rechi Vita <jprvita@profusion.mobi>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2010-07-21 10:39:07 -07:00
Gustavo F. Padovan 2600008967 Bluetooth: Check packet FCS earlier
This way, if FCS is enabled and the packet is corrupted, we just drop it
without read it len, which could be corrupted.

Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2010-07-21 10:39:07 -07:00
Gustavo F. Padovan 45d65c46ac Bluetooth: Check the tx_window size on setsockopt
We have to check if the proposed tx_window value is not greater that
maximum value supported.

Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Reviewed-by: João Paulo Rechi Vita <jprvita@profusion.mobi>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2010-07-21 10:39:07 -07:00
Gustavo F. Padovan 3cb123d1c0 Bluetooth: Fix handle of received P-bit
ERTM spec mandates that after receive a P-bit we shall send an F-bit in
response. This patch fixes this for retransmitted packets, on
retransmitting we were missing to check for a pending F-bit to be sent.
Also we were missing some annotation to send a F-bit.

Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Reviewed-by: João Paulo Rechi Vita <jprvita@profusion.mobi>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2010-07-21 10:39:06 -07:00
Gustavo F. Padovan 2ece3684b4 Bluetooth: Update buffer_seq before retransmit frames
Updating buffer_seq first make us able to ack the last I-frame received.
This is also a requirement of the  Profile Tuning Suite software.

Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2010-07-21 10:39:06 -07:00
Gustavo F. Padovan 7fe9b298c9 Bluetooth: Stop ack_timer if ERTM enters in Local Busy or SREJ_SENT
The ack_timer is implemation specific, disabling it in such situation
avoids some potencial errors in the ERTM protocol.

Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Reviewed-by: João Paulo Rechi Vita <jprvita@profusion.mobi>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2010-07-21 10:39:06 -07:00
Ron Shaffer 2d0a03460a Bluetooth: Reassigned copyright to Code Aurora Forum
Qualcomm, Inc. has reassigned rights to Code Aurora Forum. Accordingly,
as files are modified by Code Aurora Forum members, the copyright
statement will be updated.

Signed-off-by: Ron Shaffer <rshaffer@codeaurora.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2010-07-21 10:39:06 -07:00
Johan Hedberg 32c2ece5ea Bluetooth: Add debugfs support for showing the blacklist
This patch adds a debugfs blacklist entry for each HCI device which can
be used to list the current content of the blacklist.

Signed-off-by: Johan Hedberg <johan.hedberg@nokia.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2010-07-21 10:39:05 -07:00
Johan Hedberg f03585689f Bluetooth: Add blacklist support for incoming connections
In some circumstances it could be desirable to reject incoming
connections on the baseband level. This patch adds this feature through
two new ioctl's: HCIBLOCKADDR and HCIUNBLOCKADDR. Both take a simple
Bluetooth address as a parameter. BDADDR_ANY can be used with
HCIUNBLOCKADDR to remove all devices from the blacklist.

Signed-off-by: Johan Hedberg <johan.hedberg@nokia.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2010-07-21 10:39:05 -07:00
Gustavo F. Padovan 95ffa97827 Bluetooth: Fix L2CAP control bit field corruption
When resending an I-frame, ERTM was reusing the control bits from the last
time it was sent, that was causing a corruption in the new control field
due to it dirty fields.

This patches extracts only the SAR bits from the old field and reuse it to
resend the packet, the others bits should be reset and receive the
updated value.

Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2010-07-21 10:39:05 -07:00
Gustavo F. Padovan c13ffa620f Bluetooth: Proper shutdown ERTM when closing the channel
Fix a crash regarding the Monitor Timeout, it was running even after the
shutdown of the ACL connection, which doesn't make sense.

The same code also fixes another issue, before this patch L2CAP was sending
many Disconnections Requests while we have to send only one.

The issues are related to each other, a expired Monitor Timeout can
trigger a Disconnection Request and then we may have a crash if the link
was already deleted.

Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2010-07-21 10:39:05 -07:00
Nathan Holstein 51893f88dd Bluetooth: Fix bug with ERTM minimum packet length
ERTM and streaming mode L2CAP sockets have no minimum packet length. Only
basic mode connections have minimum length.

Instead, validate the packet containing all necessary control, FCS,
and SAR fields.

The patch fixes the drop of valid packets with length lower than 4.

Signed-off-by: Nathan Holstein <ngh@isomerica.net>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2010-07-21 10:39:05 -07:00
João Paulo Rechi Vita bfbacc1155 Bluetooth: Fix SREJ_QUEUE corruption in L2CAP
Since all TxSeq values are modulo, we shall not compare them directly. We
have to compare their offset inside the TxWindow instead.

Signed-off-by: João Paulo Rechi Vita <jprvita@profusion.mobi>
Acked-by: Gustavo F. Padovan <padovan@profusion.mobi>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2010-07-21 10:39:04 -07:00
Gustavo F. Padovan 6e2b6722ab Bluetooth: Fix bug in l2cap_ertm_send() behavior
This patch makes l2cap_ertm_send() similar to the Send-Data action of
the ERTM spec. We shall not check for RemoteBusy or WAIT_F state
inside l2cap_ertm_send().

Such checks were causing a bug in the retransmission logic of ERTM and
making ERTM stalls until the ACL is dropped.

Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Reviewed-by: João Paulo Rechi Vita <jprvita@profusion.mobi>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2010-07-21 10:39:04 -07:00
Gustavo F. Padovan bc1b1f8bee Bluetooth: Only check SAR bits if frame is an I-frame
The SAR bits doesn't make sense for an S-frame. It doesn't use SAR.

Checking SAR for a S-frames can lead to L2CAP errors, it could close
the channel with an invalid packet length, since we was removing the 2
of the of any frame that match SAR start bits, without check if it is
an I-frame.

Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2010-07-21 10:39:04 -07:00
Gustavo F. Padovan 8ff50ec04a Bluetooth: Fix bug with ERTM vars increment
All ERTM operations regarding the txWindow should be modulo 64,
otherwise we confuse the ERTM logic and connections will break.

Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Reviewed-by: João Paulo Rechi Vita <jprvita@profusion.mobi>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2010-07-21 10:39:04 -07:00
Gustavo F. Padovan f6337c7711 Bluetooth: Fix drop of packets with invalid req_seq/tx_seq
We shall not use an unsigned var since we are expecting negatives value
there. Using unsigned causes ERTM connection to close due to invalid
ReqSeq numbers.

Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2010-07-21 10:39:04 -07:00
Gustavo F. Padovan 0b31c85ce7 Bluetooth: Remove L2CAP Extended Features from Kconfig
This reverts commit 84fb0a6334 which adds
the L2CAP Extended Features to the Kconfig, that is actually not needed.
One can use other mechanisms to enable L2CAP Extended Features.

Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2010-07-21 10:39:04 -07:00
Gustavo F. Padovan fd059b9bd0 Bluetooth: Remove max_tx and tx_window module paramenters from L2CAP
We don't need these parameters anymore since we have socket options for
them.

Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2010-07-21 10:39:04 -07:00
Dave Chinner 567c7b0ede mm: add context argument to shrinker callback to remaining shrinkers
Add the shrinkers missed in the first conversion of the API in
commit 7f8275d0d6 ("mm: add context argument to
shrinker callback").

Signed-off-by: Dave Chinner <dchinner@redhat.com>
2010-07-21 15:33:01 +10:00
David S. Miller 11fe883936 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	drivers/vhost/net.c
	net/bridge/br_device.c

Fix merge conflict in drivers/vhost/net.c with guidance from
Stephen Rothwell.

Revert the effects of net-2.6 commit 573201f36f
since net-next-2.6 has fixes that make bridge netpoll work properly thus
we don't need it disabled.

Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-20 18:25:24 -07:00
Linus Torvalds 516bd66415 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (24 commits)
  bridge: Partially disable netpoll support
  tcp: fix crash in tcp_xmit_retransmit_queue
  IPv6: fix CoA check in RH2 input handler (mip6_rthdr_input())
  ibmveth: lost IRQ while closing/opening device leads to service loss
  rt2x00: Fix lockdep warning in rt2x00lib_probe_dev()
  vhost: avoid pr_err on condition guest can trigger
  ipmr: Don't leak memory if fib lookup fails.
  vhost-net: avoid flush under lock
  net: fix problem in reading sock TX queue
  net/core: neighbour update Oops
  net: skb_tx_hash() fix relative to skb_orphan_try()
  rfs: call sock_rps_record_flow() in tcp_splice_read()
  xfrm: do not assume that template resolving always returns xfrms
  hostap_pci: set dev->base_addr during probe
  axnet_cs: use spin_lock_irqsave in ax_interrupt
  dsa: Fix Kconfig dependencies.
  act_nat: not all of the ICMP packets need an IP header payload
  r8169: incorrect identifier for a 8168dp
  Phonet: fix skb leak in pipe endpoint accept()
  Bluetooth: Update sec_level/auth_type for already existing connections
  ...
2010-07-20 16:26:42 -07:00
John W. Linville 34782e9e1e wireless: remove unnecessary reg_same_country_ie_hint
"Might as well remove  reg_same_country_ie_hint() completely since we
already dealt with suspend/resume through the regulatory hint
disconnect." -- Luis

Reported-by: Luis R. Rodriguez <mcgrof@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-07-20 16:49:41 -04:00
John W. Linville 20925feee9 wireless: mark cfg80211_is_all_idle as static
CHECK   net/wireless/sme.c
net/wireless/sme.c:38:6: warning: symbol 'cfg80211_is_all_idle' was not declared. Should it be static?

It is not used elsewhere, so mark it static.

Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-07-20 16:49:38 -04:00
John W. Linville 2ea6fb6d1e wireless: correct sparse warning in generated regdb.c
CHECK   net/wireless/regdb.c
net/wireless/regdb.c:8:34: warning: symbol 'reg_regdb' was not declared.  Should it be static?
net/wireless/regdb.c:11:5: warning: symbol 'reg_regdb_size' was not declared. Should it be static?

Simply include the also generated regdb.h.

Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-07-20 16:49:37 -04:00
John W. Linville c28991a02c wireless: correct sparse warning in wext-compat.c
CHECK   net/wireless/wext-compat.c
net/wireless/wext-compat.c:1434:5: warning: symbol 'cfg80211_wext_siwpmksa' was not declared. Should it be static?

Add declaration in cfg80211.h.  Also add an EXPORT_SYMBOL_GPL, since all
the peer functions have it.

Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-07-20 16:49:37 -04:00
John W. Linville 3f6ff6bacd wireless: correct sparse warning in lib80211_crypt_tkip.c
CHECK   net/wireless/lib80211_crypt_tkip.c
net/wireless/lib80211_crypt_tkip.c:581:27: warning: cast to restricted __le16

Caused by dereferencing a "u8 *" and passing it to le16_to_cpu...

Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-07-20 16:49:36 -04:00
John W. Linville 4f366c5dab wireless: only use alpha2 regulatory information from country IE
The meaning and/or usage of the country IE is somewhat poorly defined.
In practice, this means that regulatory rulesets in a country IE are
often incomplete and might be untrustworthy.  This removes the code
associated with interpreting those rulesets while preserving respect
for country "alpha2" codes also contained in the country IE.

Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-07-20 16:44:35 -04:00
Neil Horman 70d4bf6d46 drop_monitor: convert some kfree_skb call sites to consume_skb
Convert a few calls from kfree_skb to consume_skb

Noticed while I was working on dropwatch that I was detecting lots of internal
skb drops in several places.  While some are legitimate, several were not,
freeing skbs that were at the end of their life, rather than being discarded due
to an error.  This patch converts those calls sites from using kfree_skb to
consume_skb, which quiets the in-kernel drop_monitor code from detecting them as
drops.  Tested successfully by myself

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-20 13:28:05 -07:00
Neil Horman 4b706372f1 drop_monitor: Add error code to detect duplicate state changes
Patch to add -EAGAIN error to dropwatch netlink message handling code.
-EAGAIN will be returned anytime userspace attempts to transition the state of
the drop monitor service to a state that its already in.  That allows user space
to detect this condition, so it doesn't wait for a success ACK that will never
arrive.  Tested successfully by me

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-20 13:28:04 -07:00
Nicolas Dichtel d79d991379 __dst_free(): put EXPORT_SYMBOLS after the fct
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-20 13:28:03 -07:00
David Gnedt 53e9b1de68 mac80211: set carrier on for monitor interfaces on ieee80211_open
If a station interface is reused as monitor interface it is possible that
the carrier is still set to off. This breaks packet injection on that
monitor interface.
Force the carrier on in monitor interface initialisation like it is also done
for other interface types (e.g. adhoc, mesh point, ap).

Signed-off-by: David Gnedt <david.gnedt@davizone.at>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-07-20 16:02:58 -04:00
Johannes Berg 4ced3f74da mac80211: move QoS-enable to BSS info
Ever since

commit e1b3ec1a2a
Author: Stanislaw Gruszka <sgruszka@redhat.com>
Date:   Mon Mar 29 12:18:34 2010 +0200

    mac80211: explicitly disable/enable QoS

mac80211 is telling drivers, in particular
iwlwifi, whether QoS is enabled or not.

However, this is only relevant for station mode,
since only then will any device send nullfunc
frames and need to know whether they should be
QoS frames or not. In other modes, there are
(currently) no frames the device is supposed to
send.

When you now consider virtual interfaces, it
becomes apparent that the current mechanism is
inadequate since it enables/disables QoS on a
global scale, where for nullfunc frames it has
to be on a per-interface scale.

Due to the above considerations, we can change
the way mac80211 advertises the QoS state to
drivers to only ever advertise it as "off" in
station mode, and make it a per-BSS setting.

Tested-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-07-20 16:02:58 -04:00
Felix Fietkau 875ae5f688 mac80211: fix aggregation action frame handling with AP VLANs
When aggregation related action frames are enqueued for further work,
and they originate from a STA that is part of an AP VLAN, they are
currently enqueued for the AP interface. This breaks the sta_info_get()
lookup in the actual work function, and because of that, aggregation
sessions are not established for this STA.

Fix this by replacing the sta_info_get call with a call to
sta_info_get_bss.

Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-07-20 16:02:58 -04:00
John W. Linville 06ee1c2613 wireless: use netif_rx_ni in ieee80211_send_layer2_update
These synthetic frames are all triggered from userland requests in
process context.

https://bugzilla.kernel.org/show_bug.cgi?id=16412

Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-07-20 15:55:12 -04:00
Herbert Xu 573201f36f bridge: Partially disable netpoll support
The new netpoll code in bridging contains use-after-free bugs
that are non-trivial to fix.

This patch fixes this by removing the code that uses skbs after
they're freed.

As a consequence, this means that we can no longer call bridge
from the netpoll path, so this patch also removes the controller
function in order to disable netpoll.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

Thanks,
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-19 23:28:25 -07:00
Eric Dumazet d6d9ca0fec net: this_cpu_xxx conversions
Use modern this_cpu_xxx() api, saving few bytes on x86

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-19 15:12:51 -07:00
Ilpo Järvinen 45e77d3145 tcp: fix crash in tcp_xmit_retransmit_queue
It can happen that there are no packets in queue while calling
tcp_xmit_retransmit_queue(). tcp_write_queue_head() then returns
NULL and that gets deref'ed to get sacked into a local var.

There is no work to do if no packets are outstanding so we just
exit early.

This oops was introduced by 08ebd1721a (tcp: remove tp->lost_out
guard to make joining diff nicer).

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Reported-by: Lennart Schulte <lennart.schulte@nets.rwth-aachen.de>
Tested-by: Lennart Schulte <lennart.schulte@nets.rwth-aachen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-19 12:43:49 -07:00
Eric Dumazet bd27290a59 net: 64bit stats for netdev_queue
Since struct netdev_queue tx_bytes/tx_packets/tx_dropped are already
protected by _xmit_lock, its easy to convert these fields to u64 instead
of unsigned long.
This completes 64bit stats for devices using them (vlan, macvlan, ...)

Strictly, we could avoid the locking in dev_txq_stats_fold() on 64bit
arches, but its slow path and we prefer keep it simple.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-19 09:35:40 -07:00
David S. Miller 6accec76f6 sch_atm: Convert to use standard list_head facilities.
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-18 19:52:55 -07:00
Richard Cochran c1f19b51d1 net: support time stamping in phy devices.
This patch adds a new networking option to allow hardware time stamps
from PHY devices. When enabled, likely candidates among incoming and
outgoing network packets are offered to the PHY driver for possible
time stamping. When accepted by the PHY driver, incoming packets are
deferred for later delivery by the driver.

The patch also adds phylib driver methods for the SIOCSHWTSTAMP ioctl
and callbacks for transmit and receive time stamping. Drivers may
optionally implement these functions.

Signed-off-by: Richard Cochran <richard.cochran@omicron.at>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-18 19:15:26 -07:00
Richard Cochran 28b041139e net: preserve ifreq parameter when calling generic phy_mii_ioctl().
The phy_mii_ioctl() function unnecessarily throws away the original ifreq.
We need access to the ifreq in order to support PHYs that can perform
hardware time stamping.

Two maverick drivers filter the ioctl commands passed to phy_mii_ioctl().
This is unnecessary since phylib will check the command in any case.

Signed-off-by: Richard Cochran <richard.cochran@omicron.at>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-18 19:15:25 -07:00
Pedro Garcia ad1afb0039 vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)
- Without the 8021q module loaded in the kernel, all 802.1p packets 
(VLAN 0 but QoS tagging) are silently discarded (as expected, as 
the protocol is not loaded).
 
- Without this patch in 8021q module, these packets are forwarded to 
the module, but they are discarded also if VLAN 0 is not configured,
which should not be the default behaviour, as VLAN 0 is not really
a VLANed packet but a 802.1p packet. Defining VLAN 0 makes it almost
impossible to communicate with mixed 802.1p and non 802.1p devices on
the same network due to arp table issues.

- Changed logic to skip vlan specific code in vlan_skb_recv if VLAN 
is 0 and we have not defined a VLAN with ID 0, but we accept the 
packet with the encapsulated proto and pass it later to netif_rx.

- In the vlan device event handler, added some logic to add VLAN 0 
to HW filter in devices that support it (this prevented any traffic
in VLAN 0 to reach the stack in e1000e with HW filter under 2.6.35,
and probably also with other HW filtered cards, so we fix it here).

- In the vlan unregister logic, prevent the elimination of VLAN 0 
in devices with HW filter.

- The default behaviour is to ignore the VLAN 0 tagging and accept
the packet as if it was not tagged, but we can still define a 
VLAN 0 if desired (so it is backwards compatible).

Signed-off-by: Pedro Garcia <pedro.netdev@dondevamos.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-18 15:38:44 -07:00
Tetsuo Handa 01893c82b4 net: Remove MAX_SOCK_ADDR constant
MAX_SOCK_ADDR is no longer used because commit 230b1839 "net: Use standard
structures for generic socket address structures." replaced
"char address[MAX_SOCK_ADDR];" with "struct sockaddr_storage address;".

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-18 15:29:14 -07:00
Kulikov Vasiliy 8e64159dfb net: dccp: fix sign bug
'gap' is unsigned, so this code is wrong:

    gap = -new_head;
    ...
    if (gap > 0) { ... }

Make 'gap' signed.

The semantic patch that finds this problem (many false-positive results):
(http://coccinelle.lip6.fr/)

// <smpl>
@ r1 @
identifier f;
@@
int f(...) { ... }

@@
identifier r1.f;
type T;
unsigned T x;
@@

*x = f(...)
 ...
*x > 0

Signed-off-by: Kulikov Vasiliy <segooon@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-18 15:07:14 -07:00
Arnaud Ebalard d9a9dc66eb IPv6: fix CoA check in RH2 input handler (mip6_rthdr_input())
The input handler for Type 2 Routing Header (mip6_rthdr_input())
checks if the CoA in the packet matches the CoA in the XFRM state.

Current check is buggy: it compares the adddress in the Type 2
Routing Header, i.e. the HoA, against the expected CoA in the state.
The comparison should be made against the address in the destination
field of the IPv6 header.

The bug remained unnoticed because the main (and possibly only current)
user of the code (UMIP MIPv6 Daemon) initializes the XFRM state with the
unspecified address, i.e. explicitly allows everything.

Yoshifuji-san, can you ack that one?

Signed-off-by: Arnaud Ebalard <arno@natisbad.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-18 15:04:33 -07:00
John W. Linville 088c87262b mac80211: improve error checking if WEP fails to init
Do this by poisoning the values of wep_tx_tfm and wep_rx_tfm if either
crypto allocation fails.

Reported-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-07-16 14:03:42 -04:00
Christian Lamparter 15804e3e9d mac80211: skip HT parsing if HW does not support HT
This patch will also fix the odd freeze which occurred
when minstrel_ht connects to an 802.11n network with
legacy hardware.

Signed-off-by: Christian Lamparter <chunkeey@googlemail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-07-16 14:03:42 -04:00
Ben Greear e40dbc51fb ipmr: Don't leak memory if fib lookup fails.
This was detected using two mcast router tables.  The
pimreg for the second interface did not have a specific
mrule, so packets received by it were handled by the
default table, which had nothing configured.

This caused the ipmr_fib_lookup to fail, causing
the memory leak.

Signed-off-by: Ben Greear <greearb@candelatech.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-15 22:38:43 -07:00
Kulikov Vasiliy bb7a0bd600 net: bridge: fix sign bug
ipv6_skip_exthdr() can return error code that is below zero.
'offset' is unsigned, so it makes no sense.
ipv6_skip_exthdr() returns 'int' so we can painlessly change type of
offset to int.

Signed-off-by: Kulikov Vasiliy <segooon@gmail.com>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-15 20:27:58 -07:00
Michael S. Tsirkin edf0e1fb0d netfilter: add CHECKSUM target
This adds a `CHECKSUM' target, which can be used in the iptables mangle
table.

You can use this target to compute and fill in the checksum in
a packet that lacks a checksum.  This is particularly useful,
if you need to work around old applications such as dhcp clients,
that do not work well with checksum offloads, but don't want to
disable checksum offload in your device.

The problem happens in the field with virtualized applications.
For reference, see Red Hat bz 605555, as well as
http://www.spinics.net/lists/kvm/msg37660.html

Typical expected use (helps old dhclient binary running in a VM):
iptables -A POSTROUTING -t mangle -p udp --dport bootpc \
	-j CHECKSUM --checksum-fill

Includes fixes by Jan Engelhardt <jengelh@medozas.de>

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-07-15 17:20:46 +02:00
Pablo Neira Ayuso fac42a9a92 netfilter: nf_ct_tcp: fix flow recovery with TCP window tracking enabled
This patch adds the missing bits to support the recovery of TCP flows
without disabling window tracking (aka be_liberal). To ensure a
successful recovery, we have to inject the window scale factor via
ctnetlink.

This patch has been tested with a development snapshot of conntrackd
and the new clause `TCPWindowTracking' that allows to perform strict
TCP window tracking recovery across fail-overs.

With this patch, we don't update the receiver's window until it's not
initiated. We require this to perform a successful recovery. Jozsef
confirmed in a private email that this spotted a real issue since that
should not happen.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-07-15 17:09:04 +02:00
Tom Herbert b0f77d0eae net: fix problem in reading sock TX queue
Fix problem in reading the tx_queue recorded in a socket.  In
dev_pick_tx, the TX queue is read by doing a check with
sk_tx_queue_recorded on the socket, followed by a sk_tx_queue_get.
The problem is that there is not mutual exclusion across these
calls in the socket so it it is possible that the queue in the
sock can be invalidated after sk_tx_queue_recorded is called so
that sk_tx_queue get returns -1, which sets 65535 in queue_index
and thus dev_pick_tx returns 65536 which is a bogus queue and
can cause crash in dev_queue_xmit.

We fix this by only calling sk_tx_queue_get which does the proper
checks.  The interface is that sk_tx_queue_get returns the TX queue
if the sock argument is non-NULL and TX queue is recorded, else it
returns -1.  sk_tx_queue_recorded is no longer used so it can be
completely removed.

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-14 20:50:29 -07:00
Chihau Chau 5487486858 Net: ethernet: pe2.c: fix EXPORT_SYMBOL macro code style issue
This patch fix a code style issue, if a function is exported, the
EXPORT_SYMBOL macro for it should follow immediately after the closing
function brace line.

Signed-off-by: Chihau Chau <chihau@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-14 18:27:09 -07:00
Doug Kehn 91a72a7059 net/core: neighbour update Oops
When configuring DMVPN (GRE + openNHRP) and a GRE remote
address is configured a kernel Oops is observed.  The
obserseved Oops is caused by a NULL header_ops pointer
(neigh->dev->header_ops) in neigh_update_hhs() when

void (*update)(struct hh_cache*, const struct net_device*, const unsigned char *)
= neigh->dev->header_ops->cache_update;

is executed.  The dev associated with the NULL header_ops is
the GRE interface.  This patch guards against the
possibility that header_ops is NULL.

This Oops was first observed in kernel version 2.6.26.8.

Signed-off-by: Doug Kehn <rdkehn@yahoo.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-14 18:02:16 -07:00
Dan Carpenter 0eff683f73 net/sched: potential data corruption
The reset_policy() does:
        memset(d->tcfd_defdata, 0, SIMP_MAX_DATA);
        strlcpy(d->tcfd_defdata, defdata, SIMP_MAX_DATA);

In the original code, the size of d->tcfd_defdata wasn't fixed and if
strlen(defdata) was less than 31, reset_policy() would cause memory
corruption.

Please Note:  The original alloc_defdata() assumes defdata is 32
characters and a NUL terminator while reset_policy() assumes defdata is
31 characters and a NUL.  This patch updates alloc_defdata() to match
reset_policy() (ie a shorter string).  I'm not very familiar with this
code so please review carefully.

Signed-off-by: Dan Carpenter <error27@gmail.com>
Acked-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-14 17:56:37 -07:00
Eric Dumazet 87fd308cfc net: skb_tx_hash() fix relative to skb_orphan_try()
commit fc6055a5ba (net: Introduce skb_orphan_try()) added early
orphaning of skbs.

This unfortunately added a performance regression in skb_tx_hash() in
case of stacked devices (bonding, vlans, ...)

Since skb->sk is now NULL, we cannot access sk->sk_hash anymore to
spread tx packets to multiple NIC queues on multiqueue devices.

skb_tx_hash() in this case only uses skb->protocol, same value for all
flows.

skb_orphan_try() can copy sk->sk_hash into skb->rxhash and skb_tx_hash()
can use this saved sk_hash value to compute its internal hash value.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-14 15:33:27 -07:00
Changli Gao 3a047bf87b rfs: call sock_rps_record_flow() in tcp_splice_read()
rfs: call sock_rps_record_flow() in tcp_splice_read()

call sock_rps_record_flow() in tcp_splice_read(), so the applications using
splice(2) or sendfile(2) can utilize RFS.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
----
 net/ipv4/tcp.c |    1 +
 1 file changed, 1 insertion(+)
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-14 14:45:15 -07:00
Timo Teräs d809ec8955 xfrm: do not assume that template resolving always returns xfrms
xfrm_resolve_and_create_bundle() assumed that, if policies indicated
presence of xfrms, bundle template resolution would always return
some xfrms. This is not true for 'use' level policies which can
result in no xfrm's being applied if there is no suitable xfrm states.
This fixes a crash by this incorrect assumption.

Reported-by: George Spelvin <linux@horizon.com>
Bisected-by: George Spelvin <linux@horizon.com>
Tested-by: George Spelvin <linux@horizon.com>
Signed-off-by: Timo Teräs <timo.teras@iki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-14 14:16:48 -07:00
Johannes Berg ccb6c1360f cfg80211: don't get expired BSSes
When kernel-internal users use cfg80211_get_bss()
to get a reference to a BSS struct, they may end
up getting one that would have been removed from
the list if there had been any userspace access
to the list. This leads to inconsistencies and
problems.

Fix it by making cfg80211_get_bss() ignore BSSes
that cfg80211_bss_expire() would remove.

Fixes http://bugzilla.intellinuxwireless.org/show_bug.cgi?id=2180

Cc: stable@kernel.org
Reported-by: Jiajia Zheng <jiajia.zheng@intel.com>
Tested-by: Jiajia Zheng <jiajia.zheng@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-07-14 13:52:45 -04:00
John W. Linville e300d955de Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 into for-davem
Conflicts:
	drivers/net/wireless/wl12xx/wl1271_cmd.h
2010-07-13 15:57:29 -04:00
Joe Perches 55a40e2432 net/irda: Remove unnecessary casts of private_data
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-12 21:13:37 -07:00
Joe Perches 8a994a7180 net/core: Remove unnecessary casts of private_data
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-12 21:13:36 -07:00
Arnd Bergmann 15fd0cd9a2 net: autoconvert trivial BKL users to private mutex
All these files use the big kernel lock in a trivial
way to serialize their private file operations,
typically resulting from an earlier semi-automatic
pushdown from VFS.

None of these drivers appears to want to lock against
other code, and they all use the BKL as the top-level
lock in their file operations, meaning that there
is no lock-order inversion problem.

Consequently, we can remove the BKL completely,
replacing it with a per-file mutex in every case.
Using a scripted approach means we can avoid
typos.

file=$1
name=$2
if grep -q lock_kernel ${file} ; then
    if grep -q 'include.*linux.mutex.h' ${file} ; then
            sed -i '/include.*<linux\/smp_lock.h>/d' ${file}
    else
            sed -i 's/include.*<linux\/smp_lock.h>.*$/include <linux\/mutex.h>/g' ${file}
    fi
    sed -i ${file} \
        -e "/^#include.*linux.mutex.h/,$ {
                1,/^\(static\|int\|long\)/ {
                     /^\(static\|int\|long\)/istatic DEFINE_MUTEX(${name}_mutex);

} }"  \
    -e "s/\(un\)*lock_kernel\>[ ]*()/mutex_\1lock(\&${name}_mutex)/g" \
    -e '/[      ]*cycle_kernel_lock();/d'
else
    sed -i -e '/include.*\<smp_lock.h\>/d' ${file}  \
                -e '/cycle_kernel_lock()/d'
fi

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: netdev@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-12 20:21:47 -07:00
Eric Dumazet d361fd599a net: sock_free() optimizations
Avoid two extra instructions in sock_free(), to reload
skb->truesize and skb->sk

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-12 20:21:46 -07:00
Changli Gao 7ba4291007 inet, inet6: make tcp_sendmsg() and tcp_sendpage() through inet_sendmsg() and inet_sendpage()
a new boolean flag no_autobind is added to structure proto to avoid the autobind
calls when the protocol is TCP. Then sock_rps_record_flow() is called int the
TCP's sendmsg() and sendpage() pathes.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
----
 include/net/inet_common.h |    4 ++++
 include/net/sock.h        |    1 +
 include/net/tcp.h         |    8 ++++----
 net/ipv4/af_inet.c        |   15 +++++++++------
 net/ipv4/tcp.c            |   11 +++++------
 net/ipv4/tcp_ipv4.c       |    3 +++
 net/ipv6/af_inet6.c       |    8 ++++----
 net/ipv6/tcp_ipv6.c       |    3 +++
 8 files changed, 33 insertions(+), 20 deletions(-)
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-12 20:21:46 -07:00
Dan Carpenter 5c4bfa17f3 9p: strlen() doesn't count the terminator
This is an off by one bug because strlen() doesn't count the NULL
terminator.  We strcpy() addr into a fixed length array of size
UNIX_PATH_MAX later on.

The addr variable is the name of the device being mounted.

CC: stable@kernel.org
Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-12 20:21:43 -07:00
David S. Miller 336a283b9c dsa: Fix Kconfig dependencies.
Based upon a report by Randy Dunlap.

DSA needs PHYLIB, but PHYLIB needs NET_ETHERNET.  So, in order
to select PHYLIB we have to make DSA depend upon NET_ETHERNET.

Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-12 20:03:42 -07:00
Changli Gao 70c2efa5a3 act_nat: not all of the ICMP packets need an IP header payload
not all of the ICMP packets need an IP header payload, so we check the length
of the skbs only when the packets should have an IP header payload.

Based upon analysis and initial patch by Rodrigo Partearroyo González.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
----
 net/sched/act_nat.c |    5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-12 20:00:19 -07:00
Johannes Berg 643f82e32f cfg80211: ignore spurious deauth
Ever since mac80211/drivers are no longer
fully in charge of keeping track of the
auth status, trying to make them do so will
fail. Instead of warning and reporting the
deauthentication to userspace, cfg80211 must
simply ignore it so that spurious
deauthentications, e.g. before starting
authentication, aren't seen by userspace as
actual deauthentications.

Cc: stable@kernel.org
Reported-by: Paul Stewart <pstew@google.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-07-12 16:05:31 -04:00
Eric Dumazet 9e34a5b516 net/core: EXPORT_SYMBOL cleanups
CodingStyle cleanups

EXPORT_SYMBOL should immediately follow the symbol declaration.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-12 12:57:55 -07:00
Eric Dumazet 4bc2f18ba4 net/ipv4: EXPORT_SYMBOL cleanups
CodingStyle cleanups

EXPORT_SYMBOL should immediately follow the symbol declaration.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-12 12:57:54 -07:00
Uwe Kleine-König 698f93159a fix comment/printk typos concerning "already"
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2010-07-11 21:45:40 +02:00
Ben Hutchings d77535162e net: Document that dev_get_stats() returns the given pointer
Document that dev_get_stats() returns the same stats pointer it was
given.  Remove const qualification from the returned pointer since the
caller may do what it likes with that structure.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-09 17:41:57 -07:00
Ben Hutchings 3cfde79c6c net: Get rid of rtnl_link_stats64 / net_device_stats union
In commit be1f3c2c02 "net: Enable 64-bit
net device statistics on 32-bit architectures" I redefined struct
net_device_stats so that it could be used in a union with struct
rtnl_link_stats64, avoiding the need for explicit copying or
conversion between the two.  However, this is unsafe because there is
no locking required and no lock consistently held around calls to
dev_get_stats() and use of the statistics structure it returns.

In commit 28172739f0 "net: fix 64 bit
counters on 32 bit arches" Eric Dumazet dealt with that problem by
requiring callers of dev_get_stats() to provide storage for the
result.  This means that the net_device::stats64 field and the padding
in struct net_device_stats are now redundant, so remove them.

Update the comment on net_device_ops::ndo_get_stats64 to reflect its
new usage.

Change dev_txq_stats_fold() to use struct rtnl_link_stats64, since
that is what all its callers are really using and it is no longer
going to be compatible with struct net_device_stats.

Eric Dumazet suggested the separate function for the structure
conversion.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-09 17:41:56 -07:00
Changli Gao 116e1f1b09 netfilter: xt_TPROXY: the length of lines should be within 80
According to the Documentation/CodingStyle, the length of lines should
be within 80.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-07-09 17:31:49 +02:00
Xiaoyu Du 8a0acaac80 ipvs: lvs sctp protocol handler is incorrectly invoked ip_vs_app_pkt_out
lvs sctp protocol handler is incorrectly invoked ip_vs_app_pkt_out
Since there's no sctp helpers at present, it does the same thing as
ip_vs_app_pkt_in.

Signed-off-by: Xiaoyu Du <tingsrain@gmail.com>
Acked-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-07-09 17:27:47 +02:00
Karl Hiramoto 005066122b atm/br2684: register notifier event for carrier signal changes.
When a signal change event occurs call netif_carrier_on/off.

Signed-off-by: Karl Hiramoto <karl@hiramoto.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-09 00:09:21 -07:00
Karl Hiramoto 7313bb8f3d atm: propagate signal changes via notifier
Add notifier chain for changes in atm_dev.

Clients like br2684 will call register_atmdevice_notifier() to be notified of
changes. Drivers will call atm_dev_signal_change() to notify clients like
br2684 of the change.

On DSL and ATM devices it's usefull to have a know if you have a carrier
signal. netdevice LOWER_UP changes can be propagated to userspace via netlink
monitor.

Signed-off-by: Karl Hiramoto <karl@hiramoto.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-09 00:09:20 -07:00
Eric Dumazet a204b48ed4 vlan: allow TSO setting on vlan interfaces
When we need to shape traffic using low speeds, we need to
disable tso on network interface :

ethtool -K eth0.2240 tso off

It seems vlan interfaces miss the set_tso() ethtool method.

Before enabling TSO, we must check real device supports
TSO for VLAN-tagged packets and enables TSO.

Note that a TSO change on real device propagates TSO setting
on all vlans, even if admin selected a different TSO setting.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-08 23:12:21 -07:00
Rémi Denis-Courmont 635f081541 Phonet: fix skb leak in pipe endpoint accept()
Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-08 21:45:34 -07:00
Stephen Hemminger dd4ba83dc1 gre: propagate ipv6 transport class
This patch makes IPV6 over IPv4 GRE tunnel propagate the transport
class field from the underlying IPV6 header to the IPV4 Type Of Service
field. Without the patch, all IPV6 packets in tunnel look the same to QoS.

This assumes that IPV6 transport class is exactly the same
as IPv4 TOS. Not sure if that is always the case?  Maybe need
to mask off some bits.

The mask and shift to get tclass is copied from ipv6/datagram.c

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-08 21:35:58 -07:00
Ville Tervo 045309820a Bluetooth: Update sec_level/auth_type for already existing connections
Update auth level for already existing connections if it is lower
than required by new connection.

Signed-off-by: Ville Tervo <ville.tervo@nokia.com>
Reviewed-by: Emeltchenko Andrei <andrei.emeltchenko@nokia.com>
Signed-off-by: Luciano Coelho <luciano.coelho@nokia.com>
Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@nokia.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2010-07-08 20:35:31 -03:00
Johan Hedberg da213f41cd Bluetooth: Reset the security level after an authentication failure
When authentication fails for a connection the assumed security level
should be set back to BT_SECURITY_LOW so that subsequent connect
attempts over the same link don't falsely assume that security is
adequate enough.

Signed-off-by: Johan Hedberg <johan.hedberg@nokia.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2010-07-08 20:35:27 -03:00
Andrei Emeltchenko e501d0553a Bluetooth: Check L2CAP pending status before sending connect request
Due to race condition in L2CAP state machine L2CAP Connection Request
may be sent twice for SDP with the same source channel id. Problems
reported connecting to Apple products, some carkit, Blackberry phones.

...
2010-06-07 21:18:03.651031 < ACL data: handle 1 flags 0x02 dlen 12
    L2CAP(s): Connect req: psm 1 scid 0x0040
2010-06-07 21:18:03.653473 > HCI Event: Number of Completed Packets (0x13) plen 5
    handle 1 packets 1
2010-06-07 21:18:03.653808 > HCI Event: Auth Complete (0x06) plen 3
    status 0x00 handle 1
2010-06-07 21:18:03.653869 < ACL data: handle 1 flags 0x02 dlen 12
    L2CAP(s): Connect req: psm 1 scid 0x0040
...

Patch uses L2CAP_CONF_CONNECT_PEND flag to mark that L2CAP Connection
Request has been sent already.

Modified version of patch from Ville Tervo.

Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@nokia.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2010-07-08 20:35:23 -03:00
John W. Linville 3473187d24 mac80211: remove wep dependency
The current mac80211 code assumes that WEP is always available.  If WEP
fails to initialize, ieee80211_register_hw will always fail.

In some cases (e.g. FIPS certification), the cryptography used by WEP is
unavailable.  However, in such cases there is no good reason why CCMP
encryption (or even no link level encryption) cannot be used.  So, this
patch removes mac80211's assumption that WEP (and TKIP) will always be
available for use.

Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-07-08 16:35:50 -04:00
Linus Torvalds 2aa72f6121 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (35 commits)
  NET: SB1250: Initialize .owner
  vxge: show startup message with KERN_INFO
  ll_temac: Fix missing iounmaps
  bridge: Clear IPCB before possible entry into IP stack
  bridge br_multicast: BUG: unable to handle kernel NULL pointer dereference
  net: Fix definition of netif_vdbg() when VERBOSE_DEBUG is defined
  net/ne: fix memory leak in ne_drv_probe()
  xfrm: fix xfrm by MARK logic
  virtio_net: fix oom handling on tx
  virtio_net: do not reschedule rx refill forever
  s2io: resolve statistics issues
  linux/net.h: fix kernel-doc warnings
  net: decreasing real_num_tx_queues needs to flush qdisc
  sched: qdisc_reset_all_tx is calling qdisc_reset without qdisc_lock
  qlge: fix a eeh handler to not add a pending timer
  qlge: Replacing add_timer() to mod_timer()
  usbnet: Set parent device early for netdev_printk()
  net: Revert "rndis_host: Poll status channel before control channel"
  netfilter: ip6t_REJECT: fix a dst leak in ipv6 REJECT
  drivers: bluetooth: bluecard_cs.c: Fixed include error, changed to linux/io.h
  ...
2010-07-07 19:56:00 -07:00
David S. Miller 597e608a84 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2010-07-07 15:59:38 -07:00
George Kadianakis 49085bd7d4 net/ipv4/ip_output.c: Removal of unused variable in ip_fragment()
Removal of unused integer variable in ip_fragment().

Signed-off-by: George Kadianakis <desnacked@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-07 15:44:59 -07:00
Eric Dumazet 28172739f0 net: fix 64 bit counters on 32 bit arches
There is a small possibility that a reader gets incorrect values on 32
bit arches. SNMP applications could catch incorrect counters when a
32bit high part is changed by another stats consumer/provider.

One way to solve this is to add a rtnl_link_stats64 param to all
ndo_get_stats64() methods, and also add such a parameter to
dev_get_stats().

Rule is that we are not allowed to use dev->stats64 as a temporary
storage for 64bit stats, but a caller provided area (usually on stack)

Old drivers (only providing get_stats() method) need no changes.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-07 14:58:56 -07:00
Herbert Xu 17762060c2 bridge: Clear IPCB before possible entry into IP stack
The bridge protocol lives dangerously by having incestuous relations
with the IP stack.  In this instance an abomination has been created
where a bogus IPCB area from a bridged packet leads to a crash in
the IP stack because it's interpreted as IP options.

This patch papers over the problem by clearing the IPCB area in that
particular spot.  To fix this properly we'd also need to parse any
IP options if present but I'm way too lazy for that.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

Cheers,
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-07 14:43:33 -07:00
Jiri Slaby 60ea385ff2 NET: nl80211, fix lock imbalance and netdev referencing
Stanse found that nl80211_set_wiphy imporperly handles a lock and netdev
reference and contains unreachable code. It is because there return statement
isntead of assignment to result variable. Fix that.

Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: "John W. Linville" <linville@tuxdriver.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jouni Malinen <j@w1.fi>
Cc: Samuel Ortiz <sameo@linux.intel.com>
Cc: linux-wireless@vger.kernel.org
Cc: netdev@vger.kernel.org
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-07-07 15:48:18 -04:00
Artem Bityutskiy 8eab945c56 sunrpc: make the cache cleaner workqueue deferrable
This patch makes the cache_cleaner workqueue deferrable, to prevent
unnecessary system wake-ups, which is very important for embedded
battery-powered devices.

do_cache_clean() is called every 30 seconds at the moment, and often
makes the system wake up from its power-save sleep state. With this
change, when the workqueue uses a deferrable timer, the
do_cache_clean() invocation will be delayed and combined with the
closest "real" wake-up. This improves the power consumption situation.

Note, I tried to create a DECLARE_DELAYED_WORK_DEFERRABLE() helper
macro, similar to DECLARE_DELAYED_WORK(), but failed because of the
way the timer wheel core stores the deferrable flag (it is the
LSBit in the time->base pointer). My attempt to define a static
variable with this bit set ended up with the "initializer element is
not constant" error.

Thus, I have to use run-time initialization, so I created a new
cache_initialize() function which is called once when sunrpc is
being initialized.

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-07-06 12:27:48 -04:00
Herbert Xu 7f285fa78d bridge br_multicast: BUG: unable to handle kernel NULL pointer dereference
On Tue, Jul 06, 2010 at 08:48:35AM +0800, Herbert Xu wrote:
>
> bridge: Restore NULL check in br_mdb_ip_get

Resend with proper attribution.

bridge: Restore NULL check in br_mdb_ip_get

Somewhere along the line the NULL check in br_mdb_ip_get went
AWOL, causing crashes when we receive an IGMP packet with no
multicast table allocated.

This patch restores it and ensures all br_mdb_*_get functions
use it.

Reported-by: Frank Arnold <frank.arnold@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

Thanks,
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-05 20:08:06 -07:00
Eric Dumazet fe76cda308 ipv4: use skb_dst_copy() in ip_copy_metadata()
Avoid touching dst refcount in ip_fragment().

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-05 18:50:56 -07:00
Michal Marek 72c7664f92 ipvs: Kconfig cleanup
IP_VS_PROTO_AH_ESP should be set iff either of IP_VS_PROTO_{AH,ESP} is
selected. Express this with standard kconfig syntax.

Signed-off-by: Michal Marek <mmarek@suse.cz>
Acked-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-07-05 10:42:37 +02:00
Eric Dumazet b13b7125e4 netfilter: ipt_REJECT: avoid touching dst ref
We can avoid a pair of atomic ops in ipt_REJECT send_reset()

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-07-05 10:40:09 +02:00
Changli Gao 98b0e84aaa netfilter: ipt_REJECT: postpone the checksum calculation.
postpone the checksum calculation, then if the output NIC supports checksum
offloading, we can utlize it. And though the output NIC doesn't support
checksum offloading, but we'll mangle this packet, this can free us from
updating the checksum, as the checksum calculation occurs later.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-07-05 10:39:17 +02:00
Changli Gao ea8fbe8f19 netfilter: nf_conntrack_reasm: add fast path for in-order fragments
As the fragments are sent in order in most of OSes, such as Windows, Darwin and
FreeBSD, it is likely the new fragments are at the end of the inet_frag_queue.
In the fast path, we check if the skb at the end of the inet_frag_queue is the
prev we expect.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-07-05 10:38:23 +02:00
Peter Kosyh 44b451f163 xfrm: fix xfrm by MARK logic
While using xfrm by MARK feature in
2.6.34 - 2.6.35 kernels, the mark
is always cleared in flowi structure via memset in
_decode_session4 (net/ipv4/xfrm4_policy.c), so
the policy lookup fails.
IPv6 code is affected by this bug too.

Signed-off-by: Peter Kosyh <p.kosyh@gmail.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-04 11:46:07 -07:00
Joe Perches 256df2f387 netdevice.h net/core/dev.c: Convert netdev_<level> logging macros to functions
Reduces an x86 defconfig text and data ~2k.
text is smaller, data is larger.

$ size vmlinux*
   text	   data	    bss	    dec	    hex	filename
7198862	 720112	1366288	9285262	 8dae8e	vmlinux
7205273	 716016	1366288	9287577	 8db799	vmlinux.device_h

Uses %pV and struct va_format
Format arguments are verified before printk

Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-04 10:40:18 -07:00
David S. Miller e490c1defe Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-next-2.6 2010-07-02 22:42:06 -07:00
David S. Miller 94e6721d9c Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-2.6 2010-07-02 22:04:49 -07:00
John Fastabend f0796d5c73 net: decreasing real_num_tx_queues needs to flush qdisc
Reducing real_num_queues needs to flush the qdisc otherwise
skbs with queue_mappings greater then real_num_tx_queues can
be sent to the underlying driver.

The flow for this is,

dev_queue_xmit()
	dev_pick_tx()
		skb_tx_hash()  => hash using real_num_tx_queues
		skb_set_queue_mapping()
	...
	qdisc_enqueue_root() => enqueue skb on txq from hash
...
dev->real_num_tx_queues -= n
...
sch_direct_xmit()
	dev_hard_start_xmit()
		ndo_start_xmit(skb,dev) => skb queue set with old hash

skbs are enqueued on the qdisc with skb->queue_mapping set
0 < queue_mappings < real_num_tx_queues.  When the driver
decreases real_num_tx_queues skb's may be dequeued from the
qdisc with a queue_mapping greater then real_num_tx_queues.

This fixes a case in ixgbe where this was occurring with DCB
and FCoE. Because the driver is using queue_mapping to map
skbs to tx descriptor rings we can potentially map skbs to
rings that no longer exist.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-02 21:59:07 -07:00
Ming Lei ecc3d5ae17 minstrel_ht: fix check for downgrading of top2 rate
The check should be against current top2 rate, instead of
current top rate.

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Acked-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-07-02 13:44:40 -04:00
Ming Lei 009271f918 minstrel_ht: fix updating rate with best probability
The throughput should be considered when updating rate
with best probability.

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Acked-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-07-02 13:44:39 -04:00
Eric Dumazet 499031ac8a netfilter: ip6t_REJECT: fix a dst leak in ipv6 REJECT
We should release dst if dst->error is set.

Bug introduced in 2.6.14 by commit e104411b82
([XFRM]: Always release dst_entry on error in xfrm_lookup)

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: stable@kernel.org
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-07-02 10:05:01 +02:00
Patrick McHardy 4df53d8bab bridge: add per bridge device controls for invoking iptables
Support more fine grained control of bridge netfilter iptables invocation
by adding seperate brnf_call_*tables parameters for each device using the
sysfs interface. Packets are passed to layer 3 netfilter when either the
global parameter or the per bridge parameter is enabled.

Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-07-02 09:32:57 +02:00
David S. Miller 05318bc905 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6
Conflicts:
	drivers/net/wireless/libertas/host.h
2010-07-01 17:34:14 -07:00
Ben Hutchings a5b6ee291e ethtool: Add support for control of RX flow hash indirection
Many NICs use an indirection table to map an RX flow hash value to one
of an arbitrary number of queues (not necessarily a power of 2).  It
can be useful to remove some queues from this indirection table so
that they are only used for flows that are specifically filtered
there.  It may also be useful to weight the mapping to account for
user processes with the same CPU-affinity as the RX interrupts.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-30 14:09:37 -07:00
Ben Hutchings 1437ce3983 ethtool: Change ethtool_op_set_flags to validate flags
ethtool_op_set_flags() does not check for unsupported flags, and has
no way of doing so.  This means it is not suitable for use as a
default implementation of ethtool_ops::set_flags.

Add a 'supported' parameter specifying the flags that the driver and
hardware support, validate the requested flags against this, and
change all current callers to pass this parameter.

Change some other trivial implementations of ethtool_ops::set_flags to
call ethtool_op_set_flags().

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Reviewed-by: Stanislaw Gruszka <sgruszka@redhat.com>
Acked-by: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-30 14:09:35 -07:00
Changli Gao d6bebca92c fragment: add fast path for in-order fragments
add fast path for in-order fragments

As the fragments are sent in order in most of OSes, such as Windows, Darwin and
FreeBSD, it is likely the new fragments are at the end of the inet_frag_queue.
In the fast path, we check if the skb at the end of the inet_frag_queue is the
prev we expect.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
----
 include/net/inet_frag.h |    1 +
 net/ipv4/ip_fragment.c  |   12 ++++++++++++
 net/ipv6/reassembly.c   |   11 +++++++++++
 3 files changed, 24 insertions(+)
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-30 13:44:29 -07:00
Eric Dumazet 4ce3c183fc snmp: 64bit ipstats_mib for all arches
/proc/net/snmp and /proc/net/netstat expose SNMP counters.

Width of these counters is either 32 or 64 bits, depending on the size
of "unsigned long" in kernel.

This means user program parsing these files must already be prepared to
deal with 64bit values, regardless of user program being 32 or 64 bit.

This patch introduces 64bit snmp values for IPSTAT mib, where some
counters can wrap pretty fast if they are 32bit wide.

# netstat -s|egrep "InOctets|OutOctets"
    InOctets: 244068329096
    OutOctets: 244069348848

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-30 13:31:19 -07:00
Changli Gao 504f85c9d0 act_nat: use stack variable
act_nat: use stack variable

structure tc_nat isn't too big for stack, so we can put it in stack.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
----
 net/sched/act_nat.c |   31 ++++++++++---------------------
 1 file changed, 10 insertions(+), 21 deletions(-)
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-30 12:12:37 -07:00
Changli Gao 5acbf7f10b act_mirred: combine duplicate code
act_mirred: combine duplicate code

tcf_bstats is updated in any way, so we can do it earlier to reduce the size of
the code.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
----
 net/sched/act_mirred.c |    6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-30 12:12:36 -07:00
Helmut Schaa 92b50c4b5b mac82011: Allow selection of minstrel_ht as default rc algorithm
Allow selection of minstrel_ht as default rate control algorithm. At
the moment minstrel_ht can only be requested by the driver code but
not selected as default in make menuconfig. Fix this by using
minstrel_ht when minstrel was selected as default and minstrel_ht
is available.

This change won't affect legacy devices as minstrel_ht falls back to
minstrel in that case.

Signed-off-by: Helmut Schaa <helmut.schaa@googlemail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-06-30 15:00:53 -04:00
Sebastian Andrzej Siewior 70777d0346 net/core: use ntohs for skb->protocol
This is only noticed by people that are not doing everything correct in
the first place.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-30 10:39:19 -07:00
Ben Hutchings 784e2710ce ipv6: Use interface max_desync_factor instead of static default
max_desync_factor can be configured per-interface, but nothing is
using the value.

Reported-by: Piotr Lewandowski <piotr.lewandowski@gmail.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-30 10:28:43 -07:00
Ben Hutchings f56619fc72 ipv6: Clamp reported valid_lft to a minimum of 0
Since addresses are only revalidated every 2 minutes, the reported
valid_lft can underflow shortly before the address is deleted.
Clamp it to a minimum of 0, as for prefered_lft.

Reported-by: Piotr Lewandowski <piotr.lewandowski@gmail.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-30 10:28:43 -07:00
Nicolas Kaiser d1e3168916 net/Makefile: conditionally descend to wireless and ieee802154
Don't descend to wireless and ieee802154 unless they are actually used.

Signed-off-by: Nicolas Kaiser <nikai@nikai.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-29 15:32:43 -07:00
John W. Linville c466d4efb8 mac80211: add basic tracing to drv_get_survey
Reported-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-06-29 14:51:23 -04:00
John W. Linville ff3074a4dd mac80211: remove unnecessary check in ieee80211_dump_survey
This check is duplicated in drv_get_survey.

Reported-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-06-29 13:55:04 -04:00
Ben Hutchings bf988435bd ethtool: Fix potential user buffer overflow for ETHTOOL_{G, S}RXFH
struct ethtool_rxnfc was originally defined in 2.6.27 for the
ETHTOOL_{G,S}RXFH command with only the cmd, flow_type and data
fields.  It was then extended in 2.6.30 to support various additional
commands.  These commands should have been defined to use a new
structure, but it is too late to change that now.

Since user-space may still be using the old structure definition
for the ETHTOOL_{G,S}RXFH commands, and since they do not need the
additional fields, only copy the originally defined fields to and
from user-space.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Cc: stable@kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-29 01:00:29 -07:00
Ben Hutchings db048b6903 ethtool: Fix potential kernel buffer overflow in ETHTOOL_GRXCLSRLALL
On a 32-bit machine, info.rule_cnt >= 0x40000000 leads to integer
overflow and the buffer may be smaller than needed.  Since
ETHTOOL_GRXCLSRLALL is unprivileged, this can presumably be used for at
least denial of service.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Cc: stable@kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-29 01:00:29 -07:00
Sjur Braendeland 01eebb53a6 caif: Kconfig and Makefile fixes
Use "depends on" instead of "if" in Kconfig files.
Fixed CAIF debug flag, and removed unnecessary clean-* options.

Signed-off-by: Sjur Braendeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-29 00:06:38 -07:00
Changli Gao 210d6de78c act_mirred: don't clone skb when skb isn't shared
don't clone skb when skb isn't shared

When the tcf_action is TC_ACT_STOLEN, and the skb isn't shared, we don't need
to clone a new skb. As the skb will be freed after this function returns, we
can use it freely once we get a reference to it.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
----
 include/net/sch_generic.h |   11 +++++++++--
 net/sched/act_mirred.c    |    6 +++---
 2 files changed, 12 insertions(+), 5 deletions(-)
Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-28 23:24:32 -07:00
Eric Dumazet c4ead4c595 tcp: tso_fragment() might avoid GFP_ATOMIC
We can pass a gfp argument to tso_fragment() and avoid GFP_ATOMIC
allocations sometimes.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-28 23:24:31 -07:00
Eric Dumazet 9618e2ffd7 vlan: 64 bit rx counters
Use u64_stats_sync infrastructure to implement 64bit rx stats.

(tx stats are addressed later)

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-28 23:24:31 -07:00
Eric Dumazet 7a9b2d5950 net: use this_cpu_ptr()
use this_cpu_ptr(p) instead of per_cpu_ptr(p, smp_processor_id())

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-28 23:24:29 -07:00
Felix Fietkau 38bdb650f9 mac80211: fix the for_each_sta_info macro
Because of an ambiguity in the for_each_sta_info macro, it can
currently only be used if the third parameter is set to 'sta'.
Fix this by renaming the parameter to '_sta'.

Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-06-28 15:16:20 -04:00
John W. Linville 5ed3bc7288 mac80211: use netif_receive_skb in ieee80211_tx_status callpath
This avoids the extra queueing from calling netif_rx.

Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-06-28 15:14:51 -04:00
John W. Linville 5548a8a113 mac80211: use netif_receive_skb in ieee80211_rx callpath
This avoids the extra queueing from calling netif_rx.

Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-06-28 15:14:51 -04:00
Patrick McHardy 7eb9282cd0 netfilter: ipt_LOG/ip6t_LOG: add option to print decoded MAC header
The LOG targets print the entire MAC header as one long string, which is not
readable very well:

IN=eth0 OUT= MAC=00:15:f2:24:91:f8:00:1b:24:dc:61:e6:08:00 ...

Add an option to decode known header formats (currently just ARPHRD_ETHER devices)
in their individual fields:

IN=eth0 OUT= MACSRC=00:1b:24:dc:61:e6 MACDST=00:15:f2:24:91:f8 MACPROTO=0800 ...
IN=eth0 OUT= MACSRC=00:1b:24:dc:61:e6 MACDST=00:15:f2:24:91:f8 MACPROTO=86dd ...

The option needs to be explicitly enabled by userspace to avoid breaking
existing parsers.

Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-06-28 14:16:08 +02:00
Patrick McHardy cf377eb4ae netfilter: ipt_LOG/ip6t_LOG: remove comparison within loop
Remove the comparison within the loop to print the macheader by prepending
the colon to all but the first printk.

Based on suggestion by Jan Engelhardt <jengelh@medozas.de>.

Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-06-28 14:12:41 +02:00
Linus Torvalds 31cafd9589 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (52 commits)
  phylib: Add autoload support for the LXT973 phy.
  ISDN: hysdn, fix potential NULL dereference
  vxge: fix memory leak in vxge_alloc_msix() error path
  isdn/gigaset: correct CAPI connection state storage
  isdn/gigaset: encode HLC and BC together
  isdn/gigaset: correct CAPI DATA_B3 Delivery Confirmation
  isdn/gigaset: correct CAPI voice connection encoding
  isdn/gigaset: honor CAPI application's buffer size request
  cpmac: do not leak struct net_device on phy_connect errors
  smc91c92_cs: fix the problem that lan & modem does not work simultaneously
  ipv6: fix NULL reference in proxy neighbor discovery
  Bluetooth: Bring back var 'i' increment
  xfrm: check bundle policy existance before dereferencing it
  sky2: enable rx/tx in sky2_phy_reinit()
  cnic: Disable statistics initialization for eth clients that do not support statistics
  net: add dependency on fw class module to qlcnic and netxen_nic
  snmp: fix SNMP_ADD_STATS()
  hso: remove setting of low_latency flag
  udp: Fix bogus UFO packet generation
  lasi82596: fix netdev_mc_count conversion
  ...
2010-06-27 11:28:02 -07:00
Florian Westphal 172d69e63c syncookies: add support for ECN
Allows use of ECN when syncookies are in effect by encoding ecn_ok
into the syn-ack tcp timestamp.

While at it, remove a uneeded #ifdef CONFIG_SYN_COOKIES.
With CONFIG_SYN_COOKIES=nm want_cookie is ifdef'd to 0 and gcc
removes the "if (0)".

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-26 22:00:03 -07:00
Florian Westphal 734f614bc1 syncookies: do not store rcv_wscale in tcp timestamp
As pointed out by Fernando Gont there is no need to encode rcv_wscale
into the cookie.

We did not use the restored rcv_wscale anyway; it is recomputed
via tcp_select_initial_window().

Thus we can save 4 bits in the ts option space by removing rcv_wscale.
In case window scaling was not supported, we set the (invalid) wscale
value 0xf.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-26 22:00:03 -07:00
Eric Dumazet 9587c6ddd4 ipv6: remove ipv6_statistics
commit 9261e53701 (ipv6: making ip and icmp statistics per/namespace)
forgot to remove ipv6_statistics variable.

commit bc417d99bf (ipv6: remove stale MIB definitions) took care of
icmpv6_statistics & icmpv6msg_statistics

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Denis V. Lunev <den@openvz.org>
CC: Alexey Dobriyan <adobriyan@gmail.com>
CC: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-25 21:33:17 -07:00
Eric Dumazet 1823e4c80e snmp: add align parameter to snmp_mib_init()
In preparation for 64bit snmp counters for some mibs,
add an 'align' parameter to snmp_mib_init(), instead
of assuming mibs only contain 'unsigned long' fields.

Callers can use __alignof__(type) to provide correct
alignment.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
CC: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
CC: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-25 21:33:17 -07:00
Eric Dumazet 4b4194c40f arp: RCU change in arp_solicit()
Avoid two atomic ops in arp_solicit()

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-25 21:33:16 -07:00
Gerrit Renker 59b80802a8 dccp: make implementation of Syn-RTT symmetric
This patch is thanks to Andre Noll who reported the issue and helped testing.

The Syn-RTT sampled during the initial handshake currently only works for
the client sending the DCCP-Request. TFRC penalizes the absence of an RTT
sample with a very slow initial speed (1 packet per second), which delays
slow-start significantly, resulting in sluggish performance.

This patch mirrors the "Syn RTT" principle by adding a timestamp also onto
the DCCP-Response, producing an RTT sample  when the (Data)Ack completing
the handshake arrives.

Also changed the documentation to 'TFRC' since Syn RTTs are also used by CCID-4.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-25 21:33:15 -07:00
Gerrit Renker a7d13fbf85 dccp: remove unused function argument
This removes an unused 'sk' argument from several option-inserting functions.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-25 21:33:14 -07:00
Joe Perches f9467eaec3 net/core/pktgen.c: Use pr_<level>
Add pr_fmt(fmt) KBUILD_MODNAME ": " fmt
Remove "pktgen: " from formats
Convert printks to pr_<level>
Added func_enter() for debugging
Moved version to end of string at module_init
Coalesced long formats

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-25 21:33:12 -07:00
Hagen Paul Pfeifer 01f2f3f6ef net: optimize Berkeley Packet Filter (BPF) processing
Gcc is currenlty not in the ability to optimize the switch statement in
sk_run_filter() because of dense case labels. This patch replace the
OR'd labels with ordered sequenced case labels. The sk_chk_filter()
function is modified to patch/replace the original OPCODES in a
ordered but equivalent form. gcc is now in the ability to transform the
switch statement in sk_run_filter into a jump table of complexity O(1).

Until this patch gcc generates a sequence of conditional branches (O(n) of 567
byte .text segment size (arch x86_64):

7ff: 8b 06                 mov    (%rsi),%eax
801: 66 83 f8 35           cmp    $0x35,%ax
805: 0f 84 d0 02 00 00     je     adb <sk_run_filter+0x31d>
80b: 0f 87 07 01 00 00     ja     918 <sk_run_filter+0x15a>
811: 66 83 f8 15           cmp    $0x15,%ax
815: 0f 84 c5 02 00 00     je     ae0 <sk_run_filter+0x322>
81b: 77 73                 ja     890 <sk_run_filter+0xd2>
81d: 66 83 f8 04           cmp    $0x4,%ax
821: 0f 84 17 02 00 00     je     a3e <sk_run_filter+0x280>
827: 77 29                 ja     852 <sk_run_filter+0x94>
829: 66 83 f8 01           cmp    $0x1,%ax
[...]

With the modification the compiler translate the switch statement into
the following jump table fragment:

7ff: 66 83 3e 2c           cmpw   $0x2c,(%rsi)
803: 0f 87 1f 02 00 00     ja     a28 <sk_run_filter+0x26a>
809: 0f b7 06              movzwl (%rsi),%eax
80c: ff 24 c5 00 00 00 00  jmpq   *0x0(,%rax,8)
813: 44 89 e3              mov    %r12d,%ebx
816: e9 43 03 00 00        jmpq   b5e <sk_run_filter+0x3a0>
81b: 41 89 dc              mov    %ebx,%r12d
81e: e9 3b 03 00 00        jmpq   b5e <sk_run_filter+0x3a0>

Furthermore, I reordered the instructions to reduce cache line misses by
order the most common instruction to the start.

Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-25 21:33:12 -07:00
stephen hemminger 9f888160bd ipv6: fix NULL reference in proxy neighbor discovery
The addition of TLLAO option created a kernel OOPS regression
for the case where neighbor advertisement is being sent via
proxy path.  When using proxy, ipv6_get_ifaddr() returns NULL
causing the NULL dereference.

Change causing the bug was:
commit f7734fdf61
Author: Octavian Purdila <opurdila@ixiacom.com>
Date:   Fri Oct 2 11:39:15 2009 +0000

    make TLLAO option for NA packets configurable

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-25 21:16:57 -07:00
Tim Gardner d70a011dbb netfilter: complete the deprecation of CONFIG_NF_CT_ACCT
CONFIG_NF_CT_ACCT has been deprecated for awhile and
was originally scheduled for removal by 2.6.29.

Removing support for this config option also stops
this deprecation warning message in the kernel log.

[   61.669627] nf_conntrack version 0.5.0 (16384 buckets, 65536 max)
[   61.669850] CONFIG_NF_CT_ACCT is deprecated and will be removed soon. Please use
[   61.669852] nf_conntrack.acct=1 kernel parameter, acct=1 nf_conntrack module option or
[   61.669853] sysctl net.netfilter.nf_conntrack_acct=1 to enable it.

Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
[Patrick: changed default value to 0]
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-06-25 14:46:56 +02:00
Tim Gardner a8756201ba netfilter: xt_connbytes: Force CT accounting to be enabled
Check at rule install time that CT accounting is enabled. Force it
to be enabled if not while also emitting a warning since this is not
the default state.

This is in preparation for deprecating CONFIG_NF_CT_ACCT upon which
CONFIG_NETFILTER_XT_MATCH_CONNBYTES depended being set.

Added 2 CT accounting support functions:

nf_ct_acct_enabled() - Get CT accounting state.
nf_ct_set_acct() - Enable/disable CT accountuing.

Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-06-25 14:44:07 +02:00
Gustavo F. Padovan 1a61a83ff5 Bluetooth: Bring back var 'i' increment
commit ff6e2163f2 accidentally added a
regression on the bnep code. Fixing it.

Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-24 22:08:37 -07:00
Konstantin Khorenko 565b7b2d2e tcp: do not send reset to already closed sockets
i've found that tcp_close() can be called for an already closed
socket, but still sends reset in this case (tcp_send_active_reset())
which seems to be incorrect.  Moreover, a packet with reset is sent
with different source port as original port number has been already
cleared on socket.  Besides that incrementing stat counter for
LINUX_MIB_TCPABORTONCLOSE also does not look correct in this case.

Initially this issue was found on 2.6.18-x RHEL5 kernel, but the same
seems to be true for the current mainstream kernel (checked on
2.6.35-rc3).  Please, correct me if i missed something.

How that happens:

1) the server receives a packet for socket in TCP_CLOSE_WAIT state
   that triggers a tcp_reset():

Call Trace:
 <IRQ>  [<ffffffff8025b9b9>] tcp_reset+0x12f/0x1e8
 [<ffffffff80046125>] tcp_rcv_state_process+0x1c0/0xa08
 [<ffffffff8003eb22>] tcp_v4_do_rcv+0x310/0x37a
 [<ffffffff80028bea>] tcp_v4_rcv+0x74d/0xb43
 [<ffffffff8024ef4c>] ip_local_deliver_finish+0x0/0x259
 [<ffffffff80037131>] ip_local_deliver+0x200/0x2f4
 [<ffffffff8003843c>] ip_rcv+0x64c/0x69f
 [<ffffffff80021d89>] netif_receive_skb+0x4c4/0x4fa
 [<ffffffff80032eca>] process_backlog+0x90/0xec
 [<ffffffff8000cc50>] net_rx_action+0xbb/0x1f1
 [<ffffffff80012d3a>] __do_softirq+0xf5/0x1ce
 [<ffffffff8001147a>] handle_IRQ_event+0x56/0xb0
 [<ffffffff8006334c>] call_softirq+0x1c/0x28
 [<ffffffff80070476>] do_softirq+0x2c/0x85
 [<ffffffff80070441>] do_IRQ+0x149/0x152
 [<ffffffff80062665>] ret_from_intr+0x0/0xa
 <EOI>  [<ffffffff80008a2e>] __handle_mm_fault+0x6cd/0x1303
 [<ffffffff80008903>] __handle_mm_fault+0x5a2/0x1303
 [<ffffffff80033a9d>] cache_free_debugcheck+0x21f/0x22e
 [<ffffffff8006a263>] do_page_fault+0x49a/0x7dc
 [<ffffffff80066487>] thread_return+0x89/0x174
 [<ffffffff800c5aee>] audit_syscall_exit+0x341/0x35c
 [<ffffffff80062e39>] error_exit+0x0/0x84

tcp_rcv_state_process()
...  // (sk_state == TCP_CLOSE_WAIT here)
...
        /* step 2: check RST bit */
        if(th->rst) {
                tcp_reset(sk);
                goto discard;
        }
...
---------------------------------
tcp_rcv_state_process
 tcp_reset
  tcp_done
   tcp_set_state(sk, TCP_CLOSE);
     inet_put_port
      __inet_put_port
       inet_sk(sk)->num = 0;

   sk->sk_shutdown = SHUTDOWN_MASK;

2) After that the process (socket owner) tries to write something to
   that socket and "inet_autobind" sets a _new_ (which differs from
   the original!) port number for the socket:

 Call Trace:
  [<ffffffff80255a12>] inet_bind_hash+0x33/0x5f
  [<ffffffff80257180>] inet_csk_get_port+0x216/0x268
  [<ffffffff8026bcc9>] inet_autobind+0x22/0x8f
  [<ffffffff80049140>] inet_sendmsg+0x27/0x57
  [<ffffffff8003a9d9>] do_sock_write+0xae/0xea
  [<ffffffff80226ac7>] sock_writev+0xdc/0xf6
  [<ffffffff800680c7>] _spin_lock_irqsave+0x9/0xe
  [<ffffffff8001fb49>] __pollwait+0x0/0xdd
  [<ffffffff8008d533>] default_wake_function+0x0/0xe
  [<ffffffff800a4f10>] autoremove_wake_function+0x0/0x2e
  [<ffffffff800f0b49>] do_readv_writev+0x163/0x274
  [<ffffffff80066538>] thread_return+0x13a/0x174
  [<ffffffff800145d8>] tcp_poll+0x0/0x1c9
  [<ffffffff800c56d3>] audit_syscall_entry+0x180/0x1b3
  [<ffffffff800f0dd0>] sys_writev+0x49/0xe4
  [<ffffffff800622dd>] tracesys+0xd5/0xe0

3) sendmsg fails at last with -EPIPE (=> 'write' returns -EPIPE in userspace):

F: tcp_sendmsg1 -EPIPE: sk=ffff81000bda00d0, sport=49847, old_state=7, new_state=7, sk_err=0, sk_shutdown=3

Call Trace:
 [<ffffffff80027557>] tcp_sendmsg+0xcb/0xe87
 [<ffffffff80033300>] release_sock+0x10/0xae
 [<ffffffff8016f20f>] vgacon_cursor+0x0/0x1a7
 [<ffffffff8026bd32>] inet_autobind+0x8b/0x8f
 [<ffffffff8003a9d9>] do_sock_write+0xae/0xea
 [<ffffffff80226ac7>] sock_writev+0xdc/0xf6
 [<ffffffff800680c7>] _spin_lock_irqsave+0x9/0xe
 [<ffffffff8001fb49>] __pollwait+0x0/0xdd
 [<ffffffff8008d533>] default_wake_function+0x0/0xe
 [<ffffffff800a4f10>] autoremove_wake_function+0x0/0x2e
 [<ffffffff800f0b49>] do_readv_writev+0x163/0x274
 [<ffffffff80066538>] thread_return+0x13a/0x174
 [<ffffffff800145d8>] tcp_poll+0x0/0x1c9
 [<ffffffff800c56d3>] audit_syscall_entry+0x180/0x1b3
 [<ffffffff800f0dd0>] sys_writev+0x49/0xe4
 [<ffffffff800622dd>] tracesys+0xd5/0xe0

tcp_sendmsg()
...
        /* Wait for a connection to finish. */
        if ((1 << sk->sk_state) & ~(TCPF_ESTABLISHED | TCPF_CLOSE_WAIT)) {
                int old_state = sk->sk_state;
                if ((err = sk_stream_wait_connect(sk, &timeo)) != 0) {
if (f_d && (err == -EPIPE)) {
        printk("F: tcp_sendmsg1 -EPIPE: sk=%p, sport=%u, old_state=%d, new_state=%d, "
                "sk_err=%d, sk_shutdown=%d\n",
                sk, ntohs(inet_sk(sk)->sport), old_state, sk->sk_state,
                sk->sk_err, sk->sk_shutdown);
        dump_stack();
}
                        goto out_err;
                }
        }
...

4) Then the process (socket owner) understands that it's time to close
   that socket and does that (and thus triggers sending reset packet):

Call Trace:
...
 [<ffffffff80032077>] dev_queue_xmit+0x343/0x3d6
 [<ffffffff80034698>] ip_output+0x351/0x384
 [<ffffffff80251ae9>] dst_output+0x0/0xe
 [<ffffffff80036ec6>] ip_queue_xmit+0x567/0x5d2
 [<ffffffff80095700>] vprintk+0x21/0x33
 [<ffffffff800070f0>] check_poison_obj+0x2e/0x206
 [<ffffffff80013587>] poison_obj+0x36/0x45
 [<ffffffff8025dea6>] tcp_send_active_reset+0x15/0x14d
 [<ffffffff80023481>] dbg_redzone1+0x1c/0x25
 [<ffffffff8025dea6>] tcp_send_active_reset+0x15/0x14d
 [<ffffffff8000ca94>] cache_alloc_debugcheck_after+0x189/0x1c8
 [<ffffffff80023405>] tcp_transmit_skb+0x764/0x786
 [<ffffffff8025df8a>] tcp_send_active_reset+0xf9/0x14d
 [<ffffffff80258ff1>] tcp_close+0x39a/0x960
 [<ffffffff8026be12>] inet_release+0x69/0x80
 [<ffffffff80059b31>] sock_release+0x4f/0xcf
 [<ffffffff80059d4c>] sock_close+0x2c/0x30
 [<ffffffff800133c9>] __fput+0xac/0x197
 [<ffffffff800252bc>] filp_close+0x59/0x61
 [<ffffffff8001eff6>] sys_close+0x85/0xc7
 [<ffffffff800622dd>] tracesys+0xd5/0xe0

So, in brief:

* a received packet for socket in TCP_CLOSE_WAIT state triggers
  tcp_reset() which clears inet_sk(sk)->num and put socket into
  TCP_CLOSE state

* an attempt to write to that socket forces inet_autobind() to get a
  new port (but the write itself fails with -EPIPE)

* tcp_close() called for socket in TCP_CLOSE state sends an active
  reset via socket with newly allocated port

This adds an additional check in tcp_close() for already closed
sockets. We do not want to send anything to closed sockets.

Signed-off-by: Konstantin Khorenko <khorenko@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-24 21:54:58 -07:00
Andrew Morton deb0d7c740 net: fix "netpoll: Allow netpoll_setup/cleanup recursion"
Remove rtnl_unlock() which had no corresponding rtnl_lock().

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-24 20:33:04 -07:00
Timo Teräs b1312c89f0 xfrm: check bundle policy existance before dereferencing it
Fix the bundle validation code to not assume having a valid policy.
When we have multiple transformations for a xfrm policy, the bundle
instance will be a chain of bundles with only the first one having
the policy reference. When policy_genid is bumped it will expire the
first bundle in the chain which is equivalent of expiring the whole
chain.

Reported-bisected-and-tested-by: Justin P. Mattock <justinmattock@gmail.com>
Signed-off-by: Timo Teräs <timo.teras@iki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-24 14:35:00 -07:00
Juuso Oikarinen 98d2ff8bec nl80211: Add option to adjust transmit power
This patch adds transmit power setting type and transmit power level attributes
to NL80211_CMD_SET_WIPHY in order to facilitate adjusting of the transmit power
level of the device.

The added attributes allow selection of automatic, limited or fixed transmit
power level, with the level definable in signed mBm format.

Signed-off-by: Juuso Oikarinen <juuso.oikarinen@nokia.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-06-24 15:42:37 -04:00
Juuso Oikarinen fa61cf70a6 cfg80211/mac80211: Update set_tx_power to use mBm instead of dBm units
In preparation for a TX power setting interface in the nl80211, change the
.set_tx_power function to use mBm units instead of dBm for greater accuracy and
smaller power levels.

Also, already in advance move the tx_power_setting enumeration to nl80211.

This change affects the .tx_set_power function prototype. As a result, the
corresponding changes are needed to modules using it. These are mac80211,
iwmc3200wifi and rndis_wlan.

Cc: Samuel Ortiz <samuel.ortiz@intel.com>
Cc: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: Juuso Oikarinen <juuso.oikarinen@nokia.com>
Acked-by: Samuel Ortiz <samuel.ortiz@intel.com>
Acked-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-06-24 15:42:33 -04:00
John W. Linville c937019761 mac80211: avoid scheduling while atomic in mesh_rx_plink_frame
While mesh_rx_plink_frame holds sta->lock...

mesh_rx_plink_frame ->
	mesh_plink_inc_estab_count ->
		ieee80211_bss_info_change_notify

...but ieee80211_bss_info_change_notify is allowed to sleep.  A driver
taking advantage of that allowance can cause a scheduling while
atomic bug.  Similar paths exist for mesh_plink_dec_estab_count,
so work around those as well.

http://bugzilla.kernel.org/show_bug.cgi?id=16099

Also, correct a minor kerneldoc comment error (mismatched function names).

Signed-off-by: John W. Linville <linville@tuxdriver.com>
Cc: stable@kernel.org
2010-06-24 15:42:30 -04:00
John W. Linville de66bfd85c minstrel_ht: move minstrel_mcs_groups declaration to header file
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Acked-by: Felix Fietkau <nbd@openwrt.org>
2010-06-24 15:42:18 -04:00
John W. Linville 670b7f11ff wireless: mark reg_mutex as static
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-06-24 15:42:12 -04:00
John W. Linville d5ece2150a minstrel_ht: make *idx unsigned in minstrel_downgrade_rate
net/mac80211/rc80211_minstrel_ht.c:440:46: warning: incorrect type in argument 2 (different signedness)
net/mac80211/rc80211_minstrel_ht.c:440:46:    expected int *idx
net/mac80211/rc80211_minstrel_ht.c:440:46:    got unsigned int *<noident>
net/mac80211/rc80211_minstrel_ht.c:446:46: warning: incorrect type in argument 2 (different signedness)
net/mac80211/rc80211_minstrel_ht.c:446:46:    expected int *idx
net/mac80211/rc80211_minstrel_ht.c:446:46:    got unsigned int *<noident>

Signed-off-by: John W. Linville <linville@tuxdriver.com>
Acked-by: Felix Fietkau <nbd@openwrt.org>
2010-06-24 15:41:26 -04:00
John W. Linville 292b4df62a mac80211: don't shadow mgmt variable in ieee80211_rx_h_action
net/mac80211/rx.c:2059:39: warning: symbol 'mgmt' shadows an earlier one
net/mac80211/rx.c:1916:31: originally declared here

Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-06-24 11:13:56 -04:00
David S. Miller 8244132ea8 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	net/ipv4/ip_output.c
2010-06-23 18:26:27 -07:00
Jiri Olsa 7b2ff18ee7 net - IP_NODEFRAG option for IPv4 socket
this patch is implementing IP_NODEFRAG option for IPv4 socket.
The reason is, there's no other way to send out the packet with user
customized header of the reassembly part.

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-23 13:16:38 -07:00
Eric Dumazet 406818ff34 bridge: 64bit rx/tx counters
Use u64_stats_sync infrastructure to provide 64bit rx/tx
counters even on 32bit hosts.

It is safe to use a single u64_stats_sync for rx and tx,
because BH is disabled on both, and we use per_cpu data.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-23 13:00:48 -07:00
John Fastabend 6afff0caa7 net: consolidate netif_needs_gso() checks
netif_needs_gso() is checked twice in the TX path once,
before submitting the skb to the qdisc and once after
it is dequeued from the qdisc just before calling
ndo_hard_start().  This opens a window for a user to
change the gso/tso or tx checksum settings that can
cause netif_needs_gso to be true in one check and false
in the other.

Specifically, changing TX checksum setting may cause
the warning in skb_gso_segment() to be triggered if
the checksum is calculated earlier.

This consolidates the netif_needs_gso() calls so that
the stack only checks if gso is needed in
dev_hard_start_xmit().

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-23 12:58:41 -07:00
Trond Myklebust b76ce56192 SUNRPC: Fix a re-entrancy bug in xs_tcp_read_calldir()
If the attempt to read the calldir fails, then instead of storing the read
bytes, we currently discard them. This leads to a garbage final result when
upon re-entry to the same routine, we read the remaining bytes.

Fixes the regression in bugzilla number 16213. Please see
    https://bugzilla.kernel.org/show_bug.cgi?id=16213

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@kernel.org
2010-06-22 13:21:18 -04:00
Arnd Hannemann fe6fb55285 netfilter: fix simple typo in KConfig for netfiltert xt_TEE
Destination was spelled wrong in KConfig.

Signed-off-by: Arnd Hannemann <hannemann@nets.rwth-aachen.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-06-22 08:22:21 +02:00
Randy Dunlap 600069daf7 netfilter: xt_IDLETIMER needs kdev_t.h
Add header file to fix build error:
net/netfilter/xt_IDLETIMER.c:276: error: implicit declaration of function 'MKDEV'

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-06-22 08:13:31 +02:00
Nick Chalk 26ec037f98 IPVS: one-packet scheduling
Allow one-packet scheduling for UDP connections. When the fwmark-based or
normal virtual service is marked with '-o' or '--ops' options all
connections are created only to schedule one packet. Useful to schedule UDP
packets from same client port to different real servers. Recommended with
RR or WRR schedulers (the connections are not visible with ipvsadm -L).

Signed-off-by: Nick Chalk <nick@loadbalancer.org>
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-06-22 08:07:01 +02:00
Herbert Xu 26cde9f7e2 udp: Fix bogus UFO packet generation
It has been reported that the new UFO software fallback path
fails under certain conditions with NFS.  I tracked the problem
down to the generation of UFO packets that are smaller than the
MTU.  The software fallback path simply discards these packets.

This patch fixes the problem by not generating such packets on
the UFO path.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-21 13:57:34 -07:00
Juuso Oikarinen f90754c15f mac80211: Add interface for driver to temporarily disable dynamic ps
This mechanism introduced in this patch applies (at least) for hardware
designs using a single shared antenna for both WLAN and BT. In these designs,
the antenna must be toggled between WLAN and BT.

In those hardware, managing WLAN co-existence with Bluetooth requires WLAN
full power save whenever there is Bluetooth activity in order for WLAN to be
able to periodically relinquish the antenna to be used for BT. This is because
BT can only access the shared antenna when WLAN is idle or asleep.

Some hardware, for instance the wl1271, are able to indicate to the host
whenever there is BT traffic. In essence, the hardware will send an indication
to the host whenever there is, for example, SCO traffic or A2DP traffic, and
will send another indication when the traffic is over.

The hardware gets information of Bluetooth traffic via hardware co-existence
control lines - these lines are used to negotiate the shared antenna
ownership. The hardware will give the antenna to BT whenever WLAN is sleeping.

This patch adds the interface to mac80211 to facilitate temporarily disabling
of dynamic power save as per request of the WLAN driver. This interface will
immediately force WLAN to full powersave, hence allowing BT coexistence as
described above.

In these kind of shared antenna desings, when WLAN powersave is fully disabled,
Bluetooth will not work simultaneously with WLAN at all. This patch does not
address that problem. This interface will not change PSM state, so if PSM is
disabled it will remain so. Solving this problem requires knowledge about BT
state, and is best done in user-space.

Signed-off-by: Juuso Oikarinen <juuso.oikarinen@nokia.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-06-21 15:39:59 -04:00
Gertjan van Wingerde fb63bc4177 mac80211: Fix compile warning in scan.c.
Fix the following compile warning:

CC [M]  net/mac80211/scan.o
net/mac80211/scan.c: In function 'ieee80211_request_internal_scan':
net/mac80211/scan.c:749:23: warning: comparison between 'enum nl80211_band' and 'enum ieee80211_band'

caused by the local variable band not being of the proper 'ieee80211_band' type.

Signed-off-by: Gertjan van Wingerde <gwingerde@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-06-21 15:39:59 -04:00
Sjur Braendeland 69ad78208e caif: Add debug connection type for CAIF.
Added new CAIF protocol type CAIFPROTO_DEBUG for accessing
CAIF debug on the ST Ericsson modems.

There are two debug servers on the modem, one for radio related
debug (CAIF_RADIO_DEBUG_SERVICE) and the other for
communication/application related debug (CAIF_COM_DEBUG_SERVICE).

The debug connection can contain trace debug printouts or
interactive debug used for debugging and test.

Debug connections can be of type STREAM or SEQPACKET.

Signed-off-by: Sjur Braendeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-20 19:46:07 -07:00
Sjur Braendeland 2aa40aef9d caif: Use link layer MTU instead of fixed MTU
Previously CAIF supported maximum transfer size of ~4050.
The transfer size is now calculated dynamically based on the
link layers mtu size.

Signed-off-by: Sjur Braendeland@stericsson.com
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-20 19:46:06 -07:00
Sjur Braendeland a7da1f55a8 caif: Bugfix - RFM must support segmentation.
CAIF Remote File Manager may send or receive more than 4050 bytes.
Due to this The CAIF RFM service have to support segmentation.

Signed-off-by: Sjur Braendeland@stericsson.com
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-20 19:46:05 -07:00
Sjur Braendeland b1c74247b9 caif: Bugfix not all services uses flow-ctrl.
Flow control is not used by all CAIF services.
The usage of flow control is now part of the gerneal
initialization function for CAIF Services.

Signed-off-by: Sjur Braendeland@stericsson.com
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-20 19:46:05 -07:00
Johannes Berg 543708be32 mac80211: fix sw scan bracketing
Currently, detection in hwsim and ath9k can
detect that two sw scans are in flight at the
same time, which isn't really true. It is
caused by a race condition, because the scan
complete callback is called too late, after
the lock has been dropped, so that a new scan
can be started before it is called.

It is also called too early semantically, as
it is currently called _after_ the return to
the operating channel -- it should be before
so that drivers know this is the operating
channel again.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-06-18 15:11:14 -04:00
Uwe Kleine-König 2fcc9f731b wireless: move regulatory_init to .init.text
regulatory_init is only called by cfg80211_init which is in .init.text,
too.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-06-18 15:11:13 -04:00
Uwe Kleine-König f884e3879b cfg80211: move cfg80211_exit to .exit.text
cfg80211_exit is only used as module_exit function, so it can go to
.exit.text saving a few bytes when CONFIG_CFG80211=y.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-06-18 15:11:13 -04:00
David S. Miller bb9c03d8a6 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 2010-06-17 14:19:06 -07:00
stephen hemminger 25442e06d2 bridge: fdb cleanup runs too often
It is common in end-node, non STP bridges to set forwarding
delay to zero; which causes the forwarding database cleanup
to run every clock tick. Change to run only as soon as needed
or at next ageing timer interval which ever is sooner.

Use round_jiffies_up macro rather than attempting round up
by changing value.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-17 13:49:14 -07:00
John W. Linville abf52f86aa Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6
Conflicts:
	net/mac80211/mlme.c
2010-06-17 16:21:14 -04:00
Patrick McHardy c68cd6cc21 netfilter: nf_nat: support user-specified SNAT rules in LOCAL_IN
2.6.34 introduced 'conntrack zones' to deal with cases where packets
from multiple identical networks are handled by conntrack/NAT. Packets
are looped through veth devices, during which they are NATed to private
addresses, after which they can continue normally through the stack
and possibly have NAT rules applied a second time.

This works well, but is needlessly complicated for cases where only
a single SNAT/DNAT mapping needs to be applied to these packets. In that
case, all that needs to be done is to assign each network to a seperate
zone and perform NAT as usual. However this doesn't work for packets
destined for the machine performing NAT itself since its corrently not
possible to configure SNAT mappings for the LOCAL_IN chain.

This patch adds a new INPUT chain to the NAT table and changes the
targets performing SNAT to be usable in that chain.

Example usage with two identical networks (192.168.0.0/24) on eth0/eth1:

iptables -t raw -A PREROUTING -i eth0 -j CT --zone 1
iptables -t raw -A PREROUTING -i eth0 -j MARK --set-mark 1
iptables -t raw -A PREROUTING -i eth1 -j CT --zone 2
iptabels -t raw -A PREROUTING -i eth1 -j MARK --set-mark 2

iptables -t nat -A INPUT       -m mark --mark 1 -j NETMAP --to 10.0.0.0/24
iptables -t nat -A POSTROUTING -m mark --mark 1 -j NETMAP --to 10.0.0.0/24
iptables -t nat -A INPUT       -m mark --mark 2 -j NETMAP --to 10.0.1.0/24
iptables -t nat -A POSTROUTING -m mark --mark 2 -j NETMAP --to 10.0.1.0/24

iptables -t raw -A PREROUTING -d 10.0.0.0/24 -j CT --zone 1
iptables -t raw -A OUTPUT     -d 10.0.0.0/24 -j CT --zone 1
iptables -t raw -A PREROUTING -d 10.0.1.0/24 -j CT --zone 2
iptables -t raw -A OUTPUT     -d 10.0.1.0/24 -j CT --zone 2

iptables -t nat -A PREROUTING -d 10.0.0.0/24 -j NETMAP --to 192.168.0.0/24
iptables -t nat -A OUTPUT     -d 10.0.0.0/24 -j NETMAP --to 192.168.0.0/24
iptables -t nat -A PREROUTING -d 10.0.1.0/24 -j NETMAP --to 192.168.0.0/24
iptables -t nat -A OUTPUT     -d 10.0.1.0/24 -j NETMAP --to 192.168.0.0/24

Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-06-17 06:12:26 +02:00
David S. Miller 3924773a5a net: Export cred_to_ucred to modules.
AF_UNIX references this, and can be built as a module,
so...

Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-16 16:18:25 -07:00
Eric W. Biederman 6616f7888c af_unix: Allow connecting to sockets in other network namespaces.
Remove the restriction that only allows connecting to a unix domain
socket identified by unix path that is in the same network namespace.

Crossing network namespaces is always tricky and we did not support
this at first, because of a strict policy of don't mix the namespaces.
Later after Pavel proposed this we did not support this because no one
had performed the audit to make certain using unix domain sockets
across namespaces is safe.

What fundamentally makes connecting to af_unix sockets in other
namespaces is safe is that you have to have the proper permissions on
the unix domain socket inode that lives in the filesystem.  If you
want strict isolation you just don't create inodes where unfriendlys
can get at them, or with permissions that allow unfriendlys to open
them.  All nicely handled for us by the mount namespace and other
standard file system facilities.

I looked through unix domain sockets and they are a very controlled
environment so none of the work that goes on in dev_forward_skb to
make crossing namespaces safe appears needed, we are not loosing
controll of the skb and so do not need to set up the skb to look like
it is comming in fresh from the outside world.  Further the fields in
struct unix_skb_parms should not have any problems crossing network
namespaces.

Now that we handle SCM_CREDENTIALS in a way that gives useable values
across namespaces.  There does not appear to be any operational
problems with encouraging the use of unix domain sockets across
containers either.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Daniel Lezcano <daniel.lezcano@free.fr>
Acked-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-16 14:58:17 -07:00
Eric W. Biederman 7361c36c52 af_unix: Allow credentials to work across user and pid namespaces.
In unix_skb_parms store pointers to struct pid and struct cred instead
of raw uid, gid, and pid values, then translate the credentials on
reception into values that are meaningful in the receiving processes
namespaces.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-16 14:58:16 -07:00
Eric W. Biederman 257b5358b3 scm: Capture the full credentials of the scm sender.
Start capturing not only the userspace pid, uid and gid values of the
sending process but also the struct pid and struct cred of the sending
process as well.

This is in preparation for properly supporting SCM_CREDENTIALS for
sockets that have different uid and/or pid namespaces at the different
ends.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Serge E. Hallyn <serge@hallyn.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-16 14:55:56 -07:00
Eric W. Biederman b47030c71d af_netlink: Add needed scm_destroy after scm_send.
scm_send occasionally allocates state in the scm_cookie, so I have
modified netlink_sendmsg to guarantee that when scm_send succeeds
scm_destory will be called to free that state.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Reviewed-by: Daniel Lezcano <daniel.lezcano@free.fr>
Acked-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-16 14:55:56 -07:00
Eric W. Biederman 109f6e39fa af_unix: Allow SO_PEERCRED to work across namespaces.
Use struct pid and struct cred to store the peer credentials on struct
sock.  This gives enough information to convert the peer credential
information to a value relative to whatever namespace the socket is in
at the time.

This removes nasty surprises when using SO_PEERCRED on socket
connetions where the processes on either side are in different pid and
user namespaces.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Daniel Lezcano <daniel.lezcano@free.fr>
Acked-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-16 14:55:55 -07:00
Eric W. Biederman 3f551f9436 sock: Introduce cred_to_ucred
To keep the coming code clear and to allow both the sock
code and the scm code to share the logic introduce a
fuction to translate from struct cred to struct ucred.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-16 14:55:35 -07:00