The following commit:
b6c40d68ff
net: only invoke dev->change_rx_flags when device is UP
tried to fix a problem with VLAN devices and promiscuouse flag setting.
The issue was that VLAN device was setting a flag on an interface that
was down, thus resulting in bad promiscuity count.
This commit blocked flag propagation to any device that is currently
down.
A later commit:
deede2fabe
vlan: Don't propagate flag changes on down interfaces
fixed VLAN code to only propagate flags when the VLAN interface is up,
thus fixing the same issue as above, only localized to VLAN.
The problem we have now is that if we have create a complex stack
involving multiple software devices like bridges, bonds, and vlans,
then it is possible that the flags would not propagate properly to
the physical devices. A simple examle of the scenario is the
following:
eth0----> bond0 ----> bridge0 ---> vlan50
If bond0 or eth0 happen to be down at the time bond0 is added to
the bridge, then eth0 will never have promisc mode set which is
currently required for operation as part of the bridge. As a
result, packets with vlan50 will be dropped by the interface.
The only 2 devices that implement the special flag handling are
VLAN and DSA and they both have required code to prevent incorrect
flag propagation. As a result we can remove the generic solution
introduced in b6c40d68ff and leave
it to the individual devices to decide whether they will block
flag propagation or not.
Reported-by: Stefan Priebe <s.priebe@profihost.ag>
Suggested-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
CPUs can ask for local route via ip_route_input_noref() concurrently.
if nh_rth_input is not cached yet, CPUs will proceed to allocate
equivalent DSTs on 'lo' and then will try to cache them in nh_rth_input
via rt_cache_route()
Most of the time they succeed, but on occasion the following two lines:
orig = *p;
prev = cmpxchg(p, orig, rt);
in rt_cache_route() do race and one of the cpus fails to complete cmpxchg.
But ip_route_input_slow() doesn't check the return code of rt_cache_route(),
so dst is leaking. dst_destroy() is never called and 'lo' device
refcnt doesn't go to zero, which can be seen in the logs as:
unregister_netdevice: waiting for lo to become free. Usage count = 1
Adding mdelay() between above two lines makes it easily reproducible.
Fix it similar to nh_pcpu_rth_output case.
Fixes: d2d68ba9fe ("ipv4: Cache input routes in fib_info nexthops.")
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hayes Wang says:
====================
r8152 bug fixes
For the patch #3, I add netif_tx_lock() before checking the
netif_queue_stopped(). Besides, I add checking the skb queue
length before waking the tx queue.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The data from the hardware should be little endian. Correct the
declaration.
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The maximum packet number which a tx aggregation buffer could contain
is the tx_qlen.
tx_qlen = buffer size / (packet size + descriptor size).
If the tx buffer is empty and the queued packets are more than the
maximum value which is defined above, stop the tx queue. Wake the
tx queue if tx queue is stopped and the queued packets are less than
tx_qlen.
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove the code for sending the packet in the rtl8152_start_xmit().
Let rtl8152_start_xmit() to queue the packet only, and schedule a
tasklet to send the queued packets. This simplify the code and make
sure all the packet would be sent by the original order.
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The tx/rx would access the memory which is out of the desired range.
Modify the method of checking the end of the memory to avoid it.
For r8152_tx_agg_fill(), the variable remain may become negative.
However, the declaration is unsigned, so the while loop wouldn't
break when reaching the end of the desied memory. Although to change
the declaration from unsigned to signed is enough to fix it, I also
modify the checking method for safe. Replace
remain = rx_buf_sz - sizeof(*tx_desc) -
(u32)((void *)tx_data - agg->head);
with
remain = rx_buf_sz - (int)(tx_agg_align(tx_data) - agg->head);
to make sure the variable remain is always positive. Then, the
overflow wouldn't happen.
For rx_bottom(), the rx_desc should not be used to calculate the
packet length before making sure the rx_desc is in the desired range.
Change the checking to two parts. First, check the descriptor is in
the memory. The other, using the descriptor to find out the packet
length and check if the packet is in the memory.
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch exposes default_erl as a TPG attribute so that it may be
set TPG wide in demo-mode, but still allow the existing NodeACL
attribute to be overridden on a per initiator basis.
Reported-by: Arshad Hussain <arshad.hussain@calsoftinc.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Split up the various ALUA states into individual attributes to
make parsing easier and adhere to the one value per attribute
sysfs principle.
(nab: Convert strict_strtoul -> kstrtoul usage)
Signed-off-by: Hannes Reinecke <hare@suse.de>
The supported ALUA states might be different for individual
devices, so store it in a separate field.
(nab: Remove unnecessary line continuation)
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Rename ALUA_ACCESS_STATE_OPTMIZED to
ALUA_ACCESS_STATE_OPTIMIZED.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Since acpi_bus_get_device() returns plain int and not acpi_status,
ACPI_FAILURE() should not be used for checking its return value. Fix
that.
Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
When a codec is resumed, it keeps the power on while the resuming
phase via hda_keep_power_on(), then turns down via
snd_hda_power_down(). At that point, snd_hda_power_down() notifies
the power down to the controller, and this may confuse the refcount if
the codec was already powered up before the resume.
In the end result, the controller goes to runtime suspend even before
the codec is kicked off to the power save, and the communication
stalls happens.
The fix is to add the power-up notification together with
hda_keep_power_on(), and clears the flag appropriately.
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
The FLL must be placed into free-run mode before disabling
to allow it to entirely shut down.
Signed-off-by: Richard Fitzgerald <rf@opensource.wolfsonmicro.com>
Signed-off-by: Mark Brown <broonie@linaro.org>
Cc: stable@vger.kernel.org
This regression has been introduced in
commit 4fe8590a92
Author: Ville Syrjälä <ville.syrjala@linux.intel.com>
Date: Wed Sep 4 18:25:22 2013 +0300
drm/i915: Use adjusted_mode appropriately when computing watermarks
I guess we should renable the enabled local variable into something a
notch more descriptive, but that's something for -next.
The effect on my i945gme netbook is pretty severe amounts of underruns
- usually the very first pixel gets used for the entire screeen.
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Damien Lespiau <damien.lespiau@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
You're looking at a casual headset patch,
for a specific hardware it will match,
and suddenly, the headset jack will work,
so please apply this simple quirk!
BugLink: https://bugs.launchpad.net/bugs/1253038
Signed-off-by: David Henningsson <david.henningsson@canonical.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Addresses
"[BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE".
In the first occurence it was used to try to be nice while releasing the
mmap_sem and retrying the fault to work around a locking inversion.
The second occurence was never used.
There has been some discussion whether we should change the locking order to
mmap_sem -> bo_reserve. This patch doesn't address that issue, and leaves
that locking order undefined. The solution that we release the mmap_sem if
tryreserve fails and wait for the buffer to become unreserved is something
we want in any case, and follows how the core vm system waits for pages
to be come unlocked while releasing the mmap_sem.
The code also outlines what needs to be changed if we want to establish the
locking order as mmap_sem -> bo::reserve.
One slight issue that remains with this code is that the fault handler might
be prone to starvation if another thread countinously reserves the buffer.
IMO that usage pattern is highly unlikely.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
If ttm_bo_move_memcpy was instructed to move a non-populated ttm to
io memory, it would first populate the ttm, then move the data and then
destroy the ttm. That's stupid. However, some drivers might have relied on
this to clear io memory from old stuff. So instead of a NOP, which would
be the most efficient, just clear the destination.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>
We should not be using jump labels before they were initialized. Push back
the callback to until after jump label initialization.
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
For all uapi headers, need use "_UAPI" prefix for its guard macro
(which will be stripped by "scripts/headers_installer.sh").
Also remove redundant files (bitsperlong.h, errno.h, fcntl.h, ioctl.h,
ioctls.h, ipcbuf.h, kvm_para.h, mman.h, poll.h, resource.h, siginfo.h,
statfs.h, and unistd.h) which are already in Kbuild.
Also be sure that all "#endif" only have one empty line above, and each
file has guard macro.
Signed-off-by: Chen Gang <gang.chen@asianux.com>
Signed-off-by: Hans-Christian Egtvedt <hegtvedt@cisco.com>
This patch fixes following error (for big kernels):
---8<---
arch/avr32/boot/u-boot/head.o: In function `no_tag_table':
(.init.text+0x44): relocation truncated to fit: R_AVR32_22H_PCREL against symbol `panic' defined in .text.unlikely section in kernel/built-in.o
arch/avr32/kernel/built-in.o: In function `bad_return':
(.ex.text+0x236): relocation truncated to fit: R_AVR32_22H_PCREL against symbol `panic' defined in .text.unlikely section in kernel/built-in.o
--->8---
It comes up when the kernel increases and 'panic()' is too far away to fit in
the +/- 2MiB range. Which in turn issues from the 21-bit displacement in
'br{cond4}' mnemonic which is one of the two ways to do jumps (rjmp has just
10-bit displacement and therefore a way smaller range). This fact was stated
before in 8d29b7b9f8.
One solution to solve this is to add a local storage for the symbol address
and just load the $pc with that value.
Signed-off-by: Andreas Bießmann <andreas@biessmann.de>
Acked-by: Hans-Christian Egtvedt <egtvedt@samfundet.no>
Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
Cc: stable@vger.kernel.org
Before the CRT was (fully) set up in kernel_entry (bss cleared before in
_start, but also not before jump to panic() in no_tag_table case).
This patch fixes this up to have a fully working CRT when branching to panic()
in no_tag_table.
Signed-off-by: Andreas Bießmann <andreas@biessmann.de>
Acked-by: Hans-Christian Egtvedt <egtvedt@samfundet.no>
Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
Cc: stable@vger.kernel.org
Drop percpu_ida.o from lib-y since it is also listed in obj-y
and it doesn't need to be listed in both places.
Move percpu-refcount.o from lib-y to obj-y to fix build errors
in target_core_mod:
ERROR: "percpu_ref_cancel_init" [drivers/target/target_core_mod.ko] undefined!
ERROR: "percpu_ref_kill_and_confirm" [drivers/target/target_core_mod.ko] undefined!
ERROR: "percpu_ref_init" [drivers/target/target_core_mod.ko] undefined!
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch changes iscsit_sequence_cmd() logic to no longer reject
non-immediate CmdSNs that exceed MaxCmdSN with a protocol error,
but instead silently ignore them.
This is done to correctly follow RFC-3720 Section 3.2.2.1:
For non-immediate commands, the CmdSN field can take any
value from ExpCmdSN to MaxCmdSN inclusive. The target MUST silently
ignore any non-immediate command outside of this range or non-
immediate duplicates within the range.
Reported-by: Santosh Kulkarni <santosh.kulkarni@calsoftinc.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch converts a handful of iscsi_session statistics to type
atomic_long_t, instead of using iscsi_session->session_stats_lock
when incrementing these values.
More importantly, go ahead and drop the spinlock usage within
iscsit_setup_scsi_cmd(), iscsit_check_dataout_hdr(),
iscsit_send_datain(), and iscsit_build_rsp_pdu() fast-path code.
(Squash in Roland's target: Remove write-only stats fields and lock
from struct se_node_acl)
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Fix static checker complaint that stream is not checked in
squashfs_decompressor_destroy().
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk>
Reviewed-by: Minchan Kim <minchan@kernel.org>
This introduces an implementation of squashfs_readpage_block()
that directly decompresses into the page cache.
This uses the previously added page handler abstraction to push
down the necessary kmap_atomic/kunmap_atomic operations on the
page cache buffers into the decompressors. This enables
direct copying into the page cache without using the slow
kmap/kunmap calls.
The code detects when multiple threads are racing in
squashfs_readpage() to decompress the same block, and avoids
this regression by falling back to using an intermediate
buffer.
This patch enhances the performance of Squashfs significantly
when multiple processes are accessing the filesystem simultaneously
because it not only reduces memcopying, but it more importantly
eliminates the lock contention on the intermediate buffer.
Using single-thread decompression.
dd if=file1 of=/dev/null bs=4096 &
dd if=file2 of=/dev/null bs=4096 &
dd if=file3 of=/dev/null bs=4096 &
dd if=file4 of=/dev/null bs=4096
Before:
629145600 bytes (629 MB) copied, 45.8046 s, 13.7 MB/s
After:
629145600 bytes (629 MB) copied, 9.29414 s, 67.7 MB/s
Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk>
Reviewed-by: Minchan Kim <minchan@kernel.org>
Restructure squashfs_readpage() splitting it into separate
functions for datablocks, fragments and sparse blocks.
Move the memcpying (from squashfs cache entry) implementation of
squashfs_readpage_block into file_cache.c
This allows different implementations to be supported.
Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk>
Reviewed-by: Minchan Kim <minchan@kernel.org>
Further generalise the decompressors by adding a page handler
abstraction. This adds helpers to allow the decompressors
to access and process the output buffers in an implementation
independant manner.
This allows different types of output buffer to be passed
to the decompressors, with the implementation specific
aspects handled at decompression time, but without the
knowledge being held in the decompressor wrapper code.
This will allow the decompressors to handle Squashfs
cache buffers, and page cache pages.
This patch adds the abstraction and an implementation for
the caches.
Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk>
Reviewed-by: Minchan Kim <minchan@kernel.org>
Add a multi-threaded decompression implementation which uses
percpu variables.
Using percpu variables has advantages and disadvantages over
implementations which do not use percpu variables.
Advantages:
* the nature of percpu variables ensures decompression is
load-balanced across the multiple cores.
* simplicity.
Disadvantages: it limits decompression to one thread per core.
Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk>
Now squashfs have used for only one stream buffer for decompression
so it hurts parallel read performance so this patch supports
multiple decompressor to enhance performance parallel I/O.
Four 1G file dd read on KVM machine which has 2 CPU and 4G memory.
dd if=test/test1.dat of=/dev/null &
dd if=test/test2.dat of=/dev/null &
dd if=test/test3.dat of=/dev/null &
dd if=test/test4.dat of=/dev/null &
old : 1m39s -> new : 9s
* From v1
* Change comp_strm with decomp_strm - Phillip
* Change/add comments - Phillip
Signed-off-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk>
The decompressor interface and code was written from
the point of view of single-threaded operation. In doing
so it mixed a lot of single-threaded implementation specific
aspects into the decompressor code and elsewhere which makes it
difficult to seamlessly support multiple different decompressor
implementations.
This patch does the following:
1. It removes compressor_options parsing from the decompressor
init() function. This allows the decompressor init() function
to be dynamically called to instantiate multiple decompressors,
without the compressor options needing to be read and parsed each
time.
2. It moves threading and all sleeping operations out of the
decompressors. In doing so, it makes the decompressors
non-blocking wrappers which only deal with interfacing with
the decompressor implementation.
3. It splits decompressor.[ch] into decompressor generic functions
in decompressor.[ch], and moves the single threaded
decompressor implementation into decompressor_single.c.
The result of this patch is Squashfs should now be able to
support multiple decompressors by adding new decompressor_xxx.c
files with specialised implementations of the functions in
decompressor_single.c
Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk>
Reviewed-by: Minchan Kim <minchan@kernel.org>
It isn't safe to call it without holding the vblk->vq_lock.
Reported-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Shaohua Li <shli@fusionio.com>
Fixed another condition of virtqueue_kick() not holding the lock.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
It appears that driver runs into a problem here if fibsize is too small
because we allocate user_srbcmd with fibsize size only but later we
access it until user_srbcmd->sg.count to copy it over to srbcmd.
It is not correct to test (fibsize < sizeof(*user_srbcmd)) because this
structure already includes one sg element and this is not needed for
commands without data. So, we would recommend to add the following
(instead of test for fibsize == 0).
Signed-off-by: Mahesh Rajashekhara <Mahesh.Rajashekhara@pmcs.com>
Reported-by: Nico Golde <nico@ngolde.de>
Reported-by: Fabian Yamaguchi <fabs@goesec.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull networking fixes from David Miller:
"Mostly these are fixes for fallout due to merge window changes, as
well as cures for problems that have been with us for a much longer
period of time"
1) Johannes Berg noticed two major deficiencies in our genetlink
registration. Some genetlink protocols we passing in constant
counts for their ops array rather than something like
ARRAY_SIZE(ops) or similar. Also, some genetlink protocols were
using fixed IDs for their multicast groups.
We have to retain these fixed IDs to keep existing userland tools
working, but reserve them so that other multicast groups used by
other protocols can not possibly conflict.
In dealing with these two problems, we actually now use less state
management for genetlink operations and multicast groups.
2) When configuring interface hardware timestamping, fix several
drivers that simply do not validate that the hwtstamp_config value
is one the driver actually supports. From Ben Hutchings.
3) Invalid memory references in mwifiex driver, from Amitkumar Karwar.
4) In dev_forward_skb(), set the skb->protocol in the right order
relative to skb_scrub_packet(). From Alexei Starovoitov.
5) Bridge erroneously fails to use the proper wrapper functions to make
calls to netdev_ops->ndo_vlan_rx_{add,kill}_vid. Fix from Toshiaki
Makita.
6) When detaching a bridge port, make sure to flush all VLAN IDs to
prevent them from leaking, also from Toshiaki Makita.
7) Put in a compromise for TCP Small Queues so that deep queued devices
that delay TX reclaim non-trivially don't have such a performance
decrease. One particularly problematic area is 802.11 AMPDU in
wireless. From Eric Dumazet.
8) Fix crashes in tcp_fastopen_cache_get(), we can see NULL socket dsts
here. Fix from Eric Dumzaet, reported by Dave Jones.
9) Fix use after free in ipv6 SIT driver, from Willem de Bruijn.
10) When computing mergeable buffer sizes, virtio-net fails to take the
virtio-net header into account. From Michael Dalton.
11) Fix seqlock deadlock in ip4_datagram_connect() wrt. statistic
bumping, this one has been with us for a while. From Eric Dumazet.
12) Fix NULL deref in the new TIPC fragmentation handling, from Erik
Hugne.
13) 6lowpan bit used for traffic classification was wrong, from Jukka
Rissanen.
14) macvlan has the same issue as normal vlans did wrt. propagating LRO
disabling down to the real device, fix it the same way. From Michal
Kubecek.
15) CPSW driver needs to soft reset all slaves during suspend, from
Daniel Mack.
16) Fix small frame pacing in FQ packet scheduler, from Eric Dumazet.
17) The xen-netfront RX buffer refill timer isn't properly scheduled on
partial RX allocation success, from Ma JieYue.
18) When ipv6 ping protocol support was added, the AF_INET6 protocol
initialization cleanup path on failure was borked a little. Fix
from Vlad Yasevich.
19) If a socket disconnects during a read/recvmsg/recvfrom/etc that
blocks we can do the wrong thing with the msg_name we write back to
userspace. From Hannes Frederic Sowa. There is another fix in the
works from Hannes which will prevent future problems of this nature.
20) Fix route leak in VTI tunnel transmit, from Fan Du.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (106 commits)
genetlink: make multicast groups const, prevent abuse
genetlink: pass family to functions using groups
genetlink: add and use genl_set_err()
genetlink: remove family pointer from genl_multicast_group
genetlink: remove genl_unregister_mc_group()
hsr: don't call genl_unregister_mc_group()
quota/genetlink: use proper genetlink multicast APIs
drop_monitor/genetlink: use proper genetlink multicast APIs
genetlink: only pass array to genl_register_family_with_ops()
tcp: don't update snd_nxt, when a socket is switched from repair mode
atm: idt77252: fix dev refcnt leak
xfrm: Release dst if this dst is improper for vti tunnel
netlink: fix documentation typo in netlink_set_err()
be2net: Delete secondary unicast MAC addresses during be_close
be2net: Fix unconditional enabling of Rx interface options
net, virtio_net: replace the magic value
ping: prevent NULL pointer dereference on write to msg_name
bnx2x: Prevent "timeout waiting for state X"
bnx2x: prevent CFC attention
bnx2x: Prevent panic during DMAE timeout
...