Commit Graph

722253 Commits

Author SHA1 Message Date
Palmer Dabbelt c901e45a99 RISC-V: `sfence.vma` orderes the instruction cache
This is just a comment change, but it's one that bit me on the mailing
list.  It turns out that issuing a `sfence.vma` enforces instruction
cache ordering in addition to TLB ordering.  This isn't explicitly
called out in the ISA manual, but Andrew will be making that more clear
in a future revision.

CC: Andrew Waterman <andrew@sifive.com>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2017-11-28 14:06:17 -08:00
Palmer Dabbelt 21db403660 RISC-V: Add READ_ONCE in arch_spin_is_locked()
This was just incorrect in the original version.

Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2017-11-28 14:05:04 -08:00
Palmer Dabbelt 9347ce54cd RISC-V: __test_and_op_bit_ord should be strongly ordered
I mis-read the documentation.  After looking at it again the
documentation is actually as clear as it can be, it's just that I didn't
actually read it in order and therefor did the wrong thing.

Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2017-11-28 14:04:05 -08:00
Palmer Dabbelt 3343eb6806 RISC-V: Remove smb_mb__{before,after}_spinlock()
These are obselete.

Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2017-11-28 14:03:55 -08:00
Palmer Dabbelt 61a60d35b7 RISC-V: Remove __smp_bp__{before,after}_atomic
These duplicate the asm-generic definitions are therefor aren't useful.

Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2017-11-28 14:03:48 -08:00
Palmer Dabbelt 8286d51a6c RISC-V: Comment on why {,cmp}xchg is ordered how it is
This is another memory model FIXME.

Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2017-11-28 14:03:29 -08:00
Palmer Dabbelt 4650d02ad2 RISC-V: Remove unused arguments from ATOMIC_OP
Our atomics are generated from a complicated series of preprocessor
macros, each of which is slightly different from the last.  When writing
the macros I'd accidentally left some unused arguments floating around.
This patch removes the unused macro arguments.

Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>
2017-11-28 13:53:24 -08:00
Jiri Pirko d51aae68b1 net: sched: cbq: create block for q->link.block
q->link.block is not initialized, that leads to EINVAL when one tries to
add filter there. So initialize it properly.

This can be reproduced by:
$ tc qdisc add dev eth0 root handle 1: cbq avpkt 1000 rate 1000Mbit bandwidth 1000Mbit
$ tc filter add dev eth0 parent 1: protocol ip prio 100 u32 match ip protocol 0 0x00 flowid 1:1

Reported-by: Jaroslav Aster <jaster@redhat.com>
Reported-by: Ivan Vecera <ivecera@redhat.com>
Fixes: 6529eaba33 ("net: sched: introduce tcf block infractructure")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Reviewed-by: Ivan Vecera <ivecera@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-28 16:04:26 -05:00
Colin Ian King 0195a21079 atm: suni: remove extraneous space to fix indentation
Remove a leading space, fixes indentation

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-28 16:03:09 -05:00
Colin Ian King 6c90654270 atm: lanai: use %p to format kernel addresses instead of %x
Don't use %x and casting to print out a kernel address, instead use %p
and remove the casting.  Cleans up smatch warnings:

drivers/atm/lanai.c:1589 service_buffer_allocate() warn: argument 2 to
%08lX specifier is cast from pointer
drivers/atm/lanai.c:2221 lanai_dev_open() warn: argument 4 to %lx
specifier is cast from pointer

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-28 16:02:44 -05:00
Jorgen Hansen 4a5def7f6a VSOCK: Don't set sk_state to TCP_CLOSE before testing it
A recent commit (3b4477d2dc) converted the sk_state to use
TCP constants. In that change, vmci_transport_handle_detach
was changed such that sk->sk_state was set to TCP_CLOSE before
we test whether it is TCP_SYN_SENT. This change moves the
sk_state change back to the original locations in that function.

Signed-off-by: Jorgen Hansen <jhansen@vmware.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-28 16:01:50 -05:00
Colin Ian King 22dac9f1fd atm: fore200e: use %pK to format kernel addresses instead of %x
Don't use %x and casting to print out a kernel address, instead use the
%pK and remove the casting.  Cleans up smatch warning:

drivers/atm/fore200e.c:3093 fore200e_proc_read() warn: argument 3 to %08x
specifier is cast from pointer

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-28 16:00:55 -05:00
Colin Ian King c95c3fe5c7 ambassador: fix incorrect indentation of assignment statement
Remove one extraneous level of indentation on assignment statement.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-28 15:58:42 -05:00
Xin Long fc39c38bdc vxlan: use __be32 type for the param vni in __vxlan_fdb_delete
All callers of __vxlan_fdb_delete pass vni with __be32 type, and
this param should be declared as __be32 type.

Fixes: 3ad7a4b141 ("vxlan: support fdb and learning in COLLECT_METADATA mode")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-28 15:57:12 -05:00
Xin Long 5eb3d22a8a bonding: use nla_get_u64 to extract the value for IFLA_BOND_AD_ACTOR_SYSTEM
bond_opt_initval expects a u64 type param, it's better to use
nla_get_u64 to extract the value here, to eliminate a sparse
endianness mismatch warning.

Fixes: 171a42c38c ("bonding: add netlink support for sys prio, actor sys mac, and port key")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-28 15:56:40 -05:00
Xin Long a8dd397903 sctp: use right member as the param of list_for_each_entry
Commit d04adf1b35 ("sctp: reset owner sk for data chunks on out queues
when migrating a sock") made a mistake that using 'list' as the param of
list_for_each_entry to traverse the retransmit, sacked and abandoned
queues, while chunks are using 'transmitted_list' to link into these
queues.

It could cause NULL dereference panic if there are chunks in any of these
queues when peeling off one asoc.

So use the chunk member 'transmitted_list' instead in this patch.

Fixes: d04adf1b35 ("sctp: reset owner sk for data chunks on out queues when migrating a sock")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-28 15:55:44 -05:00
Paolo Abeni f85729d07c sch_sfq: fix null pointer dereference at timer expiration
While converting sch_sfq to use timer_setup(), the commit cdeabbb881
("net: sched: Convert timers to use timer_setup()") forgot to
initialize the 'sch' field. As a result, the timer callback tries to
dereference a NULL pointer, and the kernel does oops.

Fix it initializing such field at qdisc creation time.

Fixes: cdeabbb881 ("net: sched: Convert timers to use timer_setup()")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-28 15:54:05 -05:00
Jakub Kicinski 25415cec50 cls_bpf: don't decrement net's refcount when offload fails
When cls_bpf offload was added it seemed like a good idea to
call cls_bpf_delete_prog() instead of extending the error
handling path, since the software state is fully initialized
at that point.  This handling of errors without jumping to
the end of the function is error prone, as proven by later
commit missing that extra call to __cls_bpf_delete_prog().

__cls_bpf_delete_prog() is now expected to be invoked with
a reference on exts->net or the field zeroed out.  The call
on the offload's error patch does not fullfil this requirement,
leading to each error stealing a reference on net namespace.

Create a function undoing what cls_bpf_set_parms() did and
use it from __cls_bpf_delete_prog() and the error path.

Fixes: aae2c35ec8 ("cls_bpf: use tcf_exts_get_net() before call_rcu()")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-28 15:49:44 -05:00
Ulf Hansson 250dcd1146 mmc: sdhci: Avoid swiotlb buffer being full
The commit de3ee99b09 ("mmc: Delete bounce buffer handling") deletes the
bounce buffer handling, but also causes the max_req_size for sdhci to be
increased, in case when max_segs == 1. This causes errors for sdhci-pci
Ricoh variant, about the swiotlb buffer to become full.

Fix the issue, by taking IO_TLB_SEGSIZE and IO_TLB_SHIFT into account when
deciding the max_req_size for sdhci.

Reported-by: Jiri Slaby <jslaby@suse.cz>
Fixes: de3ee99b09 ("mmc: Delete bounce buffer handling")
Cc: <stable@vger.kernel.org> # v4.14+
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Tested-by: Jiri Slaby <jslaby@suse.cz>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
2017-11-28 20:29:24 +01:00
Mark Rutland f81a348728 arm64: mm: cleanup stale AIVIVT references
Since commit:

  155433cb36 ("arm64: cache: Remove support for ASID-tagged VIVT I-caches")

... the kernel no longer cares about AIVIVT I-caches, as these were
removed from the architecture.

This patch removes the stale references to such I-caches.

The comment in flush_context() is also updated to clarify when and where
the TLB invalidation occurs.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-11-28 18:13:18 +00:00
Linus Torvalds 43f462f1c2 previous part 2 tag + ttm regression fix, i915,vc4,core,uapi fixes
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJaF5aSAAoJEAx081l5xIa+TgcP/ijY7I5K7uJXq+KwCThM2g2Z
 8MW0QM8u55Mk6PdNRQafVZSP6S/tyWS3gtjW2CmB6UFazNiQzJiVdoxeuKJerwob
 hyciMaYiEJ1x4Z4dJUxv7dtfdDH0duqES+rPE9znCvpW/PaR+6ohobVL2tH8QVRO
 884QHTvmABU8xmfzmpViiLdrjNQaZtAzNMl0mD07NlfAI3bNpE/UIVd+vm1ADDPl
 avZZHjyAZFgiM9anuXPGpwOcA5LSiAkUHOKZMwfj5FOhEJjAwZy0z50Jnw/Wo7OX
 N8ymDk7vRv/Q/stOk2m/yMuoDrEtG3os4L0cyDXFIumEVVsqE7Y5WMw5tvDULw6E
 WaSYr+F7t0e9OwB6w5yKRp+t97lKK1O7KZ0HA8NW0EgERHD+8/XLojr8BBAqJqxH
 mo3DVMfU7fmm7uOIBrjHGdkyWEni/Bqk/Vxo6rOTKVeRYWiCA4fNHvM7TN7h8DZA
 VlDEHB3l2k44T0ONE4vo/LgEg1Ta7B3whv0qKykYbcNK8scEBU5iV1znT+zRzJYY
 /cwuT+BxfTgXCKAveMi6FKvjvIohR9TLyj7BS6/QUK4mD+9V5AnERcorZoO6/8qY
 qiPjVDvN1BNrueyHRg162AlRXqxnvt8LFdVt2QIn8kAuXHbXOn6RMUMP49OLGlB3
 g0hpJ0MOwuHUKQcnW60d
 =3TmE
 -----END PGP SIGNATURE-----

Merge tag 'drm-for-v4.15-part2-fixes' of git://people.freedesktop.org/~airlied/linux

Pull drm fixes from Dave Airlie:

 - TTM regression fix for some virt gpus (bochs vga)

 - a few i915 stable fixes

 - one vc4 fix

 - one uapi fix

* tag 'drm-for-v4.15-part2-fixes' of git://people.freedesktop.org/~airlied/linux:
  drm/ttm: don't attempt to use hugepages if dma32 requested (v2)
  drm/vblank: Pass crtc_id to page_flip_ioctl.
  drm/i915: Fix init_clock_gating for resume
  drm/i915: Mark the userptr invalidate workqueue as WQ_MEM_RECLAIM
  drm/i915: Clear breadcrumb node when cancelling signaling
  drm/i915/gvt: ensure -ve return value is handled correctly
  drm/i915: Re-register PMIC bus access notifier on runtime resume
  drm/i915: Fix false-positive assert_rpm_wakelock_held in i915_pmic_bus_access_notifier v2
  drm/edid: Don't send non-zero YQ in AVI infoframe for HDMI 1.x sinks
  drm/vc4: Account for interrupts in flight
2017-11-28 10:01:15 -08:00
Takashi Iwai 3c02a6d946 Revert "ALSA: usb-audio: Fix potential zero-division at parsing FU"
The commit 8428a8ebde ("ALSA: usb-audio: Fix potential zero-division
at parsing FU") is utterly bogus and breaks the case with csize=1
instead of fixing anything.  Just take it back again.

Reported-by: Jörg Otte <jrg.otte@gmail.com>
Fixes: 8428a8ebde ("ALSA: usb-audio: Fix potential zero-division at parsing FU"
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-28 09:34:36 -08:00
Eric Sandeen 712d361d59 xfs: calculate correct offset in xfs_scrub_quota_item
It's only used for tracepoints so it's relatively harmless,
but the offset is calculated incorrectly in xfs_scrub_quota_item.

qi_dqperchunk is the nr. of dquots per "chunk" which we have
conveniently *cough* defined to always be 1 FSB.  Therefore
block_offset * qi_dqperchunk == first id in that chunk,
and so offset = id / qi_dqperchunk

id * dqperchunk is ... meaningless.

Fixes-coverity-id: 1423965
Fixes: c2fc338c ("xfs: scrub quota information")
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2017-11-28 08:57:11 -08:00
Eric Sandeen eda6bc27cc xfs: fix uninitialized variable in xfs_scrub_quota
On the first pass through the while(1) loop, we get to
xfs_scrub_should_terminate() which can test the uninitialized
error variable.

Fixes-coverity-id: 1423737
Fixes: c2fc338c ("xfs: scrub quota information")
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2017-11-28 08:57:11 -08:00
Eric Sandeen d41c6172bd xfs: fix leaks on corruption errors in xfs_bmap.c
Use _GOTO instead of _RETURN so we can free the allocated
cursor on error.

Fixes: bf80628 ("xfs: remove xfs_bmse_shift_one")
Fixes-coverity-id: 1423813, 1423676
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2017-11-28 08:57:11 -08:00
Michal Hocko d210a9874b xfs: fortify xfs_alloc_buftarg error handling
percpu_counter_init failure path doesn't clean up &btp->bt_lru list.
Call list_lru_destroy in that error path. Similarly register_shrinker
error path is not handled.

While it is unlikely to trigger these error path, it is not impossible
especially the later might fail with large NUMAs.  Let's handle the
failure to make the code more robust.

Noticed-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2017-11-28 08:57:11 -08:00
Minwoo Im 7e5dd57ef3 nvme-pci: fix NULL pointer dereference in nvme_free_host_mem()
Following condition which will cause NULL pointer dereference will
occur in nvme_free_host_mem() when it tries to remove pci device via
nvme_remove() especially after a failure of host memory allocation for HMB.

    "(host_mem_descs == NULL) && (nr_host_mem_descs != 0)"

It's because __nr_host_mem_descs__ is not cleared to 0 unlike
__host_mem_descs__ is so.

Signed-off-by: Minwoo Im <minwoo.im.dev@gmail.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2017-11-28 08:49:26 -08:00
Max Gurtovoy eb1bd249ba nvme-rdma: fix memory leak during queue allocation
In case nvme_rdma_wait_for_cm timeout expires before we get
an established or rejected event (rdma_connect succeeded) from
rdma_cm, we end up with leaking the ib transport resources for
dedicated queue. This scenario can easily reproduced using traffic
test during port toggling.
Also, in order to protect from parallel ib queue destruction, that
may be invoked from different context's, introduce new flag that
stands for transport readiness. While we're here, protect also against
a situation that we can receive rdma_cm events during ib queue destruction.

Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2017-11-28 08:49:22 -08:00
Martin Schwidefsky 9d0ca444d0 s390/gs: add compat regset for the guarded storage broadcast control block
git commit e525f8a6e6
"s390/gs: add regset for the guarded storage broadcast control block"
added the missing regset to the s390_regsets array but failed to add it
to the s390_compat_regsets array.

Fixes: e525f8a6e6 ("add compat regset for the guarded storage broadcast control block")
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-11-28 17:33:15 +01:00
Filipe Manana ea37d5998b Btrfs: incremental send, fix wrong unlink path after renaming file
Under some circumstances, an incremental send operation can issue wrong
paths for unlink commands related to files that have multiple hard links
and some (or all) of those links were renamed between the parent and send
snapshots. Consider the following example:

Parent snapshot

 .                                                      (ino 256)
 |---- a/                                               (ino 257)
 |     |---- b/                                         (ino 259)
 |     |     |---- c/                                   (ino 260)
 |     |     |---- f2                                   (ino 261)
 |     |
 |     |---- f2l1                                       (ino 261)
 |
 |---- d/                                               (ino 262)
       |---- f1l1_2                                     (ino 258)
       |---- f2l2                                       (ino 261)
       |---- f1_2                                       (ino 258)

Send snapshot

 .                                                      (ino 256)
 |---- a/                                               (ino 257)
 |     |---- f2l1/                                      (ino 263)
 |             |---- b2/                                (ino 259)
 |                   |---- c/                           (ino 260)
 |                   |     |---- d3                     (ino 262)
 |                   |           |---- f1l1_2           (ino 258)
 |                   |           |---- f2l2_2           (ino 261)
 |                   |           |---- f1_2             (ino 258)
 |                   |
 |                   |---- f2                           (ino 261)
 |                   |---- f1l2                         (ino 258)
 |
 |---- d                                                (ino 261)

When computing the incremental send stream the following steps happen:

1) When processing inode 261, a rename operation is issued that renames
   inode 262, which currently as a path of "d", to an orphan name of
   "o262-7-0". This is done because in the send snapshot, inode 261 has
   of its hard links with a path of "d" as well.

2) Two link operations are issued that create the new hard links for
   inode 261, whose names are "d" and "f2l2_2", at paths "/" and
   "o262-7-0/" respectively.

3) Still while processing inode 261, unlink operations are issued to
   remove the old hard links of inode 261, with names "f2l1" and "f2l2",
   at paths "a/" and "d/". However path "d/" does not correspond anymore
   to the directory inode 262 but corresponds instead to a hard link of
   inode 261 (link command issued in the previous step). This makes the
   receiver fail with a ENOTDIR error when attempting the unlink
   operation.

The problem happens because before sending the unlink operation, we failed
to detect that inode 262 was one of ancestors for inode 261 in the parent
snapshot, and therefore we didn't recompute the path for inode 262 before
issuing the unlink operation for the link named "f2l2" of inode 262. The
detection failed because the function "is_ancestor()" only follows the
first hard link it finds for an inode instead of all of its hard links
(as it was originally created for being used with directories only, for
which only one hard link exists). So fix this by making "is_ancestor()"
follow all hard links of the input inode.

A test case for fstests follows soon.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2017-11-28 17:15:30 +01:00
Eric Dumazet 15fe076ede net/packet: fix a race in packet_bind() and packet_notifier()
syzbot reported crashes [1] and provided a C repro easing bug hunting.

When/if packet_do_bind() calls __unregister_prot_hook() and releases
po->bind_lock, another thread can run packet_notifier() and process an
NETDEV_UP event.

This calls register_prot_hook() and hooks again the socket right before
first thread is able to grab again po->bind_lock.

Fixes this issue by temporarily setting po->num to 0, as suggested by
David Miller.

[1]
dev_remove_pack: ffff8801bf16fa80 not found
------------[ cut here ]------------
kernel BUG at net/core/dev.c:7945!  ( BUG_ON(!list_empty(&dev->ptype_all)); )
invalid opcode: 0000 [#1] SMP KASAN
Dumping ftrace buffer:
   (ftrace buffer empty)
Modules linked in:
device syz0 entered promiscuous mode
CPU: 0 PID: 3161 Comm: syzkaller404108 Not tainted 4.14.0+ #190
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
task: ffff8801cc57a500 task.stack: ffff8801cc588000
RIP: 0010:netdev_run_todo+0x772/0xae0 net/core/dev.c:7945
RSP: 0018:ffff8801cc58f598 EFLAGS: 00010293
RAX: ffff8801cc57a500 RBX: dffffc0000000000 RCX: ffffffff841f75b2
RDX: 0000000000000000 RSI: 1ffff100398b1ede RDI: ffff8801bf1f8810
device syz0 entered promiscuous mode
RBP: ffff8801cc58f898 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff8801bf1f8cd8
R13: ffff8801cc58f870 R14: ffff8801bf1f8780 R15: ffff8801cc58f7f0
FS:  0000000001716880(0000) GS:ffff8801db400000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000020b13000 CR3: 0000000005e25000 CR4: 00000000001406f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 rtnl_unlock+0xe/0x10 net/core/rtnetlink.c:106
 tun_detach drivers/net/tun.c:670 [inline]
 tun_chr_close+0x49/0x60 drivers/net/tun.c:2845
 __fput+0x333/0x7f0 fs/file_table.c:210
 ____fput+0x15/0x20 fs/file_table.c:244
 task_work_run+0x199/0x270 kernel/task_work.c:113
 exit_task_work include/linux/task_work.h:22 [inline]
 do_exit+0x9bb/0x1ae0 kernel/exit.c:865
 do_group_exit+0x149/0x400 kernel/exit.c:968
 SYSC_exit_group kernel/exit.c:979 [inline]
 SyS_exit_group+0x1d/0x20 kernel/exit.c:977
 entry_SYSCALL_64_fastpath+0x1f/0x96
RIP: 0033:0x44ad19

Fixes: 30f7ea1c2b ("packet: race condition in packet_bind")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Cc: Francesco Ruggeri <fruggeri@aristanetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-28 11:13:30 -05:00
Mike Maloney 57f015f5ec packet: fix crash in fanout_demux_rollover()
syzkaller found a race condition fanout_demux_rollover() while removing
a packet socket from a fanout group.

po->rollover is read and operated on during packet_rcv_fanout(), via
fanout_demux_rollover(), but the pointer is currently cleared before the
synchronization in packet_release().   It is safer to delay the cleanup
until after synchronize_net() has been called, ensuring all calls to
packet_rcv_fanout() for this socket have finished.

To further simplify synchronization around the rollover structure, set
po->rollover in fanout_add() only if there are no errors.  This removes
the need for rcu in the struct and in the call to
packet_getsockopt(..., PACKET_ROLLOVER_STATS, ...).

Crashing stack trace:
 fanout_demux_rollover+0xb6/0x4d0 net/packet/af_packet.c:1392
 packet_rcv_fanout+0x649/0x7c8 net/packet/af_packet.c:1487
 dev_queue_xmit_nit+0x835/0xc10 net/core/dev.c:1953
 xmit_one net/core/dev.c:2975 [inline]
 dev_hard_start_xmit+0x16b/0xac0 net/core/dev.c:2995
 __dev_queue_xmit+0x17a4/0x2050 net/core/dev.c:3476
 dev_queue_xmit+0x17/0x20 net/core/dev.c:3509
 neigh_connected_output+0x489/0x720 net/core/neighbour.c:1379
 neigh_output include/net/neighbour.h:482 [inline]
 ip6_finish_output2+0xad1/0x22a0 net/ipv6/ip6_output.c:120
 ip6_finish_output+0x2f9/0x920 net/ipv6/ip6_output.c:146
 NF_HOOK_COND include/linux/netfilter.h:239 [inline]
 ip6_output+0x1f4/0x850 net/ipv6/ip6_output.c:163
 dst_output include/net/dst.h:459 [inline]
 NF_HOOK.constprop.35+0xff/0x630 include/linux/netfilter.h:250
 mld_sendpack+0x6a8/0xcc0 net/ipv6/mcast.c:1660
 mld_send_initial_cr.part.24+0x103/0x150 net/ipv6/mcast.c:2072
 mld_send_initial_cr net/ipv6/mcast.c:2056 [inline]
 ipv6_mc_dad_complete+0x99/0x130 net/ipv6/mcast.c:2079
 addrconf_dad_completed+0x595/0x970 net/ipv6/addrconf.c:4039
 addrconf_dad_work+0xac9/0x1160 net/ipv6/addrconf.c:3971
 process_one_work+0xbf0/0x1bc0 kernel/workqueue.c:2113
 worker_thread+0x223/0x1990 kernel/workqueue.c:2247
 kthread+0x35e/0x430 kernel/kthread.c:231
 ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:432

Fixes: 0648ab70af ("packet: rollover prepare: per-socket state")
Fixes: 509c7a1ecc ("packet: avoid panic in packet_getsockopt()")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Mike Maloney <maloney@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-28 11:13:30 -05:00
David S. Miller a51a40b7ab Merge branch 'sctp-fix-sparse-errors'
Xin Long says:

====================
sctp: fix some other sparse errors

After the last fixes for sparse errors, there are still three sparse
errors in sctp codes, two of them are type cast, and the other one
is using extern.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-28 11:00:14 -05:00
Xin Long 1ba896f6f5 sctp: remove extern from stream sched
Now each stream sched ops is defined in different .c file and
added into the global ops in another .c file, it uses extern
to make this work.

However extern is not good coding style to get them in and
even make C=2 reports errors for this.

This patch adds sctp_sched_ops_xxx_init for each stream sched
ops in their .c file, then get them into the global ops by
calling them when initializing sctp module.

Fixes: 637784ade2 ("sctp: introduce priority based stream scheduler")
Fixes: ac1ed8b82c ("sctp: introduce round robin stream scheduler")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-28 11:00:13 -05:00
Xin Long af2697a027 sctp: force the params with right types for sctp csum apis
Now sctp_csum_xxx doesn't really match the param types of these common
csum apis. As sctp_csum_xxx is defined in sctp/checksum.h, many sparse
errors occur when make C=2 not only with M=net/sctp but also with other
modules that include this header file.

This patch is to force them fit in csum apis with the right types.

Fixes: e6d8b64b34 ("net: sctp: fix and consolidate SCTP checksumming code")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-28 11:00:13 -05:00
Xin Long 08f46070dd sctp: force SCTP_ERROR_INV_STRM with __u32 when calling sctp_chunk_fail
This patch is to force SCTP_ERROR_INV_STRM with right type to
fit in sctp_chunk_fail to avoid the sparse error.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-28 11:00:13 -05:00
Vasyl Gomonovych f95d5bf03b lmc: Use memdup_user() as a cleanup
Fix coccicheck warning which recommends to use memdup_user():
drivers/net/wan/lmc/lmc_main.c:497:27-34: WARNING opportunity for memdup_user
Generated by: scripts/coccinelle/memdup_user/memdup_user.cocci

Signed-off-by: Vasyl Gomonovych <gomonovych@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-28 10:57:54 -05:00
Christophe JAILLET dea521a2b9 bnxt_en: Fix an error handling path in 'bnxt_get_module_eeprom()'
Error code returned by 'bnxt_read_sfp_module_eeprom_info()' is handled a
few lines above when reading the A0 portion of the EEPROM.
The same should be done when reading the A2 portion of the EEPROM.

In order to correctly propagate an error, update 'rc' in this 2nd call as
well, otherwise 0 (success) is returned.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-28 10:55:22 -05:00
Antoine Tenart 952b6b3b07 net: phy: marvell10g: fix the PHY id mask
The Marvell 10G PHY driver supports different hardware revisions, which
have their bits 3..0 differing. To get the correct revision number these
bits should be ignored. This patch fixes this by using the already
defined MARVELL_PHY_ID_MASK (0xfffffff0) instead of the custom
0xffffffff mask.

Fixes: 20b2af32ff ("net: phy: add Marvell Alaska X 88X3310 10Gigabit PHY support")
Suggested-by: Yan Markman <ymarkman@marvell.com>
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-28 10:30:38 -05:00
David S. Miller f40b55ab63 Merge branch 'mvpp2-fixes'
Antoine Tenart says:

====================
net: mvpp2: set of fixes

This series fixes various issues with the Marvell PPv2 driver. The
patches are sent together to avoid any possible conflict. The series is
based on today's net tree.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-28 10:09:52 -05:00
Antoine Tenart 76e583c5f5 net: mvpp2: check ethtool sets the Tx ring size is to a valid min value
This patch fixes the Tx ring size checks when using ethtool, by adding
an extra check in the PPv2 check_ringparam_valid helper. The Tx ring
size cannot be set to a value smaller than the minimum number of
descriptors needed for TSO.

Fixes: 1d17db08c0 ("net: mvpp2: limit TSO segments and use stop/wake thresholds")
Suggested-by: Yan Markman <ymarkman@marvell.com>
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-28 10:09:51 -05:00
Yan Markman e749aca84b net: mvpp2: do not disable GMAC padding
Short fragmented packets may never be sent by the hardware when padding
is disabled. This patch stop modifying the GMAC padding bits, to leave
them to their reset value (disabled).

Fixes: 3919357fb0 ("net: mvpp2: initialize the GMAC when using a port")
Signed-off-by: Yan Markman <ymarkman@marvell.com>
[Antoine: commit message]
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-28 10:09:51 -05:00
Antoine Tenart 26146b0e6b net: mvpp2: cleanup probed ports in the probe error path
This patches fixes the probe error path by cleaning up probed ports, to
avoid leaving registered net devices when the driver failed to probe.

Fixes: 3f518509de ("ethernet: Add new driver for Marvell Armada 375 network unit")
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-28 10:09:50 -05:00
Antoine Tenart ba2d8d887d net: mvpp2: fix the txq_init error path
When an allocation in the txq_init path fails, the allocated buffers
end-up being freed twice: in the txq_init error path, and in txq_deinit.
This lead to issues as txq_deinit would work on already freed memory
regions:

    kernel BUG at mm/slub.c:3915!
    Internal error: Oops - BUG: 0 [#1] PREEMPT SMP

This patch fixes this by removing the txq_init own error path, as the
txq_deinit function is always called on errors. This was introduced by
TSO as way more buffers are allocated.

Fixes: 186cd4d4e4 ("net: mvpp2: software tso support")
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-28 10:09:50 -05:00
Chao Yu 1a6152d36d quota: propagate error from __dquot_initialize
In commit 6184fc0b8d ("quota: Propagate error from ->acquire_dquot()"),
we have propagated error from __dquot_initialize to caller, but we forgot
to handle such error in add_dquot_ref(), so, currently, during quota
accounting information initialization flow, if we failed for some of
inodes, we just ignore such error, and do account for others, which is
not a good implementation.

In this patch, we choose to let user be aware of such error, so after
turning on quota successfully, we can make sure all inodes disk usage
can be accounted, which will be more reasonable.

Suggested-by: Jan Kara <jack@suse.cz>
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2017-11-28 16:08:08 +01:00
David S. Miller e2549970f4 Merge branch 'mlxsw-GRE-offloading-fixes'
Jiri Pirko says:

====================
mlxsw: GRE offloading fixes

Petr says:

This patchset fixes a couple bugs in offloading GRE tunnels in mlxsw
driver.

Patch #1 fixes a problem that local routes pointing at a GRE tunnel
device are offloaded even if that netdevice is down.

Patch #2 detects that as a result of moving a GRE netdevice to a
different VRF, two tunnels now have a conflict of local addresses,
something that the mlxsw driver can't offload.

Patch #3 fixes a FIB abort caused by forming a route pointing at a
GRE tunnel that is eligible for offloading but already onloaded.

Patch #4 fixes a problem that next hops migrated to a new RIF kept the
old RIF reference, which went dangling shortly afterwards.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-28 09:55:48 -05:00
Petr Machata 09dbf6297f mlxsw: spectrum_router: Update nexthop RIF on update
The function mlxsw_sp_nexthop_rif_update() walks the list of nexthops
associated with a RIF, and updates the corresponding entries in the
switch. It is used in particular when a tunnel underlay netdevice moves
to a different VRF, and all the nexthops are migrated over to a new RIF.
The problem is that each nexthop holds a reference to its RIF, and that
is not updated. So after the old RIF is gone, further activity on these
nexthops (such as downing the underlay netdevice) dereferences a
dangling pointer.

Fix the issue by updating rif of impacted nexthops before calling
mlxsw_sp_nexthop_rif_update().

Fixes: 0c5f1cd5ba ("mlxsw: spectrum_router: Generalize __mlxsw_sp_ipip_entry_update_tunnel()")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-28 09:55:48 -05:00
Petr Machata d97cda5f46 mlxsw: spectrum_router: Handle encap to demoted tunnels
Some tunnels that are offloadable on their own can nonetheless be
demoted to slow path if their local address is in conflict with that of
another tunnel. When a route is formed for such a tunnel,
mlxsw_sp_nexthop_ipip_init() fails to find the corresponding IPIP entry,
and that triggers a FIB abort.

Resolve the problem by not assuming that a tunnel for which
mlxsw_sp_ipip_ops.can_offload() holds also automatically has an IPIP
entry.

Fixes: af641713e9 ("mlxsw: spectrum_router: Onload conflicting tunnels")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-28 09:55:47 -05:00
Petr Machata cab43d9c87 mlxsw: spectrum_router: Demote tunnels on VRF migration
The mlxsw driver currently doesn't offload GRE tunnels if they have the
same local address and use the same underlay VRF. When such a situation
arises, the tunnels in conflict are demoted to slow path.

However, the current code only verifies this condition on tunnel
creation and tunnel change, not when a tunnel is moved to a different
VRF. When the tunnel has no bound device, underlay and overlay are the
same. Thus moving a tunnel moves the underlay as well, and that can
cause local address conflict.

So modify mlxsw_sp_netdevice_ipip_ol_vrf_event() to check if there are
any conflicting tunnels, and demote them if yes.

Fixes: af641713e9 ("mlxsw: spectrum_router: Onload conflicting tunnels")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-28 09:55:47 -05:00
Petr Machata 57c77ce470 mlxsw: spectrum_router: Offload decap only for up tunnels
When a new local route is added, an IPIP entry is looked up to determine
whether the route should be offloaded as a tunnel decap or as a trap.
That decision should take into account whether the tunnel netdevice in
question is actually IFF_UP, and only install a decap offload if it is.

Fixes: 0063587d35 ("mlxsw: spectrum: Support decap-only IP-in-IP tunnels")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-28 09:55:47 -05:00