commit bb523b406c849eef8f265a07cd7f320f1f177743 upstream
Turn fault_in_pages_{readable,writeable} into versions that return the
number of bytes not faulted in, similar to copy_to_user, instead of
returning a non-zero value when any of the requested pages couldn't be
faulted in. This supports the existing users that require all pages to
be faulted in as well as new users that are happy if any pages can be
faulted in.
Rename the functions to fault_in_{readable,writeable} to make sure
this change doesn't silently break things.
Neither of these functions is entirely trivial and it doesn't seem
useful to inline them, so move them to mm/gup.c.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This fixes the kernelci error:
"ERROR: modpost: module configfs uses symbol kern_path from namespace
VFS_internal_I_am_really_a_filesystem_and_am_NOT_a_driver, but does not import it."
Fixes: 0a77fca3aa ("ANDROID: GKI: set vfs-only exports into their own namespace")
Signed-off-by: Todd Kjos <tkjos@google.com>
Change-Id: Ib4ab1b83c8c8c996b1f15c419fb8ce0549832699
If the file preallocated blocks and fsync'ed, we should not truncate them during
roll-forward recovery which will recover i_size correctly back.
Bug: 223740163
Fixes: d4dd19ec1ea0 ("f2fs: do not expose unwritten blocks to user by DIO")
Cc: <stable@vger.kernel.org> # 5.17+
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
(cherry picked from commit 4d8ec91208196e0e19195f1e7d6be9de5873f242)
Change-Id: I5e974a564667115455b53a18f31902a875d86dee
commit 85d825dbf4899a69407338bae462a59aa9a37326 upstream.
If the file system does not use bigalloc, calculating the overhead is
cheap, so force the recalculation of the overhead so we don't have to
trust the precalculated overhead in the superblock.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 10b01ee92df52c8d7200afead4d5e5f55a5c58b1 upstream.
The kernel calculation was underestimating the overhead by not taking
into account the reserved gdt blocks. With this change, the overhead
calculated by the kernel matches the overhead calculation in mke2fs.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2da376228a2427501feb9d15815a45dbdbdd753e upstream.
Syzbot found an issue [1] in ext4_fallocate().
The C reproducer [2] calls fallocate(), passing size 0xffeffeff000ul,
and offset 0x1000000ul, which, when added together exceed the
bitmap_maxbytes for the inode. This triggers a BUG in
ext4_ind_remove_space(). According to the comments in this function
the 'end' parameter needs to be one block after the last block to be
removed. In the case when the BUG is triggered it points to the last
block. Modify the ext4_punch_hole() function and add constraint that
caps the length to satisfy the one before laster block requirement.
LINK: [1] https://syzkaller.appspot.com/bug?id=b80bd9cf348aac724a4f4dff251800106d721331
LINK: [2] https://syzkaller.appspot.com/text?tag=ReproC&x=14ba0238700000
Fixes: a4bb6b64e3 ("ext4: enable "punch hole" functionality")
Reported-by: syzbot+7a806094edd5d07ba029@syzkaller.appspotmail.com
Signed-off-by: Tadeusz Struk <tadeusz.struk@linaro.org>
Link: https://lore.kernel.org/r/20220331200515.153214-1-tadeusz.struk@linaro.org
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a2b0b205d125f27cddfb4f7280e39affdaf46686 upstream.
We got issue as follows:
[home]# fsck.ext4 -fn ram0yb
e2fsck 1.45.6 (20-Mar-2020)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Symlink /p3/d14/d1a/l3d (inode #3494) is invalid.
Clear? no
Entry 'l3d' in /p3/d14/d1a (3383) has an incorrect filetype (was 7, should be 0).
Fix? no
As the symlink file size does not match the file content. If the writeback
of the symlink data block failed, ext4_finish_bio() handles the end of IO.
However this function fails to mark the buffer with BH_write_io_error and
so when unmount does journal checkpoint it cannot detect the writeback
error and will cleanup the journal. Thus we've lost the correct data in the
journal area. To solve this issue, mark the buffer as BH_write_io_error in
ext4_finish_bio().
Cc: stable@kernel.org
Signed-off-by: Ye Bin <yebin10@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20220321144438.201685-1-yebin10@huawei.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ad5cd4f4ee4d5fcdb1bfb7a0c073072961e70783 upstream.
Since the initial introduction of (posix) fallocate back at the turn of
the century, it has been possible to use this syscall to change the
user-visible contents of files. This can happen by extending the file
size during a preallocation, or through any of the newer modes (punch,
zero, collapse, insert range). Because the call can be used to change
file contents, we should treat it like we do any other modification to a
file -- update the mtime, and drop set[ug]id privileges/capabilities.
The VFS function file_modified() does all this for us if pass it a
locked inode, so let's make fallocate drop permissions correctly.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Link: https://lore.kernel.org/r/20220308185043.GA117678@magnolia
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5f24d5a579d1eace79d505b148808a850b417d4c upstream.
This is a fix for commit f6795053da ("mm: mmap: Allow for "high"
userspace addresses") for hugetlb.
This patch adds support for "high" userspace addresses that are
optionally supported on the system and have to be requested via a hint
mechanism ("high" addr parameter to mmap).
Architectures such as powerpc and x86 achieve this by making changes to
their architectural versions of hugetlb_get_unmapped_area() function.
However, arm64 uses the generic version of that function.
So take into account arch_get_mmap_base() and arch_get_mmap_end() in
hugetlb_get_unmapped_area(). To allow that, move those two macros out
of mm/mmap.c into include/linux/sched/mm.h
If these macros are not defined in architectural code then they default
to (TASK_SIZE) and (base) so should not introduce any behavioural
changes to architectures that do not define them.
For the time being, only ARM64 is affected by this change.
Catalin (ARM64) said
"We should have fixed hugetlb_get_unmapped_area() as well when we added
support for 52-bit VA. The reason for commit f6795053da was to
prevent normal mmap() from returning addresses above 48-bit by default
as some user-space had hard assumptions about this.
It's a slight ABI change if you do this for hugetlb_get_unmapped_area()
but I doubt anyone would notice. It's more likely that the current
behaviour would cause issues, so I'd rather have them consistent.
Basically when arm64 gained support for 52-bit addresses we did not
want user-space calling mmap() to suddenly get such high addresses,
otherwise we could have inadvertently broken some programs (similar
behaviour to x86 here). Hence we added commit f6795053da. But we
missed hugetlbfs which could still get such high mmap() addresses. So
in theory that's a potential regression that should have bee addressed
at the same time as commit f6795053da (and before arm64 enabled
52-bit addresses)"
Link: https://lkml.kernel.org/r/ab847b6edb197bffdfe189e70fb4ac76bfe79e0d.1650033747.git.christophe.leroy@csgroup.eu
Fixes: f6795053da ("mm: mmap: Allow for "high" userspace addresses")
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Steve Capper <steve.capper@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: <stable@vger.kernel.org> [5.0.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit b3d4650d82c71b9c9a8184de9e8bb656012b289e ]
When asked to create a path ending '/', but which is not to be a
directory (LOOKUP_DIRECTORY not set), filename_create() will never try
to create the file. If it doesn't exist, -ENOENT is reported.
However, it still passes LOOKUP_CREATE|LOOKUP_EXCL to the filesystems
->lookup() function, even though there is no intent to create. This is
misleading and can cause incorrect behaviour.
If you try
ln -s foo /path/dir/
where 'dir' is a directory on an NFS filesystem which is not currently
known in the dcache, this will fail with ENOENT.
But as the name is not in the dcache, nfs_lookup gets called with
LOOKUP_CREATE|LOOKUP_EXCL and so it returns NULL without performing any
lookup, with the expectation that a subsequent call to create the target
will be made, and the lookup can be combined with the creation. In the
case with a trailing '/' and no LOOKUP_DIRECTORY, that call is never
made. Instead filename_create() sees that the dentry is not (yet)
positive and returns -ENOENT - even though the directory actually
exists.
So only set LOOKUP_CREATE|LOOKUP_EXCL if there really is an intent to
create, and use the absence of these flags to decide if -ENOENT should
be returned.
Note that filename_parentat() is only interested in LOOKUP_REVAL, so we
split that out and store it in 'reval_flag'. __lookup_hash() then gets
reval_flag combined with whatever create flags were determined to be
needed.
Reviewed-by: David Disseldorp <ddiss@suse.de>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: NeilBrown <neilb@suse.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 932aba1e169090357a77af18850a10c256b50819 ]
struct stat (defined in arch/x86/include/uapi/asm/stat.h) has 32-bit
st_dev and st_rdev; struct compat_stat (defined in
arch/x86/include/asm/compat.h) has 16-bit st_dev and st_rdev followed by
a 16-bit padding.
This patch fixes struct compat_stat to match struct stat.
[ Historical note: the old x86 'struct stat' did have that 16-bit field
that the compat layer had kept around, but it was changes back in 2003
by "struct stat - support larger dev_t":
https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git/commit/?id=e95b2065677fe32512a597a79db94b77b90c968d
and back in those days, the x86_64 port was still new, and separate
from the i386 code, and had already picked up the old version with a
16-bit st_dev field ]
Note that we can't change compat_dev_t because it is used by
compat_loop_info.
Also, if the st_dev and st_rdev values are 32-bit, we don't have to use
old_valid_dev to test if the value fits into them. This fixes
-EOVERFLOW on filesystems that are on NVMe because NVMe uses the major
number 259.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: Andreas Schwab <schwab@linux-m68k.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 994fd530a512597ffcd713b0f6d5bc916c5698f0 ]
Use the IOCB_DIRECT indicator flag on the I/O context rather than checking to
see if the file was opened O_DIRECT.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Steve French <sfrench@samba.org>
cc: Shyam Prasad N <nspmangalore@gmail.com>
cc: Rohith Surabattula <rohiths.msft@gmail.com>
cc: linux-cifs@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 428f651cb80b227af47fc302e4931791f2fb4741 upstream.
Before this patch, function read_rindex_entry called compute_bitstructs
before it allocated a glock for the rgrp. But if compute_bitstructs found
a problem with the rgrp, it called gfs2_consist_rgrpd, and that called
gfs2_dump_glock for rgd->rd_gl which had not yet been assigned.
read_rindex_entry
compute_bitstructs
gfs2_consist_rgrpd
gfs2_dump_glock <---------rgd->rd_gl was not set.
This patch changes read_rindex_entry so it assigns an rgrp glock before
calling compute_bitstructs so gfs2_dump_glock does not reference an
unassigned pointer. If an error is discovered, the glock must also be
put, so a new goto and label were added.
Reported-by: syzbot+c6fd14145e2f62ca0784@syzkaller.appspotmail.com
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 2d86293c70750e4331e9616aded33ab6b47c299d ]
Now that the VFS will do something with the return values from
->sync_fs, make ours pass on error codes.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5679897eb104cec9e99609c3f045a0c20603da4c ]
Strangely, sync_filesystem ignores the return code from the ->sync_fs
call, which means that syscalls like syncfs(2) never see the error.
This doesn't seem right, so fix that.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1e03a36bdff4709c1bbf0f57f60ae3f776d51adf ]
Get rid of the indirections and just provide a sync_bdevs
helper for the generic sync code.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20211019062530.2174626-8-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 70164eb6ccb76ab679b016b4b60123bf4ec6c162 ]
Instead offer a new sync_blockdev_nowait helper for the !wait case.
This new helper is exported as it will grow modular callers in a bit.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20211019062530.2174626-3-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9a208ba5c9afa62c7b1e9c6f5e783066e84e2d3c ]
There is no clear benefit in having this helper vs just open coding it.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Link: https://lore.kernel.org/r/20211019062530.2174626-2-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Changes in 5.15.35
drm/amd/display: Add pstate verification and recovery for DCN31
drm/amd/display: Fix p-state allow debug index on dcn31
hamradio: defer 6pack kfree after unregister_netdev
hamradio: remove needs_free_netdev to avoid UAF
cpuidle: PSCI: Move the `has_lpi` check to the beginning of the function
ACPI: processor idle: Check for architectural support for LPI
ACPI: processor idle: Allow playing dead in C3 state
ACPI: processor: idle: fix lockup regression on 32-bit ThinkPad T40
btrfs: remove unused parameter nr_pages in add_ra_bio_pages()
btrfs: remove no longer used counter when reading data page
btrfs: remove unused variable in btrfs_{start,write}_dirty_block_groups()
soc: qcom: aoss: Expose send for generic usecase
dt-bindings: net: qcom,ipa: add optional qcom,qmp property
net: ipa: request IPA register values be retained
btrfs: release correct delalloc amount in direct IO write path
ALSA: core: Add snd_card_free_on_error() helper
ALSA: sis7019: Fix the missing error handling
ALSA: ali5451: Fix the missing snd_card_free() call at probe error
ALSA: als300: Fix the missing snd_card_free() call at probe error
ALSA: als4000: Fix the missing snd_card_free() call at probe error
ALSA: atiixp: Fix the missing snd_card_free() call at probe error
ALSA: au88x0: Fix the missing snd_card_free() call at probe error
ALSA: aw2: Fix the missing snd_card_free() call at probe error
ALSA: azt3328: Fix the missing snd_card_free() call at probe error
ALSA: bt87x: Fix the missing snd_card_free() call at probe error
ALSA: ca0106: Fix the missing snd_card_free() call at probe error
ALSA: cmipci: Fix the missing snd_card_free() call at probe error
ALSA: cs4281: Fix the missing snd_card_free() call at probe error
ALSA: cs5535audio: Fix the missing snd_card_free() call at probe error
ALSA: echoaudio: Fix the missing snd_card_free() call at probe error
ALSA: emu10k1x: Fix the missing snd_card_free() call at probe error
ALSA: ens137x: Fix the missing snd_card_free() call at probe error
ALSA: es1938: Fix the missing snd_card_free() call at probe error
ALSA: es1968: Fix the missing snd_card_free() call at probe error
ALSA: fm801: Fix the missing snd_card_free() call at probe error
ALSA: galaxy: Fix the missing snd_card_free() call at probe error
ALSA: hdsp: Fix the missing snd_card_free() call at probe error
ALSA: hdspm: Fix the missing snd_card_free() call at probe error
ALSA: ice1724: Fix the missing snd_card_free() call at probe error
ALSA: intel8x0: Fix the missing snd_card_free() call at probe error
ALSA: intel_hdmi: Fix the missing snd_card_free() call at probe error
ALSA: korg1212: Fix the missing snd_card_free() call at probe error
ALSA: lola: Fix the missing snd_card_free() call at probe error
ALSA: lx6464es: Fix the missing snd_card_free() call at probe error
ALSA: maestro3: Fix the missing snd_card_free() call at probe error
ALSA: oxygen: Fix the missing snd_card_free() call at probe error
ALSA: riptide: Fix the missing snd_card_free() call at probe error
ALSA: rme32: Fix the missing snd_card_free() call at probe error
ALSA: rme9652: Fix the missing snd_card_free() call at probe error
ALSA: rme96: Fix the missing snd_card_free() call at probe error
ALSA: sc6000: Fix the missing snd_card_free() call at probe error
ALSA: sonicvibes: Fix the missing snd_card_free() call at probe error
ALSA: via82xx: Fix the missing snd_card_free() call at probe error
ALSA: usb-audio: Cap upper limits of buffer/period bytes for implicit fb
ALSA: nm256: Don't call card private_free at probe error path
drm/msm: Add missing put_task_struct() in debugfs path
firmware: arm_scmi: Remove clear channel call on the TX channel
memory: atmel-ebi: Fix missing of_node_put in atmel_ebi_probe
Revert "ath11k: mesh: add support for 256 bitmap in blockack frames in 11ax"
firmware: arm_scmi: Fix sorting of retrieved clock rates
media: rockchip/rga: do proper error checking in probe
SUNRPC: Fix the svc_deferred_event trace class
net/sched: flower: fix parsing of ethertype following VLAN header
veth: Ensure eth header is in skb's linear part
gpiolib: acpi: use correct format characters
cifs: release cached dentries only if mount is complete
net: mdio: don't defer probe forever if PHY IRQ provider is missing
mlxsw: i2c: Fix initialization error flow
net/sched: fix initialization order when updating chain 0 head
net: dsa: felix: suppress -EPROBE_DEFER errors
net: ethernet: stmmac: fix altr_tse_pcs function when using a fixed-link
net/sched: taprio: Check if socket flags are valid
cfg80211: hold bss_lock while updating nontrans_list
netfilter: nft_socket: make cgroup match work in input too
drm/msm: Fix range size vs end confusion
drm/msm/dsi: Use connector directly in msm_dsi_manager_connector_init()
drm/msm/dp: add fail safe mode outside of event_mutex context
net/smc: Fix NULL pointer dereference in smc_pnet_find_ib()
scsi: pm80xx: Mask and unmask upper interrupt vectors 32-63
scsi: pm80xx: Enable upper inbound, outbound queues
scsi: iscsi: Move iscsi_ep_disconnect()
scsi: iscsi: Fix offload conn cleanup when iscsid restarts
scsi: iscsi: Fix endpoint reuse regression
scsi: iscsi: Fix conn cleanup and stop race during iscsid restart
scsi: iscsi: Fix unbound endpoint error handling
sctp: Initialize daddr on peeled off socket
netfilter: nf_tables: nft_parse_register can return a negative value
ALSA: ad1889: Fix the missing snd_card_free() call at probe error
ALSA: mtpav: Don't call card private_free at probe error path
io_uring: move io_uring_rsrc_update2 validation
io_uring: verify that resv2 is 0 in io_uring_rsrc_update2
io_uring: verify pad field is 0 in io_get_ext_arg
testing/selftests/mqueue: Fix mq_perf_tests to free the allocated cpu set
ALSA: usb-audio: Increase max buffer size
ALSA: usb-audio: Limit max buffer and period sizes per time
perf tools: Fix misleading add event PMU debug message
macvlan: Fix leaking skb in source mode with nodst option
net: ftgmac100: access hardware register after clock ready
nfc: nci: add flush_workqueue to prevent uaf
cifs: potential buffer overflow in handling symlinks
dm mpath: only use ktime_get_ns() in historical selector
vfio/pci: Fix vf_token mechanism when device-specific VF drivers are used
net: bcmgenet: Revert "Use stronger register read/writes to assure ordering"
block: fix offset/size check in bio_trim()
drm/amd: Add USBC connector ID
btrfs: fix fallocate to use file_modified to update permissions consistently
btrfs: do not warn for free space inode in cow_file_range
drm/amdgpu: conduct a proper cleanup of PDB bo
drm/amdgpu/gmc: use PCI BARs for APUs in passthrough
drm/amd/display: fix audio format not updated after edid updated
drm/amd/display: FEC check in timing validation
drm/amd/display: Update VTEM Infopacket definition
drm/amdkfd: Fix Incorrect VMIDs passed to HWS
drm/amdgpu/vcn: improve vcn dpg stop procedure
drm/amdkfd: Check for potential null return of kmalloc_array()
Drivers: hv: vmbus: Deactivate sysctl_record_panic_msg by default in isolated guests
PCI: hv: Propagate coherence from VMbus device to PCI device
Drivers: hv: vmbus: Prevent load re-ordering when reading ring buffer
scsi: target: tcmu: Fix possible page UAF
scsi: lpfc: Fix queue failures when recovering from PCI parity error
scsi: ibmvscsis: Increase INITIAL_SRP_LIMIT to 1024
net: micrel: fix KS8851_MLL Kconfig
ata: libata-core: Disable READ LOG DMA EXT for Samsung 840 EVOs
gpu: ipu-v3: Fix dev_dbg frequency output
regulator: wm8994: Add an off-on delay for WM8994 variant
arm64: alternatives: mark patch_alternative() as `noinstr`
tlb: hugetlb: Add more sizes to tlb_remove_huge_tlb_entry
net: axienet: setup mdio unconditionally
Drivers: hv: balloon: Disable balloon and hot-add accordingly
net: usb: aqc111: Fix out-of-bounds accesses in RX fixup
myri10ge: fix an incorrect free for skb in myri10ge_sw_tso
spi: cadence-quadspi: fix protocol setup for non-1-1-X operations
drm/amd/display: Enable power gating before init_pipes
drm/amd/display: Revert FEC check in validation
drm/amd/display: Fix allocate_mst_payload assert on resume
drbd: set QUEUE_FLAG_STABLE_WRITES
scsi: mpt3sas: Fail reset operation if config request timed out
scsi: mvsas: Add PCI ID of RocketRaid 2640
scsi: megaraid_sas: Target with invalid LUN ID is deleted during scan
drivers: net: slip: fix NPD bug in sl_tx_timeout()
io_uring: zero tag on rsrc removal
io_uring: use nospec annotation for more indexes
perf/imx_ddr: Fix undefined behavior due to shift overflowing the constant
mm/secretmem: fix panic when growing a memfd_secret
mm, page_alloc: fix build_zonerefs_node()
mm: fix unexpected zeroed page mapping with zram swap
mm: kmemleak: take a full lowmem check in kmemleak_*_phys()
KVM: x86/mmu: Resolve nx_huge_pages when kvm.ko is loaded
SUNRPC: Fix NFSD's request deferral on RDMA transports
memory: renesas-rpc-if: fix platform-device leak in error path
gcc-plugins: latent_entropy: use /dev/urandom
cifs: verify that tcon is valid before dereference in cifs_kill_sb
ath9k: Properly clear TX status area before reporting to mac80211
ath9k: Fix usage of driver-private space in tx_info
btrfs: fix root ref counts in error handling in btrfs_get_root_ref
btrfs: mark resumed async balance as writing
ALSA: hda/realtek: Add quirk for Clevo PD50PNT
ALSA: hda/realtek: add quirk for Lenovo Thinkpad X12 speakers
ALSA: pcm: Test for "silence" field in struct "pcm_format_data"
nl80211: correctly check NL80211_ATTR_REG_ALPHA2 size
ipv6: fix panic when forwarding a pkt with no in6 dev
drm/amd/display: don't ignore alpha property on pre-multiplied mode
drm/amdgpu: Enable gfxoff quirk on MacBook Pro
x86/tsx: Use MSR_TSX_CTRL to clear CPUID bits
x86/tsx: Disable TSX development mode at boot
genirq/affinity: Consider that CPUs on nodes can be unbalanced
tick/nohz: Use WARN_ON_ONCE() to prevent console saturation
ARM: davinci: da850-evm: Avoid NULL pointer dereference
dm integrity: fix memory corruption when tag_size is less than digest size
i2c: dev: check return value when calling dev_set_name()
smp: Fix offline cpu check in flush_smp_call_function_queue()
i2c: pasemi: Wait for write xfers to finish
dt-bindings: net: snps: remove duplicate name
timers: Fix warning condition in __run_timers()
dma-direct: avoid redundant memory sync for swiotlb
drm/i915: Sunset igpu legacy mmap support based on GRAPHICS_VER_FULL
cpu/hotplug: Remove the 'cpu' member of cpuhp_cpu_state
soc: qcom: aoss: Fix missing put_device call in qmp_get
net: ipa: fix a build dependency
cpufreq: intel_pstate: ITMT support for overclocked system
ax25: add refcount in ax25_dev to avoid UAF bugs
ax25: fix reference count leaks of ax25_dev
ax25: fix UAF bugs of net_device caused by rebinding operation
ax25: Fix refcount leaks caused by ax25_cb_del()
ax25: fix UAF bug in ax25_send_control()
ax25: fix NPD bug in ax25_disconnect
ax25: Fix NULL pointer dereferences in ax25 timers
ax25: Fix UAF bugs in ax25 timers
Linux 5.15.35
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I0dd9eaea7f977df42b0a5b9cb9043c879f62718b
Changes in 5.15.34
lib/logic_iomem: correct fallback config references
um: fix and optimize xor select template for CONFIG64 and timetravel mode
rtc: wm8350: Handle error for wm8350_register_irq
nbd: add error handling support for add_disk()
nbd: Fix incorrect error handle when first_minor is illegal in nbd_dev_add
nbd: Fix hungtask when nbd_config_put
nbd: fix possible overflow on 'first_minor' in nbd_dev_add()
kfence: count unexpectedly skipped allocations
kfence: move saving stack trace of allocations into __kfence_alloc()
kfence: limit currently covered allocations when pool nearly full
KVM: x86/pmu: Use different raw event masks for AMD and Intel
KVM: SVM: Fix kvm_cache_regs.h inclusions for is_guest_mode()
KVM: x86/svm: Clear reserved bits written to PerfEvtSeln MSRs
KVM: x86/pmu: Fix and isolate TSX-specific performance event logic
KVM: x86/emulator: Emulate RDPID only if it is enabled in guest
drm: Add orientation quirk for GPD Win Max
ath5k: fix OOB in ath5k_eeprom_read_pcal_info_5111
drm/amd/display: Add signal type check when verify stream backends same
drm/amd/amdgpu/amdgpu_cs: fix refcount leak of a dma_fence obj
drm/amd/display: Fix memory leak
drm/amd/display: Use PSR version selected during set_psr_caps
usb: gadget: tegra-xudc: Do not program SPARAM
usb: gadget: tegra-xudc: Fix control endpoint's definitions
usb: cdnsp: fix cdnsp_decode_trb function to properly handle ret value
ptp: replace snprintf with sysfs_emit
drm/amdkfd: Don't take process mutex for svm ioctls
powerpc: dts: t104xrdb: fix phy type for FMAN 4/5
ath11k: fix kernel panic during unload/load ath11k modules
ath11k: pci: fix crash on suspend if board file is not found
ath11k: mhi: use mhi_sync_power_up()
net/smc: Send directly when TCP_CORK is cleared
drm/bridge: Add missing pm_runtime_put_sync
bpf: Make dst_port field in struct bpf_sock 16-bit wide
scsi: mvsas: Replace snprintf() with sysfs_emit()
scsi: bfa: Replace snprintf() with sysfs_emit()
drm/v3d: fix missing unlock
power: supply: axp20x_battery: properly report current when discharging
mt76: mt7921: fix crash when startup fails.
mt76: dma: initialize skip_unmap in mt76_dma_rx_fill
cfg80211: don't add non transmitted BSS to 6GHz scanned channels
libbpf: Fix build issue with llvm-readelf
ipv6: make mc_forwarding atomic
net: initialize init_net earlier
powerpc: Set crashkernel offset to mid of RMA region
drm/amdgpu: Fix recursive locking warning
scsi: smartpqi: Fix kdump issue when controller is locked up
PCI: aardvark: Fix support for MSI interrupts
iommu/arm-smmu-v3: fix event handling soft lockup
usb: ehci: add pci device support for Aspeed platforms
PCI: endpoint: Fix alignment fault error in copy tests
tcp: Don't acquire inet_listen_hashbucket::lock with disabled BH.
PCI: pciehp: Add Qualcomm quirk for Command Completed erratum
scsi: mpi3mr: Fix reporting of actual data transfer size
scsi: mpi3mr: Fix memory leaks
powerpc/set_memory: Avoid spinlock recursion in change_page_attr()
power: supply: axp288-charger: Set Vhold to 4.4V
net/mlx5e: Disable TX queues before registering the netdev
usb: dwc3: pci: Set the swnode from inside dwc3_pci_quirks()
iwlwifi: mvm: Correctly set fragmented EBS
iwlwifi: mvm: move only to an enabled channel
drm/msm/dsi: Remove spurious IRQF_ONESHOT flag
ipv4: Invalidate neighbour for broadcast address upon address addition
dm ioctl: prevent potential spectre v1 gadget
dm: requeue IO if mapping table not yet available
drm/amdkfd: make CRAT table missing message informational only
vfio/pci: Stub vfio_pci_vga_rw when !CONFIG_VFIO_PCI_VGA
scsi: pm8001: Fix pm80xx_pci_mem_copy() interface
scsi: pm8001: Fix pm8001_mpi_task_abort_resp()
scsi: pm8001: Fix task leak in pm8001_send_abort_all()
scsi: pm8001: Fix tag leaks on error
scsi: pm8001: Fix memory leak in pm8001_chip_fw_flash_update_req()
mt76: mt7915: fix injected MPDU transmission to not use HW A-MSDU
powerpc/64s/hash: Make hash faults work in NMI context
mt76: mt7615: Fix assigning negative values to unsigned variable
scsi: aha152x: Fix aha152x_setup() __setup handler return value
scsi: hisi_sas: Free irq vectors in order for v3 HW
scsi: hisi_sas: Limit users changing debugfs BIST count value
net/smc: correct settings of RMB window update limit
mips: ralink: fix a refcount leak in ill_acc_of_setup()
macvtap: advertise link netns via netlink
tuntap: add sanity checks about msg_controllen in sendmsg
Bluetooth: Fix not checking for valid hdev on bt_dev_{info,warn,err,dbg}
Bluetooth: use memset avoid memory leaks
bnxt_en: Eliminate unintended link toggle during FW reset
PCI: endpoint: Fix misused goto label
MIPS: fix fortify panic when copying asm exception handlers
powerpc/64e: Tie PPC_BOOK3E_64 to PPC_FSL_BOOK3E
powerpc/secvar: fix refcount leak in format_show()
scsi: libfc: Fix use after free in fc_exch_abts_resp()
can: isotp: set default value for N_As to 50 micro seconds
can: etas_es58x: es58x_fd_rx_event_msg(): initialize rx_event_msg before calling es58x_check_msg_len()
riscv: Fixed misaligned memory access. Fixed pointer comparison.
net: account alternate interface name memory
net: limit altnames to 64k total
net/mlx5e: Remove overzealous validations in netlink EEPROM query
net: sfp: add 2500base-X quirk for Lantech SFP module
usb: dwc3: omap: fix "unbalanced disables for smps10_out1" on omap5evm
mt76: fix monitor mode crash with sdio driver
xtensa: fix DTC warning unit_address_format
MIPS: ingenic: correct unit node address
Bluetooth: Fix use after free in hci_send_acl
netfilter: conntrack: revisit gc autotuning
netlabel: fix out-of-bounds memory accesses
ceph: fix inode reference leakage in ceph_get_snapdir()
ceph: fix memory leak in ceph_readdir when note_last_dentry returns error
lib/Kconfig.debug: add ARCH dependency for FUNCTION_ALIGN option
init/main.c: return 1 from handled __setup() functions
minix: fix bug when opening a file with O_DIRECT
clk: si5341: fix reported clk_rate when output divider is 2
staging: vchiq_arm: Avoid NULL ptr deref in vchiq_dump_platform_instances
staging: vchiq_core: handle NULL result of find_service_by_handle
phy: amlogic: phy-meson-gxl-usb2: fix shared reset controller use
phy: amlogic: meson8b-usb2: Use dev_err_probe()
phy: amlogic: meson8b-usb2: fix shared reset control use
clk: rockchip: drop CLK_SET_RATE_PARENT from dclk_vop* on rk3568
cpufreq: CPPC: Fix performance/frequency conversion
opp: Expose of-node's name in debugfs
staging: wfx: fix an error handling in wfx_init_common()
w1: w1_therm: fixes w1_seq for ds28ea00 sensors
NFSv4.2: fix reference count leaks in _nfs42_proc_copy_notify()
NFSv4: Protect the state recovery thread against direct reclaim
habanalabs: fix possible memory leak in MMU DR fini
xen: delay xen_hvm_init_time_ops() if kdump is boot on vcpu>=32
clk: ti: Preserve node in ti_dt_clocks_register()
clk: Enforce that disjoints limits are invalid
SUNRPC/call_alloc: async tasks mustn't block waiting for memory
SUNRPC/xprt: async tasks mustn't block waiting for memory
SUNRPC: remove scheduling boost for "SWAPPER" tasks.
NFS: swap IO handling is slightly different for O_DIRECT IO
NFS: swap-out must always use STABLE writes.
x86: Annotate call_on_stack()
x86/Kconfig: Do not allow CONFIG_X86_X32_ABI=y with llvm-objcopy
serial: samsung_tty: do not unlock port->lock for uart_write_wakeup()
virtio_console: eliminate anonymous module_init & module_exit
jfs: prevent NULL deref in diFree
SUNRPC: Fix socket waits for write buffer space
NFS: nfsiod should not block forever in mempool_alloc()
NFS: Avoid writeback threads getting stuck in mempool_alloc()
selftests: net: Add tls config dependency for tls selftests
parisc: Fix CPU affinity for Lasi, WAX and Dino chips
parisc: Fix patch code locking and flushing
mm: fix race between MADV_FREE reclaim and blkdev direct IO read
rtc: mc146818-lib: change return values of mc146818_get_time()
rtc: Check return value from mc146818_get_time()
rtc: mc146818-lib: fix RTC presence check
drm/amdgpu: fix off by one in amdgpu_gfx_kiq_acquire()
Drivers: hv: vmbus: Fix potential crash on module unload
Revert "NFSv4: Handle the special Linux file open access mode"
NFSv4: fix open failure with O_ACCMODE flag
scsi: sr: Fix typo in CDROM(CLOSETRAY|EJECT) handling
scsi: core: Fix sbitmap depth in scsi_realloc_sdev_budget_map()
scsi: zorro7xx: Fix a resource leak in zorro7xx_remove_one()
vdpa/mlx5: Rename control VQ workqueue to vdpa wq
vdpa/mlx5: Propagate link status from device to vdpa driver
vdpa: mlx5: prevent cvq work from hogging CPU
net: sfc: add missing xdp queue reinitialization
net/tls: fix slab-out-of-bounds bug in decrypt_internal
vrf: fix packet sniffing for traffic originating from ip tunnels
skbuff: fix coalescing for page_pool fragment recycling
ice: Clear default forwarding VSI during VSI release
mctp: Fix check for dev_hard_header() result
net: ipv4: fix route with nexthop object delete warning
net: stmmac: Fix unset max_speed difference between DT and non-DT platforms
drm/imx: imx-ldb: Check for null pointer after calling kmemdup
drm/imx: Fix memory leak in imx_pd_connector_get_modes
drm/imx: dw_hdmi-imx: Fix bailout in error cases of probe
regulator: rtq2134: Fix missing active_discharge_on setting
regulator: atc260x: Fix missing active_discharge_on setting
arch/arm64: Fix topology initialization for core scheduling
bnxt_en: Synchronize tx when xdp redirects happen on same ring
bnxt_en: reserve space inside receive page for skb_shared_info
bnxt_en: Prevent XDP redirect from running when stopping TX queue
sfc: Do not free an empty page_ring
RDMA/mlx5: Don't remove cache MRs when a delay is needed
RDMA/mlx5: Add a missing update of cache->last_add
IB/cm: Cancel mad on the DREQ event when the state is MRA_REP_RCVD
IB/rdmavt: add lock to call to rvt_error_qp to prevent a race condition
sctp: count singleton chunks in assoc user stats
dpaa2-ptp: Fix refcount leak in dpaa2_ptp_probe
ice: Set txq_teid to ICE_INVAL_TEID on ring creation
ice: Do not skip not enabled queues in ice_vc_dis_qs_msg
ipv6: Fix stats accounting in ip6_pkt_drop
ice: synchronize_rcu() when terminating rings
ice: xsk: fix VSI state check in ice_xsk_wakeup()
net: openvswitch: don't send internal clone attribute to the userspace.
net: ethernet: mv643xx: Fix over zealous checking of_get_mac_address()
net: openvswitch: fix leak of nested actions
rxrpc: fix a race in rxrpc_exit_net()
net: sfc: fix using uninitialized xdp tx_queue
net: phy: mscc-miim: reject clause 45 register accesses
qede: confirm skb is allocated before using
spi: bcm-qspi: fix MSPI only access with bcm_qspi_exec_mem_op()
bpf: Support dual-stack sockets in bpf_tcp_check_syncookie
drbd: Fix five use after free bugs in get_initial_state
scsi: ufs: ufshpb: Fix a NULL check on list iterator
io_uring: nospec index for tags on files update
io_uring: don't touch scm_fp_list after queueing skb
SUNRPC: Handle ENOMEM in call_transmit_status()
SUNRPC: Handle low memory situations in call_status()
SUNRPC: svc_tcp_sendmsg() should handle errors from xdr_alloc_bvec()
iommu/omap: Fix regression in probe for NULL pointer dereference
perf: arm-spe: Fix perf report --mem-mode
perf tools: Fix perf's libperf_print callback
perf session: Remap buf if there is no space for event
arm64: Add part number for Arm Cortex-A78AE
scsi: mpt3sas: Fix use after free in _scsih_expander_node_remove()
scsi: ufs: ufs-pci: Add support for Intel MTL
Revert "mmc: sdhci-xenon: fix annoying 1.8V regulator warning"
mmc: block: Check for errors after write on SPI
mmc: mmci: stm32: correctly check all elements of sg list
mmc: renesas_sdhi: don't overwrite TAP settings when HS400 tuning is complete
mmc: core: Fixup support for writeback-cache for eMMC and SD
lz4: fix LZ4_decompress_safe_partial read out of bound
highmem: fix checks in __kmap_local_sched_{in,out}
mmmremap.c: avoid pointless invalidate_range_start/end on mremap(old_size=0)
mm/mempolicy: fix mpol_new leak in shared_policy_replace
io_uring: don't check req->file in io_fsync_prep()
io_uring: defer splice/tee file validity check until command issue
io_uring: implement compat handling for IORING_REGISTER_IOWQ_AFF
io_uring: fix race between timeout flush and removal
x86/pm: Save the MSR validity status at context setup
x86/speculation: Restore speculation related MSRs during S3 resume
perf/x86/intel: Update the FRONTEND MSR mask on Sapphire Rapids
btrfs: fix qgroup reserve overflow the qgroup limit
btrfs: prevent subvol with swapfile from being deleted
spi: core: add dma_map_dev for __spi_unmap_msg()
arm64: patch_text: Fixup last cpu should be master
RDMA/hfi1: Fix use-after-free bug for mm struct
gpio: Restrict usage of GPIO chip irq members before initialization
x86/msi: Fix msi message data shadow struct
x86/mm/tlb: Revert retpoline avoidance approach
perf/x86/intel: Don't extend the pseudo-encoding to GP counters
ata: sata_dwc_460ex: Fix crash due to OOB write
perf: qcom_l2_pmu: fix an incorrect NULL check on list iterator
perf/core: Inherit event_caps
irqchip/gic-v3: Fix GICR_CTLR.RWP polling
fbdev: Fix unregistering of framebuffers without device
amd/display: set backlight only if required
SUNRPC: Prevent immediate close+reconnect
drm/panel: ili9341: fix optional regulator handling
drm/amdgpu/display: change pipe policy for DCN 2.1
drm/amdgpu/smu10: fix SoC/fclk units in auto mode
drm/amdgpu/vcn: Fix the register setting for vcn1
drm/nouveau/pmu: Add missing callbacks for Tegra devices
drm/amdkfd: Create file descriptor after client is added to smi_clients list
drm/amdgpu: don't use BACO for reset in S3
KVM: SVM: Allow AVIC support on system w/ physical APIC ID > 255
net/smc: send directly on setting TCP_NODELAY
Revert "selftests: net: Add tls config dependency for tls selftests"
bpf: Make remote_port field in struct bpf_sk_lookup 16-bit wide
selftests/bpf: Fix u8 narrow load checks for bpf_sk_lookup remote_port
rtc: mc146818-lib: fix signedness bug in mc146818_get_time()
SUNRPC: Don't call connect() more than once on a TCP socket
Revert "nbd: fix possible overflow on 'first_minor' in nbd_dev_add()"
perf build: Don't use -ffat-lto-objects in the python feature test when building with clang-13
perf python: Fix probing for some clang command line options
tools build: Filter out options and warnings not supported by clang
tools build: Use $(shell ) instead of `` to get embedded libperl's ccopts
dmaengine: Revert "dmaengine: shdma: Fix runtime PM imbalance on error"
KVM: avoid NULL pointer dereference in kvm_dirty_ring_push
Revert "net/mlx5: Accept devlink user input after driver initialization complete"
ubsan: remove CONFIG_UBSAN_OBJECT_SIZE
selftests: cgroup: Make cg_create() use 0755 for permission instead of 0644
selftests: cgroup: Test open-time credential usage for migration checks
selftests: cgroup: Test open-time cgroup namespace usage for migration checks
mm: don't skip swap entry even if zap_details specified
Drivers: hv: vmbus: Replace smp_store_mb() with virt_store_mb()
x86/bug: Prevent shadowing in __WARN_FLAGS
sched: Teach the forced-newidle balancer about CPU affinity limitation.
x86,static_call: Fix __static_call_return0 for i386
irqchip/gic-v4: Wait for GICR_VPENDBASER.Dirty to clear before descheduling
powerpc/64: Fix build failure with allyesconfig in book3s_64_entry.S
irqchip/gic, gic-v3: Prevent GSI to SGI translations
mm/sparsemem: fix 'mem_section' will never be NULL gcc 12 warning
static_call: Don't make __static_call_return0 static
powerpc: Fix virt_addr_valid() for 64-bit Book3E & 32-bit
stacktrace: move filter_irq_stacks() to kernel/stacktrace.c
Linux 5.15.34
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I98049d0d8ebd427296418d31085bfde482ad30e7
To further exploit spatial locality, the aging prefers to walk page
tables to search for young PTEs and promote hot pages. A kill switch
will be added in the next patch to disable this behavior. When
disabled, the aging relies on the rmap only.
NB: this behavior has nothing similar with the page table scanning in
the 2.4 kernel [1], which searches page tables for old PTEs, adds cold
pages to swapcache and unmaps them.
To avoid confusion, the term "iteration" specifically means the
traversal of an entire mm_struct list; the term "walk" will be applied
to page tables and the rmap, as usual.
An mm_struct list is maintained for each memcg, and an mm_struct
follows its owner task to the new memcg when this task is migrated.
Given an lruvec, the aging iterates lruvec_memcg()->mm_list and calls
walk_page_range() with each mm_struct on this list to promote hot
pages before it increments max_seq.
When multiple page table walkers iterate the same list, each of them
gets a unique mm_struct; therefore they can run concurrently. Page
table walkers ignore any misplaced pages, e.g., if an mm_struct was
migrated, pages it left in the previous memcg will not be promoted
when its current memcg is under reclaim. Similarly, page table walkers
will not promote pages from nodes other than the one under reclaim.
This patch uses the following optimizations when walking page tables:
1. It tracks the usage of mm_struct's between context switches so that
page table walkers can skip processes that have been sleeping since
the last iteration.
2. It uses generational Bloom filters to record populated branches so
that page table walkers can reduce their search space based on the
query results, e.g., to skip page tables containing mostly holes or
misplaced pages.
3. It takes advantage of the accessed bit in non-leaf PMD entries when
CONFIG_ARCH_HAS_NONLEAF_PMD_YOUNG=y.
4. It does not zigzag between a PGD table and the same PMD table
spanning multiple VMAs. IOW, it finishes all the VMAs within the
range of the same PMD table before it returns to a PGD table. This
improves the cache performance for workloads that have large
numbers of tiny VMAs [2], especially when CONFIG_PGTABLE_LEVELS=5.
Server benchmark results:
Single workload:
fio (buffered I/O): no change
Single workload:
memcached (anon): +[5.5, 7.5]%
Ops/sec KB/sec
patch1-7: 1014393.57 39455.42
patch1-8: 1078507.59 41949.15
Configurations:
no change
Client benchmark results:
kswapd profiles:
patch1-7
45.54% lzo1x_1_do_compress (real work)
9.56% page_vma_mapped_walk
6.70% _raw_spin_unlock_irq
2.78% ptep_clear_flush
2.47% do_raw_spin_lock
2.22% __zram_bvec_write
1.87% lru_gen_look_around
1.78% memmove
1.77% obj_malloc
1.44% free_unref_page_list
patch1-8
47.02% lzo1x_1_do_compress (real work)
6.73% page_vma_mapped_walk
6.14% _raw_spin_unlock_irq
3.39% walk_pte_range
2.63% ptep_clear_flush
2.29% __zram_bvec_write
2.10% do_raw_spin_lock
1.81% memmove
1.73% obj_malloc
1.53% free_unref_page_list
Configurations:
no change
[1] https://lwn.net/Articles/23732/
[2] https://source.android.com/devices/tech/debug/scudo
Link: https://lore.kernel.org/lkml/20220309021230.721028-9-yuzhao@google.com/
Signed-off-by: Yu Zhao <yuzhao@google.com>
Acked-by: Brian Geffon <bgeffon@google.com>
Acked-by: Jan Alexander Steffens (heftig) <heftig@archlinux.org>
Acked-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Acked-by: Steven Barrett <steven@liquorix.net>
Acked-by: Suleiman Souhlal <suleiman@google.com>
Tested-by: Daniel Byrne <djbyrne@mtu.edu>
Tested-by: Donald Carr <d@chaos-reins.com>
Tested-by: Holger Hoffstätte <holger@applied-asynchrony.com>
Tested-by: Konstantin Kharlamov <Hi-Angel@yandex.ru>
Tested-by: Shuang Zhai <szhai2@cs.rochester.edu>
Tested-by: Sofia Trinh <sofia.trinh@edi.works>
Tested-by: Vaibhav Jain <vaibhav@linux.ibm.com>
Bug: 227651406
Signed-off-by: Kalesh Singh <kaleshsingh@google.com>
Change-Id: I5a3c97cf8ebf8d65d5f9528cd979a637c190053e
Evictable pages are divided into multiple generations for each lruvec.
The youngest generation number is stored in lrugen->max_seq for both
anon and file types as they are aged on an equal footing. The oldest
generation numbers are stored in lrugen->min_seq[] separately for anon
and file types as clean file pages can be evicted regardless of swap
constraints. These three variables are monotonically increasing.
Generation numbers are truncated into order_base_2(MAX_NR_GENS+1) bits
in order to fit into the gen counter in page->flags. Each truncated
generation number is an index to lrugen->lists[]. The sliding window
technique is used to track at least MIN_NR_GENS and at most
MAX_NR_GENS generations. The gen counter stores a value within [1,
MAX_NR_GENS] while a page is on one of lrugen->lists[]. Otherwise it
stores 0.
There are two conceptually independent procedures: "the aging", which
produces young generations, and "the eviction", which consumes old
generations. They form a closed-loop system, i.e., "the page reclaim".
Both procedures can be invoked from userspace for the purposes of
working set estimation and proactive reclaim. These features are
required to optimize job scheduling (bin packing) in data centers. The
variable size of the sliding window is designed for such use cases
[1][2].
To avoid confusion, the terms "hot" and "cold" will be applied to the
multi-gen LRU, as a new convention; the terms "active" and "inactive"
will be applied to the active/inactive LRU, as usual.
The protection of hot pages and the selection of cold pages are based
on page access channels and patterns. There are two access channels:
one through page tables and the other through file descriptors. The
protection of the former channel is by design stronger because:
1. The uncertainty in determining the access patterns of the former
channel is higher due to the approximation of the accessed bit.
2. The cost of evicting the former channel is higher due to the TLB
flushes required and the likelihood of encountering the dirty bit.
3. The penalty of underprotecting the former channel is higher because
applications usually do not prepare themselves for major page
faults like they do for blocked I/O. E.g., GUI applications
commonly use dedicated I/O threads to avoid blocking the rendering
threads.
There are also two access patterns: one with temporal locality and the
other without. For the reasons listed above, the former channel is
assumed to follow the former pattern unless VM_SEQ_READ or
VM_RAND_READ is present; the latter channel is assumed to follow the
latter pattern unless outlying refaults have been observed [3][4].
The next patch will address the "outlying refaults". Three macros,
i.e., LRU_REFS_WIDTH, LRU_REFS_PGOFF and LRU_REFS_MASK, used later are
added in this patch to make the entire patchset less diffy.
A page is added to the youngest generation on faulting. The aging
needs to check the accessed bit at least twice before handing this
page over to the eviction. The first check takes care of the accessed
bit set on the initial fault; the second check makes sure this page
has not been used since then. This protocol, AKA second chance,
requires a minimum of two generations, hence MIN_NR_GENS.
[1] https://dl.acm.org/doi/10.1145/3297858.3304053
[2] https://dl.acm.org/doi/10.1145/3503222.3507731
[3] https://lwn.net/Articles/495543/
[4] https://lwn.net/Articles/815342/
Link: https://lore.kernel.org/lkml/20220309021230.721028-6-yuzhao@google.com/
Signed-off-by: Yu Zhao <yuzhao@google.com>
Acked-by: Brian Geffon <bgeffon@google.com>
Acked-by: Jan Alexander Steffens (heftig) <heftig@archlinux.org>
Acked-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Acked-by: Steven Barrett <steven@liquorix.net>
Acked-by: Suleiman Souhlal <suleiman@google.com>
Tested-by: Daniel Byrne <djbyrne@mtu.edu>
Tested-by: Donald Carr <d@chaos-reins.com>
Tested-by: Holger Hoffstätte <holger@applied-asynchrony.com>
Tested-by: Konstantin Kharlamov <Hi-Angel@yandex.ru>
Tested-by: Shuang Zhai <szhai2@cs.rochester.edu>
Tested-by: Sofia Trinh <sofia.trinh@edi.works>
Tested-by: Vaibhav Jain <vaibhav@linux.ibm.com>
Bug: 227651406
Signed-off-by: Kalesh Singh <kaleshsingh@google.com>
Change-Id: I333ec6a1d2abfa60d93d6adc190ed3eefe441512
commit a690e5f2db4d1dca742ce734aaff9f3112d63764 upstream.
When btrfs balance is interrupted with umount, the background balance
resumes on the next mount. There is a potential deadlock with FS freezing
here like as described in commit 26559780b953 ("btrfs: zoned: mark
relocation as writing"). Mark the process as sb_writing to avoid it.
Reviewed-by: Filipe Manana <fdmanana@suse.com>
CC: stable@vger.kernel.org # 4.9+
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 168a2f776b9762f4021421008512dd7ab7474df1 upstream.
In btrfs_get_root_ref(), when btrfs_insert_fs_root() fails,
btrfs_put_root() can happen for two reasons:
- the root already exists in the tree, in that case it returns the
reference obtained in btrfs_lookup_fs_root()
- another error so the cleanup is done in the fail label
Calling btrfs_put_root() unconditionally would lead to double decrement
of the root reference possibly freeing it in the second case.
Reported-by: TOTE Robot <oslab@tsinghua.edu.cn>
Fixes: bc44d7c4b2 ("btrfs: push btrfs_grab_fs_root into btrfs_get_fs_root")
CC: stable@vger.kernel.org # 5.10+
Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8b6c58458ee3206dde345fce327a4cb83e69caf9 upstream.
On umount, cifs_sb->tlink_tree might contain entries that do not represent
a valid tcon.
Check the tcon for error before we dereference it.
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Cc: stable@vger.kernel.org
Reviewed-by: Shyam Prasad N <sprasad@microsoft.com>
Reported-by: Xiaoli Feng <xifeng@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 8f0a24801bb44aa58496945aabb904c729176772 ]
Automatically default rsrc tag in io_queue_rsrc_removal(), it's safer
than leaving it there and relying on the rest of the code to behave and
not use it.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/1cf262a50df17478ea25b22494dcc19f3a80301f.1649336342.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a7d16d9a07bbcb7dcd5214a1bea75c808830bc0d ]
This is a long time leftover from when I originally added the free space
inode, the point was to catch cases where we weren't honoring the NOCOW
flag. However there exists a race with relocation, if we allocate our
free space inode in a block group that is about to be relocated, we
could trigger the COW path before the relocation has the opportunity to
find the extents and delete the free space cache. In production where
we have auto-relocation enabled we're seeing this WARN_ON_ONCE() around
5k times in a 2 week period, so not super common but enough that it's at
the top of our metrics.
We're properly handling the error here, and with us phasing out v1 space
cache anyway just drop the WARN_ON_ONCE.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 05fd9564e9faf0f23b4676385e27d9405cef6637 ]
Since the initial introduction of (posix) fallocate back at the turn of
the century, it has been possible to use this syscall to change the
user-visible contents of files. This can happen by extending the file
size during a preallocation, or through any of the newer modes (punch,
zero range). Because the call can be used to change file contents, we
should treat it like we do any other modification to a file -- update
the mtime, and drop set[ug]id privileges/capabilities.
The VFS function file_modified() does all this for us if pass it a
locked inode, so let's make fallocate drop permissions correctly.
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 64c4a37ac04eeb43c42d272f6e6c8c12bfcf4304 ]
Smatch printed a warning:
arch/x86/crypto/poly1305_glue.c:198 poly1305_update_arch() error:
__memcpy() 'dctx->buf' too small (16 vs u32max)
It's caused because Smatch marks 'link_len' as untrusted since it comes
from sscanf(). Add a check to ensure that 'link_len' is not larger than
the size of the 'link_str' buffer.
Fixes: c69c1b6eae ("cifs: implement CIFSParseMFSymlink()")
Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d8a3ba9c143bf89c032deced8a686ffa53b46098 ]
Verify that the user does not pass in anything but 0 for this field.
Fixes: 992da01aa9 ("io_uring: change registration/upd/rsrc tagging ABI")
Signed-off-by: Dylan Yudaken <dylany@fb.com>
Link: https://lore.kernel.org/r/20220412163042.2788062-3-dylany@fb.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 565c5e616e8061b40a2e1d786c418a7ac3503a8d ]
Move validation to be more consistently straight after
copy_from_user. This is already done in io_register_rsrc_update and so
this removes that redundant check.
Signed-off-by: Dylan Yudaken <dylany@fb.com>
Link: https://lore.kernel.org/r/20220412163042.2788062-2-dylany@fb.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d788e51636462e61c6883f7d96b07b06bc291650 ]
During cifs_kill_sb, we first dput all the dentries that we have cached.
However this function can also get called for mount failures.
So dput the cached dentries only if the filesystem mount is complete.
i.e. cifs_sb->root is populated.
Fixes: 5e9c89d43f ("cifs: Grab a reference for the dentry of the cached directory during the lifetime of the cache")
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 6d82ad13c4110e73c7b0392f00534a1502a1b520 upstream.
Running generic/406 causes the following WARNING in btrfs_destroy_inode()
which tells there are outstanding extents left.
In btrfs_get_blocks_direct_write(), we reserve a temporary outstanding
extents with btrfs_delalloc_reserve_metadata() (or indirectly from
btrfs_delalloc_reserve_space(()). We then release the outstanding extents
with btrfs_delalloc_release_extents(). However, the "len" can be modified
in the COW case, which releases fewer outstanding extents than expected.
Fix it by calling btrfs_delalloc_release_extents() for the original length.
To reproduce the warning, the filesystem should be 1 GiB. It's
triggering a short-write, due to not being able to allocate a large
extent and instead allocating a smaller one.
WARNING: CPU: 0 PID: 757 at fs/btrfs/inode.c:8848 btrfs_destroy_inode+0x1e6/0x210 [btrfs]
Modules linked in: btrfs blake2b_generic xor lzo_compress
lzo_decompress raid6_pq zstd zstd_decompress zstd_compress xxhash zram
zsmalloc
CPU: 0 PID: 757 Comm: umount Not tainted 5.17.0-rc8+ #101
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS d55cb5a 04/01/2014
RIP: 0010:btrfs_destroy_inode+0x1e6/0x210 [btrfs]
RSP: 0018:ffffc9000327bda8 EFLAGS: 00010206
RAX: 0000000000000000 RBX: ffff888100548b78 RCX: 0000000000000000
RDX: 0000000000026900 RSI: 0000000000000000 RDI: ffff888100548b78
RBP: ffff888100548940 R08: 0000000000000000 R09: ffff88810b48aba8
R10: 0000000000000001 R11: ffff8881004eb240 R12: ffff88810b48a800
R13: ffff88810b48ec08 R14: ffff88810b48ed00 R15: ffff888100490c68
FS: 00007f8549ea0b80(0000) GS:ffff888237c00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f854a09e733 CR3: 000000010a2e9003 CR4: 0000000000370eb0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
destroy_inode+0x33/0x70
dispose_list+0x43/0x60
evict_inodes+0x161/0x1b0
generic_shutdown_super+0x2d/0x110
kill_anon_super+0xf/0x20
btrfs_kill_super+0xd/0x20 [btrfs]
deactivate_locked_super+0x27/0x90
cleanup_mnt+0x12c/0x180
task_work_run+0x54/0x80
exit_to_user_mode_prepare+0x152/0x160
syscall_exit_to_user_mode+0x12/0x30
do_syscall_64+0x42/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f854a000fb7
Fixes: f0bfa76a11e9 ("btrfs: fix ENOSPC failure when attempting direct IO write into NOCOW range")
CC: stable@vger.kernel.org # 5.17
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Tested-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6d4a6b515c39f1f8763093e0f828959b2fbc2f45 upstream.
Clang's version of -Wunused-but-set-variable recently gained support for
unary operations, which reveals two unused variables:
fs/btrfs/block-group.c:2949:6: error: variable 'num_started' set but not used [-Werror,-Wunused-but-set-variable]
int num_started = 0;
^
fs/btrfs/block-group.c:3116:6: error: variable 'num_started' set but not used [-Werror,-Wunused-but-set-variable]
int num_started = 0;
^
2 errors generated.
These variables appear to be unused from their introduction, so just
remove them to silence the warnings.
Fixes: c9dc4c6578 ("Btrfs: two stage dirty block group writeout")
Fixes: 1bbc621ef2 ("Btrfs: allow block group cache writeout outside critical section in commit")
CC: stable@vger.kernel.org # 5.4+
Link: https://github.com/ClangBuiltLinux/linux/issues/1614
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ad3fc7946b1829213bbdbb2b9ad0d124b31ae4a7 upstream.
After commit 92082d4097 ("btrfs: integrate page status update for
data read path into begin/end_page_read"), the 'nr' counter at
btrfs_do_readpage() is no longer used, we increment it but we never
read from it. So just remove it.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Cc: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This reverts commit f6ca862806.
It breaks the kernel abi so revert it for now. We will add it back
later at the next kabi update.
Bug: 161946584
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Ie2c8d32fbcab4289c1087ebfee3c1106cd637eef
This reverts commit 7ba958df64.
It breaks the kernel abi so revert it for now. We will add it back
later at the next kabi update.
Bug: 161946584
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I594a8403ac897c8aef507f7421ed8ff80935b267
This reverts commit 39fd0cc079.
It breaks the kernel abi so revert it for now. We will add it back
later at the next kabi update.
Bug: 161946584
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I0c97342054fad52bd957a09f48a176e71d76ec01
Changes in 5.15.33
Revert "swiotlb: rework "fix info leak with DMA_FROM_DEVICE""
USB: serial: pl2303: add IBM device IDs
dt-bindings: usb: hcd: correct usb-device path
USB: serial: pl2303: fix GS type detection
USB: serial: simple: add Nokia phone driver
mm: kfence: fix missing objcg housekeeping for SLAB
hv: utils: add PTP_1588_CLOCK to Kconfig to fix build
HID: logitech-dj: add new lightspeed receiver id
HID: Add support for open wheel and no attachment to T300
xfrm: fix tunnel model fragmentation behavior
ARM: mstar: Select HAVE_ARM_ARCH_TIMER
virtio_console: break out of buf poll on remove
vdpa/mlx5: should verify CTRL_VQ feature exists for MQ
tools/virtio: fix virtio_test execution
ethernet: sun: Free the coherent when failing in probing
gpio: Revert regression in sysfs-gpio (gpiolib.c)
spi: Fix invalid sgs value
net:mcf8390: Use platform_get_irq() to get the interrupt
Revert "gpio: Revert regression in sysfs-gpio (gpiolib.c)"
spi: Fix erroneous sgs value with min_t()
Input: zinitix - do not report shadow fingers
af_key: add __GFP_ZERO flag for compose_sadb_supported in function pfkey_register
net: dsa: microchip: add spi_device_id tables
selftests: vm: fix clang build error multiple output files
locking/lockdep: Avoid potential access of invalid memory in lock_class
drm/amdgpu: move PX checking into amdgpu_device_ip_early_init
drm/amdgpu: only check for _PR3 on dGPUs
iommu/iova: Improve 32-bit free space estimate
virtio-blk: Use blk_validate_block_size() to validate block size
tpm: fix reference counting for struct tpm_chip
usb: typec: tipd: Forward plug orientation to typec subsystem
USB: usb-storage: Fix use of bitfields for hardware data in ene_ub6250.c
xhci: fix garbage USBSTS being logged in some cases
xhci: fix runtime PM imbalance in USB2 resume
xhci: make xhci_handshake timeout for xhci_reset() adjustable
xhci: fix uninitialized string returned by xhci_decode_ctrl_ctx()
mei: me: disable driver on the ign firmware
mei: me: add Alder Lake N device id.
mei: avoid iterator usage outside of list_for_each_entry
bus: mhi: pci_generic: Add mru_default for Quectel EM1xx series
bus: mhi: Fix MHI DMA structure endianness
docs: sphinx/requirements: Limit jinja2<3.1
coresight: Fix TRCCONFIGR.QE sysfs interface
coresight: syscfg: Fix memleak on registration failure in cscfg_create_device
iio: afe: rescale: use s64 for temporary scale calculations
iio: inkern: apply consumer scale on IIO_VAL_INT cases
iio: inkern: apply consumer scale when no channel scale is available
iio: inkern: make a best effort on offset calculation
greybus: svc: fix an error handling bug in gb_svc_hello()
clk: rockchip: re-add rational best approximation algorithm to the fractional divider
clk: uniphier: Fix fixed-rate initialization
ptrace: Check PTRACE_O_SUSPEND_SECCOMP permission on PTRACE_SEIZE
cifs: fix handlecache and multiuser
cifs: we do not need a spinlock around the tree access during umount
KEYS: fix length validation in keyctl_pkey_params_get_2()
KEYS: asymmetric: enforce that sig algo matches key algo
KEYS: asymmetric: properly validate hash_algo and encoding
Documentation: add link to stable release candidate tree
Documentation: update stable tree link
firmware: stratix10-svc: add missing callback parameter on RSU
firmware: sysfb: fix platform-device leak in error path
HID: intel-ish-hid: Use dma_alloc_coherent for firmware update
SUNRPC: avoid race between mod_timer() and del_timer_sync()
NFS: NFSv2/v3 clients should never be setting NFS_CAP_XATTR
NFSD: prevent underflow in nfssvc_decode_writeargs()
NFSD: prevent integer overflow on 32 bit systems
f2fs: fix to unlock page correctly in error path of is_alive()
f2fs: quota: fix loop condition at f2fs_quota_sync()
f2fs: fix to do sanity check on .cp_pack_total_block_count
remoteproc: Fix count check in rproc_coredump_write()
mm/mlock: fix two bugs in user_shm_lock()
pinctrl: ingenic: Fix regmap on X series SoCs
pinctrl: samsung: drop pin banks references on error paths
net: bnxt_ptp: fix compilation error
spi: mxic: Fix the transmit path
mtd: rawnand: protect access to rawnand devices while in suspend
can: ems_usb: ems_usb_start_xmit(): fix double dev_kfree_skb() in error path
can: m_can: m_can_tx_handler(): fix use after free of skb
can: usb_8dev: usb_8dev_start_xmit(): fix double dev_kfree_skb() in error path
jffs2: fix use-after-free in jffs2_clear_xattr_subsystem
jffs2: fix memory leak in jffs2_do_mount_fs
jffs2: fix memory leak in jffs2_scan_medium
mm: fs: fix lru_cache_disabled race in bh_lru
mm/pages_alloc.c: don't create ZONE_MOVABLE beyond the end of a node
mm: invalidate hwpoison page cache page in fault path
mempolicy: mbind_range() set_policy() after vma_merge()
scsi: core: sd: Add silence_suspend flag to suppress some PM messages
scsi: ufs: Fix runtime PM messages never-ending cycle
scsi: scsi_transport_fc: Fix FPIN Link Integrity statistics counters
scsi: libsas: Fix sas_ata_qc_issue() handling of NCQ NON DATA commands
qed: display VF trust config
qed: validate and restrict untrusted VFs vlan promisc mode
riscv: dts: canaan: Fix SPI3 bus width
riscv: Fix fill_callchain return value
riscv: Increase stack size under KASAN
Revert "Input: clear BTN_RIGHT/MIDDLE on buttonpads"
cifs: prevent bad output lengths in smb2_ioctl_query_info()
cifs: fix NULL ptr dereference in smb2_ioctl_query_info()
ALSA: cs4236: fix an incorrect NULL check on list iterator
ALSA: hda: Avoid unsol event during RPM suspending
ALSA: pcm: Fix potential AB/BA lock with buffer_mutex and mmap_lock
ALSA: hda/realtek: Fix audio regression on Mi Notebook Pro 2020
rtc: mc146818-lib: fix locking in mc146818_set_time
rtc: pl031: fix rtc features null pointer dereference
ocfs2: fix crash when mount with quota enabled
drm/simpledrm: Add "panel orientation" property on non-upright mounted LCD panels
mm: madvise: skip unmapped vma holes passed to process_madvise
mm: madvise: return correct bytes advised with process_madvise
Revert "mm: madvise: skip unmapped vma holes passed to process_madvise"
mm,hwpoison: unmap poisoned page before invalidation
mm/kmemleak: reset tag when compare object pointer
dm stats: fix too short end duration_ns when using precise_timestamps
dm: fix use-after-free in dm_cleanup_zoned_dev()
dm: interlock pending dm_io and dm_wait_for_bios_completion
dm: fix double accounting of flush with data
dm integrity: set journal entry unused when shrinking device
tracing: Have trace event string test handle zero length strings
drbd: fix potential silent data corruption
powerpc/kvm: Fix kvm_use_magic_page
PCI: fu740: Force 2.5GT/s for initial device probe
arm64: signal: nofpsimd: Do not allocate fp/simd context when not available
arm64: Do not defer reserve_crashkernel() for platforms with no DMA memory zones
arm64: dts: qcom: sm8250: Fix MSI IRQ for PCIe1 and PCIe2
arm64: dts: ti: k3-am65: Fix gic-v3 compatible regs
arm64: dts: ti: k3-j721e: Fix gic-v3 compatible regs
arm64: dts: ti: k3-j7200: Fix gic-v3 compatible regs
arm64: dts: ti: k3-am64: Fix gic-v3 compatible regs
ASoC: SOF: Intel: Fix NULL ptr dereference when ENOMEM
Revert "ACPI: Pass the same capabilities to the _OSC regardless of the query flag"
ACPI: properties: Consistently return -ENOENT if there are no more references
coredump: Also dump first pages of non-executable ELF libraries
ext4: fix ext4_fc_stats trace point
ext4: fix fs corruption when tring to remove a non-empty directory with IO error
ext4: make mb_optimize_scan performance mount option work with extents
drivers: hamradio: 6pack: fix UAF bug caused by mod_timer()
samples/landlock: Fix path_list memory leak
landlock: Use square brackets around "landlock-ruleset"
mailbox: tegra-hsp: Flush whole channel
block: limit request dispatch loop duration
block: don't merge across cgroup boundaries if blkcg is enabled
drm/edid: check basic audio support on CEA extension block
fbdev: Hot-unplug firmware fb devices on forced removal
video: fbdev: sm712fb: Fix crash in smtcfb_read()
video: fbdev: atari: Atari 2 bpp (STe) palette bugfix
rfkill: make new event layout opt-in
ARM: dts: at91: sama7g5: Remove unused properties in i2c nodes
ARM: dts: at91: sama5d2: Fix PMERRLOC resource size
ARM: dts: exynos: fix UART3 pins configuration in Exynos5250
ARM: dts: exynos: add missing HDMI supplies on SMDK5250
ARM: dts: exynos: add missing HDMI supplies on SMDK5420
mgag200 fix memmapsl configuration in GCTL6 register
carl9170: fix missing bit-wise or operator for tx_params
pstore: Don't use semaphores in always-atomic-context code
thermal: int340x: Increase bitmap size
lib/raid6/test: fix multiple definition linking error
exec: Force single empty string when argv is empty
crypto: rsa-pkcs1pad - only allow with rsa
crypto: rsa-pkcs1pad - correctly get hash from source scatterlist
crypto: rsa-pkcs1pad - restore signature length check
crypto: rsa-pkcs1pad - fix buffer overread in pkcs1pad_verify_complete()
bcache: fixup multiple threads crash
PM: domains: Fix sleep-in-atomic bug caused by genpd_debug_remove()
DEC: Limit PMAX memory probing to R3k systems
media: gpio-ir-tx: fix transmit with long spaces on Orange Pi PC
media: venus: hfi_cmds: List HDR10 property as unsupported for v1 and v3
media: venus: venc: Fix h264 8x8 transform control
media: davinci: vpif: fix unbalanced runtime PM get
media: davinci: vpif: fix unbalanced runtime PM enable
btrfs: zoned: mark relocation as writing
btrfs: extend locking to all space_info members accesses
btrfs: verify the tranisd of the to-be-written dirty extent buffer
xtensa: define update_mmu_tlb function
xtensa: fix stop_machine_cpuslocked call in patch_text
xtensa: fix xtensa_wsr always writing 0
drm/syncobj: flatten dma_fence_chains on transfer
drm/nouveau/backlight: Fix LVDS backlight detection on some laptops
drm/nouveau/backlight: Just set all backlight types as RAW
drm/fb-helper: Mark screen buffers in system memory with FBINFO_VIRTFB
brcmfmac: firmware: Allocate space for default boardrev in nvram
brcmfmac: pcie: Release firmwares in the brcmf_pcie_setup error path
brcmfmac: pcie: Declare missing firmware files in pcie.c
brcmfmac: pcie: Replace brcmf_pcie_copy_mem_todev with memcpy_toio
brcmfmac: pcie: Fix crashes due to early IRQs
drm/i915/opregion: check port number bounds for SWSCI display power state
drm/i915/gem: add missing boundary check in vm_access
PCI: imx6: Allow to probe when dw_pcie_wait_for_link() fails
PCI: pciehp: Clear cmd_busy bit in polling mode
PCI: xgene: Revert "PCI: xgene: Fix IB window setup"
regulator: qcom_smd: fix for_each_child.cocci warnings
selinux: access superblock_security_struct in LSM blob way
selinux: check return value of sel_make_avc_files
crypto: ccp - Ensure psp_ret is always init'd in __sev_platform_init_locked()
hwrng: cavium - Check health status while reading random data
hwrng: cavium - HW_RANDOM_CAVIUM should depend on ARCH_THUNDER
crypto: sun8i-ss - really disable hash on A80
crypto: authenc - Fix sleep in atomic context in decrypt_tail
crypto: mxs-dcp - Fix scatterlist processing
selinux: Fix selinux_sb_mnt_opts_compat()
thermal: int340x: Check for NULL after calling kmemdup()
crypto: octeontx2 - remove CONFIG_DM_CRYPT check
spi: tegra114: Add missing IRQ check in tegra_spi_probe
spi: tegra210-quad: Fix missin IRQ check in tegra_qspi_probe
stack: Constrain and fix stack offset randomization with Clang builds
arm64/mm: avoid fixmap race condition when create pud mapping
blk-cgroup: set blkg iostat after percpu stat aggregation
selftests/x86: Add validity check and allow field splitting
selftests/sgx: Treat CC as one argument
crypto: rockchip - ECB does not need IV
audit: log AUDIT_TIME_* records only from rules
EVM: fix the evm= __setup handler return value
crypto: ccree - don't attempt 0 len DMA mappings
crypto: hisilicon/sec - fix the aead software fallback for engine
spi: pxa2xx-pci: Balance reference count for PCI DMA device
hwmon: (pmbus) Add mutex to regulator ops
hwmon: (sch56xx-common) Replace WDOG_ACTIVE with WDOG_HW_RUNNING
nvme: cleanup __nvme_check_ids
nvme: fix the check for duplicate unique identifiers
block: don't delete queue kobject before its children
PM: hibernate: fix __setup handler error handling
PM: suspend: fix return value of __setup handler
spi: spi-zynqmp-gqspi: Handle error for dma_set_mask
hwrng: atmel - disable trng on failure path
crypto: sun8i-ss - call finalize with bh disabled
crypto: sun8i-ce - call finalize with bh disabled
crypto: amlogic - call finalize with bh disabled
crypto: gemini - call finalize with bh disabled
crypto: vmx - add missing dependencies
clocksource/drivers/timer-ti-dm: Fix regression from errata i940 fix
clocksource/drivers/exynos_mct: Refactor resources allocation
clocksource/drivers/exynos_mct: Handle DTS with higher number of interrupts
clocksource/drivers/timer-microchip-pit64b: Use notrace
clocksource/drivers/timer-of: Check return value of of_iomap in timer_of_base_init()
arm64: prevent instrumentation of bp hardening callbacks
KEYS: trusted: Fix trusted key backends when building as module
KEYS: trusted: Avoid calling null function trusted_key_exit
ACPI: APEI: fix return value of __setup handlers
crypto: ccp - ccp_dmaengine_unregister release dma channels
crypto: ccree - Fix use after free in cc_cipher_exit()
hwrng: nomadik - Change clk_disable to clk_disable_unprepare
hwmon: (pmbus) Add Vin unit off handling
clocksource: acpi_pm: fix return value of __setup handler
io_uring: don't check unrelated req->open.how in accept request
io_uring: terminate manual loop iterator loop correctly for non-vecs
watch_queue: Fix NULL dereference in error cleanup
watch_queue: Actually free the watch
f2fs: fix to enable ATGC correctly via gc_idle sysfs interface
sched/debug: Remove mpol_get/put and task_lock/unlock from sched_show_numa
sched/core: Export pelt_thermal_tp
sched/uclamp: Fix iowait boost escaping uclamp restriction
rseq: Remove broken uapi field layout on 32-bit little endian
perf/core: Fix address filter parser for multiple filters
perf/x86/intel/pt: Fix address filter config for 32-bit kernel
sched/fair: Improve consistency of allowed NUMA balance calculations
f2fs: fix missing free nid in f2fs_handle_failed_inode
nfsd: more robust allocation failure handling in nfsd_file_cache_init
sched/cpuacct: Fix charge percpu cpuusage
sched/rt: Plug rt_mutex_setprio() vs push_rt_task() race
f2fs: fix to avoid potential deadlock
btrfs: fix unexpected error path when reflinking an inline extent
f2fs: fix compressed file start atomic write may cause data corruption
selftests, x86: fix how check_cc.sh is being invoked
drivers/base/memory: add memory block to memory group after registration succeeded
kunit: make kunit_test_timeout compatible with comment
pinctrl: samsung: Remove EINT handler for Exynos850 ALIVE and CMGP gpios
media: staging: media: zoran: fix usage of vb2_dma_contig_set_max_seg_size
media: camss: csid-170: fix non-10bit formats
media: camss: csid-170: don't enable unused irqs
media: camss: csid-170: set the right HALT_CMD when disabled
media: camss: vfe-170: fix "VFE halt timeout" error
media: staging: media: imx: imx7-mipi-csis: Make subdev name unique
media: v4l2-mem2mem: Apply DST_QUEUE_OFF_BASE on MMAP buffers across ioctls
media: mtk-vcodec: potential dereference of null pointer
media: imx: imx8mq-mipi-csi2: remove wrong irq config write operation
media: imx: imx8mq-mipi_csi2: fix system resume
media: bttv: fix WARNING regression on tunerless devices
media: atmel: atmel-sama7g5-isc: fix ispck leftover
ASoC: sh: rz-ssi: Drop calling rz_ssi_pio_recv() recursively
ASoC: codecs: Check for error pointer after calling devm_regmap_init_mmio
ASoC: xilinx: xlnx_formatter_pcm: Handle sysclk setting
ASoC: simple-card-utils: Set sysclk on all components
media: coda: Fix missing put_device() call in coda_get_vdoa_data
media: meson: vdec: potential dereference of null pointer
media: hantro: Fix overfill bottom register field name
media: ov6650: Fix set format try processing path
media: v4l: Avoid unaligned access warnings when printing 4cc modifiers
media: ov5648: Don't pack controls struct
media: aspeed: Correct value for h-total-pixels
video: fbdev: matroxfb: set maxvram of vbG200eW to the same as vbG200 to avoid black screen
video: fbdev: controlfb: Fix COMPILE_TEST build
video: fbdev: smscufx: Fix null-ptr-deref in ufx_usb_probe()
video: fbdev: atmel_lcdfb: fix an error code in atmel_lcdfb_probe()
video: fbdev: fbcvt.c: fix printing in fb_cvt_print_name()
ARM: dts: Fix OpenBMC flash layout label addresses
firmware: qcom: scm: Remove reassignment to desc following initializer
ARM: dts: qcom: ipq4019: fix sleep clock
soc: qcom: rpmpd: Check for null return of devm_kcalloc
soc: qcom: ocmem: Fix missing put_device() call in of_get_ocmem
soc: qcom: aoss: remove spurious IRQF_ONESHOT flags
arm64: dts: qcom: sdm845: fix microphone bias properties and values
arm64: dts: qcom: sm8250: fix PCIe bindings to follow schema
arm64: dts: broadcom: bcm4908: use proper TWD binding
arm64: dts: qcom: sm8150: Correct TCS configuration for apps rsc
arm64: dts: qcom: sm8350: Correct TCS configuration for apps rsc
firmware: ti_sci: Fix compilation failure when CONFIG_TI_SCI_PROTOCOL is not defined
soc: ti: wkup_m3_ipc: Fix IRQ check in wkup_m3_ipc_probe
ARM: dts: sun8i: v3s: Move the csi1 block to follow address order
vsprintf: Fix potential unaligned access
ARM: dts: imx: Add missing LVDS decoder on M53Menlo
media: mexon-ge2d: fixup frames size in registers
media: video/hdmi: handle short reads of hdmi info frame.
media: ti-vpe: cal: Fix a NULL pointer dereference in cal_ctx_v4l2_init_formats()
media: em28xx: initialize refcount before kref_get
media: usb: go7007: s2250-board: fix leak in probe()
media: cedrus: H265: Fix neighbour info buffer size
media: cedrus: h264: Fix neighbour info buffer size
ASoC: codecs: rx-macro: fix accessing compander for aux
ASoC: codecs: rx-macro: fix accessing array out of bounds for enum type
ASoC: codecs: va-macro: fix accessing array out of bounds for enum type
ASoC: codecs: wc938x: fix accessing array out of bounds for enum type
ASoC: codecs: wcd938x: fix kcontrol max values
ASoC: codecs: wcd934x: fix kcontrol max values
ASoC: codecs: wcd934x: fix return value of wcd934x_rx_hph_mode_put
media: v4l2-core: Initialize h264 scaling matrix
media: ov5640: Fix set format, v4l2_mbus_pixelcode not updated
selftests/lkdtm: Add UBSAN config
lib: uninline simple_strntoull() as well
vsprintf: Fix %pK with kptr_restrict == 0
uaccess: fix nios2 and microblaze get_user_8()
ASoC: rt5663: check the return value of devm_kzalloc() in rt5663_parse_dp()
soc: mediatek: pm-domains: Add wakeup capacity support in power domain
mmc: sdhci_am654: Fix the driver data of AM64 SoC
ASoC: ti: davinci-i2s: Add check for clk_enable()
ALSA: spi: Add check for clk_enable()
arm64: dts: ns2: Fix spi-cpol and spi-cpha property
arm64: dts: broadcom: Fix sata nodename
printk: fix return value of printk.devkmsg __setup handler
ASoC: mxs-saif: Handle errors for clk_enable
ASoC: atmel_ssc_dai: Handle errors for clk_enable
ASoC: dwc-i2s: Handle errors for clk_enable
ASoC: soc-compress: prevent the potentially use of null pointer
memory: emif: Add check for setup_interrupts
memory: emif: check the pointer temp in get_device_details()
ALSA: firewire-lib: fix uninitialized flag for AV/C deferred transaction
arm64: dts: rockchip: Fix SDIO regulator supply properties on rk3399-firefly
m68k: coldfire/device.c: only build for MCF_EDMA when h/w macros are defined
media: stk1160: If start stream fails, return buffers with VB2_BUF_STATE_QUEUED
media: vidtv: Check for null return of vzalloc
ASoC: atmel: Add missing of_node_put() in at91sam9g20ek_audio_probe
ASoC: wm8350: Handle error for wm8350_register_irq
ASoC: fsi: Add check for clk_enable
video: fbdev: omapfb: Add missing of_node_put() in dvic_probe_of
media: saa7134: fix incorrect use to determine if list is empty
ivtv: fix incorrect device_caps for ivtvfb
ASoC: atmel: Fix error handling in snd_proto_probe
ASoC: rockchip: i2s: Fix missing clk_disable_unprepare() in rockchip_i2s_probe
ASoC: SOF: Add missing of_node_put() in imx8m_probe
ASoC: mediatek: use of_device_get_match_data()
ASoC: mediatek: mt8192-mt6359: Fix error handling in mt8192_mt6359_dev_probe
ASoC: rk817: Fix missing clk_disable_unprepare() in rk817_platform_probe
ASoC: dmaengine: do not use a NULL prepare_slave_config() callback
ASoC: mxs: Fix error handling in mxs_sgtl5000_probe
ASoC: fsl_spdif: Disable TX clock when stop
ASoC: imx-es8328: Fix error return code in imx_es8328_probe()
ASoC: SOF: Intel: enable DMI L1 for playback streams
ASoC: msm8916-wcd-digital: Fix missing clk_disable_unprepare() in msm8916_wcd_digital_probe
mmc: davinci_mmc: Handle error for clk_enable
ASoC: atmel: Fix error handling in sam9x5_wm8731_driver_probe
ASoC: msm8916-wcd-analog: Fix error handling in pm8916_wcd_analog_spmi_probe
ASoC: codecs: wcd934x: Add missing of_node_put() in wcd934x_codec_parse_data
ASoC: amd: Fix reference to PCM buffer address
ARM: configs: multi_v5_defconfig: re-enable CONFIG_V4L_PLATFORM_DRIVERS
ARM: configs: multi_v5_defconfig: re-enable DRM_PANEL and FB_xxx
drm/meson: osd_afbcd: Add an exit callback to struct meson_afbcd_ops
drm/meson: Make use of the helper function devm_platform_ioremap_resourcexxx()
drm/meson: split out encoder from meson_dw_hdmi
drm/meson: Fix error handling when afbcd.ops->init fails
drm/bridge: Fix free wrong object in sii8620_init_rcp_input_dev
drm/bridge: Add missing pm_runtime_disable() in __dw_mipi_dsi_probe
drm/bridge: nwl-dsi: Fix PM disable depth imbalance in nwl_dsi_probe
drm: bridge: adv7511: Fix ADV7535 HPD enablement
ath10k: fix memory overwrite of the WoWLAN wakeup packet pattern
drm/v3d/v3d_drv: Check for error num after setting mask
drm/panfrost: Check for error num after setting mask
libbpf: Fix possible NULL pointer dereference when destroying skeleton
bpftool: Only set obj->skeleton on complete success
udmabuf: validate ubuf->pagecount
bpf: Fix UAF due to race between btf_try_get_module and load_module
drm/selftests/test-drm_dp_mst_helper: Fix memory leak in sideband_msg_req_encode_decode
selftests: bpf: Fix bind on used port
Bluetooth: btintel: Fix WBS setting for Intel legacy ROM products
Bluetooth: hci_serdev: call init_rwsem() before p->open()
mtd: onenand: Check for error irq
mtd: rawnand: gpmi: fix controller timings setting
drm/edid: Don't clear formats if using deep color
drm/edid: Split deep color modes between RGB and YUV444
ionic: fix type complaint in ionic_dev_cmd_clean()
ionic: start watchdog after all is setup
ionic: Don't send reset commands if FW isn't running
drm/nouveau/acr: Fix undefined behavior in nvkm_acr_hsfw_load_bl()
drm/amd/display: Fix a NULL pointer dereference in amdgpu_dm_connector_add_common_modes()
drm/amd/pm: return -ENOTSUPP if there is no get_dpm_ultimate_freq function
net: phy: at803x: move page selection fix to config_init
selftests/bpf: Normalize XDP section names in selftests
selftests/bpf/test_xdp_redirect_multi: use temp netns for testing
ath9k_htc: fix uninit value bugs
RDMA/core: Set MR type in ib_reg_user_mr
KVM: PPC: Fix vmx/vsx mixup in mmio emulation
selftests/net: timestamping: Fix bind_phc check
i40e: don't reserve excessive XDP_PACKET_HEADROOM on XSK Rx to skb
i40e: respect metadata on XSK Rx to skb
igc: don't reserve excessive XDP_PACKET_HEADROOM on XSK Rx to skb
ixgbe: pass bi->xdp to ixgbe_construct_skb_zc() directly
ixgbe: don't reserve excessive XDP_PACKET_HEADROOM on XSK Rx to skb
ixgbe: respect metadata on XSK Rx to skb
power: reset: gemini-poweroff: Fix IRQ check in gemini_poweroff_probe
ray_cs: Check ioremap return value
powerpc: dts: t1040rdb: fix ports names for Seville Ethernet switch
KVM: PPC: Book3S HV: Check return value of kvmppc_radix_init
powerpc/perf: Don't use perf_hw_context for trace IMC PMU
mt76: connac: fix sta_rec_wtbl tag len
mt76: mt7915: use proper aid value in mt7915_mcu_wtbl_generic_tlv in sta mode
mt76: mt7915: use proper aid value in mt7915_mcu_sta_basic_tlv
mt76: mt7921: fix a leftover race in runtime-pm
mt76: mt7615: fix a leftover race in runtime-pm
mt76: mt7603: check sta_rates pointer in mt7603_sta_rate_tbl_update
mt76: mt7615: check sta_rates pointer in mt7615_sta_rate_tbl_update
ptp: unregister virtual clocks when unregistering physical clock.
net: dsa: mv88e6xxx: Enable port policy support on 6097
mac80211: Remove a couple of obsolete TODO
mac80211: limit bandwidth in HE capabilities
scripts/dtc: Call pkg-config POSIXly correct
livepatch: Fix build failure on 32 bits processors
net: asix: add proper error handling of usb read errors
i2c: bcm2835: Use platform_get_irq() to get the interrupt
i2c: bcm2835: Fix the error handling in 'bcm2835_i2c_probe()'
mtd: mchp23k256: Add SPI ID table
mtd: mchp48l640: Add SPI ID table
igc: avoid kernel warning when changing RX ring parameters
igb: refactor XDP registration
PCI: aardvark: Fix reading MSI interrupt number
PCI: aardvark: Fix reading PCI_EXP_RTSTA_PME bit on emulated bridge
RDMA/rxe: Check the last packet by RXE_END_MASK
libbpf: Fix signedness bug in btf_dump_array_data()
cxl/core: Fix cxl_probe_component_regs() error message
cxl/regs: Fix size of CXL Capability Header Register
net:enetc: allocate CBD ring data memory using DMA coherent methods
libbpf: Fix compilation warning due to mismatched printf format
drm/bridge: dw-hdmi: use safe format when first in bridge chain
libbpf: Use dynamically allocated buffer when receiving netlink messages
power: supply: ab8500: Fix memory leak in ab8500_fg_sysfs_init
HID: i2c-hid: fix GET/SET_REPORT for unnumbered reports
iommu/ipmmu-vmsa: Check for error num after setting mask
drm/bridge: anx7625: Fix overflow issue on reading EDID
bpftool: Fix the error when lookup in no-btf maps
drm/amd/pm: enable pm sysfs write for one VF mode
drm/amd/display: Add affected crtcs to atomic state for dsc mst unplug
libbpf: Fix memleak in libbpf_netlink_recv()
IB/cma: Allow XRC INI QPs to set their local ACK timeout
dax: make sure inodes are flushed before destroy cache
selftests: mptcp: add csum mib check for mptcp_connect
iwlwifi: mvm: Don't call iwl_mvm_sta_from_mac80211() with NULL sta
iwlwifi: mvm: don't iterate unadded vifs when handling FW SMPS req
iwlwifi: mvm: align locking in D3 test debugfs
iwlwifi: yoyo: remove DBGI_SRAM address reset writing
iwlwifi: Fix -EIO error code that is never returned
iwlwifi: mvm: Fix an error code in iwl_mvm_up()
mtd: rawnand: pl353: Set the nand chip node as the flash node
drm/msm/dp: populate connector of struct dp_panel
drm/msm/dp: stop link training after link training 2 failed
drm/msm/dp: always add fail-safe mode into connector mode list
drm/msm/dsi: Use "ref" fw clock instead of global name for VCO parent
drm/msm/dsi/phy: fix 7nm v4.0 settings for C-PHY mode
drm/msm/dpu: add DSPP blocks teardown
drm/msm/dpu: fix dp audio condition
dm crypt: fix get_key_size compiler warning if !CONFIG_KEYS
vfio/pci: fix memory leak during D3hot to D0 transition
vfio/pci: wake-up devices around reset functions
scsi: fnic: Fix a tracing statement
scsi: pm8001: Fix command initialization in pm80XX_send_read_log()
scsi: pm8001: Fix command initialization in pm8001_chip_ssp_tm_req()
scsi: pm8001: Fix payload initialization in pm80xx_set_thermal_config()
scsi: pm8001: Fix le32 values handling in pm80xx_set_sas_protocol_timer_config()
scsi: pm8001: Fix payload initialization in pm80xx_encrypt_update()
scsi: pm8001: Fix le32 values handling in pm80xx_chip_ssp_io_req()
scsi: pm8001: Fix le32 values handling in pm80xx_chip_sata_req()
scsi: pm8001: Fix NCQ NON DATA command task initialization
scsi: pm8001: Fix NCQ NON DATA command completion handling
scsi: pm8001: Fix abort all task initialization
RDMA/mlx5: Fix the flow of a miss in the allocation of a cache ODP MR
drm/amd/display: Remove vupdate_int_entry definition
TOMOYO: fix __setup handlers return values
power: supply: sbs-charger: Don't cancel work that is not initialized
ext2: correct max file size computing
drm/tegra: Fix reference leak in tegra_dsi_ganged_probe
power: supply: bq24190_charger: Fix bq24190_vbus_is_enabled() wrong false return
scsi: hisi_sas: Change permission of parameter prot_mask
drm/bridge: cdns-dsi: Make sure to to create proper aliases for dt
bpf, arm64: Call build_prologue() first in first JIT pass
bpf, arm64: Feed byte-offset into bpf line info
xsk: Fix race at socket teardown
RDMA/irdma: Fix netdev notifications for vlan's
RDMA/irdma: Fix Passthrough mode in VM
RDMA/irdma: Remove incorrect masking of PD
gpu: host1x: Fix a memory leak in 'host1x_remove()'
libbpf: Skip forward declaration when counting duplicated type names
powerpc/mm/numa: skip NUMA_NO_NODE onlining in parse_numa_properties()
powerpc/Makefile: Don't pass -mcpu=powerpc64 when building 32-bit
KVM: x86: Fix emulation in writing cr8
KVM: x86/emulator: Defer not-present segment check in __load_segment_descriptor()
hv_balloon: rate-limit "Unhandled message" warning
i2c: xiic: Make bus names unique
power: supply: wm8350-power: Handle error for wm8350_register_irq
power: supply: wm8350-power: Add missing free in free_charger_irq
IB/hfi1: Allow larger MTU without AIP
RDMA/core: Fix ib_qp_usecnt_dec() called when error
PCI: Reduce warnings on possible RW1C corruption
net: axienet: fix RX ring refill allocation failure handling
drm/msm/a6xx: Fix missing ARRAY_SIZE() check
mips: DEC: honor CONFIG_MIPS_FP_SUPPORT=n
MIPS: Sanitise Cavium switch cases in TLB handler synthesizers
powerpc/sysdev: fix incorrect use to determine if list is empty
powerpc/64s: Don't use DSISR for SLB faults
mfd: mc13xxx: Add check for mc13xxx_irq_request
libbpf: Unmap rings when umem deleted
selftests/bpf: Make test_lwt_ip_encap more stable and faster
platform/x86: huawei-wmi: check the return value of device_create_file()
scsi: mpt3sas: Fix incorrect 4GB boundary check
powerpc: 8xx: fix a return value error in mpc8xx_pic_init
vxcan: enable local echo for sent CAN frames
ath10k: Fix error handling in ath10k_setup_msa_resources
mips: cdmm: Fix refcount leak in mips_cdmm_phys_base
MIPS: RB532: fix return value of __setup handler
MIPS: pgalloc: fix memory leak caused by pgd_free()
mtd: rawnand: atmel: fix refcount issue in atmel_nand_controller_init
power: ab8500_chargalg: Use CLOCK_MONOTONIC
RDMA/irdma: Prevent some integer underflows
Revert "RDMA/core: Fix ib_qp_usecnt_dec() called when error"
RDMA/mlx5: Fix memory leak in error flow for subscribe event routine
bpf, sockmap: Fix memleak in sk_psock_queue_msg
bpf, sockmap: Fix memleak in tcp_bpf_sendmsg while sk msg is full
bpf, sockmap: Fix more uncharged while msg has more_data
bpf, sockmap: Fix double uncharge the mem of sk_msg
samples/bpf, xdpsock: Fix race when running for fix duration of time
USB: storage: ums-realtek: fix error code in rts51x_read_mem()
drm/i915/display: Fix HPD short pulse handling for eDP
netfilter: flowtable: Fix QinQ and pppoe support for inet table
mt76: mt7921: fix mt7921_queues_acq implementation
can: isotp: sanitize CAN ID checks in isotp_bind()
can: isotp: return -EADDRNOTAVAIL when reading from unbound socket
can: isotp: support MSG_TRUNC flag when reading from socket
bareudp: use ipv6_mod_enabled to check if IPv6 enabled
ibmvnic: fix race between xmit and reset
af_unix: Fix some data-races around unix_sk(sk)->oob_skb.
selftests/bpf: Fix error reporting from sock_fields programs
Bluetooth: hci_uart: add missing NULL check in h5_enqueue
Bluetooth: call hci_le_conn_failed with hdev lock in hci_le_conn_failed
Bluetooth: btmtksdio: Fix kernel oops in btmtksdio_interrupt
ipv4: Fix route lookups when handling ICMP redirects and PMTU updates
af_netlink: Fix shift out of bounds in group mask calculation
i2c: meson: Fix wrong speed use from probe
netfilter: conntrack: Add and use nf_ct_set_auto_assign_helper_warned()
i2c: mux: demux-pinctrl: do not deactivate a master that is not active
powerpc/pseries: Fix use after free in remove_phb_dynamic()
selftests/bpf/test_lirc_mode2.sh: Exit with proper code
PCI: Avoid broken MSI on SB600 USB devices
net: bcmgenet: Use stronger register read/writes to assure ordering
tcp: ensure PMTU updates are processed during fastopen
openvswitch: always update flow key after nat
net: dsa: fix panic on shutdown if multi-chip tree failed to probe
tipc: fix the timer expires after interval 100ms
mfd: asic3: Add missing iounmap() on error asic3_mfd_probe
ice: fix 'scheduling while atomic' on aux critical err interrupt
ice: don't allow to run ice_send_event_to_aux() in atomic ctx
drivers: ethernet: cpsw: fix panic when interrupt coaleceing is set via ethtool
kernel/resource: fix kfree() of bootmem memory again
staging: r8188eu: convert DBG_88E_LEVEL call in hal/rtl8188e_hal_init.c
staging: r8188eu: release_firmware is not called if allocation fails
mxser: fix xmit_buf leak in activate when LSR == 0xff
fsi: scom: Fix error handling
fsi: scom: Remove retries in indirect scoms
pwm: lpc18xx-sct: Initialize driver data and hardware before pwmchip_add()
pps: clients: gpio: Propagate return value from pps_gpio_probe
fsi: Aspeed: Fix a potential double free
misc: alcor_pci: Fix an error handling path
cpufreq: qcom-cpufreq-nvmem: fix reading of PVS Valid fuse
soundwire: intel: fix wrong register name in intel_shim_wake
clk: qcom: ipq8074: fix PCI-E clock oops
dmaengine: idxd: check GENCAP config support for gencfg register
dmaengine: idxd: change bandwidth token to read buffers
dmaengine: idxd: restore traffic class defaults after wq reset
iio: mma8452: Fix probe failing when an i2c_device_id is used
serial: 8250_aspeed_vuart: add PORT_ASPEED_VUART port type
staging:iio:adc:ad7280a: Fix handing of device address bit reversing.
pinctrl: renesas: r8a77470: Reduce size for narrow VIN1 channel
pinctrl: renesas: checker: Fix miscalculation of number of states
clk: qcom: ipq8074: Use floor ops for SDCC1 clock
phy: dphy: Correct lpx parameter and its derivatives(ta_{get,go,sure})
phy: phy-brcm-usb: fixup BCM4908 support
serial: 8250_mid: Balance reference count for PCI DMA device
serial: 8250_lpss: Balance reference count for PCI DMA device
NFS: Use of mapping_set_error() results in spurious errors
serial: 8250: Fix race condition in RTS-after-send handling
iio: adc: Add check for devm_request_threaded_irq
habanalabs: Add check for pci_enable_device
NFS: Return valid errors from nfs2/3_decode_dirent()
staging: r8188eu: fix endless loop in recv_func
dma-debug: fix return value of __setup handlers
clk: imx7d: Remove audio_mclk_root_clk
clk: imx: off by one in imx_lpcg_parse_clks_from_dt()
clk: at91: sama7g5: fix parents of PDMCs' GCLK
clk: qcom: clk-rcg2: Update logic to calculate D value for RCG
clk: qcom: clk-rcg2: Update the frac table for pixel clock
dmaengine: hisi_dma: fix MSI allocate fail when reload hisi_dma
remoteproc: qcom: Fix missing of_node_put in adsp_alloc_memory_region
remoteproc: qcom_wcnss: Add missing of_node_put() in wcnss_alloc_memory_region
remoteproc: qcom_q6v5_mss: Fix some leaks in q6v5_alloc_memory_region
nvdimm/region: Fix default alignment for small regions
clk: actions: Terminate clk_div_table with sentinel element
clk: loongson1: Terminate clk_div_table with sentinel element
clk: hisilicon: Terminate clk_div_table with sentinel element
clk: clps711x: Terminate clk_div_table with sentinel element
clk: Fix clk_hw_get_clk() when dev is NULL
clk: tegra: tegra124-emc: Fix missing put_device() call in emc_ensure_emc_driver
mailbox: imx: fix crash in resume on i.mx8ulp
NFS: remove unneeded check in decode_devicenotify_args()
staging: mt7621-dts: fix LEDs and pinctrl on GB-PC1 devicetree
staging: mt7621-dts: fix formatting
staging: mt7621-dts: fix pinctrl properties for ethernet
staging: mt7621-dts: fix GB-PC2 devicetree
pinctrl: mediatek: Fix missing of_node_put() in mtk_pctrl_init
pinctrl: mediatek: paris: Fix PIN_CONFIG_BIAS_* readback
pinctrl: mediatek: paris: Fix "argument" argument type for mtk_pinconf_get()
pinctrl: mediatek: paris: Fix pingroup pin config state readback
pinctrl: mediatek: paris: Skip custom extra pin config dump for virtual GPIOs
pinctrl: microchip sgpio: use reset driver
pinctrl: microchip-sgpio: lock RMW access
pinctrl: nomadik: Add missing of_node_put() in nmk_pinctrl_probe
pinctrl/rockchip: Add missing of_node_put() in rockchip_pinctrl_probe
tty: hvc: fix return value of __setup handler
kgdboc: fix return value of __setup handler
serial: 8250: fix XOFF/XON sending when DMA is used
virt: acrn: obtain pa from VMA with PFNMAP flag
virt: acrn: fix a memory leak in acrn_dev_ioctl()
kgdbts: fix return value of __setup handler
firmware: google: Properly state IOMEM dependency
driver core: dd: fix return value of __setup handler
jfs: fix divide error in dbNextAG
netfilter: nf_conntrack_tcp: preserve liberal flag in tcp options
SUNRPC don't resend a task on an offlined transport
NFSv4.1: don't retry BIND_CONN_TO_SESSION on session error
kdb: Fix the putarea helper function
perf stat: Fix forked applications enablement of counters
clk: qcom: gcc-msm8994: Fix gpll4 width
vsock/virtio: initialize vdev->priv before using VQs
vsock/virtio: read the negotiated features before using VQs
vsock/virtio: enable VQs early on probe
clk: Initialize orphan req_rate
xen: fix is_xen_pmu()
net: enetc: report software timestamping via SO_TIMESTAMPING
net: hns3: fix bug when PF set the duplicate MAC address for VFs
net: hns3: fix port base vlan add fail when concurrent with reset
net: hns3: add vlan list lock to protect vlan list
net: hns3: format the output of the MAC address
net: hns3: refine the process when PF set VF VLAN
net: phy: broadcom: Fix brcm_fet_config_init()
selftests: test_vxlan_under_vrf: Fix broken test case
NFS: Don't loop forever in nfs_do_recoalesce()
net: hns3: clean residual vf config after disable sriov
net: sparx5: depends on PTP_1588_CLOCK_OPTIONAL
qlcnic: dcb: default to returning -EOPNOTSUPP
net/x25: Fix null-ptr-deref caused by x25_disconnect
net: sparx5: switchdev: fix possible NULL pointer dereference
octeontx2-af: initialize action variable
net: prefer nf_ct_put instead of nf_conntrack_put
net/sched: act_ct: fix ref leak when switching zones
NFSv4/pNFS: Fix another issue with a list iterator pointing to the head
net: dsa: bcm_sf2_cfp: fix an incorrect NULL check on list iterator
fs: fd tables have to be multiples of BITS_PER_LONG
lib/test: use after free in register_test_dev_kmod()
fs: fix fd table size alignment properly
LSM: general protection fault in legacy_parse_param
regulator: rpi-panel: Handle I2C errors/timing to the Atmel
crypto: hisilicon/qm - cleanup warning in qm_vf_read_qos
gcc-plugins/stackleak: Exactly match strings instead of prefixes
pinctrl: npcm: Fix broken references to chip->parent_device
rcu: Mark writes to the rcu_segcblist structure's ->flags field
block/bfq_wf2q: correct weight to ioprio
crypto: xts - Add softdep on ecb
crypto: hisilicon/sec - not need to enable sm4 extra mode at HW V3
block, bfq: don't move oom_bfqq
selinux: use correct type for context length
arm64: module: remove (NOLOAD) from linker script
selinux: allow FIOCLEX and FIONCLEX with policy capability
loop: use sysfs_emit() in the sysfs xxx show()
Fix incorrect type in assignment of ipv6 port for audit
irqchip/qcom-pdc: Fix broken locking
irqchip/nvic: Release nvic_base upon failure
fs/binfmt_elf: Fix AT_PHDR for unusual ELF files
bfq: fix use-after-free in bfq_dispatch_request
ACPICA: Avoid walking the ACPI Namespace if it is not there
lib/raid6/test/Makefile: Use $(pound) instead of \# for Make 4.3
Revert "Revert "block, bfq: honor already-setup queue merges""
ACPI/APEI: Limit printable size of BERT table data
PM: core: keep irq flags in device_pm_check_callbacks()
parisc: Fix handling off probe non-access faults
nvme-tcp: lockdep: annotate in-kernel sockets
spi: tegra20: Use of_device_get_match_data()
atomics: Fix atomic64_{read_acquire,set_release} fallbacks
locking/lockdep: Iterate lock_classes directly when reading lockdep files
ext4: correct cluster len and clusters changed accounting in ext4_mb_mark_bb
ext4: fix ext4_mb_mark_bb() with flex_bg with fast_commit
sched/tracing: Report TASK_RTLOCK_WAIT tasks as TASK_UNINTERRUPTIBLE
ext4: don't BUG if someone dirty pages without asking ext4 first
f2fs: fix to do sanity check on curseg->alloc_type
NFSD: Fix nfsd_breaker_owns_lease() return values
f2fs: don't get FREEZE lock in f2fs_evict_inode in frozen fs
btrfs: harden identification of a stale device
btrfs: make search_csum_tree return 0 if we get -EFBIG
f2fs: use spin_lock to avoid hang
f2fs: compress: fix to print raw data size in error path of lz4 decompression
Adjust cifssb maximum read size
ntfs: add sanity check on allocation size
media: staging: media: zoran: move videodev alloc
media: staging: media: zoran: calculate the right buffer number for zoran_reap_stat_com
media: staging: media: zoran: fix various V4L2 compliance errors
media: atmel: atmel-isc-base: report frame sizes as full supported range
media: ir_toy: free before error exiting
ASoC: sh: rz-ssi: Make the data structures available before registering the handlers
ASoC: SOF: Intel: match sdw version on link_slaves_found
media: imx-jpeg: Prevent decoding NV12M jpegs into single-planar buffers
media: iommu/mediatek-v1: Free the existed fwspec if the master dev already has
media: iommu/mediatek: Return ENODEV if the device is NULL
media: iommu/mediatek: Add device_link between the consumer and the larb devices
video: fbdev: nvidiafb: Use strscpy() to prevent buffer overflow
video: fbdev: w100fb: Reset global state
video: fbdev: cirrusfb: check pixclock to avoid divide by zero
video: fbdev: omapfb: acx565akm: replace snprintf with sysfs_emit
ARM: dts: qcom: fix gic_irq_domain_translate warnings for msm8960
ARM: dts: bcm2837: Add the missing L1/L2 cache information
ASoC: madera: Add dependencies on MFD
media: atomisp_gmin_platform: Add DMI quirk to not turn AXP ELDO2 regulator off on some boards
media: atomisp: fix dummy_ptr check to avoid duplicate active_bo
ARM: ftrace: avoid redundant loads or clobbering IP
ARM: dts: imx7: Use audio_mclk_post_div instead audio_mclk_root_clk
arm64: defconfig: build imx-sdma as a module
video: fbdev: omapfb: panel-dsi-cm: Use sysfs_emit() instead of snprintf()
video: fbdev: omapfb: panel-tpo-td043mtea1: Use sysfs_emit() instead of snprintf()
video: fbdev: udlfb: replace snprintf in show functions with sysfs_emit
ARM: dts: bcm2711: Add the missing L1/L2 cache information
ASoC: soc-core: skip zero num_dai component in searching dai name
media: imx-jpeg: fix a bug of accessing array out of bounds
media: cx88-mpeg: clear interrupt status register before streaming video
uaccess: fix type mismatch warnings from access_ok()
lib/test_lockup: fix kernel pointer check for separate address spaces
ARM: tegra: tamonten: Fix I2C3 pad setting
ARM: mmp: Fix failure to remove sram device
ASoC: amd: vg: fix for pm resume callback sequence
video: fbdev: sm712fb: Fix crash in smtcfb_write()
media: i2c: ov5648: Fix lockdep error
media: Revert "media: em28xx: add missing em28xx_close_extension"
media: hdpvr: initialize dev->worker at hdpvr_register_videodev
ASoC: Intel: sof_sdw: fix quirks for 2022 HP Spectre x360 13"
tracing: Have TRACE_DEFINE_ENUM affect trace event types as well
mmc: host: Return an error when ->enable_sdio_irq() ops is missing
media: atomisp: fix bad usage at error handling logic
ALSA: hda/realtek: Add alc256-samsung-headphone fixup
KVM: x86: Reinitialize context if host userspace toggles EFER.LME
KVM: x86/mmu: Move "invalid" check out of kvm_tdp_mmu_get_root()
KVM: x86/mmu: Zap _all_ roots when unmapping gfn range in TDP MMU
KVM: x86/mmu: Check for present SPTE when clearing dirty bit in TDP MMU
KVM: x86: hyper-v: Drop redundant 'ex' parameter from kvm_hv_send_ipi()
KVM: x86: hyper-v: Drop redundant 'ex' parameter from kvm_hv_flush_tlb()
KVM: x86: hyper-v: Fix the maximum number of sparse banks for XMM fast TLB flush hypercalls
KVM: x86: hyper-v: HVCALL_SEND_IPI_EX is an XMM fast hypercall
powerpc/kasan: Fix early region not updated correctly
powerpc/lib/sstep: Fix 'sthcx' instruction
powerpc/lib/sstep: Fix build errors with newer binutils
powerpc: Add set_memory_{p/np}() and remove set_memory_attr()
powerpc: Fix build errors with newer binutils
drm/dp: Fix off-by-one in register cache size
drm/i915: Treat SAGV block time 0 as SAGV disabled
drm/i915: Fix PSF GV point mask when SAGV is not possible
drm/i915: Reject unsupported TMDS rates on ICL+
scsi: qla2xxx: Refactor asynchronous command initialization
scsi: qla2xxx: Implement ref count for SRB
scsi: qla2xxx: Fix stuck session in gpdb
scsi: qla2xxx: Fix warning message due to adisc being flushed
scsi: qla2xxx: Fix scheduling while atomic
scsi: qla2xxx: Fix premature hw access after PCI error
scsi: qla2xxx: Fix wrong FDMI data for 64G adapter
scsi: qla2xxx: Fix warning for missing error code
scsi: qla2xxx: Fix device reconnect in loop topology
scsi: qla2xxx: edif: Fix clang warning
scsi: qla2xxx: Fix T10 PI tag escape and IP guard options for 28XX adapters
scsi: qla2xxx: Add devids and conditionals for 28xx
scsi: qla2xxx: Check for firmware dump already collected
scsi: qla2xxx: Suppress a kernel complaint in qla_create_qpair()
scsi: qla2xxx: Fix disk failure to rediscover
scsi: qla2xxx: Fix incorrect reporting of task management failure
scsi: qla2xxx: Fix hang due to session stuck
scsi: qla2xxx: Fix missed DMA unmap for NVMe ls requests
scsi: qla2xxx: Fix N2N inconsistent PLOGI
scsi: qla2xxx: Fix stuck session of PRLI reject
scsi: qla2xxx: Reduce false trigger to login
scsi: qla2xxx: Use correct feature type field during RFF_ID processing
platform: chrome: Split trace include file
KVM: x86: Check lapic_in_kernel() before attempting to set a SynIC irq
KVM: x86: Avoid theoretical NULL pointer dereference in kvm_irq_delivery_to_apic_fast()
KVM: x86: Forbid VMM to set SYNIC/STIMER MSRs when SynIC wasn't activated
KVM: Prevent module exit until all VMs are freed
KVM: x86: fix sending PV IPI
KVM: SVM: fix panic on out-of-bounds guest IRQ
ubifs: rename_whiteout: Fix double free for whiteout_ui->data
ubifs: Fix deadlock in concurrent rename whiteout and inode writeback
ubifs: Add missing iput if do_tmpfile() failed in rename whiteout
ubifs: Rename whiteout atomically
ubifs: Fix 'ui->dirty' race between do_tmpfile() and writeback work
ubifs: Rectify space amount budget for mkdir/tmpfile operations
ubifs: setflags: Make dirtied_ino_d 8 bytes aligned
ubifs: Fix read out-of-bounds in ubifs_wbuf_write_nolock()
ubifs: Fix to add refcount once page is set private
ubifs: rename_whiteout: correct old_dir size computing
nvme: allow duplicate NSIDs for private namespaces
nvme: fix the read-only state for zoned namespaces with unsupposed features
wireguard: queueing: use CFI-safe ptr_ring cleanup function
wireguard: socket: free skb in send6 when ipv6 is disabled
wireguard: socket: ignore v6 endpoints when ipv6 is disabled
XArray: Fix xas_create_range() when multi-order entry present
can: mcba_usb: mcba_usb_start_xmit(): fix double dev_kfree_skb in error path
can: mcba_usb: properly check endpoint type
can: mcp251xfd: mcp251xfd_register_get_dev_id(): fix return of error value
XArray: Update the LRU list in xas_split()
modpost: restore the warning message for missing symbol versions
rtc: check if __rtc_read_time was successful
gfs2: gfs2_setattr_size error path fix
gfs2: Make sure FITRIM minlen is rounded up to fs block size
net: hns3: fix the concurrency between functions reading debugfs
net: hns3: fix software vlan talbe of vlan 0 inconsistent with hardware
rxrpc: fix some null-ptr-deref bugs in server_key.c
rxrpc: Fix call timer start racing with call destruction
mailbox: imx: fix wakeup failure from freeze mode
crypto: arm/aes-neonbs-cbc - Select generic cbc and aes
watch_queue: Free the page array when watch_queue is dismantled
pinctrl: pinconf-generic: Print arguments for bias-pull-*
watchdog: rti-wdt: Add missing pm_runtime_disable() in probe function
net: sparx5: uses, depends on BRIDGE or !BRIDGE
pinctrl: nuvoton: npcm7xx: Rename DS() macro to DSTR()
pinctrl: nuvoton: npcm7xx: Use %zu printk format for ARRAY_SIZE()
ASoC: mediatek: mt6358: add missing EXPORT_SYMBOLs
ubi: Fix race condition between ctrl_cdev_ioctl and ubi_cdev_ioctl
ARM: iop32x: offset IRQ numbers by 1
block: Fix the maximum minor value is blk_alloc_ext_minor()
io_uring: fix memory leak of uid in files registration
riscv module: remove (NOLOAD)
ACPI: CPPC: Avoid out of bounds access when parsing _CPC data
vhost: handle error while adding split ranges to iotlb
spi: Fix Tegra QSPI example
platform/chrome: cros_ec_typec: Check for EC device
can: isotp: restore accidentally removed MSG_PEEK feature
proc: bootconfig: Add null pointer check
drm/connector: Fix typo in documentation
scsi: qla2xxx: Add qla2x00_async_done() for async routines
staging: mt7621-dts: fix pinctrl-0 items to be size-1 items on ethernet
arm64: mm: Drop 'const' from conditional arm64_dma_phys_limit definition
ASoC: soc-compress: Change the check for codec_dai
Reinstate some of "swiotlb: rework "fix info leak with DMA_FROM_DEVICE""
tracing: Have type enum modifications copy the strings
net: add skb_set_end_offset() helper
net: preserve skb_end_offset() in skb_unclone_keeptruesize()
mm/mmap: return 1 from stack_guard_gap __setup() handler
ARM: 9187/1: JIVE: fix return value of __setup handler
mm/memcontrol: return 1 from cgroup.memory __setup() handler
mm/usercopy: return 1 from hardened_usercopy __setup() handler
af_unix: Support POLLPRI for OOB.
bpf: Adjust BPF stack helper functions to accommodate skip > 0
bpf: Fix comment for helper bpf_current_task_under_cgroup()
mmc: rtsx: Use pm_runtime_{get,put}() to handle runtime PM
dt-bindings: mtd: nand-controller: Fix the reg property description
dt-bindings: mtd: nand-controller: Fix a comment in the examples
dt-bindings: spi: mxic: The interrupt property is not mandatory
dt-bindings: memory: mtk-smi: No need mediatek,larb-id for mt8167
dt-bindings: pinctrl: pinctrl-microchip-sgpio: Fix example
ubi: fastmap: Return error code if memory allocation fails in add_aeb()
ASoC: SOF: Intel: Fix build error without SND_SOC_SOF_PCI_DEV
ASoC: topology: Allow TLV control to be either read or write
perf vendor events: Update metrics for SkyLake Server
media: ov6650: Add try support to selection API operations
media: ov6650: Fix crop rectangle affected by set format
spi: mediatek: support tick_delay without enhance_timing
ARM: dts: spear1340: Update serial node properties
ARM: dts: spear13xx: Update SPI dma properties
arm64: dts: ls1043a: Update i2c dma properties
arm64: dts: ls1046a: Update i2c node dma properties
um: Fix uml_mconsole stop/go
docs: sysctl/kernel: add missing bit to panic_print
openvswitch: Fixed nd target mask field in the flow dump.
torture: Make torture.sh help message match reality
n64cart: convert bi_disk to bi_bdev->bd_disk fix build
mmc: rtsx: Let MMC core handle runtime PM
mmc: rtsx: Fix build errors/warnings for unused variable
KVM: x86/mmu: do compare-and-exchange of gPTE via the user address
iommu/dma: Skip extra sync during unmap w/swiotlb
iommu/dma: Fold _swiotlb helpers into callers
iommu/dma: Check CONFIG_SWIOTLB more broadly
swiotlb: Support aligned swiotlb buffers
iommu/dma: Account for min_align_mask w/swiotlb
coredump: Snapshot the vmas in do_coredump
coredump: Remove the WARN_ON in dump_vma_snapshot
coredump/elf: Pass coredump_params into fill_note_info
coredump: Use the vma snapshot in fill_files_note
PCI: xgene: Revert "PCI: xgene: Use inbound resources for setup"
Linux 5.15.33
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Id62bd8a22d0bfa7c2096539d253ffce804bed017
commit 60021bd754c6ca0addc6817994f20290a321d8d6 upstream.
A subvolume with an active swapfile must not be deleted otherwise it
would not be possible to deactivate it.
After the subvolume is deleted, we cannot swapoff the swapfile in this
deleted subvolume because the path is unreachable. The swapfile is
still active and holding references, the filesystem cannot be unmounted.
The test looks like this:
mkfs.btrfs -f $dev > /dev/null
mount $dev $mnt
btrfs sub create $mnt/subvol
touch $mnt/subvol/swapfile
chmod 600 $mnt/subvol/swapfile
chattr +C $mnt/subvol/swapfile
dd if=/dev/zero of=$mnt/subvol/swapfile bs=1K count=4096
mkswap $mnt/subvol/swapfile
swapon $mnt/subvol/swapfile
btrfs sub delete $mnt/subvol
swapoff $mnt/subvol/swapfile # failed: No such file or directory
swapoff --all
unmount $mnt # target is busy.
To prevent above issue, we simply check that whether the subvolume
contains any active swapfile, and stop the deleting process. This
behavior is like snapshot ioctl dealing with a swapfile.
CC: stable@vger.kernel.org # 5.4+
Reviewed-by: Robbie Ko <robbieko@synology.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Kaiwen Hu <kevinhu@synology.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b642b52d0b50f4d398cb4293f64992d0eed2e2ce upstream.
We use extent_changeset->bytes_changed in qgroup_reserve_data() to record
how many bytes we set for EXTENT_QGROUP_RESERVED state. Currently the
bytes_changed is set as "unsigned int", and it will overflow if we try to
fallocate a range larger than 4GiB. The result is we reserve less bytes
and eventually break the qgroup limit.
Unlike regular buffered/direct write, which we use one changeset for
each ordered extent, which can never be larger than 256M. For
fallocate, we use one changeset for the whole range, thus it no longer
respects the 256M per extent limit, and caused the problem.
The following example test script reproduces the problem:
$ cat qgroup-overflow.sh
#!/bin/bash
DEV=/dev/sdj
MNT=/mnt/sdj
mkfs.btrfs -f $DEV
mount $DEV $MNT
# Set qgroup limit to 2GiB.
btrfs quota enable $MNT
btrfs qgroup limit 2G $MNT
# Try to fallocate a 3GiB file. This should fail.
echo
echo "Try to fallocate a 3GiB file..."
fallocate -l 3G $MNT/3G.file
# Try to fallocate a 5GiB file.
echo
echo "Try to fallocate a 5GiB file..."
fallocate -l 5G $MNT/5G.file
# See we break the qgroup limit.
echo
sync
btrfs qgroup show -r $MNT
umount $MNT
When running the test:
$ ./qgroup-overflow.sh
(...)
Try to fallocate a 3GiB file...
fallocate: fallocate failed: Disk quota exceeded
Try to fallocate a 5GiB file...
qgroupid rfer excl max_rfer
-------- ---- ---- --------
0/5 5.00GiB 5.00GiB 2.00GiB
Since we have no control of how bytes_changed is used, it's better to
set it to u64.
CC: stable@vger.kernel.org # 4.14+
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Ethan Lien <ethanlien@synology.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e677edbcabee849bfdd43f1602bccbecf736a646 upstream.
io_flush_timeouts() assumes the timeout isn't in progress of triggering
or being removed/canceled, so it unconditionally removes it from the
timeout list and attempts to cancel it.
Leave it on the list and let the normal timeout cancelation take care
of it.
Cc: stable@vger.kernel.org # 5.5+
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0f5e4b83b37a96e3643951588ed7176b9b187c0a upstream.
Similarly to the way it is done im mbind syscall.
Cc: stable@vger.kernel.org # 5.14
Fixes: fe76421d1d ("io_uring: allow user configurable IO thread CPU affinity")
Signed-off-by: Eugene Syromiatnikov <esyr@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a3e4bc23d5470b2beb7cc42a86b6a3e75b704c15 upstream.
In preparation for not using the file at prep time, defer checking if this
file refers to a valid io_uring instance until issue time.
This also means we can get rid of the cleanup flag for splice and tee.
Cc: stable@vger.kernel.org # v5.15+
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ec858afda857e361182ceafc3d2ba2b164b8e889 upstream.
This is a leftover from the really old days where we weren't able to
track and error early if we need a file and it wasn't assigned. Kill
the check.
Cc: stable@vger.kernel.org # v5.15+
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit a07211e3001435fe8591b992464cd8d5e3c98c5a ]
It's safer to not touch scm_fp_list after we queued an skb to which it
was assigned, there might be races lurking if we screw subtle sync
guarantees on the io_uring side.
Fixes: 6b06314c47 ("io_uring: add file set registration")
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 34bb77184123ae401100a4d156584f12fa630e5c ]
Don't forget to array_index_nospec() for indexes before updating rsrc
tags in __io_sqe_files_update(), just use already safe and precalculated
index @i.
Fixes: c3bdad0271 ("io_uring: add generic rsrc update with tags")
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0bae835b63c53f86cdc524f5962e39409585b22c ]
In a low memory situation, allow the NFS writeback code to fail without
getting stuck in infinite loops in mempool_alloc().
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 515dcdcd48736576c6f5c197814da6f81c60a21e ]
The concern is that since nfsiod is sometimes required to kick off a
commit, it can get locked up waiting forever in mempool_alloc() instead
of failing gracefully and leaving the commit until later.
Try to allocate from the slab first, with GFP_KERNEL | __GFP_NORETRY,
then fall back to a non-blocking attempt to allocate from the memory
pool.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a53046291020ec41e09181396c1e829287b48d47 ]
Add validation check for JFS_IP(ipimap)->i_imap to prevent a NULL deref
in diFree since diFree uses it without do any validations.
When function jfs_mount calls diMount to initialize fileset inode
allocation map, it can fail and JFS_IP(ipimap)->i_imap won't be
initialized. Then it calls diFreeSpecial to close fileset inode allocation
map inode and it will flow into jfs_evict_inode. Function jfs_evict_inode
just validates JFS_SBI(inode->i_sb)->ipimap, then calls diFree. diFree use
JFS_IP(ipimap)->i_imap directly, then it will cause a NULL deref.
Reported-by: TCS Robot <tcs_robot@tencent.com>
Signed-off-by: Haimin Zhang <tcs_kernel@tencent.com>
Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c265de257f558a05c1859ee9e3fed04883b9ec0e ]
The commit handling code is not safe against memory-pressure deadlocks
when writing to swap. In particular, nfs_commitdata_alloc() blocks
indefinitely waiting for memory, and this can consume all available
workqueue threads.
swap-out most likely uses STABLE writes anyway as COND_STABLE indicates
that a stable write should be used if the write fits in a single
request, and it normally does. However if we ever swap with a small
wsize, or gather unusually large numbers of pages for a single write,
this might change.
For safety, make it explicit in the code that direct writes used for swap
must always use FLUSH_STABLE.
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 64158668ac8b31626a8ce48db4cad08496eb8340 ]
1/ Taking the i_rwsem for swap IO triggers lockdep warnings regarding
possible deadlocks with "fs_reclaim". These deadlocks could, I believe,
eventuate if a buffered read on the swapfile was attempted.
We don't need coherence with the page cache for a swap file, and
buffered writes are forbidden anyway. There is no other need for
i_rwsem during direct IO. So never take it for swap_rw()
2/ generic_write_checks() explicitly forbids writes to swap, and
performs checks that are not needed for swap. So bypass it
for swap_rw().
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3e17898aca293a24dae757a440a50aa63ca29671 ]
If memory allocation triggers a direct reclaim from the state recovery
thread, then we can deadlock. Use memalloc_nofs_save/restore to ensure
that doesn't happen.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b7f114edd54326f730a754547e7cfb197b5bc132 ]
[You don't often get email from xiongx18@fudan.edu.cn. Learn why this is important at http://aka.ms/LearnAboutSenderIdentification.]
The reference counting issue happens in two error paths in the
function _nfs42_proc_copy_notify(). In both error paths, the function
simply returns the error code and forgets to balance the refcount of
object `ctx`, bumped by get_nfs_open_context() earlier, which may
cause refcount leaks.
Fix it by balancing refcount of the `ctx` object before the function
returns in both error paths.
Signed-off-by: Xin Xiong <xiongx18@fudan.edu.cn>
Signed-off-by: Xiyu Yang <xiyuyang19@fudan.edu.cn>
Signed-off-by: Xin Tan <tanxin.ctf@gmail.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9ce3c0d26c42d279b6c378a03cd6a61d828f19ca ]
Testcase:
1. create a minix file system and mount it
2. open a file on the file system with O_RDWR|O_CREAT|O_TRUNC|O_DIRECT
3. open fails with -EINVAL but leaves an empty file behind. All other
open() failures don't leave the failed open files behind.
It is hard to check the direct_IO op before creating the inode. Just as
ext4 and btrfs do, this patch will resolve the issue by allowing to
create the file with O_DIRECT but returning error when writing the file.
Link: https://lkml.kernel.org/r/20220107133626.413379-1-qhjin.dev@gmail.com
Signed-off-by: Qinghua Jin <qhjin.dev@gmail.com>
Reported-by: Colin Ian King <colin.king@intel.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f639d9867eea647005dc824e0e24f39ffc50d4e4 ]
Reset the last_readdir at the same time, and add a comment explaining
why we don't free last_readdir when dir_emit returns false.
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 322794d3355c33adcc4feace0045d85a8e4ed813 ]
The ceph_get_inode() will search for or insert a new inode into the
hash for the given vino, and return a reference to it. If new is
non-NULL, its reference is consumed.
We should release the reference when in error handing cases.
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 390031c942116d4733310f0684beb8db19885fe6 upstream.
Matthew Wilcox reported that there is a missing mmap_lock in
file_files_note that could possibly lead to a user after free.
Solve this by using the existing vma snapshot for consistency
and to avoid the need to take the mmap_lock anywhere in the
coredump code except for dump_vma_snapshot.
Update the dump_vma_snapshot to capture vm_pgoff and vm_file
that are neeeded by fill_files_note.
Add free_vma_snapshot to free the captured values of vm_file.
Reported-by: Matthew Wilcox <willy@infradead.org>
Link: https://lkml.kernel.org/r/20220131153740.2396974-1-willy@infradead.org
Cc: stable@vger.kernel.org
Fixes: a07279c9a8 ("binfmt_elf, binfmt_elf_fdpic: use a VMA list snapshot")
Fixes: 2aa362c49c ("coredump: extend core dump note section to contain file names of mapped files")
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9ec7d3230717b4fe9b6c7afeb4811909c23fa1d7 upstream.
Instead of individually passing cprm->siginfo and cprm->regs
into fill_note_info pass all of struct coredump_params.
This is preparation to allow fill_files_note to use the existing
vma snapshot.
Reviewed-by: Jann Horn <jannh@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 49c1866348f364478a0c4d3dd13fd08bb82d3a5b upstream.
The condition is impossible and to the best of my knowledge has never
triggered.
We are in deep trouble if that conditions happens and we walk past
the end of our allocated array.
So delete the WARN_ON and the code that makes it look like the kernel
can handle the case of walking past the end of it's vma_meta array.
Reviewed-by: Jann Horn <jannh@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 95c5436a4883841588dae86fb0b9325f47ba5ad3 upstream.
Move the call of dump_vma_snapshot and kvfree(vma_meta) out of the
individual coredump routines into do_coredump itself. This makes
the code less error prone and easier to maintain.
Make the vma snapshot available to the coredump routines
in struct coredump_params. This makes it easier to
change and update what is captures in the vma snapshot
and will be needed for fixing fill_file_notes.
Reviewed-by: Jann Horn <jannh@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit bed5b60bf67ccd8957b8c0558fead30c4a3f5d3f upstream.
kzalloc is a memory allocation function which can return NULL when some
internal memory errors happen. It is safer to add null pointer check.
Link: https://lkml.kernel.org/r/20220329104004.2376879-1-lv.ruyi@zte.com.cn
Cc: stable@vger.kernel.org
Fixes: c1a3c36017 ("proc: bootconfig: Add /proc/bootconfig to show boot config list")
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: Lv Ruyi <lv.ruyi@zte.com.cn>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c86d18f4aa93e0e66cda0e55827cd03eea6bc5f8 upstream.
When there are no files for __io_sqe_files_scm() to process in the
range, it'll free everything and return. However, it forgets to put uid.
Fixes: 08a451739a ("io_uring: allow sparse fixed file sets")
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/accee442376f33ce8aaebb099d04967533efde92.1648226048.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 27ca8273fda398638ca994a207323a85b6d81190 upstream.
Per fstrim(8) we must round up the minlen argument to the fs block size.
The current calculation doesn't take into account devices that have a
discard granularity and requested minlen less than 1 fs block, so the
value can get shifted away to zero in the translation to fs blocks.
The zero minlen passed to gfs2_rgrp_send_discards() then allows
sb_issue_discard() to be called with nr_sects == 0 which returns -EINVAL
and results in gfs2_rgrp_send_discards() returning -EIO.
Make sure minlen is never < 1 fs block by taking the max of the
requested minlen and the fs block size before comparing to the device's
discard granularity and shifting to fs blocks.
Fixes: 076f0faa76 ("GFS2: Fix FITRIM argument handling")
Signed-off-by: Andrew Price <anprice@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7336905a89f19173bf9301cd50a24421162f417c upstream.
When gfs2_setattr_size() fails, it calls gfs2_rs_delete(ip, NULL) to get
rid of any reservations the inode may have. Instead, it should pass in
the inode's write count as the second parameter to allow
gfs2_rs_delete() to figure out if the inode has any writers left.
In a next step, there are two instances of gfs2_rs_delete(ip, NULL) left
where we know that there can be no other users of the inode. Replace
those with gfs2_rs_deltree(&ip->i_res) to avoid the unnecessary write
count check.
With that, gfs2_rs_delete() is only called with the inode's actual write
count, so get rid of the second parameter.
Fixes: a097dc7e24 ("GFS2: Make rgrp reservations part of the gfs2_inode structure")
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 705757274599e2e064dd3054aabc74e8af31a095 upstream.
When renaming the whiteout file, the old whiteout file is not deleted.
Therefore, we add the old dentry size to the old dir like XFS.
Otherwise, an error may be reported due to `fscki->calc_sz != fscki->size`
in check_indes.
Fixes: 9e0a1fff8d ("ubifs: Implement RENAME_WHITEOUT")
Reported-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Baokun Li <libaokun1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3b67db8a6ca83e6ff90b756d3da0c966f61cd37b upstream.
MM defined the rule [1] very clearly that once page was set with PG_private
flag, we should increment the refcount in that page, also main flows like
pageout(), migrate_page() will assume there is one additional page
reference count if page_has_private() returns true. Otherwise, we may
get a BUG in page migration:
page:0000000080d05b9d refcount:-1 mapcount:0 mapping:000000005f4d82a8
index:0xe2 pfn:0x14c12
aops:ubifs_file_address_operations [ubifs] ino:8f1 dentry name:"f30e"
flags: 0x1fffff80002405(locked|uptodate|owner_priv_1|private|node=0|
zone=1|lastcpupid=0x1fffff)
page dumped because: VM_BUG_ON_PAGE(page_count(page) != 0)
------------[ cut here ]------------
kernel BUG at include/linux/page_ref.h:184!
invalid opcode: 0000 [#1] SMP
CPU: 3 PID: 38 Comm: kcompactd0 Not tainted 5.15.0-rc5
RIP: 0010:migrate_page_move_mapping+0xac3/0xe70
Call Trace:
ubifs_migrate_page+0x22/0xc0 [ubifs]
move_to_new_page+0xb4/0x600
migrate_pages+0x1523/0x1cc0
compact_zone+0x8c5/0x14b0
kcompactd+0x2bc/0x560
kthread+0x18c/0x1e0
ret_from_fork+0x1f/0x30
Before the time, we should make clean a concept, what does refcount means
in page gotten from grab_cache_page_write_begin(). There are 2 situations:
Situation 1: refcount is 3, page is created by __page_cache_alloc.
TYPE_A - the write process is using this page
TYPE_B - page is assigned to one certain mapping by calling
__add_to_page_cache_locked()
TYPE_C - page is added into pagevec list corresponding current cpu by
calling lru_cache_add()
Situation 2: refcount is 2, page is gotten from the mapping's tree
TYPE_B - page has been assigned to one certain mapping
TYPE_A - the write process is using this page (by calling
page_cache_get_speculative())
Filesystem releases one refcount by calling put_page() in xxx_write_end(),
the released refcount corresponds to TYPE_A (write task is using it). If
there are any processes using a page, page migration process will skip the
page by judging whether expected_page_refs() equals to page refcount.
The BUG is caused by following process:
PA(cpu 0) kcompactd(cpu 1)
compact_zone
ubifs_write_begin
page_a = grab_cache_page_write_begin
add_to_page_cache_lru
lru_cache_add
pagevec_add // put page into cpu 0's pagevec
(refcnf = 3, for page creation process)
ubifs_write_end
SetPagePrivate(page_a) // doesn't increase page count !
unlock_page(page_a)
put_page(page_a) // refcnt = 2
[...]
PB(cpu 0)
filemap_read
filemap_get_pages
add_to_page_cache_lru
lru_cache_add
__pagevec_lru_add // traverse all pages in cpu 0's pagevec
__pagevec_lru_add_fn
SetPageLRU(page_a)
isolate_migratepages
isolate_migratepages_block
get_page_unless_zero(page_a)
// refcnt = 3
list_add(page_a, from_list)
migrate_pages(from_list)
__unmap_and_move
move_to_new_page
ubifs_migrate_page(page_a)
migrate_page_move_mapping
expected_page_refs get 3
(migration[1] + mapping[1] + private[1])
release_pages
put_page_testzero(page_a) // refcnt = 3
page_ref_freeze // refcnt = 0
page_ref_dec_and_test(0 - 1 = -1)
page_ref_unfreeze
VM_BUG_ON_PAGE(-1 != 0, page)
UBIFS doesn't increase the page refcount after setting private flag, which
leads to page migration task believes the page is not used by any other
processes, so the page is migrated. This causes concurrent accessing on
page refcount between put_page() called by other process(eg. read process
calls lru_cache_add) and page_ref_unfreeze() called by migration task.
Actually zhangjun has tried to fix this problem [2] by recalculating page
refcnt in ubifs_migrate_page(). It's better to follow MM rules [1], because
just like Kirill suggested in [2], we need to check all users of
page_has_private() helper. Like f2fs does in [3], fix it by adding/deleting
refcount when setting/clearing private for a page. BTW, according to [4],
we set 'page->private' as 1 because ubifs just simply SetPagePrivate().
And, [5] provided a common helper to set/clear page private, ubifs can
use this helper following the example of iomap, afs, btrfs, etc.
Jump [6] to find a reproducer.
[1] https://lore.kernel.org/lkml/2b19b3c4-2bc4-15fa-15cc-27a13e5c7af1@aol.com
[2] https://www.spinics.net/lists/linux-mtd/msg04018.html
[3] http://lkml.iu.edu/hypermail/linux/kernel/1903.0/03313.html
[4] https://lore.kernel.org/linux-f2fs-devel/20210422154705.GO3596236@casper.infradead.org
[5] https://lore.kernel.org/all/20200517214718.468-1-guoqing.jiang@cloud.ionos.com
[6] https://bugzilla.kernel.org/show_bug.cgi?id=214961
Fixes: 1e51764a3c ("UBIFS: add new flash file system")
Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4f2262a334641e05f645364d5ade1f565c85f20b upstream.
Function ubifs_wbuf_write_nolock() may access buf out of bounds in
following process:
ubifs_wbuf_write_nolock():
aligned_len = ALIGN(len, 8); // Assume len = 4089, aligned_len = 4096
if (aligned_len <= wbuf->avail) ... // Not satisfy
if (wbuf->used) {
ubifs_leb_write() // Fill some data in avail wbuf
len -= wbuf->avail; // len is still not 8-bytes aligned
aligned_len -= wbuf->avail;
}
n = aligned_len >> c->max_write_shift;
if (n) {
n <<= c->max_write_shift;
err = ubifs_leb_write(c, wbuf->lnum, buf + written,
wbuf->offs, n);
// n > len, read out of bounds less than 8(n-len) bytes
}
, which can be catched by KASAN:
=========================================================
BUG: KASAN: slab-out-of-bounds in ecc_sw_hamming_calculate+0x1dc/0x7d0
Read of size 4 at addr ffff888105594ff8 by task kworker/u8:4/128
Workqueue: writeback wb_workfn (flush-ubifs_0_0)
Call Trace:
kasan_report.cold+0x81/0x165
nand_write_page_swecc+0xa9/0x160
ubifs_leb_write+0xf2/0x1b0 [ubifs]
ubifs_wbuf_write_nolock+0x421/0x12c0 [ubifs]
write_head+0xdc/0x1c0 [ubifs]
ubifs_jnl_write_inode+0x627/0x960 [ubifs]
wb_workfn+0x8af/0xb80
Function ubifs_wbuf_write_nolock() accepts that parameter 'len' is not 8
bytes aligned, the 'len' represents the true length of buf (which is
allocated in 'ubifs_jnl_xxx', eg. ubifs_jnl_write_inode), so
ubifs_wbuf_write_nolock() must handle the length read from 'buf' carefully
to write leb safely.
Fetch a reproducer in [Link].
Fixes: 1e51764a3c ("UBIFS: add new flash file system")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=214785
Reported-by: Chengsong Ke <kechengsong@huawei.com>
Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1b83ec057db16b4d0697dc21ef7a9743b6041f72 upstream.
Make 'ui->data_len' aligned with 8 bytes before it is assigned to
dirtied_ino_d. Since 8871d84c8f8b0c6b("ubifs: convert to fileattr")
applied, 'setflags()' only affects regular files and directories, only
xattr inode, symlink inode and special inode(pipe/char_dev/block_dev)
have none- zero 'ui->data_len' field, so assertion
'!(req->dirtied_ino_d & 7)' cannot fail in ubifs_budget_space().
To avoid assertion fails in future evolution(eg. setflags can operate
special inodes), it's better to make dirtied_ino_d 8 bytes aligned,
after all aligned size is still zero for regular files.
Fixes: 1e51764a3c ("UBIFS: add new flash file system")
Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a6dab6607d4681d227905d5198710b575dbdb519 upstream.
UBIFS should make sure the flash has enough space to store dirty (Data
that is newer than disk) data (in memory), space budget is exactly
designed to do that. If space budget calculates less data than we need,
'make_reservation()' will do more work(return -ENOSPC if no free space
lelf, sometimes we can see "cannot reserve xxx bytes in jhead xxx, error
-28" in ubifs error messages) with ubifs inodes locked, which may effect
other syscalls.
A simple way to decide how much space do we need when make a budget:
See how much space is needed by 'make_reservation()' in ubifs_jnl_xxx()
function according to corresponding operation.
It's better to report ENOSPC in ubifs_budget_space(), as early as we can.
Fixes: 474b93704f ("ubifs: Implement O_TMPFILE")
Fixes: 1e51764a3c ("UBIFS: add new flash file system")
Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 60eb3b9c9f11206996f57cb89521824304b305ad upstream.
'ui->dirty' is not protected by 'ui_mutex' in function do_tmpfile() which
may race with ubifs_write_inode[wb_workfn] to access/update 'ui->dirty',
finally dirty space is released twice.
open(O_TMPFILE) wb_workfn
do_tmpfile
ubifs_budget_space(ino_req = { .dirtied_ino = 1})
d_tmpfile // mark inode(tmpfile) dirty
ubifs_jnl_update // without holding tmpfile's ui_mutex
mark_inode_clean(ui)
if (ui->dirty)
ubifs_release_dirty_inode_budget(ui) // release first time
ubifs_write_inode
mutex_lock(&ui->ui_mutex)
ubifs_release_dirty_inode_budget(ui)
// release second time
mutex_unlock(&ui->ui_mutex)
ui->dirty = 0
Run generic/476 can reproduce following message easily
(See reproducer in [Link]):
UBIFS error (ubi0:0 pid 2578): ubifs_assert_failed [ubifs]: UBIFS assert
failed: c->bi.dd_growth >= 0, in fs/ubifs/budget.c:554
UBIFS warning (ubi0:0 pid 2578): ubifs_ro_mode [ubifs]: switched to
read-only mode, error -22
Workqueue: writeback wb_workfn (flush-ubifs_0_0)
Call Trace:
ubifs_ro_mode+0x54/0x60 [ubifs]
ubifs_assert_failed+0x4b/0x80 [ubifs]
ubifs_release_budget+0x468/0x5a0 [ubifs]
ubifs_release_dirty_inode_budget+0x53/0x80 [ubifs]
ubifs_write_inode+0x121/0x1f0 [ubifs]
...
wb_workfn+0x283/0x7b0
Fix it by holding tmpfile ubifs inode lock during ubifs_jnl_update().
Similar problem exists in whiteout renaming, but previous fix("ubifs:
Rename whiteout atomically") has solved the problem.
Fixes: 474b93704f ("ubifs: Implement O_TMPFILE")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=214765
Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 278d9a243635f26c05ad95dcf9c5a593b9e04dc6 upstream.
Currently, rename whiteout has 3 steps:
1. create tmpfile(which associates old dentry to tmpfile inode) for
whiteout, and store tmpfile to disk
2. link whiteout, associate whiteout inode to old dentry agagin and
store old dentry, old inode, new dentry on disk
3. writeback dirty whiteout inode to disk
Suddenly power-cut or error occurring(eg. ENOSPC returned by budget,
memory allocation failure) during above steps may cause kinds of problems:
Problem 1: ENOSPC returned by whiteout space budget (before step 2),
old dentry will disappear after rename syscall, whiteout file
cannot be found either.
ls dir // we get file, whiteout
rename(dir/file, dir/whiteout, REANME_WHITEOUT)
ENOSPC = ubifs_budget_space(&wht_req) // return
ls dir // empty (no file, no whiteout)
Problem 2: Power-cut happens before step 3, whiteout inode with 'nlink=1'
is not stored on disk, whiteout dentry(old dentry) is written
on disk, whiteout file is lost on next mount (We get "dead
directory entry" after executing 'ls -l' on whiteout file).
Now, we use following 3 steps to finish rename whiteout:
1. create an in-mem inode with 'nlink = 1' as whiteout
2. ubifs_jnl_rename (Write on disk to finish associating old dentry to
whiteout inode, associating new dentry with old inode)
3. iput(whiteout)
Rely writing in-mem inode on disk by ubifs_jnl_rename() to finish rename
whiteout, which avoids middle disk state caused by suddenly power-cut
and error occurring.
Fixes: 9e0a1fff8d ("ubifs: Implement RENAME_WHITEOUT")
Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 716b4573026bcbfa7b58ed19fe15554bac66b082 upstream.
whiteout inode should be put when do_tmpfile() failed if inode has been
initialized. Otherwise we will get following warning during umount:
UBIFS error (ubi0:0 pid 1494): ubifs_assert_failed [ubifs]: UBIFS
assert failed: c->bi.dd_growth == 0, in fs/ubifs/super.c:1930
VFS: Busy inodes after unmount of ubifs. Self-destruct in 5 seconds.
Fixes: 9e0a1fff8d ("ubifs: Implement RENAME_WHITEOUT")
Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
Suggested-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 40a8f0d5e7b3999f096570edab71c345da812e3e upstream.
'whiteout_ui->data' will be freed twice if space budget fail for
rename whiteout operation as following process:
rename_whiteout
dev = kmalloc
whiteout_ui->data = dev
kfree(whiteout_ui->data) // Free first time
iput(whiteout)
ubifs_free_inode
kfree(ui->data) // Double free!
KASAN reports:
==================================================================
BUG: KASAN: double-free or invalid-free in ubifs_free_inode+0x4f/0x70
Call Trace:
kfree+0x117/0x490
ubifs_free_inode+0x4f/0x70 [ubifs]
i_callback+0x30/0x60
rcu_do_batch+0x366/0xac0
__do_softirq+0x133/0x57f
Allocated by task 1506:
kmem_cache_alloc_trace+0x3c2/0x7a0
do_rename+0x9b7/0x1150 [ubifs]
ubifs_rename+0x106/0x1f0 [ubifs]
do_syscall_64+0x35/0x80
Freed by task 1506:
kfree+0x117/0x490
do_rename.cold+0x53/0x8a [ubifs]
ubifs_rename+0x106/0x1f0 [ubifs]
do_syscall_64+0x35/0x80
The buggy address belongs to the object at ffff88810238bed8 which
belongs to the cache kmalloc-8 of size 8
==================================================================
Let ubifs_free_inode() free 'whiteout_ui->data'. BTW, delete unused
assignment 'whiteout_ui->data_len = 0', process 'ubifs_evict_inode()
-> ubifs_jnl_delete_inode() -> ubifs_jnl_write_inode()' doesn't need it
(because 'inc_nlink(whiteout)' won't be excuted by 'goto out_release',
and the nlink of whiteout inode is 0).
Fixes: 9e0a1fff8d ("ubifs: Implement RENAME_WHITEOUT")
Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 714fbf2647b1a33d914edd695d4da92029c7e7c0 ]
ntfs_read_inode_mount invokes ntfs_malloc_nofs with zero allocation
size. It triggers one BUG in the __ntfs_malloc function.
Fix this by adding sanity check on ni->attr_list_size.
Link: https://lkml.kernel.org/r/20220120094914.47736-1-dzm91@hust.edu.cn
Reported-by: syzbot+3c765c5248797356edaa@syzkaller.appspotmail.com
Signed-off-by: Dongliang Mu <mudongliangabcd@gmail.com>
Acked-by: Anton Altaparmakov <anton@tuxera.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 06a466565d54a1a42168f9033a062a3f5c40e73b ]
When session gets reconnected during mount then read size in super block fs context
gets set to zero and after negotiate, rsize is not modified which results in
incorrect read with requested bytes as zero. Fixes intermittent failure
of xfstest generic/240
Note that stable requires a different version of this patch which will be
sent to the stable mailing list.
Signed-off-by: Rohith Surabattula <rohiths@microsoft.com>
Acked-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d284af43f703760e261b1601378a0c13a19d5f1f ]
In lz4_decompress_pages(), if size of decompressed data is not equal to
expected one, we should print the size rather than size of target buffer
for decompressed data, fix it.
Signed-off-by: Chao Yu <chao.yu@oppo.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 03ddb19d2ea745228879b9334f3b550c88acb10a ]
We can either fail to find a csum entry at all and return -ENOENT, or we
can find a range that is close, but return -EFBIG. In essence these
both mean the same thing when we are doing a lookup for a csum in an
existing range, we didn't find a csum. We want to treat both of these
errors the same way, complain loudly that there wasn't a csum. This
currently happens anyway because we do
count = search_csum_tree();
if (count <= 0) {
// reloc and error handling
}
However it forces us to incorrectly treat EIO or ENOMEM errors as on
disk corruption. Fix this by returning 0 if we get either -ENOENT or
-EFBIG from btrfs_lookup_csum() so we can do proper error handling.
Reviewed-by: Boris Burkov <boris@bur.io>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 770c79fb65506fc7c16459855c3839429f46cb32 ]
Identifying and removing the stale device from the fs_uuids list is done
by btrfs_free_stale_devices(). btrfs_free_stale_devices() in turn
depends on device_path_matched() to check if the device appears in more
than one btrfs_device structure.
The matching of the device happens by its path, the device path. However,
when device mapper is in use, the dm device paths are nothing but a link
to the actual block device, which leads to the device_path_matched()
failing to match.
Fix this by matching the dev_t as provided by lookup_bdev() instead of
plain string compare of the device paths.
Reported-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 50719bf3442dd6cd05159e9c98d020b3919ce978 ]
These have been incorrect since the function was introduced.
A proper kerneldoc comment is added since this function, though
static, is part of an external interface.
Reported-by: Dai Ngo <dai.ngo@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit cc5095747edfb054ca2068d01af20be3fcc3634f ]
[un]pin_user_pages_remote is dirtying pages without properly warning
the file system in advance. A related race was noted by Jan Kara in
2018[1]; however, more recently instead of it being a very hard-to-hit
race, it could be reliably triggered by process_vm_writev(2) which was
discovered by Syzbot[2].
This is technically a bug in mm/gup.c, but arguably ext4 is fragile in
that if some other kernel subsystem dirty pages without properly
notifying the file system using page_mkwrite(), ext4 will BUG, while
other file systems will not BUG (although data will still be lost).
So instead of crashing with a BUG, issue a warning (since there may be
potential data loss) and just mark the page as clean to avoid
unprivileged denial of service attacks until the problem can be
properly fixed. More discussion and background can be found in the
thread starting at [2].
[1] https://lore.kernel.org/linux-mm/20180103100430.GE4911@quack2.suse.cz
[2] https://lore.kernel.org/r/Yg0m6IjcNmfaSokM@google.com
Reported-by: syzbot+d59332e2db681cf18f0318a06e994ebbb529a8db@syzkaller.appspotmail.com
Reported-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Link: https://lore.kernel.org/r/YiDS9wVfq4mM2jGK@mit.edu
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit bfdc502a4a4c058bf4cbb1df0c297761d528f54d ]
In case of flex_bg feature (which is by default enabled), extents for
any given inode might span across blocks from two different block group.
ext4_mb_mark_bb() only reads the buffer_head of block bitmap once for the
starting block group, but it fails to read it again when the extent length
boundary overflows to another block group. Then in this below loop it
accesses memory beyond the block group bitmap buffer_head and results
into a data abort.
for (i = 0; i < clen; i++)
if (!mb_test_bit(blkoff + i, bitmap_bh->b_data) == !state)
already++;
This patch adds this functionality for checking block group boundary in
ext4_mb_mark_bb() and update the buffer_head(bitmap_bh) for every different
block group.
w/o this patch, I was easily able to hit a data access abort using Power platform.
<...>
[ 74.327662] EXT4-fs error (device loop3): ext4_mb_generate_buddy:1141: group 11, block bitmap and bg descriptor inconsistent: 21248 vs 23294 free clusters
[ 74.533214] EXT4-fs (loop3): shut down requested (2)
[ 74.536705] Aborting journal on device loop3-8.
[ 74.702705] BUG: Unable to handle kernel data access on read at 0xc00000005e980000
[ 74.703727] Faulting instruction address: 0xc0000000007bffb8
cpu 0xd: Vector: 300 (Data Access) at [c000000015db7060]
pc: c0000000007bffb8: ext4_mb_mark_bb+0x198/0x5a0
lr: c0000000007bfeec: ext4_mb_mark_bb+0xcc/0x5a0
sp: c000000015db7300
msr: 800000000280b033
dar: c00000005e980000
dsisr: 40000000
current = 0xc000000027af6880
paca = 0xc00000003ffd5200 irqmask: 0x03 irq_happened: 0x01
pid = 5167, comm = mount
<...>
enter ? for help
[c000000015db7380] c000000000782708 ext4_ext_clear_bb+0x378/0x410
[c000000015db7400] c000000000813f14 ext4_fc_replay+0x1794/0x2000
[c000000015db7580] c000000000833f7c do_one_pass+0xe9c/0x12a0
[c000000015db7710] c000000000834504 jbd2_journal_recover+0x184/0x2d0
[c000000015db77c0] c000000000841398 jbd2_journal_load+0x188/0x4a0
[c000000015db7880] c000000000804de8 ext4_fill_super+0x2638/0x3e10
[c000000015db7a40] c0000000005f8404 get_tree_bdev+0x2b4/0x350
[c000000015db7ae0] c0000000007ef058 ext4_get_tree+0x28/0x40
[c000000015db7b00] c0000000005f6344 vfs_get_tree+0x44/0x100
[c000000015db7b70] c00000000063c408 path_mount+0xdd8/0xe70
[c000000015db7c40] c00000000063c8f0 sys_mount+0x450/0x550
[c000000015db7d50] c000000000035770 system_call_exception+0x4a0/0x4e0
[c000000015db7e10] c00000000000c74c system_call_common+0xec/0x250
Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/2609bc8f66fc15870616ee416a18a3d392a209c4.1644992609.git.riteshh@linux.ibm.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a5c0e2fdf7cea535ba03259894dc184e5a4c2800 ]
ext4_mb_mark_bb() currently wrongly calculates cluster len (clen) and
flex_group->free_clusters. This patch fixes that.
Identified based on code review of ext4_mb_mark_bb() function.
Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/a0b035d536bafa88110b74456853774b64c8ac40.1644992609.git.riteshh@linux.ibm.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0da1d5002745cdc721bc018b582a8a9704d56c42 ]
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=197921
As pointed out in the discussion of buglink, we cannot calculate AT_PHDR
as the sum of load_addr and exec->e_phoff.
: The AT_PHDR of ELF auxiliary vectors should point to the memory address
: of program header. But binfmt_elf.c calculates this address as follows:
:
: NEW_AUX_ENT(AT_PHDR, load_addr + exec->e_phoff);
:
: which is wrong since e_phoff is the file offset of program header and
: load_addr is the memory base address from PT_LOAD entry.
:
: The ld.so uses AT_PHDR as the memory address of program header. In normal
: case, since the e_phoff is usually 64 and in the first PT_LOAD region, it
: is the correct program header address.
:
: But if the address of program header isn't equal to the first PT_LOAD
: address + e_phoff (e.g. Put the program header in other non-consecutive
: PT_LOAD region), ld.so will try to read program header from wrong address
: then crash or use incorrect program header.
This is because exec->e_phoff
is the offset of PHDRs in the file and the address of PHDRs in the
memory may differ from it. This patch fixes the bug by calculating the
address of program headers from PT_LOADs directly.
Signed-off-by: Akira Kawata <akirakawata1@gmail.com>
Reported-by: kernel test robot <lkp@intel.com>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20220127124014.338760-2-akirakawata1@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d888c83fcec75194a8a48ccd283953bdba7b2550 ]
Jason Donenfeld reports that my commit 1c24a186398f ("fs: fd tables have
to be multiples of BITS_PER_LONG") doesn't work, and the reason is an
embarrassing brown-paper-bag bug.
Yes, we want to align the number of fds to BITS_PER_LONG, and yes, the
reason they might not be aligned is because the incoming 'max_fd'
argument might not be aligned.
But aligining the argument - while simple - will cause a "infinitely
big" maxfd (eg NR_OPEN_MAX) to just overflow to zero. Which most
definitely isn't what we want either.
The obvious fix was always just to do the alignment last, but I had
moved it earlier just to make the patch smaller and the code look
simpler. Duh. It certainly made _me_ look simple.
Fixes: 1c24a186398f ("fs: fd tables have to be multiples of BITS_PER_LONG")
Reported-and-tested-by: Jason A. Donenfeld <Jason@zx2c4.com>
Cc: Fedor Pchelkin <aissur0002@gmail.com>
Cc: Alexey Khoroshilov <khoroshilov@ispras.ru>
Cc: Christian Brauner <brauner@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1c24a186398f59c80adb9a967486b65c1423a59d ]
This has always been the rule: fdtables have several bitmaps in them,
and as a result they have to be sized properly for bitmaps. We walk
those bitmaps in chunks of 'unsigned long' in serveral cases, but even
when we don't, we use the regular kernel bitops that are defined to work
on arrays of 'unsigned long', not on some byte array.
Now, the distinction between arrays of bytes and 'unsigned long'
normally only really ends up being noticeable on big-endian systems, but
Fedor Pchelkin and Alexey Khoroshilov reported that copy_fd_bitmaps()
could be called with an argument that wasn't even a multiple of
BITS_PER_BYTE. And then it fails to do the proper copy even on
little-endian machines.
The bug wasn't in copy_fd_bitmap(), but in sane_fdtable_size(), which
didn't actually sanitize the fdtable size sufficiently, and never made
sure it had the proper BITS_PER_LONG alignment.
That's partly because the alignment historically came not from having to
explicitly align things, but simply from previous fdtable sizes, and
from count_open_files(), which counts the file descriptors by walking
them one 'unsigned long' word at a time and thus naturally ends up doing
sizing in the proper 'chunks of unsigned long'.
But with the introduction of close_range(), we now have an external
source of "this is how many files we want to have", and so
sane_fdtable_size() needs to do a better job.
This also adds that explicit alignment to alloc_fdtable(), although
there it is mainly just for documentation at a source code level. The
arithmetic we do there to pick a reasonable fdtable size already aligns
the result sufficiently.
In fact,clang notices that the added ALIGN() in that function doesn't
actually do anything, and does not generate any extra code for it.
It turns out that gcc ends up confusing itself by combining a previous
constant-sized shift operation with the variable-sized shift operations
in roundup_pow_of_two(). And probably due to that doesn't notice that
the ALIGN() is a no-op. But that's a (tiny) gcc misfeature that doesn't
matter. Having the explicit alignment makes sense, and would actually
matter on a 128-bit architecture if we ever go there.
This also adds big comments above both functions about how fdtable sizes
have to have that BITS_PER_LONG alignment.
Fixes: 60997c3d45 ("close_range: add CLOSE_RANGE_UNSHARE")
Reported-by: Fedor Pchelkin <aissur0002@gmail.com>
Reported-by: Alexey Khoroshilov <khoroshilov@ispras.ru>
Link: https://lore.kernel.org/all/20220326114009.1690-1-aissur0002@gmail.com/
Tested-and-acked-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7c9d845f0612e5bcd23456a2ec43be8ac43458f1 ]
In nfs4_callback_devicenotify(), if we don't find a matching entry for
the deviceid, we're left with a pointer to 'struct nfs_server' that
actually points to the list of super blocks associated with our struct
nfs_client.
Furthermore, even if we have a valid pointer, nothing pins the super
block, and so the struct nfs_server could end up getting freed while
we're using it.
Since all we want is a pointer to the struct pnfs_layoutdriver_type,
let's skip all the iteration over super blocks, and just use APIs to
find the layout driver directly.
Reported-by: Xiaomeng Tong <xiam0nd.tong@gmail.com>
Fixes: 1be5683b03 ("pnfs: CB_NOTIFY_DEVICEID")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d02d81efc7564b4d5446a02e0214a164cf00b1f3 ]
If __nfs_pageio_add_request() fails to add the request, it will return
with either desc->pg_error < 0, or mirror->pg_recoalesce will be set, so
we are guaranteed either to exit the function altogether, or to loop.
However if there is nothing left in mirror->pg_list to coalesce, we must
exit, so make sure that we clear mirror->pg_recoalesce every time we
loop.
Reported-by: Olga Kornievskaia <aglo@umich.edu>
Fixes: 70536bf4eb ("NFS: Clean up reset of the mirror accounting variables")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1d15d121cc2ad4d016a7dc1493132a9696f91fc5 ]
There is no reason to retry the operation if a session error had
occurred in such case result structure isn't filled out.
Fixes: dff58530c4 ("NFSv4.1: fix handling of backchannel binding in BIND_CONN_TO_SESSION")
Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>