| CVE |
Vendors |
Products |
Updated |
CVSS v3.1 |
| In the Linux kernel, the following vulnerability has been resolved:
bpf, cpumap: Make sure kthread is running before map update returns
The following warning was reported when running stress-mode enabled
xdp_redirect_cpu with some RT threads:
------------[ cut here ]------------
WARNING: CPU: 4 PID: 65 at kernel/bpf/cpumap.c:135
CPU: 4 PID: 65 Comm: kworker/4:1 Not tainted 6.5.0-rc2+ #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)
Workqueue: events cpu_map_kthread_stop
RIP: 0010:put_cpu_map_entry+0xda/0x220
......
Call Trace:
<TASK>
? show_regs+0x65/0x70
? __warn+0xa5/0x240
......
? put_cpu_map_entry+0xda/0x220
cpu_map_kthread_stop+0x41/0x60
process_one_work+0x6b0/0xb80
worker_thread+0x96/0x720
kthread+0x1a5/0x1f0
ret_from_fork+0x3a/0x70
ret_from_fork_asm+0x1b/0x30
</TASK>
The root cause is the same as commit 436901649731 ("bpf: cpumap: Fix memory
leak in cpu_map_update_elem"). The kthread is stopped prematurely by
kthread_stop() in cpu_map_kthread_stop(), and kthread() doesn't call
cpu_map_kthread_run() at all but XDP program has already queued some
frames or skbs into ptr_ring. So when __cpu_map_ring_cleanup() checks
the ptr_ring, it will find it was not emptied and report a warning.
An alternative fix is to use __cpu_map_ring_cleanup() to drop these
pending frames or skbs when kthread_stop() returns -EINTR, but it may
confuse the user, because these frames or skbs have been handled
correctly by XDP program. So instead of dropping these frames or skbs,
just make sure the per-cpu kthread is running before
__cpu_map_entry_alloc() returns.
After apply the fix, the error handle for kthread_stop() will be
unnecessary because it will always return 0, so just remove it. |
| In the Linux kernel, the following vulnerability has been resolved:
net: qrtr: Fix an uninit variable access bug in qrtr_tx_resume()
Syzbot reported a bug as following:
=====================================================
BUG: KMSAN: uninit-value in qrtr_tx_resume+0x185/0x1f0 net/qrtr/af_qrtr.c:230
qrtr_tx_resume+0x185/0x1f0 net/qrtr/af_qrtr.c:230
qrtr_endpoint_post+0xf85/0x11b0 net/qrtr/af_qrtr.c:519
qrtr_tun_write_iter+0x270/0x400 net/qrtr/tun.c:108
call_write_iter include/linux/fs.h:2189 [inline]
aio_write+0x63a/0x950 fs/aio.c:1600
io_submit_one+0x1d1c/0x3bf0 fs/aio.c:2019
__do_sys_io_submit fs/aio.c:2078 [inline]
__se_sys_io_submit+0x293/0x770 fs/aio.c:2048
__x64_sys_io_submit+0x92/0xd0 fs/aio.c:2048
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
Uninit was created at:
slab_post_alloc_hook mm/slab.h:766 [inline]
slab_alloc_node mm/slub.c:3452 [inline]
__kmem_cache_alloc_node+0x71f/0xce0 mm/slub.c:3491
__do_kmalloc_node mm/slab_common.c:967 [inline]
__kmalloc_node_track_caller+0x114/0x3b0 mm/slab_common.c:988
kmalloc_reserve net/core/skbuff.c:492 [inline]
__alloc_skb+0x3af/0x8f0 net/core/skbuff.c:565
__netdev_alloc_skb+0x120/0x7d0 net/core/skbuff.c:630
qrtr_endpoint_post+0xbd/0x11b0 net/qrtr/af_qrtr.c:446
qrtr_tun_write_iter+0x270/0x400 net/qrtr/tun.c:108
call_write_iter include/linux/fs.h:2189 [inline]
aio_write+0x63a/0x950 fs/aio.c:1600
io_submit_one+0x1d1c/0x3bf0 fs/aio.c:2019
__do_sys_io_submit fs/aio.c:2078 [inline]
__se_sys_io_submit+0x293/0x770 fs/aio.c:2048
__x64_sys_io_submit+0x92/0xd0 fs/aio.c:2048
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
It is because that skb->len requires at least sizeof(struct qrtr_ctrl_pkt)
in qrtr_tx_resume(). And skb->len equals to size in qrtr_endpoint_post().
But size is less than sizeof(struct qrtr_ctrl_pkt) when qrtr_cb->type
equals to QRTR_TYPE_RESUME_TX in qrtr_endpoint_post() under the syzbot
scenario. This triggers the uninit variable access bug.
Add size check when qrtr_cb->type equals to QRTR_TYPE_RESUME_TX in
qrtr_endpoint_post() to fix the bug. |
| In the Linux kernel, the following vulnerability has been resolved:
net/mlx5e: Check for NOT_READY flag state after locking
Currently the check for NOT_READY flag is performed before obtaining the
necessary lock. This opens a possibility for race condition when the flow
is concurrently removed from unready_flows list by the workqueue task,
which causes a double-removal from the list and a crash[0]. Fix the issue
by moving the flag check inside the section protected by
uplink_priv->unready_flows_lock mutex.
[0]:
[44376.389654] general protection fault, probably for non-canonical address 0xdead000000000108: 0000 [#1] SMP
[44376.391665] CPU: 7 PID: 59123 Comm: tc Not tainted 6.4.0-rc4+ #1
[44376.392984] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
[44376.395342] RIP: 0010:mlx5e_tc_del_fdb_flow+0xb3/0x340 [mlx5_core]
[44376.396857] Code: 00 48 8b b8 68 ce 02 00 e8 8a 4d 02 00 4c 8d a8 a8 01 00 00 4c 89 ef e8 8b 79 88 e1 48 8b 83 98 06 00 00 48 8b 93 90 06 00 00 <48> 89 42 08 48 89 10 48 b8 00 01 00 00 00 00 ad de 48 89 83 90 06
[44376.399167] RSP: 0018:ffff88812cc97570 EFLAGS: 00010246
[44376.399680] RAX: dead000000000122 RBX: ffff8881088e3800 RCX: ffff8881881bac00
[44376.400337] RDX: dead000000000100 RSI: ffff88812cc97500 RDI: ffff8881242f71b0
[44376.401001] RBP: ffff88811cbb0940 R08: 0000000000000400 R09: 0000000000000001
[44376.401663] R10: 0000000000000001 R11: 0000000000000000 R12: ffff88812c944000
[44376.402342] R13: ffff8881242f71a8 R14: ffff8881222b4000 R15: 0000000000000000
[44376.402999] FS: 00007f0451104800(0000) GS:ffff88852cb80000(0000) knlGS:0000000000000000
[44376.403787] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[44376.404343] CR2: 0000000000489108 CR3: 0000000123a79003 CR4: 0000000000370ea0
[44376.405004] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[44376.405665] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[44376.406339] Call Trace:
[44376.406651] <TASK>
[44376.406939] ? die_addr+0x33/0x90
[44376.407311] ? exc_general_protection+0x192/0x390
[44376.407795] ? asm_exc_general_protection+0x22/0x30
[44376.408292] ? mlx5e_tc_del_fdb_flow+0xb3/0x340 [mlx5_core]
[44376.408876] __mlx5e_tc_del_fdb_peer_flow+0xbc/0xe0 [mlx5_core]
[44376.409482] mlx5e_tc_del_flow+0x42/0x210 [mlx5_core]
[44376.410055] mlx5e_flow_put+0x25/0x50 [mlx5_core]
[44376.410529] mlx5e_delete_flower+0x24b/0x350 [mlx5_core]
[44376.411043] tc_setup_cb_reoffload+0x22/0x80
[44376.411462] fl_reoffload+0x261/0x2f0 [cls_flower]
[44376.411907] ? mlx5e_rep_indr_setup_ft_cb+0x160/0x160 [mlx5_core]
[44376.412481] ? mlx5e_rep_indr_setup_ft_cb+0x160/0x160 [mlx5_core]
[44376.413044] tcf_block_playback_offloads+0x76/0x170
[44376.413497] tcf_block_unbind+0x7b/0xd0
[44376.413881] tcf_block_setup+0x17d/0x1c0
[44376.414269] tcf_block_offload_cmd.isra.0+0xf1/0x130
[44376.414725] tcf_block_offload_unbind+0x43/0x70
[44376.415153] __tcf_block_put+0x82/0x150
[44376.415532] ingress_destroy+0x22/0x30 [sch_ingress]
[44376.415986] qdisc_destroy+0x3b/0xd0
[44376.416343] qdisc_graft+0x4d0/0x620
[44376.416706] tc_get_qdisc+0x1c9/0x3b0
[44376.417074] rtnetlink_rcv_msg+0x29c/0x390
[44376.419978] ? rep_movs_alternative+0x3a/0xa0
[44376.420399] ? rtnl_calcit.isra.0+0x120/0x120
[44376.420813] netlink_rcv_skb+0x54/0x100
[44376.421192] netlink_unicast+0x1f6/0x2c0
[44376.421573] netlink_sendmsg+0x232/0x4a0
[44376.421980] sock_sendmsg+0x38/0x60
[44376.422328] ____sys_sendmsg+0x1d0/0x1e0
[44376.422709] ? copy_msghdr_from_user+0x6d/0xa0
[44376.423127] ___sys_sendmsg+0x80/0xc0
[44376.423495] ? ___sys_recvmsg+0x8b/0xc0
[44376.423869] __sys_sendmsg+0x51/0x90
[44376.424226] do_syscall_64+0x3d/0x90
[44376.424587] entry_SYSCALL_64_after_hwframe+0x46/0xb0
[44376.425046] RIP: 0033:0x7f045134f887
[44376.425403] Code: 0a 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b9 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2e 00
---truncated--- |
| In the Linux kernel, the following vulnerability has been resolved:
ubifs: ubifs_releasepage: Remove ubifs_assert(0) to valid this process
There are two states for ubifs writing pages:
1. Dirty, Private
2. Not Dirty, Not Private
The normal process cannot go to ubifs_releasepage() which means there
exists pages being private but not dirty. Reproducer[1] shows that it
could occur (which maybe related to [2]) with following process:
PA PB PC
lock(page)[PA]
ubifs_write_end
attach_page_private // set Private
__set_page_dirty_nobuffers // set Dirty
unlock(page)
write_cache_pages[PA]
lock(page)
clear_page_dirty_for_io(page) // clear Dirty
ubifs_writepage
do_truncation[PB]
truncate_setsize
i_size_write(inode, newsize) // newsize = 0
i_size = i_size_read(inode) // i_size = 0
end_index = i_size >> PAGE_SHIFT
if (page->index > end_index)
goto out // jump
out:
unlock(page) // Private, Not Dirty
generic_fadvise[PC]
lock(page)
invalidate_inode_page
try_to_release_page
ubifs_releasepage
ubifs_assert(c, 0)
// bad assertion!
unlock(page)
truncate_pagecache[PB]
Then we may get following assertion failed:
UBIFS error (ubi0:0 pid 1683): ubifs_assert_failed [ubifs]:
UBIFS assert failed: 0, in fs/ubifs/file.c:1513
UBIFS warning (ubi0:0 pid 1683): ubifs_ro_mode [ubifs]:
switched to read-only mode, error -22
CPU: 2 PID: 1683 Comm: aa Not tainted 5.16.0-rc5-00184-g0bca5994cacc-dirty #308
Call Trace:
dump_stack+0x13/0x1b
ubifs_ro_mode+0x54/0x60 [ubifs]
ubifs_assert_failed+0x4b/0x80 [ubifs]
ubifs_releasepage+0x67/0x1d0 [ubifs]
try_to_release_page+0x57/0xe0
invalidate_inode_page+0xfb/0x130
__invalidate_mapping_pages+0xb9/0x280
invalidate_mapping_pagevec+0x12/0x20
generic_fadvise+0x303/0x3c0
ksys_fadvise64_64+0x4c/0xb0
[1] https://bugzilla.kernel.org/show_bug.cgi?id=215373
[2] https://linux-mtd.infradead.narkive.com/NQoBeT1u/patch-rfc-ubifs-fix-assert-failed-in-ubifs-set-page-dirty |
| In the Linux kernel, the following vulnerability has been resolved:
bpf: reject unhashed sockets in bpf_sk_assign
The semantics for bpf_sk_assign are as follows:
sk = some_lookup_func()
bpf_sk_assign(skb, sk)
bpf_sk_release(sk)
That is, the sk is not consumed by bpf_sk_assign. The function
therefore needs to make sure that sk lives long enough to be
consumed from __inet_lookup_skb. The path through the stack for a
TCPv4 packet is roughly:
netif_receive_skb_core: takes RCU read lock
__netif_receive_skb_core:
sch_handle_ingress:
tcf_classify:
bpf_sk_assign()
deliver_ptype_list_skb:
deliver_skb:
ip_packet_type->func == ip_rcv:
ip_rcv_core:
ip_rcv_finish_core:
dst_input:
ip_local_deliver:
ip_local_deliver_finish:
ip_protocol_deliver_rcu:
tcp_v4_rcv:
__inet_lookup_skb:
skb_steal_sock
The existing helper takes advantage of the fact that everything
happens in the same RCU critical section: for sockets with
SOCK_RCU_FREE set bpf_sk_assign never takes a reference.
skb_steal_sock then checks SOCK_RCU_FREE again and does sock_put
if necessary.
This approach assumes that SOCK_RCU_FREE is never set on a sk
between bpf_sk_assign and skb_steal_sock, but this invariant is
violated by unhashed UDP sockets. A new UDP socket is created
in TCP_CLOSE state but without SOCK_RCU_FREE set. That flag is only
added in udp_lib_get_port() which happens when a socket is bound.
When bpf_sk_assign was added it wasn't possible to access unhashed
UDP sockets from BPF, so this wasn't a problem. This changed
in commit 0c48eefae712 ("sock_map: Lift socket state restriction
for datagram sockets"), but the helper wasn't adjusted accordingly.
The following sequence of events will therefore lead to a refcount
leak:
1. Add socket(AF_INET, SOCK_DGRAM) to a sockmap.
2. Pull socket out of sockmap and bpf_sk_assign it. Since
SOCK_RCU_FREE is not set we increment the refcount.
3. bind() or connect() the socket, setting SOCK_RCU_FREE.
4. skb_steal_sock will now set refcounted = false due to
SOCK_RCU_FREE.
5. tcp_v4_rcv() skips sock_put().
Fix the problem by rejecting unhashed sockets in bpf_sk_assign().
This matches the behaviour of __inet_lookup_skb which is ultimately
the goal of bpf_sk_assign(). |
| In the Linux kernel, the following vulnerability has been resolved:
eth: alx: take rtnl_lock on resume
Zbynek reports that alx trips an rtnl assertion on resume:
RTNL: assertion failed at net/core/dev.c (2891)
RIP: 0010:netif_set_real_num_tx_queues+0x1ac/0x1c0
Call Trace:
<TASK>
__alx_open+0x230/0x570 [alx]
alx_resume+0x54/0x80 [alx]
? pci_legacy_resume+0x80/0x80
dpm_run_callback+0x4a/0x150
device_resume+0x8b/0x190
async_resume+0x19/0x30
async_run_entry_fn+0x30/0x130
process_one_work+0x1e5/0x3b0
indeed the driver does not hold rtnl_lock during its internal close
and re-open functions during suspend/resume. Note that this is not
a huge bug as the driver implements its own locking, and does not
implement changing the number of queues, but we need to silence
the splat. |
| In the Linux kernel, the following vulnerability has been resolved:
scsi: target: Fix multiple LUN_RESET handling
This fixes a bug where an initiator thinks a LUN_RESET has cleaned up
running commands when it hasn't. The bug was added in commit 51ec502a3266
("target: Delete tmr from list before processing").
The problem occurs when:
1. We have N I/O cmds running in the target layer spread over 2 sessions.
2. The initiator sends a LUN_RESET for each session.
3. session1's LUN_RESET loops over all the running commands from both
sessions and moves them to its local drain_task_list.
4. session2's LUN_RESET does not see the LUN_RESET from session1 because
the commit above has it remove itself. session2 also does not see any
commands since the other reset moved them off the state lists.
5. sessions2's LUN_RESET will then complete with a successful response.
6. sessions2's inititor believes the running commands on its session are
now cleaned up due to the successful response and cleans up the running
commands from its side. It then restarts them.
7. The commands do eventually complete on the backend and the target
starts to return aborted task statuses for them. The initiator will
either throw a invalid ITT error or might accidentally lookup a new
task if the ITT has been reallocated already.
Fix the bug by reverting the patch, and serialize the execution of
LUN_RESETs and Preempt and Aborts.
Also prevent us from waiting on LUN_RESETs in core_tmr_drain_tmr_list,
because it turns out the original patch fixed a bug that was not
mentioned. For LUN_RESET1 core_tmr_drain_tmr_list can see a second
LUN_RESET and wait on it. Then the second reset will run
core_tmr_drain_tmr_list and see the first reset and wait on it resulting in
a deadlock. |
| In the Linux kernel, the following vulnerability has been resolved:
cpufreq: amd-pstate: fix global sysfs attribute type
In commit 3666062b87ec ("cpufreq: amd-pstate: move to use bus_get_dev_root()")
the "amd_pstate" attributes where moved from a dedicated kobject to the
cpu root kobject.
While the dedicated kobject expects to contain kobj_attributes the root
kobject needs device_attributes.
As the changed arguments are not used by the callbacks it works most of
the time.
However CFI will detect this issue:
[ 4947.849350] CFI failure at dev_attr_show+0x24/0x60 (target: show_status+0x0/0x70; expected type: 0x8651b1de)
...
[ 4947.849409] Call Trace:
[ 4947.849410] <TASK>
[ 4947.849411] ? __warn+0xcf/0x1c0
[ 4947.849414] ? dev_attr_show+0x24/0x60
[ 4947.849415] ? report_cfi_failure+0x4e/0x60
[ 4947.849417] ? handle_cfi_failure+0x14c/0x1d0
[ 4947.849419] ? __cfi_show_status+0x10/0x10
[ 4947.849420] ? handle_bug+0x4f/0x90
[ 4947.849421] ? exc_invalid_op+0x1a/0x60
[ 4947.849422] ? asm_exc_invalid_op+0x1a/0x20
[ 4947.849424] ? __cfi_show_status+0x10/0x10
[ 4947.849425] ? dev_attr_show+0x24/0x60
[ 4947.849426] sysfs_kf_seq_show+0xa6/0x110
[ 4947.849433] seq_read_iter+0x16c/0x4b0
[ 4947.849436] vfs_read+0x272/0x2d0
[ 4947.849438] ksys_read+0x72/0xe0
[ 4947.849439] do_syscall_64+0x76/0xb0
[ 4947.849440] ? do_user_addr_fault+0x252/0x650
[ 4947.849442] ? exc_page_fault+0x7a/0x1b0
[ 4947.849443] entry_SYSCALL_64_after_hwframe+0x72/0xdc |
| In the Linux kernel, the following vulnerability has been resolved:
ring-buffer: Sync IRQ works before buffer destruction
If something was written to the buffer just before destruction,
it may be possible (maybe not in a real system, but it did
happen in ARCH=um with time-travel) to destroy the ringbuffer
before the IRQ work ran, leading this KASAN report (or a crash
without KASAN):
BUG: KASAN: slab-use-after-free in irq_work_run_list+0x11a/0x13a
Read of size 8 at addr 000000006d640a48 by task swapper/0
CPU: 0 PID: 0 Comm: swapper Tainted: G W O 6.3.0-rc1 #7
Stack:
60c4f20f 0c203d48 41b58ab3 60f224fc
600477fa 60f35687 60c4f20f 601273dd
00000008 6101eb00 6101eab0 615be548
Call Trace:
[<60047a58>] show_stack+0x25e/0x282
[<60c609e0>] dump_stack_lvl+0x96/0xfd
[<60c50d4c>] print_report+0x1a7/0x5a8
[<603078d3>] kasan_report+0xc1/0xe9
[<60308950>] __asan_report_load8_noabort+0x1b/0x1d
[<60232844>] irq_work_run_list+0x11a/0x13a
[<602328b4>] irq_work_tick+0x24/0x34
[<6017f9dc>] update_process_times+0x162/0x196
[<6019f335>] tick_sched_handle+0x1a4/0x1c3
[<6019fd9e>] tick_sched_timer+0x79/0x10c
[<601812b9>] __hrtimer_run_queues.constprop.0+0x425/0x695
[<60182913>] hrtimer_interrupt+0x16c/0x2c4
[<600486a3>] um_timer+0x164/0x183
[...]
Allocated by task 411:
save_stack_trace+0x99/0xb5
stack_trace_save+0x81/0x9b
kasan_save_stack+0x2d/0x54
kasan_set_track+0x34/0x3e
kasan_save_alloc_info+0x25/0x28
____kasan_kmalloc+0x8b/0x97
__kasan_kmalloc+0x10/0x12
__kmalloc+0xb2/0xe8
load_elf_phdrs+0xee/0x182
[...]
The buggy address belongs to the object at 000000006d640800
which belongs to the cache kmalloc-1k of size 1024
The buggy address is located 584 bytes inside of
freed 1024-byte region [000000006d640800, 000000006d640c00)
Add the appropriate irq_work_sync() so the work finishes before
the buffers are destroyed.
Prior to the commit in the Fixes tag below, there was only a
single global IRQ work, so this issue didn't exist. |
| In the Linux kernel, the following vulnerability has been resolved:
sctp: add a refcnt in sctp_stream_priorities to avoid a nested loop
With this refcnt added in sctp_stream_priorities, we don't need to
traverse all streams to check if the prio is used by other streams
when freeing one stream's prio in sctp_sched_prio_free_sid(). This
can avoid a nested loop (up to 65535 * 65535), which may cause a
stuck as Ying reported:
watchdog: BUG: soft lockup - CPU#23 stuck for 26s! [ksoftirqd/23:136]
Call Trace:
<TASK>
sctp_sched_prio_free_sid+0xab/0x100 [sctp]
sctp_stream_free_ext+0x64/0xa0 [sctp]
sctp_stream_free+0x31/0x50 [sctp]
sctp_association_free+0xa5/0x200 [sctp]
Note that it doesn't need to use refcount_t type for this counter,
as its accessing is always protected under the sock lock.
v1->v2:
- add a check in sctp_sched_prio_set to avoid the possible prio_head
refcnt overflow. |
| In the Linux kernel, the following vulnerability has been resolved:
net/mlx5e: Fix deadlock in tc route query code
Cited commit causes ABBA deadlock[0] when peer flows are created while
holding the devcom rw semaphore. Due to peer flows offload implementation
the lock is taken much higher up the call chain and there is no obvious way
to easily fix the deadlock. Instead, since tc route query code needs the
peer eswitch structure only to perform a lookup in xarray and doesn't
perform any sleeping operations with it, refactor the code for lockless
execution in following ways:
- RCUify the devcom 'data' pointer. When resetting the pointer
synchronously wait for RCU grace period before returning. This is fine
since devcom is currently only used for synchronization of
pairing/unpairing of eswitches which is rare and already expensive as-is.
- Wrap all usages of 'paired' boolean in {READ|WRITE}_ONCE(). The flag has
already been used in some unlocked contexts without proper
annotations (e.g. users of mlx5_devcom_is_paired() function), but it wasn't
an issue since all relevant code paths checked it again after obtaining the
devcom semaphore. Now it is also used by mlx5_devcom_get_peer_data_rcu() as
"best effort" check to return NULL when devcom is being unpaired. Note that
while RCU read lock doesn't prevent the unpaired flag from being changed
concurrently it still guarantees that reader can continue to use 'data'.
- Refactor mlx5e_tc_query_route_vport() function to use new
mlx5_devcom_get_peer_data_rcu() API which fixes the deadlock.
[0]:
[ 164.599612] ======================================================
[ 164.600142] WARNING: possible circular locking dependency detected
[ 164.600667] 6.3.0-rc3+ #1 Not tainted
[ 164.601021] ------------------------------------------------------
[ 164.601557] handler1/3456 is trying to acquire lock:
[ 164.601998] ffff88811f1714b0 (&esw->offloads.encap_tbl_lock){+.+.}-{3:3}, at: mlx5e_attach_encap+0xd8/0x8b0 [mlx5_core]
[ 164.603078]
but task is already holding lock:
[ 164.603617] ffff88810137fc98 (&comp->sem){++++}-{3:3}, at: mlx5_devcom_get_peer_data+0x37/0x80 [mlx5_core]
[ 164.604459]
which lock already depends on the new lock.
[ 164.605190]
the existing dependency chain (in reverse order) is:
[ 164.605848]
-> #1 (&comp->sem){++++}-{3:3}:
[ 164.606380] down_read+0x39/0x50
[ 164.606772] mlx5_devcom_get_peer_data+0x37/0x80 [mlx5_core]
[ 164.607336] mlx5e_tc_query_route_vport+0x86/0xc0 [mlx5_core]
[ 164.607914] mlx5e_tc_tun_route_lookup+0x1a4/0x1d0 [mlx5_core]
[ 164.608495] mlx5e_attach_decap_route+0xc6/0x1e0 [mlx5_core]
[ 164.609063] mlx5e_tc_add_fdb_flow+0x1ea/0x360 [mlx5_core]
[ 164.609627] __mlx5e_add_fdb_flow+0x2d2/0x430 [mlx5_core]
[ 164.610175] mlx5e_configure_flower+0x952/0x1a20 [mlx5_core]
[ 164.610741] tc_setup_cb_add+0xd4/0x200
[ 164.611146] fl_hw_replace_filter+0x14c/0x1f0 [cls_flower]
[ 164.611661] fl_change+0xc95/0x18a0 [cls_flower]
[ 164.612116] tc_new_tfilter+0x3fc/0xd20
[ 164.612516] rtnetlink_rcv_msg+0x418/0x5b0
[ 164.612936] netlink_rcv_skb+0x54/0x100
[ 164.613339] netlink_unicast+0x190/0x250
[ 164.613746] netlink_sendmsg+0x245/0x4a0
[ 164.614150] sock_sendmsg+0x38/0x60
[ 164.614522] ____sys_sendmsg+0x1d0/0x1e0
[ 164.614934] ___sys_sendmsg+0x80/0xc0
[ 164.615320] __sys_sendmsg+0x51/0x90
[ 164.615701] do_syscall_64+0x3d/0x90
[ 164.616083] entry_SYSCALL_64_after_hwframe+0x46/0xb0
[ 164.616568]
-> #0 (&esw->offloads.encap_tbl_lock){+.+.}-{3:3}:
[ 164.617210] __lock_acquire+0x159e/0x26e0
[ 164.617638] lock_acquire+0xc2/0x2a0
[ 164.618018] __mutex_lock+0x92/0xcd0
[ 164.618401] mlx5e_attach_encap+0xd8/0x8b0 [mlx5_core]
[ 164.618943] post_process_attr+0x153/0x2d0 [
---truncated--- |
| In the Linux kernel, the following vulnerability has been resolved:
octeontx2-pf: mcs: Fix NULL pointer dereferences
When system is rebooted after creating macsec interface
below NULL pointer dereference crashes occurred. This
patch fixes those crashes by using correct order of teardown
[ 3324.406942] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
[ 3324.415726] Mem abort info:
[ 3324.418510] ESR = 0x96000006
[ 3324.421557] EC = 0x25: DABT (current EL), IL = 32 bits
[ 3324.426865] SET = 0, FnV = 0
[ 3324.429913] EA = 0, S1PTW = 0
[ 3324.433047] Data abort info:
[ 3324.435921] ISV = 0, ISS = 0x00000006
[ 3324.439748] CM = 0, WnR = 0
....
[ 3324.575915] Call trace:
[ 3324.578353] cn10k_mdo_del_secy+0x24/0x180
[ 3324.582440] macsec_common_dellink+0xec/0x120
[ 3324.586788] macsec_notify+0x17c/0x1c0
[ 3324.590529] raw_notifier_call_chain+0x50/0x70
[ 3324.594965] call_netdevice_notifiers_info+0x34/0x7c
[ 3324.599921] rollback_registered_many+0x354/0x5bc
[ 3324.604616] unregister_netdevice_queue+0x88/0x10c
[ 3324.609399] unregister_netdev+0x20/0x30
[ 3324.613313] otx2_remove+0x8c/0x310
[ 3324.616794] pci_device_shutdown+0x30/0x70
[ 3324.620882] device_shutdown+0x11c/0x204
[ 966.664930] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
[ 966.673712] Mem abort info:
[ 966.676497] ESR = 0x96000006
[ 966.679543] EC = 0x25: DABT (current EL), IL = 32 bits
[ 966.684848] SET = 0, FnV = 0
[ 966.687895] EA = 0, S1PTW = 0
[ 966.691028] Data abort info:
[ 966.693900] ISV = 0, ISS = 0x00000006
[ 966.697729] CM = 0, WnR = 0
[ 966.833467] Call trace:
[ 966.835904] cn10k_mdo_stop+0x20/0xa0
[ 966.839557] macsec_dev_stop+0xe8/0x11c
[ 966.843384] __dev_close_many+0xbc/0x140
[ 966.847298] dev_close_many+0x84/0x120
[ 966.851039] rollback_registered_many+0x114/0x5bc
[ 966.855735] unregister_netdevice_many.part.0+0x14/0xa0
[ 966.860952] unregister_netdevice_many+0x18/0x24
[ 966.865560] macsec_notify+0x1ac/0x1c0
[ 966.869303] raw_notifier_call_chain+0x50/0x70
[ 966.873738] call_netdevice_notifiers_info+0x34/0x7c
[ 966.878694] rollback_registered_many+0x354/0x5bc
[ 966.883390] unregister_netdevice_queue+0x88/0x10c
[ 966.888173] unregister_netdev+0x20/0x30
[ 966.892090] otx2_remove+0x8c/0x310
[ 966.895571] pci_device_shutdown+0x30/0x70
[ 966.899660] device_shutdown+0x11c/0x204
[ 966.903574] __do_sys_reboot+0x208/0x290
[ 966.907487] __arm64_sys_reboot+0x20/0x30
[ 966.911489] el0_svc_handler+0x80/0x1c0
[ 966.915316] el0_svc+0x8/0x180
[ 966.918362] Code: f9400000 f9400a64 91220014 f94b3403 (f9400060)
[ 966.924448] ---[ end trace 341778e799c3d8d7 ]--- |
| In the Linux kernel, the following vulnerability has been resolved:
drivers: base: Free devm resources when unregistering a device
In the current code, devres_release_all() only gets called if the device
has a bus and has been probed.
This leads to issues when using bus-less or driver-less devices where
the device might never get freed if a managed resource holds a reference
to the device. This is happening in the DRM framework for example.
We should thus call devres_release_all() in the device_del() function to
make sure that the device-managed actions are properly executed when the
device is unregistered, even if it has neither a bus nor a driver.
This is effectively the same change than commit 2f8d16a996da ("devres:
release resources on device_del()") that got reverted by commit
a525a3ddeaca ("driver core: free devres in device_release") over
memory leaks concerns.
This patch effectively combines the two commits mentioned above to
release the resources both on device_del() and device_release() and get
the best of both worlds. |
| In the Linux kernel, the following vulnerability has been resolved:
netfilter: ipset: Rework long task execution when adding/deleting entries
When adding/deleting large number of elements in one step in ipset, it can
take a reasonable amount of time and can result in soft lockup errors. The
patch 5f7b51bf09ba ("netfilter: ipset: Limit the maximal range of
consecutive elements to add/delete") tried to fix it by limiting the max
elements to process at all. However it was not enough, it is still possible
that we get hung tasks. Lowering the limit is not reasonable, so the
approach in this patch is as follows: rely on the method used at resizing
sets and save the state when we reach a smaller internal batch limit,
unlock/lock and proceed from the saved state. Thus we can avoid long
continuous tasks and at the same time removed the limit to add/delete large
number of elements in one step.
The nfnl mutex is held during the whole operation which prevents one to
issue other ipset commands in parallel. |
| In the Linux kernel, the following vulnerability has been resolved:
cifs: fix mid leak during reconnection after timeout threshold
When the number of responses with status of STATUS_IO_TIMEOUT
exceeds a specified threshold (NUM_STATUS_IO_TIMEOUT), we reconnect
the connection. But we do not return the mid, or the credits
returned for the mid, or reduce the number of in-flight requests.
This bug could result in the server->in_flight count to go bad,
and also cause a leak in the mids.
This change moves the check to a few lines below where the
response is decrypted, even of the response is read from the
transform header. This way, the code for returning the mids
can be reused.
Also, the cifs_reconnect was reconnecting just the transport
connection before. In case of multi-channel, this may not be
what we want to do after several timeouts. Changed that to
reconnect the session and the tree too.
Also renamed NUM_STATUS_IO_TIMEOUT to a more appropriate name
MAX_STATUS_IO_TIMEOUT. |
| In the Linux kernel, the following vulnerability has been resolved:
crypto: af_alg - Fix missing initialisation affecting gcm-aes-s390
Fix af_alg_alloc_areq() to initialise areq->first_rsgl.sgl.sgt.sgl to point
to the scatterlist array in areq->first_rsgl.sgl.sgl.
Without this, the gcm-aes-s390 driver will oops when it tries to do
gcm_walk_start() on req->dst because req->dst is set to the value of
areq->first_rsgl.sgl.sgl by _aead_recvmsg() calling
aead_request_set_crypt().
The problem comes if an empty ciphertext is passed: the loop in
af_alg_get_rsgl() just passes straight out and doesn't set areq->first_rsgl
up.
This isn't a problem on x86_64 using gcmaes_crypt_by_sg() because, as far
as I can tell, that ignores req->dst and only uses req->src[*].
[*] Is this a bug in aesni-intel_glue.c?
The s390x oops looks something like:
Unable to handle kernel pointer dereference in virtual kernel address space
Failing address: 0000000a00000000 TEID: 0000000a00000803
Fault in home space mode while using kernel ASCE.
AS:00000000a43a0007 R3:0000000000000024
Oops: 003b ilc:2 [#1] SMP
...
Call Trace:
[<000003ff7fc3d47e>] gcm_walk_start+0x16/0x28 [aes_s390]
[<00000000a2a342f2>] crypto_aead_decrypt+0x9a/0xb8
[<00000000a2a60888>] aead_recvmsg+0x478/0x698
[<00000000a2e519a0>] sock_recvmsg+0x70/0xb0
[<00000000a2e51a56>] sock_read_iter+0x76/0xa0
[<00000000a273e066>] vfs_read+0x26e/0x2a8
[<00000000a273e8c4>] ksys_read+0xbc/0x100
[<00000000a311d808>] __do_syscall+0x1d0/0x1f8
[<00000000a312ff30>] system_call+0x70/0x98
Last Breaking-Event-Address:
[<000003ff7fc3e6b4>] gcm_aes_crypt+0x104/0xa68 [aes_s390] |
| In the Linux kernel, the following vulnerability has been resolved:
tunnels: fix kasan splat when generating ipv4 pmtu error
If we try to emit an icmp error in response to a nonliner skb, we get
BUG: KASAN: slab-out-of-bounds in ip_compute_csum+0x134/0x220
Read of size 4 at addr ffff88811c50db00 by task iperf3/1691
CPU: 2 PID: 1691 Comm: iperf3 Not tainted 6.5.0-rc3+ #309
[..]
kasan_report+0x105/0x140
ip_compute_csum+0x134/0x220
iptunnel_pmtud_build_icmp+0x554/0x1020
skb_tunnel_check_pmtu+0x513/0xb80
vxlan_xmit_one+0x139e/0x2ef0
vxlan_xmit+0x1867/0x2760
dev_hard_start_xmit+0x1ee/0x4f0
br_dev_queue_push_xmit+0x4d1/0x660
[..]
ip_compute_csum() cannot deal with nonlinear skbs, so avoid it.
After this change, splat is gone and iperf3 is no longer stuck. |
| In the Linux kernel, the following vulnerability has been resolved:
dm cache: Fix UAF in destroy()
Dm_cache also has the same UAF problem when dm_resume()
and dm_destroy() are concurrent.
Therefore, cancelling timer again in destroy(). |
| In the Linux kernel, the following vulnerability has been resolved:
drm: amd: display: Fix memory leakage
This commit fixes memory leakage in dc_construct_ctx() function. |
| In the Linux kernel, the following vulnerability has been resolved:
scsi: qla2xxx: Fix crash when I/O abort times out
While performing CPU hotplug, a crash with the following stack was seen:
Call Trace:
qla24xx_process_response_queue+0x42a/0x970 [qla2xxx]
qla2x00_start_nvme_mq+0x3a2/0x4b0 [qla2xxx]
qla_nvme_post_cmd+0x166/0x240 [qla2xxx]
nvme_fc_start_fcp_op.part.0+0x119/0x2e0 [nvme_fc]
blk_mq_dispatch_rq_list+0x17b/0x610
__blk_mq_sched_dispatch_requests+0xb0/0x140
blk_mq_sched_dispatch_requests+0x30/0x60
__blk_mq_run_hw_queue+0x35/0x90
__blk_mq_delay_run_hw_queue+0x161/0x180
blk_execute_rq+0xbe/0x160
__nvme_submit_sync_cmd+0x16f/0x220 [nvme_core]
nvmf_connect_admin_queue+0x11a/0x170 [nvme_fabrics]
nvme_fc_create_association.cold+0x50/0x3dc [nvme_fc]
nvme_fc_connect_ctrl_work+0x19/0x30 [nvme_fc]
process_one_work+0x1e8/0x3c0
On abort timeout, completion was called without checking if the I/O was
already completed.
Verify that I/O and abort request are indeed outstanding before attempting
completion. |