| CVE |
Vendors |
Products |
Updated |
CVSS v3.1 |
| In the Linux kernel, the following vulnerability has been resolved:
efi/unaccepted: touch soft lockup during memory accept
Commit 50e782a86c98 ("efi/unaccepted: Fix soft lockups caused by
parallel memory acceptance") has released the spinlock so other CPUs can
do memory acceptance in parallel and not triggers softlockup on other
CPUs.
However the softlock up was intermittent shown up if the memory of the
TD guest is large, and the timeout of softlockup is set to 1 second:
RIP: 0010:_raw_spin_unlock_irqrestore
Call Trace:
? __hrtimer_run_queues
<IRQ>
? hrtimer_interrupt
? watchdog_timer_fn
? __sysvec_apic_timer_interrupt
? __pfx_watchdog_timer_fn
? sysvec_apic_timer_interrupt
</IRQ>
? __hrtimer_run_queues
<TASK>
? hrtimer_interrupt
? asm_sysvec_apic_timer_interrupt
? _raw_spin_unlock_irqrestore
? __sysvec_apic_timer_interrupt
? sysvec_apic_timer_interrupt
accept_memory
try_to_accept_memory
do_huge_pmd_anonymous_page
get_page_from_freelist
__handle_mm_fault
__alloc_pages
__folio_alloc
? __tdx_hypercall
handle_mm_fault
vma_alloc_folio
do_user_addr_fault
do_huge_pmd_anonymous_page
exc_page_fault
? __do_huge_pmd_anonymous_page
asm_exc_page_fault
__handle_mm_fault
When the local irq is enabled at the end of accept_memory(), the
softlockup detects that the watchdog on single CPU has not been fed for
a while. That is to say, even other CPUs will not be blocked by
spinlock, the current CPU might be stunk with local irq disabled for a
while, which hurts not only nmi watchdog but also softlockup.
Chao Gao pointed out that the memory accept could be time costly and
there was similar report before. Thus to avoid any softlocup detection
during this stage, give the softlockup a flag to skip the timeout check
at the end of accept_memory(), by invoking touch_softlockup_watchdog(). |
| In the Linux kernel, the following vulnerability has been resolved:
bpf: Check bloom filter map value size
This patch adds a missing check to bloom filter creating, rejecting
values above KMALLOC_MAX_SIZE. This brings the bloom map in line with
many other map types.
The lack of this protection can cause kernel crashes for value sizes
that overflow int's. Such a crash was caught by syzkaller. The next
patch adds more guard-rails at a lower level. |
| In the Linux kernel, the following vulnerability has been resolved:
block: fix overflow in blk_ioctl_discard()
There is no check for overflow of 'start + len' in blk_ioctl_discard().
Hung task occurs if submit an discard ioctl with the following param:
start = 0x80000000000ff000, len = 0x8000000000fff000;
Add the overflow validation now. |
| In the Linux kernel, the following vulnerability has been resolved:
ARM: 9381/1: kasan: clear stale stack poison
We found below OOB crash:
[ 33.452494] ==================================================================
[ 33.453513] BUG: KASAN: stack-out-of-bounds in refresh_cpu_vm_stats.constprop.0+0xcc/0x2ec
[ 33.454660] Write of size 164 at addr c1d03d30 by task swapper/0/0
[ 33.455515]
[ 33.455767] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G O 6.1.25-mainline #1
[ 33.456880] Hardware name: Generic DT based system
[ 33.457555] unwind_backtrace from show_stack+0x18/0x1c
[ 33.458326] show_stack from dump_stack_lvl+0x40/0x4c
[ 33.459072] dump_stack_lvl from print_report+0x158/0x4a4
[ 33.459863] print_report from kasan_report+0x9c/0x148
[ 33.460616] kasan_report from kasan_check_range+0x94/0x1a0
[ 33.461424] kasan_check_range from memset+0x20/0x3c
[ 33.462157] memset from refresh_cpu_vm_stats.constprop.0+0xcc/0x2ec
[ 33.463064] refresh_cpu_vm_stats.constprop.0 from tick_nohz_idle_stop_tick+0x180/0x53c
[ 33.464181] tick_nohz_idle_stop_tick from do_idle+0x264/0x354
[ 33.465029] do_idle from cpu_startup_entry+0x20/0x24
[ 33.465769] cpu_startup_entry from rest_init+0xf0/0xf4
[ 33.466528] rest_init from arch_post_acpi_subsys_init+0x0/0x18
[ 33.467397]
[ 33.467644] The buggy address belongs to stack of task swapper/0/0
[ 33.468493] and is located at offset 112 in frame:
[ 33.469172] refresh_cpu_vm_stats.constprop.0+0x0/0x2ec
[ 33.469917]
[ 33.470165] This frame has 2 objects:
[ 33.470696] [32, 76) 'global_zone_diff'
[ 33.470729] [112, 276) 'global_node_diff'
[ 33.471294]
[ 33.472095] The buggy address belongs to the physical page:
[ 33.472862] page:3cd72da8 refcount:1 mapcount:0 mapping:00000000 index:0x0 pfn:0x41d03
[ 33.473944] flags: 0x1000(reserved|zone=0)
[ 33.474565] raw: 00001000 ed741470 ed741470 00000000 00000000 00000000 ffffffff 00000001
[ 33.475656] raw: 00000000
[ 33.476050] page dumped because: kasan: bad access detected
[ 33.476816]
[ 33.477061] Memory state around the buggy address:
[ 33.477732] c1d03c00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 33.478630] c1d03c80: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00 00 00 00
[ 33.479526] >c1d03d00: 00 04 f2 f2 f2 f2 00 00 00 00 00 00 f1 f1 f1 f1
[ 33.480415] ^
[ 33.481195] c1d03d80: 00 00 00 00 00 00 00 00 00 00 04 f3 f3 f3 f3 f3
[ 33.482088] c1d03e00: f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00
[ 33.482978] ==================================================================
We find the root cause of this OOB is that arm does not clear stale stack
poison in the case of cpuidle.
This patch refer to arch/arm64/kernel/sleep.S to resolve this issue.
From cited commit [1] that explain the problem
Functions which the compiler has instrumented for KASAN place poison on
the stack shadow upon entry and remove this poison prior to returning.
In the case of cpuidle, CPUs exit the kernel a number of levels deep in
C code. Any instrumented functions on this critical path will leave
portions of the stack shadow poisoned.
If CPUs lose context and return to the kernel via a cold path, we
restore a prior context saved in __cpu_suspend_enter are forgotten, and
we never remove the poison they placed in the stack shadow area by
functions calls between this and the actual exit of the kernel.
Thus, (depending on stackframe layout) subsequent calls to instrumented
functions may hit this stale poison, resulting in (spurious) KASAN
splats to the console.
To avoid this, clear any stale poison from the idle thread for a CPU
prior to bringing a CPU online.
From cited commit [2]
Extend to check for CONFIG_KASAN_STACK
[1] commit 0d97e6d8024c ("arm64: kasan: clear stale stack poison")
[2] commit d56a9ef84bd0 ("kasan, arm64: unpoison stack only with CONFIG_KASAN_STACK") |
| IBM Security Verify Information Queue 10.0.5, 10.0.6, 10.0.7, and 10.0.8 could allow a remote user to cause a denial of service due to improper handling of special characters that could lead to uncontrolled resource consumption. |
| In the Linux kernel, the following vulnerability has been resolved:
drm/mediatek: Init `ddp_comp` with devm_kcalloc()
In the case where `conn_routes` is true we allocate an extra slot in
the `ddp_comp` array but mtk_drm_crtc_create() never seemed to
initialize it in the test case I ran. For me, this caused a later
crash when we looped through the array in mtk_drm_crtc_mode_valid().
This showed up for me when I booted with `slub_debug=FZPUA` which
poisons the memory initially. Without `slub_debug` I couldn't
reproduce, presumably because the later code handles the value being
NULL and in most cases (not guaranteed in all cases) the memory the
allocator returned started out as 0.
It really doesn't hurt to initialize the array with devm_kcalloc()
since the array is small and the overhead of initting a handful of
elements to 0 is small. In general initting memory to zero is a safer
practice and usually it's suggested to only use the non-initting alloc
functions if you really need to.
Let's switch the function to use an allocation function that zeros the
memory. For me, this avoids the crash. |
| IBM Security Verify Information Queue 10.0.5, 10.0.6, 10.0.7, and 10.0.8
uses weaker than expected cryptographic algorithms that could allow an attacker to decrypt highly sensitive information. |
| In the Linux kernel, the following vulnerability has been resolved:
r8169: Fix possible ring buffer corruption on fragmented Tx packets.
An issue was found on the RTL8125b when transmitting small fragmented
packets, whereby invalid entries were inserted into the transmit ring
buffer, subsequently leading to calls to dma_unmap_single() with a null
address.
This was caused by rtl8169_start_xmit() not noticing changes to nr_frags
which may occur when small packets are padded (to work around hardware
quirks) in rtl8169_tso_csum_v2().
To fix this, postpone inspecting nr_frags until after any padding has been
applied. |
| In the Linux kernel, the following vulnerability has been resolved:
tools/nolibc/stdlib: fix memory error in realloc()
Pass user_p_len to memcpy() instead of heap->len to prevent realloc()
from copying an extra sizeof(heap) bytes from beyond the allocated
region. |
| In the Linux kernel, the following vulnerability has been resolved:
wifi: ath12k: fix out-of-bound access of qmi_invoke_handler()
Currently, there is no terminator entry for ath12k_qmi_msg_handlers hence
facing below KASAN warning,
==================================================================
BUG: KASAN: global-out-of-bounds in qmi_invoke_handler+0xa4/0x148
Read of size 8 at addr ffffffd00a6428d8 by task kworker/u8:2/1273
CPU: 0 PID: 1273 Comm: kworker/u8:2 Not tainted 5.4.213 #0
Workqueue: qmi_msg_handler qmi_data_ready_work
Call trace:
dump_backtrace+0x0/0x20c
show_stack+0x14/0x1c
dump_stack+0xe0/0x138
print_address_description.isra.5+0x30/0x330
__kasan_report+0x16c/0x1bc
kasan_report+0xc/0x14
__asan_load8+0xa8/0xb0
qmi_invoke_handler+0xa4/0x148
qmi_handle_message+0x18c/0x1bc
qmi_data_ready_work+0x4ec/0x528
process_one_work+0x2c0/0x440
worker_thread+0x324/0x4b8
kthread+0x210/0x228
ret_from_fork+0x10/0x18
The address belongs to the variable:
ath12k_mac_mon_status_filter_default+0x4bd8/0xfffffffffffe2300 [ath12k]
[...]
==================================================================
Add a dummy terminator entry at the end to assist the qmi_invoke_handler()
in traversing up to the terminator entry without accessing an
out-of-boundary index.
Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.0.1-00029-QCAHKSWPL_SILICONZ-1 |
| In the Linux kernel, the following vulnerability has been resolved:
bpf: Fix verifier assumptions about socket->sk
The verifier assumes that 'sk' field in 'struct socket' is valid
and non-NULL when 'socket' pointer itself is trusted and non-NULL.
That may not be the case when socket was just created and
passed to LSM socket_accept hook.
Fix this verifier assumption and adjust tests. |
| In the Linux kernel, the following vulnerability has been resolved:
tracefs: Reset permissions on remount if permissions are options
There's an inconsistency with the way permissions are handled in tracefs.
Because the permissions are generated when accessed, they default to the
root inode's permission if they were never set by the user. If the user
sets the permissions, then a flag is set and the permissions are saved via
the inode (for tracefs files) or an internal attribute field (for
eventfs).
But if a remount happens that specify the permissions, all the files that
were not changed by the user gets updated, but the ones that were are not.
If the user were to remount the file system with a given permission, then
all files and directories within that file system should be updated.
This can cause security issues if a file's permission was updated but the
admin forgot about it. They could incorrectly think that remounting with
permissions set would update all files, but miss some.
For example:
# cd /sys/kernel/tracing
# chgrp 1002 current_tracer
# ls -l
[..]
-rw-r----- 1 root root 0 May 1 21:25 buffer_size_kb
-rw-r----- 1 root root 0 May 1 21:25 buffer_subbuf_size_kb
-r--r----- 1 root root 0 May 1 21:25 buffer_total_size_kb
-rw-r----- 1 root lkp 0 May 1 21:25 current_tracer
-rw-r----- 1 root root 0 May 1 21:25 dynamic_events
-r--r----- 1 root root 0 May 1 21:25 dyn_ftrace_total_info
-r--r----- 1 root root 0 May 1 21:25 enabled_functions
Where current_tracer now has group "lkp".
# mount -o remount,gid=1001 .
# ls -l
-rw-r----- 1 root tracing 0 May 1 21:25 buffer_size_kb
-rw-r----- 1 root tracing 0 May 1 21:25 buffer_subbuf_size_kb
-r--r----- 1 root tracing 0 May 1 21:25 buffer_total_size_kb
-rw-r----- 1 root lkp 0 May 1 21:25 current_tracer
-rw-r----- 1 root tracing 0 May 1 21:25 dynamic_events
-r--r----- 1 root tracing 0 May 1 21:25 dyn_ftrace_total_info
-r--r----- 1 root tracing 0 May 1 21:25 enabled_functions
Everything changed but the "current_tracer".
Add a new link list that keeps track of all the tracefs_inodes which has
the permission flags that tell if the file/dir should use the root inode's
permission or not. Then on remount, clear all the flags so that the
default behavior of using the root inode's permission is done for all
files and directories. |
| In the Linux kernel, the following vulnerability has been resolved:
rpmsg: char: Fix race between the release of rpmsg_ctrldev and cdev
struct rpmsg_ctrldev contains a struct cdev. The current code frees
the rpmsg_ctrldev struct in rpmsg_ctrldev_release_device(), but the
cdev is a managed object, therefore its release is not predictable
and the rpmsg_ctrldev could be freed before the cdev is entirely
released, as in the backtrace below.
[ 93.625603] ODEBUG: free active (active state 0) object type: timer_list hint: delayed_work_timer_fn+0x0/0x7c
[ 93.636115] WARNING: CPU: 0 PID: 12 at lib/debugobjects.c:488 debug_print_object+0x13c/0x1b0
[ 93.644799] Modules linked in: veth xt_cgroup xt_MASQUERADE rfcomm algif_hash algif_skcipher af_alg uinput ip6table_nat fuse uvcvideo videobuf2_vmalloc venus_enc venus_dec videobuf2_dma_contig hci_uart btandroid btqca snd_soc_rt5682_i2c bluetooth qcom_spmi_temp_alarm snd_soc_rt5682v
[ 93.715175] CPU: 0 PID: 12 Comm: kworker/0:1 Tainted: G B 5.4.163-lockdep #26
[ 93.723855] Hardware name: Google Lazor (rev3 - 8) with LTE (DT)
[ 93.730055] Workqueue: events kobject_delayed_cleanup
[ 93.735271] pstate: 60c00009 (nZCv daif +PAN +UAO)
[ 93.740216] pc : debug_print_object+0x13c/0x1b0
[ 93.744890] lr : debug_print_object+0x13c/0x1b0
[ 93.749555] sp : ffffffacf5bc7940
[ 93.752978] x29: ffffffacf5bc7940 x28: dfffffd000000000
[ 93.758448] x27: ffffffacdb11a800 x26: dfffffd000000000
[ 93.763916] x25: ffffffd0734f856c x24: dfffffd000000000
[ 93.769389] x23: 0000000000000000 x22: ffffffd0733c35b0
[ 93.774860] x21: ffffffd0751994a0 x20: ffffffd075ec27c0
[ 93.780338] x19: ffffffd075199100 x18: 00000000000276e0
[ 93.785814] x17: 0000000000000000 x16: dfffffd000000000
[ 93.791291] x15: ffffffffffffffff x14: 6e6968207473696c
[ 93.796768] x13: 0000000000000000 x12: ffffffd075e2b000
[ 93.802244] x11: 0000000000000001 x10: 0000000000000000
[ 93.807723] x9 : d13400dff1921900 x8 : d13400dff1921900
[ 93.813200] x7 : 0000000000000000 x6 : 0000000000000000
[ 93.818676] x5 : 0000000000000080 x4 : 0000000000000000
[ 93.824152] x3 : ffffffd0732a0fa4 x2 : 0000000000000001
[ 93.829628] x1 : ffffffacf5bc7580 x0 : 0000000000000061
[ 93.835104] Call trace:
[ 93.837644] debug_print_object+0x13c/0x1b0
[ 93.841963] __debug_check_no_obj_freed+0x25c/0x3c0
[ 93.846987] debug_check_no_obj_freed+0x18/0x20
[ 93.851669] slab_free_freelist_hook+0xbc/0x1e4
[ 93.856346] kfree+0xfc/0x2f4
[ 93.859416] rpmsg_ctrldev_release_device+0x78/0xb8
[ 93.864445] device_release+0x84/0x168
[ 93.868310] kobject_cleanup+0x12c/0x298
[ 93.872356] kobject_delayed_cleanup+0x10/0x18
[ 93.876948] process_one_work+0x578/0x92c
[ 93.881086] worker_thread+0x804/0xcf8
[ 93.884963] kthread+0x2a8/0x314
[ 93.888303] ret_from_fork+0x10/0x18
The cdev_device_add/del() API was created to address this issue (see
commit '233ed09d7fda ("chardev: add helper function to register char
devs with a struct device")'), use it instead of cdev add/del(). |
| In the Linux kernel, the following vulnerability has been resolved:
net: fix information leakage in /proc/net/ptype
In one net namespace, after creating a packet socket without binding
it to a device, users in other net namespaces can observe the new
`packet_type` added by this packet socket by reading `/proc/net/ptype`
file. This is minor information leakage as packet socket is
namespace aware.
Add a net pointer in `packet_type` to keep the net namespace of
of corresponding packet socket. In `ptype_seq_show`, this net pointer
must be checked when it is not NULL. |
| In the Linux kernel, the following vulnerability has been resolved:
net: ieee802154: ca8210: Stop leaking skb's
Upon error the ieee802154_xmit_complete() helper is not called. Only
ieee802154_wake_queue() is called manually. We then leak the skb
structure.
Free the skb structure upon error before returning. |
| In the Linux kernel, the following vulnerability has been resolved:
bpf: Use VM_MAP instead of VM_ALLOC for ringbuf
After commit 2fd3fb0be1d1 ("kasan, vmalloc: unpoison VM_ALLOC pages
after mapping"), non-VM_ALLOC mappings will be marked as accessible
in __get_vm_area_node() when KASAN is enabled. But now the flag for
ringbuf area is VM_ALLOC, so KASAN will complain out-of-bound access
after vmap() returns. Because the ringbuf area is created by mapping
allocated pages, so use VM_MAP instead.
After the change, info in /proc/vmallocinfo also changes from
[start]-[end] 24576 ringbuf_map_alloc+0x171/0x290 vmalloc user
to
[start]-[end] 24576 ringbuf_map_alloc+0x171/0x290 vmap user |
| In the Linux kernel, the following vulnerability has been resolved:
perf/x86/intel/pt: Fix crash with stop filters in single-range mode
Add a check for !buf->single before calling pt_buffer_region_size in a
place where a missing check can cause a kernel crash.
Fixes a bug introduced by commit 670638477aed ("perf/x86/intel/pt:
Opportunistically use single range output mode"), which added a
support for PT single-range output mode. Since that commit if a PT
stop filter range is hit while tracing, the kernel will crash because
of a null pointer dereference in pt_handle_status due to calling
pt_buffer_region_size without a ToPA configured.
The commit which introduced single-range mode guarded almost all uses of
the ToPA buffer variables with checks of the buf->single variable, but
missed the case where tracing was stopped by the PT hardware, which
happens when execution hits a configured stop filter.
Tested that hitting a stop filter while PT recording successfully
records a trace with this patch but crashes without this patch. |
| In the Linux kernel, the following vulnerability has been resolved:
tipc: improve size validations for received domain records
The function tipc_mon_rcv() allows a node to receive and process
domain_record structs from peer nodes to track their views of the
network topology.
This patch verifies that the number of members in a received domain
record does not exceed the limit defined by MAX_MON_DOMAIN, something
that may otherwise lead to a stack overflow.
tipc_mon_rcv() is called from the function tipc_link_proto_rcv(), where
we are reading a 32 bit message data length field into a uint16. To
avert any risk of bit overflow, we add an extra sanity check for this in
that function. We cannot see that happen with the current code, but
future designers being unaware of this risk, may introduce it by
allowing delivery of very large (> 64k) sk buffers from the bearer
layer. This potential problem was identified by Eric Dumazet.
This fixes CVE-2022-0435 |
| In the Linux kernel, the following vulnerability has been resolved:
ARM: 9170/1: fix panic when kasan and kprobe are enabled
arm32 uses software to simulate the instruction replaced
by kprobe. some instructions may be simulated by constructing
assembly functions. therefore, before executing instruction
simulation, it is necessary to construct assembly function
execution environment in C language through binding registers.
after kasan is enabled, the register binding relationship will
be destroyed, resulting in instruction simulation errors and
causing kernel panic.
the kprobe emulate instruction function is distributed in three
files: actions-common.c actions-arm.c actions-thumb.c, so disable
KASAN when compiling these files.
for example, use kprobe insert on cap_capable+20 after kasan
enabled, the cap_capable assembly code is as follows:
<cap_capable>:
e92d47f0 push {r4, r5, r6, r7, r8, r9, sl, lr}
e1a05000 mov r5, r0
e280006c add r0, r0, #108 ; 0x6c
e1a04001 mov r4, r1
e1a06002 mov r6, r2
e59fa090 ldr sl, [pc, #144] ;
ebfc7bf8 bl c03aa4b4 <__asan_load4>
e595706c ldr r7, [r5, #108] ; 0x6c
e2859014 add r9, r5, #20
......
The emulate_ldr assembly code after enabling kasan is as follows:
c06f1384 <emulate_ldr>:
e92d47f0 push {r4, r5, r6, r7, r8, r9, sl, lr}
e282803c add r8, r2, #60 ; 0x3c
e1a05000 mov r5, r0
e7e37855 ubfx r7, r5, #16, #4
e1a00008 mov r0, r8
e1a09001 mov r9, r1
e1a04002 mov r4, r2
ebf35462 bl c03c6530 <__asan_load4>
e357000f cmp r7, #15
e7e36655 ubfx r6, r5, #12, #4
e205a00f and sl, r5, #15
0a000001 beq c06f13bc <emulate_ldr+0x38>
e0840107 add r0, r4, r7, lsl #2
ebf3545c bl c03c6530 <__asan_load4>
e084010a add r0, r4, sl, lsl #2
ebf3545a bl c03c6530 <__asan_load4>
e2890010 add r0, r9, #16
ebf35458 bl c03c6530 <__asan_load4>
e5990010 ldr r0, [r9, #16]
e12fff30 blx r0
e356000f cm r6, #15
1a000014 bne c06f1430 <emulate_ldr+0xac>
e1a06000 mov r6, r0
e2840040 add r0, r4, #64 ; 0x40
......
when running in emulate_ldr to simulate the ldr instruction, panic
occurred, and the log is as follows:
Unable to handle kernel NULL pointer dereference at virtual address
00000090
pgd = ecb46400
[00000090] *pgd=2e0fa003, *pmd=00000000
Internal error: Oops: 206 [#1] SMP ARM
PC is at cap_capable+0x14/0xb0
LR is at emulate_ldr+0x50/0xc0
psr: 600d0293 sp : ecd63af8 ip : 00000004 fp : c0a7c30c
r10: 00000000 r9 : c30897f4 r8 : ecd63cd4
r7 : 0000000f r6 : 0000000a r5 : e59fa090 r4 : ecd63c98
r3 : c06ae294 r2 : 00000000 r1 : b7611300 r0 : bf4ec008
Flags: nZCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment user
Control: 32c5387d Table: 2d546400 DAC: 55555555
Process bash (pid: 1643, stack limit = 0xecd60190)
(cap_capable) from (kprobe_handler+0x218/0x340)
(kprobe_handler) from (kprobe_trap_handler+0x24/0x48)
(kprobe_trap_handler) from (do_undefinstr+0x13c/0x364)
(do_undefinstr) from (__und_svc_finish+0x0/0x30)
(__und_svc_finish) from (cap_capable+0x18/0xb0)
(cap_capable) from (cap_vm_enough_memory+0x38/0x48)
(cap_vm_enough_memory) from
(security_vm_enough_memory_mm+0x48/0x6c)
(security_vm_enough_memory_mm) from
(copy_process.constprop.5+0x16b4/0x25c8)
(copy_process.constprop.5) from (_do_fork+0xe8/0x55c)
(_do_fork) from (SyS_clone+0x1c/0x24)
(SyS_clone) from (__sys_trace_return+0x0/0x10)
Code: 0050a0e1 6c0080e2 0140a0e1 0260a0e1 (f801f0e7) |
| In the Linux kernel, the following vulnerability has been resolved:
m68k: Fix spinlock race in kernel thread creation
Context switching does take care to retain the correct lock owner across
the switch from 'prev' to 'next' tasks. This does rely on interrupts
remaining disabled for the entire duration of the switch.
This condition is guaranteed for normal process creation and context
switching between already running processes, because both 'prev' and
'next' already have interrupts disabled in their saved copies of the
status register.
The situation is different for newly created kernel threads. The status
register is set to PS_S in copy_thread(), which does leave the IPL at 0.
Upon restoring the 'next' thread's status register in switch_to() aka
resume(), interrupts then become enabled prematurely. resume() then
returns via ret_from_kernel_thread() and schedule_tail() where run queue
lock is released (see finish_task_switch() and finish_lock_switch()).
A timer interrupt calling scheduler_tick() before the lock is released
in finish_task_switch() will find the lock already taken, with the
current task as lock owner. This causes a spinlock recursion warning as
reported by Guenter Roeck.
As far as I can ascertain, this race has been opened in commit
533e6903bea0 ("m68k: split ret_from_fork(), simplify kernel_thread()")
but I haven't done a detailed study of kernel history so it may well
predate that commit.
Interrupts cannot be disabled in the saved status register copy for
kernel threads (init will complain about interrupts disabled when
finally starting user space). Disable interrupts temporarily when
switching the tasks' register sets in resume().
Note that a simple oriw 0x700,%sr after restoring sr is not enough here
- this leaves enough of a race for the 'spinlock recursion' warning to
still be observed.
Tested on ARAnyM and qemu (Quadra 800 emulation). |