| CVE |
Vendors |
Products |
Updated |
CVSS v3.1 |
| In the Linux kernel, the following vulnerability has been resolved:
RDMA/bnxt_re: Prevent handling any completions after qp destroy
HW may generate completions that indicates QP is destroyed.
Driver should not be scheduling any more completion handlers
for this QP, after the QP is destroyed. Since CQs are active
during the QP destroy, driver may still schedule completion
handlers. This can cause a race where the destroy_cq and poll_cq
running simultaneously.
Snippet of kernel panic while doing bnxt_re driver load unload in loop.
This indicates a poll after the CQ is freed.
[77786.481636] Call Trace:
[77786.481640] <TASK>
[77786.481644] bnxt_re_poll_cq+0x14a/0x620 [bnxt_re]
[77786.481658] ? kvm_clock_read+0x14/0x30
[77786.481693] __ib_process_cq+0x57/0x190 [ib_core]
[77786.481728] ib_cq_poll_work+0x26/0x80 [ib_core]
[77786.481761] process_one_work+0x1e5/0x3f0
[77786.481768] worker_thread+0x50/0x3a0
[77786.481785] ? __pfx_worker_thread+0x10/0x10
[77786.481790] kthread+0xe2/0x110
[77786.481794] ? __pfx_kthread+0x10/0x10
[77786.481797] ret_from_fork+0x2c/0x50
To avoid this, complete all completion handlers before returning the
destroy QP. If free_cq is called soon after destroy_qp, IB stack
will cancel the CQ work before invoking the destroy_cq verb and
this will prevent any race mentioned. |
| In the Linux kernel, the following vulnerability has been resolved:
igb: clean up in all error paths when enabling SR-IOV
After commit 50f303496d92 ("igb: Enable SR-IOV after reinit"), removing
the igb module could hang or crash (depending on the machine) when the
module has been loaded with the max_vfs parameter set to some value != 0.
In case of one test machine with a dual port 82580, this hang occurred:
[ 232.480687] igb 0000:41:00.1: removed PHC on enp65s0f1
[ 233.093257] igb 0000:41:00.1: IOV Disabled
[ 233.329969] pcieport 0000:40:01.0: AER: Multiple Uncorrected (Non-Fatal) err0
[ 233.340302] igb 0000:41:00.0: PCIe Bus Error: severity=Uncorrected (Non-Fata)
[ 233.352248] igb 0000:41:00.0: device [8086:1516] error status/mask=00100000
[ 233.361088] igb 0000:41:00.0: [20] UnsupReq (First)
[ 233.368183] igb 0000:41:00.0: AER: TLP Header: 40000001 0000040f cdbfc00c c
[ 233.376846] igb 0000:41:00.1: PCIe Bus Error: severity=Uncorrected (Non-Fata)
[ 233.388779] igb 0000:41:00.1: device [8086:1516] error status/mask=00100000
[ 233.397629] igb 0000:41:00.1: [20] UnsupReq (First)
[ 233.404736] igb 0000:41:00.1: AER: TLP Header: 40000001 0000040f cdbfc00c c
[ 233.538214] pci 0000:41:00.1: AER: can't recover (no error_detected callback)
[ 233.538401] igb 0000:41:00.0: removed PHC on enp65s0f0
[ 233.546197] pcieport 0000:40:01.0: AER: device recovery failed
[ 234.157244] igb 0000:41:00.0: IOV Disabled
[ 371.619705] INFO: task irq/35-aerdrv:257 blocked for more than 122 seconds.
[ 371.627489] Not tainted 6.4.0-dirty #2
[ 371.632257] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this.
[ 371.641000] task:irq/35-aerdrv state:D stack:0 pid:257 ppid:2 f0
[ 371.650330] Call Trace:
[ 371.653061] <TASK>
[ 371.655407] __schedule+0x20e/0x660
[ 371.659313] schedule+0x5a/0xd0
[ 371.662824] schedule_preempt_disabled+0x11/0x20
[ 371.667983] __mutex_lock.constprop.0+0x372/0x6c0
[ 371.673237] ? __pfx_aer_root_reset+0x10/0x10
[ 371.678105] report_error_detected+0x25/0x1c0
[ 371.682974] ? __pfx_report_normal_detected+0x10/0x10
[ 371.688618] pci_walk_bus+0x72/0x90
[ 371.692519] pcie_do_recovery+0xb2/0x330
[ 371.696899] aer_process_err_devices+0x117/0x170
[ 371.702055] aer_isr+0x1c0/0x1e0
[ 371.705661] ? __set_cpus_allowed_ptr+0x54/0xa0
[ 371.710723] ? __pfx_irq_thread_fn+0x10/0x10
[ 371.715496] irq_thread_fn+0x20/0x60
[ 371.719491] irq_thread+0xe6/0x1b0
[ 371.723291] ? __pfx_irq_thread_dtor+0x10/0x10
[ 371.728255] ? __pfx_irq_thread+0x10/0x10
[ 371.732731] kthread+0xe2/0x110
[ 371.736243] ? __pfx_kthread+0x10/0x10
[ 371.740430] ret_from_fork+0x2c/0x50
[ 371.744428] </TASK>
The reproducer was a simple script:
#!/bin/sh
for i in `seq 1 5`; do
modprobe -rv igb
modprobe -v igb max_vfs=1
sleep 1
modprobe -rv igb
done
It turned out that this could only be reproduce on 82580 (quad and
dual-port), but not on 82576, i350 and i210. Further debugging showed
that igb_enable_sriov()'s call to pci_enable_sriov() is failing, because
dev->is_physfn is 0 on 82580.
Prior to commit 50f303496d92 ("igb: Enable SR-IOV after reinit"),
igb_enable_sriov() jumped into the "err_out" cleanup branch. After this
commit it only returned the error code.
So the cleanup didn't take place, and the incorrect VF setup in the
igb_adapter structure fooled the igb driver into assuming that VFs have
been set up where no VF actually existed.
Fix this problem by cleaning up again if pci_enable_sriov() fails. |
| In the Linux kernel, the following vulnerability has been resolved:
media: rcar_fdp1: Fix refcount leak in probe and remove function
rcar_fcp_get() take reference, which should be balanced with
rcar_fcp_put(). Add missing rcar_fcp_put() in fdp1_remove and
the error paths of fdp1_probe() to fix this.
[hverkuil: resolve merge conflict, remove() is now void] |
| In the Linux kernel, the following vulnerability has been resolved:
perf/x86: Fix NULL event access and potential PEBS record loss
When intel_pmu_drain_pebs_icl() is called to drain PEBS records, the
perf_event_overflow() could be called to process the last PEBS record.
While perf_event_overflow() could trigger the interrupt throttle and
stop all events of the group, like what the below call-chain shows.
perf_event_overflow()
-> __perf_event_overflow()
->__perf_event_account_interrupt()
-> perf_event_throttle_group()
-> perf_event_throttle()
-> event->pmu->stop()
-> x86_pmu_stop()
The side effect of stopping the events is that all corresponding event
pointers in cpuc->events[] array are cleared to NULL.
Assume there are two PEBS events (event a and event b) in a group. When
intel_pmu_drain_pebs_icl() calls perf_event_overflow() to process the
last PEBS record of PEBS event a, interrupt throttle is triggered and
all pointers of event a and event b are cleared to NULL. Then
intel_pmu_drain_pebs_icl() tries to process the last PEBS record of
event b and encounters NULL pointer access.
To avoid this issue, move cpuc->events[] clearing from x86_pmu_stop()
to x86_pmu_del(). It's safe since cpuc->active_mask or
cpuc->pebs_enabled is always checked before access the event pointer
from cpuc->events[]. |
| In the Linux kernel, the following vulnerability has been resolved:
md: init bioset in mddev_init
IO operations may be needed before md_run(), such as updating metadata
after writing sysfs. Without bioset, this triggers a NULL pointer
dereference as below:
BUG: kernel NULL pointer dereference, address: 0000000000000020
Call Trace:
md_update_sb+0x658/0xe00
new_level_store+0xc5/0x120
md_attr_store+0xc9/0x1e0
sysfs_kf_write+0x6f/0xa0
kernfs_fop_write_iter+0x141/0x2a0
vfs_write+0x1fc/0x5a0
ksys_write+0x79/0x180
__x64_sys_write+0x1d/0x30
x64_sys_call+0x2818/0x2880
do_syscall_64+0xa9/0x580
entry_SYSCALL_64_after_hwframe+0x4b/0x53
Reproducer
```
mdadm -CR /dev/md0 -l1 -n2 /dev/sd[cd]
echo inactive > /sys/block/md0/md/array_state
echo 10 > /sys/block/md0/md/new_level
```
mddev_init() can only be called once per mddev, no need to test if bioset
has been initialized anymore. |
| In the Linux kernel, the following vulnerability has been resolved:
media: camss: Clean up received buffers on failed start of streaming
It is required to return the received buffers, if streaming can not be
started. For instance media_pipeline_start() may fail with EPIPE, if
a link validation between entities is not passed, and in such a case
a user gets a kernel warning:
WARNING: CPU: 1 PID: 520 at drivers/media/common/videobuf2/videobuf2-core.c:1592 vb2_start_streaming+0xec/0x160
<snip>
Call trace:
vb2_start_streaming+0xec/0x160
vb2_core_streamon+0x9c/0x1a0
vb2_ioctl_streamon+0x68/0xbc
v4l_streamon+0x30/0x3c
__video_do_ioctl+0x184/0x3e0
video_usercopy+0x37c/0x7b0
video_ioctl2+0x24/0x40
v4l2_ioctl+0x4c/0x70
The fix is to correct the error path in video_start_streaming() of camss. |
| In the Linux kernel, the following vulnerability has been resolved:
isdn: mISDN: hfcsusb: fix memory leak in hfcsusb_probe()
In hfcsusb_probe(), the memory allocated for ctrl_urb gets leaked when
setup_instance() fails with an error code. Fix that by freeing the urb
before freeing the hw structure. Also change the error paths to use the
goto ladder style.
Compile tested only. Issue found using a prototype static analysis tool. |
| In the Linux kernel, the following vulnerability has been resolved:
fs: drop peer group ids under namespace lock
When cleaning up peer group ids in the failure path we need to make sure
to hold on to the namespace lock. Otherwise another thread might just
turn the mount from a shared into a non-shared mount concurrently. |
| In the Linux kernel, the following vulnerability has been resolved:
tracing/user_events: Ensure write index cannot be negative
The write index indicates which event the data is for and accesses a
per-file array. The index is passed by user processes during write()
calls as the first 4 bytes. Ensure that it cannot be negative by
returning -EINVAL to prevent out of bounds accesses.
Update ftrace self-test to ensure this occurs properly. |
| In the Linux kernel, the following vulnerability has been resolved:
can: j1939: prevent deadlock by moving j1939_sk_errqueue()
This commit addresses a deadlock situation that can occur in certain
scenarios, such as when running data TP/ETP transfer and subscribing to
the error queue while receiving a net down event. The deadlock involves
locks in the following order:
3
j1939_session_list_lock -> active_session_list_lock
j1939_session_activate
...
j1939_sk_queue_activate_next -> sk_session_queue_lock
...
j1939_xtp_rx_eoma_one
2
j1939_sk_queue_drop_all -> sk_session_queue_lock
...
j1939_sk_netdev_event_netdown -> j1939_socks_lock
j1939_netdev_notify
1
j1939_sk_errqueue -> j1939_socks_lock
__j1939_session_cancel -> active_session_list_lock
j1939_tp_rxtimer
CPU0 CPU1
---- ----
lock(&priv->active_session_list_lock);
lock(&jsk->sk_session_queue_lock);
lock(&priv->active_session_list_lock);
lock(&priv->j1939_socks_lock);
The solution implemented in this commit is to move the
j1939_sk_errqueue() call out of the active_session_list_lock context,
thus preventing the deadlock situation. |
| In the Linux kernel, the following vulnerability has been resolved:
drm/msm: fix NULL-deref on irq uninstall
In case of early initialisation errors and on platforms that do not use
the DPU controller, the deinitilisation code can be called with the kms
pointer set to NULL.
Patchwork: https://patchwork.freedesktop.org/patch/525104/ |
| In the Linux kernel, the following vulnerability has been resolved:
serial: sprd: Fix DMA buffer leak issue
Release DMA buffer when _probe() returns failure to avoid memory leak. |
| In the Linux kernel, the following vulnerability has been resolved:
scsi: target: core: Fix target_cmd_counter leak
The target_cmd_counter struct allocated via target_alloc_cmd_counter() is
never freed, resulting in leaks across various transport types, e.g.:
unreferenced object 0xffff88801f920120 (size 96):
comm "sh", pid 102, jiffies 4294892535 (age 713.412s)
hex dump (first 32 bytes):
07 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00 00 00 00 00 00 00 00 38 01 92 1f 80 88 ff ff ........8.......
backtrace:
[<00000000e58a6252>] kmalloc_trace+0x11/0x20
[<0000000043af4b2f>] target_alloc_cmd_counter+0x17/0x90 [target_core_mod]
[<000000007da2dfa7>] target_setup_session+0x2d/0x140 [target_core_mod]
[<0000000068feef86>] tcm_loop_tpg_nexus_store+0x19b/0x350 [tcm_loop]
[<000000006a80e021>] configfs_write_iter+0xb1/0x120
[<00000000e9f4d860>] vfs_write+0x2e4/0x3c0
[<000000008143433b>] ksys_write+0x80/0xb0
[<00000000a7df29b2>] do_syscall_64+0x42/0x90
[<0000000053f45fb8>] entry_SYSCALL_64_after_hwframe+0x6e/0xd8
Free the structure alongside the corresponding iscsit_conn / se_sess
parent. |
| In the Linux kernel, the following vulnerability has been resolved:
landlock: Fix handling of disconnected directories
Disconnected files or directories can appear when they are visible and
opened from a bind mount, but have been renamed or moved from the source
of the bind mount in a way that makes them inaccessible from the mount
point (i.e. out of scope).
Previously, access rights tied to files or directories opened through a
disconnected directory were collected by walking the related hierarchy
down to the root of the filesystem, without taking into account the
mount point because it couldn't be found. This could lead to
inconsistent access results, potential access right widening, and
hard-to-debug renames, especially since such paths cannot be printed.
For a sandboxed task to create a disconnected directory, it needs to
have write access (i.e. FS_MAKE_REG, FS_REMOVE_FILE, and FS_REFER) to
the underlying source of the bind mount, and read access to the related
mount point. Because a sandboxed task cannot acquire more access
rights than those defined by its Landlock domain, this could lead to
inconsistent access rights due to missing permissions that should be
inherited from the mount point hierarchy, while inheriting permissions
from the filesystem hierarchy hidden by this mount point instead.
Landlock now handles files and directories opened from disconnected
directories by taking into account the filesystem hierarchy when the
mount point is not found in the hierarchy walk, and also always taking
into account the mount point from which these disconnected directories
were opened. This ensures that a rename is not allowed if it would
widen access rights [1].
The rationale is that, even if disconnected hierarchies might not be
visible or accessible to a sandboxed task, relying on the collected
access rights from them improves the guarantee that access rights will
not be widened during a rename because of the access right comparison
between the source and the destination (see LANDLOCK_ACCESS_FS_REFER).
It may look like this would grant more access on disconnected files and
directories, but the security policies are always enforced for all the
evaluated hierarchies. This new behavior should be less surprising to
users and safer from an access control perspective.
Remove a wrong WARN_ON_ONCE() canary in collect_domain_accesses() and
fix the related comment.
Because opened files have their access rights stored in the related file
security properties, there is no impact for disconnected or unlinked
files. |
| In the Linux kernel, the following vulnerability has been resolved:
net/mlx5e: Move representor neigh cleanup to profile cleanup_tx
For IP tunnel encapsulation in ECMP (Equal-Cost Multipath) mode, as
the flow is duplicated to the peer eswitch, the related neighbour
information on the peer uplink representor is created as well.
In the cited commit, eswitch devcom unpair is moved to uplink unload
API, specifically the profile->cleanup_tx. If there is a encap rule
offloaded in ECMP mode, when one eswitch does unpair (because of
unloading the driver, for instance), and the peer rule from the peer
eswitch is going to be deleted, the use-after-free error is triggered
while accessing neigh info, as it is already cleaned up in uplink's
profile->disable, which is before its profile->cleanup_tx.
To fix this issue, move the neigh cleanup to profile's cleanup_tx
callback, and after mlx5e_cleanup_uplink_rep_tx is called. The neigh
init is moved to init_tx for symmeter.
[ 2453.376299] BUG: KASAN: slab-use-after-free in mlx5e_rep_neigh_entry_release+0x109/0x3a0 [mlx5_core]
[ 2453.379125] Read of size 4 at addr ffff888127af9008 by task modprobe/2496
[ 2453.381542] CPU: 7 PID: 2496 Comm: modprobe Tainted: G B 6.4.0-rc7+ #15
[ 2453.383386] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
[ 2453.384335] Call Trace:
[ 2453.384625] <TASK>
[ 2453.384891] dump_stack_lvl+0x33/0x50
[ 2453.385285] print_report+0xc2/0x610
[ 2453.385667] ? __virt_addr_valid+0xb1/0x130
[ 2453.386091] ? mlx5e_rep_neigh_entry_release+0x109/0x3a0 [mlx5_core]
[ 2453.386757] kasan_report+0xae/0xe0
[ 2453.387123] ? mlx5e_rep_neigh_entry_release+0x109/0x3a0 [mlx5_core]
[ 2453.387798] mlx5e_rep_neigh_entry_release+0x109/0x3a0 [mlx5_core]
[ 2453.388465] mlx5e_rep_encap_entry_detach+0xa6/0xe0 [mlx5_core]
[ 2453.389111] mlx5e_encap_dealloc+0xa7/0x100 [mlx5_core]
[ 2453.389706] mlx5e_tc_tun_encap_dests_unset+0x61/0xb0 [mlx5_core]
[ 2453.390361] mlx5_free_flow_attr_actions+0x11e/0x340 [mlx5_core]
[ 2453.391015] ? complete_all+0x43/0xd0
[ 2453.391398] ? free_flow_post_acts+0x38/0x120 [mlx5_core]
[ 2453.392004] mlx5e_tc_del_fdb_flow+0x4ae/0x690 [mlx5_core]
[ 2453.392618] mlx5e_tc_del_fdb_peers_flow+0x308/0x370 [mlx5_core]
[ 2453.393276] mlx5e_tc_clean_fdb_peer_flows+0xf5/0x140 [mlx5_core]
[ 2453.393925] mlx5_esw_offloads_unpair+0x86/0x540 [mlx5_core]
[ 2453.394546] ? mlx5_esw_offloads_set_ns_peer.isra.0+0x180/0x180 [mlx5_core]
[ 2453.395268] ? down_write+0xaa/0x100
[ 2453.395652] mlx5_esw_offloads_devcom_event+0x203/0x530 [mlx5_core]
[ 2453.396317] mlx5_devcom_send_event+0xbb/0x190 [mlx5_core]
[ 2453.396917] mlx5_esw_offloads_devcom_cleanup+0xb0/0xd0 [mlx5_core]
[ 2453.397582] mlx5e_tc_esw_cleanup+0x42/0x120 [mlx5_core]
[ 2453.398182] mlx5e_rep_tc_cleanup+0x15/0x30 [mlx5_core]
[ 2453.398768] mlx5e_cleanup_rep_tx+0x6c/0x80 [mlx5_core]
[ 2453.399367] mlx5e_detach_netdev+0xee/0x120 [mlx5_core]
[ 2453.399957] mlx5e_netdev_change_profile+0x84/0x170 [mlx5_core]
[ 2453.400598] mlx5e_vport_rep_unload+0xe0/0xf0 [mlx5_core]
[ 2453.403781] mlx5_eswitch_unregister_vport_reps+0x15e/0x190 [mlx5_core]
[ 2453.404479] ? mlx5_eswitch_register_vport_reps+0x200/0x200 [mlx5_core]
[ 2453.405170] ? up_write+0x39/0x60
[ 2453.405529] ? kernfs_remove_by_name_ns+0xb7/0xe0
[ 2453.405985] auxiliary_bus_remove+0x2e/0x40
[ 2453.406405] device_release_driver_internal+0x243/0x2d0
[ 2453.406900] ? kobject_put+0x42/0x2d0
[ 2453.407284] bus_remove_device+0x128/0x1d0
[ 2453.407687] device_del+0x240/0x550
[ 2453.408053] ? waiting_for_supplier_show+0xe0/0xe0
[ 2453.408511] ? kobject_put+0xfa/0x2d0
[ 2453.408889] ? __kmem_cache_free+0x14d/0x280
[ 2453.409310] mlx5_rescan_drivers_locked.part.0+0xcd/0x2b0 [mlx5_core]
[ 2453.409973] mlx5_unregister_device+0x40/0x50 [mlx5_core]
[ 2453.410561] mlx5_uninit_one+0x3d/0x110 [mlx5_core]
[ 2453.411111] remove_one+0x89/0x130 [mlx5_core]
[ 24
---truncated--- |
| In the Linux kernel, the following vulnerability has been resolved:
vfio/type1: fix cap_migration information leak
Fix an information leak where an uninitialized hole in struct
vfio_iommu_type1_info_cap_migration on the stack is exposed to userspace.
The definition of struct vfio_iommu_type1_info_cap_migration contains a hole as
shown in this pahole(1) output:
struct vfio_iommu_type1_info_cap_migration {
struct vfio_info_cap_header header; /* 0 8 */
__u32 flags; /* 8 4 */
/* XXX 4 bytes hole, try to pack */
__u64 pgsize_bitmap; /* 16 8 */
__u64 max_dirty_bitmap_size; /* 24 8 */
/* size: 32, cachelines: 1, members: 4 */
/* sum members: 28, holes: 1, sum holes: 4 */
/* last cacheline: 32 bytes */
};
The cap_mig variable is filled in without initializing the hole:
static int vfio_iommu_migration_build_caps(struct vfio_iommu *iommu,
struct vfio_info_cap *caps)
{
struct vfio_iommu_type1_info_cap_migration cap_mig;
cap_mig.header.id = VFIO_IOMMU_TYPE1_INFO_CAP_MIGRATION;
cap_mig.header.version = 1;
cap_mig.flags = 0;
/* support minimum pgsize */
cap_mig.pgsize_bitmap = (size_t)1 << __ffs(iommu->pgsize_bitmap);
cap_mig.max_dirty_bitmap_size = DIRTY_BITMAP_SIZE_MAX;
return vfio_info_add_capability(caps, &cap_mig.header, sizeof(cap_mig));
}
The structure is then copied to a temporary location on the heap. At this point
it's already too late and ioctl(VFIO_IOMMU_GET_INFO) copies it to userspace
later:
int vfio_info_add_capability(struct vfio_info_cap *caps,
struct vfio_info_cap_header *cap, size_t size)
{
struct vfio_info_cap_header *header;
header = vfio_info_cap_add(caps, size, cap->id, cap->version);
if (IS_ERR(header))
return PTR_ERR(header);
memcpy(header + 1, cap + 1, size - sizeof(*header));
return 0;
}
This issue was found by code inspection. |
| In the Linux kernel, the following vulnerability has been resolved:
drm/panthor: Fix UAF on kernel BO VA nodes
If the MMU is down, panthor_vm_unmap_range() might return an error.
We expect the page table to be updated still, and if the MMU is blocked,
the rest of the GPU should be blocked too, so no risk of accessing
physical memory returned to the system (which the current code doesn't
cover for anyway).
Proceed with the rest of the cleanup instead of bailing out and leaving
the va_node inserted in the drm_mm, which leads to UAF when other
adjacent nodes are removed from the drm_mm tree. |
| In the Linux kernel, the following vulnerability has been resolved:
accel/ivpu: Fix race condition when unbinding BOs
Fix 'Memory manager not clean during takedown' warning that occurs
when ivpu_gem_bo_free() removes the BO from the BOs list before it
gets unmapped. Then file_priv_unbind() triggers a warning in
drm_mm_takedown() during context teardown.
Protect the unmapping sequence with bo_list_lock to ensure the BO is
always fully unmapped when removed from the list. This ensures the BO
is either fully unmapped at context teardown time or present on the
list and unmapped by file_priv_unbind(). |
| In the Linux kernel, the following vulnerability has been resolved:
arm64/pageattr: Propagate return value from __change_memory_common
The rodata=on security measure requires that any code path which does
vmalloc -> set_memory_ro/set_memory_rox must protect the linear map alias
too. Therefore, if such a call fails, we must abort set_memory_* and caller
must take appropriate action; currently we are suppressing the error, and
there is a real chance of such an error arising post commit a166563e7ec3
("arm64: mm: support large block mapping when rodata=full"). Therefore,
propagate any error to the caller. |
| In the Linux kernel, the following vulnerability has been resolved:
binder: fix UAF of alloc->vma in race with munmap()
[ cmllamas: clean forward port from commit 015ac18be7de ("binder: fix
UAF of alloc->vma in race with munmap()") in 5.10 stable. It is needed
in mainline after the revert of commit a43cfc87caaf ("android: binder:
stop saving a pointer to the VMA") as pointed out by Liam. The commit
log and tags have been tweaked to reflect this. ]
In commit 720c24192404 ("ANDROID: binder: change down_write to
down_read") binder assumed the mmap read lock is sufficient to protect
alloc->vma inside binder_update_page_range(). This used to be accurate
until commit dd2283f2605e ("mm: mmap: zap pages with read mmap_sem in
munmap"), which now downgrades the mmap_lock after detaching the vma
from the rbtree in munmap(). Then it proceeds to teardown and free the
vma with only the read lock held.
This means that accesses to alloc->vma in binder_update_page_range() now
will race with vm_area_free() in munmap() and can cause a UAF as shown
in the following KASAN trace:
==================================================================
BUG: KASAN: use-after-free in vm_insert_page+0x7c/0x1f0
Read of size 8 at addr ffff16204ad00600 by task server/558
CPU: 3 PID: 558 Comm: server Not tainted 5.10.150-00001-gdc8dcf942daa #1
Hardware name: linux,dummy-virt (DT)
Call trace:
dump_backtrace+0x0/0x2a0
show_stack+0x18/0x2c
dump_stack+0xf8/0x164
print_address_description.constprop.0+0x9c/0x538
kasan_report+0x120/0x200
__asan_load8+0xa0/0xc4
vm_insert_page+0x7c/0x1f0
binder_update_page_range+0x278/0x50c
binder_alloc_new_buf+0x3f0/0xba0
binder_transaction+0x64c/0x3040
binder_thread_write+0x924/0x2020
binder_ioctl+0x1610/0x2e5c
__arm64_sys_ioctl+0xd4/0x120
el0_svc_common.constprop.0+0xac/0x270
do_el0_svc+0x38/0xa0
el0_svc+0x1c/0x2c
el0_sync_handler+0xe8/0x114
el0_sync+0x180/0x1c0
Allocated by task 559:
kasan_save_stack+0x38/0x6c
__kasan_kmalloc.constprop.0+0xe4/0xf0
kasan_slab_alloc+0x18/0x2c
kmem_cache_alloc+0x1b0/0x2d0
vm_area_alloc+0x28/0x94
mmap_region+0x378/0x920
do_mmap+0x3f0/0x600
vm_mmap_pgoff+0x150/0x17c
ksys_mmap_pgoff+0x284/0x2dc
__arm64_sys_mmap+0x84/0xa4
el0_svc_common.constprop.0+0xac/0x270
do_el0_svc+0x38/0xa0
el0_svc+0x1c/0x2c
el0_sync_handler+0xe8/0x114
el0_sync+0x180/0x1c0
Freed by task 560:
kasan_save_stack+0x38/0x6c
kasan_set_track+0x28/0x40
kasan_set_free_info+0x24/0x4c
__kasan_slab_free+0x100/0x164
kasan_slab_free+0x14/0x20
kmem_cache_free+0xc4/0x34c
vm_area_free+0x1c/0x2c
remove_vma+0x7c/0x94
__do_munmap+0x358/0x710
__vm_munmap+0xbc/0x130
__arm64_sys_munmap+0x4c/0x64
el0_svc_common.constprop.0+0xac/0x270
do_el0_svc+0x38/0xa0
el0_svc+0x1c/0x2c
el0_sync_handler+0xe8/0x114
el0_sync+0x180/0x1c0
[...]
==================================================================
To prevent the race above, revert back to taking the mmap write lock
inside binder_update_page_range(). One might expect an increase of mmap
lock contention. However, binder already serializes these calls via top
level alloc->mutex. Also, there was no performance impact shown when
running the binder benchmark tests. |