CVE-2024-38306
MediumIn the Linux kernel, the following vulnerability has been resolved: btrfs: protect folio::private when attaching extent buffer folios [BUG] Since v6.8 there are rare kernel crashes reported by various people, the common factor is bad page status error messages like this: BUG: Bad page state in process kswapd0 pfn:d6e840 page: refcount:0 mapcount:0 mapping:000000007512f4f2 index:0x2796c2c7c pfn:0xd6e840 aops:btree_aops ino:1 flags: 0x17ffffe0000008(uptodate|node=0|zone=2|lastcpupid=0x3fffff) page_type: 0xffffffff() raw: 0017ffffe0000008 dead000000000100 dead000000000122 ffff88826d0be4c0 raw: 00000002796c2c7c 0000000000000000 00000000ffffffff 0000000000000000 page dumped because: non-NULL mapping [CAUSE] Commit 09e6cef19c9f ("btrfs: refactor alloc_extent_buffer() to allocate-then-attach method") changes the sequence when allocating a new extent buffer. Previously we always called grab_extent_buffer() under mapping->i_private_lock, to ensure the safety on modification on folio::private (which is a pointer to extent buffer for regular sectorsize). This can lead to the following race: Thread A is trying to allocate an extent buffer at bytenr X, with 4 4K pages, meanwhile thread B is trying to release the page at X + 4K (the second page of the extent buffer at X). Thread A | Thread B -----------------------------------+------------------------------------- | btree_release_folio() | | This is for the page at X + 4K, | | Not page X. | | alloc_extent_buffer() | |- release_extent_buffer() |- filemap_add_folio() for the | | |- atomic_dec_and_test(eb->refs) | page at bytenr X (the first | | | | page). | | | | Which returned -EEXIST. | | | | | | | |- filemap_lock_folio() | | | | Returned the first page locked. | | | | | | | |- grab_extent_buffer() | | | | |- atomic_inc_not_zero() | | | | | Returned false | | | | |- folio_detach_private() | | |- folio_detach_private() for X | |- folio_test_private() | | |- folio_test_private() | Returned true | | | Returned true |- folio_put() | |- folio_put() Now there are two puts on the same folio at folio X, leading to refcount underflow of the folio X, and eventually causing the BUG_ON() on the page->mapping. The condition is not that easy to hit: - The release must be triggered for the middle page of an eb If the release is on the same first page of an eb, page lock would kick in and prevent the race. - folio_detach_private() has a very small race window It's only between folio_test_private() and folio_clear_private(). That's exactly when mapping->i_private_lock is used to prevent such race, and commit 09e6cef19c9f ("btrfs: refactor alloc_extent_buffer() to allocate-then-attach method") screwed that up. At that time, I thought the page lock would kick in as filemap_release_folio() also requires the page to be locked, but forgot the filemap_release_folio() only locks one page, not all pages of an extent buffer. [FIX] Move all the code requiring i_private_lock into attach_eb_folio_to_filemap(), so that everything is done with proper lock protection. Furthermore to prevent future problems, add an extra lockdep_assert_locked() to ensure we're holding the proper lock. To reproducer that is able to hit the race (takes a few minutes with instrumented code inserting delays to alloc_extent_buffer()): #!/bin/sh drop_caches () { while(true); do echo 3 > /proc/sys/vm/drop_caches echo 1 > /proc/sys/vm/compact_memory done } run_tar () { while(true); do for x in `seq 1 80` ; do tar cf /dev/zero /mnt > /dev/null & done wait done } mkfs.btrfs -f -d single -m single ---truncated---
CVSS 3.1 score
4.7
CVSS:3.1/AV:L/AC:H/PR:L/UI:N/S:U/C:N/I:N/A:H
Weakness type
CWE-362CVE-2024-38306 is a Race Condition vulnerability
What is Race Condition?
The product contains a code sequence that can run concurrently with other code, creating unexpected states. Learn more on MITRE CWE
References
The following references provide additional information about CVE-2024-38306 including vendor advisories, patch commits, exploit details, and third-party analysis. Links are sourced from the NIST NVD database.
-
PatchKernel patch commithttps://git.kernel.org/stable/c/952f048eb901881a7cc6f7c1368b53cd386ead7b
-
PatchKernel patch commithttps://git.kernel.org/stable/c/f3a5367c679d31473d3fbb391675055b4792c309
Frequently asked questions
-
What is CVE-2024-38306?
CVE-2024-38306 is a Medium severity Linux kernel vulnerability with a CVSS score of 4.7 out of 10 , classified as a Race Condition flaw (CWE-362) . CVE-2024-38306 has not been confirmed as actively exploited and is not listed in the CISA KEV catalog.
-
What is the CVSS score for CVE-2024-38306?
CVE-2024-38306 has a CVSS score of 4.7 out of 10, rated Medium severity (CVSS 3.1). The vector string is
CVSS:3.1/AV:L/AC:H/PR:L/UI:N/S:U/C:N/I:N/A:H. -
Is there a patch available for CVE-2024-38306?
No patch is currently available for CVE-2024-38306. Monitor the NIST NVD and your Linux distribution's security advisories for updates.
-
Is CVE-2024-38306 actively exploited?
No — CVE-2024-38306 has not been confirmed as actively exploited. It is not listed in the CISA Known Exploited Vulnerabilities (KEV) catalog.
-
What is Race Condition (CWE-362)?
The product contains a code sequence that can run concurrently with other code, creating unexpected states. View CWE-362 on MITRE CWE →