CXL Collab Sync
CXL Linux Sync: Ground Rules
- Do not share confidential information
- Do not share confidential product details
-
Do not disclose CXL consortium confidential information
- Do discuss any Linux questions about released CXL specifications:
- Do use Discord as a supplement for this sync meeting for quick questions
-
Do follow-up on linux-cxl@vger.kernel.org for longer questions / debug
- https://pmem.io/ndctl/collab/
February 17, 2026
Agenda
- Opens
- cxl-cli
- QEMU
- v7.0 rc fixes
- v7.1 merge window
- v7.2 and beyond
The full transcript is posted on the CXL Discord channel.
Opens
-
JonathanC: heads up on work coming wrt ras handling.
-
Anisa: DCD - FAMFS first user of DCD in simplest way, single region.
-
GregP: discussion about AI generated reviews. Welcomed if sender has reviewed the review and finds it valid.
-
GregP: looking at external driver commonalities and refactoring. External drivers are type2, vfio, pmem, and reset, Do we pull exports out ahead and make common? GregP will pull create region func first and post for comments.
-
GregP: while testing is finding race conditions. Seems to be in refactored code not existing code. Try with MingL and DaveJ’s latest race condition patchsets.
-
Are there any vendors needing CXL physical hotplug to work properly upstream? Context: folks disabling ACS to make it work. Any customers or only a validation requirement? (Access Control Services - blocks CXL pm init for unexpected id) Users may be able to turn off this check specifically in ACS.
-
JohnG: working set of DCD clarifications, expect to be an ECN and set of requirements around SW cache coherency - may be consortium confidential Both for consortium members (or others) to ask and review.
CXL CLI
NDCTL v84 - gathering for a Q1 release to support things through 7.0 kernel
(non-CXL work mostly filtered out)
- Welcoming reviews:
John’s FAMFS Set:
- v4 daxctl: Add support for famfs mode
-
v4 test/daxctl-famfs.sh: test famfs mode transitions https://lore.kernel.org/all/0100019bd34040d9-0b6e9e4c-ecd4-464d-ab9d-88a251215442-000000@email.amazonses.com/
-
v4 cxl/cli: enforce HPA-descending teardown (PawelM) https://lore.kernel.org/all/20260130121638.169160-1-pawel.mielimonka@fujitsu.com/
-
v2 test/cxl-poison.sh: test unaligned address translations in cxl_poison events (AlisonS) https://lore.kernel.org/all/20260115200241.522809-1-alison.schofield@intel.com/
- v2 test/cxl-poison.sh: replace sysfs usage with cxl-cli cmds (AlisonS) https://lore.kernel.org/all/20260214023239.1352245-1-alison.schofield@intel.com/
- Waiting revisions:
- FAMFS - awaiting update to unit test and fix for unit test fail (JohnG)
- Pending for v84: (many hiding on internal pending awaiting a coverity scan)
Ben’s Error Inject Set:
- Documentation: Add docs for protocol and poison injection commands
- cxl/list: Add injectable errors in output
- cxl: Add poison injection/clear commands
- cxl: Add inject-protocol-error command
- libcxl: Add poison injection support
- libcxl: Add CXL protocol errors
- libcxl: Add debugfs path to CXL context DaveJ’s ELC support
- cxl: add support for extended linear cache New unit tests:
- cxl/test: add test for extended linear cache support
- cxl/test: add cxl-translate.sh unit test Updated unit tests:
- test/cxl-topology.sh: test switch port target lists
- test/cxl-poison.sh: add support for poison test for ELC
- test/cxl-poison.sh: move cxl-poison.sh to use cxl_test auto region
- test/cxl-poison.sh: fix cxl-poison.sh to detect the correct elc sysfs attrib
QEMU
Tried to merge 20, got 2!!! Rework of phys port control, allows reset testing. JC wants to drop - compliance mailbox: no one uses and does nothing - i2c mctp not needed. Can use USB now.
v7.0 rc fixes
- Fix nvdimm_bus race by cxl_nvdimm. (DaveJ)
https://lore.kernel.org/linux-cxl/20260213224038.549798-1-dave.jiang@intel.com/
- v3 under review. Some 0-day issues.
- Fix deadlock in cxl_memdev_autoremove() on attach failure. (Gregory)
https://lore.kernel.org/linux-cxl/20260211192228.2148713-1-gourry@gourry.net/
- Ready for cxl/fixes apply
- Avoid DVSEC fallback after region teardown. (Smita)
https://lore.kernel.org/linux-cxl/20260212223800.23624-1-Smita.KoralahalliChannabasappa@amd.com/
- v1 Need review
- Fix port enumeration failure. (Ming)
https://lore.kernel.org/linux-cxl/20260212223800.23624-1-Smita.KoralahalliChannabasappa@amd.com/
- v3 needs review
- Fix leakage in __construct_region(). (Davidlohr)
https://lore.kernel.org/linux-cxl/20260202191330.245608-1-dave@stgolabs.net/
- v1 needs review
- Fix testing of decoder flags as bitmasks. (Alison)
https://lore.kernel.org/linux-cxl/20260206181404.1025991-1-alison.schofield@intel.com/
- Pending v2 to split the patch, could use more review tags.
v7.1 merge window
- Coordinate Soft Reserved handling with CXL and HMEM. (Smita)
https://lore.kernel.org/linux-cxl/20260210064501.157591-1-Smita.KoralahalliChannabasappa@amd.com/
- v6 needs review
- CXL port error protocol handling and logging. (Terry)
https://lore.kernel.org/linux-cxl/20260203025244.3093805-1-terry.bowman@amd.com/
- Pending v16
- Patch series split into 3 parts with 1 and 2 in 7.0 now.
- Need tags from Bjorn for PCI bits
- Pull region specific logic into new files. (Gregory)
https://lore.kernel.org/linux-cxl/20260211204206.2171525-1-gourry@gourry.net/
- v3 needs review
- Type 2 device basic support. (Alejandro)
https://lore.kernel.org/linux-cxl/20260201155438.2664640-1-alejandro.lucero-palau@amd.com/
- v23 needs review
- Also waiting on testing feedback from PJ
- No functional dependency. In Dans queue to review.
- Explicit DAX driver selection and hotplug. (Gregory)
https://lore.kernel.org/linux-cxl/20260129210442.3951412-1-gourry@gourry.net/
- wait on v2 reworks to review
- Delay insert iomem resource. (Alison)
https://lore.kernel.org/linux-cxl/20260212062250.1219043-1-alison.schofield@intel.com/
- Need review
- Allow 6 & 12 way regions on 3-way HB interleave (Alison)
https://lore.kernel.org/linux-cxl/20250306232239.2609017-1-alison.schofield@intel.com/
- Pending v3
- LSA 2.1 support for CXL pmem (Neeraj)
https://lore.kernel.org/linux-cxl/20260123113112.3488381-1-s.neeraj@samsung.com/#r
- Need response to question from Dan: https://lore.kernel.org/linux-cxl/6979522dc1916_1d331009@dwillia2-mobl4.notmuch/
- Need to address cxl_test cxl-topology.sh regression test issue
- Question for call: is the no interleave limitation OK? Neeraj: what is the impact of adding interleave with first merge?
v7.2 and beyond
- Zero sized decoder (VishalA)
https://lore.kernel.org/linux-cxl/20251015024019.1189713-1-vaslot@nvidia.com/
- No activity since 10/25?
- VFIO CXL type2 support RFCv2 (Manish)
https://lore.kernel.org/linux-cxl/20251209165019.2643142-1-mhonap@nvidia.com/T/#t
- Manish has a new set coming for review.
- CXL type2 reset support v4 (Srirangan)
https://lore.kernel.org/linux-cxl/20260120222610.2227109-1-smadhavan@nvidia.com/
- Pending v5? Not much response from submitter on the series.
- Add support for multiple DC regions RFC (Anisa)
https://lore.kernel.org/linux-cxl/20260115102819.00006d55.alireza.sanaee@huawei.com/T/#t
- pending DCD series as well. Anisa has taken over.
- Anisa to send out RFC with discussion of approach.
January 20, 2026
Agenda
- Opens
- cxl-cli
- QEMU
- v6.19 rc fixes
- v7.0 merge window
- v7.1 and beyond
Opens
-
Dan suggested, Jonathan seconded, we need to use this forum for tech topics. We typically have the last 30 mins to dive into such topics.
-
Today, ManishH shared his work and plans for the vfio-cxl driver.
CXL CLI
NDCTL v84 - gathering for a Q1 release to support things through 7.0 kernel
- Welcoming reviews:
-
v6 Add error injection support (BenC) https://lore.kernel.org/nvdimm/20260109160720.1823-1-Benjamin.Cheatham@amd.com/
-
v2 cxl/test: test unaligned address translations in cxl_poison events (AlisonS) https://lore.kernel.org/nvdimm/20260115200241.522809-1-alison.schofield@intel.com/
-
v2 cxl/cli: HPA-ordered destroy-region teardown (PawelM) https://lore.kernel.org/linux-cxl/20260120143212.3006273-1-pawel.mielimonka@fujitsu.com/
-
v2 daxctl: replace basename() usage with new path_basename() (AlisonS) https://lore.kernel.org/nvdimm/20260116043056.542346-1-alison.schofield@intel.com/
-
util/sysfs: add hint for missing root privileges on sysfs access (AlisonS) https://lore.kernel.org/nvdimm/b74bfd8623fcfc4cf1078991b22b8c899147f5fb.1768530600.git.alison.schofield@intel.com/
-
-
Waiting revisions:
- Pending for v84:
- test/cxl-topology.sh: test switch port target lists
- cxl/test: add support for poison test for ELC
- cxl/test: add test for extended linear cache support
- cxl/test: move cxl-poison.sh to use cxl_test auto region
- cxl/test: fix cxl-poison.sh to detect the correct elc sysfs attrib
- cxl: add cxl-translate.sh unit test
- ndctl/test: fully reset nfit_test in pmem-ns unit test
- add support for extended linear cache
- README.md: exclude unsupported distros from Repology badge
QEMU
Soft freeze in March.
- event, memory repair, fixes awaiting merge
- back-invalidate is new, tested, ready to merge (DavidL)
- next up: phys port control for switches
- Adress lookup changes
- GregP looking for KVM support fix, it’s in work by Ali
v6.19 rc fixes
- Fixes merged in for rc6
v7.0 merge window
-
Pending in cxl/next: 345f23df5d08 cxl/region: Use do_div() for 64-bit modulo operation 78b50b598462 cxl/region: Translate HPA to DPA and memdev in unaligned regions 19885bd10755 cxl/region: Translate DPA->HPA in unaligned MOD3 regions 5c604e7a9f6c cxl/core: Fix cxl_dport debugfs EINJ entries b4692385bb68 cxl/acpi: Remove cxl_acpi_set_cache_size() 0db2344eb8a8 cxl/hdm: Fix newline character in dev_err() messages 7b4f9743fbbf cxl/pci: Remove outdated FIXME comment and BUILD_BUG_ON fa19611f96fd Documentation/driver-api/cxl: device hotplug section 63e5a6294dad Documentation/driver-api/cxl: BIOS/EFI expectation update
- Make ELOG and GHES log and trace consistently (Fabio)
https://lore.kernel.org/linux-cxl/CAJZ5v0g80j4iFMXYDKek8VBYsa0g35avvw+UK6RxutcmxSX+WA@mail.gmail.com/T/#t
- Picked up by Rafael
- Make ELOG and GHES log and trace consistently (Fabio)
https://lore.kernel.org/linux-cxl/CAJZ5v0g80j4iFMXYDKek8VBYsa0g35avvw+UK6RxutcmxSX+WA@mail.gmail.com/T/#t
-
Linus declared rc8. All series want to merge should be settled by next week.
- Enable CXL PCIe port protocol error handling and logging (Terry)
https://lore.kernel.org/linux-cxl/20260114182055.46029-1-terry.bowman@amd.com/T/#t
- Pending v15 with some discussions still on going in v14
- Intend to split up for manageable review and merge.
- Support soft reserve (Smita)
https://lore.kernel.org/linux-cxl/20250822034202.26896-1-Smita.KoralahalliChannabasappa@amd.com/T/#t
- Pending next rev?
- Zen5 PRM translation support (Robert)
https://lore.kernel.org/linux-cxl/20250822034202.26896-1-Smita.KoralahalliChannabasappa@amd.com/T/#t
https://lore.kernel.org/linux-cxl/20260112111707.794526-1-rrichter@amd.com/T/#t
- Waiting on requested addition for the Convention doc from Dan
- Upstream PRM usage objection from PeterZ settled?
- Introduce cxl_region_driver field for cxl_region (Gregory)
https://lore.kernel.org/linux-cxl/aWfe-r7uEV-ajfhX@gourry-fedora-PF4VCD3F/T/#t
- Needs review
- Maybe hold off on patch 1, the ABI intro. Take 2,3 that move code only.
v7.1 and beyond
- Type 2 accelerator basic support v22 (Alejandro)
https://lore.kernel.org/linux-cxl/20251205115248.772945-1-alejandro.lucero-palau@amd.com/T/#t
- Pending debug with PJ before v23
- Alejandro asking for eyes on region config from BIOS
- Support Low Memory Hole v6 (Fabio)
- Pending resolve of cxl_test support
- Will move region driver to module, outside of cxl core
- LSA 2.1 label support for CXL v5 (Neeraj)
https://lore.kernel.org/linux-cxl/20260109124437.4025893-1-s.neeraj@samsung.com/T/#t
- Need acks from Ira for nvdimm changes
- Review ongoing
- Allow 6 & 12 way regions on 3-way HB interleave (Alison)
https://lore.kernel.org/linux-cxl/20250306232239.2609017-1-alison.schofield@intel.com/
- Pending v3
- VFIO CXL type2 support RFCv2 (Manish)
https://lore.kernel.org/linux-cxl/20251209165019.2643142-1-mhonap@nvidia.com/T/#t
- Needs review
- CXL type2 reset support v3 (Srirangan)
https://lore.kernel.org/linux-cxl/aW1ercZMyqh2Ej5F@aschofie-mobl2.lan/T/#t
- Needs review
- Add support for multiple DC regions RFC (Anisa)
https://lore.kernel.org/linux-cxl/20260115102819.00006d55.alireza.sanaee@huawei.com/T/#t
- Of course pending DCD series as well
- Anisa - Happy to take over, rebase, add Fans fixes. Needs more work done on solidifying how to expose this.
- Dan - risk is merging ABI that we are not commited to.
- Anisa to send out RFC with discussion of approach.
- CXL .cache device support RFCv2 (Ben)
https://lore.kernel.org/linux-cxl/20251111214032.8188-1-Benjamin.Cheatham@amd.com/T/#t
- Needs confirmed user for support
- Going through reviews
Patch series of interest to CXL
-
FAMFS v7 (John) https://lore.kernel.org/linux-cxl/0100019bdbf77d8b-fc329dba-dc0d-4233-9b6a-b45e3e271727-000000@email.amazonses.com/T/#t
-
Move dax_pgoff_to_phys() v4 (John) https://lore.kernel.org/linux-cxl/20260114213209.29453-2-john@groves.net/T/#t
-
N_PRIVATE support RFCv3 (Gregory) https://lore.kernel.org/linux-cxl/20260108203755.1163107-1-gourry@gourry.net/T/#t
-
Add runtime hotplug state control v2 (Gregory) https://lore.kernel.org/linux-cxl/df04f6e2-39ee-46f0-984d-54dcba16a011@kernel.org/T/#t
-
Identify the accurate NUMA ID of CFMWS v2 (Cui) https://lore.kernel.org/linux-cxl/20260106031042.1606729-1-cuichao1753@phytium.com.cn/T/#t
October 28, 2025
Agenda
- Opens
- cxl-cli
- QEMU
- v6.18 rc fixes
- v6.19 merge window
- v6.19 and beyond
Opens
- FAMFS JohnG
- Adding sw managed cache coherency, leveraging libpmem?
- Jonathan - caution about archs that do not describe CXL flush behavior because CXL did not exist.
- GregP lead discussion of new Anon ZONE_DEVICE allocator
- whether it should use existing DAX + memory_hotplug
- whether it should use something like hugetlb allocator
- whether pgmap->alloc_folio makes sense
- whether existing buddy-allocator could be extended for “arenas”
- Maybe will discuss more at plumbers
CXL CLI
NDCTL v83 released September 30th
- https://github.com/pmem/ndctl/releases/tag/v83
NDCTL v84 and beyond
- Welcoming reviews:
- ndctl: v2 Add error injection support (BenC)
- cxl/test: add cxl-translate unit test (AlisonS)
- Introduce sanitize-memdev functionality (DavidLohrB)
- Add support for extended linear cache (DaveJ)
- Waiting revisions:
- test: fail on unexpected kernel error & warning, not just “Call Trace” (MarcH)
- test/common: document magic number CXL TEST QOS CLASS=42 (MarcH)
- test/monitor.sh: replace sleep with event driven wait (AlisonS)
- Merged to pending for v84:
- README.md: exclude unsupported distros from Repology badge (AlisonS)
QEMU
- last pull request missed some cxl changes, expect in a second set, will include event, sanitize,
- hw/cxl: Add a performant (and correct) path for the non interleaved cases.
- DavidL sighting crash using cxl mem and vfio, running w kvm. Jonathan says not supported - don’t do that w kvm. No guardrails around that yet.
v6.18 rc fixes
- A number of fixes for 6.18-rc2 merged. Mostly extended linear cache related.
- Generic Initiator device handle fix (Shuai)
- Queued
v6.19 merge window
- Will start cxl/next on 6.18-rc4 next week. Below are ready to be queued
- Remove page-allocator quirk section for CXL doc (Gregory)
- Remove devm_cxl_port_enumerate_dports (Ming)
- Fix typo in cdat.c (Alok)
- Add a loadable module for address translation series (Alison)
- Add managed SOFT RESERVE resource handling (Smita)
https://lore.kernel.org/linux-cxl/aQAmhrS3Im21m_jw@aschofie-mobl2.lan/T/#t
- v3 discussion on going
- Pending v4
- Enable CXL PCIe port protocol error handling and logging (Terry)
https://lore.kernel.org/linux-cxl/20250925223440.3539069-1-terry.bowman@amd.com/T/#t
- Pending v13
- Type2 device support (Alejandro)
https://lore.kernel.org/linux-cxl/9a3eed68-9394-4f87-a204-4f2a0caf496e@intel.com/T/#t
- v19 review on going
- Now has dependency on Terry’s protocol error set
- Pending v20
- Can use a check from Dan
- Low Mem Hole (Fabio)
https://lore.kernel.org/linux-cxl/20251006155836.791418-1-fabio.m.de.francesco@linux.intel.com/T/#t
- Convention doc in cxl/next
- v5 review on going
- cxl_test support discussion on going
- ACPI PRM Address Translation support - Zen5 (Robert)
https://lore.kernel.org/linux-cxl/aNITd1fXcBxKM5mF@gourry-fedora-PF4VCD3F/T/#t
- needs convention doc
- pending v4
- CXL LSA 2.1 labeling support (Neeraj)
https://lore.kernel.org/linux-cxl/aNMnmdOY4g5PRpxY@aschofie-mobl2.lan/T/#t
- pending v4
- Can use some review
- Support zero sized decoder (Vishal A)
https://lore.kernel.org/linux-cxl/20251015024019.1189713-1-vaslot@nvidia.com/T/#t
- pending v2
- Add handling of locked CXL decoders (Dave)
https://lore.kernel.org/linux-cxl/637292ff-0cca-41bd-8ce9-4e38d6b1ff1b@intel.com/T/#t
- review on going
- hmat_register_target() lockdep issue (Dave)
https://lore.kernel.org/linux-cxl/20251017212105.4069510-1-dave.jiang@intel.com/T/#t
- v3 review on going
- Add support to indicate extended linear cache is present via sysfs attribute (Dave)
https://lore.kernel.org/linux-cxl/20251028144125.0000133b@huawei.com/T/#t
- v3 review on going
- Adjust extended linear cache failure emission (Dave)
https://lore.kernel.org/linux-cxl/20251003185509.3215900-1-dave.jiang@intel.com/
- v2 needs review
- Support multi-level interleaving with smaller granularities for lower levels (Robert)
https://lore.kernel.org/linux-cxl/20251028094754.72816-1-rrichter@amd.com/
- needs review
- platform exists, BIOS setup regions only ATM
- look at Alison’s Allow 6 & 12 way regions on 3-way HB interleave patch
- Make ELOG and GHES log and trace consistently (Fabio)
https://lore.kernel.org/linux-cxl/SJ1PR11MB60836FB0D4D8EE564759F7E3FCFCA@SJ1PR11MB6083.namprd11.prod.outlook.com/T/#t
- v6 review on going
- Translate DPA->HPA in unaligned MOD3 regions (Alison)
https://lore.kernel.org/linux-cxl/20251014062850.727428-1-alison.schofield@intel.com/
- v2 needs review
- Jonathan maybe waiting for v3 w bot fix :(
- Allow 6 & 12 way regions on 3-way HB interleave (Alison)
https://lore.kernel.org/linux-cxl/20250306232239.2609017-1-alison.schofield@intel.com/
- Pending v3
- look at Robert’ multi-level interleave patch
- Coherent Cache Management System (Jonathan)
https://lore.kernel.org/linux-cxl/20251023133136.00006cdd@huawei.com/T/#t
- pending v5
- Will go through ARM SOC tree
- No one wants it ?
- CXL.mem error isolation support (Ben)
https://lore.kernel.org/linux-cxl/20250730214718.10679-1-Benjamin.Cheatham@amd.com/T/#t
- Review on going
- No expected user yet
- CXL reset support for devices. (Srirangan / Vishal A)
https://lore.kernel.org/linux-cxl/20250221043906.1593189-1-smadhavan@nvidia.com/
- Pending v3
- VishalA has taken over
v6.20 and beyond
- Initial CXL.cache device support (Ben)
- Testing and planning a RFC-v2.
- Initial support for Back-Invalidate (DavidLohrB)
- Has dependencies on Type2. Consider cherry-picking a couple of T2 patches related to region creation if T2 not landing soon.
- Hotness Driver (Jonathan)
- DCD
- Continue to wait for an upstream user to take over
- JohnG: most likely famfs is that first upstream user
- fwctl support for CCI switch (Jonathan)
September 30 2025
- Opens
- cxl-cli
- QEMU
- v6.18 merge window
- v6.18 rc fixes
- v6.19 merge window
- v6.19 and beyond
Opens
- John Groves - submitting plumbers topics:
- DCD: the namespace for composable memory.
- FAMFS Update - DAX challenges and use cases
- John Groves to post a detailed decription of DAX issues w FAMFS.
- VishalA working a patch to not fail on 0 size committed and locked decoders
- DaveJ asks that you base patches on the RC that cxl/next is based upon, not upon cxl/next directly. Include the base commit in the patch. Comment non-compliance ;)
CXL CLI
NDCTL v83
- Release is WIP, perhaps today
- Last commit removes libtracefs build dependency that broke v80,81,82.
- https://github.com/pmem/ndctl/commits/pending/
NDCTL v84 and beyond
- Reviews welcome:
- cxl/test: add cxl-translate unit test (expect need for 6.18 kernel) (AlisonS)
- Introduce sanitize-memdev functionality (DavidLohrB)
- Revisions welcome:
- ndctl: v2 Add error injection support (BenC)
- test: fail on unexpected kernel error & warning, not just “Call Trace” (MarcH)
- ndctl,cxl/test: Add a common unit test for creating pmem namespaces(AlisonS)
- test/monitor.sh: replace sleep with event driven wait (AlisonS)
QEMU
- Features pending next release, reviews welcome.
v6.18 merge window
- Window open this week. Will send PR end of week or early next week.
v6.18 rc fixes
- Avoid missing port component registers setup (Ming)
- Can use review.
v6.19 merge window
- Add managed SOFT RESERVE resource handling (Smita)
- v3 now available for review
- Enable CXL PCIe port protocol error handling and logging (Terry)
- v12 going through reviews
- Type2 device support (Alejandro)
- pending v19?
- please review
- Low Mem Hole (Fabio)
- Convention doc in cxl/next
- v5 to be posted soon
- ACPI PRM Address Translation support - Zen5 (Robert)
- needs review
- needs convention doc
- pending v4
- CXL LSA 2.1 labeling support (Neeraj)
- v3 needs review
- CXL.mem error isolation support (Ben)
- Is there a pending use case?
- CXL reset support for devices. (Srirangan)
- Pending v3
- VishalA has taken over
- Remove devm_cxl_port_enumerate_dports() (Ming)
- Queued to cxl/next after merge window
- Make ELOG and GHES log and trace consistently (Fabio)
- v5 to be posted soon
- Allow 6 & 12 way regions on 3-way HB interleave (Alison)
- Pending v3
- Translate DPA->HPA in unaligned MOD3 regions (Alison)
- v1 posted, need review
- CXL: Add a loadable module for address translation (Alison)
- v2 needs review
v6.20 and beyond
- Initial CXL.cache device support (Ben)
- Testing and planning a RFC-v2.
- Hotness Driver (Jonathan)
- DavidLohrB - are you modifying your apporach to use K promote B stuff.
- David has a proposal to co-exist.
- Qemu support? Jonathan has something functional, welcomes more.
- non-x86 cache flushing (“wbinv”) (Jonathan)
- v4 WIP, has substantial feedback from DanW. Got ACK from ARM folks.
- DCD
- User? IraW posted a rebase for Micron person who asked on Discord.
- John Groves submitting plumbers topic for DCD: the namespace for composable memory.
- cxl: Initial support for Back-Invalidate (DavidLohrB)
- Available to review. Expect session at plumbers too.
- fwctl support for CCI switch (Jonathan)
September 2025
- Opens
- cxl-cli
- QEMU
- v6.17 rc fixes
- v6.18 merge window
- v6.19 and beyond
Opens
- DanW - Early topic acceptance mid month.
- JonathanC - Plumbers device memory microconf topic submit deadline Sept30.
- DavidLohrB - back invalidate and cxl.cache topic plumbers proposal coming soon
- famfs (JohnGroves) - needs to clean up issue w Alistairs dax set. Dax never clears the page mapping or folio mapping. John expects to send fixup. Also topic for next DCD call.
CXL CLI
NDCTL v83
- See queue in https://github.com/pmem/ndctl/tree/pending
- expect September release
NDCTL v83 (maybe) and beyond:
- Introduce sanitize-memdev functionality (DavidLohrB)
- Needs review
- Received a couple of it “works for me” replies but no review tags.
- ndctl: v2 Add error injection support (BenC)
- status ?
- test: fail on unexpected kernel error & warning, not just “Call Trace” (MarcH)
- Needs next rev on check dmesg piece
- Needs next rev on kmesg piece
- test/cxl-poison.sh: test inject and clear poison by HPA (AlisonS)
- Needs review
- cxl/test: add cxl_translate unit test (AlisonS)
- Needs review
- ndctl,cxl/test: Add a common unit test for creating pmem namespaces(AlisonS)
- Pending a v2
- test/monitor.sh: replace sleep with event driven wait (AlisonS)
- Pending a v4
QEMU
- Fan support mv’d to hobby level
- Looking for more reviewers - wonderful role, not for poor souls
- 10.1 released
- 10.2
- event record and RAS features
- ARM SPSA support needs reviewers
- a bunch more in the works
v6.17 rc fixes
- None
v6.18 merge window
- Add managed SOFT RESERVE resource handling (Smita)
- New patch series, v1 under review
- Enable CXL PCIe port protocol error handling and logging (Terry)
- v11 going through reviews
- Delayed dport creation (Dave)
- v9 needs review
- Update CXL access coordinates to node directly (Dave)
- v3 needs acks from Rafael
- Update maintainers
- Type2 device support (Alejandro)
- Main issues brought up by Alejandro at this link:
- Waiting on response: https://lore.kernel.org/linux-cxl/e74a66db-6067-4f8d-9fb1-fe4f80357899@amd.com/T/#me74adadf01d65ea15b5ef92a3947f8730f06ec93
- Wants to address those things before posting next version
- Low Mem Hole (Fabio)
- Posted CXL convention doc, going through review
- v4 under review
- Zen5 translate part 2 (Robert)
- need review
- need convention doc
- Robert gave overview on layers of patches, refactors to Zen5 support.
- CXL.mem error isolation support (Ben)
- need review
- Is there a pending use case?
- CXL LSA 2.1 labeling support (Neeraj)
- v2 needs review and response to review comments
- CXL reset support for devices. (Srirangan)
- Pending v3
- still active?
- Allow 6 & 12 way regions on 3-way HB interleave (Alison)
- Pending v3
- (RFC) Translate DPA->HPA in unaligned MOD3 regions (Alison)
- Pending v1
- Make ELOG and GHES log and trace consistently (Fabio)
- Pending v5
- CXL: Add a loadable module for address translation (Alison)
- v2 needs review
- anything else missed?
v6.19 and beyond
- Initial CXL.cache device support (Ben)
- Hotness Driver (Jonathan) Driver level is not critical path, what we do with hotness data in the kernel level is.
- non-x86 cache flushing (“wbinv”) (Jonathan)
- Cache coherency management subsystem
- non x86 folks, please try it out.
- DCD
- next call will not be cancelled
- vfio-cxl type 2 (Zhi)
- Still pending v2 RFC. Abandoned?
- cxl: Initial support for Back-Invalidate (DavidLohrB)
- fwctl
- Jonathan shared update on fwctl future. A new kconfig is needed for CCI switch support. These go beyond the security scope of current FWCTL. Plan is to convert the commands to Features in order to utilize FWCTL. Some ops will be encapsulated as a Feature commands.
August 2025
- Skipped
July 2025
- Opens
- cxl-cli
- QEMU
- v6.16 rc fixes
- v6.17 merge window
- v6.18 and beyond
Opens
- JohnG: wrt dax device w dax extents, hope to get a smaller group call. Ira going to do a poll to select time and set up.
- JohnG: famfs v2 patches, need to fixup dax, heads up need dax developer help
- DavidLohr: Background handling discussions ongoing. More feedback welcome.
CXL CLI
- NDCTL v82 was released June 12 https://github.com/pmem/ndctl/releases/tag/v82
- Patch Queue for v83:
- ndctl: Add missing test dependencies and other fixups (DanW)
- Set applied to pending
- Introduce sanitize-memdev functionality (DavidLohrB)
- Received a couple of it “works for me” replies but no review tags.
- ndctl: v2 Add error injection support (BenC)
- Needs review
- cxl: Add helper function to verify port is in memdev hierarchy (DaveJ)
- next rev pending
- test: fail on unexpected kernel error & warning, not just “Call Trace” (MarcH)
- Needs review
- test/cxl-poison.sh: test inject and clear poison by HPA (AlisonS)
- Pending a v2 with added cases but reviews still welcome on v1.
- Documentation: cxl,daxctl,ndctl add –list-cmds info (RongT)
- Pending Alison to apply - is good.
- test/monitor.sh: replace sleep with event driven wait (AlisonS)
- Pending a v4
- ndctl: Various typos fix in Documention/, cxl/, ndctl/, … (YiZ)
- Pending a v3
- ndctl: Dynamic Capacity additions for cxl-cli (IraW)
- Deferred but not forgotten
QEMU
Jonathan not in mtg. DaveJ covering…
- QEMU 10.1 soft freeze is on the 15th July (1 week from today).
Queued up waiting for Michael Tsirkin to get to:
- FM-API DCD support (Anisa) Waiting for ARM maintainers
- ARM-virt - one open question around the address space allocator used for RCRBs.
In good state so maybe if we get enough review we can try to slip in late this week: (please review!)
- 3.2 Event injection updates (Shiju)
- Maintenance commands (Davidlohr and Shiju)
Longer term stuff
- Interest in an upstream MHD implementation, so revisit inter ‘host’ communication path (Gregory)
- MCTP over USB - worked for Anisa so need to resolve remaining issues (MTU not being respected from device to host) and separate from stalled MCTP over I2C
- CHMU. Works etc, but little point in upstreaming yet.
- ARM SBSA reference platform support (separate RC) - Waiting for SBSA and PCI maintainers to review.
- Performance path for non interleaved case. Useful, needs cleaning up and tear down support -> Similar support needed for virtualized DCD.
- Various other sets awaiting new versions.
v6.16 rc fixes
- rc4 PR with some fixes accepted
- CXL Feature: Using full data transfer only when offset is 0 (Ming)
- Waiting on Jonathan to inquire spec clarification with the consortium
- Fix wrong dpa checking in PPR operation (Ming)
v6.17 merge window in cxl/next
- Documentation/driver-api/cxl: Introduce conventions.rst
- Documentation: cxl: fix typos and improve clarity in memory-devices.rst
- cxl/pci: Replace mutex_lock_io() w mutex_lock() for mailbox access
- cxl_test: Limit location for fake CFMWS to mappable range
- cxl/EDAC: use correct format specifier for u32 value
- make cxl_bus_type constant
- Remove core/acpi.c and ACPI dependency on the core for extended linear cache size
v6.17 merge window pending review
- Type2 device support (Alejandro)
- v17 going through reviews
- Add managed SOFT RESERVE resource handling (Smita)
- Pending v5
- Enable CXL PCIe port protocol error handling and logging (Terry)
- v10 going through reviews, v11 in the works
- Delayed dport creation (Dave)
- v5 going through review, v6 in the works
- Introduce DEFINE_ACQUIRE() (Dan)
- Pending v2
- Immutable branch for definition patch on cxl git
- Initialize eiw and eig (Purva)
- Pending v2
- Low Mem Hole (Fabio)
- Posted CXL convention doc, going through review
- new rev in the works
- Zen5 translate part 2 (Robert)
- expect revs to roll out with functionality in chunks like: region code refactor + rework extended linear cache + zen5 code
- CXL reset support for devices. (Srirangan)
- Pending v3
- cxl: Support Poison Inject & Clear by Region Offset (Alison)
- Pending v3 w Jonathans feedback, but more v2 comments welcome
- Allow 6 & 12 way regions on 3-way HB interleave (Alison)
- Pending v3
- (RFC) Translate DPA->HPA in unaligned MOD3 regions (Alison)
- Pending v1
- Make ELOG and GHES log and trace consistently (Fabio)
- Pending v5 with updates per Jonathans review
v6.18 and beyond
- CXL Nvdimm labels (Neeraj)
-
RFC going through reviews
-
Hotness Driver (Jonathan) not revisited since last meeting - need to repost with cleaner solution for register mapping in core driver. CHMU is the first regloc addressed thing that has hugely variable size so need to go poke inside to find out how big it is.
-
non-x86 cache flushing (“wbinv”) (Jonathan) Cache flushing for non x86. Descended into a discussion of problems with use of WBINVD on x86 so little useful discussion of what the set actually does. Some minor issues so I’ll do a v3 late this week (seems unlikely to make 6.17!) Review welcome.
- DCD (Ira)
- Anything new since June?
- vfio-cxl type 2 (Zhi)
- Still pending v2 RFC
June 2025
- Opens
- cxl-cli
- QEMU
- v6.16 rc fixes
- v6.17 merge window
- v6.18 and beyond
Opens
- CXL device life time (Dan)
-
- John Groves – firmware download/activate issue
- Can’t (may not) complete within 2sec timeout - want to run as background cmd
- revive background abort cmd patch?
- 10 sec is upper bound for what they need
- If abort - what state does that leave the card
- unknown
- device is still working during download/flash
- orig patch was for user space cmds -> was racy
- this use case might be ok
- mainly need to ensure that the state returned is correct
- Spec says one can return background command started - for firmware activate
- can’t support background in general. lose communication
- This is a true background operation
- hardware with abort does not need a timeout but need one if the hardware does not support abort
- could fw activate be a state to poll?
- would require a new opcode
- abort seems messy
- you still need to reset so no need to abort just wait and reset
- there is at least 1 cmd which polls to avoid background abort
- there is no reason to abort until another command comes
- this is user triggered
- Jonathan handle this with Davidlohr because the device supports abort
- New device would need a new mechanism - like sanitize
- Dan prefers a forground operation which just polls for completion
- would need a new status poll (not generic background)
- Alternate we extend the timeout for firmware update to 1 min (eternity…)
- wait in the shutdown flow?
- Patch set comming w/ Davidlohr’s set (regiggered)
cxl-cli
- v82 release expected only include what is in pending today.[1] Any patches left out, are either pending an update or lack Reviewed-by tag. [1] https://github.com/pmem/ndctl/commits/pending
QEMU
- FMAPI related stuff going on
- FMAPI over USB working
- should be easier to test with this
- phys switch port control stuff
- CHMU is up on gitlab
- nearly feature complete
- some arm support
- reference machine model but machine may not be maintained
- feedback from arm/qemu maintainers needed
v6.16 rc fixes (Applied to cxl/fixes)
- fix return value in cxlctl_validate_set_features()
v6.17 merge window
cxl/next applied
- Documentation/driver-api/cxl: Introduce conventions.rst
- Documentation: cxl: fix typos and improve clarity in memory-devices.rst
- cxl/pci: Replace mutex_lock_io() w mutex_lock() for mailbox access
- cxl_test: Limit location for fake CFMWS to mappable range
- Fix the min_scrub_cycle of a region miscalculation
- why not 6.16?
- obscure but is a bug
- will move to 6.16
cxl/next targets
- Type2 device support (Alejandro)
- Pending v17
- 2 issues
- conflicts Dan pointed out
- problems with accelerators call with objects which are not there
- pio buffers are in CXL - lower latency
- could there be a call back to say the mem device is comming down out from under the accelerator?
- where could this come from? perhaps cxl module removal?
- link go down?
- talk offline with Dan
- make gross/violent but safe then clean up later
- Add managed SOFT RESERVE resource handling (Smita)
- Pending v5
- Enable CXL PCIe port protocol error handling and logging (Terry)
- Pending v10 - end of the week…
- will revisit locks/reference counts
- Will need to get new Bjorn tags.
- Pending v10 - end of the week…
- Delayed port enumeration (Dave)
- Pending v4
- Need to consider Robert’s request of providing dport port_num via sysfs
- What is the reason to require this?
- very hard to debug without this
- can’t we export hardware ID?
- can’t because they are not struct device…
- make them devices?
- surface on PCI device? (wrong pci device)
- allocate the dports at the time we are numbering the ports
- Dave will revisit this
- What is the reason to require this?
- Remove core/acpi.c (Dave)
- Pending v2
- v3 posted - please review
- Introduce DEFINE_ACQUIRE() (Dan)
- Going through discussions
- Pending v2?
- v2 will come with Peter Z’s suggestions
- Using full data transfer only when offset is 0 (Ming)
- Waiting on Jonathan to hear back from consortium on spec language interpretation
- Jonathan will have a look
- John G to look too
- Initialize eiw and eig (Purva)
- Pending v2
- Low Mem Hole (Fabio)
- Creating CXL convention doc
- Zen5 translate part 2 (Robert)
- pending next rev?
- ECN?
- trying to combine with extended linear caching code already upstream
- remove platform specific changes - make more generic
- CXL reset support for devices. (Srirangan)
- Pending v3
- Allow 6 & 12 way regions on 3-way HB interleave (Alison)
- Pending v3
- (RFC) Translate DPA->HPA in unaligned MOD3 regions (Alison)
- Needs review and will need an ECN or the like also.
v6.18 and beyond
- DCD (Ira)
- v9 posted, still waiting for a use case
- Jonathan - patch set still applies
- Dan’s apetite for having a sparse device dax is limited
- don’t want this to become another ‘hugetlbfs’
- have another call outside the colab meeting
- vfio-cxl type 2 (Zhi)
- Hotness Driver (Jonathan)
- split the work…
- non-x86 cache flushing (“wbinv”) (Jonathan)
- don’t have a user space ABI so use a kref…
May 2025
- Opens
- cxl-cli
- QEMU
- v6.15 rc fixes
- v6.16 merge window
- v6.17 and beyond
Opens
RobertR: ECN update wrt addr trans series? Can’t talk confidential side of proposal. Linux side - file code first ECNs (like ACPI) what we want FW/BIOS to provide. Doing similar for mem hole problem. Code first means Linux writes rules Linux needs added to CXL spec. ACPI ECN examples, ACPI0017, extended linear cache. By starting discussion in open, on Linux mailing list, not encumbered for consortium confidentiality.
RobertR: patches address AMD specific addr trans, do we have other users? Should we be pushing a generic solution now? Ans: stay specific now.
FanN: Issue (device probe) using DCD patch set. Has worked around it. FanN to post on cxl mailing list.
DanW: cxl reset - does use case include issuing a reset from userspace?
cxl-cli / user tools
Collecting patches for a v82 release at EOQ 2, align w kernel 6.15.
- ndctl: Add support and test for CXL Features support (DaveJ)
- Needs review tags
- ndctl: Introduce sanitize-memdev functionality (DavidLohr)
- David pinging user who asked about in earlier this year
- ndctl: Add inject-error command (Ben)
- ? Pending an update from Ben considering Junhyeok prior set ?
- ndctl: Dynamic Capacity additions for cxl-cli (Ira)
- Deferred but not forgotten
QEMU
Jonathan’s Discord Update (He’s enjoying fine food in Lisbon)
- Most of left over stuff that was queued for 10.0 is now queued by MST. One patch dropped as compile issue.
- Tcg bug introduced in some tlb cleanup work. Affecting code running from cxl mem and some other cases.
- Arm support v13 posted.
- Dcd fmapi updated series on list (Jonathan hasn’t looked at yet).
v6.15 RC fixes
- RC4 PR done
- No more fixes PR unless extremely urgent.
v6.16 merge window - queued
- Remove always true condition for cxlctl_validate_hw_command()
- Verify CHBS length for CXL2.0
- Ignore interleave granularity when ways=1
- Address missing MODULE_DESCRIPTION warnings for cxl_test
- Cleanups and refactors part 1 for Zen5 translation support
- Cleanup debug printk for cxl_dpa_alloc()
v6.16 merge window - considering
- type2 support (Alejandro)
- v15 posted, v16 coming w rebase on rc4
- Boot to Bash documentation (Gregory)
- v3 posted. Review tags please. Plan is to merge as is and expect incremental fixups can follow
- CXL Maturity Map update (Alison)
- v2 posted. Review tags please.
- RAS features drivers (Shiju)
- v4 posted, ready for merge?
v6.17
- Using full data transfer only when offset is 0 (Ming)
- Waiting on Jonathan to hear back from consortium on spec language interpretation
- Native port protocol error handling and logging (Terry)
- Pending v9 will need to get new Bjorn tags.
- Soft Reserve handling (Terry–>Smitha)
- Pending v4
- Introduce DEFINE_ACQUIRE() (Dan)
- Going through discussions
- Delayed port enumeration (Dave)
- v2 posted, going through reviews
- Can Robert check and see if that resolves his dport num issue reported
- Initialize eiw and eig (Purva)
- Pending v2
- Low Mem Hole (Fabio)
- Waiting on ECN to post next rev
- Zen5 translate part 2 (Robert)
- pending next rev?
- CXL reset support for devices. (Srirangan)
- Pending v3
- Allow 6 & 12 way regions on 3-way HB interleave (Alison)
- Pending v3
- (RFC) Translate DPA->HPA in unaligned MOD3 regions (Alison)
- Needs review and will need an ECN or the like also.
v6.18 and beyond
- DCD (Ira)
- v9 posted, still waiting for a use case
- vfio-cxl type 2 (Zhi)
- Hotness Driver (Jonathan)
- non-x86 cache flushing (“wbinv”) (Jonathan)
April 2025
- Opens
- cxl-cli
- QEMU
- v6.15 rc fixes
- v6.16 merge window
- v6.17 and beyond
Opens
- DCD (Ira)
- Fan tried a qemu test – failing
- Is using the latest stuff
- Ira does not have a lot of time
- Jonathan will be taking this forward as a fork
- please review it!
- Is DAX ok?
- why is this different than other features which have landed well ahead of hardware?
- Fan tried a qemu test – failing
- Low Memory Hole enumeration (Fabio)
- Robert wanted some changes (different direction)
- more isolation within the implementation for special features
- Address translation rework has a lot of conflicts
- hard to follow
- proposal to have a check if the LMH applies then use the SPA range
- Dan is missing the conflicts – refactoring is ok
- LMH is a small change - different from a whole new addressing space
- what happens when a 3rd, 4th… etc show up?
- don’t be surprised by these things(?)
- why does this quirk need to be delayed by larger changes?
- some code conflicts
- but does LMH break the new code?
- Robert - extended linear caching is harder to abstract and LMH makes that harder
- wants some code isolation
- flat2lm messes this up too - LMH is yet another thing
- is the refactoring for flat2lm done?
- not yet
- the refactoring should make LMH fit easier
- makes SPA != HPA => use for LMH
- part 2 of Roberts series would do this.
- this has been posted. “Address translation part 1 and 2”
- part 1 does not conflict as much
- Could be helpful to get this landed to clear the backlog
- part 2 mostly needs to be resolved -> hard
- Linux has suffered from platforms taking liberties (inveted on the fly)
- there has to be a conversation somewhere on these special configs
- can we get some rules around these things
- examples
- no CFMWS for type2
- SPA vs HPA
- Robert wanted some changes (different direction)
- John G. – FAMfs RFC v6.14 out soon. 6.15 rework comming
- Gregory working on boot to bash stuff to put in documentation
- need opinions on this
- in a personal-public github
cxl-cli / user tools
- v81 was released end of Q1.
- Collecting features for a v82 at end of Q2, aligned w 6.15.
- ndctl: Add support and test for CXL Features support (DaveJ)
- Needs review. Driver support is in.
- ndctl: Introduce sanitize-memdev functionality (DavidLohr)
- Needs review. Driver support is in.
- ndctl: Add inject-error command (Ben)
- Pending an update from Ben
- ndctl: Dynamic Capacity additions for cxl-cli (Ira)
- Awaiting driver decision
- might need some clean up on the base commit
- but it is out there
- ndctl: Add support and test for CXL Features support (DaveJ)
QEMU
- 10.0 out today or next week
- fairly minor features will land after that
- arm support waiting for review
- FM-API review
- FM in qemu A controlling devices in qemu B
- RFC - test FM commands through MCTP
- uses QMP to notify qemu B
- MCTP messages is in shared buffer
- Need feedback from upstream -> may need a socket vs shared buffer
- ‘whatever works’ …
- FM in host could work -> nice to have kernel stack formulate MCTP
- what blocks MCTP
- open BMC is blocked by lack of tests
- need to know what happens with malformed packets
- long way around is to use a PCI (with a distro) -> i2c emulated device
- could abuse this work.
- open BMC is blocked by lack of tests
- could just use ARM for MCTP with open BMC
v6.15 rc fixes
- Pending cxl/fixes
-
GPF DVSEC fixes (Ming)
- Waiting on more review tags
- CXL Features: Address set_feature and offset flag (Ming)
- email sent
- CXL Features: Set out_len in set_feature failure case (Ming)
- Skip Mem_En check for RCD and RCH ports (Smita)
v6.16 merge window
- Pending cxl/next
- Ignore interleave granularity when ways=1 (Gregory)
- Verify CHBS length for CXL 2.0 (Zhijian)
-
Remove always true condition for cxlctl_validate_hw_command() (DaveJ)
- Waiting on more review tags
- CXL type2 support (Alejandro)
- Going through v13 review
- Enable CXL PCIe port protocol error handling and logging (Terry)
- Going through v8 review
- working on it but ioresource is higher priority
- AMD Zen5 address translation support (Robert)
- Going through v2 review
- will send part 1 first and can focus on that now
- Managed SOFT RESERVE resource handling (Terry)
- Going through v3 review
- build bot issues v4 comming.
- Enable region creation on x86 with low memory hole (Fabio)
- Discussion on going
- Focus on clean ups (part 1 series) then decide on LMH
- Delay dport initialization (DaveJ)
- Going through v1 review
- CXL reset support for devices. (Srirangan)
- Going through v2 review
- PCIe subsystem review
- v3 needed but awaiting more comments prior
- Allow 6 & 12 way regions on 3-way HB interleave (Alison)
- Pending a v2 update
- Translate DPA->HPA in unaligned MOD3 regions (Alison)
- Needs review
- label RFC but please review anyway
- priority vs LMH/part 1 rework/part 2?
- Gregory does not see anything obvious but will take a quick look
- may be subtle interleave position issues
- any 3-way region will be unaligned
- Gregory does not see anything obvious but will take a quick look
- Update CXL maturity map. (Alison)
- Need review?
- the maturity map needs more review but not Alisons’ patch itself
- Please update the maturity map as part of documenting any changes one submits
- RAS features drivers
- ACPI should land in 6.16 too
- locking bugs -> fixed in v3 just posted
v6.17 and beyond
- DCD support (Ira)
- v9 posted
- Dan’s uncomfort
- Dan did a public demo almost 2 years ago -> all dissapeared
- device dax does not have review scalability
- low priority for Dan
- would like to have an end user stand up
- what about the FM development
- this has stopped being a priority
- John Groves - was shown a demo - not public… yet
- sharable memory
- Has a person who has been testing this
- emulation is probably at a point that John could test this
- AR : John look at the interface and contribute what they need with tagging to make this work for them. With a real use case.
- Johnathan prefered some of the older interfaces but these have all been ok
- Also need a virtualization story
- virtio plan is gone
- cxl emulation is in
- keep this alive on top of type2
- Johnathan will continue to have a staging branch - as stated above
- Gregory
- is dax the right interface for virtualization
- among us we have users
- Yannis - Is this chicken and egg?
- there are demos
- Is the interface what we want to support?
- is it sufficient?
- Dan we know this is not a slam dunk interface
- does it matter if dax may go away?
- dax was advocated at LSFmm - so not going away
- vfio-cxl type 2 (Zhi)
- next version?
- Hotness Driver (Jonathan)
- focus is on emulation (qemu) first
- non-x86 cache flushing (“wbinv”)
- need review
- arm focused but should work anywhere
- used device classes (show in sysfs)
- but no user interface now
- could be used for specific flushes
- various methods
March 2025
- Opens
- cxl-cli
- QEMU
- v6.14 rc fixes
- v6.15 merge window
- v6.16 and beyond
Opens
- Mixed granularity in x3 regions (Alison)
- posted on the list 6&12 way regions
- fix is limited
- this is fine but could this be simpler by relaxing ordering constraints?
- perhaps this makes no difference at all?
- way back - mismatch - interleave was “backwards”
- course vs fine
- if there are other use cases of x3 - please review her patch
- LSFmm – Intel attendees?
- DCD?
- Dan should be there
- CXL specific session?
- 1 hour might be better than 1/2
- device specific stuff maybe in the hallway track?
- Gregory to focus on external to the driver
- General CXL discussions 80% on external with maybe 20% internal driver stuff
- chime in on the ML on what you would like to see
cxl-cli / user tools
- v81 is queued up with misc fixups.
- New features on the list:
Davidlohr
- ndctl: Introduce sanitize-memdev functionality (Davidlohr) Dave J.
- fwctl changes? Pending an update from David
- ndctl: Dynamic Capacity additions for cxl-cli (Ira) Simmering waiting for entry into cxl/next
- ndctl: Add support and test for CXL Features support (DaveJ) Simmering waiting for all the pieces to come together for test and review
- ndctl: Add inject-error command (Ben) Pending an update from Ben
QEMU
Missed Michael’s merge window
- queued up for next cycle
- DCD FM api stuff
- interacts with Gregory FM stuff
- idealy would be connected eventually
- Who is using DCD/qemu
- Jonathan has always tested DCD with qemu
- Terry, Adam, and John Groves all using qemu DCD
- Adam – DCD would be nice for compression
- Yiannis – some concerns due to lack of use case for DCD
- John G. allocations from 0 within same tag - similar to storage but not exactly the same
- ARM support review needed
- got blocked way back for device tree support - relaxed
- hotpage support
- please chime in if you are interested
v6.14 rc fixes
none
6.15 in cxl/next
Already in from last meeting:
- Add support for Global Persistent Flush (GPF)
- Cleanup of DPA partition metadata handling
- Removed unused CXL partition values
- Refactor user ioctl command path from mds to cxl_mailbox
- Add logging support for CXL CPER endpoint and port protocol errors
- Remove redundant gp_port init
Newly added:
- Cleanup of gotos using guard() series
- Validation of CXL device serial number
- CXL ABI documentation update/fixups
- CXL Features support (First part of CXL FWCTL support)
- FWCTL specific CXL bits will be pushed by Jason G.
- Additional support for dirty shutdowns
- Extended Linear Cache enumeration and RAS support
- Last 2 patches from Smita for firmware first error logging
- cxl_test to support 3-way capable CFMWS
- Documentation fix to remove “mixed mode”
6.15 merge window considerations
- 6.15 merge window closed for large series.
- May still take small changes or fixes that are urgent.
v6.16 and beyond
- cxl: Add address translation support and enable AMD Zen5 platforms (Robert)
- v4 of part1 in review
- v3 of part2 pending?
- Update soft reserved resource handling (Nathan -> Terry)
- v3 pending
- next version this week
- CXL PCIe port protocol error handling and logging (Terry)
- v8 pending?
- tomorrow…
- Support CXL memory RAS features - EDAC (Shiju)
- pending v2? (AKA v24 … ;-)
- core support landed
- yay! can do this now
- Support background operation abort requests (Davidlohr)
- pending v2?
- Enable Region creation on x86 with Low Mem Hole (Fabio)
- v3 posted, under review. Would like this queued for cxl/next after merge window.
- Type2 device support (Alejandro)
- v11 posted. Under review. Would like this queued for cxl/next next merge window.
- Rest of DCD series (Ira)
- Pending type2 acceptance
- look for another version after the merge window
- will be watered down from previous version
- need use cases beyond the spec
- The cost of not merging this is that nothing is being built above it
- Several CPU vendors talking about hot add/remove
- is DCD a way to get around this?
- flush cost limits things
- a motivation is to decouple decoder programming from on/off lining memory
- entire provisioning mechanism is DCD with CXL 3 - endpoint
- use case order is important as well
- Linus/mm folks won’t care – But Dan wants the folks shipping products to stand up.
- Allow 6 & 12 way regions on 3-way HB interleaves (Alison)
- Pending v3?
- already discussed above (in opens)
- Translate DPA->HPA in unaligned MOD3 regions (Alison)
- v1 needs review
- cxl: factor out cxl_await_range_active() and cxl_media_ready() (Zhi)
- Pending next rev?
- Add cxl reset support (Srirangan)
- Pending review
- PCIe folks looking at this…
- Cleanup add_port_attach_ep() “cleanup” confusion (Dan)
- Dan needs to review
RFC
- vfio-cxl type 2 (Zhi)
- Pending next rev
- Zhi could not make the call
- Hotness Driver (Jonathan)
- lots to do here
- Combining all hotness features CXL and beyond - Discuss at LSFmm
- DAMON… NO… ?
- one options on the table
- save it for LSFmm
- DAMON… NO… ?
- boot to bash
- Gregory’s documentation journey
- kernel docs?
- AI will pick it up from there.
- CEDT recipes?
- good feedback on this
- LSFmm session will focus on how Linux expects things to be configured
- memory blocks vs region alignment -> lose memory
- Theory vs how Linux really works
- All of the complication comes from BIOS and OS interactions after BIOS sets things up
- Need OS first!!!
- ACPI tables need to be correct.
-
backwards compatibility...
- please tear the docs appart if you think it is wrong
- FAMfs is running under FUSE!
- patch to libfuse
- may have a branch before LSF
February 2025
- Opens
- cxl-cli
- QEMU
- v6.14 rc fixes
- v6.15 merge window
- v6.16 and beyond
Opens
- Bueler?
cxl-cli / user tools
- v81 is open with misc fixups
- Need review on build and coverity fixups on list
- New features in review:
- ndctl: Introduce sanitize-memdev functionality (Davidlohr) https://lore.kernel.org/linux-cxl/20240928211643.140264-1-dave@stgolabs.net/
- ndctl: Dynamic Capacity additions for cxl-cli (Ira) https://lore.kernel.org/nvdimm/20241214-dcd-region2-v4-0-36550a97f8e2@intel.com
- ndctl: Add support and test for CXL Features support (DaveJ) https://lore.kernel.org/linux-cxl/20250207234718.2387622-1-dave.jiang@intel.com/
- ndctl: Add inject-error command (Ben) https://lore.kernel.org/nvdimm/20250108215749.181852-1-Benjamin.Cheatham@amd.com/
QEMU
- inside merge window
- Fujitsu clean up
- ARM virt support reposted
- 2 samsung series’
- Hotlist monitoring
v6.14 rc fixes
none
6.15 in cxl/next
- Add support for Global Persistent Flush (GPF)
- Cleanup of DPA partition metadata handling
- Removed unused CXL partition values
- Refactor user ioctl command path from mds to cxl_mailbox
- Add logging support for CXL CPER endpoint and port protocol errors
- Remove redundant gp_port init
6.15 merge window
- Rest of DCD series (Ira)
- pending v9
- much discussion on actual use cases
- AI: Ira to schedule another call between Dan, John, Jonathan and Ira…
- Support background operation abort requests (Davidlohr)
- Pending v2
- will rebase and send soon
- CXL PCIe port protocol error handling and logging (Terry)
- v7 posted, review on going
- Some devices/drivers have been happily CXL-unaware (prtdrv)
- should PCI subsystem throw errors to ‘cxl land’?
- CXL system must be loaded for processing these errors
- Alternate: make PCI system more cxl aware
- new file is ok
- 2 fifos CPER/OS first
- fifo overflows if cxl is not loaded (user ‘asked for it’)
- mapping between PCI/CXL device
- AER -> fifo -> wq -> pciaer -> aer src info???
- AER -> fifo -> cxl core?
- AER statistics (CXL counters?)
- Jonathan to post reference
- Type2 device support (Alejandro)
- v10 posted, review on going
- memdev state vs device state
- need to have Alejandro to discuss further
- Trace FW-First CXL Protocol Errors (Smita)
- 1-4 in cxl/next. 5&6 needs more work
- Q: looks like this needs to be in cxl core?
- yes it is not an endpoint but a port object
- cxl: Add address translation support and enable AMD Zen5 platforms (Robert)
- Part 1&2 v2 posted
- Review on going
- Reference Low memory hole: was generic to ‘some platform may do this’
- specific AMD file? can this be more generic?
- has anyone else done this? … no
- specification help?
- Update soft reserved resource handling (Nathan, Alison)
- v2 is posted. Review on going?
- Hildenbrand(sp?) had comments
- Introduce generic EDAC RAS control feature driver (Shiju)
- v19 posted. Review ongoing?
- v19 discussion : Boris is unconvinced about the API
- reverse engineer tracepoints?
- marshal into ioctl
- marshal into tracepoint (format different)
- use sysfs
- lockdown kernels disable debugfs
- Must be in EDAC - Boris
- v20 posted
- Dan to look at CXL bits
- online repair - must see an error ‘this boot’
- call for memory device manf. to look at the API and weigh in
- can parameters just fine
- Dan to look at CXL bits
- FWCTL CXL (Dave)
- pending v6
- Add exclusive caching enumeration and RAS support (Dave)
- v3 posted, minor changes requested from Ming. Need Dan’s review
- Enable Region creation on x86 with Low Mem Hole (Fabio)
- v2 posted, review ongoing.
- some assertion this should wait for Roberts stuff?
- Can we apply this?
- this could affect Roberts patches with a small conflict - Gregory
- cxl: factor out cxl_await_range_active() and cxl_media_ready() (Zhi)
- pending v3?
- will respin
- vfio-cxl type 2 (Zhi)
- will rebase on type 2 v10+
- Use guard() instead of rwsem locking cleanup series (Ming)
- v2 review on going
- should be simple; get reveiwng folks!
- cxl/pmem: debug invalid serial number data (Yuquan)
- v3 review on going
- Add cxl reset support (Srirangan)
- review on going
- Dirty shutdown followups (Davidlohr)
- Cleanup add_port_attach_ep() “cleanup” confusion (Dan)
- pending v3?
- forgotten… will be remembered…
6.16 and beyond
- Hotness driver (Jonathan)
- might need a sub-call on this; Jonathan to schedule
- vfio-cxl? (Zhi)
- see above…
January 2025
- Opens
- cxl-cli
- QEMU
- v6.14 merge window
- v6.14 rc fixes
- v6.15 merge window
- v6.16 and beyond
Opens
- 1.5 hour meeting?
- Feature velocity is slowing
- Upstream support needs to land well ahead of distro acceptance
- Some vendors may not release hardware without ecosystem support
- lack of hardware does not mean a feature is not important
- DCD is important but type 2 seems to be more important because devices are out and real
- The core had an issue which probably should have been cleaned up a while ago
- both DCD and type 2 were trying to not ‘disrupt’ the status quo
- Generally clean up first is good
- In this case there are other issues with both sets so it might be fine… this time
- Cross-subsystem ties?
- Jonathan would like to see at least one of these features queued for cxl-next
- More reviews!
- CXL will now be ‘leaking’ out into the rest of the kernel.
- Use cases may need to make CXL core changes… without redoing the entire thing.
- type 2 is higher prioity than error handling. may folks are doing FW first.
- there are some FW first patches on the tail end of the port error handling series.
- Maybe those should come first or as a separate patch set?
- AFAWK they don’t conflict with type 2
- there are some FW first patches on the tail end of the port error handling series.
- DCD can go in with device dax. memfd is a future question. should device dax move toward memfd?
- famfs needs device dax. without memfd changes it can’t replace device dax.
- memfd is currently geared toward anonymous memory and device dax provides better super block support
- also tagging support in memfd would also be a bigger change
- memfd support may be growing some persistence so there may be conflicts in the support and a decision may need to be made.
- public? guest memfd call is public
- Gowans is working on this
- CC can’t consume device dax. which also pushes toward memfd
- How would CC handle shared memory? – can’t be anonymous
- shared confidential is down the road – we are not ready for it
- ratelimit AER
- https://lore.kernel.org/linux-pci/cover.1736341506.git.karolina.stolarek@oracle.com/
- https://lore.kernel.org/linux-pci/20250115074301.3514927-1-pandoh@google.com/
- Pradeep to check with Terry
- CXL reset
- patch set should apply to all devices
- Don’t the SBR patches already do this?
- This would destroy the regions
- We think so
- If this is adding a new reset to that then we should be ok
- CXL reset is different
- It does work through sysfs
- expected use case would be to go through type 2 driver and it could ensure memory flushes
- Does this need a new version?
- Still RFC
cxl-cli / user tools
- v81 is open with misc fixups and unit test updates at the moment
- New features need review:
- DCD needs review (Ira) https://lore.kernel.org/nvdimm/20241115-dcd-region2-v3-0-585d480ccdab@intel.com/
- Sanitize memdev needs review (Davidlohr) https://lore.kernel.org/linux-cxl/20240928211643.140264-1-dave@stgolabs.net/
- Inject error (Ben) https://lore.kernel.org/nvdimm/20250108215749.181852-1-Benjamin.Cheatham@amd.com/
QEMU
- one fix around MSI
- clean up some error paths
- new staging tree should be out in a day or 2
- hot miss – HMU
- roughly speaking one can run a real work load and get real data
- 10% speed (4min to boot)
- infinate counters (no one will build this) – could add knob to change counters
- framework is in place
- if other HW vendors like to upstream ‘real’ behavior
- would be nice to have a range of implementations
- hot miss – HMU
In v6.14 merge window
- ACPI/HMAT: Move HMAT messages to pr_debug()
- cxl/pci: Add CXL Type 1/2 support to cxl_dvsec_rr_decode()
- CXL events updates for spec r3.1 (series)
cxl-next may still consider the following. Would like to close cxl-next by Wed/Thurs.
- cxl/pci: Support Global Persistent Flush (GPF)
- Need review tags
- DPA partition meta data cleanup
- Need v2 and review tags
v6.14 rc fixes
- None so far
v6.15 merge window
- Rest of DCD series (Ira)
- Pending v8
- pending on DPA partition cleanups from Dan
- week?
- Support background operation abort requests (Davidlohr)
- Pending v2
- user space would abort previous operation
- v2 on the way
- CXL PCIe port protocol error handling and logging (Terry)
- v5 posted, review on going?
- yea
- Type2 device support (Alejandro)
- v9 posted, need review
- Also pending on DPA partition cleanups from Dan
- is there a way to check if an allocation came from devm?
- xfc is a network driver
- put cxl side of the driver in drivers/cxl
- Alejandro believes he has a solution
- VFIO also has this problem
- export a new function so the VFIO can do the release
- convert some core calls to non-devm
- patch set has been tested with 2 different drivers and AMD
- should be very stable
- cxl-test has not been run though
- but also can’t break the cxl-test build
- should also have a basic cxl-smoke test-test
- would love that every feature has a new cxl-test but not always required
- Trace FW-First CXL Protocol Errors (Smita)
- v5 posted. Is linux-efi picking up the series?
- cxl: Add address translation support and enable AMD Zen5 platforms
- Review on going
- Add device reporting poison handler (Shiyang)
- Will there be v5? No movement since last September
- rasdaemon folks are engaged
- rasdaemon will find the tracepoints
- what about corrected errors. would need to soft offline
- who does that? don’t know yet.
- Update soft reserved resource handling (Nathan, Alison)
- v2 is posted. Review on going?
- Introduce generic EDAC RAS control feature driver (Shiju)
- v18 posted. Review ongoing?
- v19 comming
- complexity around the interface caught on merge
- specifically memory sparing – because DPA is not stable
- add PoC in user space to show the usage of the API
- need vendors to step up to show this is not a single vendor solution
- does this just belong more on the CXL side vs EDAC
- but Boris wanted it in EDAC – for unified interfaces
- maybe write a whitepaper to help explain – it is in the documentation
- when is it safe to use the interfaces? ie after a boot?
- safty rules vary a bit
- error record must corespond to the current boot
- can a device really do this… ‘atomic swap’
- soft-hibernate idea
- just document it – device is to take care of it
- feature query
- xarray will carry on forever … should not see that many errors
- PPR will have separate support
- FWCTL CXL (Dave)
- Almost done with cxl cli support and cxl unit test for using ioctls. v1 will be posted once that is done.
- Add exclusive caching enumeration and RAS support (Dave)
- v3 posted, minor changes requested from Ming. Need Dan’s review
- all tags from Jonathan
- Enable Region creation on x86 with Low Mem Hole (Fabio)
- v2 posted, review ongoing.
- Need to sync with Robert’s address translation series?
- not sure how to do what Robert suggests
- cxl: factor out cxl_await_range_active() and cxl_media_ready() (Zhi)
- v2 posted, Need more review tags
- Cleanup add_port_attach_ep() “cleanup” confusion (Dan)
- pending v3?
6.16 and beyond
- Hotness driver (Jonathan)
- Updates?
- vfio-cxl?
- Any new updates?
Admin Issues
- Anyone want to host the next meeting?
December 2024
- Opens
- cxl-cli
- QEMU
- v6.13 rc fixes
- v6.14 merge window