CXL Collab Sync
CXL Linux Sync: Ground Rules
- Do not share confidential information
- Do not share confidential product details
- Do not disclose CXL consortium confidential information
- Do discuss any Linux questions about released CXL specifications:
- Do use Discord as a supplement for this sync meeting for quick questions
- Do follow-up on linux-cxl@vger.kernel.org for longer questions / debug
- https://pmem.io/ndctl/collab/
March 2024
- Opens
- FAMFS update
- QEMU
- cxl-cli
- v6.8 Fixes
- v6.9 Queue
- Future
QEMU
- 8 week merge cycle still open
- Pre-reqs pending
- SPDM from Alistair
- MCTP
- Bounce buffer fix for DMA to CXL memory
- Generic port pending generic initiators from NVIDIA
- DCD Emulation Update
- Feedback incorporated
- Superset extend/release as well partial extend/release supported
- Tests passing
- Formalizing introspection may make sense in the future
- Might be too late for cycle ending in a week or so
- Greg: Multi-head interactions with DCD emulation? Can it be added incrementally?
- Investigate a shared base device for single/muti-head implementations to share
- Greg to RFC what he has
- Generic Port
- Background command support queued behind DCD for convenience
- MCTP over I2C still in process
- ARM support still pending device-tree interactions, help welcome
- Firmware first error handling, not impossible to upstream but not a priority
- CPMU
- FM-API help to flesh out the command support welcome
- Non-interleaved high performance CXL memory emulation (how to represent the performance)
- QEMU only emits x1 lowest bandwidth link speed
FAMFS
- Review, thanks Jonathan
- DEVICE-DAX IOMAP review needed
- PMEM support may be dropped in favor of just DAX
- Christian Brauner has advice on how to open dev-dax
- Fault counters to be removed
- FAMFS held up to performance benchmarking
- Superblock identified capacity
- Initial use case: provide access to a large shared pool with readonly clients
cxl-cli
- List Media Errors (Poison): pending review
- QoS class changes; pending
- v79
v6.8 Fixes
- 3, 6, 12 XOR interleave math fix: pending feedback
- SSBLIS Fix: ready to queue
- CXL QOS Sysfs fixes / simplification: merged
- Fix “HPA out of order” region assembly fix: merged
- Fix “no NUMA configuration found”: merged
- Crash on repeated AER signaling: merged
- cxl_test build fix: merged
- Stop requiring MSI/MSIx: merged
- Fix x16 Region HPA allocation: merged
- Fix duplicate messages in CPER handling: merged
v6.9 Queue
- CXL QOS to NUMA: pending merge
- Weighted Interleave: queued in mm-unstable
- DAX support on modern ARM: pending merge
- CXL CPER Protocol Errors to Trace Events: pending review
- CXL EINJ: pending merge check ACPICA
- CXL Userspace Unit Tests: pending review
- CDAT Cleanups: queued
- CXL test save/restore: pending non-RFC posting
- Use sysfs_emit(): throughout: queued
- cond_guard(): and related cleanups: pending next posting
- scoped_cond_guard() usages pending for v6.9
Future
- Component State Dump interaction with event clearing
- how much data is in a CSD, how much blob can trace event support
- CXL Scrub Feature: more review needed
- DRAM Scrub necessary over time
- Tradeoffs of reliability vs scrub cost
- want hotplug support
- Address Range Scrub, on demand scrub
- new patchset in process
- sync with RAS API folks on reusability
- OpenCompute model of out of band control might be in conflict with embedded use cases
- RAS API does not supply stop-scrub on inband interface
- CXL Switch Port Error Handling: pending initial posting
- CXL Root Port (RCEC Notified): Error Handling: pending initial posting
- DCD: pending next revision
- DPA to HPA translation for events
- Type-2 Preview: still awaiting a consumer
- CCI Refactor for Switch CCI, RAS API, Type-2: pending next posting
- MMPT in Jonathan’s queue
February 2024
- Opens
- LSF/MM CFP: deadline March 1st
- QEMU
- cxl-cli
- v6.8 Fixes
- v6.9 Queue
- Future
QEMU
- Status
- 2 patch sets to pick up; bunch of fixes
- Not clear spec versions so update those to 3.1
- Fan’s next DCD version; close
- Some minor issues
- Would like to land 9.0 cycle (Aprox end of March)
- Some things depend on these so want to land them first
-
MHD won’t make March
- TCG/KVM mess
- Bug report on list; Not as minor as thought
- Slow path does not cover everything unfortunately
- May be some other issues
- Random crashes (might be page tables or ??)
- Alternative is to implement performance path
- Treat as normal RAM
- Can’t do interleave with lots of memory regions (ways?)
- For now… Don’t use emulated CXL memory
- Fan said it would work for some cases?
- Kernel code is now putting things in the right numa nodes
- Kernel may have been using swap
- Should x86 use memblock?
- Jonathan does not think it will help
- Re-read cdat?
- EFI soft reserved causes x86 to keep the info around
- ‘numa keep meminfo’ or something like that
- AMD CPER pushed out
- Jonathan would like a HEST table from x86 if someone could provide that
cxl-cli
- List Media Errors (Poison): pending review
- QoS class changes; pending
- Porcelain patches welcome
- How can we make things easier?
- Automate cxl create region for largest regions it can figure out
Should Linux be the BMC?
- Open BMC has a lot of drivers
- need guard rails
- Might be useful and to share code
- BMC only use cases are questionable
- How do we ID which is which?
- Kconfig CXL_BMC_SUPPORT?
- Similart to raw command support
v6.8 Fixes
- CXL QOS Sysfs fixes / simplification: pending next posting
- Fix “HPA out of order” region assembly fix: ready to queue
- Fix “no NUMA configuration found”: queued for v6.8-rc4
- Crash on repeated AER signaling: queued for v6.8-rc4
- cxl_test build fix: merged v6.8-rc2
- Stop requiring MSI/MSIx: merged v6.8-rc2
- Fix x16 Region HPA allocation: merged v6.8-rc2
- Fix sleeping lock in CPER handling: pending next posting
- Fix duplicate messages in CPER handling: Going through EFI tree
AER fatal panic - wide range of handleing
- Policy change to discuss with comunity
- Instead of hoping we should panic?
- But if DAX just kill the process (invalidate mappings)
- But how much running around should we do?
- Hope was that force remove of driver would do a pr_warn() [let panic on warn crash]
- Need more real world feedback
v6.9 Queue
- CXL QOS to NUMA: pending review
- Weighted Interleave: queued in mm-unstable
- DAX support on modern ARM: pending final review
- CXL CPER Protocol Errors to Trace Events: pending review
- CXL EINJ: pending resolution of ACPICA dependency
- CXL Userspace Unit Tests: pending next posting
- CDAT Cleanups: queued
- CXL test save/restore: pending non-RFC posting
- Use sysfs_emit(): throughout: queued
- cond_guard(): and related cleansups: pending next posting
Future
- CXL Scrub Feature: more review needed
- CXL Switch Port Error Handling: pending initial posting
- CXL Root Port (RCEC Notified): Error Handling: pending initial posting
- DCD: pending next revision
- DPA to HPA translation for events
- Type-2 Preview: still awaiting a consumer
- CCI Refactor for Switch CCI, RAS API, Type-2: pending next posting
November/December 2023
- Opens
- Plumbers Takeaways
- QEMU
- cxl-cli
- v6.7 Fixes
- v6.8 Queue
Opens
- Interleave ratios: MVP
- mempolicy based to start
- cgroups deferred for a later fight
Plumbers Takeaways
- Greg’s interleave document
- 5 types: BIOS, OS, mempolicy (homogenous or heterogeneous)
- LWN Article for reach? Follow in the style of Mel’s NUMA article
- UKunit: Userspace unit testing of kernel code
- limitation on what can be mocked with Kunit
- https://github.com/jimharris/ukunit
- Davidlohr to post notes
- Port device RAS support
- Move PCIe port bus driver logic into PCIe driver/core to start as library
- AER handler callback to the endpoint driver
- Break the pcie portbus driver dependency
- Hotplug range register problem resolution
QEMU
- mst picked up more than expected includng CCI support into 8.2-rc1
- ira’s cdat fixes posted
- scrub control: both QEMU and kernel patches posted
- Integrate with ACPI scrub control as a subsystem shared with CXL
- Alistair’s SPDM work progression
cxl-cli
- concern for first-time users
- dnf install cxl-cli
- cxl list -RX
- v79 release imminent
- corresponding to v6.7 updates
- [hotplug range register support?](https://lore.kernel.org/linux-cxl/ZCRhhUDcmypVKu0X@memverge.com/]
- disable device mem_enable modify range register + re-enable
- how to handle zero based DVSEC range register
v6.7-fixes
- locking fixups
v6.8
- Interleave syscall
- John: don’t force people to go through BIOS for interleave
- Michal: looking for mempolicy2() support
- Greg: also working on thrid-party mempolicy syscall via pidfd (minus mbind/homenode)
- once syscalls are in interleave weights can be layered on top without ABI changes
- numactl changes would be nice to have
“Halloween” 2023
- Opens:
- QEMU
- cxl-cli
QEMU
- Multiple HDM decoder support landed
- Compilation issues slowed down a topic
- Mailbox CCI rework sent out
- Difficult to test MCTP infrastructure
- Fan in process of next DCD posting
- FMAPI support on top of DCD (“add” support, test interfaces included “real” tooling wanted)
- QEMU support for changing QOS class information?
- weighted interleave investigation
- generic target support needed
cxl-cli
- sanitize command unit test for (for v80 depends on v6.7)
- poison listing support (for v79 kernel support in v6.5)
- automatic region position determination for create-region (–strict option for recovery of old behavior)
v6.7
v6.8
- DCD next revision pending
- Spec pipecleaning in progress
- Node Weights and Weighted Interleave - Gregory Price
- John: Tier preference vs local preference?
- Gregory: bandwidth vs latency tiering conflicts
October 2023
- Opens:
- Jim: QEMU dport conflicting connections, (1) HB (1) 1 RP (2?) Switches (4) Endpoints (Who detects impossible configs?)
- Gregory: port to region confusion (make create-region smarter)
- Vincent: multi-function upstream ports? Yes, for PCIE, does CXL mandate function0?
- Steve: RCH link width / speed enumeration (emit via CXL objects?) Jonathan RCIEP examples of emitting attributes, virtual switch?
- Jonathan: Dynamic NUMA node creation
- 0-size NUMA node entries in SRAT already shipping
- QEMU
- cxl-cli
- v6.6 Fixes
- v6.7 Queue
QEMU
- Cleanup sets upstream
- mst has QTG in the backlog
- backlog of PCI bits
- switch serial number on upstream port
- multi-HDM decoders
- mailbox rework for Switch CCI + MCTP over I2C (difficult to add aspeed to x86 machine model)
- DCD: working through reported issues wrt kernel patches
- Fabric management ambiguities
- MCTP representation of MLDs? Single-MLDs when plugged in as an SLD.
- FMAPI binding when sending to a switch, not Type-3, except for the general commands like identify
- I.e. use type-3 binding except opcodes 0x4000+ when talking to a switch
cxl-cli
- Poison List Retrieval
- Towards CXL continuous integration
- Vishal: set alert config patches queued up
v6.6 fixes
- v6.6-rc3 update
- Fix shutdown order
- awaiting testing
- need to rework mbox irq to be threaded or an atomic flag
- Soft Reserved Conflict / Lifetime
- Auto-assembly Rework
- Jim: Granularity fix top down is confusing switch settings
- Davidlohr: Type-2 crash interaction with security shutdown order?
v6.7+
- RCH EH
- QTG
- QTG to HMEM
- Switch CCI
- Davidlohr: background status publishing to userspace? Bind VPB, Sanitize via Tunnel?
- Jonathan: Punt until someone with BMC background can help drive
- Jonathan: Possibly some NVME MCTP work to draft behind
- Jonathan: start with safe commands to get framework started
- Gregory: multi-headed SLD testing validating the approach of an independent mailbox core (QEMU)
- SPDM / Auth
- SPDM BoF Planned for Plumbers in November
- memmap on memory
- mempolicy proposals:
- multi-tier
- mempolicy2
- mempolicyNM
- [weighted interleave]
- Informal Plumbers BoF
September 2023
- Opens:
- John: CXL memory online by default memhp_default_state=offline not working?
- QEMU
- cxl-cli
- v6.6 Fixes
- v6.7 Queue
QEMU
- Merge window induced slowness
- Round-up of fixlets sent up
- Multiple HDM Decoder support for endpoints posted
- Serial number update
- Maintainer feedback administrivia cleanups
- Sort out revision numbers for spec version comments
- advocate with your rep about caching old copies at spec-landing
- MCTP I2C from NVME
- Single Aspeed i2c controller driver has support
- POC quality / out-of-tree support until server class driver arrives
- DCD Update
- waiting for kernel-side code resolution
- Get Extent List for unaccepted memory, track pending state in the implementation
- cxl-test may need updates too
- MHD Update
- Joint effort with SK Hynix, custom command set
- Proto-DCD
- Single logical device
- Software Development Vehicle
- CPMU, ARM, Compliance, Type-2
- SPDM Interest
- WDC looking at library-izing it, still looking to support and external agent
- FM API (MCTP Mailboxes + Switch CCI + MHD Mailbox)
v6.6 Fixes
- CXL RAS Enabling
- Region Granularity Setup
- Region Decoder Discover
v6.7 Queue
- RCH EH (under)
- Kernel SPDM
- WDC showing up to help
- Invite to CXL sync? Invited to “devsec”
August 2023
- Opens:
- Linux Plumbers CXL Microconference CFP
- uConf proposals close at end of the August
- Linux Plumbers CXL Microconference CFP
QEMU Update
- Not a huge amount going in this merge, doc, fixes Multiple HDM decoders should be going in this merge.
- Lot of stuff is backed up by the mailbox rework
- Jonathans gitlab has DCD preview queued up.
- Ira did some testing and fixes were merged in latest version
- Jonathan might have broken it with rebasing. So just a reminder that this is work in progress.
- MCTP support over I2c… Support is coming from NVME-MI this work is similar to FM-API
cxl-cli update:
v6.5 Fixes Queue
v6.6 Queue
- RCH Error handling
- Terry working on it right now. Was waiting on response from Dan which should be there yesterday.
- Will pick that work back up
- Type2
- Davidlohr to submit the fix for type2 init collision. (Merged)
- Dan rebasing patches. There is conflict here with the Switch CCI work. See below.
- DCD
- Ira is reworking the patch set quite a bit.
- Fan’s QEMU DCD work is being used
- Cxl-test being added for better regression testing
- Cxl-test event processing was changed
- New DAX device work needed to handle sparse extents within the dax region
- Interleaving is in the back of his head and Navneet has been looking into this. However, interleaving is not slated for this initial work
- Jonathan - concerned that interleaving should not to be precluded
- Leave in comments about where interleaving would fit in.
- Interleaving is the next major feature…
- QEMU - DCD merge would be at least 6.7 aligned.
- Switch CCI (Jonathan)
- Opens around what we do for user space – almost every command is destructive
- Maybe just CXL raw commands are required?
- Patch set has been a pain to rebase on type 2 from Dan
- Would really like review / feedback
- Davidlohr would like to merge the ‘moving around code’ sooner
- Would help with the type 2 conflicts
- It is hard to generalize the code without this second user
- Not critical for 6.6
- would like to see an early merge slated for 6.7
- In the end – Security questions are major gating factor
- Memory tiring in general
- CDAT vs HMAT
- ‘Distance’ calculations vary
- Patch set: ‘Mem tiring calculating abstract distance from ACPI’ (v6.7 material)
FM general topics
- We said we would talk about FM things in this meeting…
- Is there something at plumbers? Yes there is.
- Plumbers BoF for FM stuff?
Question from discord:
- John: “numa ratio policy patch?”
- Jonathan will try and dig in to see where the patches are
- We are talking at a VMA level.
QEMU Update
cxl-cli update
v6.5 Fixes Queue
- Region autodiscovery fixes
- x1 granularity calculation fix: minor fixups requested
- switch decoder allocation: minor fixups requested
- Hotplug fixes
- Cleanup softreserve on takeover: awaiting review
- Reuse SRAT proximity domain: pinged x86
- CXL _OSC AER Fixup: minor fixups requested
v6.6 Queue
- Queue closes August 18th
- RCH Error handling: fixes requested
- QTG enabling
- ACPI HMAT Generic Port support: awaiting merge
- Surface QTG ID info: awaiting merge
- CDAT Parsing: awaiting merge
- Finish Type2 enabling
- Fix security init collision: different approach requested
- Rebase remaining Type2 HDM API
- DCD: awaiting next rev
- Switch CCI: awaiting review
July 2023
ndctl / cxl-cli update
- v78 - minor fixups only - will go out this week
- v79
- Firmware update (no outstanding comments)
- Poison injection (awaiting new rev)
- Others?
…further notes not captured.
June 2023
- Opens:
- OpenBMC collaboration
- Labels / Persistent Naming (6.3 issue?)
- Add a CXL-CLI Item to the Agenda
- QEMU Update
- v6.4 Fixes
- v6.5 Merge Queue
- Post v6.5 material
QEMU Update
- QEMU DCD Support?
- MLD Support
- CCI layering work for OpenBMC collab
- I2C ACPI aspeed controller (upstream questionable)
v6.4 Fixes
- DAX Use After Free
- SRAT vs CFMWS Fixup(pending next rev and x86 review)
- Cache Management Discussion
v6.5 Merge Queue
- RCH Error Handling(awaiting v6 posting)
- Follow-up: RDPAS vs Root Port Scanning?
- Background command support(baseline pushed, awaiting consumer)
- Sanitization(pending review)
- Firmware udpate(awaiting final review)
- CXL perf monitoring(awaiting push to cxl-next)
Post v6.5
- QoS Class support(pre-reqs heading for v6.5)
- CDAT + QTG _DSM integration(pending review)
- Standalone CXL IDE
- Switch CCI
- memory_failure() for CXL events
- Type-2 Region Creation (awaiting review)
- Scan Media
- background dependency
- Dynamic Capacity Device support(awaiting next rev)
- Sparse DAX Region infrastructure
- DCD event plumbing
May 2023
- Opens:
- rasdaemon patches need review
- LSF/MM takeaways
- QEMU Update
- v6.4 pull summary
- v6.5 Queue
LSF/MM takeaways
- CXL 3.0 specification update review well received
- Discussed nodes vs zones and mempolicy vs mmap flags, nodes+mempolicy continues as the path forward
- Fabric manager: several efforts in flight (one in rust one in golang, OCP and OFA efforts as well)
- Live migration: CXL as a transport for migration, opportunity for migrate in place
QEMU Update
- Several patchkits ready and awaiting final merge:
- volatile memory
- poison handling
- events
- DCD support starting to surface
- Initial test results of the pre-RFC implementation look good
- QMP based interface
v6.4 pull summary
- DOE rework(queued)
- Poison retrieval(pending review)
- Forward and reverse address translation (DPA <==> HPA)
- Poison inject and clear(awaiting next rev)
v6.5 queue
- Background command support(pending review)
- QoS Class support(pending review)
- CDAT + QTG _DSM integration(pending review)
- CXL perf monitoring(awaiting perf acks)
- Dynamic Capacity Device support(awaiting next rev)
- Sparse DAX Region infrastructure
- DCD event plumbing
- Firmware udpate)(pending review)
- v2 posted with review feedback incorporated
- man page added to the cxl-cli patchkit
- RAS Capability Tracing on RCH AER events(awaiting next rev)
- Standalone CXL IDE
- PCIE SPDM pre-requisite
- KEYP table enabling
- Switch CCI
- memory_failure() for CXL events
- Type-2 Region Creation(awaiting first rev)
- Scan Media
- background dependency
April 2023
- Opens:
- QEMU Update
- v6.3 Fixes
- v6.4 Queue
- v6.5 Queue
v6.3 Fixes
- Decoder Enumeration Fixes(queued)
v6.4 Queue
- DOE rework(queued)
- Poison retrieval(pending review)
- Forward and reverse address translation (DPA <==> HPA)
- Poison inject and clear(awaiting next rev)
- CXL perf monitoring(awaiting perf acks)
v6.5 Queue
- CDAT + QTG _DSM integration(review pending)
- Dynamic Capacity Device support(awaiting next rev)
- Sparse DAX Region infrastructure
- DCD event plumbing
- Firmware Update (awaiting first rev)
- RAS Capability Tracing on RCH AER events(awaiting next rev)
- Standalone CXL IDE
- PCIE SPDM pre-requisite
- KEYP table enabling
- Switch CCI
- memory_failure() for CXL events
- Type-2 Region Creation(awaiting first rev)
- Scan Media
- background dependency
March 2023
- Opens:
- cxl/hdm: Fix hdm decoder init by adding COMMIT field check
- HDM-D/DB Kernel-internal region creation
- QEMU Update
- v6.4 Queue
v6.4 Queue
- DOE rework
- Poison retrieval
- Forward and reverse address translation (DPA <==> HPA)
- Poison inject and clear
- Scan Media
- background dependency
- Background command support
- Dynamic Capacity Device support
- Sparse DAX Region infrastructure
- DCD event plumbing
- Firmware Update
- CDAT + QTG _DSM integration
- CXL perf monitoring
- RAS Capability Tracing on RCH AER events
- Standalone CXL IDE
- PCIE SPDM pre-requisite
- Switch CCI
- memory_failure() for CXL events
- Maintenance Feature Support (DRAM PPR) (BMC only?)
Notes
- Question about kernel code modularity for accelerator drivers
- Expectation is that it is a bug if CXL core code cannnot be reused for devices outside of the class-device definition
- DCD Sharing may be the first user of HDM-DB functionality in the kernel, QEMU model for this in scoping
- Multi-head (not yet MLD) device support in the works for QEMU
- QEMU gaining a fix for clearing the HDM decoder COMMITTED bit when deactivating decoders
- Poison
- Poison inject can be done unconditionally, rely on “injected” indication to delineate real vs simulated hardware problems
- open question: should the driver taint the kernel on inject? No, ACPI EINJ does not
- Poison list: emit trace event on inject event? Maybe already covered by another event record
February 2023
- Opens:
- CXL DVSEC emulation fixes
- QEMU Update
- v6.3 Merge Window
- v6.4 Queue
v6.3 Merge Window
- Move tracepoints to cxl_core
- Export CXL _OSC error control result
- CXL Events to Linux Trace Events (including interrupts)
- HDM decoder emulation
- Default “Soft Reserved” (EFI_MEMORY_SP) handling policy (kernel)
- Volatile Region Discovery
- Volatile Region Provisioning
- Set timestamp
v6.4 Queue
- Poison inject and clear
- Forward and reverse address translation (DPA <==> HPA)
- Poison retrieval
- memory_failure() for CXL events
- Dynamic Capacity Device support
- Sparse DAX Region infrastructure
- DCD event plumbing
- CDAT + QTG _DSM integration
- DOE rework
- Standalone CXL IDE
- PCIE SPDM pre-requisite
- RAS Capability Tracing on RCH AER events
- Maintenance Feature Support (DRAM PPR)
- CXL perf monitoring
- Switch CCI
Notes:
- QEMU:
- Several patch kits in flight: https://gitlab.com/jic23/qemu/-/commits/cxl-2023-02-21/
- AER Discussion:
- What about CXL Reset for recovery?
- May be more relevant for future Type-2 devices than Type-3
- Add another PCI error recovery reset type?
- Map FLR => CXL Reset?
- PCI core supports per-device reset methods
- What about CXL Reset for recovery?
- DCD
- Look at MLD support before Switch CCI support
- CXL perf monitoring
- https://lore.kernel.org/r/20221018121318.22385-1-Jonathan.Cameron@huawei.com
- FW Update
- depends on background command support
- revisit for v6.4
- Scan Media
- revisit for v6.4
January 2023
Agenda 01/24
- Opens:
- DAX-page request API rework
- FM Project? LSF/MM topic
- Type-3 volatile
- QEMU Update
- v6.2 Merge Window
- v6.2-rc Fixes
- v6.3 Status
- v6.3+ Future Work
v6.2 Merge Window
- Cache invalidation for region physical invalidation scenarios
- DOE kernel/user access collision detection
- RCH preparation patches
- RCH Support (including DVSEC Range Register enumeration)
- Security commands (including background commands)
- RAS Capability Tracing on VH AER events
- XOR Interleave Math support
- cxl_pmem_wq removal
- EFI CPER record parsing for CXL error records
v6.2-rc Fixes
Merged in cxl/fixes:
- RAS UE addr mis-assignment
Pending merge:
- Fix nvdimm unregistration
v6.3 Status
Merged in cxl/next:
- Move tracepoints to cxl_core
- Export CXL _OSC error control result
Pending merge:
- CXL Events to Linux Trace Events (including interrupts)
- Poison inject and clear
- Forward and reverse address translation (DPA <==> HPA)
- Poison retrieval
- HDM decoder emulation
Awaiting next (or first) posting:
- RAS Capability Tracing on RCH AER events
- Volatile Region Discovery
- Volatile Region Provisioning
- CDAT + QTG _DSM integration
- Set timestamp
- memory_failure() for CXL events
- DOE rework
v6.3+ Future Work
- Default “Soft Reserved” (EFI_MEMORY_SP) handling policy (cxl-cli + daxctl)
- Dynamic Capacity Device support
- Sparse DAX Region infrastructure
- DCD event plumbing
- Standalone CXL IDE
- PCIE SPDM pre-requisite
- Maintenance Feature Support (DRAM PPR)
- CXL perf monitoring
FM Future
- MLD Mailbox support for DCD event injection
- Switch mailbox CCI
- Multi-head device mailbox tunneling
QEMU
- Start new threads for debug issues not on patches
- Greg’s volatile region setup testing
- Passthrough decoder checks
- SPDM still pending
November 2022
Agenda 11/29
- Opens:
- FSDAX ->notify_failure() regression work still pending
- Others?
- Fixes merged for v6.1-rc4
- v6.2 merge window status
- Post v6.2 Features
v6.1-rc4 Fixes
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tag/?h=v6.1-rc4
Merged:
- Mailbox input payload fix
- Decoder commit crash
- LSA payload handling fix
- CFMWS NUMA Node setup
- Fix switch attached to single-port host-bridge
- BUG in create-region when no more intermediate port decoders available
- Fix region object memory leak
- Fix memdev object memory leak
- cxl_pmem static analysis fix
v6.2 Merge Window Status
Merged:
- Cache invalidation for region physical invalidation scenarios
- DOE kernel/user access collision detection
- RCH preparation patches
In the queue (has review):
- RCH Support (including DVSEC Range Register enumeration)
- Security commands (including background commands)
- CXL Events to Linux Trace Events (including interrupts)
- RAS Capability Tracing on RCH and VH AER events
In the queue (needs review):
- XOR Interleave Math support
- Forward and reverse address translation (DPA <==> HPA)
- Poison retrieval
- cxl_pmem_wq removal
- EFI CPER record parsing for CXL error records
At risk:
- Volatile Region Discovery
- Volatile Region Provisioning
- CDAT + QTG _DSM integration
- Poison inject and clear
- CXL perf monitoring
Post v6.2 Features
- MLD Mailbox support for DCD event injection
- Dynamic Capacity Device support
- Sparse DAX Region infrastructure
- DCD event plumbing
- Switch mailbox CCI
- Multi-head device mailbox tunneling
- Standalone CXL IDE
- PCIE SPDM pre-requisite
- Maintenance Feature Support (DRAM PPR)
- Default “Soft Reserved” (EFI_MEMORY_SP) handling policy (cxl-cli + daxctl)
October 2022
Agenda 10/25
- Opens:
- FSDAX page reference counting rework (merged in mm-unstable)
- FSDAX ->notify_failure() regression work still pending
- Code First ECR: ‘SP’ attribute in SRAT
- QEMU emulation status update
- Others?
- Fixes pending for v6.1-rc
- Features in flight for v6.2
- Rough plans for post v6.2 work
v6.1 Fixes
https://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl.git/log/?h=fixes
Queued:
- Mailbox input payload fix
- Decoder commit crash
- LSA payload handling fix
- CFMWS NUMA Node setup
Pending:
- Fix switch attached to single-port host-bridge
- BUG in create-region when no more intermediate port decoders available
v6.2 Features
In rough priority order, feedback welcome:
- RCH Support (including DVSEC Range Register enumeration)
- Cache invalidation for region physical invalidation scenarios
- RAS Capability Tracing on RCH and VH AER events
- CXL Events to Linux Trace Events (including interrupts)
- EFI CPER record parsing for CXL error records
- Forward and reverse address translation (DPA <==> HPA)
- Volatile Region Discovery
- Volatile Region Provisioning
- Security commands (including background commands)
- CXL perf monitoring
- Miscellaneous cleanups and renames
Post v6.2 Features
- Dynamic Capacity Device support
- Sparse DAX Region infrastructure
- DCD event plumbing
- Maintenance Feature Support (DRAM PPR)
- Switch mailbox CCI
- Multi-head device mailbox tunneling
- Default “Soft Reserved” (EFI_MEMORY_SP) handling policy (cxl-cli + daxctl)
August 2022
Agenda 8/30
- Opens:
- FSDAX ->notify_failure() fixes
- FSDAX page reference counting rework
- Linux v6.0-rc1 and ndctl (ndctl, daxctl, cxl-cli) v74 released
- Fix and Feature queue for v6.0-rc, v6.1 and ndctl-v75
- Rough plans for post v6.1 work for CXL 3.0 enabling
Recently released
- Kernel:
- DPA Space Accounting
- PMEM Region Provisioning
- DOE Support in PCI core
- CDAT retrieval (for debug)
- User tooling:
- cxl create-region
- cxl reserve/free-dpa
- cxl list -vvv
Next fixes and features
- ‘arch_flush_memregion()’
- Fix validation of x1 switch topologies
- Volatile region provisioning
- Region labels
- Security commands support
- Trace events for CXL events (including interrupts)
- ‘cxl monitor’ command
- CXL AER handling
- Address translation
Future work
- Performance monitoring
- Maintenance Feature Support (DRAM PPR)
- Dynamic Capacity Device support
- Default “Soft Reserved” (EFI_MEMORY_SP) handling policy
July 2022
Agenda 7/26
- Opens:
- FSDAX page reference counting rework
- What is queued for v6.0 (and ndctl-v74)?
- Late v6.0 updates
- Post v6.0 work
Queued for v6.0
- DOE Support in PCI core
- CDAT retrieval (for debug)
- DPA Space Accounting
- PMEM Region Provisioning
In review for v6.0
- Interleave granularity fixes
- Fix host-bridge x1 interleave constraint
- Fix region granularity > host-bridge granularity handling (scale factors must match)
Post v6.0 material
- Pre-existing region enumeration
- Volatile region provisioning
- XORMAP interleave support
- Trace Events for CXL Events
- List Poison
- Scan Media
- Address translation
- Region persistence in labels
- Region enumeration via labels
June 2022
Agenda: 6/28
- Opens:
- CXL Device Tree Support
- MEM_HWINIT_MODE=0
- QEMU mainline CXL support is live
- What is in review for v5.20 (and ndctl-v74)
- What else might make v5.20?
- What is post v5.20 material?
v5.20 in review
v5.20 on deck
- Pre-existing region enumeration
- Region persistence in labels
- Region enumeration via labels
- Address translation foundation
Post v5.20 material
- List Poison
- Scan Media
- XORMAP interleave support
- Trace Events for CXL Events
- Address translation (in cxl-cli) for all kernel supported Events, List Poison, and Scan Media
May 2022
Agenda: 5/31
- What is in v5.19?
- What is on deck for v5.20?
- What is post v5.20 material?
- Opens
v5.19 / ndctl-v73
- Kernel
- lockdep annotations
- CXL _OSC (native CXL hotplug + error “handling”)
- Disable suspend
- Mem_enable fixes
v5.20 / ndctl-v74
- Kernel
- Region Provisioning
- DOE Core
- CXL CDAT Retrieval
- Event record handling core
- Scan Media records
- Event Interrupts
- Background command timesharing
- Userpace
- ‘cxl create-region’
- Region listing support
- Scan media / Event records to json
- Address translation
Post v5.20 / v6.0
- Kernel
- SPDM Attestation
- IDE
- Security commands
- Userspace
- Attestation helper process
- CXL Device-DAX Policy