CXL Linux Sync: Ground Rules
- Do not share confidential information
- Do not share confidential product details
- Do not disclose CXL consortium confidential information
- Do discuss any Linux questions about released CXL specifications:
- Do use Discord as a supplement for this sync meeting for quick questions
- Do follow-up on linux-cxl@vger.kernel.org for longer questions / debug
- https://pmem.io/ndctl/collab/
May 2025
- Opens
- cxl-cli
- QEMU
- v6.15 rc fixes
- v6.16 merge window
- v6.17 and beyond
Opens
RobertR: ECN update wrt addr trans series? Can’t talk confidential
side of proposal. Linux side - file code first ECNs (like ACPI) what
we want FW/BIOS to provide. Doing similar for mem hole problem.
Code first means Linux writes rules Linux needs added to CXL spec.
ACPI ECN examples, ACPI0017, extended linear cache.
By starting discussion in open, on Linux mailing list, not encumbered
for consortium confidentiality.
RobertR: patches address AMD specific addr trans, do we have other
users? Should we be pushing a generic solution now? Ans: stay
specific now.
FanN: Issue (device probe) using DCD patch set. Has worked around it.
FanN to post on cxl mailing list.
DanW: cxl reset - does use case include issuing a reset from userspace?
Collecting patches for a v82 release at EOQ 2, align w kernel 6.15.
- ndctl: Add support and test for CXL Features support (DaveJ)
- ndctl: Introduce sanitize-memdev functionality (DavidLohr)
- David pinging user who asked about in earlier this year
- ndctl: Add inject-error command (Ben)
- ? Pending an update from Ben considering Junhyeok prior set ?
- ndctl: Dynamic Capacity additions for cxl-cli (Ira)
- Deferred but not forgotten
QEMU
Jonathan’s Discord Update (He’s enjoying fine food in Lisbon)
- Most of left over stuff that was queued for 10.0 is now queued by MST.
One patch dropped as compile issue.
- Tcg bug introduced in some tlb cleanup work. Affecting code running from
cxl mem and some other cases.
- Arm support v13 posted.
- Dcd fmapi updated series on list (Jonathan hasn’t looked at yet).
v6.15 RC fixes
- RC4 PR done
- No more fixes PR unless extremely urgent.
v6.16 merge window - queued
- Remove always true condition for cxlctl_validate_hw_command()
- Verify CHBS length for CXL2.0
- Ignore interleave granularity when ways=1
- Address missing MODULE_DESCRIPTION warnings for cxl_test
- Cleanups and refactors part 1 for Zen5 translation support
- Cleanup debug printk for cxl_dpa_alloc()
v6.16 merge window - considering
- type2 support (Alejandro)
- v15 posted, v16 coming w rebase on rc4
- Boot to Bash documentation (Gregory)
- v3 posted. Review tags please.
Plan is to merge as is and expect incremental fixups can follow
- CXL Maturity Map update (Alison)
- v2 posted. Review tags please.
- RAS features drivers (Shiju)
- v4 posted, ready for merge?
v6.17
- Using full data transfer only when offset is 0 (Ming)
- Waiting on Jonathan to hear back from consortium on spec language interpretation
- Native port protocol error handling and logging (Terry)
- Pending v9
will need to get new Bjorn tags.
- Soft Reserve handling (Terry–>Smitha)
- Introduce DEFINE_ACQUIRE() (Dan)
- Going through discussions
- Delayed port enumeration (Dave)
- v2 posted, going through reviews
- Can Robert check and see if that resolves his dport num issue reported
- Initialize eiw and eig (Purva)
- Low Mem Hole (Fabio)
- Waiting on ECN to post next rev
- Zen5 translate part 2 (Robert)
- CXL reset support for devices. (Srirangan)
- Allow 6 & 12 way regions on 3-way HB interleave (Alison)
- (RFC) Translate DPA->HPA in unaligned MOD3 regions (Alison)
- Needs review and will need an ECN or the like also.
v6.18 and beyond
- DCD (Ira)
- v9 posted, still waiting for a use case
- vfio-cxl type 2 (Zhi)
- Hotness Driver (Jonathan)
- non-x86 cache flushing (“wbinv”) (Jonathan)
April 2025
- Opens
- cxl-cli
- QEMU
- v6.15 rc fixes
- v6.16 merge window
- v6.17 and beyond
Opens
- DCD (Ira)
- Fan tried a qemu test – failing
- Is using the latest stuff
- Ira does not have a lot of time
- Jonathan will be taking this forward as a fork
- please review it!
- Is DAX ok?
- why is this different than other features which have landed well ahead of hardware?
- Low Memory Hole enumeration (Fabio)
- Robert wanted some changes (different direction)
- more isolation within the implementation for special features
- Address translation rework has a lot of conflicts
- proposal to have a check if the LMH applies then use the SPA range
- Dan is missing the conflicts – refactoring is ok
- LMH is a small change - different from a whole new addressing space
- what happens when a 3rd, 4th… etc show up?
- don’t be surprised by these things(?)
- why does this quirk need to be delayed by larger changes?
- some code conflicts
- but does LMH break the new code?
- Robert - extended linear caching is harder to abstract and LMH makes that harder
- wants some code isolation
- flat2lm messes this up too - LMH is yet another thing
- is the refactoring for flat2lm done?
- the refactoring should make LMH fit easier
- makes SPA != HPA => use for LMH
- part 2 of Roberts series would do this.
- this has been posted. “Address translation part 1 and 2”
- part 1 does not conflict as much
- Could be helpful to get this landed to clear the backlog
- part 2 mostly needs to be resolved -> hard
- Linux has suffered from platforms taking liberties (inveted on the fly)
- there has to be a conversation somewhere on these special configs
- can we get some rules around these things
- examples
- no CFMWS for type2
- SPA vs HPA
- John G. – FAMfs RFC v6.14 out soon. 6.15 rework comming
- Gregory working on boot to bash stuff to put in documentation
- need opinions on this
- in a personal-public github
- v81 was released end of Q1.
- Collecting features for a v82 at end of Q2, aligned w 6.15.
- ndctl: Add support and test for CXL Features support (DaveJ)
- Needs review. Driver support is in.
- ndctl: Introduce sanitize-memdev functionality (DavidLohr)
- Needs review. Driver support is in.
- ndctl: Add inject-error command (Ben)
- Pending an update from Ben
- ndctl: Dynamic Capacity additions for cxl-cli (Ira)
- Awaiting driver decision
- might need some clean up on the base commit
- but it is out there
QEMU
- 10.0 out today or next week
- fairly minor features will land after that
- arm support waiting for review
- FM-API review
- FM in qemu A controlling devices in qemu B
- RFC - test FM commands through MCTP
- uses QMP to notify qemu B
- MCTP messages is in shared buffer
- Need feedback from upstream -> may need a socket vs shared buffer
- ‘whatever works’ …
- FM in host could work -> nice to have kernel stack formulate MCTP
- what blocks MCTP
- open BMC is blocked by lack of tests
- need to know what happens with malformed packets
- long way around is to use a PCI (with a distro) -> i2c emulated device
- could just use ARM for MCTP with open BMC
v6.15 rc fixes
- Pending cxl/fixes
-
GPF DVSEC fixes (Ming)
- Waiting on more review tags
- CXL Features: Address set_feature and offset flag (Ming)
- CXL Features: Set out_len in set_feature failure case (Ming)
- Skip Mem_En check for RCD and RCH ports (Smita)
v6.16 merge window
v6.17 and beyond
- DCD support (Ira)
- v9 posted
- Dan’s uncomfort
- Dan did a public demo almost 2 years ago -> all dissapeared
- device dax does not have review scalability
- would like to have an end user stand up
- what about the FM development
- this has stopped being a priority
- John Groves - was shown a demo - not public… yet
- sharable memory
- Has a person who has been testing this
- emulation is probably at a point that John could test this
- AR : John look at the interface and contribute what they need with tagging to make this work for them. With a real use case.
- Johnathan prefered some of the older interfaces but these have all been ok
- Also need a virtualization story
- virtio plan is gone
- keep this alive on top of type2
- Johnathan will continue to have a staging branch - as stated above
- Gregory
- is dax the right interface for virtualization
- among us we have users
- Yannis - Is this chicken and egg?
- Is the interface what we want to support?
- is it sufficient?
- Dan we know this is not a slam dunk interface
- does it matter if dax may go away?
- dax was advocated at LSFmm - so not going away
- vfio-cxl type 2 (Zhi)
- Hotness Driver (Jonathan)
- focus is on emulation (qemu) first
- non-x86 cache flushing (“wbinv”)
- need review
- arm focused but should work anywhere
- used device classes (show in sysfs)
- but no user interface now
- could be used for specific flushes
March 2025
- Opens
- cxl-cli
- QEMU
- v6.14 rc fixes
- v6.15 merge window
- v6.16 and beyond
Opens
- Mixed granularity in x3 regions (Alison)
- posted on the list 6&12 way regions
- fix is limited
- this is fine but could this be simpler by relaxing ordering constraints?
- perhaps this makes no difference at all?
- way back - mismatch - interleave was “backwards”
- if there are other use cases of x3 - please review her patch
- LSFmm – Intel attendees?
- DCD?
- Dan should be there
- CXL specific session?
- 1 hour might be better than 1/2
- device specific stuff maybe in the hallway track?
- Gregory to focus on external to the driver
- General CXL discussions 80% on external with maybe 20% internal driver stuff
- chime in on the ML on what you would like to see
- v81 is queued up with misc fixups.
- New features on the list:
Davidlohr
- ndctl: Introduce sanitize-memdev functionality (Davidlohr)
Dave J.
- fwctl changes?
Pending an update from David
- ndctl: Dynamic Capacity additions for cxl-cli (Ira)
Simmering waiting for entry into cxl/next
- ndctl: Add support and test for CXL Features support (DaveJ)
Simmering waiting for all the pieces to come together for test and review
- ndctl: Add inject-error command (Ben)
Pending an update from Ben
QEMU
Missed Michael’s merge window
- queued up for next cycle
- DCD FM api stuff
- interacts with Gregory FM stuff
- idealy would be connected eventually
- Who is using DCD/qemu
- Jonathan has always tested DCD with qemu
- Terry, Adam, and John Groves all using qemu DCD
- Adam – DCD would be nice for compression
- Yiannis – some concerns due to lack of use case for DCD
- John G. allocations from 0 within same tag - similar to storage but not exactly the same
- ARM support review needed
- got blocked way back for device tree support - relaxed
- hotpage support
- please chime in if you are interested
v6.14 rc fixes
none
6.15 in cxl/next
Already in from last meeting:
- Add support for Global Persistent Flush (GPF)
- Cleanup of DPA partition metadata handling
- Removed unused CXL partition values
- Refactor user ioctl command path from mds to cxl_mailbox
- Add logging support for CXL CPER endpoint and port protocol errors
- Remove redundant gp_port init
Newly added:
- Cleanup of gotos using guard() series
- Validation of CXL device serial number
- CXL ABI documentation update/fixups
- CXL Features support (First part of CXL FWCTL support)
- FWCTL specific CXL bits will be pushed by Jason G.
- Additional support for dirty shutdowns
- Extended Linear Cache enumeration and RAS support
- Last 2 patches from Smita for firmware first error logging
- cxl_test to support 3-way capable CFMWS
- Documentation fix to remove “mixed mode”
6.15 merge window considerations
- 6.15 merge window closed for large series.
- May still take small changes or fixes that are urgent.
v6.16 and beyond
- cxl: Add address translation support and enable AMD Zen5 platforms (Robert)
- v4 of part1 in review
- v3 of part2 pending?
- Update soft reserved resource handling (Nathan -> Terry)
- v3 pending
- next version this week
- CXL PCIe port protocol error handling and logging (Terry)
- Support CXL memory RAS features - EDAC (Shiju)
- pending v2? (AKA v24 … ;-)
- core support landed
- Support background operation abort requests (Davidlohr)
- Enable Region creation on x86 with Low Mem Hole (Fabio)
- v3 posted, under review. Would like this queued for cxl/next after merge
window.
- Type2 device support (Alejandro)
- v11 posted. Under review. Would like this queued for cxl/next next merge
window.
- Rest of DCD series (Ira)
- Pending type2 acceptance
- look for another version after the merge window
- will be watered down from previous version
- need use cases beyond the spec
- The cost of not merging this is that nothing is being built above it
- Several CPU vendors talking about hot add/remove
- is DCD a way to get around this?
- flush cost limits things
- a motivation is to decouple decoder programming from on/off lining memory
- entire provisioning mechanism is DCD with CXL 3 - endpoint
- use case order is important as well
- Linus/mm folks won’t care – But Dan wants the folks shipping products to stand up.
- Allow 6 & 12 way regions on 3-way HB interleaves (Alison)
- Pending v3?
- already discussed above (in opens)
- Translate DPA->HPA in unaligned MOD3 regions (Alison)
- cxl: factor out cxl_await_range_active() and cxl_media_ready() (Zhi)
- Add cxl reset support (Srirangan)
- Pending review
- PCIe folks looking at this…
- Cleanup add_port_attach_ep() “cleanup” confusion (Dan)
RFC
- vfio-cxl type 2 (Zhi)
- Pending next rev
- Zhi could not make the call
- Hotness Driver (Jonathan)
- lots to do here
- Combining all hotness features CXL and beyond - Discuss at LSFmm
- DAMON… NO… ?
- save it for LSFmm
- boot to bash
- Gregory’s documentation journey
- kernel docs?
- AI will pick it up from there.
- CEDT recipes?
- good feedback on this
- LSFmm session will focus on how Linux expects things to be configured
- memory blocks vs region alignment -> lose memory
- Theory vs how Linux really works
- All of the complication comes from BIOS and OS interactions after BIOS sets things up
- Need OS first!!!
- ACPI tables need to be correct.
-
backwards compatibility...
- please tear the docs appart if you think it is wrong
- FAMfs is running under FUSE!
- patch to libfuse
- may have a branch before LSF
February 2025
- Opens
- cxl-cli
- QEMU
- v6.14 rc fixes
- v6.15 merge window
- v6.16 and beyond
Opens
- v81 is open with misc fixups
- Need review on build and coverity fixups on list
- New features in review:
- ndctl: Introduce sanitize-memdev functionality (Davidlohr)
https://lore.kernel.org/linux-cxl/20240928211643.140264-1-dave@stgolabs.net/
- ndctl: Dynamic Capacity additions for cxl-cli (Ira)
https://lore.kernel.org/nvdimm/20241214-dcd-region2-v4-0-36550a97f8e2@intel.com
- ndctl: Add support and test for CXL Features support (DaveJ)
https://lore.kernel.org/linux-cxl/20250207234718.2387622-1-dave.jiang@intel.com/
- ndctl: Add inject-error command (Ben)
https://lore.kernel.org/nvdimm/20250108215749.181852-1-Benjamin.Cheatham@amd.com/
QEMU
- inside merge window
- Fujitsu clean up
- ARM virt support reposted
- 2 samsung series’
- Hotlist monitoring
v6.14 rc fixes
none
6.15 in cxl/next
- Add support for Global Persistent Flush (GPF)
- Cleanup of DPA partition metadata handling
- Removed unused CXL partition values
- Refactor user ioctl command path from mds to cxl_mailbox
- Add logging support for CXL CPER endpoint and port protocol errors
- Remove redundant gp_port init
6.15 merge window
- Rest of DCD series (Ira)
- pending v9
- much discussion on actual use cases
- AI: Ira to schedule another call between Dan, John, Jonathan and Ira…
- Support background operation abort requests (Davidlohr)
- Pending v2
- will rebase and send soon
- CXL PCIe port protocol error handling and logging (Terry)
- v7 posted, review on going
- Some devices/drivers have been happily CXL-unaware (prtdrv)
- should PCI subsystem throw errors to ‘cxl land’?
- CXL system must be loaded for processing these errors
- Alternate: make PCI system more cxl aware
- new file is ok
- 2 fifos CPER/OS first
- fifo overflows if cxl is not loaded (user ‘asked for it’)
- mapping between PCI/CXL device
- AER -> fifo -> wq -> pciaer -> aer src info???
- AER -> fifo -> cxl core?
- AER statistics (CXL counters?)
- Jonathan to post reference
- Type2 device support (Alejandro)
- v10 posted, review on going
- memdev state vs device state
- need to have Alejandro to discuss further
- Trace FW-First CXL Protocol Errors (Smita)
- 1-4 in cxl/next. 5&6 needs more work
- Q: looks like this needs to be in cxl core?
- yes it is not an endpoint but a port object
- cxl: Add address translation support and enable AMD Zen5 platforms (Robert)
- Part 1&2 v2 posted
- Review on going
- Reference Low memory hole: was generic to ‘some platform may do this’
- specific AMD file? can this be more generic?
- has anyone else done this? … no
- specification help?
- Update soft reserved resource handling (Nathan, Alison)
- v2 is posted. Review on going?
- Hildenbrand(sp?) had comments
- Introduce generic EDAC RAS control feature driver (Shiju)
- v19 posted. Review ongoing?
- v19 discussion : Boris is unconvinced about the API
- reverse engineer tracepoints?
- marshal into ioctl
- marshal into tracepoint (format different)
- use sysfs
- lockdown kernels disable debugfs
- Must be in EDAC - Boris
- v20 posted
- Dan to look at CXL bits
- online repair - must see an error ‘this boot’
- call for memory device manf. to look at the API and weigh in
- can parameters just fine
- FWCTL CXL (Dave)
- Add exclusive caching enumeration and RAS support (Dave)
- v3 posted, minor changes requested from Ming. Need Dan’s review
- Enable Region creation on x86 with Low Mem Hole (Fabio)
- v2 posted, review ongoing.
- some assertion this should wait for Roberts stuff?
- Can we apply this?
- this could affect Roberts patches with a small conflict - Gregory
- cxl: factor out cxl_await_range_active() and cxl_media_ready() (Zhi)
- vfio-cxl type 2 (Zhi)
- will rebase on type 2 v10+
- Use guard() instead of rwsem locking cleanup series (Ming)
- v2 review on going
- should be simple; get reveiwng folks!
- cxl/pmem: debug invalid serial number data (Yuquan)
- Add cxl reset support (Srirangan)
- Dirty shutdown followups (Davidlohr)
- Cleanup add_port_attach_ep() “cleanup” confusion (Dan)
- pending v3?
- forgotten… will be remembered…
6.16 and beyond
- Hotness driver (Jonathan)
- might need a sub-call on this; Jonathan to schedule
- vfio-cxl? (Zhi)
January 2025
- Opens
- cxl-cli
- QEMU
- v6.14 merge window
- v6.14 rc fixes
- v6.15 merge window
- v6.16 and beyond
Opens
- 1.5 hour meeting?
- Feature velocity is slowing
- Upstream support needs to land well ahead of distro acceptance
- Some vendors may not release hardware without ecosystem support
- lack of hardware does not mean a feature is not important
- DCD is important but type 2 seems to be more important because devices are out and real
- The core had an issue which probably should have been cleaned up a while ago
- both DCD and type 2 were trying to not ‘disrupt’ the status quo
- Generally clean up first is good
- In this case there are other issues with both sets so it might be fine… this time
- Cross-subsystem ties?
- Jonathan would like to see at least one of these features queued for cxl-next
- More reviews!
- CXL will now be ‘leaking’ out into the rest of the kernel.
- Use cases may need to make CXL core changes… without redoing the entire thing.
- type 2 is higher prioity than error handling. may folks are doing FW first.
- there are some FW first patches on the tail end of the port error handling series.
- Maybe those should come first or as a separate patch set?
- AFAWK they don’t conflict with type 2
- DCD can go in with device dax. memfd is a future question. should device dax move toward memfd?
- famfs needs device dax. without memfd changes it can’t replace device dax.
- memfd is currently geared toward anonymous memory and device dax provides better super block support
- also tagging support in memfd would also be a bigger change
- memfd support may be growing some persistence so there may be conflicts in the support and a decision may need to be made.
- public? guest memfd call is public
- Gowans is working on this
- CC can’t consume device dax. which also pushes toward memfd
- How would CC handle shared memory? – can’t be anonymous
- shared confidential is down the road – we are not ready for it
- ratelimit AER
- https://lore.kernel.org/linux-pci/cover.1736341506.git.karolina.stolarek@oracle.com/
- https://lore.kernel.org/linux-pci/20250115074301.3514927-1-pandoh@google.com/
- Pradeep to check with Terry
- CXL reset
- patch set should apply to all devices
- Don’t the SBR patches already do this?
- This would destroy the regions
- We think so
- If this is adding a new reset to that then we should be ok
- CXL reset is different
- It does work through sysfs
- expected use case would be to go through type 2 driver and it could ensure memory flushes
- Does this need a new version?
- v81 is open with misc fixups and unit test updates at the moment
- New features need review:
- DCD needs review (Ira)
https://lore.kernel.org/nvdimm/20241115-dcd-region2-v3-0-585d480ccdab@intel.com/
- Sanitize memdev needs review (Davidlohr)
https://lore.kernel.org/linux-cxl/20240928211643.140264-1-dave@stgolabs.net/
- Inject error (Ben)
https://lore.kernel.org/nvdimm/20250108215749.181852-1-Benjamin.Cheatham@amd.com/
QEMU
- one fix around MSI
- clean up some error paths
- new staging tree should be out in a day or 2
- hot miss – HMU
- roughly speaking one can run a real work load and get real data
- 10% speed (4min to boot)
- infinate counters (no one will build this) – could add knob to change counters
- framework is in place
- if other HW vendors like to upstream ‘real’ behavior
- would be nice to have a range of implementations
In v6.14 merge window
- ACPI/HMAT: Move HMAT messages to pr_debug()
- cxl/pci: Add CXL Type 1/2 support to cxl_dvsec_rr_decode()
- CXL events updates for spec r3.1 (series)
cxl-next may still consider the following. Would like to close cxl-next by Wed/Thurs.
- cxl/pci: Support Global Persistent Flush (GPF)
- DPA partition meta data cleanup
v6.14 rc fixes
v6.15 merge window
- Rest of DCD series (Ira)
- Pending v8
- pending on DPA partition cleanups from Dan
- week?
- Support background operation abort requests (Davidlohr)
- Pending v2
- user space would abort previous operation
- v2 on the way
- CXL PCIe port protocol error handling and logging (Terry)
- v5 posted, review on going?
- yea
- Type2 device support (Alejandro)
- v9 posted, need review
- Also pending on DPA partition cleanups from Dan
- is there a way to check if an allocation came from devm?
- xfc is a network driver
- put cxl side of the driver in drivers/cxl
- Alejandro believes he has a solution
- VFIO also has this problem
- export a new function so the VFIO can do the release
- convert some core calls to non-devm
- patch set has been tested with 2 different drivers and AMD
- should be very stable
- cxl-test has not been run though
- but also can’t break the cxl-test build
- should also have a basic cxl-smoke test-test
- would love that every feature has a new cxl-test but not always required
- Trace FW-First CXL Protocol Errors (Smita)
- v5 posted. Is linux-efi picking up the series?
- cxl: Add address translation support and enable AMD Zen5 platforms
- Add device reporting poison handler (Shiyang)
- Will there be v5? No movement since last September
- rasdaemon folks are engaged
- rasdaemon will find the tracepoints
- what about corrected errors. would need to soft offline
- who does that? don’t know yet.
- Update soft reserved resource handling (Nathan, Alison)
- v2 is posted. Review on going?
- Introduce generic EDAC RAS control feature driver (Shiju)
- v18 posted. Review ongoing?
- v19 comming
- complexity around the interface caught on merge
- specifically memory sparing – because DPA is not stable
- add PoC in user space to show the usage of the API
- need vendors to step up to show this is not a single vendor solution
- does this just belong more on the CXL side vs EDAC
- but Boris wanted it in EDAC – for unified interfaces
- maybe write a whitepaper to help explain – it is in the documentation
- when is it safe to use the interfaces? ie after a boot?
- safty rules vary a bit
- error record must corespond to the current boot
- can a device really do this… ‘atomic swap’
- soft-hibernate idea
- just document it – device is to take care of it
- feature query
- xarray will carry on forever … should not see that many errors
- PPR will have separate support
- FWCTL CXL (Dave)
- Almost done with cxl cli support and cxl unit test for using ioctls.
v1 will be posted once that is done.
- Add exclusive caching enumeration and RAS support (Dave)
- v3 posted, minor changes requested from Ming. Need Dan’s review
- all tags from Jonathan
- Enable Region creation on x86 with Low Mem Hole (Fabio)
- v2 posted, review ongoing.
- Need to sync with Robert’s address translation series?
- not sure how to do what Robert suggests
- cxl: factor out cxl_await_range_active() and cxl_media_ready() (Zhi)
- v2 posted, Need more review tags
- Cleanup add_port_attach_ep() “cleanup” confusion (Dan)
6.16 and beyond
- Hotness driver (Jonathan)
- vfio-cxl?
Admin Issues
- Anyone want to host the next meeting?
December 2024
- Opens
- cxl-cli
- QEMU
- v6.13 rc fixes
- v6.14 merge window
Opens
## cxl-cli / user tools
* v81 open: misc fixups and unit test updates
* DCD cxl cli series needs review (Ira)
- Ira update?
- alter test to pass with known lockdep issue
* Sanitize memdev functionalities (DavidLorh)
- Need review
- sanitize and secure will be mutually exclusive
## QEMU
* in stabalization phase, nothing now
* some fixes for generic ports
* future
- low hanging fruit - minor features
- 3,6,12 interleave
- CHMU feature
- no real hardware emulation due to IP issues
- probably better than real hardware
## 6.13 rc fixes
* cxl/pci: Check dport->regs.rcd_pcie_cap availability before accessing
- Ready to queue
* cxl/region: Fix region creation for greater than x2 switches
- Ready to queue
* Fix potential bogus return value upon successful probing
- Davidlohr
- https://lore.kernel.org/all/20241115170032.108445-1-dave@stgolabs.net/
## 6.14 merge window
* Rest of DCD series (Ira)
- Cleanup parts landed in 6.13.
- v7 posted. Pending review from Dan
- lockdep issue dev_uevent - uevent_show()
* Support background operation abort requests (Davidlohr)
- v1 review needed
* CXL PCIe port protocol error handling and logging (Terry)
- v3 posted, review on going?
- v4 1.1 test issues - defer 1.1 changes? (yes)
- PCIe core are all 2.0 changes
- no regressions on 1.1
- 1.1 could be made to use the same flow as VH (but more time is needed)
- would be easier for Bjorn to review as well
- Jonathan nervous about differences between PCIe/CXL iterators
* Type2 device support (Alejandro)
- v7 posted, need review
- please concentrate on patches without tags
- prelim patches for Dave to land first
- Reviewed by Jonathan
- https://patchwork.kernel.org/project/cxl/patch/20241203162112.5088-1-alucerop@amd.com/
* Trace FW-First CXL Protocol Errors (Smita)
- v3 posted, review on going?
* Add device reporting poison handler (Shiyang)
- Waiting on v5?
* Update soft reserved resource handling (Nathan, Alison)
- v1 is posted (w RFC feedback) and needs review, esp DAX notifier
- share notifier between dcd code and soft reserved
- clean up later if needed
- Do we need to add checking or handling for DCD?
* Introduce generic EDAC RAS control feature driver (Shiju)
- v17 posted. Review ongoing with Boris
- Needs to sync "feature" calls with CXL fwctl
- please don't block EDAC on fwctl (no)
- But will sync code paths
* FWCTL CXL (Dave)
- Create a features driver/device to handle all feature support. Driver/device needed to prevent FWCTL from being loaded by CXL core. Need to coordinate with Shiju.
- Drop RFC and pending v1 to be posted upstream
* Add exclusive caching enumeration and RAS support (Dave)
- v1 posted, need review.
* Enable Region creation on x86 with Low Mem Hole (Fabio)
- v1 posted, need review.
- This is platform specific x86'ism
- there was at least one other memory hole example
- do we have that concrete example?
- will need to get approval for release
* Update event records to CXL spec r3.1 (Shiju)
- v4 posted, need review
- tracepoints are now too long for libtracept (> PAGESIZE)
- Working with Steven to determine correct fix
- uuid support to be added too
* Rename ACPI_CEDT_CFMWS_RESTRICT_TYPE2/TYPE3 (Ying)
- Looks like needs a conclusion with the discussion
* Cleanup add_port_attach_ep() "cleanup" confusion (Dan)
- v2 posted, need review?
## 6.15 and beyond
* Hotness driver
- basic enablement - tracing driver
- send to userspace for user to decode
- in kernel use?
- convert PA to user app VA?
- DAEMON? - Jonathan thinks "no"
* vfio-cxl?
- Any new updates?
## Admin issues
* Should we make this 1.5 hour, 2?
* Share the load of running meeting?
# November 2024
* Opens
* cxl-cli
* QEMU
* v6.13 fixes
* v6.14 merge window
## Opens
* whats next? How to support DCD in VMs? Continue discussion from LPC.
## cxl-cli / user tools
* v81 open: misc fixups and unit test updates, expect release w. kernel 6.13
* DCD cxl cli series
- need review
## QEMU
* merge window finished next week
* topology discovery
* lane details (speed/width)
* payload checks missing which could attach the host
* fixed.
# Next stuff
* 3/6/12 interleave upstream
## 6.13 fixes
* cxl/region: Fix region creation for greater than x2 switches (Huaisheng)
- waiting for v2
## 6.13 merge window (in next)
* Constify range_contains() input params
* Add CXL 1.1 device link status support for lspci
* Downgrade warning message to debug in cxl_probe_component_regs()
* Add printf specifier '$pra' for 'struct range'
* Add cleanup/prep code (first 6 patches) from DCD series
* Cleanup add_port_attach_ep() "cleanup" confusion (Dan)
- v3 soon
## 6.14 and beyond
* Rest of DCD series (Ira)
- Pending review from Dan
* Support background operation abort requests (Davidlohr)
- Review needed
* CXL PCIe port protocol error handling and logging (Terry)
- Review on going?
- v2 posted -> v3 comming
- later patches need more looks
- pci does not look at upstream ports but Terry's patch set did
- Fix in pci and would make cxl easier? (AER info)
- Would be nice for the PCI folks to explain the current PCI behavior (ping Lukas?)
- CXL protocol errors are reported through AER
- AER driver determines CXL device and passes to CXL for further decode
- uncorrectable error (fatal or non-fatal) -> panic
- report info first (pci link still up so should still be able to use link)
- We don't know who is using it (acclerators [through specific driver] might be able to recover)
- Don't always panic but... the details are use case specific
* Type2 device support (Alejandro)
- Waiting for v5?
- v5 soon
- later patches need more looks
- minor fixes and more testing
* Trace FW-First CXL Protocol Errors (Smita)
- Waiting for v3? -> yes
- does call into the AER handler
- patch sets can't land independent for fatal error case
- Terry's lands first for Smita to use
* Defer probe when memdev fails to find correct port (Gregory)
- Waiting for v2?
- Obsoleted with Dan's patch set
- Dropping
* Add device reporting poison handler (Shiyang)
- Discussion still on going?
- Need to route to ras daemon?
- ras daemon deals with so many vendor specifics it is fine to let ras daemon deal with it.
* Update soft reserved resource handling (Nathan, Alison)
- Alison testing has not been reproduced by Nathan
- Do we need to add checking or handling for DCD?
- Notifications of the dax driver still being looked into
- need to notify of soft reserved after region created?
* Introduce generic EDAC RAS control feature driver (Shiju)
- Review needed for CXL bits
- Boris review for sysfs (has not said he likes it yet)
- memory repair stuff CXL spec a bit of a mess
- one device per feature kind of clean
## RFCs?
* FWCTL CXL
- v2 coming. Jason requested posting sooner than later.
* Extended Linear Cache support
- v2 coming. Looking for ideas on properly detecting MMIO hole in kernel
* vfio-cxl?
- new summary out
- planning to send a new patch set based on Alejandros v5
- white paper?
- Leverage device ACL registers if possible
- fall back to vendor driver
## open discussion
* Jonathan ACPI discussion
* Hotplug CXL fixed memory window review -> going through mm tree
- Mike Rapaport review should be good
- not critical yet
* DCD set up
- expose DCD in a VM
- long ago discussion from Jonathan
- Find device by Tag (VM and bare metal should work the same)
- John provide an on-list conversation
# October 2024
* Opens
- Thanks for the LPC-MC organizers!!!
-
* cxl-cli
* QEMU
* v6.12 fixes
* v6.13 queue
* v6.13+
* Vfio-cxl
## cxl-cli / user tools
* v80 released and working a build issue w CentOS Stream10
* v81 open: expecting DCD, pmem_ns, and misc unit test patches
## QEMU
* 3 sets out for next
- generic ports
- speed control
- fixes
* 3/6/12 interleave outstanding
* MCTP payload sizes might get sliped in
* NVME changes
- blocker was tests
## 6.12 fixes
* v6.12 PR completed
* Fix CXL device SBDF calculation (in RC3)
- https://lore.kernel.org/all/6701760b2e390_16041829420@iweiny-mobl.notmuch/
* Fix Trace DRAM Event Record (
- https://lore.kernel.org/linux-cxl/05305df495904b9f99fcb52f67a66762@huawei.com/T/#t
* Fix CXL port initialization order when the subsystem is built-in
- https://lore.kernel.org/linux-cxl/0c945d60-de62-06a5-c636-1cec3b5f516c@amd.com/T/#t
- [Johnathan: Comments in the makefile]
- spelling fix -- will spin for now
- Ack from GregKH would be nice
* Fix KASAN error in cxl-test
-
* Other trees
* Poll DOE Busy bit for up to 1 second in pci_doe_send_req
- https://lore.kernel.org/linux-cxl/20241013155834.GA607803@bhelgaas/T/#mfe07920c273fe28eb9acce1b2bb509a69860e6de
* Support missing events in 3.1 spec - 6.13
- https://lore.kernel.org/linux-cxl/20241014141551.GA17702@willie-the-truck/T/#t
## 6.13 queue
* EDAC scrub
- https://lore.kernel.org/linux-cxl/20241009124120.1124-1-shiju.jose@huawei.com/T/#m6f07b22ae7a00dd7ecf0095cff2683b88aa21ea7
* Enable CXL PCIe port protocol error handling and logging
- https://lore.kernel.org/linux-cxl/d9d87f72-6273-4adc-934c-e25ee79bb8c7@amd.com/T/#mc0de8ae4187b55227201bae525596d49cad196ba
* Smita - CPER changes
* DCD: Add support for Dynamic Capacity Devices (DCD)
- v4 is out, v5 soon
- https://lore.kernel.org/linux-cxl/6707f33c89730_4042929481@iweiny-mobl.notmuch/T/#m6267352b7ea3215257311a05cb058aaf7e12b522
## 6.13 order? -- What should we focus on?
1 DCD - dax review - if this is fine.
- remove printk stuff
2 Port protocol error handling...
- Need community agreement
3 Smita's patches
4 RFC for soft reserved - dax again...
4 (parallel) EDAC scrub not a lot more cxl review needed
## 6.13+ queue
* Type 2 patch sets
- (Alejandro) https://lore.kernel.org/linux-cxl/20240907081836.5801-1-alejandro.lucero-palau@amd.com/T/#mf3427aeb28f1eecc5df167cfd15af22aaec4a1d5
- (Ying) https://lore.kernel.org/linux-cxl/20240925024647.46735-1-ying.huang@intel.com/T/#mb2484ab566353dbf58bd905d7affb1549bb15fad
* VFIO-type 2
- https://lore.kernel.org/linux-cxl/4230fba5-030c-49ef-799e-f4138b1c9f7d@amd.com/T/#mcac344899aacd07f98c4132d25d2ad29a6da9b09
- https://lore.kernel.org/linux-cxl/20240921071440.1915876-1-zhiw@nvidia.com/T/#m7e6f2b0ac88102095120a56d55341477b74a737a
- Dan: Trend is bare-metal sets up resources and hands them to the guest
- PCI BAR mapping does not change for the guest so why would this be different?
- Also have to consider the CXL.cache case for type-2. This is more demanding.
- What kind of comunication between guest driver and VMM for CXL.cache?
- Alejandro working on a doc to share with the mailing list.
- Look at resizable BARs in VFIO.
## Init order talks
Nothing which stops an accelerator to skip the cxl_mem driver. And there are reasons not to.
## ARM stuff
non-x86 how do we do cache flush (invalidate all)
Every ARM implementation seems to do this differently
Will need more infrastructure -- a subsystem to cache flush
PRM spec?
Dan's unicorn wish is for CXL spec to add this...
# September 2024
Skipped due to LPC
# August 2024
* Opens
* QEMU
* v6.11 fixes
* v6.12 queue
* v6.12+
## QEMU
## cxl-cli / user tools
## 6.11 fixes
* cxl/region: Remove lock from memory notifier callback
- Maybe destined for 6.12 merge window?
* Fixes for hdm decoder initialization from DVSEC ranges
- Need review tags
## 6.12
* cxl: Region bandwidth calculation for targets with shared upstream link
- Waiting on Dan's further review
* cxl/port: Use _-free() to drop put_device() for cxl_port
- Need reviews
* cxl: add Type2 device support
- Pending v3 from Alejandro, discussion on going
* Address translation for HDM decoding
- Pending v3 from Robert
* cxl/region: Remove soft reserve resource at region construction
* acpi/ghes, cper, cxl: Trace FW-First CXL Protocol Errors
- Waiting on next rev from Smita. No activity since May. Is this still happening?
* DCD: Add support for Dynamic Capacity Devices (DCD)
- v3 is out, please review
# 6.12+
* Add RAS support for CXL root ports, CXL downstream switch ports, and CXL upstream switch ports
* Terry, new updates?
* Fan is testing
* Scrub Subsystem via EDAC
* v11 posted
* Shiju/Jonathan, new updates?
* Extended-linear memory-side-cache HMAT Proposal
* Approved?
* ACPICA pull request merged for new HMAT definition
* WIP
* CXL Error Isolation
* awaiting a user, paused
* CXL PMU support for switches
* Jonathan, updates?
* cxl: avoid duplicating report from MCE & device
- Nothing new since last meeting
* FWCTL CXL support
- Awaiting v2 and address Jason and Jonathan's comments
# July 2024
* Opens
* invitation for tech topics for Plumbers uConf spillover
* device-passthrough followup
* vm capability to find memory by tag?
* passthrough DCD (existing CXL emulation) vs passthrough DAX region (new emulation)
* are there nested use cases for passing through pooled memory?
* pooled memory for VMs pre-DCD is not widespread, but chicken / egg problem to become a prom
* QEMU
* cxl-cli
* v6.11 fixes
* v6.12 queue
* v6.12+
## QEMU
* DCD landed since last meeting (basic infrastructure for kernel enabling)
* MST pull request for (sanitize, scan media, get feature (scrub/ecs), libspdm NVME target)
* generic port support will circle back
* PCI CMA not using libspdm found a libspdm bug as a result
* MCTP support?
* only aspeed i2c controller support it and awkward to support on x86 (ACPI)
* PCI -> DT -> I2C?
* Request for a PCI VDM MCTP that is easier to emulate? Inventing one probably too much effort
* virtio-i2c not mctp capable
## cxl-cli / user tools
* [ndctl PATCH v13 0/8] Support poison list retrieval
- [Patches 1 & 3 need tags](https://patchwork.kernel.org/project/cxl/list/?series=868958)
* ndctl v80 waits for poison list
* coincident with v6.11
## 6.11 fixes
* None, currently in 6.11 merge window. Will do PR tomorrow (Wednesday)
## 6.12
* cxl: add poison creation event handler
- Waiting on comments from Dan?
* cxl: Region bandwidth calculation for targets with shared upstream link
- Waiting on further review from Dan.
* cxl: add Type2 device support
- Going through reviews
* Address translation for HDM decoding
* Waiting on next revision from author:
* Export cxl1.1 device link status register value to pci device sysfs.
* cxl/region: Remove soft reserve resource at region construction
* acpi/ghes, cper, cxl: Trace FW-First CXL Protocol Errors
# 6.12+
* DCD: Add support for Dynamic Capacity Devices (DCD)
* Ira: Reordering patches, asking for new tags
* 'more bit' and multiple regions per partition included
* Add RAS support for CXL root ports, CXL downstream switch ports, and CXL upstream switch ports
* Terry: Incorporating RFC feedback to follow RCH downstream port flow
* Scrub Subsystem review
* Shiju / Jonathan: bringing it inline with EDAC requirements from Borislav
* ECS and PPR to be included
* Extended-linear memory-side-cache HMAT Proposal
* Dan: on track for approval
* CXL Error Isolation
* awaiting a user
* CXL PMU support for switches
* Jonathan: stalled behind pcieport rework
* Bind problem for switches relative to class code (pci conventional bridge code)
* cxl: avoid duplicating report from MCE & device
* Shiyang: are duplicated reports a problem in practice? Maybe when "repair" becomes more widespread
- Going through reviews
* FWCTL CXL support
* RFC posted, Jason says looks ok, need CXL review on policies
* Mailbox refactoring
* make cxl_dev_state public? wait to see use cases
* Spreading CFMWS x86 policy to other archs: numa_memblk for more archs
# June 2024
* Opens
* QEMU
* cxl-cli
* v6.10 fixes
* v6.11 queue
* v6.11+
## QEMU
* Big endian conflicts for generic port... fix triggered signficant review feedback
* Waiting for some merge confusion to resolve before pushing more
* Next up
- Misc fixes (no multiple memory backends)
- Scan media
- Sanitize fixes
- 3,6,12,16 way interleave support
- 'Get Feature' support
- Introduce comprehensive variability for generic port for test coverage (bottlenecks in multiple places)
- Investigate fine grained CDAT configurability
* MCTP backed up on an NVME dev branch
* DMA Bounce Buffering pending on separate tree
* FW Update: v3 pending clarification feedback
* FW First Error injection support, is now a generic capability
- ARM-only at the moment
- x86 review / help needed with emulating SCI mechanics to signal errors
* DCD should land this cycle
## cxl-cli / user tools
* cxl/test: add cxl_translate unit test
- [New patch](http://lore.kernel.org/20240624210644.495563-1-alison.schofield@intel.com), needs review:
* cxl/test: Add test case for region info to cxl-events.sh
- [Needs additional review](http://lore.kernel.org/20240328043727.2186722-1-alison.schofield@intel.com)
## 6.10 fixes
* cxl/mem: Fix no cxl_nvd during pmem region auto-assembling
- locally queued
* cxl/region: Avoid null pointer dereference in region lookup
- Ready to be queued, review still welcome
* Check interleave capability
- ready to be queued
* XOR Math Fixups: translation & position
- Need review tags, top priority for fixes pull
- Looking to merge for v6.10-rc6
## 6.11 merge window
* cxl: add missing MODULE_DESCRIPTION() macros
- locally queued
* cxl/region: Support to calculate memory tier abstract distance
- locally queued, review still welcome
## Need additional reviewing and tags
* cxl: add poison creation event handler
* acpi/ghes, cper, cxl: Trace FW-First CXL Protocol Errors
* Documentation: CXL Maturity Map
- awaiting v2 posting
* Export cxl1.1 device link status register value to pci device sysfs.
- Have tags from Jonathan. Dan can you take a look?
- lspci already has support on the pending ABI
* cxl: Region bandwidth calculation for targets with shared upstream link
- Have tags from Jonathan. Pending test tag from Jonathan. Some additional minor updates. Can use additional review.
## Waiting on next revision from author
* cxl/acpi: Warn on unsupported platform config detection
* cxl/region: Remove soft reserve resource at region construction
# 6.11+
* DCD: Add support for Dynamic Capacity Devices (DCD)
- Cleaning up lifetime of extent devices
- 'More' bit support in progress
- ndctl enhancements (extent listing suppport)
- Misc cleanups
* Add RAS support for CXL root ports, CXL downstream switch ports, and CXL upstream switch ports
- Option 2 consideration
* Scrub Subsystem review
* Shiju
- Any new updates since last meeting?
- new version pending, all under EDAC, but modernized to be a 'struct bus_type'
- Need per-device *and* per-region scrubbing
- CXL regions and memdevs registered on the EDAC bus
- DDR ECC6 does have per-dimm controls, so *maybe* there is a DDR-DPA concept
- Respond to high correctable events with increasing the scrub rate? (POC)
- DPAs are not physical? ...but system-software pretends they're physical
* Extended-linear memory-side-cache HMAT Proposal
* Dan
* CXL Error Isolation
* awaiting a user
* CXL PMU support for switches
* Jonathan
* cxl: avoid duplicating report from MCE & device
* Shiyang
- Going through reviews
* FWCTL CXL support
* DaveJ working on RFC, get/set features support
# May 2024
* Opens
* LSF/MM takeaways
* famfs status
* FUSE maintainers are interested in supporting famfs
* Passthrough for inodes and faults working
* famfs needs some support on top, iomap support still needed
* memfd support?
* filesystem in Rust discussion
* weighted interleave (now fully upstream!) takeaways
* container developers are a bit more tolerant of details
* cluster owners want to know a bit more
* folks open to improvement to numactl
* Jonathan: get in touch with Brice about hwloc updates?
* Peter Xu hugetlb fixups
* LPC: uConf accepted, get submissions in and register with CXL uConf as planned attendee
* Vienna Sept 16th
* CFP deadline early July
* CXL Maturity Map
* QEMU
* cxl-cli
* v6.10 fixes
* v6.11 queue
* v6.11+
## QEMU
* Fan posted new version of QEMU-DCD support
* Generic port support got review and prompted docs updates
* MHD support making way for DCD support to clear first
## cxl-cli / user tools
* v79 release since last meeting, and v80 open
## 6.10
* cxl/region: Fix memregion leaks in devm_cxl_add_region()
* Ready for queue
* [/4] XOR Math Fixups: translation & position
* [/2] cxl/region: Fix potential invalid pointer dereference
* cxl: Add interleave capability check
* cxl/pci: Convert PCIBIOS_* return codes to errnos
## 6.11 queue
* cxl/events: Use a common struct for DRAM and General Media events
* Ready for queue
* cxl/region: Convert cxl_pmem_region_alloc to scope-based resource management
* Ready for queue
* [/2] cxl: add poison creation event handler
* [/3] Display cxl1.1 device link status
* [/26] DCD: Add support for Dynamic Capacity Devices (DCD)
* Wait for next rev
## 6.11+
* Scrub Subsystem review
* ACPI RAS2 user showed up on the list
* Converging on "new EDAC" today only "mc" and "pci" devices, switch to properly parented devices
* class device to associated memory regions to scrub interfaces
* why does userpace want to control this and how
* would there be an auto-scrub control in userspace? e.g. Increase scrubbing on threshold crossed
* Extended-linear memory-side-cache HMAT Proposal
* CXL Error Isolation
* awaiting a user
* Switch Port Error handling pending
* CXL PMU support for switches
* Refactored portdrv to aux-bus RFC to be posted soon
# April 2024 (2)
* Opens
* FAMFS RFC v2 posted
* QEMU
* cxl-cli
* v6.9 Fixes
* v6.10 queue
* v6.10+
## QEMU
* DCD Feedback for 9.1 dev cycle
* last mile feedback but looking positive for this cycle
* Generic Port feedback also started
* Reset handling Fixups
* For AER flow testing
* SBR masking emulation testing
* Useful for kexec testing
* CXL Reset digression
* Interleave testing
* Commit error checking and interleave functionality
* precursor - DMA bounce for TCG, MCTP still pending (NVMe-MI dependency)
* MHD Support for Orchestrator flow testing
* SPDM is a precursor for QEMU tree backlog
## cxl-cli / user tools
* Media error patch pending awaiting final review
* Minor fixups
* rasdaemon has CXL patches pending to log events to DB
* OCP is working on expanding CPER record types for hyperscale RAS use cases
## v6.9
* [0/2] cxl: add interleave capability check
* Waiting for next rev, likely 6.10 material
* cxl: Fix cxl_endpoint_get_perf_coordinate() support for RCH
* Pending for 6.9-rc7, Thanks Robert!
## v6.10 Queue
* cxl: Calculate region bandwidth of targets with shared upstream link
* Need review
* [0/4] PCI: Add Secondary Bus Reset (SBR) support for CXL
* Pending review from Dan and Bjorn
* cxl/hdm: Debug, use decoder name function
* Queued.
* [0/4] efi/cxl-cper: Report CXL CPER events through tracing
* Waiting for Rafael review tag. Probably for 6.11
* Ping Tony to take a look
* [0/3] Display cxl1.1 device link status
* Waiting for next rev and review
* [0/2] Add log related mailbox commands
* Pending queueing
* [0/4] Add DPA->HPA translation to dram & general_media
* Wait for next rev
* likely 6.10 material
* cxl/test: Enhance event testing
* Queued
* cxl/hdm: Add debug message for invalid interleave granularity
* Queued
* cxl/acpi: Cleanup __cxl_parse_cfmws()
* Pending next rev
* cxl/acpi.c: Add buggy BIOS hint for CXL ACPI lookup failure
* BIOS fix provided, no need to add this hint
* MAINTAINERS: repair file entry in COMPUTE EXPRESS LINK
* Pending next rev
* cxl/cxl-event: include missing <linux/types.h> and <linux/uuid.h>
* Queued
* [0/2] cxl: add poison creation event handler
* Pending next rev
* cxl: Fix use of phys_to_target_node() for x86
* Queued
## v6.10+
* [00/26] DCD: Add support for Dynamic Capacity Devices (DCD)
* Wait for next rev
* Scrub Subsystem review
* ACPI RAS2 user showed up on the list
* CXL Error Isolation
* awaiting a user
* XOR translation
* Switch Port Error handling pending
* CXL PMU support for switches
# April 2024 (1)
* Opens
* CXL Maintainer Update
* QEMU
* QEMU is in the quiet period for next 2-3 weeks
* TCG memory update, do not use virtio-storage that might put a DMA buffer in CXL memory
* Generic port in staging, cleaned up version to be posted soon
* DCD Update
* DVSEC Ranges? What to do if no static capacity: cxl_await_media_ready() (Linux currently needs active + valid at a minimum)
* Might need a spec clarification for pure DCD devices
* v6 pending review
* Release of pending extents => no force removal flow in QEMU for now
* cxl-cli
* v79 awaiting final review on poison listing changes
* Alison to be maintainer / patch wrangler post v79 release
* libcxl-mi modeled on NVME-mi? MCTP investigation
* BMC Switch CCI + MCTP unification, but maybe not inband
* v6.9 Fixes
* v6.10 Queue
* Future
# March 2024
* Opens
* FAMFS update
* QEMU
* cxl-cli
* v6.8 Fixes
* v6.9 Queue
* Future
## QEMU
* 8 week merge cycle still open
* Pre-reqs pending
* SPDM from Alistair
* MCTP
* Bounce buffer fix for DMA to CXL memory
* Generic port pending generic initiators from NVIDIA
* DCD Emulation Update
* Feedback incorporated
* Superset extend/release as well partial extend/release supported
* Tests passing
* Formalizing introspection may make sense in the future
* Might be too late for cycle ending in a week or so
* Greg: Multi-head interactions with DCD emulation? Can it be added incrementally?
* Investigate a shared base device for single/muti-head implementations to share
* Greg to RFC what he has
* Generic Port
* Background command support queued behind DCD for convenience
* MCTP over I2C still in process
* ARM support still pending device-tree interactions, help welcome
* Firmware first error handling, not impossible to upstream but not a priority
* CPMU
* FM-API help to flesh out the command support welcome
* Non-interleaved high performance CXL memory emulation (how to represent the performance)
* QEMU only emits x1 lowest bandwidth link speed
## FAMFS
* Review, thanks Jonathan
* DEVICE-DAX IOMAP review needed
* PMEM support may be dropped in favor of just DAX
* Christian Brauner has advice on how to open dev-dax
* Fault counters to be removed
* FAMFS held up to performance benchmarking
* Superblock identified capacity
* Initial use case: provide access to a large shared pool with readonly clients
## cxl-cli
* [List Media Errors (Poison)](http://lore.kernel.org/r/cover.1705534719.git.alison.schofield@intel.com): pending review
* QoS class changes; pending
* v79
## v6.8 Fixes
* 3, 6, 12 XOR interleave math fix: pending feedback
* [SSBLIS Fix](http://lore.kernel.org/r/20240301210948.1298075-1-dave.jiang@intel.com): ready to queue
* [CXL QOS Sysfs fixes / simplification](https://patchwork.kernel.org/project/cxl/list/?series=823300): merged
* [Fix "HPA out of order" region assembly fix](https://patchwork.kernel.org/project/cxl/list/?series=821883): merged
* [Fix "no NUMA configuration found"](https://lore.kernel.org/r/99dcb3ae87e04995e9f293f6158dc8fa0749a487.1705085543.git.alison.schofield@intel.com): merged
* [Crash on repeated AER signaling](https://lore.kernel.org/r/20240129131856.2458980-1-ming4.li@intel.com): merged
* [cxl_test build fix](https://lore.kernel.org/r/170543983780.460832.10920261849128601697.stgit@dwillia2-xfh.jf.intel.com): merged
* [Stop requiring MSI/MSIx](https://lore.kernel.org/r/20240117-dont-fail-irq-v2-1-f33f26b0e365@intel.com): merged
* [Fix x16 Region HPA allocation](http://lore.kernel.org/r/20240124091527.8469-1-caoqq@fujitsu.com): merged
* [Fix duplicate messages in CPER handling](http://lore.kernel.org/r/20240131-cxl-cper-fixups-v1-0-335c85b1d77b@intel.com): merged
## v6.9 Queue
* [CXL QOS to NUMA](http://lore.kernel.org/r/170568485801.1008395.12244787918793980621.stgit@djiang5-mobl3): pending merge
* [Weighted Interleave](https://lore.kernel.org/all/20240202170238.90004-5-gregory.price@memverge.com/): queued in mm-unstable
* [DAX support on modern ARM](http://lore.kernel.org/r/20240202210019.88022-1-mathieu.desnoyers@efficios.com): pending merge
* [CXL CPER Protocol Errors to Trace Events](http://lore.kernel.org/r/20240109034755.100555-1-Smita.KoralahalliChannabasappa@amd.com): pending review
* [CXL EINJ](https://lore.kernel.org/all/20240115172007.309547-1-Benjamin.Cheatham@amd.com/): pending merge check ACPICA
* [CXL Userspace Unit Tests](http://lore.kernel.org/r/170171841563.162223.2230646078958595847.stgit@ubuntu): pending review
* [CDAT Cleanups](https://lore.kernel.org/all/20240108114833.241710-1-rrichter@amd.com/): queued
* [CXL test save/restore](http://lore.kernel.org/r/65a980249f50f_3b8e294a3@dwillia2-xfh.jf.intel.com.notmuch): pending non-RFC posting
* [Use sysfs_emit(): throughout](https://lore.kernel.org/r/20240112062709.2490947-1-ruansy.fnst@fujitsu.com): queued
* [cond_guard(): and related cleanups](http://lore.kernel.org/r/20240205142613.23914-1-fabio.maria.de.francesco@linux.intel.com): pending next posting
* scoped_cond_guard() usages pending for v6.9
## Future
* [Component State Dump](https://lore.kernel.org/linux-mm/20240222172350.512-2-sthanneeru.opensrc@micron.com/T/) interaction with event clearing
* how much data is in a CSD, how much blob can trace event support
* [CXL Scrub Feature](http://lore.kernel.org/r/20240111131741.1356-1-shiju.jose@huawei.com): more review needed
* DRAM Scrub necessary over time
* Tradeoffs of reliability vs scrub cost
* want hotplug support
* Address Range Scrub, on demand scrub
* new patchset in process
* sync with RAS API folks on reusability
* OpenCompute model of out of band control might be in conflict with embedded use cases
* RAS API does not supply stop-scrub on inband interface
* CXL Switch Port Error Handling: pending initial posting
* CXL Root Port (RCEC Notified): Error Handling: pending initial posting
* DCD: pending next revision
* DPA to HPA translation for events
* Type-2 Preview: still awaiting a consumer
* CCI Refactor for Switch CCI, RAS API, Type-2: pending next posting
* MMPT in Jonathan's queue
# February 2024
* Opens
* [LSF/MM CFP](https://lore.kernel.org/bpf/4343d07b-b1b2-d43b-c201-a48e89145e5c@iogearbox.net/): deadline March 1st
* QEMU
* cxl-cli
* v6.8 Fixes
* v6.9 Queue
* Future
## QEMU
* Status
* 2 patch sets to pick up; bunch of fixes
* Not clear spec versions so update those to 3.1
* Fan's next DCD version; close
* Some minor issues
* Would like to land 9.0 cycle (Aprox end of March)
* Some things depend on these so want to land them first
* MHD won't make March
* TCG/KVM mess
* Bug report on list; Not as minor as thought
* Slow path does not cover everything unfortunately
* May be some other issues
* Random crashes (might be page tables or ??)
* Alternative is to implement performance path
* Treat as normal RAM
* Can't do interleave with lots of memory regions (ways?)
* For now... Don't use emulated CXL memory
* Fan said it would work for some cases?
* Kernel code is now putting things in the right numa nodes
* Kernel may have been using swap
* Should x86 use memblock?
* Jonathan does not think it will help
* Re-read cdat?
* EFI soft reserved causes x86 to keep the info around
* 'numa keep meminfo' or something like that
* AMD CPER pushed out
* Jonathan would like a HEST table from x86 if someone could provide that
## cxl-cli
* [List Media Errors (Poison)](http://lore.kernel.org/r/cover.1705534719.git.alison.schofield@intel.com): pending review
* QoS class changes; pending
* Porcelain patches welcome
* How can we make things easier?
* Automate cxl create region for largest regions it can figure out
## Should Linux be the BMC?
* Open BMC has a lot of drivers
* need guard rails
* Might be useful and to share code
* BMC only use cases are questionable
* How do we ID which is which?
* Kconfig CXL_BMC_SUPPORT?
* Similart to raw command support
## v6.8 Fixes
* [CXL QOS Sysfs fixes / simplification](https://patchwork.kernel.org/project/cxl/list/?series=823300): pending next posting
* [Fix "HPA out of order" region assembly fix](https://patchwork.kernel.org/project/cxl/list/?series=821883): ready to queue
* [Fix "no NUMA configuration found"](https://lore.kernel.org/r/99dcb3ae87e04995e9f293f6158dc8fa0749a487.1705085543.git.alison.schofield@intel.com): queued for v6.8-rc4
* [Crash on repeated AER signaling](https://lore.kernel.org/r/20240129131856.2458980-1-ming4.li@intel.com): queued for v6.8-rc4
* [cxl_test build fix](https://lore.kernel.org/r/170543983780.460832.10920261849128601697.stgit@dwillia2-xfh.jf.intel.com): merged v6.8-rc2
* [Stop requiring MSI/MSIx](https://lore.kernel.org/r/20240117-dont-fail-irq-v2-1-f33f26b0e365@intel.com): merged v6.8-rc2
* [Fix x16 Region HPA allocation](http://lore.kernel.org/r/20240124091527.8469-1-caoqq@fujitsu.com): merged v6.8-rc2
* [Fix sleeping lock in CPER handling](http://lore.kernel.org/r/20240202-cxl-cper-smatch-v1-1-7a4103c7f5a0@intel.com): pending next posting
* [Fix duplicate messages in CPER handling](http://lore.kernel.org/r/20240131-cxl-cper-fixups-v1-0-335c85b1d77b@intel.com): Going through EFI tree
## AER fatal panic - wide range of handleing
* Policy change to discuss with comunity
* Instead of hoping we should panic?
* But if DAX just kill the process (invalidate mappings)
* But how much running around should we do?
* Hope was that force remove of driver would do a pr_warn() [let panic on warn crash]
* Need more real world feedback
## v6.9 Queue
* [CXL QOS to NUMA](http://lore.kernel.org/r/170568485801.1008395.12244787918793980621.stgit@djiang5-mobl3): pending review
* [Weighted Interleave](https://lore.kernel.org/all/20240202170238.90004-5-gregory.price@memverge.com/): queued in mm-unstable
* [DAX support on modern ARM](http://lore.kernel.org/r/20240202210019.88022-1-mathieu.desnoyers@efficios.com): pending final review
* [CXL CPER Protocol Errors to Trace Events](http://lore.kernel.org/r/20240109034755.100555-1-Smita.KoralahalliChannabasappa@amd.com): pending review
* [CXL EINJ](https://lore.kernel.org/all/20240115172007.309547-1-Benjamin.Cheatham@amd.com/): pending resolution of ACPICA dependency
* [CXL Userspace Unit Tests](http://lore.kernel.org/r/170171841563.162223.2230646078958595847.stgit@ubuntu): pending next posting
* [CDAT Cleanups](https://lore.kernel.org/all/20240108114833.241710-1-rrichter@amd.com/): queued
* [CXL test save/restore](http://lore.kernel.org/r/65a980249f50f_3b8e294a3@dwillia2-xfh.jf.intel.com.notmuch): pending non-RFC posting
* [Use sysfs_emit(): throughout](https://lore.kernel.org/r/20240112062709.2490947-1-ruansy.fnst@fujitsu.com): queued
* [cond_guard(): and related cleansups](http://lore.kernel.org/r/20240205142613.23914-1-fabio.maria.de.francesco@linux.intel.com): pending next posting
## Future
* [CXL Scrub Feature](http://lore.kernel.org/r/20240111131741.1356-1-shiju.jose@huawei.com): more review needed
* CXL Switch Port Error Handling: pending initial posting
* CXL Root Port (RCEC Notified): Error Handling: pending initial posting
* DCD: pending next revision
* DPA to HPA translation for events
* Type-2 Preview: still awaiting a consumer
* CCI Refactor for Switch CCI, RAS API, Type-2: pending next posting
# November/December 2023
* Opens
* Plumbers Takeaways
* QEMU
* cxl-cli
* v6.7 Fixes
* v6.8 Queue
## Opens
* Interleave ratios: MVP
* mempolicy based to start
* cgroups deferred for a later fight
## Plumbers Takeaways
* Greg's interleave document
* 5 types: BIOS, OS, mempolicy (homogenous or heterogeneous)
* LWN Article for reach? Follow in the style of Mel's NUMA article
* UKunit: Userspace unit testing of kernel code
* limitation on what can be mocked with Kunit
* https://github.com/jimharris/ukunit
* Davidlohr to post notes
* Port device RAS support
* Move PCIe port bus driver logic into PCIe driver/core to start as library
* AER handler callback to the endpoint driver
* Break the pcie portbus driver dependency
* Hotplug range register problem resolution
## QEMU
* mst picked up more than expected includng CCI support into 8.2-rc1
* ira's cdat fixes posted
* scrub control: both QEMU and kernel patches posted
* Integrate with ACPI scrub control as a subsystem shared with CXL
* Alistair's SPDM work progression
## cxl-cli
* concern for first-time users
* dnf install cxl-cli
* cxl list -RX
* v79 release imminent
* corresponding to v6.7 updates
* [hotplug range register support?](https://lore.kernel.org/linux-cxl/ZCRhhUDcmypVKu0X@memverge.com/]
* disable device mem_enable modify range register + re-enable
* how to handle zero based DVSEC range register
## v6.7-fixes
* locking fixups
## v6.8
* Interleave syscall
* John: don't force people to go through BIOS for interleave
* Michal: looking for mempolicy2() support
* Greg: also working on thrid-party mempolicy syscall via pidfd (minus mbind/homenode)
* once syscalls are in interleave weights can be layered on top without ABI changes
* numactl changes would be nice to have
# "Halloween" 2023
* Opens:
* QEMU
* cxl-cli
## QEMU
* Multiple HDM decoder support landed
* Compilation issues slowed down a topic
* Mailbox CCI rework sent out
* Difficult to test MCTP infrastructure
* Fan in process of next DCD posting
* FMAPI support on top of DCD ("add" support, test interfaces included "real" tooling wanted)
* QEMU support for changing QOS class information?
* weighted interleave investigation
* generic target support needed
## cxl-cli
* sanitize command unit test for (for v80 depends on v6.7)
* poison listing support (for v79 kernel support in v6.5)
* automatic region position determination for create-region (--strict option for recovery of old behavior)
## v6.7
## v6.8
* DCD next revision pending
* Spec pipecleaning in progress
* [Node Weights and Weighted Interleave - Gregory Price](https://lore.kernel.org/linux-mm/20231031003810.4532-1-gregory.price@memverge.com/)
* John: Tier preference vs local preference?
* Gregory: bandwidth vs latency tiering conflicts
# October 2023
* Opens:
* Jim: QEMU dport conflicting connections, (1) HB (1) 1 RP (2?) Switches (4) Endpoints (Who detects impossible configs?)
* Gregory: port to region confusion (make create-region smarter)
* Vincent: multi-function upstream ports? Yes, for PCIE, does CXL mandate function0?
* Steve: RCH link width / speed enumeration (emit via CXL objects?) Jonathan RCIEP examples of emitting attributes, virtual switch?
* Jonathan: Dynamic NUMA node creation
* 0-size NUMA node entries in SRAT already shipping
* QEMU
* cxl-cli
* v6.6 Fixes
* v6.7 Queue
## QEMU
* Cleanup sets upstream
* mst has QTG in the backlog
* backlog of PCI bits
* switch serial number on upstream port
* multi-HDM decoders
* mailbox rework for Switch CCI + MCTP over I2C (difficult to add aspeed to x86 machine model)
* DCD: working through reported issues wrt kernel patches
* Fabric management ambiguities
* MCTP representation of MLDs? Single-MLDs when plugged in as an SLD.
* FMAPI binding when sending to a switch, not Type-3, except for the general commands like identify
* I.e. use type-3 binding except opcodes 0x4000+ when talking to a switch
## cxl-cli
* [Poison List Retrieval](https://patchwork.kernel.org/project/cxl/cover/cover.1696196382.git.alison.schofield@intel.com/)
* [Towards CXL continuous integration](https://github.com/facebookincubator/kernel-patches-daemon)
* Vishal: [set alert config patches](https://lore.kernel.org/linux-cxl/20230918045514.6709-1-jehoon.park@samsung.com/) queued up
## v6.6 fixes
* [v6.6-rc3 update](https://lore.kernel.org/linux-cxl/650f60a224347_124e92943@dwillia2-mobl3.amr.corp.intel.com.notmuch/)
* [Fix shutdown order](https://patchwork.kernel.org/project/cxl/cover/169602896768.904193.11292185494339980455.stgit@dwillia2-xfh.jf.intel.com/)
* awaiting testing
* need to rework mbox irq to be threaded or an atomic flag
* [Soft Reserved Conflict / Lifetime](https://patchwork.kernel.org/project/cxl/cover/cover.1692638817.git.alison.schofield@intel.com/)
* Auto-assembly Rework
* Jim: Granularity fix top down is confusing switch settings
* Davidlohr: Type-2 crash interaction with security shutdown order?
## v6.7+
* [RCH EH](https://patchwork.kernel.org/project/cxl/cover/20230927154339.1600738-1-rrichter@amd.com/)
* [QTG](https://patchwork.kernel.org/project/cxl/cover/168695160531.3031571.4875512229068707023.stgit@djiang5-mobl3/)
* [QTG to HMEM](https://patchwork.kernel.org/project/cxl/list/?series=759643)
* [Switch CCI](https://patchwork.kernel.org/project/cxl/cover/20230804115414.14391-1-Jonathan.Cameron@huawei.com/)
* Davidlohr: background status publishing to userspace? Bind VPB, Sanitize via Tunnel?
* Jonathan: Punt until someone with BMC background can help drive
* Jonathan: Possibly some NVME MCTP work to draft behind
* Jonathan: start with safe commands to get framework started
* Gregory: multi-headed SLD testing validating the approach of an independent mailbox core (QEMU)
* [SPDM / Auth](https://patchwork.kernel.org/project/cxl/cover/cover.1695921656.git.lukas@wunner.de/)
* SPDM BoF Planned for Plumbers in November
* [memmap on memory](https://patchwork.kernel.org/project/cxl/cover/20230928-vv-kmem_memmap-v4-0-6ff73fec519a@intel.com/)
* mempolicy proposals:
* [multi-tier](https://patchwork.kernel.org/project/cxl/cover/20230927095002.10245-1-ravis.opensrc@micron.com/)
* [mempolicy2](https://patchwork.kernel.org/project/cxl/cover/20231003002156.740595-1-gregory.price@memverge.com/)
* [mempolicyNM](https://patchwork.kernel.org/project/linux-mm/patch/20220607171949.85796-1-hannes@cmpxchg.org/)
* [weighted interleave]
* Informal Plumbers BoF
# September 2023
* Opens:
* John: CXL memory online by default memhp_default_state=offline not working?
* QEMU
* cxl-cli
* v6.6 Fixes
* v6.7 Queue
## QEMU
* Merge window induced slowness
* Round-up of fixlets sent up
* Multiple HDM Decoder support for endpoints posted
* Serial number update
* Maintainer feedback administrivia cleanups
* Sort out revision numbers for spec version comments
* advocate with your rep about caching old copies at spec-landing
* MCTP I2C from NVME
* Single Aspeed i2c controller driver has support
* POC quality / out-of-tree support until server class driver arrives
* DCD Update
* waiting for kernel-side code resolution
* Get Extent List for unaccepted memory, track pending state in the implementation
* cxl-test may need updates too
* MHD Update
* Joint effort with SK Hynix, custom command set
* Proto-DCD
* Single logical device
* Software Development Vehicle
* CPMU, ARM, Compliance, Type-2
* SPDM Interest
* WDC looking at library-izing it, still looking to support and external agent
* FM API (MCTP Mailboxes + Switch CCI + MHD Mailbox)
## v6.6 Fixes
* CXL RAS Enabling
* [Region Granularity Setup](https://patchwork.kernel.org/project/cxl/patch/20230822180928.117596-1-alison.schofield@intel.com/)
* [Region Decoder Discover](https://patchwork.kernel.org/project/cxl/patch/20230822014303.110509-1-alison.schofield@intel.com/)
## v6.7 Queue
* RCH EH (under)
* Kernel SPDM
* WDC showing up to help
* Invite to CXL sync? Invited to "devsec"
# August 2023
* Opens:
* [Linux Plumbers CXL Microconference CFP](http://lore.kernel.org/r/a4c2gx2tnm4ckax7qkx2trnvmqjssfytc45sb2zikuayd2marc@rpsjp4icgsvn)
* uConf proposals close at end of the August
## QEMU Update
* Not a huge amount going in this merge, doc, fixes Multiple HDM decoders should be going in this merge.
* Lot of stuff is backed up by the mailbox rework
* Jonathans gitlab has [DCD preview queued up](https://gitlab.com/jic23/qemu).
* Ira did some testing and fixes were merged in latest version
* Jonathan might have broken it with rebasing. So just a reminder that this is work in progress.
* MCTP support over I2c... Support is coming from NVME-MI this work is similar to FM-API
## cxl-cli update:
* [v78 release](http://lore.kernel.org/r/8a83f1832c95e327a4695b607729102216a3e2f0.camel@intel.com)
* [Hotplug helper proposal](http://lore.kernel.org/r/20230807213635.3633907-1-terry.bowman@amd.com)
## v6.5 Fixes Queue
* [rc4 updates](https://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl.git/tag/?h=cxl-fixes-6.5-rc4)
* [rc5 updates](https://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl.git/tag/?h=cxl-fixes-6.5-rc5)
## v6.6 Queue
* RCH Error handling
* Terry working on it right now. Was waiting on response from Dan which should be there yesterday.
* Will pick that work back up
* Type2
* Davidlohr to submit the fix for type2 init collision. (Merged)
* Dan rebasing patches. There is conflict here with the Switch CCI work. See below.
* DCD
* Ira is reworking the patch set quite a bit.
* Fan’s QEMU DCD work is being used
* Cxl-test being added for better regression testing
* Cxl-test event processing was changed
* New DAX device work needed to handle sparse extents within the dax region
* Interleaving is in the back of his head and Navneet has been looking into this. However, interleaving is not slated for this initial work
* Jonathan - concerned that interleaving should not to be precluded
* Leave in comments about where interleaving would fit in.
* Interleaving is the next major feature…
* QEMU - DCD merge would be at least 6.7 aligned.
* Switch CCI (Jonathan)
* Opens around what we do for user space – almost every command is destructive
* Maybe just CXL raw commands are required?
* Patch set has been a pain to rebase on type 2 from Dan
* Would really like review / feedback
* Davidlohr would like to merge the ‘moving around code’ sooner
* Would help with the type 2 conflicts
* It is hard to generalize the code without this second user
* Not critical for 6.6
* would like to see an early merge slated for 6.7
* In the end – Security questions are major gating factor
* Memory tiring in general
* CDAT vs HMAT
* ‘Distance’ calculations vary
* Patch set: ‘Mem tiring calculating abstract distance from ACPI’ (v6.7 material)
## FM general topics
* We said we would talk about FM things in this meeting…
* Is there something at plumbers? Yes there is.
* Plumbers BoF for FM stuff?
## Question from discord:
* John: "numa ratio policy patch?"
* Jonathan will try and dig in to see where the patches are
* We are talking at a VMA level.
## QEMU Update
## cxl-cli update
* [Hotplug helper, and expanding cxl-cli beyond C](http://lore.kernel.org/r/20230807213635.3633907-1-terry.bowman@amd.com)
## v6.5 Fixes Queue
* Region autodiscovery fixes
* [x1 granularity calculation fix](https://patchwork.kernel.org/project/cxl/list/?series=773298): minor fixups requested
* [switch decoder allocation](https://patchwork.kernel.org/project/cxl/list/?series=773274): minor fixups requested
* Hotplug fixes
* [Cleanup softreserve on takeover](https://patchwork.kernel.org/project/cxl/list/?series=773250): awaiting review
* [Reuse SRAT proximity domain](https://patchwork.kernel.org/project/cxl/list/?series=764146): pinged x86
* [CXL _OSC AER Fixup](https://patchwork.kernel.org/project/cxl/list/?series=772827): minor fixups requested
## v6.6 Queue
* Queue closes August 18th
* [RCH Error handling](https://patchwork.kernel.org/project/cxl/list/?series=761698): fixes requested
* QTG enabling
* [ACPI HMAT Generic Port support](https://patchwork.kernel.org/project/cxl/list/?series=759643): awaiting merge
* [Surface QTG ID info](https://patchwork.kernel.org/project/cxl/list/?series=758023): awaiting merge
* [CDAT Parsing](https://patchwork.kernel.org/project/cxl/list/?series=757264): awaiting merge
* Finish Type2 enabling
* [Fix security init collision](https://patchwork.kernel.org/project/cxl/list/?series=770745): different approach requested
* [Rebase remaining Type2 HDM API](http://lore.kernel.org/r/168592149709.1948938.8663425987110396027.stgit@dwillia2-xfh.jf.intel.com)
* [DCD](https://patchwork.kernel.org/project/cxl/list/?series=757239): awaiting next rev
* [Switch CCI](https://patchwork.kernel.org/project/cxl/list/?series=773085): awaiting review
# July 2023
## ndctl / cxl-cli update
* v78 - minor fixups only - will go out this week
* v79
* Firmware update (no outstanding comments)
* Poison injection (awaiting new rev)
* Others?
...further notes not captured.
# June 2023
* Opens:
* OpenBMC collaboration
* Labels / Persistent Naming (6.3 issue?)
* Add a CXL-CLI Item to the Agenda
* QEMU Update
* v6.4 Fixes
* v6.5 Merge Queue
* Post v6.5 material
## QEMU Update
* QEMU DCD Support?
* MLD Support
* CCI layering work for OpenBMC collab
* I2C ACPI aspeed controller (upstream questionable)
## v6.4 Fixes
* [DAX Use After Free](https://lore.kernel.org/linux-cxl/168577282846.1672036.13848242151310480001.stgit@dwillia2-xfh.jf.intel.com/)
* [SRAT vs CFMWS Fixup](https://lore.kernel.org/linux-cxl/cover.1684448934.git.alison.schofield@intel.com/)(pending next rev and x86 review)
* [Cache Management Discussion](http://lore.kernel.org/r/648220cdade2_1433ac2949b@dwillia2-xfh.jf.intel.com.notmuch)
## v6.5 Merge Queue
* [RCH Error Handling](https://lore.kernel.org/linux-cxl/20230607221651.2454764-1-terry.bowman@amd.com/)(awaiting v6 posting)
* Follow-up: RDPAS vs Root Port Scanning?
* [Background command support](http://lore.kernel.org/r/20230421092321.12741-1-dave@stgolabs.net/)(baseline pushed, awaiting consumer)
* [Sanitization](http://lore.kernel.org/r/20230612181038.14421-1-dave@stgolabs.net)(pending review)
* [Firmware udpate](http://lore.kernel.org/r/20230602-vv-fw_update-v3-0-869f82069c95@intel.com)(awaiting final review)
* [CXL perf monitoring](http://lore.kernel.org/r/20230303175022.10806-1-Jonathan.Cameron@huawei.com)(awaiting push to cxl-next)
## Post v6.5
* [QoS Class support](http://lore.kernel.org/r/168382784460.3510737.9571643715488757272.stgit@djiang5-mobl3)(pre-reqs heading for v6.5)
* [CDAT + QTG _DSM integration](http://lore.kernel.org/r/168088732996.1441063.10107817505475386072.stgit@djiang5-mobl3)(pending review)
* Standalone CXL IDE
* [PCIE SPDM pre-requisite](https://github.com/l1k/linux/commits/doe)
* [KEYP table enabling](https://cdrdv2-public.intel.com/732838/732838_Root%20Complex%20IDE%20Programming%20Guide_Rev1p0.pdf)
* [Switch CCI](http://lore.kernel.org/r/20221025104243.20836-1-Jonathan.Cameron@huawei.com)
* memory_failure() for CXL events
* [Type-2 Region Creation](http://lore.kernel.org/r/168592149709.1948938.8663425987110396027.stgit@dwillia2-xfh.jf.intel.com) (awaiting review)
* Scan Media
* background dependency
* [Dynamic Capacity Device support](https://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl.git/log/?h=for-6.5/dcd-preview)(awaiting next rev)
* Sparse DAX Region infrastructure
* DCD event plumbing
# May 2023
* Opens:
* rasdaemon patches need review
* LSF/MM takeaways
* QEMU Update
* v6.4 pull summary
* v6.5 Queue
## LSF/MM takeaways
* CXL 3.0 specification update review well received
* Discussed nodes vs zones and mempolicy vs mmap flags, nodes+mempolicy continues as the path forward
* Fabric manager: several efforts in flight (one in rust one in golang, OCP and OFA efforts as well)
* Live migration: CXL as a transport for migration, opportunity for migrate in place
## QEMU Update
* Several patchkits ready and awaiting final merge:
* volatile memory
* poison handling
* events
* DCD support starting to surface
* Initial test results of the pre-RFC implementation look good
* QMP based interface
## v6.4 pull summary
* [DOE rework](https://lore.kernel.org/all/cover.1678543498.git.lukas@wunner.de/)(queued)
* [Poison retrieval](http://lore.kernel.org/r/cover.1679284567.git.alison.schofield@intel.com)(pending review)
* Forward and reverse address translation (DPA <==> HPA)
* [Poison inject and clear](http://lore.kernel.org/r/cover.1678471465.git.alison.schofield@intel.com)(awaiting next rev)
## v6.5 queue
* [Background command support](http://lore.kernel.org/r/20230421092321.12741-1-dave@stgolabs.net/)(pending review)
* [QoS Class support](http://lore.kernel.org/r/168382784460.3510737.9571643715488757272.stgit@djiang5-mobl3)(pending review)
* [CDAT + QTG _DSM integration](http://lore.kernel.org/r/168088732996.1441063.10107817505475386072.stgit@djiang5-mobl3)(pending review)
* [CXL perf monitoring](http://lore.kernel.org/r/20230303175022.10806-1-Jonathan.Cameron@huawei.com)(awaiting perf acks)
* [Dynamic Capacity Device support](https://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl.git/log/?h=for-6.5/dcd-preview)(awaiting next rev)
* Sparse DAX Region infrastructure
* DCD event plumbing
* [Firmware udpate](http://lore.kernel.org/r/20230421-vv-fw_update-v1-0-22468747d72f@intel.com))(pending review)
* v2 posted with review feedback incorporated
* man page added to the cxl-cli patchkit
* [RAS Capability Tracing on RCH AER events](http://lore.kernel.org/r/20221021185615.605233-1-terry.bowman@amd.com)(awaiting next rev)
* Standalone CXL IDE
* PCIE SPDM pre-requisite
* [KEYP table enabling](https://cdrdv2-public.intel.com/732838/732838_Root%20Complex%20IDE%20Programming%20Guide_Rev1p0.pdf)
* [Switch CCI](http://lore.kernel.org/r/20221025104243.20836-1-Jonathan.Cameron@huawei.com)
* memory_failure() for CXL events
* [Type-2 Region Creation](https://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl.git/log/?h=for-6.4/cxl-type-2)(awaiting first rev)
* Scan Media
* background dependency
# April 2023
* Opens:
* QEMU Update
* v6.3 Fixes
* v6.4 Queue
* v6.5 Queue
## v6.3 Fixes
* [Decoder Enumeration Fixes](http://lore.kernel.org/r/168149842935.792294.13212627946146993066.stgit@dwillia2-xfh.jf.intel.com)(queued)
## v6.4 Queue
* [DOE rework](https://lore.kernel.org/all/cover.1678543498.git.lukas@wunner.de/)(queued)
* [Poison retrieval](http://lore.kernel.org/r/cover.1679284567.git.alison.schofield@intel.com)(pending review)
* Forward and reverse address translation (DPA <==> HPA)
* [Poison inject and clear](http://lore.kernel.org/r/cover.1678471465.git.alison.schofield@intel.com)(awaiting next rev)
* [CXL perf monitoring](http://lore.kernel.org/r/20230303175022.10806-1-Jonathan.Cameron@huawei.com)(awaiting perf acks)
## v6.5 Queue
* [CDAT + QTG _DSM integration](http://lore.kernel.org/r/168088732996.1441063.10107817505475386072.stgit@djiang5-mobl3)(review pending)
* [Dynamic Capacity Device support](https://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl.git/log/?h=for-6.5/dcd-preview)(awaiting next rev)
* Sparse DAX Region infrastructure
* DCD event plumbing
* Firmware Update (awaiting first rev)
* [RAS Capability Tracing on RCH AER events](http://lore.kernel.org/r/20221021185615.605233-1-terry.bowman@amd.com)(awaiting next rev)
* Standalone CXL IDE
* PCIE SPDM pre-requisite
* [KEYP table enabling](https://cdrdv2-public.intel.com/732838/732838_Root%20Complex%20IDE%20Programming%20Guide_Rev1p0.pdf)
* [Switch CCI](http://lore.kernel.org/r/20221025104243.20836-1-Jonathan.Cameron@huawei.com)
* memory_failure() for CXL events
* [Type-2 Region Creation](https://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl.git/log/?h=for-6.4/cxl-type-2)(awaiting first rev)
* Scan Media
* background dependency
# March 2023
* Opens:
* [cxl/hdm: Fix hdm decoder init by adding COMMIT field check](http://lore.kernel.org/r/20230228224014.1402545-1-fan.ni@samsung.com)
* HDM-D/DB Kernel-internal region creation
* QEMU Update
* v6.4 Queue
## v6.4 Queue
* [DOE rework](https://lore.kernel.org/all/cover.1678543498.git.lukas@wunner.de/)
* [Poison retrieval](http://lore.kernel.org/r/cover.1679284567.git.alison.schofield@intel.com)
* Forward and reverse address translation (DPA <==> HPA)
* [Poison inject and clear](http://lore.kernel.org/r/cover.1678471465.git.alison.schofield@intel.com)
* Scan Media
* background dependency
* [Background command support](https://lore.kernel.org/all/20230224194652.1990604-1-dave@stgolabs.net/)
* Dynamic Capacity Device support
* Sparse DAX Region infrastructure
* DCD event plumbing
* Firmware Update
* CDAT + QTG _DSM integration
* [CXL perf monitoring](http://lore.kernel.org/r/20230303175022.10806-1-Jonathan.Cameron@huawei.com)
* [RAS Capability Tracing on RCH AER events](http://lore.kernel.org/r/20221021185615.605233-1-terry.bowman@amd.com)
* Standalone CXL IDE
* PCIE SPDM pre-requisite
* [Switch CCI](http://lore.kernel.org/r/20221025104243.20836-1-Jonathan.Cameron@huawei.com)
* memory_failure() for CXL events
* Maintenance Feature Support (DRAM PPR) (BMC only?)
## Notes
* Question about kernel code modularity for accelerator drivers
* Expectation is that it is a bug if CXL core code cannnot be reused for devices outside of the class-device definition
* DCD Sharing may be the first user of HDM-DB functionality in the kernel, QEMU model for this in scoping
* Multi-head (not yet MLD) device support in the works for QEMU
* QEMU gaining a fix for clearing the HDM decoder COMMITTED bit when deactivating decoders
* Poison
* Poison inject can be done unconditionally, rely on "injected" indication to delineate real vs simulated hardware problems
* open question: should the driver taint the kernel on inject? No, ACPI EINJ does not
* Poison list: emit trace event on inject event? Maybe already covered by another event record
# February 2023
* Opens:
* CXL DVSEC emulation fixes
* QEMU Update
* v6.3 Merge Window
* v6.4 Queue
## v6.3 Merge Window
* Move tracepoints to cxl_core
* Export CXL _OSC error control result
* CXL Events to Linux Trace Events (including interrupts)
* HDM decoder emulation
* Default "Soft Reserved" (EFI_MEMORY_SP) handling policy (kernel)
* Volatile Region Discovery
* Volatile Region Provisioning
* Set timestamp
## v6.4 Queue
* Poison inject and clear
* Forward and reverse address translation (DPA <==> HPA)
* Poison retrieval
* memory_failure() for CXL events
* Dynamic Capacity Device support
* Sparse DAX Region infrastructure
* DCD event plumbing
* CDAT + QTG _DSM integration
* DOE rework
* Standalone CXL IDE
* PCIE SPDM pre-requisite
* RAS Capability Tracing on RCH AER events
* Maintenance Feature Support (DRAM PPR)
* CXL perf monitoring
* Switch CCI
## Notes:
* QEMU:
* Several patch kits in flight: https://gitlab.com/jic23/qemu/-/commits/cxl-2023-02-21/
* AER Discussion:
* What about CXL Reset for recovery?
* May be more relevant for future Type-2 devices than Type-3
* Add another PCI error recovery reset type?
* Map FLR => CXL Reset?
* PCI core supports per-device reset methods
* DCD
* Look at MLD support before Switch CCI support
* CXL perf monitoring
* https://lore.kernel.org/r/20221018121318.22385-1-Jonathan.Cameron@huawei.com
* FW Update
* depends on background command support
* revisit for v6.4
* Scan Media
* revisit for v6.4
## January 2023
# Agenda 01/24
* Opens:
* DAX-page request API rework
* FM Project? LSF/MM topic
* Type-3 volatile
* QEMU Update
* v6.2 Merge Window
* v6.2-rc Fixes
* v6.3 Status
* v6.3+ Future Work
# v6.2 Merge Window
* Cache invalidation for region physical invalidation scenarios
* DOE kernel/user access collision detection
* RCH preparation patches
* RCH Support (including DVSEC Range Register enumeration)
* Security commands (including background commands)
* RAS Capability Tracing on VH AER events
* XOR Interleave Math support
* cxl_pmem_wq removal
* EFI CPER record parsing for CXL error records
# v6.2-rc Fixes
Merged in cxl/fixes:
* RAS UE addr mis-assignment
Pending merge:
* Fix nvdimm unregistration
# v6.3 Status
Merged in cxl/next:
* Move tracepoints to cxl_core
* Export CXL _OSC error control result
Pending merge:
* CXL Events to Linux Trace Events (including interrupts)
* Poison inject and clear
* Forward and reverse address translation (DPA <==> HPA)
* Poison retrieval
* HDM decoder emulation
Awaiting next (or first) posting:
* RAS Capability Tracing on RCH AER events
* Volatile Region Discovery
* Volatile Region Provisioning
* CDAT + QTG _DSM integration
* Set timestamp
* memory_failure() for CXL events
* DOE rework
# v6.3+ Future Work
* Default "Soft Reserved" (EFI_MEMORY_SP) handling policy (cxl-cli + daxctl)
* Dynamic Capacity Device support
* Sparse DAX Region infrastructure
* DCD event plumbing
* Standalone CXL IDE
* PCIE SPDM pre-requisite
* Maintenance Feature Support (DRAM PPR)
* CXL perf monitoring
# FM Future
* MLD Mailbox support for DCD event injection
* Switch mailbox CCI
* Multi-head device mailbox tunneling
# QEMU
* Start new threads for debug issues not on patches
* Greg's volatile region setup testing
* Passthrough decoder checks
* SPDM still pending
## November 2022
# Agenda 11/29
* Opens:
* FSDAX ->notify_failure() regression work still pending
* Others?
* Fixes merged for v6.1-rc4
* v6.2 merge window status
* Post v6.2 Features
# v6.1-rc4 Fixes
[https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tag/?h=v6.1-rc4](https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tag/?h=v6.1-rc4)
Merged:
* Mailbox input payload fix
* Decoder commit crash
* LSA payload handling fix
* CFMWS NUMA Node setup
* Fix switch attached to single-port host-bridge
* BUG in create-region when no more intermediate port decoders available
* Fix region object memory leak
* Fix memdev object memory leak
* cxl_pmem static analysis fix
# v6.2 Merge Window Status
Merged:
* Cache invalidation for region physical invalidation scenarios
* DOE kernel/user access collision detection
* RCH preparation patches
In the queue (has review):
* RCH Support (including DVSEC Range Register enumeration)
* Security commands (including background commands)
* CXL Events to Linux Trace Events (including interrupts)
* RAS Capability Tracing on RCH and VH AER events
In the queue (needs review):
* XOR Interleave Math support
* Forward and reverse address translation (DPA <==> HPA)
* Poison retrieval
* cxl_pmem_wq removal
* EFI CPER record parsing for CXL error records
At risk:
* Volatile Region Discovery
* Volatile Region Provisioning
* CDAT + QTG _DSM integration
* Poison inject and clear
* CXL perf monitoring
# Post v6.2 Features
* MLD Mailbox support for DCD event injection
* Dynamic Capacity Device support
* Sparse DAX Region infrastructure
* DCD event plumbing
* Switch mailbox CCI
* Multi-head device mailbox tunneling
* Standalone CXL IDE
* PCIE SPDM pre-requisite
* Maintenance Feature Support (DRAM PPR)
* Default "Soft Reserved" (EFI_MEMORY_SP) handling policy (cxl-cli + daxctl)
## October 2022
# Agenda 10/25
* Opens:
* FSDAX page reference counting rework (merged in mm-unstable)
* FSDAX ->notify_failure() regression work still pending
* Code First ECR: ['SP' attribute in SRAT](https://bugzilla.tianocore.org/show_bug.cgi?id=4062)
* QEMU emulation status update
* Others?
* Fixes pending for v6.1-rc
* Features in flight for v6.2
* Rough plans for post v6.2 work
# v6.1 Fixes
[https://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl.git/log/?h=fixes](https://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl.git/log/?h=fixes)
Queued:
* Mailbox input payload fix
* Decoder commit crash
* LSA payload handling fix
* CFMWS NUMA Node setup
Pending:
* Fix switch attached to single-port host-bridge
* BUG in create-region when no more intermediate port decoders available
# v6.2 Features
In rough priority order, feedback welcome:
* RCH Support (including DVSEC Range Register enumeration)
* Cache invalidation for region physical invalidation scenarios
* RAS Capability Tracing on RCH and VH AER events
* CXL Events to Linux Trace Events (including interrupts)
* EFI CPER record parsing for CXL error records
* Forward and reverse address translation (DPA <==> HPA)
* Volatile Region Discovery
* Volatile Region Provisioning
* Security commands (including background commands)
* CXL perf monitoring
* Miscellaneous cleanups and renames
# Post v6.2 Features
* Dynamic Capacity Device support
* Sparse DAX Region infrastructure
* DCD event plumbing
* Maintenance Feature Support (DRAM PPR)
* Switch mailbox CCI
* Multi-head device mailbox tunneling
* Default "Soft Reserved" (EFI_MEMORY_SP) handling policy (cxl-cli + daxctl)
## August 2022
# Agenda 8/30
* Opens:
* FSDAX ->notify_failure() fixes
* FSDAX page reference counting rework
* Linux v6.0-rc1 and ndctl (ndctl, daxctl, cxl-cli) v74 released
* Fix and Feature queue for v6.0-rc, v6.1 and ndctl-v75
* Rough plans for post v6.1 work for CXL 3.0 enabling
# Recently released
* Kernel:
* DPA Space Accounting
* PMEM Region Provisioning
* DOE Support in PCI core
* CDAT retrieval (for debug)
* User tooling:
* cxl create-region
* cxl reserve/free-dpa
* cxl list -vvv
# Next fixes and features
* 'arch_flush_memregion()'
* Fix validation of x1 switch topologies
* Volatile region provisioning
* Region labels
* Security commands support
* Trace events for CXL events (including interrupts)
* 'cxl monitor' command
* CXL AER handling
* Address translation
# Future work
* Performance monitoring
* Maintenance Feature Support (DRAM PPR)
* Dynamic Capacity Device support
* Default "Soft Reserved" (EFI_MEMORY_SP) handling policy
## July 2022
# Agenda 7/26
* Opens:
* FSDAX page reference counting rework
* What is queued for v6.0 (and ndctl-v74)?
* Late v6.0 updates
* Post v6.0 work
# Queued for v6.0
* DOE Support in PCI core
* CDAT retrieval (for debug)
* DPA Space Accounting
* PMEM Region Provisioning
# In review for v6.0
* Interleave granularity fixes
* Fix host-bridge x1 interleave constraint
* [Fix region granularity > host-bridge granularity handling](https://lore.kernel.org/linux-cxl/165853778028.2430596.7493880465382850752.stgit@dwillia2-xfh.jf.intel.com/) (scale factors must match)
# Post v6.0 material
* Pre-existing region enumeration
* Volatile region provisioning
* XORMAP interleave support
* Trace Events for CXL Events
* List Poison
* Scan Media
* Address translation
* Region persistence in labels
* Region enumeration via labels
## June 2022
# Agenda: 6/28
* Opens:
* CXL Device Tree Support
* MEM_HWINIT_MODE=0
* QEMU mainline CXL support is live
* What is in review for v5.20 (and ndctl-v74)
* What else might make v5.20?
* What is post v5.20 material?
# v5.20 in review
* [DOE + CDAT](https://lore.kernel.org/linux-cxl/20220628041527.742333-1-ira.weiny@intel.com/)
* [PMEM Region Provisioning](https://lore.kernel.org/linux-cxl/165603869943.551046.3498980330327696732.stgit@dwillia2-xfh/)
# v5.20 on deck
* Pre-existing region enumeration
* Region persistence in labels
* Region enumeration via labels
* Address translation foundation
# Post v5.20 material
* List Poison
* Scan Media
* XORMAP interleave support
* Trace Events for CXL Events
* Address translation (in cxl-cli) for all kernel supported Events, List Poison, and Scan Media
## May 2022
# Agenda: 5/31
* What is in v5.19?
* What is on deck for v5.20?
* What is post v5.20 material?
* Opens
# v5.19 / ndctl-v73
* Kernel
* lockdep annotations
* CXL _OSC (native CXL hotplug + error "handling")
* Disable suspend
* Mem_enable fixes
# v5.20 / ndctl-v74
* Kernel
* Region Provisioning
* DOE Core
* CXL CDAT Retrieval
* Event record handling core
* Scan Media records
* Event Interrupts
* Background command timesharing
* Userpace
* 'cxl create-region'
* Region listing support
* Scan media / Event records to json
* Address translation
# Post v5.20 / v6.0
* Kernel
* SPDM Attestation
* IDE
* Security commands
* Userspace
* Attestation helper process
* CXL Device-DAX Policy