LLVM/project 4a24c68llvm/lib/CodeGen/SelectionDAG DAGCombiner.cpp, llvm/test/CodeGen/ARM setcc-logic.ll

[DAGCombiner] Fold (or (seteq X, 0), (seteq X, -1)) to (setult (add X, 1), 2) (#192183)

This is the De Morgan dual of the existing fold:
    (and (setne X, 0), (setne X, -1)) --> (setuge (add X, 1), 2)

The or-of-equalities version checks if X is either 0 or -1, which is
equivalent to (X+1) < 2 (unsigned). This reduces two comparisons and
an or to one add and one comparison.

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply at anthropic.com>
DeltaFile
+14-0llvm/test/CodeGen/ARM/setcc-logic.ll
+4-4llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+18-42 files

LLVM/project 7780e54llvm/include/llvm/CodeGen AsmPrinterAnalysis.h

[AsmPrinter] Fix AsmPrinterAnalysis::Result::invalidate to take PreservedAnalyses by const reference (#191742)

The invalidate method was taking PreservedAnalyses by value instead of
by const reference, causing an unnecessary copy on every invalidation
query. All other analysis invalidate methods in LLVM use const
reference.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply at anthropic.com>
DeltaFile
+1-1llvm/include/llvm/CodeGen/AsmPrinterAnalysis.h
+1-11 files

LLVM/project 2dc9e4dclang/include/clang/AST ASTContext.h, clang/lib/AST ASTContext.cpp ItaniumMangle.cpp

[clang] implement CWG2064: ignore value dependence for decltype

The 'decltype' for a value-dependent (but non-type-dependent) should be known,
so this patch makes them non-opaque instead.

This patch also implements what's neceessary to allow overloading
on pure differences in instantiation dependence, making `std::void_t`
usable for SFINAE purposes.

This also readds a few test cases from da98651, which was a previous attempt
at resolving CWG2064.

Fixes #8740
Fixes #61818
Fixes #190388
DeltaFile
+887-175clang/lib/AST/ASTContext.cpp
+287-12clang/test/SemaTemplate/instantiation-dependence.cpp
+151-93clang/lib/AST/ItaniumMangle.cpp
+76-68clang/lib/AST/Type.cpp
+77-48clang/lib/Sema/SemaTemplate.cpp
+93-16clang/include/clang/AST/ASTContext.h
+1,571-41279 files not shown
+2,279-75085 files

LLVM/project 326a9falld/MachO ConcatOutputSection.h ConcatOutputSection.cpp, lld/MachO/Arch ARM64.cpp

[lld][MachO] Key branch-extension thunks on (referent, addend) (#191808)

TextOutputSection::finalize ignored branch relocation addends. Two call
sites branching to the same symbol with different addends therefore
collapsed onto a single thunk.

Key thunkMap on (isec, value, addend) so two call sites with different
addends get independent thunks. The addend is encoded in the thunk's
relocs and is zeroed at the call site after the callee is redirected to
the thunk. Thunk names carry a `+N` suffix when the addend is non-zero.
DeltaFile
+80-0lld/test/MachO/arm64-thunk-branch-addend.s
+45-17lld/MachO/ConcatOutputSection.h
+32-14lld/MachO/ConcatOutputSection.cpp
+6-4lld/MachO/Arch/ARM64.cpp
+2-1lld/MachO/Target.h
+1-1lld/MachO/InputSection.cpp
+166-371 files not shown
+168-377 files

LLVM/project 7514309llvm/lib/CodeGen AtomicExpandPass.cpp, llvm/test/CodeGen/AMDGPU unsupported-atomics.ll

[AtomicExpandPass] Improve atomic expand error messages (#188380)

AtomicExpandPass tells you that an operation is not supported but not why.
DeltaFile
+67-23llvm/lib/CodeGen/AtomicExpandPass.cpp
+25-0llvm/test/CodeGen/NVPTX/atomic-alignment.err.ll
+9-9llvm/test/CodeGen/AMDGPU/unsupported-atomics.ll
+8-8llvm/test/CodeGen/NVPTX/atomicrmw-expand.err.ll
+2-2llvm/test/CodeGen/NVPTX/load-store-atomic.err.ll
+2-2llvm/test/CodeGen/NVPTX/atomics-b128.ll
+113-441 files not shown
+114-457 files

LLVM/project 18bed37offload CMakeLists.txt, offload/plugins-nextgen/cuda CMakeLists.txt

[offload][OpenMP] Require CUDA 11.8 (#191100)
DeltaFile
+21-0offload/plugins-nextgen/cuda/src/rtl.cpp
+6-5openmp/docs/Building.md
+5-1offload/plugins-nextgen/cuda/CMakeLists.txt
+4-0offload/CMakeLists.txt
+1-1offload/test/CMakeLists.txt
+1-1offload/unittests/CMakeLists.txt
+38-86 files

LLVM/project cf53623clang/lib/Frontend CompilerInstance.cpp, clang/lib/Lex PPDirectives.cpp Preprocessor.cpp

Reapply "[ObjC][Preprocessor] Handle @import directive as a pp-directive" (#189174)

This PR reapply https://github.com/llvm/llvm-project/pull/157726.

Depends: https://github.com/llvm/llvm-project/pull/107168
This patch handle `@import` as a preprocessing directive, and since this
patch, the following import directive will be ill-formed:
```
@import Foo\n;
```

---------

Signed-off-by: yronglin <yronglin777 at gmail.com>
DeltaFile
+71-24clang/lib/Lex/PPDirectives.cpp
+1-92clang/lib/Lex/Preprocessor.cpp
+42-0clang/test/Modules/objc-at-import.m
+33-3clang/lib/Lex/Lexer.cpp
+14-13clang/lib/Frontend/CompilerInstance.cpp
+7-11clang/lib/Lex/DependencyDirectivesScanner.cpp
+168-1439 files not shown
+188-17815 files

LLVM/project e0e2c8dclang/lib/CIR/CodeGen CIRGenClass.cpp, clang/test/CIR/CodeGen base-init-eh.cpp

[CIR] Implement EH handling for base class initializer (#192358)

This implements exception handling when a base class initializer is
called from a dervied class' constructor. The cleanup handler to call
the base class dtor was already implemented. We just needed to push the
cleanup on the EH stack.
DeltaFile
+129-0clang/test/CIR/CodeGen/base-init-eh.cpp
+2-1clang/lib/CIR/CodeGen/CIRGenClass.cpp
+131-12 files

LLVM/project ed55c8eclang/include/clang/AST ASTContext.h, clang/lib/AST ASTContext.cpp ItaniumMangle.cpp

[clang] implement CWG2064: ignore value dependence for decltype

The 'decltype' for a value-dependent (but non-type-dependent) should be known,
so this patch makes them non-opaque instead.

This patch also implements what's neceessary to allow overloading
on pure differences in instantiation dependence, making `std::void_t`
usable for SFINAE purposes.

This also readds a few test cases from da98651, which was a previous attempt
at resolving CWG2064.

Fixes #8740
Fixes #61818
Fixes #190388
DeltaFile
+887-175clang/lib/AST/ASTContext.cpp
+287-12clang/test/SemaTemplate/instantiation-dependence.cpp
+151-93clang/lib/AST/ItaniumMangle.cpp
+76-68clang/lib/AST/Type.cpp
+77-48clang/lib/Sema/SemaTemplate.cpp
+93-16clang/include/clang/AST/ASTContext.h
+1,571-41276 files not shown
+2,270-74582 files

LLVM/project ce435ddllvm/lib/Target/AMDGPU AMDGPUCoExecSchedStrategy.cpp, llvm/test/CodeGen/AMDGPU coexec-sched-effective-stall.mir

Adrress comments from https://github.com/llvm/llvm-project/pull/188658

Change-Id: Ia94c567a753168c1ffa16dc5d91195e7dd0ba044
DeltaFile
+114-114llvm/test/CodeGen/AMDGPU/coexec-sched-effective-stall.mir
+3-3llvm/lib/Target/AMDGPU/AMDGPUCoExecSchedStrategy.cpp
+117-1172 files

LLVM/project 2537596clang/include/clang/Basic DiagnosticLexKinds.td Module.h, clang/include/clang/Lex ModuleMap.h

[clang][modules] Diagnose headers owned by multiple modules (#188538)

Add -Wduplicate-header-ownership, an off by default warning that fires
at include time when a header is owned by multiple top-level modules.
This helps catch overlapping module maps that can cause confusing module
resolution.

Assisted-by: claude-opus-4.6
DeltaFile
+159-0clang/test/Modules/duplicate-header-ownership.c
+105-12clang/lib/Lex/ModuleMap.cpp
+23-4clang/include/clang/Lex/ModuleMap.h
+7-0clang/include/clang/Basic/DiagnosticLexKinds.td
+3-0clang/include/clang/Basic/Module.h
+297-165 files

LLVM/project 8d5a719clang/include/clang/ScalableStaticAnalysisFramework/Analyses/PointerFlow PointerFlow.h, clang/lib/ScalableStaticAnalysisFramework/Analyses SSAFAnalysesCommon.cpp

clean up code
DeltaFile
+45-66clang/lib/ScalableStaticAnalysisFramework/Analyses/PointerFlow/PointerFlowExtractor.cpp
+14-13clang/lib/ScalableStaticAnalysisFramework/Analyses/PointerFlow/PointerFlow.cpp
+1-1clang/include/clang/ScalableStaticAnalysisFramework/Analyses/PointerFlow/PointerFlow.h
+1-1clang/lib/ScalableStaticAnalysisFramework/Analyses/SSAFAnalysesCommon.cpp
+1-1clang/test/Analysis/Scalable/PointerFlow/tu-summary-serialization.test
+62-825 files

LLVM/project 561cf0cllvm/include/llvm Pass.h, llvm/include/llvm/IR PassTimingInfo.h

[NFC] Move TimePasses globals from Pass.h to PassTimingInfo.h (#192352)

They don't belong in the legacy pass manager-specific header, they apply
to both pass managers, plus the pass manager isn't the right layer to
put the bools anyway.
DeltaFile
+0-11llvm/include/llvm/Pass.h
+11-0llvm/include/llvm/IR/PassTimingInfo.h
+1-4llvm/lib/IRReader/IRReader.cpp
+1-0llvm/unittests/IR/TimePassesTest.cpp
+1-0llvm/lib/Target/AMDGPU/AMDGPUSplitModule.cpp
+1-0llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
+15-154 files not shown
+19-1510 files

LLVM/project 975fda5llvm/lib/Target/AMDGPU AMDGPUCoExecSchedStrategy.cpp

Add comment

Change-Id: I2180bba631fe4a01ed3c3fbcfa8c19cbefa84133
DeltaFile
+1-0llvm/lib/Target/AMDGPU/AMDGPUCoExecSchedStrategy.cpp
+1-01 files

LLVM/project 5be815allvm/lib/Target/AMDGPU AMDGPUCoExecSchedStrategy.cpp

clang-format

Change-Id: I534b1a979f55339a814ef3416c2492252845add5
DeltaFile
+6-3llvm/lib/Target/AMDGPU/AMDGPUCoExecSchedStrategy.cpp
+6-31 files

LLVM/project f892036llvm/lib/Target/AMDGPU AMDGPUCoExecSchedStrategy.h

Add a comment

Change-Id: I447f7f1fb185b18924cfd98249b5a0a05fef2484
DeltaFile
+7-0llvm/lib/Target/AMDGPU/AMDGPUCoExecSchedStrategy.h
+7-01 files

LLVM/project 996914cllvm/lib/Target/AMDGPU AMDGPUCoExecSchedStrategy.cpp

Add back tryLatency

Change-Id: I12d4f255c48ed77ba927eb3b192e5903f1f5e24f
DeltaFile
+7-1llvm/lib/Target/AMDGPU/AMDGPUCoExecSchedStrategy.cpp
+7-11 files

LLVM/project 17f284cllvm/lib/Target/AMDGPU AMDGPUCoExecSchedStrategy.cpp

Make fence heuristic work bottom-up

Change-Id: I629cbc8905b87a962e8b123287e5f60a3154df6b
DeltaFile
+19-17llvm/lib/Target/AMDGPU/AMDGPUCoExecSchedStrategy.cpp
+19-171 files

LLVM/project 9cc5d20llvm/lib/Target/AMDGPU AMDGPUCoExecSchedStrategy.cpp AMDGPUCoExecSchedStrategy.h, llvm/test/CodeGen/AMDGPU coexec-sched-effective-stall.mir

[AMDGPU] Add MemoryPipeline scheduling to Coexec sched

Change-Id: I52c476834155823d1ba998cdbbcb3ad6a7e6f2f5
DeltaFile
+323-0llvm/test/CodeGen/AMDGPU/coexec-sched-effective-stall.mir
+77-23llvm/lib/Target/AMDGPU/AMDGPUCoExecSchedStrategy.cpp
+18-0llvm/lib/Target/AMDGPU/AMDGPUCoExecSchedStrategy.h
+418-233 files

LLVM/project 48408f9llvm/lib/Target/AMDGPU AMDGPUCoExecSchedStrategy.cpp

Remove unused function

Change-Id: I9f2de1497f793d2848dedaf645e21e07a4ba82d6
DeltaFile
+0-60llvm/lib/Target/AMDGPU/AMDGPUCoExecSchedStrategy.cpp
+0-601 files

LLVM/project a3af640flang/include/flang/Evaluate tools.h, flang/lib/Evaluate tools.cpp

[flang][cuda] Avoid false positive on multi device symbol with components (#192177)

Semantic was wrongly flagging derived-type components as two device
resident object. Update how we collect symbols and count the number of
device resident object.
DeltaFile
+34-0flang/lib/Evaluate/tools.cpp
+26-0flang/test/Lower/CUDA/cuda-data-transfer.cuf
+11-0flang/include/flang/Evaluate/tools.h
+2-1flang/lib/Semantics/check-cuda.cpp
+73-14 files

LLVM/project f834a48clang/lib/CIR/CodeGen CIRGenCall.cpp, clang/test/CIR/CodeGen trivial-abi.cpp

[CIR][ABI] Handle callee-destructed params for trivial_abi (#191257)

Replace errorNYI for isParamDestroyedInCallee with working
implementation: create aggregate temp, mark externally destructed,
emit expr.  Unblocks [[trivial_abi]] types on Itanium ABI.

Adds trivial-abi.cpp test covering 17 cases from
CodeGenCXX/trivial_abi.cpp with CIR/LLVM/OGCG checks.

Made with [Cursor](https://cursor.com)
DeltaFile
+316-0clang/test/CIR/CodeGen/trivial-abi.cpp
+20-5clang/lib/CIR/CodeGen/CIRGenCall.cpp
+336-52 files

LLVM/project 0a4d3b3clang/test/CIR/CodeGen attr-noundef.cpp, clang/test/CIR/CodeGenCXX uncopyable-args.cpp x86_64-arguments.cpp

[CIR][ABI][NFC] Add x86_64 ABI parity tests (#191259)

Add three test files for CIR ABI parity on x86_64, all with
CIR/LLVM/OGCG checks:

- uncopyable-args.cpp — 24 functions covering non-copyable and
  move-only types (trivial, default-ctor, move-ctor, etc.)
- x86_64-arguments.cpp — 26 functions covering C++ struct passing,
  inheritance, member pointers, empty bases, packed structs
- attr-noundef.cpp — 26 functions covering noundef placement on
  structs, unions, vectors, member pointers, _BitInt

Made with [Cursor](https://cursor.com)
DeltaFile
+464-0clang/test/CIR/CodeGenCXX/uncopyable-args.cpp
+252-0clang/test/CIR/CodeGenCXX/x86_64-arguments.cpp
+235-0clang/test/CIR/CodeGen/attr-noundef.cpp
+951-03 files

LLVM/project b2af653clang/include/clang/CIR MissingFeatures.h, clang/lib/CIR/CodeGen CIRGenStmt.cpp CIRGenClass.cpp

[CIR][NFC] Convert MissingFeatures::requiresCleanups to errorNYI (#192350)

This change adds errorNYI calls in two places that we previously had
requiresCleanups() missing features markers, adds a more specific
missing feature marker for loops, removes one requiresCleanups() where
the handling was already implemented, and deletes a bunch of missing
feature markers there were never used.
DeltaFile
+4-4clang/lib/CIR/CodeGen/CIRGenStmt.cpp
+4-3clang/lib/CIR/CodeGen/CIRGenClass.cpp
+1-6clang/include/clang/CIR/MissingFeatures.h
+0-2clang/lib/CIR/CodeGen/CIRGenExprAggregate.cpp
+9-154 files

LLVM/project 5b979f5bolt/lib/Passes Instrumentation.cpp

[BOLT][Passes] use ADT containers for instrumentation spanning tree. (#192289)

Swap `std::unordered_map<…, std::set<…>>` for
`DenseMap<…, SmallVector<…>>` in `Instrumentation::instrumentFunction`
and switch read paths from `STOutSet[&BB]` to `find()`. This removes
per-set heap allocations, stops inserting empty buckets on every probe,
and replaces linear `is_contained()` scans over a red-black tree with
linear scans over inline `SmallVector` storage (most basic blocks have
at most a couple of spanning-tree out-edges). NFC.
DeltaFile
+12-7bolt/lib/Passes/Instrumentation.cpp
+12-71 files

LLVM/project eab567allvm/lib/Target/AMDGPU SIInstructions.td, llvm/test/CodeGen/AMDGPU sub.v2i16.ll add.v2i16.ll

[AMDGPU] Add true16 patterns for build_vector (vgpr, 0) (#192147)

It is shorter than VOP3 `and` instruction and in some cases
can save a second move.
DeltaFile
+8-10llvm/test/CodeGen/AMDGPU/sub.v2i16.ll
+8-10llvm/test/CodeGen/AMDGPU/add.v2i16.ll
+7-5llvm/test/CodeGen/AMDGPU/fcanonicalize.f16.ll
+4-8llvm/test/CodeGen/AMDGPU/flat-saddr-load.ll
+10-0llvm/lib/Target/AMDGPU/SIInstructions.td
+2-8llvm/test/CodeGen/AMDGPU/divergence-driven-buildvector.ll
+39-414 files not shown
+48-5010 files

LLVM/project d430d89clang/lib/CIR/Dialect/Transforms LoweringPrepare.cpp, clang/test/CIR/CodeGenCUDA device-stub.cu

[CIR][CUDA] Do Runtime Kernel Registration (#188926)

Related:
https://github.com/issues/assigned?issue=llvm%7Cllvm-project%7C179278,
https://github.com/llvm/llvm-project/issues/175871

More registration shenanigans -> Generates `__cuda_register_globals`
that associates the fatbin with kernels that contain `__global__`
qualifiers with the runtime.

Generated equivalent runtime code:

``` C
  // Called once per kernel to register it with the CUDA runtime.
  void __cuda_register_globals(void **fatbinHandle) {
      __cudaRegisterFunction(
          fatbinHandle,
          (const char *)&_Z25__device_stub__kernelfunciii, // host-side stub ptr
          (char *)__cuda_kernelname_str,                   // device-side mangled name

    [13 lines not shown]
DeltaFile
+119-2clang/lib/CIR/Dialect/Transforms/LoweringPrepare.cpp
+28-2clang/test/CIR/CodeGenCUDA/device-stub.cu
+147-42 files

LLVM/project 5dc1fd4clang/lib/CIR/CodeGen CIRGenCall.cpp, clang/test/CIR/CodeGen amdgpu-call-addrspace-cast.cpp

[CIR] Add address space casts for pointer arguments when creating a call (#192303)

This patch checks if the expected type for an argument is the same as
the actual type. If types are pointers but with different address spaces
then add an address space cast to make the pointer types match.

Assised-by: Cursor / Claude Opus 4.6
DeltaFile
+47-0clang/test/CIR/CodeGen/amdgpu-call-addrspace-cast.cpp
+12-0clang/lib/CIR/CodeGen/CIRGenCall.cpp
+59-02 files

LLVM/project bbc6a54mlir/include/mlir/Dialect/XeGPU/IR XeGPUOps.td, mlir/lib/Dialect/XeGPU/IR XeGPUOps.cpp

[MLIR][XeGPU] Remove create tdesc & update offset op from xegpu dialect (#182804)

This PR removes create_tdesc and update_offset ops from the XeGPU
dialect, as scatter load/store/prefetch now accept memref+offsets
directly.
DeltaFile
+132-300mlir/test/Dialect/XeGPU/invalid.mlir
+1-287mlir/test/Dialect/XeGPU/ops.mlir
+14-219mlir/include/mlir/Dialect/XeGPU/IR/XeGPUOps.td
+1-202mlir/test/Dialect/XeGPU/xegpu-unroll-patterns.mlir
+0-168mlir/lib/Dialect/XeGPU/IR/XeGPUOps.cpp
+1-150mlir/test/Dialect/XeGPU/xegpu-blocking.mlir
+149-1,32612 files not shown
+187-1,75818 files

LLVM/project 41d96d5bolt/lib/Rewrite RewriteInstance.cpp

[BOLT] Add guardrails around reading malformed input
DeltaFile
+27-8bolt/lib/Rewrite/RewriteInstance.cpp
+27-81 files