LLVM/project ee0b9c6clang/include/clang/Basic arm_sve.td, clang/include/clang/Interpreter IncrementalExecutor.h

Add missing newlines at EOF; NFC (#203483)
DeltaFile
+1-1clang/include/clang/Basic/arm_sve.td
+1-1clang/include/clang/Interpreter/IncrementalExecutor.h
+1-1clang/lib/Headers/hlsl/hlsl_alias_intrinsics.h
+1-1clang/unittests/AST/ASTExprTest.cpp
+1-1llvm/unittests/Transforms/IPO/AttributorTestBase.h
+1-1llvm/unittests/Transforms/IPO/MergeFunctionsTest.cpp
+6-62 files not shown
+8-88 files

LLVM/project 97a79eellvm/lib/CodeGen/GlobalISel IRTranslator.cpp, llvm/test/CodeGen/AArch64/GlobalISel translate-gep.ll

[GlobalISel] Avoid redundant copy for zero-offset GEPs (#203029)

Handle zero-offset GEPs early to avoid creating a separate vreg for the
GEP result and copying the base pointer into it.

Improves CTMark geomean -0.14% on aarch64-O0-g, with consumer-typeset
-0.86%.

https://llvm-compile-time-tracker.com/compare.php?from=2de2edb943fe1b83d79bdffa03606eb8c5452e9b&to=d3d5a4af0e7a58ea7b3a1e8c02b34fa380695e62&stat=instructions%3Au

Assisted-by: codex
DeltaFile
+18-24llvm/test/CodeGen/Mips/GlobalISel/irtranslator/aggregate_struct_return.ll
+3-4llvm/test/CodeGen/Mips/GlobalISel/irtranslator/sret_pointer.ll
+2-4llvm/test/CodeGen/X86/GlobalISel/irtranslator-callingconv.ll
+3-0llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp
+1-2llvm/test/CodeGen/AArch64/GlobalISel/translate-gep.ll
+27-345 files

LLVM/project 7ff58e4llvm/lib/Transforms/Vectorize LoopVectorize.cpp, llvm/test/Transforms/LoopVectorize/X86 nondetermisitic-widening-cost.ll

[LoopVectorize] Fix nondeterminism in loop-vectorize (#200833)

The nondeterministic iteration over `AddrDefs` (SmallPtrSet) causes
nondeterministic output for the test case in this patch (reduced from a
C codebase). One of two different outputs is generated arbitrarily,
chosen roughly equally.

Between the two different outputs sometimes the instruction
   `%3 = load i64, ptr %2, align 8`
has an associated cost of 4 and othertimes 9. The instruction is visited
twice in `setCostBasedWideningDecision` in the `AddrDefs` loop: once
directly as an element of `AddrDefs`, and the other time indirectly in
the lambda `UpdateMemOpUserCost` as a User of another `AddrDefs`
element. Each of those times `setWideningDecision` is called with a
different cost value; the final of the two calls sets the final value
(previous is overwritten). Because `AddrDefs` iteration is
nondeterministic, the order of those two calls to `setWideningDecision`
is also nondeterministic, hence we see two different costs arbitrarily
between runs.

    [13 lines not shown]
DeltaFile
+147-0llvm/test/Transforms/LoopVectorize/X86/nondetermisitic-widening-cost.ll
+2-2llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+149-22 files

LLVM/project fecfe32clang/test/CodeGen/AArch64/neon add.c intrinsics.c

[Clang][CIR] Move AArch64 Add tests (nfc) (#203543)

Follow-up for https://github.com/llvm/llvm-project/pull/202005
DeltaFile
+247-1clang/test/CodeGen/AArch64/neon/add.c
+0-247clang/test/CodeGen/AArch64/neon/intrinsics.c
+247-2482 files

LLVM/project 8608209llvm/lib/ObjectYAML ELFEmitter.cpp, llvm/test/tools/yaml2obj/ELF bb-addr-map.yaml

[ObjectYAML] Make BBAddrMap encoder diagnostics format-neutral (#202524)

In preparation for sharing the yaml2obj BBAddrMap encoder with COFF.
1. Drop the now-dead `Section.Type == SHT_LLVM_BB_ADDR_MAP` guards (#146186).
2. Reword the two warnings that will move into the shared helper.
3. Fix a "PBOBBEntries" -> "PGOBBEntries" typo.
DeltaFile
+14-16llvm/lib/ObjectYAML/ELFEmitter.cpp
+1-1llvm/test/tools/yaml2obj/ELF/bb-addr-map.yaml
+15-172 files

LLVM/project acdfd6bclang/lib/AST/ByteCode Descriptor.cpp

[clang][bytecode] Char<Signed> doesn't need a descriptor ctor (#203748)

This was lost when adding Char for 8-bit data.
DeltaFile
+2-2clang/lib/AST/ByteCode/Descriptor.cpp
+2-21 files

LLVM/project 0b6faeellvm/lib/Transforms/Scalar LoopInterchange.cpp, llvm/test/Transforms/LoopInterchange inner-latch-lcssa-feeds-exit-condition.ll

[LoopInterchange] Reject inner-latch lcssa PHI feeding the exit condition (#202863)

In a multi-level nest, an lcssa PHI in the inner loop latch that feeds
the latch's exit condition can be left with a stale incoming block after
a subsequent interchange rewires the CFG, producing invalid IR. This
happened even when the outer latch had a single predecessor, where the
legality check returned early. Instead, reject the interchange when such
a PHI feeds the exit condition.

Fixes #202027
DeltaFile
+81-0llvm/test/Transforms/LoopInterchange/inner-latch-lcssa-feeds-exit-condition.ll
+41-27llvm/lib/Transforms/Scalar/LoopInterchange.cpp
+122-272 files

LLVM/project 0e21d7cllvm/test/Analysis/DependenceAnalysis gcd-miv-addrec-wrap.ll

[DA] Add test for addrec can wrap in GCD MIV (NFC) (#203526)

This patch adds a test that should have been included in #186892. The
test demonstrates a case where the GCD MIV test would miss a dependency
if the presence of nsw flags were not checked.
DeltaFile
+73-0llvm/test/Analysis/DependenceAnalysis/gcd-miv-addrec-wrap.ll
+73-01 files

LLVM/project 66e5a88clang/lib/AST/ByteCode Function.h Function.cpp

[clang][bytecode] Add an `ExplicitThisParam` flag to `Function` (#203672)

We unfortunately have to check this for every function call, so don't
consult the decl every time here.
DeltaFile
+6-12clang/lib/AST/ByteCode/Function.h
+5-0clang/lib/AST/ByteCode/Function.cpp
+11-122 files

LLVM/project 713acbcllvm/include/llvm/MC DXContainerInfo.h

[NFC][MC] Initialize all fields of DebugName::Parameters in default constructor (#202701)

Initialized both variables **Flags** and **NameLength** of
**DebugNameHeader** structure.
DeltaFile
+1-1llvm/include/llvm/MC/DXContainerInfo.h
+1-11 files

LLVM/project 43a0be0llvm/lib/Target/X86 X86DomainReassignment.cpp, llvm/test/CodeGen/X86 domain-reassignment-closure-stats.mir

[X86] Record the enclosed register in X86DomainReassignment::buildClosure (#202534)

buildClosure recorded the seed register Reg in the function-wide
EnclosedEdges map on every worklist iteration instead of CurReg, the
register actually being added to the closure. EnclosedEdges therefore
only ever contained the seed of each closure.

The driver loop in runOnMachineFunction skips registers already present
in EnclosedEdges before starting a new closure. Because only seeds were
recorded, every non-seed member of an already-built closure looked like
a fresh seed, so a redundant closure was built for it and then
immediately discarded by the EnclosedInstrs cross-closure check. The
emitted code is unchanged; the pass just performed redundant work
proportional to closure size.

Key EnclosedEdges by CurReg so each enclosed register is recorded once.

This was found as part of @jlebar's X86 LLVM bug hunt / FuzzX effort:


    [2 lines not shown]
DeltaFile
+92-0llvm/test/CodeGen/X86/domain-reassignment-closure-stats.mir
+3-1llvm/lib/Target/X86/X86DomainReassignment.cpp
+95-12 files

LLVM/project 0f572a5clang/lib/AST/ByteCode Opcodes.td, clang/utils/TableGen ClangOpcodesEmitter.cpp

[clang][bytecode] Add an on-by-default `CanFail` flag to opcodes (#203671)

We have several opcodes that can't fail, so add a flag to them
indicating that they always return `true` anyway.

This simplifies the generated code from e.g.
```c++
PRESERVE_NONE
static bool Interp_Activate(InterpState &S, CodePtr &PC) {
  if (!Activate(S, PC))
    return false;
#if USE_TAILCALLS
  MUSTTAIL return InterpNext(S, PC);
#else
  return true;
#endif
}
```


    [12 lines not shown]
DeltaFile
+45-22clang/lib/AST/ByteCode/Opcodes.td
+24-6clang/utils/TableGen/ClangOpcodesEmitter.cpp
+69-282 files

LLVM/project 57a5646llvm/test/CodeGen/AMDGPU fcanonicalize.ll llvm.amdgcn.sched.group.barrier.ll, llvm/test/CodeGen/X86 vector-interleaved-store-i16-stride-7.ll vector-interleaved-store-i16-stride-6.ll

Merge branch 'main' into users/kasuga-fj/da-add-test-addrec-wrap-gcdmiv
DeltaFile
+3,204-3,450llvm/test/CodeGen/X86/vector-interleaved-store-i16-stride-7.ll
+1,905-2,037llvm/test/CodeGen/X86/vector-interleaved-store-i16-stride-6.ll
+2,760-227llvm/test/CodeGen/AMDGPU/fcanonicalize.ll
+1,813-654llvm/test/CodeGen/AMDGPU/llvm.amdgcn.sched.group.barrier.ll
+812-846llvm/test/CodeGen/X86/vector-interleaved-store-i16-stride-5.ll
+1,357-0llvm/test/CodeGen/AMDGPU/maximumnum.ll
+11,851-7,214802 files not shown
+43,491-17,605808 files

LLVM/project e170fb1llvm/test/Analysis/DependenceAnalysis gcd-miv-addrec-wrap.ll

address review
DeltaFile
+3-3llvm/test/Analysis/DependenceAnalysis/gcd-miv-addrec-wrap.ll
+3-31 files

LLVM/project 774f486clang/lib/CIR/CodeGen CIRGenBuiltinAMDGPU.cpp, clang/test/CIR/CodeGenHIP builtins-amdgcn-extended-image.hip

[CIR][AMDGPU] Adds lowering for amdgcn extended image sample/gather4 builtins (#201761)

Support for lowering of` __builtin_amdgcn_image_sample/gather4` for
AMDGPU builtins to clangIR.
Followed similar lowering from clang->llvmir:
`clang/lib/CodeGen/TargetBuiltins/AMDGPU.cpp`.

Upstreaming clangIR PR:
[llvm/clangir#2083](https://github.com/llvm/clangir/pull/2083)
DeltaFile
+374-0clang/test/CIR/CodeGenHIP/builtins-amdgcn-extended-image.hip
+50-12clang/lib/CIR/CodeGen/CIRGenBuiltinAMDGPU.cpp
+424-122 files

LLVM/project c236ef5llvm/include/llvm InitializePasses.h, llvm/include/llvm/CodeGen CFIFixup.h

[CodeGen][NewPM] Port cfi-fixup to new pass manager (#203692)

Standard work for `cfi-fixup`.
DeltaFile
+14-4llvm/lib/CodeGen/CFIFixup.cpp
+9-2llvm/include/llvm/CodeGen/CFIFixup.h
+1-1llvm/include/llvm/InitializePasses.h
+1-1llvm/include/llvm/Passes/MachinePassRegistry.def
+1-1llvm/lib/CodeGen/CodeGen.cpp
+1-1llvm/lib/CodeGen/TargetPassConfig.cpp
+27-105 files not shown
+32-1111 files

LLVM/project ad611b6llvm/lib/Target/X86 X86ISelLoweringCall.cpp, llvm/test/CodeGen/X86 abi-isel.ll

[X86] Do not hold GOT base for indirect call or absolute address (#203192)

Fixes:
https://github.com/llvm/llvm-project/pull/202370#discussion_r3384983368

Assisted-by: Claude Sonnet 4.6
DeltaFile
+32-40llvm/test/CodeGen/X86/abi-isel.ll
+6-5llvm/lib/Target/X86/X86ISelLoweringCall.cpp
+38-452 files

LLVM/project 05efe1bclang/test/Sema warn-lifetime-safety.cpp, clang/test/Sema/LifetimeSafety safety.cpp

rebase

Created using spr 1.3.7
DeltaFile
+3,204-3,450llvm/test/CodeGen/X86/vector-interleaved-store-i16-stride-7.ll
+1,905-2,037llvm/test/CodeGen/X86/vector-interleaved-store-i16-stride-6.ll
+3,716-0clang/test/Sema/LifetimeSafety/safety.cpp
+0-3,653clang/test/Sema/warn-lifetime-safety.cpp
+2,760-227llvm/test/CodeGen/AMDGPU/fcanonicalize.ll
+1,813-654llvm/test/CodeGen/AMDGPU/llvm.amdgcn.sched.group.barrier.ll
+13,398-10,0211,541 files not shown
+92,209-41,2531,547 files

LLVM/project 568122aclang/include/clang/Options Options.td, clang/lib/Driver/ToolChains AMDGPU.cpp AMDGPU.h

clang/AMDGPU: Split out target ID flags in TranslateArgs. (#203750)

Change how xnack and sramecc are processed. Introduce
-mxnack/-mno-xnack and -msramecc/-mno-sramecc flags.
When the target is first parsed in TranslateArgs, synthesize
the appropriate flag for the toolchain. This avoids
special case feature string fixups in getAMDGPUTargetFeatures,
and also avoids an extra parse of the target ID.

In the future this will also simplify tracking these ABI
modifiers in a module flag.

As a side-effect, you can use these flags to override the
no specifier case with the flags. These do not fully replace
the target ID syntax, as there's no way to represent compiling
both modes for the same subtarget.

I didn't bother trying to forward these flags on the main command
line without being specified to the offload device, but I suppose

    [2 lines not shown]
DeltaFile
+149-0clang/test/Driver/amdgpu-xnack-sramecc-flags.c
+24-27clang/lib/Driver/ToolChains/AMDGPU.cpp
+9-4clang/test/Driver/hip-target-id.hip
+6-4clang/lib/Driver/ToolChains/AMDGPU.h
+3-2clang/lib/Driver/ToolChains/HIPAMD.cpp
+4-0clang/include/clang/Options/Options.td
+195-375 files not shown
+203-4411 files

LLVM/project 08940b5clang-tools-extra/clang-tidy/bugprone NotNullTerminatedResultCheck.cpp, clang-tools-extra/clang-tidy/modernize LoopConvertUtils.cpp

[clang-tidy][NFC] Apply const-correctness to code (#203823)
DeltaFile
+12-10clang-tools-extra/clang-tidy/readability/ElseAfterReturnCheck.cpp
+10-10clang-tools-extra/clang-tidy/bugprone/NotNullTerminatedResultCheck.cpp
+6-5clang-tools-extra/clang-tidy/readability/EnumInitialValueCheck.cpp
+6-4clang-tools-extra/clang-tidy/modernize/LoopConvertUtils.cpp
+4-4clang-tools-extra/clang-tidy/readability/ImplicitBoolConversionCheck.cpp
+4-4clang-tools-extra/clang-tidy/performance/UnnecessaryCopyInitializationCheck.cpp
+42-3720 files not shown
+78-6926 files

LLVM/project 6f916feclang/docs/tools dump_ast_matchers.py

[ASTMatchers][Docs] print ignoring message only when class was not documented before (#203783)
DeltaFile
+4-1clang/docs/tools/dump_ast_matchers.py
+4-11 files

LLVM/project c775d6elibcxx/include print

[libc++] Make the body of println(FILE*) dependent on the template parameter to avoid template instantiation (#200996)

Make the function parameter of the `std::print` call inside the
`std::println` overload taking `FILE*` dependent on the template
parameter to avoid eager instantiation.
DeltaFile
+2-2libcxx/include/print
+2-21 files

LLVM/project bab217bmlir/lib/Bindings/Python IRCore.cpp, mlir/test/python context_shutdown.py

[mlir][python] Fix segfault at interpreter shutdown with entered contexts

The thread-local context stack (`PyThreadContextEntry::getStack()`)
holds `nb::object` references to Python Context, Location, and
InsertionPoint objects. When a Context is entered via `__enter__` but
never exited before the interpreter shuts down, these references
cause a segfault during process teardown.

The crash sequence:
1. User calls `ctx.__enter__()`, pushing a frame onto the
   `static thread_local vector<PyThreadContextEntry>`.
2. The script ends; CPython runs `Py_FinalizeEx()` which tears down
   the interpreter (clears modules, destroys remaining objects).
3. `main()` returns.
4. The C runtime destroys static/thread_local storage. On the main
   thread, thread_local variables have the same destruction timing
   as static storage — they are destroyed *after* main() returns.
5. The vector destructor runs, and each `PyThreadContextEntry`'s
   `nb::object` members call `Py_DECREF` — but the interpreter is

    [8 lines not shown]
DeltaFile
+26-0mlir/test/python/context_shutdown.py
+9-0mlir/lib/Bindings/Python/IRCore.cpp
+35-02 files

LLVM/project b113f5butils/bazel/llvm-project-overlay/flang/tools/flang-driver BUILD.bazel

[Bazel] Fixes 625facd (#203814)

This fixes 625facd4375f6bfa5de501d0559bd262062e2dc3.

Co-authored-by: Google Bazel Bot <google-bazel-bot at google.com>
DeltaFile
+1-0utils/bazel/llvm-project-overlay/flang/tools/flang-driver/BUILD.bazel
+1-01 files

LLVM/project 80ae495clang/lib/Sema SemaExprCXX.cpp

fixup! formatting
DeltaFile
+1-1clang/lib/Sema/SemaExprCXX.cpp
+1-11 files

LLVM/project 06b52a0clang/test/CIR/CodeGenHIP builtins-amdgcn-extended-image.hip

[CIR][AMDGPU] Adds missing test cases
DeltaFile
+28-4clang/test/CIR/CodeGenHIP/builtins-amdgcn-extended-image.hip
+28-41 files

LLVM/project c9cdee0clang/lib/CIR/CodeGen CIRGenBuiltinAMDGPU.cpp, clang/test/CIR/CodeGenHIP builtins-amdgcn-extended-image.hip

[CIR][AMDGPU] Adds lowering for amdgcn extended image sample/gather4 builtins
DeltaFile
+350-0clang/test/CIR/CodeGenHIP/builtins-amdgcn-extended-image.hip
+50-12clang/lib/CIR/CodeGen/CIRGenBuiltinAMDGPU.cpp
+400-122 files

LLVM/project 3f55f00mlir/lib/Bindings/Python IRCore.cpp, mlir/test/python context_shutdown.py

[mlir][python] Fix segfault at interpreter shutdown with entered contexts

The thread-local context stack (`PyThreadContextEntry::getStack()`)
holds `nb::object` references to Python Context, Location, and
InsertionPoint objects. When a Context is entered via `__enter__` but
never exited before the interpreter shuts down, these references
cause a segfault during process teardown.

The crash sequence:
1. User calls `ctx.__enter__()`, pushing a frame onto the
   `static thread_local vector<PyThreadContextEntry>`.
2. The script ends; CPython runs `Py_FinalizeEx()` which tears down
   the interpreter (clears modules, destroys remaining objects).
3. `main()` returns.
4. The C runtime destroys static/thread_local storage. On the main
   thread, thread_local variables have the same destruction timing
   as static storage — they are destroyed *after* main() returns.
5. The vector destructor runs, and each `PyThreadContextEntry`'s
   `nb::object` members call `Py_DECREF` — but the interpreter is

    [8 lines not shown]
DeltaFile
+27-0mlir/test/python/context_shutdown.py
+11-0mlir/lib/Bindings/Python/IRCore.cpp
+38-02 files

LLVM/project 05c8f9bclang/lib/AST/ByteCode Interp.cpp, clang/test/AST/ByteCode codegen-cxx2a.cpp

[clang][bytecode] Overide constant context state in CallVar (#203747)

We do this for regular calls, so do it for variable calls as well. Also
remove two comments that don't have any meaning today anymore.
DeltaFile
+26-0clang/test/AST/ByteCode/codegen-cxx2a.cpp
+1-6clang/lib/AST/ByteCode/Interp.cpp
+27-62 files

LLVM/project c730204clang/lib/Sema SemaExprCXX.cpp, clang/test/CXX/drs cwg5xx.cpp

[Clang] Implement CWG 2282
DeltaFile
+41-26clang/lib/Sema/SemaExprCXX.cpp
+8-5clang/test/CXX/expr/expr.unary/expr.new/p14.cpp
+7-6clang/test/SemaCXX/new-delete.cpp
+6-0clang/test/SemaCXX/std-align-val-t-in-operator-new.cpp
+3-2clang/test/CXX/drs/cwg5xx.cpp
+65-395 files