LLVM/project 5829bb9clang/test/CodeGen/X86 avx512fp16-builtins-constrained.c

[clang][x86] AVX512FP16 constrained sqrt test coverage (#168046)

_mm_sqrt_sh / _mm512_sqrt_ph - these were missed from #167692
DeltaFile
+85-0clang/test/CodeGen/X86/avx512fp16-builtins-constrained.c
+85-01 files

LLVM/project 00000dcllvm/test/Transforms/LoopInterchange pr43176-move-to-new-latch.ll pr43326.ll

[LoopInterchange] Fix tests with loops that have BTC=0. NFC. (#167748)

Precommit test fixups for #167113
DeltaFile
+36-34llvm/test/Transforms/LoopInterchange/pr43176-move-to-new-latch.ll
+26-26llvm/test/Transforms/LoopInterchange/pr43326.ll
+5-2llvm/test/Transforms/LoopInterchange/lcssa-phi-outer-latch.ll
+3-3llvm/test/Transforms/LoopInterchange/reductions-across-inner-and-outer-loop.ll
+4-2llvm/test/Transforms/LoopInterchange/pr57148.ll
+2-1llvm/test/Transforms/LoopInterchange/interchanged-loop-nest-4.ll
+76-686 files

LLVM/project 72c69aellvm/lib/Target/AMDGPU SIInstrInfo.cpp AMDGPURegisterBankInfo.cpp

[AMDGPU] Make use of getFunction and getMF. NFC. (#167872)

DeltaFile
+14-15llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
+9-9llvm/lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp
+4-4llvm/lib/Target/AMDGPU/R600InstrInfo.cpp
+3-3llvm/lib/Target/AMDGPU/AMDGPUPreloadKernelArguments.cpp
+3-3llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp
+3-3llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp
+36-3722 files not shown
+67-6928 files

LLVM/project 3890c97llvm/lib/Transforms/Scalar InferAddressSpaces.cpp, llvm/test/Transforms/InferAddressSpaces/AMDGPU phinode-address-infer.ll

[InferAddressSpaces] Fix bad `addrspacecast` insertion for phinode (#163528)

The IR verifier will carsh if there is any instructions located before
phi-node. The `infer-address-spaces` pass would like to insert
`addrspacecast` before phi-node in some corner cases. Indeed, since the
operand pointer(phi-node's incoming value) has been determined to
`NewAS` by the pass, it is safe to `addrspacecast` it immediately after
the position where defined it.

Co-authored-by: Kerang Mao <krmao at birentech.com>
DeltaFile
+57-0llvm/test/Transforms/InferAddressSpaces/NVPTX/phinode-address-infer.ll
+55-0llvm/test/Transforms/InferAddressSpaces/AMDGPU/phinode-address-infer.ll
+39-0llvm/lib/Transforms/Scalar/InferAddressSpaces.cpp
+151-03 files

LLVM/project c021e16llvm/lib/Target/AArch64 AArch64ISelLowering.cpp MachineSMEABIPass.cpp, llvm/test/CodeGen/AArch64 sme-dynamic-tls.ll

[AArch64][SME] Handle SME state around TLS-descriptor calls (#155608)

This patch ensures we switch out of streaming mode before TLS-descriptor
calls. ZA state will also be preserved when using the new SME ABI
lowering (`-aarch64-new-sme-abi`).

Fixes #152165
DeltaFile
+159-0llvm/test/CodeGen/AArch64/sme-dynamic-tls.ll
+30-4llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+12-2llvm/lib/Target/AArch64/MachineSMEABIPass.cpp
+2-2llvm/lib/Target/AArch64/AArch64SMEInstrInfo.td
+2-2llvm/lib/Target/AArch64/AArch64InstrInfo.td
+205-105 files

LLVM/project c534701lldb/test/Shell/Commands command-disassemble-aarch64-extensions.s

fixup! [AArch64] Remove FEAT_TME assembly and ACLE support

Missed one reference
DeltaFile
+1-3lldb/test/Shell/Commands/command-disassemble-aarch64-extensions.s
+1-31 files

LLVM/project 3907e93clang/lib/Headers arm_acle.h, clang/test/CodeGen/AArch64 tme.cpp

[AArch64] Remove FEAT_TME assembly and ACLE support

The Transactional Memory Extension (TME) was introduced as part of
Armv9-A but has not been adopted by the ecosystem. This mirrors what
Arm has observed with similar extensions in other architectures.

Therefore, remove FEAT_TME assembly and ACLE code from llvm, because
support for TME has now been officially withdrawn, as noted here:

```
   FEAT_TME is withdrawn from all future versions of Arm®
   Architecture Reference Manual for A-profile architecture.
```

referenced in Known Issue D24093, documented here:
   https://developer.arm.com/documentation/102105/lb-05/
DeltaFile
+0-47llvm/test/MC/AArch64/tme-error.s
+0-44llvm/test/CodeGen/AArch64/tme.ll
+0-42clang/test/CodeGen/AArch64/tme.cpp
+0-39llvm/lib/Target/AArch64/AArch64InstrFormats.td
+0-24llvm/test/MC/AArch64/tme.s
+0-22clang/lib/Headers/arm_acle.h
+0-21815 files not shown
+4-31121 files

LLVM/project afcf7e8llvm/lib/Target/AMDGPU GCNSchedStrategy.cpp

wip
DeltaFile
+95-81llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp
+95-811 files

LLVM/project 40a9e34llvm/lib/Target/AArch64 AArch64SystemOperands.td AArch64InstrFormats.td, llvm/lib/Target/AArch64/AsmParser AArch64AsmParser.cpp

[AArch64][llvm] Add support for Permission Overlay Extension 2 (FEAT_S1POE2) (#164912)

Add assembly/disassembly support for AArch64 `FEAT_S1POE2` (Stage 1
Permission Overlay Extension 2), as blogged about here:

* https://developer.arm.com/community/arm-community-blogs/b/architectures-and-processors-blog/posts/future-architecture-technologies-poe2-and-vmte

and as documented here:

* https://developer.arm.com/documentation/109697/2025_09/Future-Architecture-Technologies

Co-authored-by: Rodolfo Wottrich <rodolfo.wottrich at arm.com>
DeltaFile
+3,263-0llvm/test/MC/AArch64/arm-poe2.s
+176-0llvm/lib/Target/AArch64/AArch64SystemOperands.td
+96-0llvm/lib/Target/AArch64/AArch64InstrFormats.td
+87-0llvm/test/MC/AArch64/arm-poe2-tlbid.s
+78-2llvm/lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp
+72-0llvm/test/MC/AArch64/arm-poe2-tlbid-diagnostics.s
+3,772-212 files not shown
+3,961-618 files

LLVM/project 12bf9cdllvm/include/llvm/TableGen Main.h, llvm/lib/TableGen Main.cpp

[TableGen][NFCI] Change TableGenMain() to take function_ref.

It was switched from a function pointer to std::function in

TableGen: Make 2nd arg MainFn of TableGenMain(argv0, MainFn) optional.
f675ec6165ab6add5e57cd43a2e9fa1a9bc21d81

but there's no mention of any particular reason for that.
DeltaFile
+6-8llvm/include/llvm/TableGen/Main.h
+2-4llvm/lib/TableGen/Main.cpp
+1-1llvm/utils/TableGen/Basic/TableGen.cpp
+9-133 files

LLVM/project ad20628llvm/include/llvm/TableGen TableGenBackend.h Main.h, llvm/lib/TableGen Main.cpp TableGenBackend.cpp

[TableGen] Split *GenRegisterInfo.inc.

Reduces memory usage compiling backend sources, most notably for
AMDGPU by ~98 MB per source on average.

AMDGPUGenRegisterInfo.inc is tens of megabytes in size now, and
is even larger downstream. At the same time, it is included in
nearly all backend sources, typically just for a small portion of
its content, resulting in compilation being unnecessarily
memory-hungry, which in turn stresses buildbots and wastes their
resources.

Splitting .inc files also helps avoiding extra ccache misses
where changes in .td files don't cause changes in all parts of
what previously was a single .inc file.

It is thought that rather than building on top of the current
single-output-file design of TableGen, e.g., using `split-file`,
it would be more preferable to recognise the need for multi-file

    [2 lines not shown]
DeltaFile
+55-31llvm/utils/TableGen/RegisterInfoEmitter.cpp
+52-24llvm/lib/TableGen/Main.cpp
+40-3llvm/include/llvm/TableGen/TableGenBackend.h
+14-5llvm/lib/TableGen/TableGenBackend.cpp
+16-1llvm/include/llvm/TableGen/Main.h
+8-1mlir/lib/Tools/mlir-tblgen/MlirTblgenMain.cpp
+185-658 files not shown
+201-7314 files

LLVM/project c8b601bclang/include/clang/Basic Builtins.def, clang/lib/AST ASTContext.cpp

[AMDGPU] Removal of 'e' handling
DeltaFile
+2-6clang/lib/AST/ASTContext.cpp
+0-1clang/include/clang/Basic/Builtins.def
+2-72 files

LLVM/project 8b8cba0llvm/lib/Target/AMDGPU GCNRegPressure.cpp, llvm/test/CodeGen/AMDGPU machine-scheduler-sink-trivial-remats-attr.mir machine-scheduler-sink-trivial-remats.mir

[AMDGPU] Rematerialize VGPR candidates when SGPR spills to VGPR over the VGPR limit

Before, when selecting candidates to rematerialize, we would only
consider SGPR candidates when there was an excess of SGPR registers.

Failing to eliminate the excess would result in spills to VGPRs.
This is normally not an issue, unless spilling to VGPRs results in
excess VGPRs.

This patch does 2 things:
* It relaxes the GCNRPTarget success criteria: now we accept regions
  where we spill SGPRs to VGPRs, as long as this does not end up in
  excess VGPRs.
* It changes isSaveBeneficial to consider the excess VGPRs (which
  includes the SGPRs that would be spilled to VGPR).

With these changes, the compiler rematerializes VGPRs when the excess
SGPRs would result in VGPR excess.


    [4 lines not shown]
DeltaFile
+215-215llvm/test/CodeGen/AMDGPU/machine-scheduler-sink-trivial-remats-attr.mir
+92-92llvm/test/CodeGen/AMDGPU/machine-scheduler-sink-trivial-remats.mir
+11-13llvm/lib/Target/AMDGPU/GCNRegPressure.cpp
+318-3203 files

LLVM/project 8723fe5mlir/include/mlir/Dialect/Tosa/IR TosaTypesBase.td TosaOps.td, mlir/test/Dialect/Tosa ops.mlir

[mlir][tosa] Allow int64 index tensors in gather/scatter (#167894)

This commit ensures that gather and scatter operations with int64 index
tensors can be created. This aligns with the EXT_INT64 extension.
DeltaFile
+18-4mlir/test/Dialect/Tosa/ops.mlir
+2-4mlir/include/mlir/Dialect/Tosa/IR/TosaTypesBase.td
+2-2mlir/include/mlir/Dialect/Tosa/IR/TosaOps.td
+22-103 files

LLVM/project 8c3b5c0clang/include/clang/Basic Builtins.def, clang/lib/AST ASTContext.cpp

[AMDGPU] Removal of 'e' handling
DeltaFile
+2-6clang/lib/AST/ASTContext.cpp
+0-1clang/include/clang/Basic/Builtins.def
+2-72 files

LLVM/project 1c42262clang/include/clang/Basic Builtins.def, clang/lib/AST ASTContext.cpp

Moving removal of 'e' handling to a separate PR
DeltaFile
+6-2clang/lib/AST/ASTContext.cpp
+1-0clang/include/clang/Basic/Builtins.def
+7-22 files

LLVM/project 31b7f1fllvm/lib/CodeGen/GlobalISel InlineAsmLowering.cpp, llvm/test/CodeGen/AArch64/GlobalISel irtranslator-inline-asm.ll arm64-fallback.ll

[GlobalISel] Add support for value/constants as inline asm memory operand (#161501)

InlineAsmLowering rejected inline assembly with memory reference inputs
if the values passed to the inline asm weren't pointers. The DAG
lowering however handled them just fine.

This patch updates InlineAsmLowering to store such values on the stack,
and then use the stack pointer as the "indirect" version of the operand.
DeltaFile
+94-0llvm/test/CodeGen/AMDGPU/GlobalISel/irtranslator-inline-asm.ll
+93-0llvm/test/CodeGen/AArch64/GlobalISel/irtranslator-inline-asm.ll
+40-13llvm/lib/CodeGen/GlobalISel/InlineAsmLowering.cpp
+0-9llvm/test/CodeGen/AArch64/GlobalISel/arm64-fallback.ll
+227-224 files

LLVM/project 787f677libcxx/include/__filesystem u8path.h path.h

[libc++] proper guarding for locale usage in filesystem on Windows (#165470)

- Resolves build issues when localization support is disabled on
Windows.
- Resolves dependencies on localization in filesystem header
implementations.

Related PR #164602
Fixes #164074
DeltaFile
+9-7libcxx/include/__filesystem/u8path.h
+5-3libcxx/include/__filesystem/path.h
+14-102 files

LLVM/project b2a8188llvm/lib/Support ThreadPool.cpp, llvm/unittests/Support ThreadPool.cpp

Destroy tasks as they are run in the thread pool (#167852)

Without this, any RAII objects held in the task's captures aren't
destroyed in a similar fashion to the task being run. If those objects
in turn interact with the thread pool itself, chaos ensues. This comes
up quite naturally with RAII-objects used for synchronization such as
RAII-powered latches or releasing a mutex, etc.

A unit test is crafted that tries to very directly test that the logic
of the thread pool continues to hold even with an RAII object. This
isn't the only type of failure mode (a deadlock due to mutexes in the
captures can also occur), but seemed the easiest to test.
DeltaFile
+37-0llvm/unittests/Support/ThreadPool.cpp
+7-2llvm/lib/Support/ThreadPool.cpp
+44-22 files

LLVM/project d4893ecllvm/lib/Target/LoongArch LoongArchTargetMachine.cpp, llvm/test/CodeGen/LoongArch preferred-alignments.ll opt-pipeline.ll

[LoongArch] Enable LoopTermFold pass
DeltaFile
+48-3llvm/test/Transforms/LoopStrengthReduce/LoongArch/lsr-insns.ll
+3-3llvm/test/CodeGen/LoongArch/preferred-alignments.ll
+1-0llvm/test/CodeGen/LoongArch/opt-pipeline.ll
+1-0llvm/lib/Target/LoongArch/LoongArchTargetMachine.cpp
+53-64 files

LLVM/project 3277f6cllvm/lib/Analysis IVDescriptors.cpp, llvm/lib/Transforms/Vectorize LoopVectorize.cpp VPlanRecipes.cpp

[LV] Explicitly disable in-loop reductions for AnyOf and FindIV. nfc (#163541)

Currently, in-loop reductions for AnyOf and FindIV are not supported.
They were implicitly blocked. This happened because
RecurrenceDescriptor::getReductionOpChain could not detect their
recurrence chain. The reason is that RecurrenceDescriptor::getOpcode was
set to Instruction::Or, but the recurrence chains of AnyOf and FindIV do
not actually contain an Instruction::Or.

This patch explicitly disables in-loop reductions for AnyOf and FindIV
instead of relying on getReductionOpChain to implicitly prevent them.
DeltaFile
+7-5llvm/lib/Analysis/IVDescriptors.cpp
+6-1llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+3-4llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
+16-103 files

LLVM/project bf07226.github/workflows libcxx-build-containers.yml, libcxx/docs Contributing.rst

[libc++] Reorganize and fix the libc++ CI dockerfiles (#167530)

Instead of having one large Dockerfile building multiple images with
relatively confusing inheritance, explicitly have three standalone
Dockerfiles each building one image. Then, tie the three images together
using the docker-compose file which explicitly versions the base image
used by the Android and the Github Actions images.
DeltaFile
+0-329libcxx/utils/ci/Dockerfile
+148-0libcxx/utils/ci/docker/linux-builder-base.dockerfile
+114-0libcxx/utils/ci/docker/android-builder.dockerfile
+19-37libcxx/docs/Contributing.rst
+25-22.github/workflows/libcxx-build-containers.yml
+0-40libcxx/utils/ci/docker-compose.yml
+306-4282 files not shown
+382-4288 files

LLVM/project b5c459dmlir/include/mlir/Dialect/Linalg/Utils Utils.h, mlir/lib/Dialect/Linalg/Transforms Specialize.cpp

[Linalg] Add basic infra to add matchers for linalg.*conv*/*pool* ops (#163724)

-- This commit includes the basic infra/utilities to add matchers for
   linalg.*conv*/*pool* ops - such that given a `linalg.generic` op it
   identifies which linalg.*conv*/*pool* op it is.
-- It adds a few representative linalg.*conv*/*pool* ops to demo the
matchers' capability and does so as part of
`linalg-specialize-generic-ops`
   pass.
-- The goal is directed towards addressing the aim of
[[RFC] Op explosion in
Linalg](https://discourse.llvm.org/t/rfc-op-explosion-in-linalg/82863)
   iteratively for `*conv*/*pooling*` ops.
-- This is part-1 of a series of PRs aimed to add matchers for
Convolution ops.
-- For further details, refer to
https://github.com/llvm/llvm-project/pull/163374#pullrequestreview-3341048722

Signed-off-by: Abhishek Varma <abhvarma at amd.com>
DeltaFile
+579-0mlir/lib/Dialect/Linalg/Utils/Utils.cpp
+119-0mlir/test/Dialect/Linalg/convolution/roundtrip-convolution.mlir
+50-0mlir/lib/Dialect/Linalg/Transforms/Specialize.cpp
+11-0mlir/include/mlir/Dialect/Linalg/Utils/Utils.h
+759-04 files

LLVM/project cf0f5ccllvm/test/Transforms/LoopStrengthReduce/LoongArch lsr-insns.ll

update tests
DeltaFile
+29-35llvm/test/Transforms/LoopStrengthReduce/LoongArch/lsr-insns.ll
+29-351 files

LLVM/project e517a85llvm/lib/Target/LoongArch LoongArchTargetTransformInfo.cpp LoongArchTargetTransformInfo.h

[LoongArch] Override `isLSRCostLess` to set `Insns` as the first priority

Similar to several other targets, this commit override
`isLSRCostLess` to set instruction number as the first priority
when LSR pass deciding the cost.

Besides, this commit also takes the extra temporary register
may be used into account in `NumRegs`. This is same as riscv,
see the reason in https://github.com/llvm/llvm-project/pull/92296.
DeltaFile
+14-0llvm/lib/Target/LoongArch/LoongArchTargetTransformInfo.cpp
+3-0llvm/lib/Target/LoongArch/LoongArchTargetTransformInfo.h
+17-02 files

LLVM/project 473601cllvm/lib/Target/AMDGPU GCNRegPressure.cpp

[NFC][AMDGPU] Refactor common code computing excess register preassure into RegExcess class
DeltaFile
+47-45llvm/lib/Target/AMDGPU/GCNRegPressure.cpp
+47-451 files

LLVM/project cee2f4ellvm/test/Transforms/LoopStrengthReduce/LoongArch lsr-insns.ll lit.local.cfg

[LoongArch][NFC] Add tests for LoopStrengthReduce pass

Copied from x86.
DeltaFile
+141-0llvm/test/Transforms/LoopStrengthReduce/LoongArch/lsr-insns.ll
+2-0llvm/test/Transforms/LoopStrengthReduce/LoongArch/lit.local.cfg
+143-02 files

LLVM/project 80ae168libcxx/docs VendorDocumentation.rst index.rst

[libcxx] [doc] Document the supported target versions of Windows (#167845)

The llvm-mingw toolchains defaults to `_WIN32_WINNT=0x601`, so this
configuration is covered by our CI build matrix.
DeltaFile
+6-0libcxx/docs/VendorDocumentation.rst
+1-1libcxx/docs/index.rst
+7-12 files

LLVM/project 9822905libcxx/docs VendorDocumentation.rst

[libcxx] [doc] Update the docs about LIBCXX_ENABLE_FILESYSTEM (#167843)

Since 1939eb3dc2330af6fb9609a7c3bd5276e127c9ce, std::filesystem is
enabled by default in MSVC builds too.
DeltaFile
+2-2libcxx/docs/VendorDocumentation.rst
+2-21 files

LLVM/project 6b16b31llvm/lib/Target/RISCV RISCVInstrInfoP.td, llvm/test/CodeGen/RISCV rvp-ext-rv64.ll

[llvm][RISCV] Support P extension CodeGen (#167882)

This patch support PADD_W, PSUB_W, PSADD_W, PSADDU_W, PSSUB_W, PSSUBU_W,
PAADD_W and PAADDU_W
DeltaFile
+176-0llvm/test/CodeGen/RISCV/rvp-ext-rv64.ll
+14-0llvm/lib/Target/RISCV/RISCVInstrInfoP.td
+190-02 files