LLVM/project ac00a11llvm/lib/Target/AMDGPU VOP3PInstructions.td, llvm/test/CodeGen/AMDGPU mfma-convergent.mir

[AMDGPU] Ensure v_mfma_scale_f32_{16x16x128|32x32x64}_f8f6f4 instructions are convergent (#178627)

The scaled variants of mfma instructions are not properly marked as
"convergent" and hence the machine-sink pass sinks them which is
incorrect.

This patch ensures that the instructions get marked as "convergent". The
new test also covers other mfma variants, but only the scale variants
are mistreated without the changes from this patch.
DeltaFile
+478-0llvm/test/CodeGen/AMDGPU/mfma-convergent.mir
+3-2llvm/lib/Target/AMDGPU/VOP3PInstructions.td
+481-22 files

LLVM/project dea1d29clang-tools-extra/docs clang-reorder-fields.rst index.rst

[clang-tools-extra][docs] Add documentation for clang-reorder-fields (#178446)

Add comprehensive documentation for the clang-reorder-fields tool,
addressing #35520. The tool has existed in the repository but was
previously undocumented.

The documentation includes:
- Basic usage examples for C and C++ structs/classes
- Constructor initializer list reordering
- Designated initializer support (C++20)
- Detailed limitations and caveats
- Command line option reference
- Common use cases (memory layout optimization, etc.)

Fixes #35520

---------

Co-authored-by: EugeneZelenko <eugene.zelenko at gmail.com>
DeltaFile
+423-0clang-tools-extra/docs/clang-reorder-fields.rst
+1-0clang-tools-extra/docs/index.rst
+424-02 files

LLVM/project a482eb7lldb/test/API/tools/lldb-dap/exception/runtime-instruments TestDAP_runtime_instruments.py

[lldb-dap] Conditionally check UBSan stack trace on Darwin only (#178655)

non-darwin platforms may have incorrect stop information location
heuristics. Enable assertion once UBSan stopInfo heuristic is updated.

I hit this locally, I don't see it hitting any CI bot but should, Mostly
likely the CI linux bots may not have `compiler_rt` run time enabled.
see
https://github.com/llvm/llvm-project/pull/177964#discussion_r2732271531
DeltaFile
+5-1lldb/test/API/tools/lldb-dap/exception/runtime-instruments/TestDAP_runtime_instruments.py
+5-11 files

LLVM/project 068c3d2llvm/lib/Target/AMDGPU AMDGPUISelDAGToDAG.cpp

[AMDGPU] Address review comment
DeltaFile
+1-2llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp
+1-21 files

LLVM/project 7dba90fllvm/test/CodeGen/AMDGPU amdgcn.bitcast.1024bit.ll load-constant-i1.ll

[AMDGPU] Have VCC as a first-class member of the SGPR pool.

Add VCC and tuples using VCC to SGPR register classes.

We already support VCC as an allocatable register for 32-bit SGPR
operands, so it seems most natural to support it for register
tuple operands as well.

s106/s107 are still not allowed as aliases of vcc_lo/hi in
AsmParser.

The names given to the VCC tuples match those produced by SP3,
though it feels like there is room for improvement.

https://github.com/llvm/llvm-project/issues/62651
DeltaFile
+4,333-4,337llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.1024bit.ll
+729-735llvm/test/CodeGen/AMDGPU/load-constant-i1.ll
+427-431llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.512bit.ll
+383-355llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.960bit.ll
+250-252llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.896bit.ll
+245-249llvm/test/CodeGen/AMDGPU/scc-clobbered-sgpr-to-vmem-spill.ll
+6,367-6,35912 files not shown
+7,024-6,92418 files

LLVM/project 43108bfllvm/runtimes CMakeLists.txt

[openmp] Build doxygen in bootstrapping builds (#178298)

When LLVM_ENABLE_DOXYGEN=ON, forward the `doxygen-openmp` build target
from the nested (default target) runtimes build. When
LLVM_BUILD_DOCS=ON, also trigger `doxygen-build` with `ninja doxygen`.
LLVM_INCLUDE_DOCS=ON is required in the runtimes build, which is the
default.

This is required to update the OpenMP doxygen documentation at
https://openmp.llvm.org/doxygen by the publish-doxygen-docs buidbot,
discussed here:
https://github.com/llvm/llvm-zorg/pull/716#pullrequestreview-3713032311
DeltaFile
+12-0llvm/runtimes/CMakeLists.txt
+12-01 files

LLVM/project 1da76c3llvm/lib/Target/AMDGPU AMDGPUInstructionSelector.cpp

 [AMDGPU] add back missing parenthesis
DeltaFile
+12-11llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp
+12-111 files

LLVM/project 8fa4807llvm/lib/CodeGen ExpandIRInsts.cpp

[ExpandIRInsts] Simplify constant construction (NFC)

Don't go through IRBuilder for constants we can create with
APInt APIs.
DeltaFile
+5-8llvm/lib/CodeGen/ExpandIRInsts.cpp
+5-81 files

LLVM/project b1f845dmlir/lib/Target/LLVMIR/Dialect/OpenMP OpenMPToLLVMIRTranslation.cpp

[MLIR][OpenMP] Fix unused variable warning for #137201 (#178659)

Fixes 4cc80831ea5d39c186fc29692556b762ffb6478b.
DeltaFile
+2-2mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
+2-21 files

LLVM/project 9ae6d8fclang/test/Driver hipspv-toolchain.hip hipspv-link-static-library.hip

[Clang] Try to fix HIPSPV tests after #168043

Summary:
https://github.com/llvm/llvm-project/pull/168043 seems to not have
specified the target triple for the tests so different architectures
fail these tests. Try to set it manually. If this doesn't clear up the
bots I'll revert both.
DeltaFile
+6-6clang/test/Driver/hipspv-toolchain.hip
+2-0clang/test/Driver/hipspv-link-static-library.hip
+8-62 files

LLVM/project e509974.ci/buildbot worker.py, polly/ci polly-x86_64-linux-test-suite.py

[Polly][CI] Unconditionally delete test-suite build

The test-suite should be recompiled every time, even in incremental
builds.
DeltaFile
+4-0.ci/buildbot/worker.py
+3-0polly/ci/polly-x86_64-linux-test-suite.py
+7-02 files

LLVM/project ac039c5llvm/lib/CodeGen ExpandIRInsts.cpp, llvm/test/Transforms/ExpandIRInsts/X86 expand-fp-convert-small.ll

[ExpandIRInsts] Test fptoi expansion for small types

Allow testing fptoui/fptosi on half types, which are small enough
for alive2 to verify the result.

They currently pass for non-undef/poison input. (The fptoui
expansion is the same as fptosi, which is confusing, but not
incorrect, because the saturation it performs is not actually
required by fptoi.)
DeltaFile
+170-0llvm/test/Transforms/ExpandIRInsts/X86/expand-fp-convert-small.ll
+2-3llvm/lib/CodeGen/ExpandIRInsts.cpp
+172-32 files

LLVM/project 162267ellvm/lib/Target/AArch64 AArch64ISelLowering.cpp, llvm/test/CodeGen/AArch64 get-active-lane-mask-extract.ll

[AArch64][SME2] Allow lowering to whilelo.x2 in non-streaming mode (#178399)

Since #145322 relaxed the SME predicate for the multi-register while
instructions, these instructions are allowed in non-streaming mode
when SME2 is available.

This patch removes the isStreaming() restriction from both
performActiveLaneMaskCombine & ReplaceGetActiveLaneMaskResults,
allowing the whilelo.x2 intrinsic to be used if SVE or streaming
SVE is available.
DeltaFile
+22-21llvm/test/CodeGen/AArch64/get-active-lane-mask-extract.ll
+6-5llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+28-262 files

LLVM/project 312078bclang/test/Sema format-strings.c

[Sema] Fix format-strings test on 32-bit Arm (#178450)

DeltaFile
+10-0clang/test/Sema/format-strings.c
+10-01 files

LLVM/project d2e18beclang/lib/Driver/ToolChains HIPSPV.cpp, clang/test/Driver hipspv-toolchain-rdc.hip hipspv-toolchain.hip

[Clang] Lift HIPSPV onto the new offload driver (#168043)

Update HIPSPV toolchain to support `--offload-new-driver`. Additionally,
tailor `llvm-spirv` invocation for
[chipStar](https://github.com/CHIP-SPV/chipStar) via
`spirv64-*-chipstar` offload triple.

Depends on one commit from #170467 and one from #170655.

---------

Co-authored-by: Henry Linjamäki <henry.mikael.linjamaki at intel.com>
Co-authored-by: Joseph Huber <huberjn at outlook.com>
DeltaFile
+98-42clang/test/Driver/hipspv-toolchain-rdc.hip
+77-16clang/test/Driver/hipspv-toolchain.hip
+59-11clang/lib/Driver/ToolChains/HIPSPV.cpp
+66-0clang/test/Driver/hipspv-toolchain-rdc-separate.hip
+34-8clang/test/Driver/hipspv-pass-plugin.hip
+28-10clang/test/Driver/hipspv-link-static-library.hip
+362-878 files not shown
+419-11114 files

LLVM/project 24e95bamlir/lib/Dialect/Linalg/Transforms Generalization.cpp, mlir/test/Dialect/Linalg generalize-named-ops.mlir

[mlir][Linalg] Preserve discardable/user-defined attributes during generalization (#178599)

-- As observed in a [downstream
project](https://github.com/iree-org/iree/pull/23294#discussion_r2734982998)
: the named to generize linalg op conversion wasn't preserving
discardable attributes.
-- This commit aims to fix the same.
-- Only a single test case is added as the change applies to any named
linalg op's generalization.

Signed-off-by: Abhishek Varma <abhvarma at amd.com>
DeltaFile
+18-0mlir/test/Dialect/Linalg/generalize-named-ops.mlir
+8-0mlir/lib/Dialect/Linalg/Transforms/Generalization.cpp
+26-02 files

LLVM/project 8a9f362mlir/lib/Dialect/Arith/IR ArithOps.cpp, mlir/test/Conversion/SCFToSPIRV signed-vector.mlir

[MLIR][Arith] Ensure ConstantOp validates signless integers for vectors (#177857)

Fixes #177818 
`arith::ConstantOp::isBuildableWith()` was only checking scalar integers
for signlessness, allowing signed vector element types to pass
validation incorrectly.

---------

Co-authored-by: Milos Poletanovic <mpoletanovic at syrmia.com>
DeltaFile
+12-0mlir/test/Conversion/SCFToSPIRV/signed-vector.mlir
+4-3mlir/lib/Dialect/Arith/IR/ArithOps.cpp
+16-32 files

LLVM/project 4500a0dllvm/docs LangRef.rst

Fix a typo
DeltaFile
+1-1llvm/docs/LangRef.rst
+1-11 files

LLVM/project 4cc8083flang/test/Integration/OpenMP target-nesting-in-host-ops.f90 target-use-device-nested.f90, mlir/lib/Target/LLVMIR/Dialect/OpenMP OpenMPToLLVMIRTranslation.cpp

[MLIR][OpenMP] Simplify OpenMP device codegen (#137201)

After removing host operations from the device MLIR module, it is no
longer necessary to provide special codegen logic to prevent these
operations from causing compiler crashes or miscompilations.

This patch removes these now unnecessary code paths to simplify codegen
logic. Some MLIR tests are now replaced with Flang tests, since the
responsibility of dealing with host operations has been moved earlier in
the compilation flow.

MLIR tests holding target device modules are updated to no longer
include now unsupported host operations.
DeltaFile
+166-302mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
+0-160mlir/test/Target/LLVMIR/openmp-target-nesting-in-host-ops.mlir
+87-0flang/test/Integration/OpenMP/target-nesting-in-host-ops.f90
+24-37mlir/test/Target/LLVMIR/omptarget-memcpy-align-metadata.mlir
+46-0flang/test/Integration/OpenMP/target-use-device-nested.f90
+0-46mlir/test/Target/LLVMIR/openmp-target-use-device-nested.mlir
+323-54512 files not shown
+484-73618 files

LLVM/project 1e4b4faflang/lib/Optimizer/OpenMP FunctionFiltering.cpp, flang/test/Lower/OpenMP host-eval.f90 declare-target-link-tarop-cap.f90

[Flang][OpenMP] Minimize host ops remaining in device compilation (#137200)

This patch updates the function filtering OpenMP pass intended to remove
host functions from the MLIR module created by Flang lowering when
targeting an OpenMP target device.

Host functions holding target regions must be kept, so that the target
regions within them can be translated for the device. The issue is that
non-target operations inside these functions cannot be discarded because
some of them hold information that is also relevant during target device
codegen. Specifically, mapping information resides outside of
`omp.target` regions.

This patch updates the previous behavior where all host operations were
preserved to then ignore all of those that are not actually needed by
target device codegen. This, in practice, means only keeping target
regions and mapping information needed by the device. Arguments for some
of these remaining operations are replaced by placeholder allocations
and `fir.undefined`, since they are only actually defined inside of the

    [4 lines not shown]
DeltaFile
+516-0flang/test/Transforms/OpenMP/function-filtering-host-ops.mlir
+350-2flang/lib/Optimizer/OpenMP/FunctionFiltering.cpp
+137-0flang/test/Transforms/OpenMP/function-filtering.mlir
+0-137flang/test/Transforms/omp-function-filtering.mlir
+37-18flang/test/Lower/OpenMP/host-eval.f90
+10-9flang/test/Lower/OpenMP/declare-target-link-tarop-cap.f90
+1,050-1662 files not shown
+1,053-1718 files

LLVM/project 3a95117llvm/test/CodeGen/AMDGPU amdgcn.bitcast.1024bit.ll amdgcn.bitcast.512bit.ll

Explicitly check fo 8bit source type

Created using spr 1.3.7
DeltaFile
+74,257-82,975llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.1024bit.ll
+26,135-30,267llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.512bit.ll
+9,044-11,203llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.320bit.ll
+5,872-6,681llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.256bit.ll
+2,674-3,346llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.128bit.ll
+1,521-1,873llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.96bit.ll
+119,503-136,345353 files not shown
+131,707-144,047359 files

LLVM/project 40ebbb6llvm/lib/Transforms/Scalar LoopInterchange.cpp, llvm/test/Transforms/LoopInterchange reduction2mem.ll reduction2mem-limitation.ll

[LoopInterchange] Initialize new_var to InitValue on first iteration (#178370)

Fixed a bug found during testing:
- If it is the first iteration, `new_var` should be initialized to
'InitValue'.
DeltaFile
+1-1llvm/test/Transforms/LoopInterchange/reduction2mem.ll
+1-1llvm/lib/Transforms/Scalar/LoopInterchange.cpp
+1-1llvm/test/Transforms/LoopInterchange/reduction2mem-limitation.ll
+3-33 files

LLVM/project deafb6bclang/lib/AST/ByteCode EvalEmitter.cpp

[clang][bytecode][NFC] Use `Block::deref()` in `EvalEmitter` (#178630)

Instead of doing the casting around `Block::data()` ourselves.
DeltaFile
+2-2clang/lib/AST/ByteCode/EvalEmitter.cpp
+2-21 files

LLVM/project 245043fclang/docs ReleaseNotes.rst, clang/lib/Sema SemaType.cpp

[Clang] avoid assertion in __underlying_type for enum redeclarations (#177984)

Fixes #177943

---

This patch addresses cases where `__underlying_type` is used with enum
redeclarations. The previously added assertion
(https://github.com/llvm/llvm-project/pull/155900) treated a missing
`int` on the referenced `EnumDecl` as an indicator of a _demoted
definition_, while this condition can also occur for redeclarations.
DeltaFile
+9-0clang/test/SemaCXX/underlying_type.cpp
+0-4clang/lib/Sema/SemaType.cpp
+1-0clang/docs/ReleaseNotes.rst
+10-43 files

LLVM/project e17374alldb/source/Host/windows MainLoopWindows.cpp, lldb/test/Shell/DAP TestSTDINConsole.test

[lldb-dap][windows] allow STDIN to be a console (#178642)

DeltaFile
+62-0lldb/test/Shell/DAP/TestSTDINConsole.test
+1-1lldb/source/Host/windows/MainLoopWindows.cpp
+63-12 files

LLVM/project 3f1386bllvm/lib/Target/AMDGPU SIInstrInfo.cpp

[AMDGPU] Add braces around a switch case. NFC. (#178637)

DeltaFile
+2-1llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
+2-11 files

LLVM/project 73c7c56llvm/lib/CodeGen/SelectionDAG DAGCombiner.cpp, llvm/test/CodeGen/AArch64 sve-ldst-sext.ll sve-ldst-zext.ll

[LLVM][DAGCombiner] Look through freeze when combining extensions of loads (#175022)

Following on from https://github.com/llvm/llvm-project/pull/172484 I
have added support to tryToFoldExtOfLoad for looking through freezes, in
order to catch more cases of extending loads. This type of code is
sometimes seen being generated by the loop vectoriser. For now I've
limited this to cases where the load is only used by the freeze, since
otherwise it leads to worse code in some X86 tests.
DeltaFile
+435-0llvm/test/CodeGen/AArch64/sve-ldst-sext.ll
+426-0llvm/test/CodeGen/AArch64/sve-ldst-zext.ll
+36-19llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+31-0llvm/test/CodeGen/X86/2007-10-29-ExtendSetCC.ll
+4-8llvm/test/CodeGen/X86/avx512-ext.ll
+6-6llvm/test/CodeGen/X86/widen-load-of-small-alloca-with-zero-upper-half.ll
+938-339 files not shown
+966-4715 files

LLVM/project 3fb8601lldb/source/Interpreter Options.cpp

[lldb] Refactor command option printing (#178208)

So I have an easier time fixing #177570.

Changes I have made:
* Init a variable inside if statement to reduce scope.
* Added const to some variables.
* Early return if we print a single line, and dedent the "else" that
handles multiple lines.
* Only convert lldb's short codes into ansi codes once.
* Rename a couple of variables where they could have either referred to
the visible text or the raw data with the ansi codes in.
DeltaFile
+44-41lldb/source/Interpreter/Options.cpp
+44-411 files

LLVM/project aeee859llvm/lib/Target/AArch64 AArch64SystemOperands.td AArch64Features.td, llvm/lib/Target/AArch64/AsmParser AArch64AsmParser.cpp

[AArch64][llvm] Gate some `tlbip` insns with +tlbid or +d128

Change the gating of `tlbip` instructions containing `*E1IS*`, `*E1OS*`,
`*E2IS*` or `*E2OS*` to be used with `+tlbid` or `+d128`. This is because
the 2025 Armv9.7-A MemSys specification says:

```
All TLBIP *E1IS*, TLBIP*E1OS*, TLBIP*E2IS* and TLBIP*E2OS* instructions
that are currently dependent on FEAT_D128 are updated to be dependent
on FEAT_D128 or FEAT_TLBID
```
DeltaFile
+259-0llvm/test/MC/AArch64/tlbip-tlbid-or-d128.s
+110-110llvm/test/MC/AArch64/armv9a-sysp.s
+18-4llvm/lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp
+21-0llvm/lib/Target/AArch64/Utils/AArch64BaseInfo.h
+13-2llvm/lib/Target/AArch64/AArch64SystemOperands.td
+7-4llvm/lib/Target/AArch64/AArch64Features.td
+428-1204 files not shown
+449-12610 files

LLVM/project 4ebede7lldb/source/Host/windows MainLoopWindows.cpp, lldb/test/Shell/DAP TestSTDINConsole.test

Revert "[lldb-dap][windows] allow STDIN to be a console (#178409)" (#178641)

DeltaFile
+0-62lldb/test/Shell/DAP/TestSTDINConsole.test
+1-1lldb/source/Host/windows/MainLoopWindows.cpp
+1-632 files