LLVM/project 4b31f1ellvm/lib/Target/AArch64 AArch64A57FPLoadBalancing.cpp AArch64.h

[NewPM] Port for AArch64A57FPLoadBalancing (#190652)
DeltaFile
+66-46llvm/lib/Target/AArch64/AArch64A57FPLoadBalancing.cpp
+9-2llvm/lib/Target/AArch64/AArch64.h
+2-2llvm/lib/Target/AArch64/AArch64TargetMachine.cpp
+1-0llvm/lib/Target/AArch64/AArch64PassRegistry.def
+78-504 files

LLVM/project bafb2cbclang/include/clang/Basic DiagnosticSemaKinds.td, clang/lib/Sema SemaOpenMP.cpp

[clang][OpenMP] declare_target/local clause variable can't be in map clause (#190470)

In OpenMP 6.0, the 'local' clause was added to the declare_target
directive. Variables listed in the 'local' clause are considered to be
device-local. In addition, a new map clause restriction was added:
A device-local variable must not appear as a list item in a map clause.
See OpenMP 6.0 specification section 7.9.6, map Clause, Restrictions, p.
386.

Testing:
- New error messages test for device-local variables defined in
declare_target local clauses (device-local) used in map clauses.
  - ninja check-openmp
DeltaFile
+70-0clang/test/OpenMP/declare_target_local_map_messages.cpp
+15-0clang/lib/Sema/SemaOpenMP.cpp
+2-0clang/include/clang/Basic/DiagnosticSemaKinds.td
+87-03 files

LLVM/project f069b82llvm/lib/Target/AArch64/GISel AArch64PostLegalizerCombiner.cpp

[NFC] Drop AArch64PostLegalizerCombiner dep on TargetPassConfig (#190569)

This will enable NewPM porting.

Replaced with the definition in

[AArch64PassConfig::getCSEConfig](https://github.com/llvm/llvm-project/blob/1d549d9a777a6faef6d425cb6482ab1fa6b91bb7/llvm/lib/Target/AArch64/AArch64TargetMachine.cpp#L614)
DeltaFile
+2-5llvm/lib/Target/AArch64/GISel/AArch64PostLegalizerCombiner.cpp
+2-51 files

LLVM/project f68868dllvm/lib/IR Value.cpp

Revert "[IR] Use iteration limit in stripPointerCastsAndOffsets" (#190839)

Reverts llvm/llvm-project#190472

Causes crashes:
https://github.com/llvm/llvm-project/pull/190472#issuecomment-4201843466
DeltaFile
+7-12llvm/lib/IR/Value.cpp
+7-121 files

LLVM/project 88af280clang/include/clang/Basic HLSLIntrinsics.td, clang/lib/Headers/hlsl hlsl_intrinsics.h hlsl_intrinsic_helpers.h

[HLSL] Rewrite inline HLSL intrinsics into TableGen (#188362)

Partially addresses https://github.com/llvm/llvm-project/issues/188345.
This PR rewrites all applicable inline HLSL intrinsics from
`hlsl_intrinsics.h` into TableGen.

The unsigned `abs` from `hlsl_alias_intrinsics.h` is also rewritten into
TableGen since it can also be defined inline.

The `NonUniformResourceIndex` is moved from `hlsl_intrinsics.h` over to
`hlsl_alias_intrinsics.h` since it can be defined as an alias.

`__detail::.*_impl` helper functions that were one liners have been
removed, and their corresponding HLSL intrinsics have been defined in
TableGen using the `Body` field instead.

Note that rewriting `refract` in TableGen instead of templates
introduces some significant changes to error messages and also
introduces a new offload test suite failure in the fp16 test because a

    [10 lines not shown]
DeltaFile
+0-591clang/lib/Headers/hlsl/hlsl_intrinsics.h
+325-4clang/include/clang/Basic/HLSLIntrinsics.td
+25-33clang/test/SemaHLSL/BuiltIns/refract-errors.hlsl
+10-45clang/lib/Headers/hlsl/hlsl_intrinsic_helpers.h
+13-38clang/test/SemaHLSL/BuiltIns/length-errors.hlsl
+24-24clang/test/CodeGenHLSL/builtins/ldexp.hlsl
+397-73512 files not shown
+487-93718 files

LLVM/project b3c093dllvm/test/tools/llvm-mca/RISCV/Inputs/rvv mask.s, llvm/test/tools/llvm-mca/RISCV/SiFiveP400/rvv mask.test

[RISCV][MCA] Do not use mask instructions that can potentially be optimized by uArch (#190820)

Context:
https://github.com/llvm/llvm-project/pull/189785#discussion_r3019282209

Some mask instructions have a form that can potentially be optimized by
HW implementation: `vmxor.mm vd, vs, vs` and `vmclr vd, vs`, for
instance. This patch avoids using such instructions in MCA tests.
DeltaFile
+176-176llvm/test/tools/llvm-mca/RISCV/SiFiveP400/rvv/mask.test
+176-176llvm/test/tools/llvm-mca/RISCV/SiFiveP600/rvv/mask.test
+176-176llvm/test/tools/llvm-mca/RISCV/SiFiveX100/rvv/mask.test
+176-176llvm/test/tools/llvm-mca/RISCV/SpacemitX60/rvv/mask.test
+88-88llvm/test/tools/llvm-mca/RISCV/Inputs/rvv/mask.s
+792-7925 files

LLVM/project 3c11ae6lldb/tools/driver lldb-mte-entitlements.plist

[lldb] Fixup MTE entitlement spelling
DeltaFile
+2-2lldb/tools/driver/lldb-mte-entitlements.plist
+2-21 files

LLVM/project 05f9c66llvm/lib/Target/AArch64 AArch64ISelLowering.cpp, llvm/test/CodeGen/AArch64 branch-on-bool.ll

[AArch64] Normalize (bool CC 1) to (bool NewCC 0) in LowerBR_CC (#189380)
DeltaFile
+202-0llvm/test/CodeGen/AArch64/branch-on-bool.ll
+42-1llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+244-12 files

LLVM/project 12245c2llvm/lib/IR Value.cpp

Revert "[IR] Use iteration limit in stripPointerCastsAndOffsets (#190472)"

This reverts commit b5e7dbb30ace6c9f7b7920462e209bb08e7ffa56.
DeltaFile
+7-12llvm/lib/IR/Value.cpp
+7-121 files

LLVM/project c7c9025bolt/lib/Target/AArch64 AArch64MCPlusBuilder.cpp CMakeLists.txt, bolt/unittests/Core MCPlusBuilder.cpp

[BOLT][AArch64] Optimize the mov-imm-to-reg operation (#189304)

On AArch64, logical immediate instructions are used to encode some
special immediate values. And even at `-O0` level, the AArch64 backend
would not choose to generate 4 instructions (movz, movk, movk, movk) for
moving such a special value to a 64-bit regiter.

For example, to move the 64-bit value `0x0001000100010001` to `x0`, the
AArch64 backend would not choose a 4-instruction-sequence like
```
movz x0, 0x0001
movk x0, 0x0001, lsl 16
movk x0, 0x0001, lsl 32
movk x0, 0x0001, lsl 48
```
Actually, the AArch64 backend would choose to generate one instruction
```
mov x0, 0x0001000100010001
```

    [10 lines not shown]
DeltaFile
+97-0bolt/unittests/Core/MCPlusBuilder.cpp
+63-24bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp
+1-0bolt/lib/Target/AArch64/CMakeLists.txt
+161-243 files

LLVM/project 5baec2cbolt/include/bolt/Profile DataAggregator.h, bolt/lib/Profile DataAggregator.cpp

[𝘀𝗽𝗿] initial version

Created using spr 1.3.7
DeltaFile
+86-3bolt/lib/Profile/DataAggregator.cpp
+6-0bolt/include/bolt/Profile/DataAggregator.h
+92-32 files

LLVM/project e382a95llvm/test/Transforms/LoopVectorize/WebAssembly memory-interleave.ll, llvm/test/Transforms/LoopVectorize/X86/CostModel interleaved-load-i8-stride-8.ll interleaved-load-i16-stride-8.ll

[LV] Update remaining tests to use VPlan cost output (NFC). (#190038)

Move remaining tests checking legacy cost output to check the VPlan's
cost model output.

In some cases, checks become much more compact (checking a single
interleave group cost vs checking the individual members which all have
the group's cost). In some cases, auto-generation consistently checks
all relevant VFs.

PR: https://github.com/llvm/llvm-project/pull/190038
DeltaFile
+1,157-452llvm/test/Transforms/LoopVectorize/WebAssembly/memory-interleave.ll
+123-284llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-load-i8-stride-8.ll
+123-252llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-load-i16-stride-8.ll
+111-248llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-load-i8-stride-7.ll
+111-248llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-load-i16-stride-7.ll
+129-212llvm/test/Transforms/LoopVectorize/X86/CostModel/interleaved-load-i8-stride-6.ll
+1,754-1,696109 files not shown
+10,338-7,066115 files

LLVM/project 88a78f6clang/include/clang/CIR/Dialect/IR CIROps.td, clang/lib/CIR/CodeGen CIRGenClass.cpp

[CIR] Add support for variable sized array new. (#190656)

This change adds support for array new with variable size. This required
extending the cir.array.ctor operation to accept a value for the size
and a direct pointer to the element size instead of a pointer to an
array.

Assisted-by: Cursor / claude-4.6-opus-high
Assisted-by: Cursor / composer-2-fast
DeltaFile
+218-0clang/test/CIR/IR/invalid-array-structor.cir
+146-0clang/test/CIR/CodeGen/new.cpp
+49-45clang/lib/CIR/CodeGen/CIRGenClass.cpp
+55-0clang/lib/CIR/Dialect/IR/CIRDialect.cpp
+31-3clang/include/clang/CIR/Dialect/IR/CIROps.td
+20-4clang/lib/CIR/Dialect/Transforms/LoweringPrepare.cpp
+519-521 files not shown
+538-527 files

LLVM/project d4ed2a7llvm/test/CodeGen/AMDGPU rewrite-vgpr-mfma-to-agpr-spill-multi-store.ll

Trimmed test options and passes.
DeltaFile
+2-2llvm/test/CodeGen/AMDGPU/rewrite-vgpr-mfma-to-agpr-spill-multi-store.ll
+2-21 files

LLVM/project 9744f1bclang/lib/Headers wasm_simd128.h, cross-project-tests/intrinsic-header-tests wasm_simd128.c

[WebAssembly] Support promoting lower lanes of f16x8 to f32x4. (#129786)
DeltaFile
+40-15llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
+20-0llvm/test/CodeGen/WebAssembly/f16-intrinsics.ll
+9-0clang/lib/Headers/wasm_simd128.h
+6-0cross-project-tests/intrinsic-header-tests/wasm_simd128.c
+3-0llvm/test/MC/WebAssembly/simd-encodings.s
+2-0llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
+80-156 files

LLVM/project a030dfblldb/source/Commands CommandObjectThread.cpp, lldb/source/Target Thread.cpp

[lldb] Add --provider option to thread backtrace (#181071)
DeltaFile
+402-12lldb/test/API/functionalities/scripted_frame_provider/pass_through_prefix/TestFrameProviderPassThroughPrefix.py
+188-5lldb/source/Commands/CommandObjectThread.cpp
+110-0lldb/test/API/functionalities/scripted_frame_provider/thread_filter/frame_provider.py
+94-0lldb/test/API/functionalities/scripted_frame_provider/thread_filter/TestFrameProviderThreadFilter.py
+65-8lldb/test/API/functionalities/scripted_frame_provider/pass_through_prefix/frame_provider.py
+32-17lldb/source/Target/Thread.cpp
+891-4210 files not shown
+959-4816 files

LLVM/project 3098b4dflang/include/flang/Optimizer/Transforms Passes.td Passes.h, flang/lib/Optimizer/Transforms LoopInvariantCodeMotion.cpp

[flang] Added LICM hoisting for nested regions. (#190696)

This patch adds a couple of experimental LICM modes
that allow hoisting operations from regions nested
inside a loop, e.g. when there is `fir.if` inside
`fir.do_loop`. The aggressive mode hoists all operations
that are safe to hoist. The cheap mode hoists only
"cheap" operations (currently, only `fir.convert`),
though the definition of "cheap" needs to be worked out.
DeltaFile
+341-0flang/test/Transforms/licm.fir
+117-31flang/lib/Optimizer/Transforms/LoopInvariantCodeMotion.cpp
+13-0flang/include/flang/Optimizer/Transforms/Passes.td
+8-0flang/include/flang/Optimizer/Transforms/Passes.h
+479-314 files

LLVM/project e6c262bllvm/lib/Target/AMDGPU AMDGPURewriteAGPRCopyMFMA.cpp, llvm/test/CodeGen/AMDGPU rewrite-vgpr-mfma-to-agpr-spill-multi-store.ll

[AMDGPU] Added debugging output/test for multiple store to spill slot.
DeltaFile
+7-3llvm/test/CodeGen/AMDGPU/rewrite-vgpr-mfma-to-agpr-spill-multi-store.ll
+4-0llvm/lib/Target/AMDGPU/AMDGPURewriteAGPRCopyMFMA.cpp
+11-32 files

LLVM/project 824fc35mlir/lib/Dialect/Vector/IR VectorOps.cpp, mlir/test/Dialect/Linalg transform-op-mmt4d-to-fma.mlir

[mlir][vector] Constrain broadcast->shape_cast folding (#190230)

Fixes https://github.com/llvm/llvm-project/issues/190614.

Do not fold broadcast->shape_cast when that would result in switching
between the two distinct semantic modes of `vector.broadcast`, as
explained in https://github.com/llvm/llvm-project/issues/190614.

This fixes incorrect-result bugs in IREE:
https://github.com/iree-org/iree/issues/23952

---------

Signed-off-by: Benoit Jacob <benoit.jacob at amd.com>
DeltaFile
+0-69mlir/test/Dialect/Linalg/transform-op-mmt4d-to-fma.mlir
+51-0mlir/test/Dialect/Vector/vector-multi-reduction-to-fma.mlir
+42-6mlir/lib/Dialect/Vector/IR/VectorOps.cpp
+16-0mlir/test/Dialect/Vector/canonicalize/vector-to-shape-cast.mlir
+109-754 files

LLVM/project 4913bd5clang/lib/ScalableStaticAnalysisFramework/Analyses/UnsafeBufferUsage UnsafeBufferUsage.cpp, clang/test/Analysis/Scalable/UnsafeBufferUsage/Inputs tu-summary.json tu-summary-bad-ptr-level.json

[ssaf][UnsafeBufferUsage] Add JSON serialization for UnsafeBufferUsage (#187156)

Implemented and registered a JSONFormat::FormatInfo for
UnsafeBufferUsage analysis

rdar://171920065

---------

Co-authored-by: Balázs Benics <benicsbalazs at gmail.com>
DeltaFile
+123-0clang/lib/ScalableStaticAnalysisFramework/Analyses/UnsafeBufferUsage/UnsafeBufferUsage.cpp
+108-0clang/test/Analysis/Scalable/UnsafeBufferUsage/Inputs/tu-summary.json
+58-0clang/test/Analysis/Scalable/UnsafeBufferUsage/Inputs/tu-summary-bad-ptr-level.json
+58-0clang/test/Analysis/Scalable/UnsafeBufferUsage/Inputs/tu-summary-no-key.json
+55-1clang/unittests/ScalableStaticAnalysisFramework/Analyses/UnsafeBufferUsage/UnsafeBufferUsageTest.cpp
+53-0clang/test/Analysis/Scalable/UnsafeBufferUsage/Inputs/tu-summary-bad-element.json
+455-17 files not shown
+528-613 files

LLVM/project 920e46cllvm/test/CodeGen/AMDGPU amdgcn.bitcast.1024bit.ll amdgcn.bitcast.512bit.ll, llvm/test/CodeGen/RISCV/rvv clmulh-sdnode.ll

Address review comments

Created using spr 1.3.6-beta.1
DeltaFile
+84,299-78,378llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.1024bit.ll
+66,293-29,491llvm/test/CodeGen/RISCV/rvv/clmulh-sdnode.ll
+25,754-24,794llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.512bit.ll
+23,631-20,343llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.960bit.ll
+21,843-18,635llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.896bit.ll
+19,086-16,499llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.832bit.ll
+240,906-188,14016,074 files not shown
+1,615,762-879,13416,080 files

LLVM/project e23a9fallvm/test/CodeGen/AMDGPU amdgcn.bitcast.1024bit.ll amdgcn.bitcast.512bit.ll, llvm/test/CodeGen/RISCV/rvv clmulh-sdnode.ll

[𝘀𝗽𝗿] changes introduced through rebase

Created using spr 1.3.6-beta.1

[skip ci]
DeltaFile
+84,299-78,378llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.1024bit.ll
+66,293-29,491llvm/test/CodeGen/RISCV/rvv/clmulh-sdnode.ll
+25,754-24,794llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.512bit.ll
+23,631-20,343llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.960bit.ll
+21,843-18,635llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.896bit.ll
+19,086-16,499llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.832bit.ll
+240,906-188,14016,073 files not shown
+1,615,734-879,13316,079 files

LLVM/project 16a8316llvm/lib/CodeGen/SelectionDAG DAGCombiner.cpp, llvm/test/CodeGen/AMDGPU urem64.ll udiv64.ll

[DAG] Use known-bits when creating umulh/smulh. (#160916)

This extends the creation of umulh/smulh instructions to handle cases
where one operand is a zext/sext and the other has enough known-zero or
sign bits to create a mulh. This can be useful when one of the operands
is hoisted out of a loop.
DeltaFile
+115-76llvm/test/CodeGen/X86/combine-pmuldq.ll
+32-112llvm/test/CodeGen/Thumb2/mve-vmulh.ll
+13-10llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+8-8llvm/test/CodeGen/AMDGPU/urem64.ll
+4-4llvm/test/CodeGen/AMDGPU/udiv64.ll
+2-2llvm/test/CodeGen/AMDGPU/sdiv64.ll
+174-2126 files

LLVM/project c6694e5llvm/include/llvm/Transforms/Utils Cloning.h, llvm/lib/Transforms/IPO Inliner.cpp

Revert "[Inliner] Put inline history into IR as !inline_history metadata" (#190824)

Reverts llvm/llvm-project#190700

Causes timeouts:
https://github.com/llvm/llvm-project/pull/190700#issuecomment-4198496978
DeltaFile
+0-102llvm/test/Transforms/Inline/inline-history.ll
+28-57llvm/lib/Transforms/Utils/InlineFunction.cpp
+36-25llvm/lib/Transforms/IPO/Inliner.cpp
+0-61llvm/test/Verifier/inline-history-metadata.ll
+26-25llvm/lib/Transforms/Utils/CloneFunction.cpp
+17-19llvm/include/llvm/Transforms/Utils/Cloning.h
+107-28913 files not shown
+213-39419 files

LLVM/project 3dfa021lldb/source/API SBListener.cpp, lldb/source/Core Debugger.cpp

[lldb][NFC] Stop using ConstStrings with BroadcastEventSpec (#190660)

BroadcastEventSpec owns the broadcaster class its configured to listen
for. Broadcasters usually advertise their broadcast class name with
StringRefs so there's no need to put them in the string pool.

The only exception here is SBListener. There are 2 methods that take
`const char *` values. However, that's handled when converting them to
StringRefs.
DeltaFile
+5-4lldb/source/Core/Debugger.cpp
+2-2lldb/source/API/SBListener.cpp
+7-62 files

LLVM/project 1d6ad6eclang/include/clang/CIR/Dialect/Builder CIRBaseBuilder.h, clang/test/CIR/CodeGen pointer-to-member-func.cpp

[CIR] Implement 'zero attr' creation of method (#190819)

This appears quite a bit in some benchmarks, and is seemingly something
we missed at one point. This patch just implements a 'zero-init' of a
pmf.
DeltaFile
+8-0clang/test/CIR/CodeGen/pointer-to-member-func.cpp
+2-0clang/include/clang/CIR/Dialect/Builder/CIRBaseBuilder.h
+10-02 files

LLVM/project 6398181lldb/source/Plugins/Instruction/ARM EmulateInstructionARM.cpp, lldb/unittests/Instruction CMakeLists.txt

[lldb] Fix ARM STR T1 encoding using subtract instead of add, add test (#188614)

The STR Thumb T1 encoding had add=false instead of add=true, causing the
emulator to compute the store address as Rn - imm rather than Rn + imm.
This contradicts the ARM spec comment directly above.

Add a unit test that verifies the STR T1 encoding stores to the correct
address (base + offset).
DeltaFile
+136-0lldb/unittests/Instruction/ARM/TestARMEmulator.cpp
+3-1lldb/unittests/Instruction/CMakeLists.txt
+1-1lldb/source/Plugins/Instruction/ARM/EmulateInstructionARM.cpp
+140-23 files

LLVM/project a974f0allvm/utils/llvm-testing-tools pyproject.toml README.md, llvm/utils/llvm-testing-tools/src/llvm_testing_tools wrapper.py __init__.py

[TestingTools] Add new llvm-testing-tools package (#188888)

This allows for packaging split-file and FileCheck for distribution on
PyPI which will support libc++ wanting to use FileCheck/split-file for
more thorough testing.
DeltaFile
+21-0llvm/utils/llvm-testing-tools/pyproject.toml
+16-0llvm/utils/llvm-testing-tools/src/llvm_testing_tools/wrapper.py
+6-0llvm/utils/llvm-testing-tools/README.md
+0-0llvm/utils/llvm-testing-tools/src/llvm_testing_tools/__init__.py
+43-04 files

LLVM/project d243d55flang/test/Lower/OpenMP taskloop.f90, mlir/include/mlir/Dialect/OpenMP OpenMPOps.td

[mlir][OpenMP] Rename omp.taskloop to omp.taskloop.wrapper (#188071)

Rename the loop wrapper operation to better distinguish it from the
context op (omp.taskloop.context), which handles outlining and runtime
calls. The new name makes the role of each operation clearer at a
glance.

RFC:
https://discourse.llvm.org/t/rfc-openmp-alloca-placement-for-openmp-loop-wrappers/89512/7

Patch 3/3

Assisted-by: Copilot, Claude Sonnet 4.6
DeltaFile
+37-37mlir/test/Dialect/OpenMP/ops.mlir
+21-21mlir/test/Dialect/OpenMP/invalid.mlir
+21-18mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp
+13-12mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
+10-10flang/test/Lower/OpenMP/taskloop.f90
+9-9mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
+111-10733 files not shown
+185-18139 files

LLVM/project eb99187lldb/test/Shell/Platform/AutoLoad/Darwin dsym-auto-load-modules-multiple.test

[lldb][test] Fix dsym-auto-load-modules-multiple.test (#190826)

We were compiling without debug-info causing the test to fail on macOS.
This was a silly oversight because I was mainly working on Linux when
working on the last iterations of the patch that added this test.
DeltaFile
+5-5lldb/test/Shell/Platform/AutoLoad/Darwin/dsym-auto-load-modules-multiple.test
+5-51 files