LLVM/project 04ca658clang/include/clang/Basic DiagnosticGroups.td

Reorganise-DiagGroups
DeltaFile
+3-4clang/include/clang/Basic/DiagnosticGroups.td
+3-41 files

LLVM/project 680a990llvm/lib/Transforms/Vectorize VPlanSLP.cpp VPlanSLP.h, llvm/unittests/Transforms/Vectorize VPlanSlpTest.cpp CMakeLists.txt

[VPlanSLP] Strip stub (#192635)

VPlanSLP hasn't seen much progress since it was checked in 7 years ago,
and it is unclear if there ever will be any progress. Strip it from the
tree to avoid confusion.
DeltaFile
+0-896llvm/unittests/Transforms/Vectorize/VPlanSlpTest.cpp
+0-528llvm/lib/Transforms/Vectorize/VPlanSLP.cpp
+0-145llvm/lib/Transforms/Vectorize/VPlanSLP.h
+0-8llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
+0-5llvm/lib/Transforms/Vectorize/VPlan.h
+0-1llvm/unittests/Transforms/Vectorize/CMakeLists.txt
+0-1,5831 files not shown
+0-1,5847 files

LLVM/project 4975ad9llvm/lib/CodeGen/GlobalISel GISelValueTracking.cpp, llvm/test/CodeGen/AArch64/GlobalISel knownbits-urem.mir

[GlobalISel][KnownBits] Use KnownBits::urem for G_UREM (#193455)

This updates the implementation of G_UREM in GlobalISel to use
KnownBits::urem instead of reimplementing the logic.
Supersedes #189087.
DeltaFile
+101-0llvm/test/CodeGen/AArch64/GlobalISel/knownbits-urem.mir
+12-0llvm/lib/CodeGen/GlobalISel/GISelValueTracking.cpp
+113-02 files

LLVM/project 44753d8mlir/include/mlir/Dialect/LLVMIR NVVMOps.td, mlir/test/Dialect/LLVMIR invalid.mlir

[MLIR][NVVM] SpecialRegister&PureSpecialRegister takes result type  (#195030)

Use concrete `I32` (default) and `I64` (clock64, globaltimer) instead of
generic `LLVM_Type` for special-register op results. The dialect
verifier now rejects mismatches up-front, and the Python op-binding
generator emits the inferred-result form, so callers can write
`nvvm.ThreadIdXOp()` with no arguments. Strict tightening: no valid
existing IR is rejected.
DeltaFile
+16-0mlir/test/Dialect/LLVMIR/invalid.mlir
+13-0mlir/test/python/dialects/nvvm.py
+6-3mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td
+35-33 files

LLVM/project 875d2c9llvm/lib/Transforms/Vectorize LoopVectorize.cpp LoopVectorizationPlanner.h

[LV][NFC] Factor out MinBWs of values from the cost model (#194492)

Move MinBWs out of the CM to the planner, as it doesn't depend on the
CM.
DeltaFile
+29-36llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+19-3llvm/lib/Transforms/Vectorize/LoopVectorizationPlanner.h
+4-0llvm/lib/Transforms/Vectorize/LoopVectorizationPlanner.cpp
+52-393 files

LLVM/project ebd677bllvm/test/Transforms/LoopVectorize first-order-recurrence.ll reduction-inloop-uf4.ll

[LV] Re-generate check lines with UTC version 6. (NFC) (#195061)

The checks in the re-generated files check if.pred block chains, which
are prone to renaming chains. Re-generate with version 6 to avoid
unnecessary test changes due to renumbering.
DeltaFile
+1,424-1,376llvm/test/Transforms/LoopVectorize/first-order-recurrence.ll
+234-230llvm/test/Transforms/LoopVectorize/reduction-inloop-uf4.ll
+1,658-1,6062 files

LLVM/project 6593f9dllvm/lib/Target/SPIRV SPIRVEmitIntrinsics.cpp, llvm/test/CodeGen/SPIRV/pointers store-operand-ptr-to-struct.ll

[SPIR-V] Recover aggregate type for stores of undef/composite constants (#195003)

preprocessUndefs/preprocessCompositeConstants lower aggregate values to
spv_undef/spv_const_composite calls returning i32, stashing the original
type in AggrConstTypes
DeltaFile
+28-2llvm/test/CodeGen/SPIRV/pointers/store-operand-ptr-to-struct.ll
+18-1llvm/lib/Target/SPIRV/SPIRVEmitIntrinsics.cpp
+46-32 files

LLVM/project 98e26bcllvm/lib/Target/AArch64 AArch64ConditionOptimizer.cpp, llvm/test/CodeGen/AArch64 aarch64-condopt-chaining.mir

[AArch64] ConditionOptimizer: replace intra-block scan with map-based algorithm (#190455)

The previous condopt implementation found the first two CSINC
instructions in a block and attempted one optimisation, ignoring other
possible pairs. It also performed extra forward and backward walks.

Replace the two-CSINC scan with a single forward walk maintaining a
DenseMap keyed by canonical (copy-traced) register. Any number of pairs
per block are now handled.
DeltaFile
+277-0llvm/test/CodeGen/AArch64/aarch64-condopt-chaining.mir
+76-71llvm/lib/Target/AArch64/AArch64ConditionOptimizer.cpp
+353-712 files

LLVM/project 6ccbdc3clang/unittests/ScalableStaticAnalysisFramework FindDecl.h ASTEntityMappingTest.cpp, clang/unittests/ScalableStaticAnalysisFramework/Analyses/UnsafeBufferUsage UnsafeBufferUsageTest.cpp

[clang][ssaf][NFC] Hoist findFnByName and findDeclByName (#195056)

Split from #194448

This was already approved in

https://github.com/llvm/llvm-project/pull/194448#pullrequestreview-4201251523
DeltaFile
+49-0clang/unittests/ScalableStaticAnalysisFramework/FindDecl.h
+1-31clang/unittests/ScalableStaticAnalysisFramework/Analyses/UnsafeBufferUsage/UnsafeBufferUsageTest.cpp
+10-21clang/unittests/ScalableStaticAnalysisFramework/ASTEntityMappingTest.cpp
+60-523 files

LLVM/project 1862fd2clang/unittests/Support TimeProfilerTest.cpp

[clang][unittests] Fix flaky PerformPendingInstantiations nesting in TimeProfilerTest (#193717)

buildTraceGraph already compensates for timer rounding that makes
PerformPendingInstantiations appear to be inside the previous event, but
only when it is nested exactly one level deep. The aarch64-darwin
buildbot produced three-level nesting for ConstantEvaluationC99, which
slipped through the normalization and broke the expected trace output.

Keep popping while PerformPendingInstantiations looks nested—we know it
is always a top-level event in these tests—instead of stopping at the
single-level case.

Followup to https://github.com/llvm/llvm-project/pull/138613.
DeltaFile
+3-3clang/unittests/Support/TimeProfilerTest.cpp
+3-31 files

LLVM/project 4fa68d7llvm/lib/DWP DWP.cpp, llvm/test/CodeGen/AArch64 arm64-extract-insert-varidx.ll

Merge branch 'main' into users/kparzysz/dsa-conflicts
DeltaFile
+945-7llvm/test/CodeGen/AMDGPU/llvm.amdgcn.cvt.fp8.ll
+263-261llvm/test/CodeGen/AArch64/arm64-extract-insert-varidx.ll
+343-111llvm/lib/DWP/DWP.cpp
+424-0llvm/test/CodeGen/LoongArch/lasx/vec-zext.ll
+423-0llvm/test/CodeGen/LoongArch/lasx/vec-sext.ll
+290-124mlir/lib/Bindings/Python/IRCore.cpp
+2,688-503271 files not shown
+11,421-2,769277 files

LLVM/project d13c658llvm/test/Transforms/LoopVectorize vectorize-once.ll if-pred-not-when-safe.ll

[LV][NFC] Remove unused -simplifycfg-*** option from tests (#195044)

The -simplifycfg-require-and-preserve-domtree=1 option used in two tests
had no effect.
DeltaFile
+1-1llvm/test/Transforms/LoopVectorize/vectorize-once.ll
+1-1llvm/test/Transforms/LoopVectorize/if-pred-not-when-safe.ll
+2-22 files

LLVM/project 2f96457flang/lib/Semantics check-directive-structure.h

Remove loopIV from DirectiveContext
DeltaFile
+0-4flang/lib/Semantics/check-directive-structure.h
+0-41 files

LLVM/project 2dbff02llvm/lib/Transforms/Scalar LoopInterchange.cpp, llvm/test/Transforms/LoopInterchange phi-to-phi.ll

[LoopInterchange] Fix handling of PHI which refers to another PHI (#194364)

In the transformation phase, at first LoopInterchange moves several
instructions in the inner loop into the new latch block. The
instructions used as incoming values to the induction variables from the
latch block are the targets of this movement. Previously, this process
could result in an infinite loop when a PHI node refers to another PHI
node, as in the following example:

```
%i = phi i64 [ 0, %entry ], [ %i.inc, %latch ]
%j = phi i64 [ 0, %entry ], [ %i, %latch ]
```

The root cause was that `%i` enqueued for processing because it is used
by `%j`.
This patch fixes the issue by preventing induction variables from being
enqueued into the movement list.
Fix #193733
DeltaFile
+62-0llvm/test/Transforms/LoopInterchange/phi-to-phi.ll
+7-7llvm/lib/Transforms/Scalar/LoopInterchange.cpp
+69-72 files

LLVM/project d50dd86llvm/lib/Target/AMDGPU VOP2Instructions.td, llvm/test/CodeGen/AMDGPU strict_fmul.f64.ll

AMDGPU/GlobalISel: Fix G_STRICT_FMUL f64 selection on GFX12 (#195050)
DeltaFile
+91-0llvm/test/CodeGen/AMDGPU/strict_fmul.f64.ll
+1-1llvm/lib/Target/AMDGPU/VOP2Instructions.td
+92-12 files

LLVM/project dd99506clang/lib/CIR/CodeGen CIRGenClass.cpp, clang/test/CIR/CodeGenCXX virtual-base-cast.cpp

[CIR] Combined virtual + non-virtual base offset (#192617)

Refer: https://github.com/llvm/llvm-project/issues/192330

---------

Co-authored-by: Zile Xiong <xiongzile99 at gmail.com>
DeltaFile
+143-0clang/test/CIR/CodeGenCXX/virtual-base-cast.cpp
+8-4clang/lib/CIR/CodeGen/CIRGenClass.cpp
+151-42 files

LLVM/project 08bafa3llvm/test/CodeGen/SPIRV ctor-dtor-lowering.ll, llvm/test/CodeGen/SPIRV/extensions/SPV_INTEL_float_controls2 exec_mode_float_control_intel.ll

[NFC][SPIR-V] Re-enable spirv-val on tests that are passing validation (#195022)
DeltaFile
+1-4llvm/test/CodeGen/SPIRV/extensions/SPV_INTEL_function_pointers/fp_no_return.ll
+1-3llvm/test/CodeGen/SPIRV/pointers/nested-struct-opaque-pointers.ll
+1-2llvm/test/CodeGen/SPIRV/ctor-dtor-lowering.ll
+1-2llvm/test/CodeGen/SPIRV/extensions/SPV_INTEL_masked_gather_scatter/vector-of-pointers-ptrtoint.ll
+1-2llvm/test/CodeGen/SPIRV/extensions/SPV_INTEL_masked_gather_scatter/masked-gather-scatter.ll
+1-1llvm/test/CodeGen/SPIRV/extensions/SPV_INTEL_float_controls2/exec_mode_float_control_intel.ll
+6-149 files not shown
+15-2315 files

LLVM/project 312d882llvm/include/llvm/ExecutionEngine/JITLink COFF.h, llvm/lib/ExecutionEngine/JITLink COFF.cpp COFFLinkGraphBuilder.cpp

[JITLink][COFF] Move GetImageBaseSymbol utility into public header. (#195041)

This utility may be useful for people writing
LinkGraphLinkingLayer::Plugins for COFF LinkGraphs, so this commit moves
it a public header where it can easily be reused
(llvm/ExecutionEngine/JITLink/COFF.h).

Also adds unit tests for the utility.
DeltaFile
+109-0llvm/unittests/ExecutionEngine/JITLink/COFFLinkGraphTests.cpp
+19-0llvm/include/llvm/ExecutionEngine/JITLink/COFF.h
+18-0llvm/lib/ExecutionEngine/JITLink/COFF.cpp
+0-18llvm/lib/ExecutionEngine/JITLink/COFFLinkGraphBuilder.cpp
+0-12llvm/lib/ExecutionEngine/JITLink/COFFLinkGraphBuilder.h
+1-0llvm/lib/ExecutionEngine/JITLink/COFF_x86_64.cpp
+147-301 files not shown
+148-307 files

LLVM/project f7e133d.ci all_requirements.txt, lldb/test CMakeLists.txt requirements.txt

[lldb] Add Python cryptography package as new test dependency (#192434)

HTTPS tests for SymbolLocatorSymStore need the Python cryptography package
DeltaFile
+5-2.ci/all_requirements.txt
+1-1lldb/test/CMakeLists.txt
+2-0lldb/test/requirements.txt
+8-33 files

LLVM/project cae34e9llvm/lib/Target/AMDGPU VOP2Instructions.td, llvm/test/CodeGen/AMDGPU strict_fmul.f64.ll

AMDGPU/GlobalISel: Fix G_STRICT_FMUL f64 selection on GFX12
DeltaFile
+91-0llvm/test/CodeGen/AMDGPU/strict_fmul.f64.ll
+1-1llvm/lib/Target/AMDGPU/VOP2Instructions.td
+92-12 files

LLVM/project a8eb65allvm/lib/Target/AMDGPU AMDGPURegBankLegalizeHelper.cpp

[AMDGPU][NFC] Use LaneMaskConstants for waterfall loops in AMDGPURegBankLegalizeHelper (#190792)

Use `LaneMaskConstants` for generating waterfall loops in
`AMDGPURegBankLegalizeHelper`.
No Functionality Change.
DeltaFile
+9-18llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeHelper.cpp
+9-181 files

LLVM/project f7d4032llvm/lib/ObjCopy ConfigManager.cpp

[ObjCopy] Reject compress-debug-sections for non-ELF (#191314)

`--compress-debug-sections` is currently an ELF-only option in the
[docs](https://llvm.org/docs/CommandGuide/llvm-objcopy.html#cmdoption-llvm-objcopy-compress-debug-sections)
but in `llvm-objcopy`, non-ELF backends were silently ignoring it, while
`--decompress-debug-sections` already
[reports](https://github.com/llvm/llvm-project/blob/89446086eaed6f07e2c122396570f2985cec62e5/llvm/lib/ObjCopy/ConfigManager.cpp#L32)
unsupported-option error. This PR makes behavior consistent by treating
`--compress-debug-sections` and `--compress-sections` as unsupported for non-ELF formats too.
DeltaFile
+17-7llvm/lib/ObjCopy/ConfigManager.cpp
+17-71 files

LLVM/project f966490llvm/lib/Target/AMDGPU SIISelLowering.cpp, llvm/test/CodeGen/AMDGPU minmax3-tree-reduction.ll vector-reduce-smin.ll

[AMDGPU] Extend max3/min3 tree-reduction combine to cover ternary chains (#194845)

The tree-reduction combine for min/max currently trigger on shapes where
both children of a node are same-opcode. This patch extends it to also
recognize cases where only one child is same-opcode and one-use like
max(max(a, b), c) feeding another max.

For example, with R = max(max(A, B), C) where A, B, and C are each
ternary chains of the form max(max(x, y), z), the current predicate does
not recognize the ternary-chain interiors as still combinable, so the
higher-level rules fire eagerly and produce max3 nodes with 2-op maxes
inside them. With the extended predicate, each ternary chain is allowed
to fold into a max3 first, after which the higher levels reduce cleanly
without leaving stranded 2-op maxes behind.

Adds six regression tests covering a 2-level ternary chain, a mixed
ternary+binary shape and vector examples.

Fix: LCOMPILER-2166
DeltaFile
+244-26llvm/test/CodeGen/AMDGPU/minmax3-tree-reduction.ll
+28-28llvm/test/CodeGen/AMDGPU/vector-reduce-smin.ll
+28-28llvm/test/CodeGen/AMDGPU/vector-reduce-smax.ll
+28-28llvm/test/CodeGen/AMDGPU/vector-reduce-umax.ll
+28-28llvm/test/CodeGen/AMDGPU/vector-reduce-umin.ll
+13-10llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+369-1486 files

LLVM/project 64c70c1llvm/lib/Target/AMDGPU AMDGPULowerVGPREncoding.cpp, llvm/test/CodeGen/AMDGPU vgpr-setreg-mode-swar.mir hazard-setreg-vgpr-msb-gfx1250.mir

[AMDGPU] Refactor setreg handling in the VGPR MSB lowering

It can skip inserting S_SET_VGPR_MSB if we set the mode via
piggybacking. We are now relying on the HW bug for correct
behavior. If/when the bug is fixed lowering will be incorrect.

SETREG is not a piggybacking target anymore. Instead piggybacking is
disabled if we have seen a SETREG since the last mode change.
DeltaFile
+117-48llvm/test/CodeGen/AMDGPU/vgpr-setreg-mode-swar.mir
+14-34llvm/lib/Target/AMDGPU/AMDGPULowerVGPREncoding.cpp
+9-3llvm/test/CodeGen/AMDGPU/hazard-setreg-vgpr-msb-gfx1250.mir
+140-853 files

LLVM/project b2dda8cclang/lib/Analysis/LifetimeSafety FactsGenerator.cpp, clang/test/Sema warn-lifetime-safety.cpp

[LifetimeSafety] Add placement new support (#194030)

Allows flow from placement new closely resembling standard library form.

Comes as part of the completion of #164963.
DeltaFile
+203-11clang/test/Sema/warn-lifetime-safety.cpp
+43-6clang/lib/Analysis/LifetimeSafety/FactsGenerator.cpp
+246-172 files

LLVM/project a94ad60llvm/lib/Target/AMDGPU AMDGPULowerVGPREncoding.cpp, llvm/test/CodeGen/AMDGPU vgpr-setreg-mode-swar.mir

[AMDGPU] Preserve old MSBs when handling SETREG (#191352)
DeltaFile
+1-1llvm/test/CodeGen/AMDGPU/vgpr-setreg-mode-swar.mir
+1-1llvm/lib/Target/AMDGPU/AMDGPULowerVGPREncoding.cpp
+2-22 files

LLVM/project 780a182llvm/lib/Target/LoongArch LoongArchLASXInstrInfo.td LoongArchLSXInstrInfo.td, llvm/test/CodeGen/LoongArch/lasx bitsel.ll

[LoongArch] Add patterns for vector bitwise selection (#193753)

Add instruction selection patterns for VBITSEL_V/XVBITSEL_V and
VBITSELI_B/XVBITSELI_B to match the canonical bitwise select idiom:

`(a & b) | (~a & c)`

This enables the backend to generate dedicated bitwise select
instructions instead of separate AND/ANDN/OR sequences.
DeltaFile
+5-15llvm/test/CodeGen/LoongArch/lasx/bitsel.ll
+5-15llvm/test/CodeGen/LoongArch/lsx/bitsel.ll
+11-0llvm/lib/Target/LoongArch/LoongArchLASXInstrInfo.td
+11-0llvm/lib/Target/LoongArch/LoongArchLSXInstrInfo.td
+32-304 files

LLVM/project 63ee54alibcxx/include print, libcxx/include/__ostream print.h

[libc++] Refactor std::print to allow for constant folding of the format part (#185459)

```
---------------------------------------------------------
Benchmark                             old             new
---------------------------------------------------------
std::print("Hello, World!")       43.6 ns         9.88 ns
```
DeltaFile
+0-113libcxx/test/libcxx/input.output/iostream.format/print.fun/vprint_unicode_windows.pass.cpp
+111-0libcxx/test/libcxx/input.output/iostream.format/print.fun/output_unicode_windows.pass.cpp
+35-46libcxx/include/print
+39-0libcxx/test/benchmarks/format/print.bench.cpp
+4-5libcxx/include/__ostream/print.h
+189-1645 files

LLVM/project f32da7bmlir/include/mlir/Dialect/SPIRV/IR SPIRVTosaTypes.td SPIRVTosaOps.td, mlir/lib/Dialect/SPIRV/IR SPIRVTosaOps.cpp

[mlir][spirv] Tighten SPIR-V TOSA convolution verification (#194592)

Add verifier coverage for SPIR-V TOSA convolution ops against the TOSA
shape and type constraints.

This adds shared TableGen shape predicates for Conv2D, Conv3D,
DepthwiseConv2D and TransposeConv2D, including batch/channel/bias
relationships. It also constrains integer convolution weights so i8 and
i16 inputs use i8 weights, matching the SPIR-V TOSA representation.

Add custom verifiers for the convolution output shape formulas,
including stride divisibility for regular convolutions and out_pad
bounds for TransposeConv2D. Tighten pad, stride and dilation attributes
to use non-negative or positive i32 attribute constraints where
required.

Signed-off-by: Davide Grohmann <davide.grohmann at arm.com>
DeltaFile
+228-60mlir/test/Dialect/SPIRV/IR/tosa-ops-verification.mlir
+194-0mlir/lib/Dialect/SPIRV/IR/SPIRVTosaOps.cpp
+78-10mlir/include/mlir/Dialect/SPIRV/IR/SPIRVTosaTypes.td
+34-20mlir/include/mlir/Dialect/SPIRV/IR/SPIRVTosaOps.td
+534-904 files

LLVM/project 39e30ccllvm/include/llvm/Support KnownFPClass.h, llvm/lib/Support KnownFPClass.cpp

[KnowFPClass] Refactor fmul & fdiv (NFC) (#191651)

- Remove `0 * Inf ->  NaN` redundant check and a nan setting.

  The reason:
  ```rust
  1. Inf * Y   ->  {Inf, NaN}
  2.   0 * Y   ->  {Zero, NaN}
  3.   0 * Inf ->  NaN
  ```
But after `1.` and `2.` we already have `{Inf, NaN} ∩ {Zero, NaN} ->
NaN`
So `3.` is redundant and [can be
removed](https://github.com/llvm/llvm-project/pull/191651/changes#diff-eacbd8c8620db92a00453be14b3b433c618946ad7b57c10b039437641e9777c4L389):
```diff
   // +/-0 * +/-inf = nan
-  if ((KnownLHS.isKnownAlways(fcZero | fcNan) &&
-       KnownRHS.isKnownAlways(fcInf | fcNan)) ||
-      (KnownLHS.isKnownAlways(fcInf | fcNan) &&

    [4 lines not shown]
DeltaFile
+12-34llvm/lib/Support/KnownFPClass.cpp
+13-0llvm/include/llvm/Support/KnownFPClass.h
+25-342 files