LLVM/project bf2a97allvm/lib/Target/AMDGPU AMDGPUInstCombineIntrinsic.cpp, llvm/test/Transforms/InstCombine/AMDGPU mbcnt.ll llvm.amdgcn.wave.shuffle.ll

AMDGPU: Add range attribute to mbcnt intrinsic callsites (#189191)

It seems the known bits handling added in
686987a540bc176bceaad43ffe530cb3e88796d5
is insufficient to perform many range based optimizations. For some
reason
computeConstantRange doesn't fall back on KnownBits, and has a separate,
less used form which tries to use computeKnownBits.
DeltaFile
+236-15llvm/test/Transforms/InstCombine/AMDGPU/mbcnt.ll
+22-22llvm/test/Transforms/InstCombine/AMDGPU/llvm.amdgcn.wave.shuffle.ll
+22-2llvm/lib/Target/AMDGPU/AMDGPUInstCombineIntrinsic.cpp
+1-1llvm/test/Transforms/InstCombine/AMDGPU/canonicalize-add-to-gep.ll
+281-404 files

LLVM/project 297a70cclang/lib/CIR/CodeGen CIRGenModule.cpp, clang/test/CIR/CodeGen global-decomp-decls.cpp

[CIR] Implement global decomposition declarations (#190364)

No real challenge to these, it is effectively a copy/paste of the
classic codegen as it just requires we properly emit the holding
variable. The rest falls out of the rest of our handling of variables.
DeltaFile
+114-0clang/test/CIR/CodeGen/global-decomp-decls.cpp
+5-6clang/lib/CIR/CodeGen/CIRGenModule.cpp
+119-62 files

LLVM/project c4281fdllvm/include/llvm/Support KnownFPClass.h, llvm/lib/Support KnownFPClass.cpp

[Support][ValueTraking] Improve KnownFPClass for fadd. Handle infinity signs (#190559)

Improve KnownFPClass reasoning for fadd:

- Refine NaN handling for infinities by checking opposite-sign cases:
   - `-inf` + `+inf` --> `nan`
  - `+inf` + `-inf` --> `nan`
  - `+inf` + `+inf` --> `+inf`
  - `-inf` + `-inf` --> `-inf`
- Introduce `cannotBeOrderedLessEqZero` as pair to
`cannotBeOrderedGreaterEqZero`.
DeltaFile
+44-0llvm/test/Transforms/Attributor/nofpclass.ll
+11-0llvm/include/llvm/Support/KnownFPClass.h
+4-3llvm/lib/Support/KnownFPClass.cpp
+1-4llvm/test/Transforms/InstSimplify/known-never-nan.ll
+2-2llvm/test/Transforms/InstCombine/simplify-demanded-fpclass-fadd.ll
+62-95 files

LLVM/project 8519f41llvm/lib/Target/AMDGPU AMDGPUInstCombineIntrinsic.cpp

Update llvm/lib/Target/AMDGPU/AMDGPUInstCombineIntrinsic.cpp
DeltaFile
+1-1llvm/lib/Target/AMDGPU/AMDGPUInstCombineIntrinsic.cpp
+1-11 files

LLVM/project 59e899eclang/lib/AST/ByteCode Interp.h Interp.cpp, clang/test/AST/ByteCode cxx26.cpp

[clang][bytecode] Don't unref constexpr-unknown references (#190177)

If the pointer for a reference is constexpr-unknown, use the pointer
itself instead, instead of dereferencing it. Unfortunately, that means
constexpr-unknown pointers to reach a lot more places than before.
DeltaFile
+45-1clang/lib/AST/ByteCode/Interp.h
+29-6clang/lib/AST/ByteCode/Interp.cpp
+20-2clang/test/AST/ByteCode/cxx26.cpp
+4-8clang/test/SemaCXX/constant-expression-p2280r4.cpp
+7-5clang/lib/AST/ByteCode/Compiler.cpp
+10-0clang/lib/AST/ByteCode/Pointer.h
+115-222 files not shown
+127-238 files

LLVM/project 2ccc941llvm/lib/Target/AMDGPU VOP2Instructions.td

[AMDGPU] Mark two instructions as DPMACC (#190391)

It appears these were accidentally missed in #170319
DeltaFile
+2-2llvm/lib/Target/AMDGPU/VOP2Instructions.td
+2-21 files

LLVM/project 74ad441llvm/test/DebugInfo/Generic debug-info-enum-dwarf2.ll incorrect-variable-debugloc1-dwarf2.ll

Split DWARF v2 tests to exclude 64-bit AIX targets (#189077)

64-bit AIX requires DWARF64 format, which was only introduced in DWARF
v3. DWARF v2 only supports 32-bit DWARF format, making it incompatible
with 64-bit AIX (the compiler throws a fatal error). These changes split
DWARF v2 tests into separate files that exclude 64-bit AIX targets while
still running on 32-bit AIX and other 64-bit platforms where DWARF v2 is
supported.
DeltaFile
+15-0llvm/test/DebugInfo/Generic/debug-info-enum-dwarf2.ll
+10-0llvm/test/DebugInfo/Generic/incorrect-variable-debugloc1-dwarf2.ll
+6-0llvm/test/DebugInfo/Generic/restrict-dwarf2.ll
+2-4llvm/test/DebugInfo/Generic/debug-info-enum.ll
+2-1llvm/test/DebugInfo/Generic/incorrect-variable-debugloc1.ll
+2-1llvm/test/DebugInfo/Generic/restrict.ll
+37-66 files

LLVM/project 4670f59llvm/test/Analysis/DependenceAnalysis banerjee-overflow.ll

update
DeltaFile
+5-4llvm/test/Analysis/DependenceAnalysis/banerjee-overflow.ll
+5-41 files

LLVM/project 4376a41llvm/test/Transforms/LoopVectorize find-last-iv-sinkable-expr-epilogue.ll, llvm/test/Transforms/LoopVectorize/AArch64 epilog-iv-live-outs.ll find-last-iv-sinkable-expr-epilogue.ll

Address comments

Created using spr 1.3.7
DeltaFile
+257-0llvm/test/Transforms/LoopVectorize/AArch64/epilog-iv-live-outs.ll
+209-23mlir/lib/Dialect/LLVMIR/IR/NVVMDialect.cpp
+206-14mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td
+212-0llvm/test/Transforms/LoopVectorize/find-last-iv-sinkable-expr-epilogue.ll
+189-0mlir/test/Target/LLVMIR/nvvm/convert_s2f6x2.mlir
+172-0llvm/test/Transforms/LoopVectorize/AArch64/find-last-iv-sinkable-expr-epilogue.ll
+1,245-37135 files not shown
+2,509-1,044141 files

LLVM/project b6e7c47llvm/lib/CodeGen/SelectionDAG ScheduleDAGRRList.cpp, llvm/test/CodeGen/ARM pr190497.ll

[CodeGen] Ignore `ANNOTATION_LABEL` in scheduler (#190499)

This fixes a crash in `clang` for `armv7` targets when optimizations are
enabled.

Fixes #190497
DeltaFile
+39-0llvm/test/CodeGen/ARM/pr190497.ll
+1-0llvm/lib/CodeGen/SelectionDAG/ScheduleDAGRRList.cpp
+40-02 files

LLVM/project 0403639llvm/lib/Transforms/Vectorize VPlan.cpp, llvm/test/Transforms/LoopVectorize early_exit_with_outer_loop.ll

[VPlan] Skip successors outside any loop when updating LoopInfo. (#190553)

Successors outside of any loop do not contribute to the innermost loop,
skip them to avoid incorrect results due to
getSmallestCommonLoop(nullptr, X) returning nullptr.
DeltaFile
+115-0llvm/test/Transforms/LoopVectorize/early_exit_with_outer_loop.ll
+15-8llvm/lib/Transforms/Vectorize/VPlan.cpp
+130-82 files

LLVM/project 05ff170llvm/lib/Transforms/InstCombine InstCombineShifts.cpp InstCombineCompares.cpp, llvm/test/Transforms/InstCombine icmp-shl-add-to-add.ll apint-shift.ll

[InstCombine] Fix #163110: Support peeling off matching shifts from icmp operands via canEvaluateShifted (#165975)

Consider a pattern like `icmp (shl nsw X, L), (add nsw (shl nsw Y, L),
K)`. When the constant K is a multiple of 2^L, this can be simplified to
`icmp X, (add nsw Y, K >> L)`.
This patch extends canEvaluateShifted to support `Instruction::Add` and
updates its signature to accept `Instruction::BinaryOps` instead of a
boolean. This change allows the function to distinguish between LShr and
AShr requirements, ensuring that information is preserved according to
the signedness and overflow flags (nsw/nuw) of the operands.
The logic is integrated into `foldICmpCommutative` to enable peeling off
matching shifts from both sides of a comparison even when an offset is
present.

Fixes: #163110
DeltaFile
+311-0llvm/test/Transforms/InstCombine/icmp-shl-add-to-add.ll
+111-41llvm/lib/Transforms/InstCombine/InstCombineShifts.cpp
+28-0llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
+14-0llvm/lib/Transforms/InstCombine/InstCombineInternal.h
+3-3llvm/test/Transforms/InstCombine/apint-shift.ll
+1-1llvm/test/Transforms/InstCombine/icmp-select.ll
+468-456 files

LLVM/project 3b02210llvm/utils/gn/secondary/lldb/source/Host BUILD.gn

[gn] fix mistake from 88f6b181b6ab2 (#190601)
DeltaFile
+1-1llvm/utils/gn/secondary/lldb/source/Host/BUILD.gn
+1-11 files

LLVM/project 4539d71llvm/lib/Target/AMDGPU AMDGPUResourceUsageAnalysis.cpp, llvm/test/CodeGen/AMDGPU resource-usage-asan-O0.ll

[AMDGPU] Preserve assumed stack size for ASan-instrumented functions at -O0
DeltaFile
+29-0llvm/test/CodeGen/AMDGPU/resource-usage-asan-O0.ll
+18-4llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp
+47-42 files

LLVM/project ded8e89llvm/test/CodeGen/AMDGPU amdgpu-sw-lower-lds-multi-static-dynamic-indirect-access-asan.ll amdgpu-sw-lower-lds-static-dynamic-indirect-access-asan.ll

[AMDGPU] Use ASan callback functions instead of inline checks in SW lower LDS pass
DeltaFile
+31-157llvm/test/CodeGen/AMDGPU/amdgpu-sw-lower-lds-multi-static-dynamic-indirect-access-asan.ll
+8-119llvm/test/CodeGen/AMDGPU/amdgpu-sw-lower-lds-static-dynamic-indirect-access-asan.ll
+6-117llvm/test/CodeGen/AMDGPU/amdgpu-sw-lower-lds-dynamic-indirect-access-asan.ll
+3-118llvm/test/CodeGen/AMDGPU/amdgpu-sw-lower-lds-static-lds-test-atomicrmw-asan.ll
+7-98llvm/test/CodeGen/AMDGPU/amdgpu-sw-lower-lds-static-indirect-access-asan.ll
+4-89llvm/test/CodeGen/AMDGPU/amdgpu-sw-lower-lds-static-dynamic-lds-test-asan.ll
+59-6987 files not shown
+121-96913 files

LLVM/project 64a0bd1llvm/lib/Transforms/Vectorize LoopVectorize.cpp LoopVectorizationPlanner.h

[LV] Return best VPlan together with VF from computeBestVF (NFC). (#190385)

computeBestVF iterates over all VPlans and picks the VF of the most
profitable VPlan. This VPlan is later needed for execution and
additional checks. Instead of retrieving it multiple times later, just
directly return it from computeBestVF.

This removes some redundant lookups.

PR: https://github.com/llvm/llvm-project/pull/190385
DeltaFile
+33-29llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+8-6llvm/lib/Transforms/Vectorize/LoopVectorizationPlanner.h
+41-352 files

LLVM/project 4cce6f8llvm/test/tools/llvm-ir2vec/bindings ir2vec-initEmbedding.py ir2vec-getInstEmbMap.py, llvm/tools/llvm-ir2vec/Bindings PyIR2Vec.cpp

[llvm-ir2vec] Added Enum for ir2vec embedding mode (#190466)

Currently, the initEmbedding() takes mode as an input. This input is a
string input. This PR introduces a patch to take the input as an enum
value.
DeltaFile
+19-7llvm/test/tools/llvm-ir2vec/bindings/ir2vec-initEmbedding.py
+12-12llvm/tools/llvm-ir2vec/Bindings/PyIR2Vec.cpp
+3-1llvm/test/tools/llvm-ir2vec/bindings/ir2vec-getInstEmbMap.py
+3-1llvm/test/tools/llvm-ir2vec/bindings/ir2vec-getFuncNames.py
+3-1llvm/test/tools/llvm-ir2vec/bindings/ir2vec-getFuncEmbMap.py
+3-1llvm/test/tools/llvm-ir2vec/bindings/ir2vec-getFuncEmb.py
+43-231 files not shown
+46-247 files

LLVM/project f7cdebbllvm/lib/Transforms/Vectorize VPlanRecipes.cpp, llvm/unittests/Transforms/Vectorize VPlanTest.cpp

[VPlan] Mark unary ops as not having side-effects (NFC). (#190554)

Mark unary ops (only FNeg current) to neither read nor write memory,
similar to binary and cast ops.

Should currently be NFC end-to-end.
DeltaFile
+10-0llvm/unittests/Transforms/Vectorize/VPlanTest.cpp
+2-1llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
+12-12 files

LLVM/project 63231ebmlir/include/mlir/Dialect/LLVMIR NVVMOps.td, mlir/lib/Dialect/LLVMIR/IR NVVMDialect.cpp

[MLIR][NVVM] Add new narrow FP convert Ops (#184291)

This change adds the following NVVM Ops for new narrow FP conversions
introduced in PTX 9.1:
- `convert.{f32x2/bf16x2}.to.s2f6x2`
- `convert.s2f6x2.to.bf16x2`
- `convert.bf16x2.to.f8x2` (extended for `f8E4M3FN` and `f8E5M2` types)
- `convert.{f16x2/bf16x2}.to.f6x2`
- `convert.{f16x2/bf16x2}.to.f4x2`

PTX ISA Reference:
https://docs.nvidia.com/cuda/parallel-thread-execution/#data-movement-and-conversion-instructions-cvt
DeltaFile
+209-23mlir/lib/Dialect/LLVMIR/IR/NVVMDialect.cpp
+206-14mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td
+189-0mlir/test/Target/LLVMIR/nvvm/convert_s2f6x2.mlir
+74-3mlir/test/Target/LLVMIR/nvvm/convert_fp6x2.mlir
+41-0mlir/test/Target/LLVMIR/nvvm/convert_fp8x2_invalid.mlir
+28-0mlir/test/Target/LLVMIR/nvvm/convert_fp4x2.mlir
+747-405 files not shown
+807-5711 files

LLVM/project e326ff2clang-tools-extra/clang-tidy/cppcoreguidelines ProTypeMemberInitCheck.cpp, clang-tools-extra/docs ReleaseNotes.rst

[clang-tidy] Fix FP on cppcoreguidelines-pro-type-member-init with forward decl (#190521)

Fixes https://github.com/llvm/llvm-project/issues/155416.
DeltaFile
+24-0clang-tools-extra/test/clang-tidy/checkers/cppcoreguidelines/pro-type-member-init.cpp
+5-0clang-tools-extra/docs/ReleaseNotes.rst
+2-1clang-tools-extra/clang-tidy/cppcoreguidelines/ProTypeMemberInitCheck.cpp
+31-13 files

LLVM/project ce1a9fdmlir/include/mlir/Dialect/ControlFlow/IR ControlFlowOps.td, mlir/include/mlir/Interfaces ControlFlowInterfaces.td

Reland "[mlir][reducer] Add eraseRedundantBlocksInRegion and getSuccessorForwardOperands API to BranchOpInterface" (#189253)

After fixing undefined symbol and memory leak issues(You can see
previous issue https://github.com/llvm/llvm-project/pull/189150), the PR
would like to reland
it(https://github.com/llvm/llvm-project/pull/187864).
DeltaFile
+114-0mlir/lib/Reducer/ReductionTreePass.cpp
+65-0mlir/test/mlir-reduce/reduction-tree.mlir
+24-0mlir/lib/Dialect/ControlFlow/IR/ControlFlowOps.cpp
+10-0mlir/lib/Reducer/ReductionNode.cpp
+6-4mlir/include/mlir/Dialect/ControlFlow/IR/ControlFlowOps.td
+9-0mlir/include/mlir/Interfaces/ControlFlowInterfaces.td
+228-42 files not shown
+232-48 files

LLVM/project 5e14916bolt/lib/Profile DataReader.cpp, bolt/test empty-fdata-file.test

Early exit llvm-bolt when coming across empty data files (#176859)

perf2bolt generates empty fdata files for small binaries and right now
BOLT does this check while parsing by calling `((!hasBranchData() &&
!hasMemData()))`. Instead, early exit as soon as the buffer finishes
reading the data file and exit with error message.
DeltaFile
+12-0bolt/test/empty-fdata-file.test
+9-3bolt/lib/Profile/DataReader.cpp
+21-32 files

LLVM/project 26697f4polly/lib/CodeGen IslExprBuilder.cpp, polly/test/CodeGen issue190459_1.ll

[Polly] Correct integer comparison bit width (#190493)

For making an integer compareable to bool, don't compare it to bool.

Bug occured during the reduction of #190459
DeltaFile
+27-0polly/test/CodeGen/issue190459_1.ll
+2-1polly/lib/CodeGen/IslExprBuilder.cpp
+29-12 files

LLVM/project 1839b75llvm/cmake/modules HandleLLVMOptions.cmake

[runtimes] Skip custom linker validation for gpu/offload targets (#189933)

This fixes `Host compiler does not support '-fuse-ld=lld'` error when
cross-build libclc for gpu target. Cmake configure command is:
-DRUNTIMES_amdgcn-amd-amdhsa-llvm_LLVM_ENABLE_RUNTIMES=libclc \
-DLLVM_RUNTIME_TARGETS="amdgcn-amd-amdhsa-llvm"
libclc targets only support offload target cross-build and can't link
host executable. The configuration error is false positive for offload.

This PR adds a baseline test to first check if the target can link
executable. If it fails (typical for gpu/offload), we skip the custom
linker validation.
DeltaFile
+12-6llvm/cmake/modules/HandleLLVMOptions.cmake
+12-61 files

LLVM/project 3564570llvm/lib/Transforms/Vectorize SLPVectorizer.cpp

Fix formatting

Created using spr 1.3.7
DeltaFile
+9-11llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+9-111 files

LLVM/project 96b2a4ellvm/test/CodeGen/AMDGPU amdgcn.bitcast.1024bit.ll amdgcn.bitcast.512bit.ll, llvm/test/CodeGen/RISCV/rvv clmulh-sdnode.ll

Rebase

Created using spr 1.3.7
DeltaFile
+161,105-175,310llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.1024bit.ll
+54,366-54,928llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.512bit.ll
+92,827-0llvm/test/CodeGen/RISCV/rvv/clmulh-sdnode.ll
+42,349-42,348llvm/test/MC/AMDGPU/gfx8_asm_vop3.s
+41,419-41,418llvm/test/MC/AMDGPU/gfx7_asm_vop3.s
+36,428-36,427llvm/test/MC/AMDGPU/gfx9_asm_vop3.s
+428,494-350,43134,353 files not shown
+4,984,546-3,040,38734,359 files

LLVM/project 58208a0llvm/test/Transforms/LoopVectorize find-last-iv-sinkable-expr-epilogue.ll, llvm/test/Transforms/LoopVectorize/AArch64 epilog-iv-live-outs.ll find-last-iv-sinkable-expr-epilogue.ll

[LV] Additional epilogue tests for find-iv and with uses of IV.(NFC) (#190548)

Additional test coverage for loops not yet supported, with sinkable
find-iv expressions (github.com/llvm/llvm-project/pull/183911) and uses
of the IV.

PR: https://github.com/llvm/llvm-project/pull/190548
DeltaFile
+257-0llvm/test/Transforms/LoopVectorize/AArch64/epilog-iv-live-outs.ll
+212-0llvm/test/Transforms/LoopVectorize/find-last-iv-sinkable-expr-epilogue.ll
+172-0llvm/test/Transforms/LoopVectorize/AArch64/find-last-iv-sinkable-expr-epilogue.ll
+641-03 files

LLVM/project c109dd1llvm/lib/Transforms/Vectorize VPlanTransforms.cpp VPlanPatternMatch.h

[VPlan] Refactor FindLastSelect matching to use m_Specific(PhiR) (NFC). (#190547)

Match the select operands directly against PhiR using m_Specific,
binding only the non-phi IV expression. This replaces the generic
TrueVal/FalseVal matching followed by an assert and conditional
extraction.

Split off from approved
https://github.com/llvm/llvm-project/pull/183911/ as suggested.
DeltaFile
+17-15llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
+5-0llvm/lib/Transforms/Vectorize/VPlanPatternMatch.h
+22-152 files

LLVM/project 4bd1facllvm/docs GettingStarted.rst

[llvm][docs] Fix typo (#190150)

This commit corrects a typo in the project documentation.
DeltaFile
+1-1llvm/docs/GettingStarted.rst
+1-11 files

LLVM/project 9ce30c8llvm/lib/ExecutionEngine/Orc/TargetProcess LibraryScanner.cpp

[Orc][LibResolver] Fix GNU/Hurd build (#184470)

GNU/Hurd does not put a PATH_MAX static constraint on path lengths. We can instead check the symlink length.
DeltaFile
+3-5llvm/lib/ExecutionEngine/Orc/TargetProcess/LibraryScanner.cpp
+3-51 files