LLVM/project 601f796mlir/include/mlir/Dialect/LLVMIR NVVMOps.td, mlir/lib/Dialect/LLVMIR/IR NVVMDialect.cpp

[MLIR][NVVM] Add missing rounding modes in fp16x2 conversions (#169005)

This change adds the `RN` and `RZ` rounding modes to the
`convert.f32x2.to.f16x2` and `convert.f32x2.to.bf16x2` Ops.

Tests are added in `convert_fp16x2.mlir` and
`invalid_convert_fp16x2.mlir`.
Tests with these Ops in `convert_stochastic_rounding.mlir` and
`invalid-convert-stochastic-rounding.mlir` have been removed or
modified.

PTX spec reference:
https://docs.nvidia.com/cuda/parallel-thread-execution/#data-movement-and-conversion-instructions-cvt
DeltaFile
+122-28mlir/lib/Dialect/LLVMIR/IR/NVVMDialect.cpp
+87-0mlir/test/Target/LLVMIR/nvvm/convert_fp16x2.mlir
+2-66mlir/test/Target/LLVMIR/nvvm/convert_stochastic_rounding.mlir
+32-20mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td
+47-0mlir/test/Target/LLVMIR/nvvm/invalid_convert_fp16x2.mlir
+3-23mlir/test/Dialect/LLVMIR/nvvm/invalid-convert-stochastic-rounding.mlir
+293-1376 files

LLVM/project e7dec23llvm/lib/IR ReplaceConstant.cpp, llvm/test/CodeGen/AMDGPU lower-module-lds-constantexpr.ll same-lds-variable-multiple-use-in-one-phi-node.ll

[ReplaceConstant] Don't create instructions for the same constant multiple times in the same basic block (#169141)

Fixes #167500.
DeltaFile
+47-41llvm/test/CodeGen/AMDGPU/lower-module-lds-constantexpr.ll
+51-0llvm/test/CodeGen/AMDGPU/same-lds-variable-multiple-use-in-one-phi-node.ll
+26-17llvm/test/CodeGen/AMDGPU/lower-kernel-lds-constexpr.ll
+16-5llvm/lib/IR/ReplaceConstant.cpp
+140-634 files

LLVM/project 6e9c978llvm/include/llvm/Support File.h

fix windows

Created using spr 1.3.7
DeltaFile
+7-4llvm/include/llvm/Support/File.h
+7-41 files

LLVM/project 541e13cllvm/test/CodeGen/LoongArch/lasx build-vector.ll scalar-to-vector.ll, llvm/test/CodeGen/LoongArch/lasx/ir-instruction insertelement.ll

update tests
DeltaFile
+10-40llvm/test/CodeGen/LoongArch/lasx/build-vector.ll
+7-19llvm/test/CodeGen/LoongArch/lsx/build-vector.ll
+4-6llvm/test/CodeGen/LoongArch/lasx/ir-instruction/insertelement.ll
+4-4llvm/test/CodeGen/LoongArch/lasx/scalar-to-vector.ll
+4-4llvm/test/CodeGen/LoongArch/lsx/scalar-to-vector.ll
+29-735 files

LLVM/project b3428bbllvm/lib/IR LLVMContextImpl.cpp

Add missing freeConstants() call for ConstantPtrAuths.

Fixes memory leak uncovered by #133533.
DeltaFile
+1-0llvm/lib/IR/LLVMContextImpl.cpp
+1-01 files

LLVM/project aa0d95fllvm/lib/Target/LoongArch LoongArchISelLowering.cpp

[LoongArch] Legalize BUILD_VECTOR into a broadcast when all non-undef elements are identical

When a BUILD_VECTOR consists of the same element (ignoring undefs),
it is better emitting a broadcast instead of multiple insertions.

Some floating-point cases suffer performance regressions, those
specific cases are excluded in this commit. Including when:

- only one element is non-undef,
- only two elements are non-undef, and one of them must at index 0,
- for v8f32 vector type, specially exclude the cases when the only
two non-undefs are at index (1,2)/(1,3)/(2,3).
DeltaFile
+31-5llvm/lib/Target/LoongArch/LoongArchISelLowering.cpp
+31-51 files

LLVM/project bc0235cllvm/test/CodeGen/LoongArch/lasx build-vector.ll, llvm/test/CodeGen/LoongArch/lsx build-vector.ll

[LoongArch][NFC] Add tests for build_vector containing same elements except for undefs
DeltaFile
+231-18llvm/test/CodeGen/LoongArch/lasx/build-vector.ll
+149-18llvm/test/CodeGen/LoongArch/lsx/build-vector.ll
+380-362 files

LLVM/project fb18f75cross-project-tests/debuginfo-tests/dexter/dex/debugger DAP.py

[lldb-dap] Add breakpoints after debugger initialization in DExTer (#169744)

# Summary
This is a forward fix for test errors from
https://github.com/llvm/llvm-project/pull/163653.

The PR moved debugger initialization outside of
InitializeRequestHandler, and into Launch/AttachRequestHandlers to
support DAP sessions sharing debugger instances for dynamically created
targets. However, DExTer's DAP class seemed to set breakpoints before
the debugger was initialized, which caused the tests to hang waiting for
a breakpoint to hit due to none of the breakpoints getting resolved.

# Tests
```
bin/llvm-lit -v /home/qxy11/llvm/llvm-project/cross-project-tests/debuginfo-tests/dexter-tests/
```
DeltaFile
+15-8cross-project-tests/debuginfo-tests/dexter/dex/debugger/DAP.py
+15-81 files

LLVM/project bacca23libcxx/include/__mdspan mdspan.h extents.h, libcxx/test/libcxx/containers/views/mdspan nodiscard.verify.cpp

[libc++][mdspan] Applied `[[nodiscard]]` (#169326)

`[[nodiscard]]` should be applied to functions where discarding the
return value is most likely a correctness issue.
-
https://libcxx.llvm.org/CodingGuidelines.html#apply-nodiscard-where-relevant
DeltaFile
+62-0libcxx/test/libcxx/containers/views/mdspan/nodiscard.verify.cpp
+25-18libcxx/include/__mdspan/mdspan.h
+10-10libcxx/test/libcxx/containers/views/mdspan/extents/assert.obs.pass.cpp
+6-4libcxx/include/__mdspan/extents.h
+103-324 files

LLVM/project 504b507mlir/include/mlir/IR PatternMatch.h, mlir/include/mlir/Transforms DialectConversion.h

[mlir][Transforms] Dialect conversion: Add support for `replaceUsesWithIf` (#169606)

This commit adds support for `replaceUsesWithIf` (and variants such as
`replaceAllUsesExcept`) to the `ConversionPatternRewriter`. This API is
supported only in no-rollback mode. An assertion is triggered in
rollback mode. (This missing assertion has been confusing for users
because it seemed that the API supported, while it was actually not
working properly.)

This commit brings us a bit closer towards removing
[this](https://github.com/llvm/llvm-project/blob/76ec25f729fcc7ae576caf21293cc393e68e7cf7/mlir/lib/Transforms/Utils/DialectConversion.cpp#L1214)
workaround.

Additional changes are needed to support this API in rollback mode. In
particular, no entries should be added to the `ConversionValueMapping`
for conditional replacements. It's unclear at this point if this API can
be supported in rollback mode, so this is deferred to later.

This commit turns `replaceUsesWithIf` into a virtual function, so that

    [8 lines not shown]
DeltaFile
+53-25mlir/lib/Transforms/Utils/DialectConversion.cpp
+23-0mlir/test/Transforms/test-legalizer-no-rollback.mlir
+21-0mlir/include/mlir/Transforms/DialectConversion.h
+7-1mlir/test/lib/Dialect/Test/TestPatterns.cpp
+3-3mlir/include/mlir/IR/PatternMatch.h
+107-295 files

LLVM/project bd643bcflang/include/flang/Optimizer/Transforms Passes.h Passes.td, flang/lib/Optimizer/Transforms FIRToSCF.cpp

[flang] Use default constructor for FIRToSCF pass (#169741)

DeltaFile
+2-4flang/lib/Optimizer/Transforms/FIRToSCF.cpp
+0-1flang/include/flang/Optimizer/Transforms/Passes.h
+0-1flang/include/flang/Optimizer/Transforms/Passes.td
+2-63 files

LLVM/project b028daclibcxx/include queue, libcxx/test/libcxx/diagnostics queue.nodiscard.verify.cpp

[libc++][queue] Applied `[[nodiscard]]` (#169469)

`[[nodiscard]]` should be applied to functions where discarding the
return value is most likely a correctness issue.

-
https://libcxx.llvm.org/CodingGuidelines.html#apply-nodiscard-where-relevant
DeltaFile
+19-7libcxx/test/libcxx/diagnostics/queue.nodiscard.verify.cpp
+9-7libcxx/include/queue
+28-142 files

LLVM/project 6910813mlir/lib/Transforms/Utils DialectConversion.cpp

address comments
DeltaFile
+4-3mlir/lib/Transforms/Utils/DialectConversion.cpp
+4-31 files

LLVM/project ceba82fllvm/lib/Transforms/Vectorize LoadStoreVectorizer.cpp, llvm/test/Transforms/LoadStoreVectorizer/AMDGPU vectorize-redund-loads.ll

[LoadStoreVectorizer] Fix one-element vector handling (#169671)

This is the followup of https://github.com/llvm/llvm-project/pull/168135
DeltaFile
+27-0llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/vectorize-redund-loads.ll
+4-4llvm/lib/Transforms/Vectorize/LoadStoreVectorizer.cpp
+31-42 files

LLVM/project 1ff5c89mlir/test/Analysis/DataFlow test-liveness-analysis.mlir, mlir/test/lib/Analysis/DataFlow TestLivenessAnalysis.cpp

[mlir][dataflow] Add arguemnt print for test-liveness-analysis (#169625)

Add arguemnt print for test-liveness-analysis to better debug
remove-dead-values pass.

---------

Co-authored-by: Mehdi Amini <joker.eph at gmail.com>
DeltaFile
+13-1mlir/test/Analysis/DataFlow/test-liveness-analysis.mlir
+11-0mlir/test/lib/Analysis/DataFlow/TestLivenessAnalysis.cpp
+24-12 files

LLVM/project 48a9b07llvm/lib/Target/AMDGPU AMDGPUCodeGenPrepare.cpp

[AMDGPU] Remove unused functions isSigned. NFC (#169750)

These have been unused since
https://github.com/llvm/llvm-project/pull/145483.
DeltaFile
+0-18llvm/lib/Target/AMDGPU/AMDGPUCodeGenPrepare.cpp
+0-181 files

LLVM/project e2a29ecclang/lib/CodeGen CGExpr.cpp, clang/test/Driver fsanitize.c

[UBSan] Use -fsanitize-handler-preserve-all-regs in codegen



Pull Request: https://github.com/llvm/llvm-project/pull/168645
DeltaFile
+18-3compiler-rt/test/ubsan_minimal/TestCases/override-callback.c
+19-0llvm/test/Instrumentation/BoundsChecking/runtimes.ll
+14-4clang/test/Driver/fsanitize.c
+8-0clang/lib/CodeGen/CGExpr.cpp
+7-0llvm/lib/Passes/PassBuilder.cpp
+5-2llvm/include/llvm/Transforms/Instrumentation/BoundsChecking.h
+71-95 files not shown
+84-1411 files

LLVM/project 27a3e43llvm/test/CodeGen/AMDGPU amdgcn.bitcast.1024bit.ll amdgcn.bitcast.512bit.ll, llvm/test/tools/llvm-dwarfdump/X86 simplified-template-names.s

address review get() -> native_handle() and constexpr invalid value

Created using spr 1.3.7
DeltaFile
+45,267-48,746llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.1024bit.ll
+11,954-11,000llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.512bit.ll
+5,981-8,885llvm/test/CodeGen/AMDGPU/shufflevector.v4p0.v4p0.ll
+5,981-8,885llvm/test/CodeGen/AMDGPU/shufflevector.v4i64.v4i64.ll
+7,387-7,087llvm/test/tools/llvm-dwarfdump/X86/simplified-template-names.s
+5,060-5,874llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.960bit.ll
+81,630-90,47712,517 files not shown
+674,509-546,02212,523 files

LLVM/project 9782413clang/lib/CodeGen CGExpr.cpp, clang/test/Driver fsanitize.c

[UBSan] Use -fsanitize-handler-preserve-all-regs in codegen (#168645)

DeltaFile
+18-3compiler-rt/test/ubsan_minimal/TestCases/override-callback.c
+19-0llvm/test/Instrumentation/BoundsChecking/runtimes.ll
+14-4clang/test/Driver/fsanitize.c
+8-0clang/lib/CodeGen/CGExpr.cpp
+6-1llvm/lib/Transforms/Instrumentation/BoundsChecking.cpp
+5-2llvm/include/llvm/Transforms/Instrumentation/BoundsChecking.h
+70-105 files not shown
+84-1411 files

LLVM/project 396f4f9mlir/lib/Conversion/ArithToAPFloat ArithToAPFloat.cpp, mlir/lib/ExecutionEngine APFloatWrappers.cpp

[mlir][arith] Add support for `cmpf` to `ArithToAPFloat`
x
DeltaFile
+147-5mlir/lib/Conversion/ArithToAPFloat/ArithToAPFloat.cpp
+15-0mlir/test/Conversion/ArithToApfloat/arith-to-apfloat.mlir
+11-0mlir/lib/ExecutionEngine/APFloatWrappers.cpp
+4-0mlir/test/Integration/Dialect/Arith/CPU/test-apfloat-emulation.mlir
+177-54 files

LLVM/project b7eb988lldb/source/Utility RegisterValue.cpp

[lldb] Use InlHostByteOrder in RegisterValue::SetValueFromData (#169624)

An existing code can be further simplified.

---------

Co-authored-by: Matej Košík <matej.kosik at codasip.com>
DeltaFile
+1-3lldb/source/Utility/RegisterValue.cpp
+1-31 files

LLVM/project 8cc0259llvm/include/llvm/ExecutionEngine/Orc WaitingOnGraph.h, llvm/unittests/ExecutionEngine/Orc WaitingOnGraphTest.cpp

[ORC] Clear stale ElemToPendingSN entries in WaitingOnGraph. (#169747)

WaitingOnGraph::processReadyOrFailed was not clearing stale entries from
the ElemToPendingSN map. If symbols were removed from the
ExecutionSession and then re-added this could lead to dependencies on
the stale entries, triggering a use-after-free bug.

https://github.com/llvm/llvm-project/issues/169135
DeltaFile
+46-0llvm/unittests/ExecutionEngine/Orc/WaitingOnGraphTest.cpp
+18-4llvm/include/llvm/ExecutionEngine/Orc/WaitingOnGraph.h
+64-42 files

LLVM/project 227c15dmlir/lib/Dialect/XeGPU/Transforms XeGPUBlocking.cpp

simplify: Avoid calling getTileShape(Result&Operand) from getTileShape(op)
DeltaFile
+14-13mlir/lib/Dialect/XeGPU/Transforms/XeGPUBlocking.cpp
+14-131 files

LLVM/project 49516ballvm/tools/llvm-objdump SourcePrinter.cpp SourcePrinter.h

[llvm-objdump] Optimize live element tracking (#158763)

This patch significantly optimizes the LiveElementPrinter
by replacing a slow linear search with efficient hash map
lookups. It refactors the code to use a map-based system
for tracking live element addresses and managing column
assignments, leading to a major performance improvement
for large binaries.
DeltaFile
+203-56llvm/tools/llvm-objdump/SourcePrinter.cpp
+43-10llvm/tools/llvm-objdump/SourcePrinter.h
+5-5llvm/tools/llvm-objdump/llvm-objdump.cpp
+251-713 files

LLVM/project 31141b6llvm/docs LangRef.rst, llvm/lib/CodeGen PreISelIntrinsicLowering.cpp

Unsupport SW encoding

Created using spr 1.3.6-beta.1
DeltaFile
+4-60llvm/test/Transforms/PreISelIntrinsicLowering/protected-field-pointer.ll
+4-60llvm/test/Transforms/PreISelIntrinsicLowering/protected-field-pointer-addrspace1.ll
+3-23llvm/lib/CodeGen/PreISelIntrinsicLowering.cpp
+2-4llvm/docs/LangRef.rst
+13-1474 files

LLVM/project 4795f24mlir/lib/Dialect/XeGPU/Transforms XeGPUBlocking.cpp, mlir/test/Dialect/XeGPU xegpu-blocking.mlir

simplify: load_nd/store_nd/prefetch_nd use anchor layout for blocking
DeltaFile
+39-39mlir/test/Dialect/XeGPU/xegpu-blocking.mlir
+6-5mlir/lib/Dialect/XeGPU/Transforms/XeGPUBlocking.cpp
+45-442 files

LLVM/project 17d207aclang/lib/Analysis/FlowSensitive/Models UncheckedStatusOrAccessModel.cpp, clang/unittests/Analysis/FlowSensitive UncheckedStatusOrAccessModelTestFixture.cpp

[𝘀𝗽𝗿] initial version

Created using spr 1.3.7
DeltaFile
+102-0clang/lib/Analysis/FlowSensitive/Models/UncheckedStatusOrAccessModel.cpp
+84-0clang/unittests/Analysis/FlowSensitive/UncheckedStatusOrAccessModelTestFixture.cpp
+186-02 files

LLVM/project 6178706llvm/test/CodeGen/AMDGPU minimumnum.bf16.ll maximumnum.bf16.ll, llvm/test/CodeGen/RISCV/rvv vluxei.ll

Address review comments

Created using spr 1.3.6-beta.1
DeltaFile
+4,734-0llvm/test/tools/llvm-mca/RISCV/tt-ascalon-d8/vlseg-vsseg.s
+1,529-1,529llvm/test/tools/llvm-mca/RISCV/SpacemitX60/rvv-fp.s
+838-838llvm/test/CodeGen/AMDGPU/minimumnum.bf16.ll
+838-838llvm/test/CodeGen/AMDGPU/maximumnum.bf16.ll
+1,560-19llvm/test/CodeGen/AMDGPU/constant-address-space-32bit.ll
+0-1,554llvm/test/CodeGen/RISCV/rvv/vluxei.ll
+9,499-4,7782,383 files not shown
+58,721-154,4962,389 files

LLVM/project 35ca906llvm/test/CodeGen/AMDGPU maximumnum.bf16.ll minimumnum.bf16.ll, llvm/test/CodeGen/RISCV/rvv vluxei.ll

[𝘀𝗽𝗿] changes introduced through rebase

Created using spr 1.3.6-beta.1

[skip ci]
DeltaFile
+4,734-0llvm/test/tools/llvm-mca/RISCV/tt-ascalon-d8/vlseg-vsseg.s
+1,529-1,529llvm/test/tools/llvm-mca/RISCV/SpacemitX60/rvv-fp.s
+838-838llvm/test/CodeGen/AMDGPU/maximumnum.bf16.ll
+838-838llvm/test/CodeGen/AMDGPU/minimumnum.bf16.ll
+1,560-19llvm/test/CodeGen/AMDGPU/constant-address-space-32bit.ll
+0-1,554llvm/test/CodeGen/RISCV/rvv/vluxei.ll
+9,499-4,7782,382 files not shown
+58,690-154,4682,388 files

LLVM/project ae01e29mlir/lib/Dialect/XeGPU/Transforms XeGPUBlocking.cpp, mlir/test/Dialect/XeGPU xegpu-blocking.mlir

simplify: load/store/prefetch/loadmatrix/storematrix use anchor layout for blocking
DeltaFile
+62-29mlir/lib/Dialect/XeGPU/Transforms/XeGPUBlocking.cpp
+34-40mlir/test/Dialect/XeGPU/xegpu-blocking.mlir
+96-692 files