LLVM/project aa61f21llvm/lib/Target/LoongArch LoongArchISelLowering.cpp

modify comment
DeltaFile
+1-1llvm/lib/Target/LoongArch/LoongArchISelLowering.cpp
+1-11 files

LLVM/project 6abbbcallvm/lib/Transforms/AggressiveInstCombine AggressiveInstCombine.cpp, llvm/test/Transforms/AggressiveInstCombine umulh_carry4.ll umulh_ladder.ll

[AggressiveInstCombine] Match long high-half multiply (#168396)

This patch adds recognition of high-half multiply by parts into a single
larger multiply.

Considering a multiply made up of high and low parts, we can split the
multiply into:
  x * y == (xh*T + xl) * (yh*T + yl)
where `xh == x>>32` and `xl == x & 0xffffffff`. `T = 2^32`.
This expands to
  xh*yh*T*T + xh*yl*T + xl*yh*T + xl*yl
which I find it helpful to be drawn as
  [  xh*yh  ]
       [  xh*yl  ]
       [  xl*yh  ]
            [  xl*yl  ]

We are looking for the "high" half, which is xh*yh + xh*yl>>32 + xl*yh>>32 +
carrys. The carry makes this difficult and there are multiple ways of

    [15 lines not shown]
DeltaFile
+3,019-0llvm/test/Transforms/AggressiveInstCombine/umulh_carry4.ll
+858-0llvm/test/Transforms/AggressiveInstCombine/umulh_ladder.ll
+755-0llvm/test/Transforms/AggressiveInstCombine/umulh_carry.ll
+530-0llvm/test/Transforms/AggressiveInstCombine/umulh_ladder4.ll
+324-0llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp
+5,486-05 files

LLVM/project bb9449dllvm/lib/Transforms/InstCombine InstCombineCalls.cpp, llvm/test/Transforms/InstCombine get_vector_length.ll

[InstCombine] Fold @llvm.experimental.get.vector.length when cnt <= max_lanes (#169293)

On RISC-V, some loops that the loop vectorizer vectorizes pre-LTO may
turn out to have the exact trip count exposed after LTO, see #164762.

If the trip count is small enough we can fold away the
@llvm.experimental.get.vector.length intrinsic based on this corollary
from the LangRef:

> If %cnt is less than or equal to %max_lanes, the return value is equal
to %cnt.

This on its own doesn't remove the @llvm.experimental.get.vector.length
in #164762 since we also need to teach computeKnownBits about
@llvm.experimental.get.vector.length and the sub recurrence, but this PR
is a starting point.

I've added this in InstCombine rather than InstSimplify since we may
need to insert a truncation (@llvm.experimental.get.vector.length can

    [3 lines not shown]
DeltaFile
+89-0llvm/test/Transforms/InstCombine/get_vector_length.ll
+21-0llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
+110-02 files

LLVM/project 71f25eallvm/lib/Target/LoongArch LoongArchISelLowering.cpp LoongArchLASXInstrInfo.td

[LoongArch] Make rotl/rotr custom for lsx/lasx
DeltaFile
+60-0llvm/lib/Target/LoongArch/LoongArchISelLowering.cpp
+5-0llvm/lib/Target/LoongArch/LoongArchLASXInstrInfo.td
+5-0llvm/lib/Target/LoongArch/LoongArchLSXInstrInfo.td
+1-0llvm/lib/Target/LoongArch/LoongArchISelLowering.h
+71-04 files

LLVM/project f1b04aellvm/test/CodeGen/LoongArch/lasx rotl-rotr.ll, llvm/test/CodeGen/LoongArch/lsx rotl-rotr.ll

add tests
DeltaFile
+36-71llvm/test/CodeGen/LoongArch/lasx/rotl-rotr.ll
+34-71llvm/test/CodeGen/LoongArch/lsx/rotl-rotr.ll
+70-1422 files

LLVM/project f1ddb2fllvm/test/CodeGen/LoongArch/lasx rotl-rotr.ll, llvm/test/CodeGen/LoongArch/lsx rotl-rotr.ll

[LoongArch][NFC] Pre-commit tests for vector rotl/rotr (#161115)

DeltaFile
+283-0llvm/test/CodeGen/LoongArch/lsx/rotl-rotr.ll
+283-0llvm/test/CodeGen/LoongArch/lasx/rotl-rotr.ll
+566-02 files

LLVM/project 326a1a4mlir/include/mlir/Dialect/XeGPU/IR XeGPUOps.td, mlir/lib/Dialect/XeGPU/IR XeGPUOps.cpp

[MLIR][XeGPU] Add anchor_layout and update propagation to honor user-specified layouts (#169267)

Introduce anchor layout for XeGPU anchor ops: load_nd, store_nd,
prefetch_nd, dpas, load, store, prefetch, load_matrix, store_matrix, and
atomic_rmw. Anchor layout is permanent, and is guaranteed to be honored
by XeGPU distribution and lowerinngs once specified.
1. Add anchor_layout for XeGPU anchor OPs: load_nd, store_nd,
prefetch_nd, dpas, load, store, prefetch, load_matrix, store_matrix, and
atomic_rmw.
2. rename layout attributes to anchor_layout for these ops: load, store,
load_matrix, store_matrix
3. update layout propagation pass: Only when user doesn't specify anchor
layout, the pass computes a default layout and set to anchor op's
permant layout and use that for propagation. if user specified anchor
layout, the pass takes user-specified anchor layout. permant layout and
use that for propagation. if user specified anchor layout, the pass
takes user-specified anchor layout.
DeltaFile
+329-119mlir/include/mlir/Dialect/XeGPU/IR/XeGPUOps.td
+267-177mlir/lib/Dialect/XeGPU/Transforms/XeGPUPropagateLayout.cpp
+41-40mlir/test/Dialect/XeGPU/propagate-layout.mlir
+13-11mlir/lib/Dialect/XeGPU/IR/XeGPUOps.cpp
+10-10mlir/test/Dialect/XeGPU/propagate-layout-inst-data.mlir
+10-2mlir/lib/Dialect/XeGPU/Transforms/XeGPUWgToSgDistribute.cpp
+670-3591 files not shown
+670-3607 files

LLVM/project a9cc7fellvm/include/llvm/ProfileData SampleProf.h

[NFC][SampleFDO] Use const& to avoid copies (#164584)

Use const& in range-based for loop to avoid unnecessary copies
DeltaFile
+1-1llvm/include/llvm/ProfileData/SampleProf.h
+1-11 files

LLVM/project f6712b6libcxx/test/std/utilities/optional/optional.object/optional.object.ctor copy.pass.cpp move.pass.cpp

[libc++] Reformat `optional` constructor tests (#169231)

- Mass-reformat tests in
`std/utilities/optional/optional.object/optional.object.ctor` and
rearrange header `#include`s
- No functional changes
- Prelude for #169203
DeltaFile
+133-141libcxx/test/std/utilities/optional/optional.object/optional.object.ctor/copy.pass.cpp
+110-117libcxx/test/std/utilities/optional/optional.object/optional.object.ctor/move.pass.cpp
+94-113libcxx/test/std/utilities/optional/optional.object/optional.object.ctor/rvalue_T.pass.cpp
+82-106libcxx/test/std/utilities/optional/optional.object/optional.object.ctor/in_place_t.pass.cpp
+91-93libcxx/test/std/utilities/optional/optional.object/optional.object.ctor/U.pass.cpp
+82-97libcxx/test/std/utilities/optional/optional.object/optional.object.ctor/const_T.pass.cpp
+592-66711 files not shown
+978-1,11217 files

LLVM/project 82640a9clang/include/clang/Analysis/Analyses/LifetimeSafety Origins.h, clang/lib/Analysis/LifetimeSafety FactsGenerator.cpp Origins.cpp

Multi-origin changes
DeltaFile
+384-30clang/test/Sema/warn-lifetime-safety.cpp
+213-90clang/lib/Analysis/LifetimeSafety/FactsGenerator.cpp
+117-64clang/lib/Analysis/LifetimeSafety/Origins.cpp
+96-22clang/include/clang/Analysis/Analyses/LifetimeSafety/Origins.h
+56-30clang/unittests/Analysis/LifetimeSafetyTest.cpp
+27-7clang/lib/Analysis/LifetimeSafety/LifetimeSafety.cpp
+893-2437 files not shown
+931-27013 files

LLVM/project 07a2dbaclang/test/Sema/AArch64 arm_sve_feature_dependent_sve___sme.c, llvm/test/CodeGen/AMDGPU amdgcn.bitcast.1024bit.ll amdgcn.bitcast.512bit.ll

Merge branch 'main' into users/ylzsx/precommit-rotr-custom
DeltaFile
+53,205-51,210llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.1024bit.ll
+18,277-15,993llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.512bit.ll
+19,255-3,889llvm/test/CodeGen/RISCV/atomic-rmw.ll
+19,470-0clang/test/Sema/AArch64/arm_sve_feature_dependent_sve___sme.c
+5,981-8,885llvm/test/CodeGen/AMDGPU/shufflevector.v4p0.v4p0.ll
+5,981-8,885llvm/test/CodeGen/AMDGPU/shufflevector.v4i64.v4i64.ll
+122,169-88,86220,850 files not shown
+1,624,754-725,03320,856 files

LLVM/project 6696e0cclang/lib/AST/ByteCode Interp.cpp Interp.h

[clang][bytecode] Remove double diagnostic emission (#169658)

We emit this diagnostic from CheckPointerToIntegralCast() already, so
remove the emission from CastPointerIntegral().
DeltaFile
+4-4clang/lib/AST/ByteCode/Interp.cpp
+0-4clang/lib/AST/ByteCode/Interp.h
+4-82 files

LLVM/project fede947mlir/lib/Target/LLVMIR/Dialect/LLVMIR LLVMToLLVMIRTranslation.cpp, mlir/test/Dialect/LLVMIR invalid-cg-profile.mlir

[mlir][LLVMIR] Handle missing functions in CGProfile module flags (#169517)

This commit extends the CGProfile module flags export with support for missing function references. Previously, this caused a crash and now it's properly exported to `null` values in the metadata node.
Fixes: https://github.com/llvm/llvm-project/issues/160717
DeltaFile
+10-10mlir/lib/Target/LLVMIR/Dialect/LLVMIR/LLVMToLLVMIRTranslation.cpp
+14-0mlir/test/Dialect/LLVMIR/invalid-cg-profile.mlir
+24-102 files

LLVM/project f43c59flibcxx/include/__mdspan mdspan.h, libcxx/test/libcxx/containers/views/mdspan nodiscard.verify.cpp

fix windows build

Created using spr 1.3.7
DeltaFile
+53-25mlir/lib/Transforms/Utils/DialectConversion.cpp
+62-0libcxx/test/libcxx/containers/views/mdspan/nodiscard.verify.cpp
+25-18libcxx/include/__mdspan/mdspan.h
+27-0llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/vectorize-redund-loads.ll
+19-7libcxx/test/libcxx/diagnostics/queue.nodiscard.verify.cpp
+23-0mlir/test/Transforms/test-legalizer-no-rollback.mlir
+209-5029 files not shown
+402-13335 files

LLVM/project 4099121clang/lib/Tooling/Transformer SourceCode.cpp, clang/unittests/Tooling SourceCodeTest.cpp

[clang][Tooling] Fix `getFileRange` returning a range spanning across macro arguments (#169757)

When the start and end token are both spelled in macro arguments, we
still want to reject the range if they come from two separate macro
arguments, as the original specified range is not precisely spelled in a
single sequence of characters in source.
DeltaFile
+25-5clang/lib/Tooling/Transformer/SourceCode.cpp
+2-0clang/unittests/Tooling/SourceCodeTest.cpp
+27-52 files

LLVM/project 9fe4582mlir/lib/Conversion/ArithToAPFloat ArithToAPFloat.cpp, mlir/lib/ExecutionEngine APFloatWrappers.cpp

[mlir][arith] Add support for min/max to `ArithToAPFloat`
DeltaFile
+40-0mlir/test/Conversion/ArithToApfloat/arith-to-apfloat.mlir
+20-0mlir/lib/ExecutionEngine/APFloatWrappers.cpp
+8-0mlir/lib/Conversion/ArithToAPFloat/ArithToAPFloat.cpp
+4-0mlir/test/Integration/Dialect/Arith/CPU/test-apfloat-emulation.mlir
+72-04 files

LLVM/project 1748e23mlir/include/mlir/Dialect/LLVMIR NVVMOps.td, mlir/include/mlir/Target/LLVMIR ModuleTranslation.h

[MLIR][Intrinsics] Add new MLIR API to automatically resolve overload types (#168188)

Add createIntrinsicCall overload that accepts return type and arguments,
automatically resolve overload types rather than requiring manual
computation. Simplifies NVVM_PrefetchOp by removing conditional overload
logic.
DeltaFile
+7-4mlir/lib/Target/LLVMIR/ModuleTranslation.cpp
+9-0mlir/include/mlir/Target/LLVMIR/ModuleTranslation.h
+1-6mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td
+17-103 files

LLVM/project 410d05fmlir/lib/Conversion/ArithToAPFloat ArithToAPFloat.cpp, mlir/lib/ExecutionEngine APFloatWrappers.cpp

[mlir][arith] Add support for `negf` to `ArithToAPFloat`
DeltaFile
+44-0mlir/lib/Conversion/ArithToAPFloat/ArithToAPFloat.cpp
+10-0mlir/test/Conversion/ArithToApfloat/arith-to-apfloat.mlir
+9-0mlir/lib/ExecutionEngine/APFloatWrappers.cpp
+4-0mlir/test/Integration/Dialect/Arith/CPU/test-apfloat-emulation.mlir
+67-04 files

LLVM/project 601f796mlir/include/mlir/Dialect/LLVMIR NVVMOps.td, mlir/lib/Dialect/LLVMIR/IR NVVMDialect.cpp

[MLIR][NVVM] Add missing rounding modes in fp16x2 conversions (#169005)

This change adds the `RN` and `RZ` rounding modes to the
`convert.f32x2.to.f16x2` and `convert.f32x2.to.bf16x2` Ops.

Tests are added in `convert_fp16x2.mlir` and
`invalid_convert_fp16x2.mlir`.
Tests with these Ops in `convert_stochastic_rounding.mlir` and
`invalid-convert-stochastic-rounding.mlir` have been removed or
modified.

PTX spec reference:
https://docs.nvidia.com/cuda/parallel-thread-execution/#data-movement-and-conversion-instructions-cvt
DeltaFile
+122-28mlir/lib/Dialect/LLVMIR/IR/NVVMDialect.cpp
+87-0mlir/test/Target/LLVMIR/nvvm/convert_fp16x2.mlir
+2-66mlir/test/Target/LLVMIR/nvvm/convert_stochastic_rounding.mlir
+32-20mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td
+47-0mlir/test/Target/LLVMIR/nvvm/invalid_convert_fp16x2.mlir
+3-23mlir/test/Dialect/LLVMIR/nvvm/invalid-convert-stochastic-rounding.mlir
+293-1376 files

LLVM/project e7dec23llvm/lib/IR ReplaceConstant.cpp, llvm/test/CodeGen/AMDGPU lower-module-lds-constantexpr.ll same-lds-variable-multiple-use-in-one-phi-node.ll

[ReplaceConstant] Don't create instructions for the same constant multiple times in the same basic block (#169141)

Fixes #167500.
DeltaFile
+47-41llvm/test/CodeGen/AMDGPU/lower-module-lds-constantexpr.ll
+51-0llvm/test/CodeGen/AMDGPU/same-lds-variable-multiple-use-in-one-phi-node.ll
+26-17llvm/test/CodeGen/AMDGPU/lower-kernel-lds-constexpr.ll
+16-5llvm/lib/IR/ReplaceConstant.cpp
+140-634 files

LLVM/project 6e9c978llvm/include/llvm/Support File.h

fix windows

Created using spr 1.3.7
DeltaFile
+7-4llvm/include/llvm/Support/File.h
+7-41 files

LLVM/project 541e13cllvm/test/CodeGen/LoongArch/lasx build-vector.ll scalar-to-vector.ll, llvm/test/CodeGen/LoongArch/lasx/ir-instruction insertelement.ll

update tests
DeltaFile
+10-40llvm/test/CodeGen/LoongArch/lasx/build-vector.ll
+7-19llvm/test/CodeGen/LoongArch/lsx/build-vector.ll
+4-6llvm/test/CodeGen/LoongArch/lasx/ir-instruction/insertelement.ll
+4-4llvm/test/CodeGen/LoongArch/lasx/scalar-to-vector.ll
+4-4llvm/test/CodeGen/LoongArch/lsx/scalar-to-vector.ll
+29-735 files

LLVM/project b3428bbllvm/lib/IR LLVMContextImpl.cpp

Add missing freeConstants() call for ConstantPtrAuths.

Fixes memory leak uncovered by #133533.
DeltaFile
+1-0llvm/lib/IR/LLVMContextImpl.cpp
+1-01 files

LLVM/project aa0d95fllvm/lib/Target/LoongArch LoongArchISelLowering.cpp

[LoongArch] Legalize BUILD_VECTOR into a broadcast when all non-undef elements are identical

When a BUILD_VECTOR consists of the same element (ignoring undefs),
it is better emitting a broadcast instead of multiple insertions.

Some floating-point cases suffer performance regressions, those
specific cases are excluded in this commit. Including when:

- only one element is non-undef,
- only two elements are non-undef, and one of them must at index 0,
- for v8f32 vector type, specially exclude the cases when the only
two non-undefs are at index (1,2)/(1,3)/(2,3).
DeltaFile
+31-5llvm/lib/Target/LoongArch/LoongArchISelLowering.cpp
+31-51 files

LLVM/project bc0235cllvm/test/CodeGen/LoongArch/lasx build-vector.ll, llvm/test/CodeGen/LoongArch/lsx build-vector.ll

[LoongArch][NFC] Add tests for build_vector containing same elements except for undefs
DeltaFile
+231-18llvm/test/CodeGen/LoongArch/lasx/build-vector.ll
+149-18llvm/test/CodeGen/LoongArch/lsx/build-vector.ll
+380-362 files

LLVM/project fb18f75cross-project-tests/debuginfo-tests/dexter/dex/debugger DAP.py

[lldb-dap] Add breakpoints after debugger initialization in DExTer (#169744)

# Summary
This is a forward fix for test errors from
https://github.com/llvm/llvm-project/pull/163653.

The PR moved debugger initialization outside of
InitializeRequestHandler, and into Launch/AttachRequestHandlers to
support DAP sessions sharing debugger instances for dynamically created
targets. However, DExTer's DAP class seemed to set breakpoints before
the debugger was initialized, which caused the tests to hang waiting for
a breakpoint to hit due to none of the breakpoints getting resolved.

# Tests
```
bin/llvm-lit -v /home/qxy11/llvm/llvm-project/cross-project-tests/debuginfo-tests/dexter-tests/
```
DeltaFile
+15-8cross-project-tests/debuginfo-tests/dexter/dex/debugger/DAP.py
+15-81 files

LLVM/project bacca23libcxx/include/__mdspan mdspan.h extents.h, libcxx/test/libcxx/containers/views/mdspan nodiscard.verify.cpp

[libc++][mdspan] Applied `[[nodiscard]]` (#169326)

`[[nodiscard]]` should be applied to functions where discarding the
return value is most likely a correctness issue.
-
https://libcxx.llvm.org/CodingGuidelines.html#apply-nodiscard-where-relevant
DeltaFile
+62-0libcxx/test/libcxx/containers/views/mdspan/nodiscard.verify.cpp
+25-18libcxx/include/__mdspan/mdspan.h
+10-10libcxx/test/libcxx/containers/views/mdspan/extents/assert.obs.pass.cpp
+6-4libcxx/include/__mdspan/extents.h
+103-324 files

LLVM/project 504b507mlir/include/mlir/IR PatternMatch.h, mlir/include/mlir/Transforms DialectConversion.h

[mlir][Transforms] Dialect conversion: Add support for `replaceUsesWithIf` (#169606)

This commit adds support for `replaceUsesWithIf` (and variants such as
`replaceAllUsesExcept`) to the `ConversionPatternRewriter`. This API is
supported only in no-rollback mode. An assertion is triggered in
rollback mode. (This missing assertion has been confusing for users
because it seemed that the API supported, while it was actually not
working properly.)

This commit brings us a bit closer towards removing
[this](https://github.com/llvm/llvm-project/blob/76ec25f729fcc7ae576caf21293cc393e68e7cf7/mlir/lib/Transforms/Utils/DialectConversion.cpp#L1214)
workaround.

Additional changes are needed to support this API in rollback mode. In
particular, no entries should be added to the `ConversionValueMapping`
for conditional replacements. It's unclear at this point if this API can
be supported in rollback mode, so this is deferred to later.

This commit turns `replaceUsesWithIf` into a virtual function, so that

    [8 lines not shown]
DeltaFile
+53-25mlir/lib/Transforms/Utils/DialectConversion.cpp
+23-0mlir/test/Transforms/test-legalizer-no-rollback.mlir
+21-0mlir/include/mlir/Transforms/DialectConversion.h
+7-1mlir/test/lib/Dialect/Test/TestPatterns.cpp
+3-3mlir/include/mlir/IR/PatternMatch.h
+107-295 files

LLVM/project bd643bcflang/include/flang/Optimizer/Transforms Passes.h Passes.td, flang/lib/Optimizer/Transforms FIRToSCF.cpp

[flang] Use default constructor for FIRToSCF pass (#169741)

DeltaFile
+2-4flang/lib/Optimizer/Transforms/FIRToSCF.cpp
+0-1flang/include/flang/Optimizer/Transforms/Passes.h
+0-1flang/include/flang/Optimizer/Transforms/Passes.td
+2-63 files

LLVM/project b028daclibcxx/include queue, libcxx/test/libcxx/diagnostics queue.nodiscard.verify.cpp

[libc++][queue] Applied `[[nodiscard]]` (#169469)

`[[nodiscard]]` should be applied to functions where discarding the
return value is most likely a correctness issue.

-
https://libcxx.llvm.org/CodingGuidelines.html#apply-nodiscard-where-relevant
DeltaFile
+19-7libcxx/test/libcxx/diagnostics/queue.nodiscard.verify.cpp
+9-7libcxx/include/queue
+28-142 files