LLVM/project 1aaa05fllvm/lib/Transforms/InstCombine InstCombineCalls.cpp, llvm/lib/Transforms/Utils SimplifyLibCalls.cpp

Revert "[InstCombine] Combine llvm.sin/llvm.cos libcall pairs into llvm.sinco…"

This reverts commit efdb493e485ceaa7a80392de338b02d00e9b67e0.
DeltaFile
+0-421llvm/test/Transforms/InstCombine/sincos.ll
+0-77llvm/test/Transforms/InstCombine/sincos-fpmath.ll
+0-67llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
+18-32llvm/test/Transforms/InstCombine/fdiv-cos-sin.ll
+11-29llvm/test/Transforms/InstCombine/fdiv-sin-cos.ll
+2-20llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp
+31-6464 files not shown
+56-66210 files

LLVM/project 6129794llvm/test/CodeGen/PowerPC aix-complex.ll

[NFC][PowerPC] aix-complex.ll - regenerate test checks (#194576)

Makes it easier to show the diffs in the topological dag work
DeltaFile
+199-49llvm/test/CodeGen/PowerPC/aix-complex.ll
+199-491 files

LLVM/project 7ebd47bclang/lib/AST Type.cpp, clang/lib/CodeGen CGExprScalar.cpp

[Clang][AArch64] Fix codegen for SVE vector compare operations (#194013)

Overloaded operartors `<`, `>`, `<=`, `>=`, `==`, and `!=` with SVE
integer vector operands emitted LLVM IR with a couple of issues:
* The `icmp` instruction always performed unsigned comparison, even for
signed operands.
* The result of the comparison was zero-extended, whereas the intent is
to follow established NEON conventions and sign-extend it.

This patches fixes these issues.
DeltaFile
+148-148clang/test/CodeGen/AArch64/sve-vector-compare-ops.c
+40-40clang/test/CodeGenCXX/aarch64-sve-vector-conditional-op.cpp
+14-2clang/lib/AST/Type.cpp
+1-1clang/lib/CodeGen/CGExprScalar.cpp
+203-1914 files

LLVM/project 9c6e273clang/lib/CIR/CodeGen CIRGenBuiltinAArch64.cpp, clang/test/CodeGen/AArch64 neon-intrinsics.c

[clang][CIR] Add lowering for vrshr_ and vrshrq_ rounding intrinsics (#194229)

This PR adds lowering for the vector rounding shift right intrinsice,
i.e. `vrshr_*` and `vrshrq_*` [1]. It also moves the corresponding tests
from:
  * clang/test/CodeGen/AArch64/neon_intrinsics.c

to:
  * clang/test/CodeGen/AArch64/neon/intrinsics.c

The lowering follows the existing implementation in
CodeGen/TargetBuiltins/ARM.cpp.

Part of #185382.

Reference:
[1] https://arm-software.github.io/acle/neon_intrinsics/advsimd.html#vector-rounding-shift-right

Co-authored-by: Md Mouzam Arfi Hussain <arfihussain27 at gmail.com>
DeltaFile
+0-205clang/test/CodeGen/AArch64/neon-intrinsics.c
+205-0clang/test/CodeGen/AArch64/neon/intrinsics.c
+29-7clang/lib/CIR/CodeGen/CIRGenBuiltinAArch64.cpp
+234-2123 files

LLVM/project efdb493llvm/lib/Transforms/InstCombine InstCombineCalls.cpp, llvm/lib/Transforms/Utils SimplifyLibCalls.cpp

[InstCombine] Combine llvm.sin/llvm.cos libcall pairs into llvm.sincos (#184760)

Teach InstCombine to recognize pairs of `llvm.sin(x)` and `llvm.cos(x)`
intrinsic calls that share the same argument and replace them with a
single `llvm.sincos(x)` call, extracting the individual results.

The optimization works in two phases:

1. **SimplifyLibCalls**: Convert `sin`/`cos` C library calls (e.g.
   `sinf`, `cosf`, `sin`, `cos`, `sinl`, `cosl`) into `llvm.sin` /
   `llvm.cos` intrinsics when the call does not access memory (i.e. does
   not set `errno`). This normalization step brings library calls into
   the same form as compiler-generated intrinsics.

2. **InstCombineCalls**: When visiting an `llvm.sin` or `llvm.cos`
   intrinsic, scan the users of the shared argument for a matching
   counterpart. If found, emit a single `llvm.sincos` call placed right
   after the argument definition, replace both original calls, and erase
   the matched instruction.

Also remove the completed sincos TODO from Target/README.txt.
DeltaFile
+421-0llvm/test/Transforms/InstCombine/sincos.ll
+77-0llvm/test/Transforms/InstCombine/sincos-fpmath.ll
+67-0llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
+32-18llvm/test/Transforms/InstCombine/fdiv-cos-sin.ll
+29-11llvm/test/Transforms/InstCombine/fdiv-sin-cos.ll
+20-2llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp
+646-314 files not shown
+662-5610 files

LLVM/project cced408openmp/runtime CMakeLists.txt, openmp/runtime/src kmp_tasking.cpp kmp_taskdeps.cpp

[OpenMP] Rename ompx_taskgraph->omp_taskgraph_experimental

This patch renames the option to enable taskgraph support in the
runtime from OMPX_TASKGRAPH to OMP_TASKGRAPH_EXPERIMENTAL, to reflect
the feature's official status in OpenMP 6.0, but also the feature's
current work-in-progress nature.

commit-id:fa62775a

Reviewers: ro-i

Reviewed By: ro-i

Pull Request: https://github.com/llvm/llvm-project/pull/194045
DeltaFile
+28-27openmp/runtime/src/kmp_tasking.cpp
+7-7openmp/runtime/src/kmp_taskdeps.cpp
+5-5openmp/runtime/src/kmp.h
+3-3openmp/runtime/CMakeLists.txt
+2-2openmp/runtime/src/kmp_config.h.cmake
+2-2openmp/runtime/src/kmp_settings.cpp
+47-4611 files not shown
+60-5917 files

LLVM/project 38daaballvm/lib/Target/X86 X86MCInstLower.cpp, llvm/test/CodeGen/X86 vector-bitreverse.ll gfni-rotates.ll

[X86] Add constant pool comments for (V)GF2P8AFFINEQB instructions (#194572)

Still need to do predicate/broadcast handling, but that's true for most instructions and we need a decent general mechanism to handle them
DeltaFile
+26-26llvm/test/CodeGen/X86/vector-bitreverse.ll
+24-24llvm/test/CodeGen/X86/gfni-rotates.ll
+14-14llvm/test/CodeGen/X86/gfni-lzcnt.ll
+12-12llvm/test/CodeGen/X86/gfni-funnel-shifts.ll
+21-0llvm/lib/Target/X86/X86MCInstLower.cpp
+9-9llvm/test/CodeGen/X86/gfni-shifts.ll
+106-851 files not shown
+114-937 files

LLVM/project 3d1fa9emlir/include/mlir/Dialect/SMT/IR SMTTypes.td, mlir/lib/Dialect/SMT/IR SMTTypes.cpp

[mlir][smt] Allow empty function domains (#193732)
DeltaFile
+0-31mlir/unittests/Dialect/SMT/TypeTest.cpp
+15-2mlir/lib/Dialect/SMT/IR/SMTTypes.cpp
+6-0mlir/test/Dialect/SMT/basic.mlir
+1-4mlir/include/mlir/Dialect/SMT/IR/SMTTypes.td
+0-1mlir/unittests/Dialect/SMT/CMakeLists.txt
+22-385 files

LLVM/project b46a51dllvm/lib/CodeGen/GlobalISel MachineIRBuilder.cpp, llvm/unittests/CodeGen/GlobalISel MachineIRBuilderTest.cpp

[GISel] Add operands check for G_INSERT_SUBVECTOR and G_EXTRACT_SUBVECTOR in buildInstr (#186021)
DeltaFile
+101-11llvm/unittests/CodeGen/GlobalISel/MachineIRBuilderTest.cpp
+59-0llvm/lib/CodeGen/GlobalISel/MachineIRBuilder.cpp
+160-112 files

LLVM/project 43645bellvm/lib/Target/RISCV RISCVISelLowering.cpp, llvm/lib/Target/RISCV/GISel RISCVInstructionSelector.cpp

[RISCV][GISel] Support select G_INSERT_SUBVECTOR (#171092)
DeltaFile
+411-0llvm/test/CodeGen/RISCV/GlobalISel/rvv/insert-subvector.ll
+61-1llvm/lib/Target/RISCV/GISel/RISCVInstructionSelector.cpp
+2-1llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+474-23 files

LLVM/project ee8ca6bflang/lib/Optimizer/OpenMP MapInfoFinalization.cpp, flang/test/Transforms omp-map-info-finalization-usm.fir

Revert "[Flang][OpenMP] Clear close on descriptor members for box parents in USM" (#194568)

Reverts llvm/llvm-project#194287

Buildbot errors in https://lab.llvm.org/buildbot/#/builders/67/builds/3464
local revert fixed the issues.
DeltaFile
+0-49offload/test/offloading/fortran/usm-box-parent-descriptor-close.f90
+12-12flang/test/Transforms/omp-map-info-finalization-usm.fir
+12-6flang/lib/Optimizer/OpenMP/MapInfoFinalization.cpp
+24-673 files

LLVM/project ecb69fbllvm/lib/Target/LoongArch LoongArchISelLowering.cpp LoongArchLSXInstrInfo.td, llvm/test/CodeGen/LoongArch ctpop-with-lsx.ll sextw-removal.ll

[LoongArch] Optimize for scalar type `ctpop` when lsx enabled (#166286)
DeltaFile
+32-55llvm/test/CodeGen/LoongArch/ctpop-with-lsx.ll
+38-22llvm/lib/Target/LoongArch/LoongArchISelLowering.cpp
+11-17llvm/test/CodeGen/LoongArch/sextw-removal.ll
+10-17llvm/test/CodeGen/LoongArch/ctlz-cttz-ctpop.ll
+21-3llvm/lib/Target/LoongArch/LoongArchLSXInstrInfo.td
+112-1145 files

LLVM/project f409c26llvm/lib/Target/LoongArch LoongArchInstrInfo.td, llvm/test/CodeGen/LoongArch ctlz-cttz-ctpop.ll

[LoongArch] Add patterns to match `cto.w/d` when meeting i8/i16 types `not+cttz` (#166124)
DeltaFile
+8-12llvm/test/CodeGen/LoongArch/ctlz-cttz-ctpop.ll
+4-0llvm/lib/Target/LoongArch/LoongArchInstrInfo.td
+12-122 files

LLVM/project 33117e7llvm/lib/Target/AArch64 AArch64TargetTransformInfo.cpp, llvm/test/Transforms/InstCombine/AArch64 sve-intrinsic-comb-all-active-lanes-cvt.ll sve-intrinsic-comb-no-active-lanes.ll

[AArch64][ISel] Remove zero instruction for `rev` all true predicates (#192925)

This patch removes the redundant instruction to zero inactive lanes for
SVE2p1 `rev` intrinsics when all lanes are active.
DeltaFile
+100-0llvm/test/Transforms/InstCombine/AArch64/sve-intrinsic-comb-all-active-lanes-cvt.ll
+4-8llvm/test/Transforms/InstCombine/AArch64/sve-intrinsic-comb-no-active-lanes.ll
+4-0llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
+108-83 files

LLVM/project 0c17423llvm/lib/Target/LoongArch LoongArchLASXInstrInfo.td LoongArchLSXInstrInfo.td, llvm/test/CodeGen/LoongArch/lasx ctpop-ctlz.ll

[LoongArch] Add patterns for `[x]vclo.{b/h/w/d}` instructions (#165985)
DeltaFile
+4-8llvm/test/CodeGen/LoongArch/lsx/ctpop-ctlz.ll
+4-8llvm/test/CodeGen/LoongArch/lasx/ctpop-ctlz.ll
+6-0llvm/lib/Target/LoongArch/LoongArchLASXInstrInfo.td
+6-0llvm/lib/Target/LoongArch/LoongArchLSXInstrInfo.td
+20-164 files

LLVM/project e8ca585mlir/lib/Dialect/Arith/IR ArithOps.cpp

[mlir][arith] Remove redundant lambdas (NFC) (#194376)

Replace trivial lambda wrappers with direct function references. The
lambdas simply forwarded their arguments to existing functions, so
passing the function directly is clearer and more concise.
DeltaFile
+11-29mlir/lib/Dialect/Arith/IR/ArithOps.cpp
+11-291 files

LLVM/project 09cd294clang/lib/Sema SemaOpenMP.cpp, clang/test/OpenMP target_device_omp_initial_invalid.c target_device_messages.cpp

[Clang][OpenMP] Validate omp_initial_device omp_invalid_device as device IDs (#193688)

The counterpart fix for clang (as too done here:
[flang-fix](https://github.com/llvm/llvm-project/pull/193669))

The incorrectly interpreted device values in the `target` directive
throws:
 
```
 error: argument to 'device' clause must be a non-negative integer value
    #pragma omp target device(-1)
                              ^~
error: argument to 'device' clause must be a non-negative integer value
    #pragma omp target device(omp_invalid_device)
                              ^~~~~~~~~~~~~~~~~~
```
DeltaFile
+55-0clang/test/OpenMP/target_device_omp_initial_invalid.c
+29-5clang/lib/Sema/SemaOpenMP.cpp
+10-6clang/test/OpenMP/target_device_messages.cpp
+8-4clang/test/OpenMP/target_update_device_messages.cpp
+7-3clang/test/OpenMP/target_teams_distribute_device_messages.cpp
+7-3clang/test/OpenMP/target_teams_distribute_parallel_for_simd_device_messages.cpp
+116-2114 files not shown
+186-5120 files

LLVM/project a14cd6cllvm/lib/Target/LoongArch LoongArchLSXInstrInfo.td LoongArchLASXInstrInfo.td, llvm/test/CodeGen/LoongArch/lsx bitclr.ll bitrev.ll

[LoongArch] Support VBIT{CLR,SET,REV}I patterns for non-native element sizes

Extend vsplat_uimm_{pow2,inv_pow2} matching to allow specifying an explicit
element bit width, enabling recognition of splat patterns whose logical
element size differs from the vector's native element type.

Introduce templated selectVSplatUimm{Pow2,InvPow2} helpers with an optional
EltSize parameter, and add corresponding ComplexPattern definitions for
i8/i16/i32 element widths. This allows TableGen patterns to match cases such
as operating on v8i32/v4i64 vectors with masks derived from smaller element
sizes (e.g. i16).

With these changes, AND/OR/XOR operations using inverse power-of-two or
power-of-two splat masks are now correctly selected to VBITCLRI, VBITSETI,
and VBITREVI instructions instead of falling back to vector logical
operations with materialized constants.
DeltaFile
+35-0llvm/lib/Target/LoongArch/LoongArchLSXInstrInfo.td
+27-0llvm/lib/Target/LoongArch/LoongArchLASXInstrInfo.td
+8-4llvm/lib/Target/LoongArch/LoongArchISelDAGToDAG.cpp
+3-6llvm/test/CodeGen/LoongArch/lsx/bitclr.ll
+3-6llvm/test/CodeGen/LoongArch/lsx/bitrev.ll
+3-6llvm/test/CodeGen/LoongArch/lsx/bitset.ll
+79-224 files not shown
+90-4110 files

LLVM/project 095cbdallvm/lib/Target/LoongArch LoongArchLASXInstrInfo.td

Address wanglei's comments
DeltaFile
+18-18llvm/lib/Target/LoongArch/LoongArchLASXInstrInfo.td
+18-181 files

LLVM/project ac41786llvm/lib/Target/AArch64 AArch64ISelLowering.cpp, llvm/test/CodeGen/AArch64 sve-gather-scatter-addr-opts.ll

[AArch64][SDAG] Fix invalid index into STEP_VECTOR operand (#193236)

This commit fixes an invalid index into STEP_VECTOR constant operand
trying to find a more optimal addressing type.
 
Related Issue:
https://github.com/llvm/llvm-project/issues/193014#event-24715078836
DeltaFile
+21-0llvm/test/CodeGen/AArch64/sve-gather-scatter-addr-opts.ll
+0-11llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+21-112 files

LLVM/project 2552635llvm/test/CodeGen/LoongArch/lasx bitset.ll bitclr.ll, llvm/test/CodeGen/LoongArch/lsx bitclr.ll bitrev.ll

[LoongArch][NFC] Add tests for VBITCLRI, VBITSETI, and VBITREVI (#193718)
DeltaFile
+96-0llvm/test/CodeGen/LoongArch/lasx/bitset.ll
+96-0llvm/test/CodeGen/LoongArch/lasx/bitclr.ll
+96-0llvm/test/CodeGen/LoongArch/lasx/bitrev.ll
+96-0llvm/test/CodeGen/LoongArch/lsx/bitclr.ll
+96-0llvm/test/CodeGen/LoongArch/lsx/bitrev.ll
+96-0llvm/test/CodeGen/LoongArch/lsx/bitset.ll
+576-06 files

LLVM/project a54419allvm/lib/Target/LoongArch LoongArchISelLowering.cpp LoongArchLASXInstrInfo.td, llvm/test/CodeGen/LoongArch/lasx/ir-instruction avgfloor-ceil.ll

[LoongArch] Set `avg{floor/ceil}{s/u}` as legal for lsx and lasx (#165836)

Suggested-by: tangaac <tangyan01 at loongson.cn>
Link:
https://github.com/llvm/llvm-project/pull/161079#issuecomment-3420763377
DeltaFile
+16-64llvm/test/CodeGen/LoongArch/lasx/ir-instruction/avgfloor-ceil.ll
+16-64llvm/test/CodeGen/LoongArch/lsx/ir-instruction/avgfloor-ceil.ll
+8-0llvm/lib/Target/LoongArch/LoongArchISelLowering.cpp
+4-0llvm/lib/Target/LoongArch/LoongArchLASXInstrInfo.td
+4-0llvm/lib/Target/LoongArch/LoongArchLSXInstrInfo.td
+48-1285 files

LLVM/project f5b6e4flibcxx/include CMakeLists.txt, libcxx/include/__locale_dir locale_base_api.h

[libc++][NFC] Remove unused header <__support/xlocale/__nop_locale_mgmt.h> (#194316)
DeltaFile
+0-33libcxx/include/__support/xlocale/__nop_locale_mgmt.h
+3-0libcxx/include/__locale_dir/locale_base_api.h
+0-1libcxx/include/CMakeLists.txt
+3-343 files

LLVM/project a6cf1aamlir/lib/Dialect/SPIRV/IR SPIRVOps.cpp, mlir/test/Dialect/SPIRV/IR structure-ops.mlir

[mlir][SPIR-V] Allow SpecConstantComposite constituents to reference other SpecConstantComposites (#193416)

The verifier for spirv.SpecConstantComposite previously assumed all
constituents were spirv.SpecConstant ops, which caused a crash when
referencing nested spirv.SpecConstantComposite ops

Per the SPIR-V spec (s3.3.7, OpSpecConstantComposite), constituents
"must be the \<id\>s of other specialization constants, constant
declarations, or an OpUndef", which includes OpSpecConstantComposite
DeltaFile
+51-0mlir/test/Dialect/SPIRV/IR/structure-ops.mlir
+20-6mlir/lib/Dialect/SPIRV/IR/SPIRVOps.cpp
+71-62 files

LLVM/project a726282lldb/source/Commands CommandObjectThread.cpp CommandObjectType.cpp, lldb/source/Interpreter CommandInterpreter.cpp

[lldb] Remove full stop from AppendErrorWithFormat format strings (part 2) (#194352)

To fit the style guide:
https://llvm.org/docs/CodingStandards.html#error-and-warning-messages

I found these with:
* Find `(\.AppendErrorWithFormat\(([\s\r\n]+)?"(?:(?:\\.|[^"\\])*))\."`
and replace with `$1"` using Visual Studio Code.
* Putting a call to `validate_diagnostic` in `AppendErrorWithFormat`.
* Manual inspection.

Note that this change *does not* include a call to `validate_diagnostic`
because I do not know what's going to crash on platforms that I haven't
tested on.
DeltaFile
+31-32lldb/source/Commands/CommandObjectThread.cpp
+13-14lldb/source/Commands/CommandObjectType.cpp
+13-13lldb/source/Commands/CommandObjectTarget.cpp
+10-12lldb/source/Commands/CommandObjectSource.cpp
+7-7lldb/source/Interpreter/CommandInterpreter.cpp
+3-3lldb/source/Commands/CommandObjectProcess.cpp
+77-816 files not shown
+87-9112 files

LLVM/project 711a17dllvm/include/llvm/CodeGenTypes LowLevelType.h, llvm/lib/CodeGen/GlobalISel LegalizerHelper.cpp

[AArch64][GlobalISel] Lower BF16 FPTRUNC (#193941)

When the architecture +bf16 features is available this is simple as we
lower to a standard instruction. When not available we need to expand to
a series of instructions that performs the necessary rounding. The code
to do that is a port of TargetLowering::expandFP_ROUND to GISel, minus
the float64 odd rounding via expandRoundInexactToOdd. f64 will follow in
a followup patch.

uitofp and sitofp are currently disabled, so that we can take this one
step at a time and check each part in turn. The LLT fp types attempt to
return true for ieee types without UseExtended for types of the correct
size, always returning false for non-standard types.
DeltaFile
+77-39llvm/test/CodeGen/AArch64/bf16-instructions.ll
+46-22llvm/test/CodeGen/AArch64/bf16-v8-instructions.ll
+47-6llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp
+29-15llvm/test/CodeGen/AArch64/bf16-v4-instructions.ll
+32-3llvm/include/llvm/CodeGenTypes/LowLevelType.h
+7-7llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp
+238-921 files not shown
+239-927 files

LLVM/project a910a0bmlir/lib/Dialect/X86/Transforms VectorContractToPackedTypeDotProduct.cpp, mlir/test/Dialect/X86 vector-contract-to-packed-type-dotproduct.mlir

[mlir][x86] Fix - Replace `load` with `transfer_read` to support on tensor type. (#194543)

This `patch` replaces `vector.load` operation with
`vector.transfer_read` op, such that the re-write lowers
`vector.contract` ops to `bf16_avx512_dp`.
DeltaFile
+105-0mlir/test/Dialect/X86/vector-contract-to-packed-type-dotproduct.mlir
+10-2mlir/lib/Dialect/X86/Transforms/VectorContractToPackedTypeDotProduct.cpp
+115-22 files

LLVM/project b6ed6b6llvm/docs AMDGPUUsage.rst

Add operands doc
DeltaFile
+13-0llvm/docs/AMDGPUUsage.rst
+13-01 files

LLVM/project 9edf0e7flang/include/flang/Optimizer/Analysis ArraySectionAnalyzer.h, flang/lib/Optimizer/Analysis ArraySectionAnalyzer.cpp

[flang] improve array section analysis for WHERE (#194399)

The array section analysis in the HLFIR pass in charge of WHERE lowering
was unable to tell that the LHS and RHS are the same array section when
the base is an assumed shape or when a variable is used as indices.

This patch adds an optional callback to the array section
analysis to tell if two SSA values have the same value. This call back
is then implemented to tell that two SSA values are the same only if:
they are the result of equivalent operations with no memory effect (ok
to be non speculatable) and with operands that have the same value
(recursively), or if they are the load from the same variable (which is
OK in the context of WHERE RHS/LHS thanks to Fortran 2023 10.1.4 that
guarantee that a variable referred both on the RHS and LHS cannot be
modified by side effects in the RHS/LHS).

Assisted by: Claude
DeltaFile
+143-0flang/test/HLFIR/order_assignments/where-equivalent-subscripts.fir
+114-15flang/lib/Optimizer/HLFIR/Transforms/ScheduleOrderedAssignments.cpp
+24-3flang/include/flang/Optimizer/Analysis/ArraySectionAnalyzer.h
+19-5flang/lib/Optimizer/Analysis/ArraySectionAnalyzer.cpp
+300-234 files

LLVM/project 972d2c2llvm/docs AMDGPUUsage.rst

Comments
DeltaFile
+4-9llvm/docs/AMDGPUUsage.rst
+4-91 files