[TTI][Vectorize] Migrate masked/gather-scatter/strided/expand-compress costing (NFCI) (#165532)
In #160470, there is a discussion about the possibility to explored a
general approach for handling memory intrinsics.
API changes:
- Remove getMaskedMemoryOpCost, getGatherScatterOpCost,
getExpandCompressMemoryOpCost, getStridedMemoryOpCost from
Analysis/TargetTransformInfo.
- Add getMemIntrinsicInstrCost.
In BasicTTIImpl, map intrinsic IDs to existing target implementation
until the legacy TTI hooks are retired.
- masked_load/store → getMaskedMemoryOpCost
- masked_/vp_gather/scatter → getGatherScatterOpCost
- masked_expandload/compressstore → getExpandCompressMemoryOpCost
- experimental_vp_strided_{load,store} → getStridedMemoryOpCost
TODO: add support for vp_load_ff.
No functional change intended; costs continue to route to the same
target-specific hooks.
[TTI] Use MemIntrinsicCostAttributes for getExpandCompressMemoryOpCost (#168677)
- Following #168029. This is a step toward a unified interface for
masked/gather-scatter/strided/expand-compress cost modeling.
- Replace the ad-hoc parameter list with a single attributes object.
API change:
```
- InstructionCost getExpandCompressMemoryOpCost(Opcode, DataTy,
- VariableMask, Alignment,
- CostKind, Inst);
+ InstructionCost getExpandCompressMemoryOpCost(MemIntrinsicCostAttributes,
+ CostKind);
```
Notes:
- NFCI intended: callers populate MemIntrinsicCostAttributes with same
information as before.
[LLD] Add support for statically resolved vendor-specific RISCV relocations. (#169273)
This is achieved by using some of the bits of RelType to tag vendor namespaces. This change also adds a relocation iterator for RISCV that folds vendor namespaces into the RelType of the following relocation.
This patch is extracted from the implementation of RISCV vendor-specific relocations in the CHERIoT LLVM downstream: https://github.com/CHERIoT-Platform/llvm-project/commit/3d6d6f7d9480b590731cbcf4b4817e1fa3049854
[mlir][arith] Fix `arith.cmpf` lowering with unsupported FP types (#166684)
The `arith.cmpf` lowering pattern used to generate invalid IR when an
unsupported floating-point type was used.
[QualGroup][docs] Update meeting schedule and link for slides (#169458)
Summary
======
This PR update the schedule for online sync-up and update link for past
meeting slides.
Changes
======
* Remove the wednesday schedule, since we did not have the meeting for
Americas-friendly timezones.
* Use a single folder for past meeting slides instead of individual
links.
Related Links
=========
* [Meeting materials for Qualification Working
Group](https://llvm.org/docs/QualGroup.html#meeting-materials)
* [Online
[4 lines not shown]
[libc++] Fix the locale base API on Linux with musl (#167980)
This pull request addresses an issue encountered when building
**libcxx** with certain configurations (`-D_LIBCPP_HAS_MUSL_LIBC` &
`-D__linux__`) that lack the `_GNU_SOURCE` definition. Specifically,
this issue arises if the system **musl libc** is built with
`_BSD_SOURCE` instead of `_GNU_SOURCE`. The resultant configuration
leads to problems with the "Strtonum functions" in the file
[libcxx/include/__locale_dir/support/linux.h](https://github.com/llvm/llvm-project/tree/master/libcxx/include/__locale_dir/support/linux.h),
affecting the following functions:
- `__strtof`
- `__strtod`
- `__strtold`
**Error messages displayed include**:
```console
error: no member named 'strtof_l' in the global namespace
```
[9 lines not shown]
[NFC] [DirectX] Make DirectX codegen test `CBufferAccess/gep-ce-two-uses.ll` more strict (#169855)
Continuation of PR #169848 to address PR comments.
This PR makes the test more strict by adding CHECKs to ensure the loads
are indeed using the same or different GEPs.
[VPlan] Handle scalar VPWidenPointerInd in convertToConcreteRecipes. (#169338)
In some case, VPWidenPointerInductions become only used by scalars after
legalizeAndOptimizationInducftions was already run, for example due to
some VPlan optimizations.
Move the code to scalarize VPWidenPointerInductions to a helper and use
it if needed.
This fixes a crash after #148274 in the added test case.
Fixes https://github.com/llvm/llvm-project/issues/169780
[bolt][aarch64] Change indirect call instrumentation snippet
Indirect call instrumentation snippet uses x16 register in exit
handler to go to destination target
__bolt_instr_ind_call_handler_func:
msr nzcv, x1
ldp x0, x1, [sp], llvm#16
ldr x16, [sp], llvm#16
ldp x0, x1, [sp], llvm#16
br x16 <-----
This patch adds the instrumentation snippet by calling instrumentation
runtime library through indirect call instruction and adding the wrapper
to store/load target value and the register for original indirect instruction.
Example:
mov x16, foo
[79 lines not shown]
[MLIR][OpenMP] Add OpenMPToLLVMIRTranslation support for is_device_ptr
This PR adds support for the OpenMP is_device_ptr clause in the MLIR to LLVM IR translation for target regions. The is_device_ptr clause allows device pointers (allocated via OpenMP runtime APIs) to be used directly in target regions without implicit mapping.
[NFC] [DirectX] Update DirectX codegen test `CBufferAccess/gep-ce-two-uses.ll` due to changes to ReplaceConstant (#169848)
Fixes an LLVM DirectX codegen test after it broke due to #169141
The CBuffer loads and GEPs are no longer duplicated when there are two
or more accesses within the same basic block.
This PR removes the duplicate check for CBuffer load and GEP from the
original test function `@f` and adds a new test function `@g` which
places duplicate CBuffer loads into separate basic blocks.
[lld][WebAssembly] Fix SEGFAULT when importing wrapped symbol (#169656)
When wrapping a symbol `foo` via `-wrap=foo`, we create the symbol
`__wrap_foo` that replaces all mentions of `foo`. This feature was
implemented for wasm-ld in commit a5ca34e.
So far, no valid signature has been attached to the undefined symbol,
leading to a nullptr dereference in the logic for creating the import
section. This change adds the correct signature to the wrapped symbol,
enabling the generation of an import for it.
[AArch64] Use SVE for fixed-length bf16 operations with +sve-b16b16 (#169329)
This can avoid the promotion bf16 -> f32 -> bf16 round trip (or costly
expansions).
Revert "[ShrinkWrap] Modify shrink wrapping to accommodate functions terminated by no-return blocks" (#169852)
Reverts llvm/llvm-project#167548
As commented at
https://github.com/llvm/llvm-project/pull/167548#issuecomment-3587008602
this is causing miscompiles in two-stage RISC-V Clang/LLVM builds that
result in test failures on the builders.