[NFCI][bolt][test] Use AT&T syntax explicitly (#167225)
This enables building LLVM with `-mllvm -x86-asm-syntax=intel` in one's
Clang config files (i.e. a global preference for Intel syntax).
`-masm=att` is insufficient as it doesn't override a specification of `-mllvm -x86-asm-syntax`.
[clang][NVPTX] Add remaining float to fp16 conversions (#167641)
This change adds intrinsics and clang builtins for the remaining float
to fp16 conversions. This includes the following conversions:
- float to bf16x2 - satfinite variants
- float to f16x2 - satfinite variants
- float to bf16 - satfinite variants
- float to f16 - all variants
Tests are added in `convert-sm80.ll` and `convert-sm80-sf.ll` for the
intrinsics and in `builtins-nvptx.c` for the clang builtins.
[RISCV][NewPM] Port RISCVCodeGenPrepare to the new pass manager (#168381)
As suggested in the review for #160536 it would be good to follow up and
port the RISC-V passes to the new pass manager. This PR starts that
task. It provides the bare minimum necessary to run RISCVCodeGenPrepare
with opt -passes=riscv-codegenprepare. The approach used is modeled on
my observations of the AMDGPU backend and the recent work to port the
X86 passes.
The testing approach is to add a `-passes=riscv-foo` RUN line to at
least one test, if an appropriate test exists.
ELF,test: Test unversioned undefined symbols of index 0 and 1
My 2020 change that added versioned symbol recognition
(reviews.llvm.org/D80059) checks both VER_NDX_LOCAL and VER_NDX_GLOBAL,
though test coverage was missing. lld/test/ELF/dso-undef-extract-lazy.s
checks that the undefined symbol is indeed considered unversioned.
[libc] Fix -Wshorten-64-to-32 in fileop_test. (#168451)
Explicitly cast 0 to size_t type to match fread() return type. This
follows the pattern used elsewhere in this file, and fixes
-Wshorten-64-to-32 warnings when building the test.
[orc-rt] Simplify Session shutdown. (#168664)
Moves all Session member variables dedicated to shutdown into a new
ShutdownInfo struct, and uses the presence / absence of this struct as
the flag to indicate that we've entered the "shutting down" state. This
simplifies the implementation of the shutdown process.
[MLIR][XeGPU] Allow create mem desc from 2d memref (#167767)
This PR relax the create_mem_desc's restriction on source memref,
allowing it to be a 2d memref.
[libclc] Use CLC atomic functions for legacy OpenCL atom/atomic builtins (#168325)
Main changes:
* OpenCL legacy atom/atomic builtins now call CLC atomic functions
(which use Clang __scoped_atomic_*), replacing previous Clang __sync_*
functions.
* Change memory order from seq_cst to relaxed; keep device scope (spec
permits broader than workgroup). LLVM IR for _Z8atom_decPU3AS1Vi in
amdgcn--amdhsa.bc:
Before:
%2 = atomicrmw volatile sub ptr subrspace(1) %0, i32 1
syncscope("agent") seq_cst
After:
%2 = atomicrmw volatile sub ptr subrspace(1) %0, i32 1
syncscope("agent") monotonic
* Also adds OpenCL 1.0 atom_* variants without volatile on the pointer.
They are added for backward compatibility.
[LV]: Skip Epilogue scalable VF greater than RemainingIterations. (#156724)
Consider skipping epilogue scalable VF when they are greater than
RemainingIterations same as fixed VF.
And skip scalable RemainingIterations from that comparison because
SCEV ATM can't evaluate non-canonical vscale-based expressions.
Reapply "[Github] Update PR labeller to v6.0.1 (#167246)"
This reverts commit d772663a9f003a08ee76414397963c58e80b27d7.
This fixes the final issue with the labeller landing. There were
two remaining issues:
1. There was an extra quote on one of the globs
2. Some of the yaml keys were named incorrectly (should have been
plural)
[PowerPC] Add custom lowering for SADD overflow for i32 and i64 (#159255)
This patch improves the codegen for saddo on i32 and i64 in both 32-bit
and 64-bit modes by custom lowering. It implements signed-add overflow
detection using the `(x eqv y) & (sum xor x)`bit-level sequence.
[TableGen] Silence a warning (NFC)
/llvm-project/llvm/utils/TableGen/Common/CodeGenTarget.cpp:286:12:
error: variable 'SkippedInsts' set but not used [-Werror,-Wunused-but-set-variable]
unsigned SkippedInsts = 0;
^
1 error generated.
[TTI] Use MemIntrinsicCostAttributes for getMaskedMemoryOpCost (#168029)
- Split from #165532. This is a step toward a unified interface for
masked/gather-scatter/strided/expand-compress cost modeling.
- Replace the ad-hoc parameter list with a single attributes object.
API change:
```
- InstructionCost getMaskedMemoryOpCost(Opcode, Src, Alignment,
- AddressSpace, CostKind);
+ InstructionCost getMaskedMemoryOpCost(MemIntrinsicCostAttributes,
+ CostKind);
```
Notes:
- NFCI intended: callers populate MemIntrinsicCostAttributes with the
same information as before.
- Follow-up: migrate gather/scatter, strided, and expand/compress cost
queries to the same attributes-based entry point.