[AMDGPU] Do not emit function prologue on naked functions (#191398)
Summary:
Naked functions are intended to allow the user to write the entirety of
the function block, so we shouldn't include the `waitcnt` instructions
for them.
[SystemZTTI][CostModel] Improve SystemZ cost model for scalar Read-Modify-Write Sequence, Fix #189183 (#190350)
This patch improves the SystemZ cost model to identify Read-Modify-Write
sequences
that can be folded into a single instruction (e.g., ASI, NI, OI).
If a load, a scalar arithmetic operation (ADD, SUB, AND, OR, XOR) with
an
immediate, and a store all target the same memory location and have no
external uses, the cost of the arithmetic and store insn should bw 0.
This implementation does not include TTI::TCK_RecipThroughput CostKind,
as
it causes regression in non-power-2-subvector-extract.ll.
Fixes #189183. (Refer it for example)
---------
Co-authored-by: anoopkg6 <anoopkg6 at github.com>
[OpenMP][MLIR] Modify lowering OpenMP Dialect lowering to support attach mapping
This PR adjusts the LLVM-IR lowering to support the new attach map type that the runtime
uses to link data and pointer together, this swaps the mapping from the older
OMP_MAP_PTR_AND_OBJ map type in most cases and allows slightly more complicated ref_ptr/ptee
and attach semantics.
AMDGPU: Match fract from compare and select and minimum (#189082)
Implementing this with any of the minnum variants is overconstraining
for the actual use. Existing patterns use fmin, then have to manually
clamp nan inputs to get nan propagating behavior. It's cleaner to
express this with a nan propagating operation to start with.
[clang-tidy] Fix readability-identifier-naming FP with DefaultCase on function templates (#189788)
Closes #189755
---------
Co-authored-by: EugeneZelenko <eugene.zelenko at gmail.com>
Co-authored-by: Daniil Dudkin <unterumarmung at yandex.ru>
[Clang] Pass toolchain paths unconditionally in linker wrapper (#191311)
Summary:
Previously we used the auto-forwarding mechanism to handle options like
forwarding --cuda-path. The problem is that this went over the toolchain
options and that meant if someone used just bare `--offload-link` there
would be no CUDA or ROCm toolchain to figure out if we should forward
it. Just do this unconditionally for all toolchains, there's no harm in
setting it if it's unused.
Fixes: https://github.com/llvm/llvm-project/issues/190979
[acc] Support for Optional arguments in firstprivate recipes (#190079)
Add support for explicit of implicit firstprivates that are Fortran
Optional arguments.