[ORC] Simplify DylibManager::lookupSymbols, remove LookupRequest. (#197626)
DylibManager::lookupSymbols used to take an array of LookupRequests,
where each request specified a handle and list of symbols to lookup
within that handle.
This commit replaces the array of lookup requests with a single handle
and list of symbols passed directly to lookupSymbols.
In practice all clients were passing a singlton array anyway, and
simplifying this signature significantly simplifies implementations.
[LLVM-Flang] Improve the realloc size for the write statement (#187662)
Information:
The "buffer_" is used to store data in memory, and then the entire
"buffer_" is written at once using the Write(..) function. Consider a
write statement inside the nested implied do loop, the buffer_ size
keeps increasing when the length exceeds the buffer size. To increase
the size, we need to reallocate memory by copying the entire buffer into
a new buffer. This process consumes more time.
Implementation:
By reducing the number of reallocations, the performance could be
improved. Initially, we increase the size linearly, i.e., +65536 for
every new allocation. Then, when we cross 1 MB of buffer size, we
increase the size geometrically, i.e., 2x. Later, for more than 64MB, we
do 1.5x.
Issue: https://github.com/llvm/llvm-project/issues/163945
[Flang][OpenMP] Support iterator modifiers in map and motion clauses
Support iterated array elements and array sections in map and motion clauses for
target data, target enter data, target exit data, and target update constructs.
Preserve mapper resolution for iterated entries, including explicit mappers,
user-defined default mappers, declare mapper entries, and implicit default
mappers.
This PR stacked on top of #197047 and #197752.
This patch is part of the feature work for #188061.
Assisted with copilot.
[mlir][complex] Emit fma for contracted complex.mul lowering (#196248)
When complex.mul has fastmath<contract>, lower it using explicit fused
multiply-add operations for the real and imaginary components.
The lowering changes from:
real = ar * br - ai * bi
imag = ai * br + ar * bi
expressed as mul/sub/add, to:
real = fma(ar, br, -(ai * bi))
imag = fma(ar, bi, ai * br)
This is only applied when contraction is allowed. Non-contracted
complex.mul
continues to lower to separate fmul/fsub/fadd operations.
Fixed: https://github.com/llvm/llvm-project/issues/196246
[clangd][Lex][NFC] Use valid non-ASCII identifiers in tests (#197826)
Several tests used "ab🙂cd" / "🙂cd" as multi-byte UTF-8 example
identifiers. The smiley, however, is not actually among the allowed
identifier characters, and Clang only accepts it as an extension (with a
warning).
Switch to identifiers that are valid per [lex.name]:
- "naïve", with a non-ASCII char in the middle of the identifier,
- "æon", with a non-ASCII char at the start of the identifier,
- "café", with a non-ASCII char at the end of the identifier.
The 2-byte characters are handled the same way as the original 4-byte
emoji; no functional change here.
[flang][OpenMP][NFC] Share declare mapper helpers for iterator modifier lowering
Move mapper lookup and implicit default mapper creation into reusable
OpenMP lowering helpers so regular map lowering and iterator-generated
map entries can use the same resolution path.
This prepares Flang iterator modifier lowering for map and motion clauses
without changing the generated IR for existing non-iterator maps.
[AMDGPU] Fix waterfall inreg call args -- AGPR sources legalized for `V_READFIRSTLANE_B32` instr (#194890)
#146997 introduced waterfall loop to handle illegal copies from
non-uniform sources. The logic resulted in the issue:
In cases where the register class is `AV`, the waterfall pumped that
operand straight into `v_readfirstlane_b32` and `v_cmp_eq_u32`, but
those instructions cannot read an AGPR. When the allocator picked an
AGPR, the verifier rejected the result.
Changes made: Copy the AGPR value to a VGPR register class before the
waterfall reads it.
Fixes LCOMPILER-2045.
[clang-repl] fix vtable symbol duplication error (closes #141039) (#185648)
In incremental mode, emit by ExternalLinkage causes duplicate symbol
error. A single targeted change in getVTableLinkage() fixes the issue by
returning LinkOnceODRLinkage when IncrementalExtensions is active. The
JIT linker then keeps the first definition a silently discards
subsequent ones.
closes issue #141039
Co-authored-by: Emery Conrad <emery.conrad at chicagotrading.com>
[PowerPC] Enable custom lowering for bswap64 builtin on Power8 64 bits with improved parallelism (#187259)
The current implementation for `__builtin_bswap64` does not do many
things in parallel. This patch splits the 64 bit swaps into 32 bit swaps
as high and low 32-bit swaps are independent and can execute
simultaneously.
Compared to the sequential approach, there are fewer instructions. These
changes should not alter the current assembly and there is default
fall-through for power9+.
---------
Co-authored-by: himadhith <himadhith.v at ibm.com>
[clang][NFC] Unify `MacroState` `isAmbiguous` and `getModuleInfo`
Every call to `MacroState::getModuleInfo`, and `MacroState::isAmbiguous` are paired in the same function. Rather than doing the same work twice, just unify them into a single function, `getModuleInfo`, that returns both pieces of information in a new type `ModuleMacroInfo`.
Unfortunately, `getModuleInfo` and`ModuleMacroInfo` already exist, so rename them to `getFullModuleInfo` and `FullModuleMacroInfo`, respectively, since the new type is a subset of the old type. The new type contains just the pieces consumers care about.
While we're there, use the range constructor of `llvm::DenseSet` instead of default constructing and calling `insert` in a loop.
[libc] Add all the toolchains needed for libc-shared-tests to the docker container. (#197735)
Toolchains include:
- gcc-7, 8, 9, 11
- qemu-static-user
- cross-build toolchain for aarch64, riscv64, ppc64le, including gcc,
g++, gmp, mpfr, mpc.
Container size before the change: ~ 500 MB
Container size after the change: ~ 1.3 - 1.4 GB
[flang] Diagnose BIND(C) procedures in submodules without ancestor interfaces (#194571)
This diagnoses `BIND(C)` procedures defined in submodules when their
interface is not declared in the ancestor module.
The check is added in `CheckBindC()` and covers plain `BIND(C)`,
explicit `NAME=`, empty/all-blank `NAME=`, valid ancestor-module
interfaces, and nested submodule cases.
Fixes #194570.
Co-authored-by: Sairudra More <moresair at pe31.hpc.amslabs.hpecorp.net>
[AtomicExpand] Add bitcasts when expanding store atomic vector
AtomicExpand fails for aligned \`store atomic <n x T>\` because it
does not find a compatible library call. This change adds appropriate
ptrtoint + bitcast so that the call can be lowered, mirroring the
load-side handling from #148900.
Require explicit yield in iterator op
Remove the implicit terminator trait from omp.iterator so iterator
modifiers must explicitly yield the value used to form the iterated list.
Add and update verfier and test accordingly.
Reject target map iterators without captures
Reject target map iterators until the follow-up capture-binding
representation is added since currently map_iterated on omp.target
only represents the dynamic map list and does not consider the
target-region arguments required by IsolatedFromAbove.
Simplify map iterator clause assembly
- Split MLIR map syntax into separate map_entries(...) and map_iterated(...),
removing the custom MapEntryList parser/printer.
- Moved omp.target map_iterated out of TargetOpRegion
- it now prints before the target region instead of as map_iterated_entries(...) after the region.
- Renamed LLVMIR TODO helper to clause-style checkMap.
- Added DeclareMapperInfoOp builder from DeclareMapperInfoOperands
and updated Flang call sites so they do not need to spell out newly
added operands..
Reject target map iterators without captures
Reject target map iterators until the follow-up capture-binding
representation is added since currently map_iterated on omp.target
only represents the dynamic map list and does not consider the
target-region arguments required by IsolatedFromAbove.
Require explicit yield in iterator op
Remove the implicit terminator trait from omp.iterator so iterator
modifiers must explicitly yield the value used to form the iterated list.
Add and update verfier and test accordingly.
Simplify map iterator clause assembly
- Split MLIR map syntax into separate map_entries(...) and map_iterated(...),
removing the custom MapEntryList parser/printer.
- Moved omp.target map_iterated out of TargetOpRegion
- it now prints before the target region instead of as map_iterated_entries(...) after the region.
- Renamed LLVMIR TODO helper to clause-style checkMap.
- Added DeclareMapperInfoOp builder from DeclareMapperInfoOperands
and updated Flang call sites so they do not need to spell out newly
added operands..