clang: Simplify OpenMP triple adjustment (#189265)
Previously this would find a list of offloading triples,
then later fill in the unknown components specifically for
OpenMP after the fact. Start normalizing the triples upfront,
before inserting into the set. Also stop special casing OpenMP
since there's no apparent reason to treat it differently from
other offload languages.
Also operate on the Triple rather than the string, and handle
the unset OS and environment separately.
Update the patch according to feedback:
- Remove the changes from IRBuilder, CallBase::getMemoryEffects() must be
enough.
- Update documentation, remove extra terms, fix wording.
- Add tests that check correct ordering.
- Update macro names in FloatingPointOps.def.
- If an instruction is isolated, it is considered strictfp.
[AArch64] More accurately model cost of partial reductions (#181707)
With #181706 using the cost-model to decide whether using partial
reductions is profitable, we need to more accurately represent the cost
of certain partial reduction operations:
* Reflect the fact that *MLALB/T instructions can be used for 16-bit ->
32-bit partial reductions (or *MLAL/MLAL2 for NEON).
* Calculate the cost of expanding the partial reduction in ISel for
reductions that don't have an explicit instruction, rather than
returning a random number. For sub-reductions we scale the cost to make
them slightly cheaper, so that they're still candidates for forming cdot
operations.
[flang] get rid of descriptor in scalar type is (#188762)
Select type lowering was keeping scalar selector as descriptors inside
TYPE IS for derived type, leading to a declare using a fir.box.
This is not the canonical representation for such variables that can be
tracked with a simple pointer. This code that is remapping variables
that appear in data operation in lowering was not expecting a
fir.declare to be emitted with fir.box for such entity (an assert was
hit in the added OpenACC test).
Align the lowering of derived type scalar selector with the handling of
intrinsic selector. While doing this, simplify the logic by using and
adding fir::BaseBoxAddr helpers to ensure that attributes such as
VOLATILE are correctly propagated (they matter more than keeping the
fir.ptr/fir.heap type that is not relevant for the selector that does
not have the POINTER/ALLOCATABLE attributes).
[WebAssembly] Lower extend v16i8 to v16i32 (#188936)
Split the input vector with an extend_low and high and then split the
results again with extend_low and high for a total of 6 instructions.
This is removes 3 shuffles and a couple of extends.
clang: Fix warnings with multiple offload arch args
Fix regression after ab885fdf5f67726ef564c34087e813f2ca861f5c.
Apparently driver tests do not enforce there are no warnings.
Oddly, I need to use -Werror for the specific error. If I use
just -Werror, I get an error that the -Werror is unused.
clang: Simplify OpenMP triple adjustment
Previously this would find a list of offloading triples,
then later fill in the unknown components specifically for
OpenMP after the fact. Start normalizing the triples upfront,
before inserting into the set. Also stop special casing OpenMP
since there's no apparent reason to treat it differently from
other offload languages.
Also operate on the Triple rather than the string, and handle
the unset OS and environment separately.
[lldb] Remove data_offset arg from GetModuleSpecifications (#188978)
- it is always passed as zero
- a lot of plugins aren't using it correctly
- the data extractor class already has the capability to look at a
subset of bytes
[lldb][FreeBSDKernel] Add missing error checks in DynamicLoader (#189250)
Add extra guards in case a call to function fails. For example, the
result of `ReadMemory()` cannot be trusted when `error.Fail()` is true,
and this change ensures the code executes properly according to the
value of the error.
Signed-off-by: Minsoo Choo <minsoochoo0122 at proton.me>
[Flang][OpenMP] Support iterator modifier in depend clause
Lower the iterator modifier on depend clause to omp.iterator.
Iterated depend objects emit `!omp.iterated<!llvm.ptr>` by using
`getDataOperandBaseAddr` to generate base address and
`genIteratorCoordinate` to get the addr+offset. The non-iterated objects
in depend clause still use existing lowering path.
This patch is part of feature work for #188061.
Assisted with copilot.
Revert "[CoroSplit] Erase trivially dead allocas after spilling" (#189311)
I think I forgot to update the FrameData after erasing. I'll check it
locally.
Reverts llvm/llvm-project#189295
[ORC] LinkGraphLinkingLayer::registerDependencies improvements. (#189298)
This commit moves the bulk of
LinkGraphLinkingLayer::registerDependencies into a new static method,
LinkGraphLinkingLayer::calculateDepGroups, where the behavior can be
unit tested.
The new method returns a list of LinkGraphLinkingLayer::SymbolDepGroups:
```
struct SymbolDepGroup {
SmallVector<jitlink::Symbol*> Defs;
DenseSet<jitlink::Symbol*> Deps;
};
```
The existing registerDependencies method converts these into
orc::SymbolDependenceGroups for registration with the ExecutionSession.
[14 lines not shown]
[CoroSplit] Erase trivially dead allocas after spilling (#189295)
Erase these allocas so that they do not take up extra stack space at
-O0.
Close #57638
[NFC][test] Precommit test for pr188989 (#188667)
Precommit test for #188989.
This test case covers a scenario in the vector combine
foldShuffleToIdentity function where incorrect folding was caused when
different shuffle sequences shared the same initial Use *. This issue
may be due to cost model differences and currently reproduces only on
LoongArch for this test case.