[OpenMP][Flang][MLIR] Skip trip count calculation when bounds are null (#176469)
Fixes a segfault when trip count values are null by skipping trip count
calculation when we cannot determine if it is safe to hoist out the
values.
Of note I originally tried to modify `extractOnlyOmpNestedDir` to return
the first OpenMPConstruct directive, skipping over any earlier
directives (ie stores), which did work for the below generic test case:
```fortran
program minimal_repro
implicit none
integer :: i, m
integer :: res(10) = 0
!$omp target teams map(from:m,res) private(m)
m = 5
[59 lines not shown]
AMDGPU/GlobalISel: Regbanklegalize rules for G_UNMERGE_VALUES
Move G_UNMERGE_VALUES handling to AMDGPURegBankLegalizeRules.cpp.
Fix sgpr S16 unmerge by lowering using shift and using S32.
Previously sgpr S16 unmerge was selected using _lo16 and _hi16 subreg
indexes which are exclusive to vgpr register classes.
For remaing cases we do trivial mapping, assigns same reg bank
to all operands, vgpr or sgpr.
[CoroFrame][NFC] Reduce insertSpills size through a helper function (#177129)
This function can be pretty difficult to follow due to its size and how
much work it does. This commit moves a lambda capturing a lot of state
into a self-contained function.
It will allow subsequent patches to simplify code and delete variables.
[LifetimeSafety] Use source ranges instead of end locations in diagnostics (#177020)
### TL;DR
Update diagnostic location information to use full source ranges instead of just end locations for lifetime safety warnings.
[runtimes] Share doxygen handling with LLVM (#176948)
Hoist handling of Doxygen into the top-level cmake/ directory so it can
be shared between LLVM and RUNTIMES and a default/standalone runtimes
build can support building Doxygen documentation as well.
The openmp subproject currently supports doxygen documentation using an
`LLVM_ENABLE_PROJECTS=openmp` build, but not with
`LLVM_ENABLE_RUNTIMES=openmp` because of this missing boilerplate code
in the runtimes build. This is a step towards removing the
`LLVM_ENABLE_PROJECTS=openmp` build mode which was deprecated (#124014)
and already scheduled to be removed in LLVM 21 (#136314). Eventual
removal is planned with #176950.
Hoisting CMake code for shared use with runtimes has been done before in
e.g. #84641, 7017e6c9cfd2de3122ce9528f338a97d61e96373,
44e3365775101fec3fd355eda339282258d74415,
7017e6c9cfd2de3122ce9528f338a97d61e96373
[2 lines not shown]
[AArch64][PAC] Rework the expansion of AUT/AUTPAC pseudos
Refactor `AArch64AsmPrinter::emitPtrauthAuthResign` to improve
readability and fix the conditions when `emitPtrauthDiscriminator` is
allowed to clobber AddrDisc.
* do not clobber `AUTAddrDisc` when computing `AUTDiscReg` on resigning
if `AUTAddrDisc == PACAddrDisc`, as it would prevent passing raw,
64-bit value as the new discriminator
* move the code computing `ShouldCheck` and `ShouldTrap` conditions to a
separate function
[AArch64][PAC] Group arguments of emitPtrauthAuthResign (NFC) (#174002)
The caller of `AArch64AsmPrinter::emitPtrauthAuthResign` has to analyze
the operands of MachineInstr being emitted and pass them explicitly to
this method, which leads to large number of function arguments, some of
them being optional.
This commit introduces `struct PtrAuthSchema` to pass semantically-
related parameters as a single argument and to better express the idea
that the second schema can only be passed or omitted as a whole.
Furthermore, `AUTVal` argument is renamed to `Pointer`, as unlike other
arguments with the `AUT` prefix, it does not relate to the authentication
schema, but represents a tied in-out operand used throughout the entire
expanded instruction sequence.
[mlir][mpi] adding MPI_Allgather and lowering to LLVM (#176937)
- Adding MPI_Allgather to MPI dialect
- Adding lowering to MPIToLLVM
- Also lowering MPI_Commsize (see also #140392)
wesnoth: updated to 1.18.6
Version 1.18.6
Multiplayer
* 5p - The Wilderlands:
* Fixed lag during AI turn
Translations
* Updated translations: Ancient Greek, Arabic, Bengali, British English, Catalan, Chinese (Simplified), Czech, Finnish, French, Galician, Hungarian, Polish, Spanish
User interface
* The load-game dialog can now see the directories used by the development version (1.19.2 and later)
[OpenMP][flang] Move `todo` for checking reduction support status on the GPU
Moves a `todo` to check for the current level of support for by-ref
reductions to the `FunctionFiltering` pass. This guarantees that the
check does not trigger when the same module is compiled twice: on the
CPU and on the GPU.
[OptBisect] Merge shouldRun logic of -opt-bisect and -opt-disable (#177122)
Hi everyone,
After the introduction of `-opt-disable` in,
one of its main limitations has been that it cannot be used together
with `-opt-bisect`, since `getGlobalPassGate()` returns either
`getOptDisabler()` or `getOptBisector()`, but not both. Allowing them to
work simultaneously would be useful for disabling individual passes
while still restricting the pipeline. This is especially relevant given
the recent updates to `-opt-bisect`, such as interval support.
For example, when a defect is caused by a particular pass but its impact
is masked by another, it can be difficult to identify the actual culprit
through bisecting alone. Being able to disable passes individually while
using `-opt-bisect` would make this process much more efficient.
In this PR, I have merged the logic of the two flags so that they can
interoperate. Specifically:
[11 lines not shown]
[AMDGPU][AsmParser] Forbid Fake16 instructions in Real16 mode (#176934)
We don't need to support both simultaneously in tests now that all
True16 instructions are supported.
[IR] Allow non-constant offsets in @llvm.vector.splice.{left,right} (#174693)
Following on from #170796, this PR implements the second part of
https://discourse.llvm.org/t/rfc-allow-non-constant-offsets-in-llvm-vector-splice/88974
by allowing non-constant offsets in the vector splice intrinsics.
Previously @llvm.vector.splice had a restriction enforced by the
verifier that the offset had to be known to be within the range of the
vector at compile time. Because we can't enforce this with non-constant
offsets, it's been relaxed so that offsets that would slide the vector
out of bounds return a poison value, similar to
insertelement/extractelement.
@llvm.vector.splice.left also previously only allowed offsets within the
range 0 <= Offset < N, but this has been relaxed to 0 <= Offset <= N so
that it's consistent with @llvm.vector.splice.right.
In lieu of the verifier checks that were removed, InstSimplify has been
taught to fold splices to poison when the offset is out of bounds.
[5 lines not shown]
IR: Add !nofpclass metadata
This adds the analagous metadata to the nofpclass attribute
to assert values are not a certain set of floating-point classes.
This allows the same information to be expressed if a function
argument is passed indirectly. This matches the bitmask encoding
of nofpclass.
I also think this should be allowed for stores to symmetrically handle
sret, but leave that for later.
Alternatively we could add a more expressive !fprange metadata,
but that would be much more complex. It's useful to match the attribute,
and more annotations can always be added.
Fixes #133560