[flang] Downgrade an overly strict error to a warning (#187524)
Fortran allows a PURE subroutine to have dummy argument with INTENT(IN
OUT). An actual argument that is associated with an INTENT(IN OUT) dummy
argument must be definable. Consequently, there's a hole in the language
that allows a PURE subroutine to modify arbitrary global state: the
argument could have a derived type with an impure FINAL subroutine, and
that FINAL subroutine could be invoked by an assignment to the dummy
argument. I consider this to be a mistake in the language design.
So the compiler was reporting this case as an error, although it is
indeed conforming usage, and not flagged by any other compiler.
Unfortunately, somebody has a code that needs this usage to be accepted,
because (I presume) they can't modify the dummy argument to be
INTENT(IN).
Consequently, we'll need to allow this usage. But it will elicit a
warning, and the warning is on by default.
[X86][GISEL] Port X86PostLegalizerCombiner to npm (#182787)
Port X86PostLegalizerCombiner to npm as part of llvm/llvm-project#178192
Also added cli option for lpm X86PostLegalizerCombiner pass for testing.
[DTLTO] Speed up temporary file removal in the ThinLTO backend (#186988)
Deleting the temporary files produced by the DTLTO ThinLTO backend can
be expensive on Windows hosts. For a Clang link (Debug build with
sanitizers and instrumentation) using an optimized toolchain (PGO
non-LTO, llvmorg-22.1.0) on a Windows 11 Pro (Build 26200), AMD Family
25 @ ~4.5 GHz, 16 cores / 32 threads, 64 GB RAM machine, the mean
duration of the "Remove DTLTO temporary files" time trace scope was
1267.789 ms (measured over 10 runs).
This patch performs the deletions on a background thread, allowing them
to overlap with the remainder of the link to hide this cost.
Based on work by @romanova-ekaterina and @kbelochapka.
[clang][modules] Remove `Module::ASTFile` (#185994)
This removes the assumption that a deserialized module is backed by a
`FileEntry`. The `OptionalFileEntryRef` member is replaced with
`ModuleFile{Name,Key}`.
[CGP][PAC] Flip PHI and blends when all immediate modifiers are the same
GVN PRE, SimplifyCFG and possibly other passes may hoist the call to
`@llvm.ptrauth.blend` intrinsic, introducing multiple duplicate call
instructions hidden behind a PHI node. This prevents the instruction
selector from generating safer code by absorbing the address and
immediate modifiers into separate operands of AUT, PAC, etc. pseudo
instruction.
This patch makes CodeGenPrepare pass detect when discriminator is
computed as a PHI node with all incoming values being blends with the
same immediate modifier. Each such discriminator value is replaced by a
single blend, whose address argument is computed by a PHI node.
[AMDGPU] Fix setreg handling in the VGPR MSB lowering
There are multiple issues with it:
1. It can skip inserting S_SET_VGPR_MSB if we set the mode via
piggybacking. We are now relying on the HW bug for correct
behavior. If/when the bug is fixed lowering will be incorrect.
2. We should just unconditionally update MSBs if immediate allows it.
We shall set correct bits and keep the rest of the immediate
(that is done). There is no reasonable way for an user to change
MSBs nor does it do anything good to set it with SETREG and then
immediately overwrite with S_SET_VGPR_MSB.
3. We can always update immediate if Offset is zero.
4. Redundant mode changes created as seen in the
hazard-setreg-vgpr-msb-gfx1250.mir.
5. Decoding of the immediate was also wrong with non-zero offset
and did not factor MSB fixup offset handling.
With unconditional immediate update most of time and not relying on
[12 lines not shown]
[AArch64][PAC] Rework discriminator analysis for calls and tail calls
Make use of fixupBlendComponents for AUTH_TCRETURN[_BTI] and for
BLRA[_RVMARKER] pseudos the same way it is done for AUT/PAC/AUTPAC.
This patch unifies discriminator analysis for DAGISel and GlobalISel
and improves cross-BB analysis in case of DAGISel.
[libc][stdio] Fix standard streams in overlay mode. (#187522)
https://github.com/llvm/llvm-project/pull/184669 changed the behavior of
standard streams in overlay mode, bringing in some symbols that are only
available in full build mode.
[AArch64][clang][llvm] Add support for Armv9.7-A lookup table intrinsics
Add support for the following Armv9.7-A Lookup Table (lut)
instruction intrinsics:
SVE2.3
```c
// Variant is also available for: _u8 _mf8
svint8_t svluti6[_s8](svint8x2_t table, svuint8_t indices);
```
SVE2.3 and SME2.3
``` c
// Variants are also available for _u16_x2 and _f16_x2.
svint16_t svluti6_lane[_s16_x2](svint16x2_t table, svuint8_t indices, uint64_t imm_idx);
```
SME2.3
```c
[9 lines not shown]
libclc: Replace flush_if_daz implementation
The fallback non-canonicalize path didn't work. Use a more
straightforward implementation. Eventually this should use
the pattern from #172998
[mlir][linalg] Fix vectorizer generating invalid vector.gather for 0-D tensor.extract (#187085)
Vectorizing a rank-0 `linalg.generic` whose body contains
`tensor.extract` with data-dependent indices hits the Gather
classification in `getTensorExtractMemoryAccessPattern` because
`isOutput1DVector` returns false for a 0-D result. This produces an
invalid `vector.gather` where operand #2 must be a vector of index
values but gets a scalar `index` instead.
Fix classifies a 0-D result as ScalarBroadcast rather than Gather, and
skips mask generation for 0-D in that path.
Add OpenMP version guard for linear modifier
- Add OpenMP 5.2 version guard for linear modifier to make sure
we don't set val for OpenMP 4.5 and 5.0 which support explicit
linear modifier
- Update test to revert changes in version < 5.2
- Update declare simd test (add declare_simd for 6.0)
- Refactor logic in flang lowering
[LV] Move dereferenceability check from Legal to VPlan (NFC) (#185323)
Instead of checking dereferenceability early during
LoopVectorizationLegality, defer the check to VPlan construction via
areAllLoadsDereferenceable.
This in preparation for supporting early exit vectorization of
non-dereferencable loads, e.g. via speculative loads
(https://discourse.llvm.org/t/rfc-provide-intrinsics-for-speculative-loads/89692)
or first-faulting loads. Detection in VPlan allows easily replacing
potentially non-deref loads with other loads as needed.
PR: https://github.com/llvm/llvm-project/pull/185323
[flang][OpenMP] Introduce `WithReason<T>` for nest/sequence properties
This helper class contains an optional value and a "reason" message.
It replaces the uses of std::pair<optional<...>, Reason>.
Issue: https://github.com/llvm/llvm-project/issues/185287
[libc++][NFC] Remove redundant guard for `__cpp_lib_destroying_delete` (#187473)
In `<version>` and test files, `__cpp_lib_destroying_delete` is already
properly guarded with standard modes, so it's redundant to say standard
revision in `test_suite_guard`/`libcxx_guard`.
[libc++] Unify python shebangs (#187258)
As per PEP-0394[1], there is no real concensus over what binary names
Python has, specifically 'python' could be Python 3, Python 2, or not
exist.
However, everyone has a python3 interpreter and the scripts are all
written for Python 3. Unify the shebangs so that the ~50% of shebangs
that use python now use python3.
[1] https://peps.python.org/pep-0394/