[flang][cuda] Preserve fir.rebox captured by cuf.kernel via CUDAKernelOpInterface (#193890)
Reland of #193837 (reverted in #193855), now using a marker op interface
to avoid the link cycle that broke `BUILD_SHARED_LIBS=ON` builds.
`SimplifyArrayCoorOp` folded `fir.rebox` into `fir.array_coor` across a
`cuf.kernel` boundary. CUF lowering needs the captured rebox to
materialize a managed-memory descriptor for the kernel; folding it away
makes the kernel dereference the host-side descriptor and crash with
`cudaErrorIllegalAddress`.
Fix is to add `fir::CUDAKernelOpInterface`, a marker op interface
defined in FIRDialect and implemented by `cuf.kernel`. The
canonicalization guard queries the interface, so the `TypeIDResolver`
symbol lives in `libFIRDialect.so` and no `FIR -> CUF` link edge is
introduced.
[Flang][OpenMP] Validate `omp_initial_device` `omp_invalid_device` as device IDs (#193669)
As per OpenMP 5.2/6.0 the below are valid device values in a `#pragma
omp target` directive:
omp_initial_device (-1) -> refers to the host CPU.
omp_invalid_device (-2) -> an intentionally invalid device, used to
trigger a runtime error.
For the 2 values discussed above flang fails with:
```
error: The device expression of the DEVICE clause must be a positive integer expression
!$OMP TARGET DEVICE(-1)
error: Must have INTEGER type, but is REAL(4)
!$OMP TARGET DEVICE(OMP_INVALID_DEVICE)
```
Issue: https://github.com/llvm/llvm-project/issues/192989
[libc++] Improvements to the benchmark runners (#194659)
- Run the ref workloads on SPEC
- Record the SPEC version in the machine info
- Allow filtering which benchmarks are run in run-benchbot
[VPlan] Optz WideCanIV with SIVSteps over CanIV (#191276)
Replace WideCanonicalIV with a ScalarIVSteps over the CanonicalIV when
only the first lane is used. This is a preparatory step in enabling
expansion of WideCanonicalIV into executable recipes.
[Lit] Open sub-processes with text=`False` (#194577)
This PR is part of a series of patches upgrading Lit's in-process
built-ins to be able to run with piped input/output and full redirection
support, and to allow custom in-process builtns to be provided via the
Lit config. The remaining patches to Lit's test runner can be found here:
https://github.com/BStott6/llvm-project/compare/lit-inproc-builtins.
This is part of the Lit daemonized testing project:
https://discourse.llvm.org/t/88612
This PR makes Lit open all sub-processes with `text=False`, so that the
Python code will be able to read and write binary data to and from their
IO streams. This currently causes no functional change, as when Lit
reads output from the sub-processes, it already handles the case that
the read output is `bytes` by decoding it, but we will need to be able
to read binary data from a sub-process's STDIN if its output, which may
be binary, is piped into an in-process built-in, and we will need to be
able to write binary data to a sub-process's STDOUT if its input is
[5 lines not shown]
[Flang] Fix -Wopen-mp-* and -Wopen-acc-* flag spellings (#188434)
The CamelCase-to-hyphenated conversion was incorrectly splitting
"OpenMP" and "OpenACC" into "open-mp" and "open-acc", producing wrong -W
flag names like -Wopen-mp-usage instead of -Wopenmp-usage. Fix the
conversion to treat these as compound names, keep the old spellings as
deprecated aliases, and emit a warning when deprecated spellings are
used.
---------
Co-authored-by: Claude Opus 4.6 <noreply at anthropic.com>
[SLP] Add tests for boundary case with MinProfitableStridedOps (#194507)
Currently we don't vectorize runtime strided loads when `VF == MinProfitableStridedOps`.
[mlir][tosa] Verify the output shape of tosa.mul and tosa.rescale (#193952)
Verifying the provided output shape against an expected shape helps
diagnose issues on op construction.
[Lit] Change processRedirects to open all files in binary mode (#194368)
This PR is the second in a series of patches upgrading Lit's in-process
built-ins to be able to run with piped input/output and full redirection
support, and to allow custom in-process builtns to be provided via the
Lit config. The remaining patches to Lit's test runner can be found here@
https://github.com/BStott6/llvm-project/compare/lit-inproc-builtins.
This is part of the Lit daemonized testing project:
https://discourse.llvm.org/t/88612.
This PR makes Lit's `processRedirects` function open all input/output
files in binary mode. This makes sure that in-process builtins have the
expected behaviour when reading and writing from them:
Newline translation is not required for any of the current in-process
built-ins, in fact, the in-process built-in for `echo`, which is the
only one that writes to `stdout`, explicitly re-opens the output file
with `newline=""` on Windows, to avoid newline translation. Also,
[7 lines not shown]
[AArch64][clang][llvm] Add support for Armv9.7-A lookup table intrinsics
Add support for the following Armv9.7-A Lookup Table (lut)
instruction intrinsics:
SVE2.3
```c
// Variant is also available for: _u8 _mf8
svint8_t svluti6[_s8](svint8x2_t table, svuint8_t indices);
```
SVE2.3 and SME2.3
``` c
// Variants are also available for _u16_x2 and _f16_x2.
svint16_t svluti6_lane[_s16_x2](svint16x2_t table, svuint8_t indices, uint64_t imm_idx);
```
SME2.3
```c
[9 lines not shown]
[mlir][memref] Pass TypeConverter to ConvertMemrefStore (#194356)
Commit 20b925a28a29 dropped the TypeConverter from ConvertMemrefStore
when adding the disableAtomicRMW flag. Restore it.