[flang][OpenMP] Fix crash when a sliced array is specified in a forall within a workshare construct (#170913)
This is a fix for two problems that caused a crash:
1. Thread-local variables sometimes are required to be parallelized.
Added a special case to handle this in
`LowerWorkshare.cpp:isSafeToParallelize`.
2. Race condition caused by a `nowait` added to the `omp.workshare` if
it is the last operation in a block. This allowed multiple threads to
execute the `omp.workshare` region concurrently. Since
_FortranAPushValue modifies a shared stack, this concurrent access
causes a crash. Disable the addition of `nowait` and rely on the
implicit barrier at the the of the `omp.workshare` region.
Fixes #143330
[mlir][bufferization] Fix use-after-free in ownership-based buffer deallocation (#184118)
When `handleInterface(RegionBranchOpInterface)` processes an op such as
`scf.for`, it calls `appendOpResults` to clone the op with extra
ownership result types and erase the original. The `Liveness` analysis
is computed once before the transformation begins and may still
reference the old (now-freed) result values.
If the same block contains a `BranchOpInterface` terminator (e.g.,
`cf.br`) after the structured loop, `handleInterface(BranchOpInterface)`
calls `getMemrefsToRetain`, which iterates `liveness.getLiveOut()`. That
set may contain stale `Value` objects pointing to the erased op's
results. Calling `isMemref()` on such a value dereferences freed memory,
triggering a crash.
Fix by adding a `valueMapping` map to `DeallocationState`. Before
erasing the old op in `handleInterface(RegionBranchOpInterface)`, record
the old-to-new result mapping via `state.mapValue`. The
`getLiveMemrefsIn` and `getMemrefsToRetain` helpers translate stale
[5 lines not shown]
[mlir][Shape] Fix crash in BroadcastOp::fold when operand is ub.poison (#183931)
BroadcastOp::fold used an unchecked llvm::cast<DenseIntElementsAttr> on
each operand's folded attribute. The existing null-check only guarded
against a missing (unset) attribute, not against a non-null attribute of
a different type such as PoisonAttr (produced when an operand is
ub.poison).
Replace the unchecked casts with dyn_cast_or_null, bailing out with
nullptr (i.e. no fold) when any operand does not provide a
DenseIntElementsAttr.
Add a regression test with a ub.poison operand.
Fixes #179679
[mlir][bytecode] Fix crash when reading DenseIntOrFPElementsAttr with unsupported element type (#184773)
When a bytecode type callback substitutes a type that does not implement
DenseElementTypeInterface (e.g., \!test.i32 replacing i32), the bytecode
reader attempted to reconstruct a DenseIntOrFPElementsAttr with that
type. This unconditionally called getDenseElementBitWidth() which hit an
llvm_unreachable on unsupported types.
Fix this by validating the element type implements
DenseElementTypeInterface in readDenseIntOrFPElementsAttr before
proceeding. If the check fails, a proper diagnostic is emitted and
reading fails gracefully instead of crashing.
Fixes #128317
libclc: Define work_group_barrier
Previously only the old barrier name was implemented. Define this
as an indirection around the new name, and move it to common code.
The target implementations are already provided by __clc_work_group_barrier,
so targets were unnecessarily duplicating these.
This also fixes the default scope, which should be
memory_work_group_scope. Previously this was guessing that
if the flags included global memory, it makes the scope
device which is not the case.
DAG: Replace legal type check in EmitCopyFromReg (#177788)
It doesn't make sense that an illegal type would get here; a
CopyFromReg cannot be illegally typed. The only exception that
was hit here is in a handful of SystemZ inline assembly tests
for i128, which use untyped. They shouldn't; it should treat
v2i64 as legal instead. Just leave the untyped check for now.
[libc++] Remove `__wrap_iter::base()` (#179389)
Resolves #126442
- Converts all the relevant functions that used `.base()` into friends
- Fixed usage in `<regex>`
---------
Co-authored-by: A. Jiang <de34 at live.cn>
[mlir][sparse] Fix crash in sparsification when unary/binary present block captures sparse tensor argument (#184597)
`relinkBranch` in Sparsification.cpp assumed that any block argument
from the outer `linalg.generic` op encountered inside an inlined
semi-ring branch must be a dense tensor, and asserted accordingly.
However, the `present` block of a `sparse_tensor.unary` (or similar
semi-ring ops) is permitted to capture sparse tensor operands directly
via `isAdmissibleBranchExp`, which accepts any `BlockArgument` as
admissible.
The fix removes the incorrect assertion and extends the load generation
to handle sparse tensors using `genSubscript`, which already knows how
to return the value buffer and current value position via the loop
emitter. The `kSparseIterator` strategy (where `genSubscript` returns a
`TensorType`) is also handled by emitting a
`sparse_tensor.extract_value` op.
Fixes #91183
Reapply "[SPIRV] Emit intrinsics for globals only in function that references them (#178143 (#179268)) (#182552)
This reverts commit 395858d9f172ff1c61c661aa7c2a18b449daffa6.
This PR had been reverted due to an unrelated address-sanitizer failure.
[mlir][sparse] Fix crash in SparseAssembler when run after SparseTensorCodegen (#183896)
After --sparse-tensor-codegen, sparse tensor arguments are replaced by
memrefs and \!sparse_tensor.storage_specifier types. The subsequent
--sparse-assembler pass calls getSparseTensorEncoding() to identify
sparse arguments to wrap/unwrap. However, getSparseTensorEncoding()
returns non-null for StorageSpecifierType as well as for sparse
RankedTensorType. Since StorageSpecifierType is not a RankedTensorType,
the subsequent cast<RankedTensorType> in convTypes() and convVals()
would crash with an assertion failure.
Fix by also checking isa<RankedTensorType>(type) in the passthrough
condition in both convTypes() and convVals(), so that
StorageSpecifierType arguments pass through unchanged.
Fixes #183776
[HLSL] Amend f32tof16() and f16tof32() tests (#179261)
Amend the codegen tests for f32tof16() and f16tof32() to include SPIRV
as a target in addition to DXIL.
Fixes #179257
Co-authored-by: Tim Corringham <tcorring at amd.com>
[AArch64][llvm] Gate some `tlbip` insns with +tlbid or +d128
Change the gating of `tlbip` instructions containing `*E1IS*`, `*E1OS*`,
`*E2IS*` or `*E2OS*` to be used with `+tlbid` or `+d128`. This is because
the 2025 Armv9.7-A MemSys specification says:
```
All TLBIP *E1IS*, TLBIP*E1OS*, TLBIP*E2IS* and TLBIP*E2OS* instructions
that are currently dependent on FEAT_D128 are updated to be dependent
on FEAT_D128 or FEAT_TLBID
```
[lldb][test] TestDataFormatterGenericOptional.py: remove obsolete skipIfs
Clang 7 and GCC 5 are pretty ancient. There's unlikely to be any bot configurations running this anymore. Lets remove it to reduce test noise.
[lldb][test] Clean up USE_LIBSTDCPP/USE_LIBCPP usage
This patch makes the two tests consistent with the rest of the formatter API tests (and is in my opionion easier to follow).
[Flang] Fix wrong compile-time error message, issue #178494. (#183878)
Fix the problem described in issue #178494. It will cover the failures
with S, SP, SS, BN, BZ, LZ, LZP, LZS, etc. It will resolve the test
failures in PR #183500.