[LV] Use isLegalMaskedLoadOrStore for interleaved accesses too (NFC) (#195243)
isLegalMaskedLoadOrStore is now the central place for querying target
capabilities for masked accesses. Access pattern legality checks are
hoisted outside of it.
[Flang][Semantics] Treat host/use-associated objects as externally visible. (#192892)
This patch fixes a false semantic error in Flang where function result
variables were incorrectly treated as externally visible in
pure-definability checks.
As a result, valid code assigning a pointer component of a function
result (as in flang/test/Semantics/pure-function-result-pointer.f90) was
rejected with “not definable in a pure subprogram.”
The fix updates _FindExternallyVisibleObject_ to treat function result
symbols as local, which matches Fortran semantics for function result
variables.
[Flang][OpenMP] Fix COPYIN of derived types with allocatable components at -O3 (#196063)
COPYIN of threadprivate derived types with allocatable components
segfaults at -O3 because the OpenMP runtime zero-fills per-thread
storage, leaving allocatable component descriptors with invalid
metadata. This patch skips the copy on the master thread (where source
and destination alias) and uses temporary_lhs assignment on worker
threads so the runtime initializes descriptors before the deep copy.
Assisted-by: Claude Opus 4.6
Fixes :
[https://github.com/llvm/llvm-project/issues/196134](https://github.com/llvm/llvm-project/issues/196134)
Minimal reprducing test-case :
```
program repro_o3_segv
use omp_lib
implicit none
[64 lines not shown]
Reapply [AA] No synchronization effects for never-escaping identified local (#196923)
Relative to the previous attempt, this makes sure that the location does
not alias with the pointer operand first. If it aliases, then we need to
consider the direct ModRef effects of the instruction, not just the
synchronization effects.
-----
Fences and other synchronizing operations (such as atomic accesses
stronger than monotonic) are modelled as reading and writing all memory,
in order to enforce their implied ordering constraints.
Currently, this happens even for identified function locals that do not
escape. This patch excludes those objects.
Notably, we can not reason based on captures-before here, because the
synchronizing operation still has an effect even if the object only
escapes later.
[2 lines not shown]
[Dexter] Add basic structured script parsing (#193710)
See PSA:
https://discourse.llvm.org/t/psa-planned-changes-to-dexter/90402
This patch begins adding support for "structured scripts" to Dexter,
starting with some of the core classes and the ability to parse script
files. This patch does not add the ability to actually run scripts, or
any of the underlying functionality required to do so.
NB: This patch adds a dependency on PyYAML, which is specified in a new
requirements.txt file.
[mlir][dataflow] IntRange: Replace yield-based widening with per-state lattice budget (#196616)
IntegerRangeAnalysis can hang on `scf.while` loops with dynamic bounds:
a
loop-carried range ratchets [0,0]->[0,1]->[0,2]->... by one per worklist
visit, requiring up to 2^31 iterations on i32. The new
`int-range-analysis-convergence.mlir` test reproduces this.
The ratchet lives at framework merge sites (region successors, callable
args) where the solver joins lattices via virtual
`Lattice::join(const AbstractSparseLattice &)`. The pre-existing
`isYieldedResult`/`isYieldedValue` heuristic in
`IntegerRangeAnalysis::visitOperation` doesn't help: it runs in the
transfer-function callback for inferrable-op results used by a
terminator,
not on the merge path. It is also harmful where it fires - slams to
maxRange on the *second* visit (after, say, [1,1]->[1,2]), so naturally
bounded accumulators (e.g. `arith.minsi`-clamped iter args) widen to
[INT_MIN, INT_MAX].
[8 lines not shown]
[MLIR][GPU] Add gpu-lower-to-rocdl-pipeline meta-pass (#196751)
Add `gpu-lower-to-rocdl-pipeline` meta-pass which lowers common MLIR
dialects (gpu/arith/scf/vector) to binary, similar to the existing
XeVM/NVVM pipelines.
[MLIR] Make MLIRRegisterAllPasses depend on mlir-headers (#196913)
RegisterAllPasses.cpp pulls in dialect Passes.h / generated Passes.h.inc
via TableGen targets that are tied to mlir-headers, but add_mlir_library
only adds mlir-generic-headers by default, so this TU can compile before
those generated headers are ready and registerAllPasses() can miss
passes (e.g. sporadic mlir-opt --help gaps). Add DEPENDS mlir-headers to
MLIRRegisterAllPasses in mlir/lib/CMakeLists.txt so it waits for those
outputs. Verified with ninja mlir-opt and mlir-opt --help | grep -E
'nvvm-attach-target|rocdl-attach-target' (or similar stable upstream
passes in your tree).
Signed-off-by: Fujun Han <fujun.han at iluvatar.com>
Co-authored-by: Cursor <cursoragent at cursor.com>
[AtomicExpand] Add bitcasts when expanding load atomic vector (#148900)
AtomicExpand fails for aligned `load atomic <n x T>` because it
does not find a compatible library call. This change adds appropriate
bitcasts so that the call can be lowered. It also adds support for
128 bit lowering in tablegen to support SSE/AVX.
[LifetimeSafety] Impove `[[clang::lifetimbound]]` violation diagnostics (#196824)
Reports lifetimebound verification diagnostics at the attribute
location, so declarations with the attribute now point at the
declaration rather than only at the function definition.
[CMake] Don't pass --gc-sections to MSVC-style linkers when using clang's MSVC mode (#196393)
The PR concerns Clang with a GNU-like command-line interface on Windows.
The LLVM linker on Windows (lld-link.exe) does not understand the
--gc-sections option. The PR excludes that option when compiling on
Windows to remove a linker warning (and an error if warnings are treated
as such).