[Clang] [Sema] Improve handling of multidimensional subscript operator for builtin types (#187828)
This patch improves the diagnostic we issue when a multidimensional
subscript operator is applied to a builtin type. Additionally, this also
updates several code paths that were asserting that a subscript operator
always has 1 argument to handle multiple arguments properly.
Fixes #187800.
[mlir][linalg] Fix UBSan division-by-zero in PackOp folding (#186271)
When tensor-cast folding propagates a zero constant into a dynamic tile
size, `FoldTensorCastPackOp` would proceed to fold the pack into an
`expand_shape` using that zero tile — causing undefined behaviour
(integer
division/modulo by zero, caught by UBSan as SIGFPE).
1. Guard `FoldTensorCastPackOp`: bail out early if any of the resolved
tile sizes is zero, preventing the invalid fold entirely.
2. Restrict the `hasZeros` check in `commonVerifierPackAndUnPackOp` to
only inspect `Attribute` operands (statically-known zeros), not dynamic
`Value` operands. The verifier can only meaningfully reject zero tiles
that are statically visible; dynamic zeros are an inherently runtime
condition.
3. Add `assert(*constantTile \!= 0)` guards in `requirePaddingValue` and
`requirePaddingValueStrict` to document and enforce the precondition
[9 lines not shown]
AArch64 Tablegen Update (#179692)
- Use updated `.td` file instead of the `.def` file for the AArch64
builtin functions.
- The change has been tested via `clang-check` and `check-clang` which
returns no error nor warnings
- The translation is mostly done using the translation script -
https://github.com/moodytongytong/random-python-script
Signed-off-by: moodytongytong
---------
Co-authored-by: Lang Hames <lhames at gmail.com>
Co-authored-by: Jackson Stogel <jtstogel at gmail.com>
Co-authored-by: Pranav Kant <prka at google.com>
[clang] Add -Wunused-but-set-global (#188291)
Commit fd11cf430e5a extended `-Wunused-but-set-variable` to static
globals. To make it easier for downstream projects to integrate this new
functionality, this commit introduces `-Wunused-but-set-global` so it
can be easily disabled as projects investigate and fix new findings.
[flang][OpenMP] Check if loop nest/sequence is well-formed (#188025)
Check if the code associated with a nest or sequence construct is well
formed. Emit diagnostic messages if not.
Make a clearer separation for checks of loop-nest-associated and loop-
sequence-associated constructs.
Unify structure of some of the more common messages.
Issue: https://github.com/llvm/llvm-project/issues/185287
[LoongArch][RISCV] Fix incorrect indexing of incoming byval arguments in tail call eligibility check (#188006)
The loop that validates byval arguments in
`isEligibleForTailCallOptimization()` incorrectly used the loop index
`i` when accessing `getIncomingByValArgs()`, even though `j` is the
index tracking the number of encountered byval arguments.
This mismatch could lead to out-of-bounds access or incorrect type
comparisons when non-byval arguments are interleaved with byval
arguments, causing the tail call eligibility check to fail or behave
incorrectly.
Fix this by using `j` consistently as the index into the incoming byval
argument list and only incrementing it after the bounds check.
This issue affects both LoongArch and RISCV backends, which share the
same logic pattern.
Fixes #187832
[SLP]Mark candidate instruction as reduced value, if it is the operand of another reduced value
If the next candidate is the operand of one of the reduced value
candidates, such instructions also should be marked as a reduced value,
not a reduction operation, even if all other requirements are met.
This will allow to reduce the compile time.
Reviewers: hiraditya, RKSimon
Pull Request: https://github.com/llvm/llvm-project/pull/188103
[Offload][OpenMP][libdevice] Make check to enter state machine architecture dependent (#188144)
The genericStateMachine call uses synchronize::thread wich is expected
to be implemented using a workgroup level barrier.
Currently as in some other architectures where if threads in the same
warp as the main thread reach the barrier may cause a race condition
there's a condition that makes some threads not enter the state machine.
But in Intel GPUs all threads must reach the barrier for it to be
completed, otherwise the threads in the state machine never make
progress.
This PR moves the condition into an architecture-dependent config so it
can work correctly for both kinds of hardware.
science/fastcap: update 2.0wr-011109 -> 2.0wr-010524
Changelog: https://people.freebsd.org/~rodrigo/fastcap-2.0wr-010524.changelog.txt
Major changes:
* Fixed an old fastcap bug that could seg. fault
* Rewrote the Makefile/config system to match FastHenry
Port changes:
* move to gmake and GCC for the build, instead of trying to patch for llvm
[flang][mlir][OpenMP] Add linear modifier (val, ref, uval) (#187142)
Add support for OpenMP linear modifiers `val`, `ref`, and `uval` as
defined in OpenMP 5.2 (5.4.6).
[acc] Lower acc if with multi-block host fallback via scf.execute_region (#188350)
handle multi-block host fallback regions by wrapping them in
scf.execute_region, instead of rejecting with `not yet implemented:
region with multiple blocks`.
InstCombine: Fold out nanless canonicalize pattern (#172998)
Pattern match a wrapper around llvm.canonicalize which
weakens the semantics to not require quieting signaling
nans. Depending on the denormal mode and FP type, we can
either drop the pattern entirely or reduce it only to
a canonicalize call. I'm inventing this pattern to deal
with LLVM's lax canonicalization model in math library
code.
The math library code currently has explicit checks for
the denormal mode, and conditionally canonicalizes the
result if there is flushing. Semantically, this could be
directly replaced with a simple call to llvm.canonicalize,
but doing so would incur an additional cost when using
standard IEEE behavior. If we do not care about quieting
a signaling nan, this should be a no-op unless the denormal
mode may flush. This will allow replacement of the
conditional code with a zero cost abstraction utility
[16 lines not shown]
[DSE] Use CycleInfo instead of LoopInfo (#188253)
DSE needs to reason about cycles in order to correctly handle
loop-carried dependencies. It currently does this by using LoopInfo and
performing a separate check for irreducible control flow.
Instead, we can use CycleInfo, which is like LoopInfo but also handles
irreducible cycles.
This requires computing CycleInfo (which, unlike LoopInfo won't be
reused by surrouding passes), but ends up being neutral in terms of
compile-time overall.
[lldb] Clear up GetModuleSpecifications return value confusion (#188276)
Some plugins were returning the number of specifications they have
added, while others were returning the total final number. Particularly
devious plugins (Minidump) were clearing the specification list
altogether. This resulted in nondeterministic failures (depending on
plugin ininitialization order) in TestSBModule.
This PR defines the problem away by having each plugin only return the
specifications it is responsible for. If the caller wants to merge them,
it is free to do so. This *might* be slighly less efficient, but this is
hardly hot code.
I'm not touching the ObjectFile::GetModuleSpecifications function (the
caller of all these functions) as the PR is big enough, although the
same approach might be warranted there as well.
Fixes https://github.com/llvm/llvm-project/issues/178625.