[Sema] Fix crash on invalid operator template-id (#181404)
Add checks in GetNameFromUnqualifiedId to handle invalid TemplateId
cases safely. This avoids a crash when handling an invalid template-id
during error recovery and allows normal error reporting to continue.
Fixes #177549
[LoopUnrollAndJam] Update test unroll-and-jam.ll (NFC) (#183520)
The test `unroll-and-jam.ll` has the following issues:
- Some functions use `%i` and `%I` as variable names, which UTC fails to
distinguish, causing it to update the assertions incorrectly.
- Some tests use parameters for loop bounds, which means they will start
failing in the near future due to the ongoing changes in DA.
To address these issues, this patch updates the test as follows:
- Renames certain variables to avoid the naming conflict.
- For the tests that will be affected by the DA changes, adds variants
with constant loop bounds.
[MLIR][Presburger][NFC] Don't add empty regions when unioning PWMA functions (#182468)
This will prevent exponential behaviour in lexicographic maximum
computation, where the `tiebreak` predicate is very likely to return
empty regions.
[LoopUnrollAndJam] Update test dependencies.ll (NFC) (#183509)
Recent on-going works to fix the correctness issues in DA will affect
some existing regression tests for passes that rely on it. As a result,
the original intent of several tests will be lost.
This patch updates `dependencies.ll` to avoid such issues and preserve
its intent. Specifically, this patch changes the loop bounds from
parameters to constants, which allows SCEV to infer no-wrap flags for
the addrecs. Also this patch updates other minor issues in the test,
such as adding pseudo codes and removing some `nuw` to avoid UB.
[CIR] Remove branch through cleanup fixups (#182953)
Because we are using a structured representation of cleanups in CIR, we
don't need to handle branching through cleanups during codegen. These
branches are created during CFG flattening instead. However, we had
already committed some code that copied the classic codegen behavior for
branching through cleanups. This change deletes that unneeded code.
The most significant change here is that when we encounter a return
statement we emit the return directly in the current location.
The coroutine implementation still creates a return block in the current
lexical scope and branches to that block. Cleaning up that
representation is left as future work.
The popCleanupBlock handling still has a significant amount of logic
that is carried over from the classic codegen implementation. It is left
in place until we can be sure we won't need it.
[MLIR][Python] Support op adaptor for Python-defined operations (#183528)
Previously, in #177782, we added support for dialect conversion and
generated an `OpAdaptor` subtype for every ODS-defined operation. In
this PR, we will also generate `OpAdaptor` subtypes for Python-defined
operations, so that they can be applied in dialect conversion as well.
[mlir][arith-to-spirv] Fix null dereference when converting trunci/extui with tensor types (#183654)
`getScalarOrVectorConstInt` only handles `VectorType` and `IntegerType`,
returning `nullptr` for any other type (e.g., a `RankedTensorType` that
slips through after type emulation maps `tensor<Nxi16>` to
`tensor<Nxi32>` with the same destination type). The callers in
`TruncIPattern` and `ExtUIPattern` passed this null value directly to
`spirv::BitwiseAndOp::create`, causing a null-pointer dereference in
`OperandStorage`.
Similarly, the signed-extension pattern passes the result of
`getScalarOrVectorConstInt` as a shift amount to
`ShiftLeftLogicalOp::create` without a null check.
Add `if (\!mask)` / `if (\!shiftSize)` guards that return a match
failure in all three cases, converting the crash into a proper
legalization failure.
Fixes #178214
[VPlan] Process instructions in reverse order when widening
It doesn't matter right now because we're using CM's decision, but
https://github.com/llvm/llvm-project/pull/182595 introduces some
scalarization (first-lane-only) opportunites that aren't known in CM and
those require reverse iteration order to support as those are determined
by VPUsers and not operands.
[MLIR] Do not abort on invalid --mlir-debug-counter values (#181751)
Use `cl::Option::error()` diagnostics for invalid `--mlir-debug-counter`
arguments and exit with status 1 (no stack dump).
Added `mlir/test/mlir-opt/debugcounter-invalid-cl-options.mlir`
covering:
- non-numeric value (`-1n`)
- missing `=`
- missing `-skip`/`-count` suffix
Fixes #180117
[flang][cuda] Add support for cudaStreamDestroy (#183648)
Add specific lowering and entry point for cudaStreamDestroy. Since we
keep associated stream for some allocation, we need to reset it when the
stream is destroy so we don't use it anymore.
[Clang][Hexagon] Add QURT as recognized OS in target triple (#183622)
Add support for the QURT as a recognized OS type in the LLVM triple
system, and define the __qurt__ predefined macro when targeting it.
[scudo] Add reallocarray C wrapper. (#183385)
`reallocarray()` is a POSIX extension to C standard which wraps
`realloc` function and adds `calloc`-like overflow detection. It is
available in glibc and some other standard library implementations. Add
`reallocarray` to the list of Scudo C wrappers, so that the code that
depends on `reallocarray` presence will continue to work.
build: correct `MSVC` and Windows mixup for `CLANG_BUILD_STATIC` (#183609)
The build incorrectly used `MSVC` to determine that we were building for
Windows (MS ABI). This prevents the use of the GNU driver for building
LLVM for Windows. Adjust the condition to `WIN32 AND NOT MINGW` to
correctly identify that we are building for Windows MS ABI.
[scudo] Change header tagging for the secondary allocator (#182487)
When secondary allocator allocates a new chunk, the allocation is
prepended with a chunk header (common with the primary allocator)
and large header (only used for secondary).
Only the headers are tagged, the data is not, and the headers are
tagged individually as different tags are used for them.
In the current implementation while tagging the large header the unused
area is tagged with it, so the allocator can tag up to a page size (in
worst case), which is costly and does not bring security benefit (as the
area is unused).
With the current fix we can get rid of around 97-98% of the tagging for
the secondary allocator, measured with random benchmarks.
Co-authored-by: Christopher Ferris <cferris1000 at users.noreply.github.com>
[AArch64] Decompose FADD reductions with known zero elements (#167313)
FADDV is matched into FADDPv4f32 + FADDPv2i32p but this can be relaxed
when one element (usually the 4th) or more are known to be zero.
Before:
```
movi d1, #0000000000000000
mov v0.s[3], v1.s[0]
faddp v0.4s, v0.4s, v0.4s
faddp s0, v0.2s
```
After:
```
mov s1, v0.s[2]
faddp s0, v0.2s
fadd s0, s0, s1
```
[2 lines not shown]