[lldb][test] Fix Makefile for TestInlineSourceFiles.py (#194078)
The test did not build hidden.o with a custom target triple when
specified. Use CFLAGS from Makefile.rules to fix.
[Offload] Make kernel dynamic memory handling more generic (#194403)
Make sure we do not get unexpected NumThreads and NumBlocks values when
launching non-bare kernels, and generalize the computation of the
dynamic block memory allocation to handle multi-dimensional blocks.
The DynBlockMem fallback is never used in a non-bare context where
`NumBlocks[1]` and `NumBlocks[2]` are not 1 so the code was correct, but
this patch makes sure that assumption is made explicit, and also
future-proofs the code in case we decide to allow multi-dimensional
blocks for fallback dyn block mem in some path.
[Clang] Avoid an extra `FunctionPrototypeScope` for lambda trailing requires-clauses (#194068)
`ParseTrailingRequiresClause` currently always creates a synthetic
`FunctionPrototypeScope`. This is needed for ordinary function
declarators
whose prototype scope has already ended, but it is wrong for lambda
trailing
requires-clauses because they are parsed while the lambda prototype
scope is
still active.
The extra counted scope gives parameters in nested requires-expressions
an
incorrect function scope depth. Split the synthetic prototype-scope
setup from
the trailing requires-clause parser so the lambda path can parse the
clause in
the existing prototype scope.
Fixes: #123854
Fixes: #100774
[CIR] Fix eraseOp assertion in TryOp flattening with unreachable handlers (#193615)
When a try block has catch handlers but no throwing calls, the handler
regions are unreachable and the TryOp is erased. However, ops inside the
handler regions may reference values that were inlined from the try body
into the parent block, causing an assertion in `eraseOp` ("expected that
op has no uses").
This drops all defined value uses from handler regions before erasing
the TryOp.
Made with [Cursor](https://cursor.com)
[CIR] Emit frexp, modf, and powi builtins as library calls (#193795)
`__builtin_frexpf`, `__builtin_modf`, `__builtin_powi`, and related
builtins were incorrectly falling through to the `__builtin_isnan`
handler in `emitBuiltinExpr`, which calls `createBoolToInt` /
`createIsFPClass`. This produced a `cir.cast` with an integer result
type when the actual return type is floating-point, failing CIR
verification.
Break out of the switch so these builtins fall through to the
`isLibFunction()` path, which emits them as regular library calls.
Made with [Cursor](https://cursor.com)
[mlir][spirv] Add missing capabilities for CoopMatrix in TypeExtensionVisitor (#193803)
This adds missing capabilities when CoopMatrix is used with bf16 and
fp8.
Assisted-by: Codex
Don't pass RecipeBuilder
Legacy calls `setRecipe` on all processed recipes but really queries `getRecipe`
for memory operations only, that we don't touch in the scalarization as that
happens after all memory recipes has been processed.
[VPlan] Scalarize to first-lane-only directly on VPlan
This is needed to enable subsequent https://github.com/llvm/llvm-project/pull/182595.
I don't think we can fully port all scalarization logic from the legacy
path to VPlan-based right now because that would require us to introduce
interleave groups much earlier in VPlan pipeline, and without that we
can't really `assert` this new decision matches the previous CM-based
one. And without those `assert`s it's really hard to ensure we properly
port all the previous logic.
As such, I decided just to implement something much simpler that would
be enough for #182595. However, we perform this transformation before
delegating to the old CM-based decision, so it **is** effective
immediately and taking precedence even for consecutive loads/stores
right away.
Depends on https://github.com/llvm/llvm-project/pull/182592 but is stacked on
top of https://github.com/llvm/llvm-project/pull/182594 to enable linear
stacking for https://github.com/llvm/llvm-project/pull/182595.
[WebAssembly] Support f16x8.demote_f32x4_zero (#193564)
Add support for the f16x8.demote_f32x4_zero instruction. This
instruction converts a v4f32 vector to a v4f16 and pads the result with
zeros to fill the 128-bit register.
This enables efficient lowering of fptrunc operations from v4f32 to
v4f16 when the result is zero-extended or when only the low lanes are
needed. A DAG combine is included to recognize these patterns and fold
them into the new instruction.
[PowerPC] Simplify implementation of atomis loads (#191044)
The code for atomic loads is verbose. There are 10 different operations
and 4 memory sizes to support, which means 40 pseudo instructions are
used, with all the details repeated. This PR changes the following:
- Use a loop over the operations and the sizes to create the pseudo
instruction
- Adds the memory size as last operand to the pseudo instruction
- Updates the C++ code to take advantage of the memory size in the
pseudo instruction
[SPIR-V][NewPM] Register SPIRVPrepareFunctions and SPIRVPrepareGlobals with the new pass manager (#194024)
Rename the legacy pass IDs to spirv-prepare-functions and
spirv-prepare-globals for consistency with the other SPIR-V passes and
add opt-driven lit tests for both passes
[DirectX] Emit unresolved ptr as i8* (#192086)
We cannot use dxilOpaquePtrReservedName in this test as that is the
wrong type for the null initializer.
[PowerPC] Enable using HwMode for instructions (#191051)
The HwMode is already used for operands representing an effective
address. It can also be used for general purpose registers but this is
not clear from the naming. This change
- introduces the hw-mode dependent register class `GxRC`, and the
associated register operands
- removes register class `ptr_rc_idx_by_hwmode`, and replaces the only
use with `gxrc`
- uses the `EQV` instruction as an example how to use the new class
[lldb-server] Implement support for MultiBreakpoint packet
This is fairly straightforward, thanks to the helper functions created
in the previous commit.
https://github.com/llvm/llvm-project/pull/192910
[libc] Add sys/ucontext.h header (#194329)
POSIX historically provided <sys/ucontext.h> as an alias for
<ucontext.h>. Some software still includes the sys/ path. Added the
header as a simple wrapper that includes <ucontext.h>, gated to x86_64
alongside the existing ucontext support.