[WebAssembly] Reapply "[WebAssembly] Incorporate SCCs into WebAssemblyFixIrreducibleControlFlow" (#181755) (#184441)
Re-application of #181755.
Includes fixes to issues found after the original's merge.
[lldb] Use "assemble" instead of "compile" in formatter_bytecode.py (#184714)
Replace "compile" with "assemble" in formatter_bytecode. This is in
preparation for the addition of a Python to formatter bytecode compiler.
It will be more clear to have one meaning for "compile".
[Hexagon] Add missing early architecture features to V81 processor (#183499)
V81 was missing ArchV5, ArchV55, ArchV60, and ArchV62 in its feature
list, causing instructions requiring these architecture versions to fail
during compilation.
(cherry picked from commit 6f736e2cd9933aed1c8b72f2fe370f7d60bd3709)
[lld][Hexagon] Fix findMaskR8 missing duplex support (#183936)
findMaskR8() lacked an isDuplex() check, unlike findMaskR6(),
findMaskR11(), and findMaskR16() which all handle duplex instructions.
When the assembler generates R_HEX_8_X on a duplex SA1_addi instruction
(e.g. `{ r0 = add(r0, ##target); memw(r1+#0) = r2 }`), the wrong mask
0x00001fe0 placed relocation bits at [12:5] instead of [25:20],
corrupting the low sub-instruction (e.g. memw became memb).
Add the isDuplex() check returning 0x03f00000, and add a comprehensive
test covering all duplex instruction x relocation type combinations
across findMaskR6, findMaskR8, findMaskR11, and findMaskR16.
(cherry picked from commit 9105d9c24949d8cf9b740cb874027351e7230e70)
[Clang] Ensure child classes export inherited constructors from base classes (#182706)
Inherited constructors in `dllexport` classes are now exported for ABI-compatible cases,
matching MSVC behavior. Constructors with variadic arguments or callee-cleanup
parameters are not yet supported and produce a warning.
This aims to partially resolve https://github.com/llvm/llvm-project/issues/162640.
Assisted by : Cursor // Claude Opus 4.6
Fix a bug in the watchpoint callback - in one case we weren't
returning anything from the callback. Fixing this on the off
chance that is what is causing the linux-only failure in this test
after PR:
https://github.com/llvm/llvm-project/pull/184272
[clang] Don't use `VarDecl` of local variables as `ManglingContextDecl` for lambdas (#179035)
Currently, in a C++20 modules context, a `VarDecl` of a local variable
can wrongly end up as a `ManglingContextDecl` for a lambda.
Fix this by removing `ContextKind::NonInlineInModulePurview` in
`Sema::getCurrentMangleNumberContext` and add
`IsExternallyVisibleInModulePurview` checks in the appropriate places:
- For externally visible functions defined in a module purview, add a
check to `isInInlineFunction`, renaming it to
`IsInFunctionThatRequiresMangling`
- For externally visible variables defined in a module purview, add a
new `ContextKind::ExternallyVisibleVariableInModulePurview` and an
appropriate check to the `VarDecl` case
Fixes #178893
---------
[4 lines not shown]
[C++20] [Modules] Set ManglingContextDecl when we need to mangle a lambda but it's nullptr (#177899)
Close https://github.com/llvm/llvm-project/issues/177385
The root cause of the problem is, when we decide to mangle a lamdba in a
module interface while the ManglingContextDecl is nullptr, we didn't
update ManglingContextDecl. So that the following use of
ManglingContextDecl is an invalid value.
(cherry picked from commit 772b15b3be153b1d2df910057af17926ea227243)
[flang][OpenMP] Utilities to get uppercase directive/clause names
It is a convention to use uppercase names of directives and clauses in
diagnostic messages, but getting such names is somewhat cumbersome:
```
parser::ToUpperCaseLetters(llvm::omp::getOpenMPDirectiveName(dirId));
parser::ToUpperCaseLetters(llvm::omp::getOpenMPClauseName(clauseId));
```
Implement `GetUpperName` (overloaded for clauses and directives) to
shorten it to
```
GetUpperName(dirId, version);
GetUpperName(clauseId, version);
```
This patch replaces existing instances of this pattern, adding the
use of OpenMP version where it was previously missing.
Revert "[flang][OpenMP] Fix lowering of LINEAR iteration variables" (#184843)
Reverts llvm/llvm-project#183794
It broke a couple of tests from Fujitsu testsuite.
[SPIRV] Replace `removeFromParent` with `eraseFromParent` for `ASSING_TYPE` (#184793)
The `ASSIGN_TYPE` instruction should not be referenced anymore at this
point. So we can free its memory.
Follow up of https://github.com/llvm/llvm-project/pull/182330
[green dragon] skip trigger on release branch stage1 RA jobs (#184653)
* This will allow us to setup clang-stage1-RA jobs as multi branch
pipelines for release branches without setting up all the downstream
jobs.
I will need to cherry-pick the jenkinsfile to the release/22.x branch
which I will do after this lands
[clang][bytecode][HLSL][Matrix] Support `ConstantMatrixType` and more HLSL casts in the new constant interpreter for basic matrix constexpr evaluation in HLSL (#184840)
Forgot to change the target branch before merging. This PR is a
cherry-pick of the squashed-and-merged PR commit
b16aa4b7ec665911c74300cd7442659b70973d13 from 183424
This PR fixes #182963
This PR is an extension of #178762 which has already been merged.
This PR adds support for `ConstantMatrixType` and the HLSL casts
`CK_HLSLArrayRValue`, `CK_HLSLMatrixTruncation`,
`CK_HLSLAggregateSplatCast`, and `CK_HLSLElementwiseCast` to the
bytecode constexpr evaluator.
The implementations of CK_HLSLAggregateSplatCast and
CK_HLSLElementwiseCast are incomplete, as they still need to support
struct and array types to enable use of the experimental new constant
interpreter on other existing HLSL constexpr tests. The completion of
the implementations of these casts will be tracked in a separate issue
[2 lines not shown]
[Clang] Fix 'gpuintrin.h' implementation of 'match_all'
Summary:
This implementation only worked if the lane mask passed in was uniform,
but this is against the expected usage where the user may be wishing to
check if a value is uniform *within* a mask subset. Also remove
redundant sync_lanes, the ballots and shuffles already have
synchronizing behavior.
[libc] Rework slab cache data structure for GPU allocator
Summary:
This was previously a Trieber stack, which is a perfectly fine generic
and lock-free data structure. However, this used some expensive CAS
operations and had issues with ABA. Because the only user of this was
the slab cache mechanism, we can pretty safely specialize it. Instead,
we simply search a fixed size buffer for some sentinal values and CAS
into it.
For allocations that only ever hit the cache, this improves performance
from ~9000 cycles to ~6000 cycles and similar improvements for workloads
that feel the pain of small thread counts hitting the cache.
Refactor and support multiple affinity register for a task
- Support multiple affinity register for a task
- Move iterator loop generate logic to OMPIRBuilder
- Extract iterator loop body convertion logic
- Refactor buildAffinityData by hoisting the creation of affinity_list
- IteratorsOp -> IteratorOp
- Add mlir to llvmir test
Implement lowering for omp.iterator in affinity
Create IteratorLoopNestScope for building nested loop for iterator.
Take advantage of RAII so that we can have correct exit for each
level of the loop.
[mlir][llvmir][OpenMP] Translate affinity clause in task construct to llvmir
Translate affinity entries to LLVMIR by passing affinity information to
createTask (__kmpc_omp_reg_task_with_affinity is created inside PostOutlineCB).
[clang][deps] Simplify by-module-name scan API (#184376)
The by-module-name scanning APIs are fairly spread out. There's the main
`CompilerInstanceWithContext` class that provides a constructor,
`initialize()` and `computeDependencies()`. Then there's the
`DependencyScanningWorker` that optionally owns
`CompilerInstanceWithContext` and re-exposes two `initialize()` and one
`computeDependencies()` functions. Lastly, there's
`DependencyScanningTool` that again re-exposes two variants of
`initialize()` and one `computeDependencies()`.
The current setup makes it unnecessarily difficult to make changes to
these APIs (as observed in
https://github.com/swiftlang/llvm-project/pull/12453).
This PR makes `CompilerInstanceWithContext` standalone, and hides the
construct + initialize pattern behind a static factory function. This
makes it harder to misuse the API (forgetting to call `initialize()`,
calling it twice, etc.) and means changes now need to only touch single
class instead of three classes spread over multiple files.