[libclc][NFC] Move convert builtins from Python generator to .cl sources (#172634)
Remove the Python dependency for generating convert builtins, aligning
with how other builtins are defined.
In addition, our downstream target relies on this PR to override convert
implementations.
llvm-diff shows no changes to all bitcodes:
amdgcn--amdhsa.bc, barts-r600--.bc, cayman-r600--.bc, cedar-r600--.bc,
clspv64--.bc, clspv--.bc, cypress-r600--.bc, nvptx64--.bc,
nvptx64--nvidiacl.bc, nvptx--.bc, nvptx--nvidiacl.bc, tahiti-amdgcn--.bc
and tahiti-amdgcn-mesa-mesa3d.bc.
[libc++] Add missing %{flags} substitution to clang-tidy (#171689)
Flags that should be used both for compiling and for linking are
provided through the %{flags} substitution. Our clang-tidy tests should
be using them, not only %{compile_flags}.
[CIR] Implement AggExprEmitter::VisitVAArgExpr (#172551)
This PR implements support for aggregate va_arg expressions in CIR
codegen.
## Changes
- **CIRGenBuiltin.cpp**: Modified `emitVAArg` to return a pointer type
for aggregate types. For aggregate types, `va_arg` returns a pointer to
the aggregate rather than the aggregate value itself.
- **CIRGenExprAggregate.cpp**: Implemented
`AggExprEmitter::VisitVAArgExpr` to handle aggregate va_arg expressions
by:
- Getting the va_arg pointer from `emitVAArg()`
- Creating an `Address` from the pointer with proper alignment
- Creating an `LValue` from the `Address`
- Copying the aggregate value to the destination using
`emitFinalDestCopy()`
[9 lines not shown]
docs/ParallelMultiImageFortranRuntime: Update link to latest PRIF Specification (#172747)
The PRIF Committee is pleased to announce the publication of the
Parallel Runtime Interface for Fortran (PRIF) Specification, Revision
0.7. The latest iteration of this specification represents the efforts
of a collaborative design process involving multiple individuals across
several institutions.
The document is available here: <https://doi.org/10.25344/S46S3W>
The PRIF specification is governed by a formal PRIF Committee.
For more details, see: <https://go.lbl.gov/prif-governance>
The Committee vote to approve the technical content in this revision
began on 2025-11-24 and concluded on 2025-12-08 with unanimous approval.
The 7-day Committee comment period for cosmetic feedback began on
2025-12-08 and concluded on 2025-12-15 with no comments.
See the Change Log in Section 1 of the document for the list of changes
relative to the prior revision.
[CIR][X86] Implement lowering for `_AddressOfReturnAddress` builtin (#171974)
- Add new `CIR_AddrOfReturnAddrOp` and support lowering it to LLVMIR
- Add CIR CodeGen for `_AddressOfReturnAddress` X86 builtin
- Fix error return type of `FrameAddrOp`, and add missing test for
`_ReturnAddress`
Part of https://github.com/llvm/llvm-project/issues/167765
[CIR] Add emitDeclInvariant for global with constant storage (#171915)
Implement emitDeclInvariant to emit llvm.invariant.start intrinsic for
global variables with constant storage. This enables optimizations by
marking when a global becomes read-only after initialization.
## Changes
- Add emitDeclInvariant and emitInvariantStart functions in
CIRGenCXX.cpp
- Add emitInvariantStart declaration in CIRGenFunction.h
- Update emitCXXGlobalVarDeclInit to call emitDeclInvariant for constant
storage globals after initialization
- Update getOrCreateCIRGlobal to set constant flag on globals with
constant storage
- Add comprehensive test covering positive and negative cases
## Implementation Details
The implementation handles address spaces correctly, dynamically
constructing the intrinsic name (e.g., invariant.start.p0,
[6 lines not shown]
[CIR] Move CIR CXXABI lowering to a standlone pass (#172133)
This moves the code that handles CXXABI-specific lowering in
ConvertCIRToLLVMPass into a standlone CIR-to-CIR transform pass. The
handling of these operations was already performing a CIR-to-CIR
transformation, with the CIR operations being further lowered to the
LLVM dialect. This change makes that transformation a separate pass.
The LowerModule object in ConvertCIRToLLVMPass will be unused after this
change, but removal of that object is being deferred to a follow-up PR
to keep this change isolated to a single purpose.
---------
Co-authored-by: Sirui Mu <msrlancern at gmail.com>
---------
Co-authored-by: Sirui Mu <msrlancern at gmail.com>
[CIR] Make x86 i1 mask vectors signed (#172912)
A number of x86 builtins need to cast a mask value to a vector of i1
values. Strictly speaking, these i1 values should be signless. However,
we don't have signless types in CIR, so we have to choose whether to
represent them as signed or unsigned. It seemed natural to make them
unsigned. However, there are going to be multiple places where we want
to convert the vector of i1 to a vector of either all ones or all zeros,
and in those cases we'll need to sign-extend the vector values.
Rather than creating the vector as unsigned and casting it to signed in
the cases where we need to saturate the lane, I think it makes more
sense to just create it as signed. This change does that.
[MLIR] Add DefaultValuedEnumAttr decorator
Introduce DefaultValuedEnumAttr, which similarly to DefaultValuedAttr
decorates an enum attribute to have a default value from a specific enum
case when not present. The default is constructed as the fully-qualified
enum case symbol.
In comparison to DefaultValuedAttr, this allows using a TableGen EnumCase
variable instead of a raw string.
[lldb/test] Fix failure caused by leading zero in TestScriptedFrameProvider.py
This should fix a test failure in TestScriptedFrameProvider.py:
https://lab.llvm.org/buildbot/#/builders/18/builds/23398/steps/6/logs/stdio
This is a happening because on 32bit system, addresses don't have the
leading zeroes. This patch removes them to satisfy the checks.
Signed-off-by: Med Ismail Bennani <ismail at bennani.ma>
[RegisterCoalescer] Don't commute two-address instructions which only define a subregister (#169031)
Currently, the register coalescer may try to commute an instruction
like:
```
%0.sub_lo32:gpr64 = AND %0.sub_lo32:gpr64(tied-def 0), %1.sub_lo32:gpr64
USE %0:gpr64
```
resulting in:
```
%1.sub_lo32:gpr64 = AND %1.sub_lo32:gpr64(tied-def 0), %0.sub_lo32:gpr64
USE %1:gpr64
```
However, this is not correct if the instruction doesn't define the
entire register, as the value of the upper 32-bits
of the register used in `USE` will not be the same.
[lldb] Add priority support to synthetic frame providers (#172848)
This patch adds `get_priority()` support to synthetic frame providers to
enable priority-based selection when multiple providers match a thread.
This is the first step toward supporting frame provider chaining for
visualizing coroutines, Swift async tasks, and et al.
Priority ordering follows Unix nice convention where lower numbers
indicate higher priority (0 = highest). Providers without explicit
priority return `std::nullopt`, which maps to UINT32_MAX (lowest
priority), ensuring backward compatibility with existing providers.
The implementation adds `GetPriority()` as a virtual method to
`SyntheticFrameProvider` base class, implements it through the scripting
interface hierarchy (`ScriptedFrameProviderInterface` and
`ScriptedFrameProviderPythonInterface`), and updates
`Thread::GetStackFrameList()` to sort applicable providers by priority
before attempting to load them.
[8 lines not shown]
[AArch64] Make IFUNC opt-in rather than opt-out. (#171648)
IFUNCs require loader support, so for arbitrary environments, the safe
assumption is to assume that they are not supported. In particular,
aarch64-linux-pauthtest may be used with musl, and was wrongly detected
as supporting IFUNCs.
With IFUNC support now being detected more reliably, this also removes
the check for PAuth support. If both are supported, either would work.
[DirectX] Avoid precalculating GEPs in DXILResourceAccess (#172720)
Instead of trying to precalculate GEP offsets ahead of time and then
process resource accesses based off of these offsets, traverse the GEP
chain inline for each access. This makes it easier to get the types
correct when translating GEPs for cbuffer and structured buffer
accesses, which in turn lets us access individual elements of those
structures directly.
Fixes #160208, #164517, and #169430
[LV] Add select cost test with negated condition. (NFC)
Add additional test coverage for select with negated condition.
Currently we overestimate the cost, because the negation can be folded
in the compare.
[mips][micromips] Add mayRaiseFPException to appropriate instructions, mark all instructions that read FCSR (FCR31) rounding bits as doing so (#170322)
[CIR] Combine effectively duplicate getMaskVecValue functions (#172896)
We had two functions, `getMaskVecValue` and `getBoolMaskVecValue` that
were both ported from the `GetMaskVecValue` in classic codegen.
`getBoolMaskVecValue` was bitcasting an X86 mask value to a vector of
`cir.bool` whereas `getMaskVecValue` was casting it to a vector of 1-bit
integers. While we do generally want to represent boolean values as
`cir.bool`, I don't think it makes sense to bitcast an X86 mask to a
vector of `cir.bool`. These just don't correspond.
Eliminating the boolean variant of this function also required updating
`emitX86Select` because that function was creating a `cir.select` op,
which requires a boolean argument and does not accept a vector of i1.
This probably should have been using `cir.vec.ternary` all along.