[VPlan] Expand VPExpandSCEVRecipes to VPInstructions before CSE. (#197643)
Add expandSCEVExpressions transform that converts VPExpandSCEVRecipes
to VPInstructions where possible, running before CSE so duplicates with
other SCEV expansions (e.g., from addMinimumIterationCheck) are
eliminated. This also reuses existing loop-invariant IR values via
ScalarEvolution::getSCEVValues to avoid redundant computation.
Currently limited to SCEVMulExpr (along with constants, unknowns, and
vscale). Support for SCEVAddExpr and SCEVUDivExpr will follow in
subsequent patches.
Depends on https://github.com/llvm/llvm-project/pull/189455
PR: https://github.com/llvm/llvm-project/pull/197643
[flang][docs] Documented IS_CONTIGUOUS() extension for constant arrays (#200451)
Flang considers constant objects or subobjects of constant objects as
contiguous even in cases, where the other compilers may consider them
non-contiguous. Documented the extension.
Fixes #199878
Add yamlobj roundtrip tests for SM 6.8 features (#198403)
This PR adds DXContainer tests for Shader Model 6.8 features;
SampleCmpGradientOrBias and ExtendedCommandInfo.
Assisted by: Github Copilot
Fixes https://github.com/llvm/llvm-project/issues/83177
Process BACKUP in vrrp rapid-succession branch
When VrrpEventThread saw a second rapid event after waiting
rapid_event_settle_time, it dropped the latest queued event and
logged a warning. On boot-time keepalived flaps where the
MASTER->BACKUP gap floors below max_wait, that drop swallowed
the only BACKUP signal middleware was going to see, so
vrrp_backup never ran.
Fire the hook for BACKUP (skipping if vrrp_backup already ran
this process lifetime, tracked via a new LAST_EVENT_TYPE
attribute on FailoverEventsService); keep the drop+warn for
MASTER, since acting on an unsettled MASTER would kick off
fenced + zpool import.
Update clzdi2.c to pull in the following commit neede because of a change
in clang brought in with the llvm 22 update:
commit 5d0e26e571c08dc4c0b2a25ed6c9f845f054fa76
Author: Koakuma <koachan at protonmail.com>
Date: Tue Apr 29 07:36:32 2025 +0700
[compiler-rt] Make sure __clzdi2 doesn't call itself recursively on sparc64 (#136737)
On 64-bit platforms, libgcc doesn't ship with __clzsi2, so __builtin_clz
gets lowered to __clzdi2. A check already exists for GCC, but as of
commit 8210ca019839fc5430b3a95d7caf5c829df3232a clang also lowers
__builtin_clz to __clzdi2 on sparc64.
Update the check so that building __clzdi2 with clang/sparc64 also
works.
ok tb@, deraadt@
import liquid-dsp
liquid-dsp is a free and open-source digital signal processing (DSP)
library designed specifically for software-defined radios on embedded
platforms. The aim is to provide a lightweight DSP library that does not
rely on a myriad of external dependencies or proprietary and otherwise
cumbersome frameworks. All signal processing elements are designed to be
flexible, scalable, and dynamic, including filters, filter design,
oscillators, modems, synchronizers, complex mathematical operations, and
much more.
ok benoit@
[ExpandIRInsts] Fix e.g. fptoui.sat.f32.i256's handling of inf (#200261)
When expanding fptoui.sat/fptosi.sat, we saturate when the biased
exponent is at least ExponentBias + BitWidth - IsSigned, the point where
the value no longer fits in the target integer.
We should *also* always saturate when the floating-point value is
+/-inf. Usually this doesn't require any special handling; for example
for a float32 -> int32 conversion, inf has a biased exponent of 255 >
ExponentBias + BitWidth - IsSigned = 127 + 32 - 1.
But for integer types which are large enough to contain all source
floating-point values, this doesn't work. For example, if you're
converting float32 to int256, you'd compute a threshold of 383, which is
greater than 255. Therefore float32(inf) would not correctly saturate
to INT256_MAX.
Fix this by clamping the threshold to the all-ones biased exponent.
This bug was found by a large run of Opus 4.7 looking for bugs in LLVM.
[CIR] Implement cleanups of base classes for aggregates. (#200473)
This is a very simple implementation, we just make sure we add the base
class destructor to the cleanup scope.
Unbreak and update powerpc retguard for llvm 22
For RETGUARD_LOAD_COOKIE when -fno-pie (as in macppc kernels), change
a relocation in -fno-pie (as in macppc kernels) from unusual S_HIGHA
"__retguard_3671 at higha" to usual S_HA "__retguard_3671 at ha". This
prevents an error from lld 22,
ld: error: rasops15.o:(function rasops15_init: .text+0x2): unknown \
relocation (111) against symbol __retguard_3671
For RETGUARD_LOAD_PC in PIC code, change an instruction from 'bl .+4'
to 'bcl 20,31,.+4' to fix branch prediction. This follows the same
change in upstream llvm,
https://github.com/llvm/llvm-project/issues/128644
ok jca@ naddy@
[Flang][OpenMP]Handling restrictions of using Memory-Order-Clause with Atomic-Clause (#199636)
Adhering to the restrictions of using Memory-Order-Clause with
Atomic-Clause.
Added warnings to indicate the transformations that will done internally
in flang.
In the process of handling all the restrictions of using
memory-order-clause This also Fixes
[#199490](https://github.com/llvm/llvm-project/issues/199490)
---------
Co-authored-by: Sunil Kuravinakop <kuravina at pe31.hpc.amslabs.hpecorp.net>
[scudo] Improve performance of pushBlocks sort. (#200297)
Ran this on an Android device using both algorithms, the new algorithm
is on average 10% faster, but gets to be 15% faster in some cases. This
is an example of the speed-ups.
Average Operation Time Maximum Operation Time Name
326.9(ns) 80770(ns) PushBlocks New
365.9(ns) 108032(ns) PushBlocks Old
[DirectX] Remove obsolete cbuffer layout test (#200307)
This test uses outdated `cbuffer` layout design. It has been replaced by
`cbuffer-metadata.ll` when we updated the frontend to use explicit
padding for `cbuffer` data types.
[IR] Introduce `mem.cache_hint` metadata for composable memory cache control hints (#181612)
Add target-agnostic infrastructure for the !mem.cache_hint metadata
kind,
https://discourse.llvm.org/t/rfc-composable-and-extensible-memory-cache-control-hints-in-llvm-ir/89443
This patch includes:
- Registration of mem.cache_hint in FixedMetadataKinds
- IR Verifier validation of structural constraints
- Metadata helper support in combineMetadata(), copyMetadataForLoad(),
and dropUBImplyingAttrsAndMetadata()
- LangRef documentation for the metadata format and semantics
- Verifier and transform pass test coverage (GVN, InstCombine,
SimplifyCFG)
Co-authored-by: Yonah Goldberg <ygoldberg at nvidia.com>
Assisted-by: Claude Code
---------
Co-authored-by: Yonah Goldberg <ygoldberg at nvidia.com>
[clang] Enable GNU __attribute__((init_priority(...))) on z/OS. (#199573)
Enable `init_priority` on z/OS
Motivation
The recent addition of `clang/test/Sema/type-dependent-attrs.cpp` in
https://github.com/llvm/llvm-project/pull/182208 started failing on
z/OS. That test uses `[[gnu::init_priority(2000)]]`, and the failure
exposed that init_priority support was still disabled for z/OS in
`Attr.td`.
What changed
- Enabled init_priority for z/OS in `clang/include/clang/Basic/Attr.td`
- Updated `clang/test/SemaCXX/init-priority-attr.cpp` so z/OS now
expects normal semantic handling for init_priority
This reverts commit 2c7e24c4b6893a93ddb2b2cca91eaf5bf7956965 and
preserve any changes done after this commit.
[CIR] Add Extend (signext/zeroext) handling to CallConvLowering (#195745)
Third PR in the series splitting
[#192119](https://github.com/llvm/llvm-project/pull/192119) /
[#192124](https://github.com/llvm/llvm-project/pull/192124).
[#195725](https://github.com/llvm/llvm-project/pull/195725) and
[#195737](https://github.com/llvm/llvm-project/pull/195737) have merged;
this PR is now a standalone diff on main.
Adds Extend (signext / zeroext) to `cir-call-conv-lowering`. The CIR
signature keeps the original narrow integer type; the rewriter attaches
`llvm.signext` / `llvm.zeroext` to `arg_attrs` and `res_attrs`. That
matches classic Clang's LLVM IR convention — `define void @f(i8 signext
%x)`, not `define void @f(i32 signext %x)` with an entry-block
truncation. The `coercedType` field on an Extend `ArgClassification` is
informational only; the rewriter doesn't use it to change the CIR
signature.
Three `.cir` tests cover narrow-signed-arg, narrow-unsigned-arg, and
[3 lines not shown]
[SimplifyCFG] Preserve atomicity when merging atomic conditional stores (#200327)
mergeConditionalStoreToAddress() merges two stores into one. It does
this for non-atomic and atomic-unordered stores, but when merging
unordered stores, it would downgrade them to non-atomic!
This bug isn't accessible from C because C doesn't expose unordered
atomics. But you can access it from e.g. Objective-C with something like
```
// repro.m — clang -fno-objc-arc -O2
__attribute__((objc_root_class))
@interface C { int _value; }
@property(atomic, direct) int value;
@end
@implementation C
@end
void f(C *obj, _Bool c1, _Bool c2, int v1, int v2) {
[11 lines not shown]