[DAG] computeKnownFPClass - Add handling for AssertNoFPClass (#190185)
Resolves #189478
Adds code to handle AssertNoFPClass in computeKnownFPClass and adds IR
test coverage for RISC-V.
[Clang] Do not create a NoSFINAETrap for variable specialization. (#191000)
There is no thing in the standard that says this should happen outside
of the immediate context.
Fixes #54439
[AMDGPU] Use wavefront scope for single-wave workgroup synchronization (#187673)
Workgroup-scoped fences and non-relaxed workgroup atomics were
previously legalized with synchronization strong enough for multi-wave
workgroups.
When the kernel's maximum flat work-group size does not exceed the
wavefront size, the workgroup contains only a single wavefront, so
workgroup-scoped synchronization is equivalent to wavefront scope and
the stronger legalization is unnecessary.
SIMemoryLegalizer now demotes workgroup scope to wavefront scope
in this case for workgroup-scoped fences and for non-relaxed atomic
load, store, atomicrmw, and cmpxchg operations.
This allows subsequent legalization to operate at wavefront scope.
The decision is based on AMDGPUSubtarget::isSingleWavefrontWorkgroup.
---------
Co-authored-by: Barbara Mitic <Barbara.Mitic at amd.com>
gptzfsboot: boot prompt should emit new line on input
In case the user did input, we should put newline
on screen to avoid possible error messages to get
mixed with user input.
[OMPIRBuilder] Move debug records to correct blocks. (#157125)
Consider the following small OpenMP target region:
```
!$omp target map(tofrom: x)
x = x + 1
!$omp end target
```
Currently, when compiled with `flang`, it will generate an outlined
function like below (with irrelevant bits removed).
```
void @__omp_offloading_10303_14e8afc__QQmain_l13(ptr %0, ptr %1) { entry:
%2 = alloca ptr, align 8, addrspace(5)
%3 = addrspacecast ptr addrspace(5) %2 to ptr
...
br i1 %exec_user_code, label %user_code.entry, label %worker.exit
[36 lines not shown]
[analyzer] Fix crash in CStringChecker on zero-size element types (#191061)
Move the null check of Offset before its dereference in checkInit. When
the element type has zero size (e.g., an empty struct in C), the
division returns an empty optional, which was dereferenced
unconditionally.
Fixes #190457
[clang][ssaf][test] Fix the extraction-works-alongside-compilation.cpp test (#191162)
I forgot that we need this `REQUIRES: asserts` for the test.
Fixes build bots not setting `LLVM_ENABLE_ASSERTIONS=ON`.
For example:
https://lab.llvm.org/buildbot/#/builders/11/builds/37623
This fixes up #191058
[LV] NFCI: Create VPExpressions in transformToPartialReductions.
With this change, all logic to generate partial reductions and
recognising them as VPExpressions is contained in
`transformToPartialReductions`, without the need for a second
transform pass.
The PR intends to be a non-functional change.
[LV] Simplify costing partial reduction chain links (NFCI) (#190980)
Previously, `getPartialReductionLinkCost()` needed to figure out what
case `matchExtendedReductionOperand()` matched to compute a cost. This
made adding new cases to `matchExtendedReductionOperand()` more complex
and added some redundancy.
This patch updates `ExtendedReductionOperand` so that it contains all
the information needed to compute the cost ready to pass to
`getPartialReductionCost()`. This means matching new operand forms only
needs to be done in `matchExtendedReductionOperand()`.
This is split off from #188043 (this change simplifies matching absolute
difference operands).
[mlir][Vector] Make createWriteOrMaskedWrite utility (#190967)
Analog to https://github.com/llvm/llvm-project/pull/89119, make
`createWriteOrMaskedWrite` a vector utility, exposing it for re-use by
downstream users.
This PR is mostly just moving code and updating documentation but also
addresses a `TODO` for `isMaskTriviallyFoldable` to use that utility in
`createReadOrMaskedRead` as well.
No new tests were added, because the functionality is covered by existing tests.
---------
Signed-off-by: Lukas Sommer <lukas.sommer at amd.com>
[VPlan] Handle AnyOf Or reduction via ComputeReductionResult. (#191049)
Instead of having ComputeAnyOfResult handle the Or reduction of unrolled
parts inline, route it through ComputeReductionResult with
RecurKind::Or. ComputeAnyOfResult now takes a pre-reduced scalar and
only performs the freeze + select.
This is a preparatory step towards removing ComputeAnyOfResult entirely
in https://github.com/llvm/llvm-project/pull/190039.
PR: https://github.com/llvm/llvm-project/pull/191049
[mlir][debug] Make DICompileUnitAttr recursive. (#190808)
This PR add `DIRecursiveTypeAttrInterface` to `DICompileUnitAttr`. It
should fix the circular dependency problem we have since
`importedEntities` field was added.
[Clang] Improve concept performance 1/N (#188421)
The concept parameter mapping patch significantly impacted performance
in scenarios where concepts are heavily used, even with
atomic-expression-level caching.
After normalization, we often end up with large atomic expressions
containing numerous duplicate and complex template parameter mappings.
Previously, we were substituting and checking these repeatedly, which
was highly inefficient.
We now cache these substitution results within TemplateInstantiator.
This saves us a lot of duplicate semantic checking and provides us some
performance improvement, as in these regression cases:
usb_ids_gen.cpp:
clang-21: 1.41s
clang-22: 3.90s
This patch: 2.45s
[12 lines not shown]
[clang][test] Modernize 2004-02-13-Memset.c to use FileCheck (#191092)
Replace `grep | count` verification with `FileCheck` and update `CHECK`
directives with current codegen output.
[LV][NFC] Remove unneeded LLVM intrinsic declarations (#190993)
We no longer need to declare LLVM intrinsics in .ll files as the
intrinsics are populated automatically in the module. Remove the
declarations from tests to reduce test noise and size.
This came from a suggestion on PR #190786.
Revert "[SelectionDAG] Recurse through mask expression trees in WidenVSELECTMask (#188085)" (#191151)
This reverts commit 815edc3ff646392bfee2b381d37dd35e4b04f9c5.
security/wolfssl: Fix 32-bit builds.
Add upstream patch for the fix, until changes are merged and
a new release is made.
PR: 294287
Reported by: Robert Clausecker <fuz at FreeBSD.org>
Reviewed by: Robert Clausecker <fuz at FreeBSD.org>
Tested by: Robert Clausecker <fuz at FreeBSD.org>
(cherry picked from commit 8318a3cd1c5262d51c70240d97798cce3c1a3bd6)
[clang][ssaf] Preserve AST after codegen for SSAF extractors (#191058)
This is a use-after-free.
Codegen would drop the AST before starting the optimizations on the LLVM
IR level. This means that the ASTConsumers of the SSAF extractors only
had dangling TU Decls etc.
For now, let's override this option to force-keep the AST alive. Note
that PluginActions already did the same if their consumers were added
after the main frontend-action.
See:
https://github.com/llvm/llvm-project/blob/69e0367e8221b8002b5d438fb70ff3daf36257fc/clang/lib/Frontend/FrontendAction.cpp#L470
```c++
CI.getCodeGenOpts().ClearASTBeforeBackend = false;
```
Long term, we could think about the stability implications of running
the extractors before codegen to be able to drop the AST, thus save
[13 lines not shown]