[mlir][emitc] Lower multiple results as a struct (#200659)
Previously, func-to-emitc lowering rejected func.{func,call,return} with
more than one result/operand. Such ops are directly handled by the
translator which emits an `std::tuple` packing ther results, but is only
relevant for C++ users. This patch lifts that restriction by packing
multiple return values into an automatically-generated struct, e.g. for
a function returning (i32, i32):
emitc.class struct @return_i32_i32 {
emitc.field @field0 : i32
emitc.field @field1 : i32
}
On return, the operands are packed into a local struct variable which is
then loaded and returned. On call sites, the struct is stored in a local
variable, and each field is extracted to recreate the individual SSA
values of the original results. As with single-result functions,
`emitc.array` return types are not supported.
[9 lines not shown]
[libc++] Fill in Apple availability for LLVM 21 (#202347)
macOS 26.4 and aligned platforms have been released and they are roughly
synchronized to libc++ 21. As a drive-by, also add missing versions for
previous releases.
This also allows reverting #199682 which moved an XFAIL to UNSUPPORTED
to silence CI failures temporarily.
[compiler-rt][ARM] Optimized integer -> FP conversions (#179928)
This commit adds a total of 8 new functions, all converting an integer
to a floating-point number, varying in 3 independent choices:
* input integer size (32-bit or 64-bit)
* input integer type (signed or unsigned)
* output float format (32-bit or 64-bit)
The two conversions of 64-bit integer to 32-bit float live in the same
source file, to save code size, since that conversion is one of the more
complicated ones and the two functions can share most of their code,
with only a few instructions differing at the start to handle negative
numbers (or not).
[VPlan] Insert VPBlendRecipes in post order. NFC (#201782)
#201783 wants to optimize blend masks by peeking through the contents of
other phi nodes. Currently we eagerly convert phis to blends in reverse
post order, so switch it to post order so that phis at the bottom can
see the phis in their uses.
[AMDGPU][InsertWaitCnts] Move HWEvent analysis code
Building up on the previous RFC, if it is accepted:
Move the code that maps a MachineInstr to HWEventSet to a separate file.
This should be NFC.
[RFC][AMDGPU][InsertWaitCnt] Move WaitEventType into separate HWEvent header
I propose to move `WaitEventType` into its own header to start a new
component of the back-end targeted at analyzing and treating hardware events
fired by instructions. Right now this just moves code around and renames things
(NFCI) but over time, we should generalize the events so they can be reused
by other passes instead of being hyper-specialized for InsertWaitCnt.
[LICM][SimplifyCFG] Ignore frees for writable dereferenceability check (#202589)
Both of these places only explicitly check for dereferenceability
because this is required for the `writable` attribute. Actual
dereferenceability has already been established at this point, e.g.
based on a prior access. As such, we can ignore frees here. We only care
that the argument has an appropriately sized `dereferenceable`
attribute.
[AMDGPU][InsertWaitCnts] Move HWEvent analysis code
Building up on the previous RFC, if it is accepted:
Move the code that maps a MachineInstr to HWEventSet to a separate file.
This should be NFC.
[RFC][AMDGPU][InsertWaitCnt] Move WaitEventType into separate HWEvent header
I propose to move `WaitEventType` into its own header to start a new
component of the back-end targeted at analyzing and treating hardware events
fired by instructions. Right now this just moves code around and renames things
(NFCI) but over time, we should generalize the events so they can be reused
by other passes instead of being hyper-specialized for InsertWaitCnt.
[libc++] Assume that __array_rank is provided by the compiler (#202511)
All compilers we support have `__array_rank`, so we can remove the
preprocessor branch for supporting compilers which don't provide it.
[clang-tidy] Avoid invalid fixes in `readability-delete-null-pointer` (#202488)
Only provide warnings (not fixits) when `IfStmt` has condition variable
or initializer.
Note that i didn't provide fixit for the situation that conditon
variable is different with the pointer variable being cast to bool
because i think this is rare. (the third newly added testcase)
Closes #202312.
---------
Co-authored-by: Zeyi Xu <zeyi2 at nekoarch.cc>
[Flang][OpenMP][Sema] Add OpenMP warning when mapping local descriptors to device on enter without a corresponding exit (#201060)
This PR aims to add a new warning to Flang that will emit when a user
tries to map a local/temporary descriptor to device on an enter
directive without also applying it to a corresponding exit directive.
This problem can cause some pretty unique and difficult to track down
errors in programs as it can result in a user unintentionally locking
into place a stack allocated descriptor that has fallen out of scope,
which can result in a later clash with another stack allocated variable
that's being mapped and just happens to reside in the old descriptor
address range.
So this PR attempts to warn about this problem to prevent users doing
so, it's of note that we handle some of these cases in our
MapInfoFinalization pass, but I believe we should still include these
cases for portability reasons and incase we ever backtrack on our
decision to silently support some of these cases.
Made this warning as it was a suggestion from Michael Klemm and seemed
[3 lines not shown]
[clang][bytecode] Save a `Type*` in integral pointers instead of a descriptor (#202835)
This way we don't need to allocate a descriptor via the `Program`, which
is for global data.
[mlir][mem2reg] fix 197158 by moving visitReplacedValues call (#198552)
Fix #197158 and #200844 by moving the `visitReplacedValues` calls
between `promoteInRegion` and `removeBlockingUses` , as well as setting
the insertion point before the replaced store operation before calling
the `PromotableMemOpInterface::getStored` API (instead of setting the
insertion point after).
The action order change is done at the top level. The `promoteInRegion`
are done for all regions in post order, then the `visitReplacedValues`
are done for all regions, and then only the `removeBlockingUses` are
done for all regions in post order. This ensures that any load results
that would happen to be used in a later stored is not deleted by
`removeBlockingUses` before it is used by `visitReplacedValues`.
The insertion point change ensures that the stored values passed to
`visitReplacedValues` dominate the related store operations. Otherwise,
typical `visitReplacedValues` that set insertion points at the store
operation and use the stored values generated invalid IR when
`getStored` generates new IR (like bitcasts for the LLVM dialect
implementation).
[clang][OpenMP] Improve loop structure for distributed loops (pt 1: reductions) (#201670)
This is a part of a series of patches that rework OpenMP cross-team
reductions.
This patches wires the existing
`kmp_sched_distr_static_chunk_sched_static_chunkone` to be used by
CodeGen (this patch is restricted to reduction loops).
Example of the intended change of this patch:
```
target teams distribute parallel for reduction(+:s)
for (i = 0; i < N; i++) s += a[i];
```
Before:
```
__kmpc_distribute_static_init(91)
for (team_lb = team*nthreads; team_lb < N; team_lb += nteams*nthreads) {
[68 lines not shown]
[clang][ssaf] CallGraph extractor should ignore objc callees for now (#202606)
Ignoring them is better than crashing/asserting on nullptr derefs.
Fixes: rdar://179104950
[clang-tidy] Fix false positive in bugprone-use-after-move with std::forward on derived classes (#199905)
The `bugprone-use-after-move` check correctly identified partial moves
when using `std::move` by matching the `ImplicitCastExpr`
(DerivedToBase) as the parent of the call. However, when using
`std::forward<Base>`, the cast occurs inside the argument, causing the
matcher to miss the cast and falsely report a use-after-move.
This patch uses `traverse(TK_AsIs, expr(hasParent(...)))` on the first
argument to navigate bottom-up, reliably capturing the hidden
`ImplicitCastExpr`. This ensures both partial moves and forwards are
consistently recognized, eliminating the false positive.
Assisted by AI to check code.
Fixes #63202
pfdenied: fix checking root anchor
pfctl doesn't like empty anchors (-a ''), but we can specify the root
anchor as '/' too, so do that instead.
PR: 295324
Tested by: Paweł Krawczyk
MFC after: 1 week
Sponsored by: Rubicon Communications, LLC ("Netgate")
(cherry picked from commit 3d9cd10b2857ee7a9ec1b04457d9ec44f614d32c)