[MLIR][XeVM] Update HandleVectorExtract pattern. (#191052)
Split loads only if pointer address space is private.
Splitting loads from non-private memory could hurt performance.
[MLIR][XeVM] Update XeVM type converter (#189306)
Ideally, DLTI should be used for getting Index type which as it is tied
to bitwidth of pointer type that can be expressed with DLTI.
But currently, a separate pass option for bitwidth of Index type is used
in many passes.
GPU to XeVM lowering pipeline also use passes with such options.
But XeVM type converter does not provide a way to reflect choice of
Index type bitwidth and uses a hardcoded value.
This PR updates XeVM type converter to use Index type bitwidth from pass
option. This is done by using LLVM type converter for converting element
type instead of the previous custom logic.
In addition to handling Index type properly, by using LLVM type
converter, low precision float types are correctly converted to LLVM
supported types.
[CIR][ABI] Add ABI metadata fields to RecordType (#188300)
Store AST-derived layout information on `cir::RecordType` so that ABI
lowering passes (which have no AST access) can make correct calling
convention decisions.
The five new fields on `RecordTypeStorage` are: `triviallyCopyable`
(from `canPassInRegisters`), `triviallyDestructible` (from
`hasTrivialDestructor`), `isEmpty` (from
`CXXRecordDecl::isEmpty`/`field_empty`), `dataSizeInBits` (from
`ASTRecordLayout::getDataSize`), and `recordAlignInBytes` (from
`ASTRecordLayout::getAlignment`). They're set during
`computeRecordLayout` and are not part of the printed/parsed CIR text.
The `complete()` signature uses defaults so existing callers don't need
changes.
Anonymous records (created by passes, not CIRGen) default to trivially
copyable/destructible since they represent synthetic aggregates like
member pointer lowering tuples.
[2 lines not shown]
[AMDGPU][SIInsertWaitcnts][NFC] Drop `using llvm::AMDGPU` (#180782)
This is a followup patch for PR
https://github.com/llvm/llvm-project/pull/178345 which introduced `using
llvm::AMDGPU` to keep the patch size small.
[AMDGPU] Refactor setreg handling in the VGPR MSB lowering
It can skip inserting S_SET_VGPR_MSB if we set the mode via
piggybacking. We are now relying on the HW bug for correct
behavior. If/when the bug is fixed lowering will be incorrect.
SETREG is not a piggybacking target anymore. Instead piggybacking is
disabled if we have seen a SETREG since the last mode change.
[MC] Move addEncodingComment() into new base class MCAsmBaseStreamer
This is in preparation to use this functionality in the
SystemZHLASMAsmStreamer. No functional change.
[AMDGPU] Added a debug counter to Rewrite AGPR-Copy-MFMA pass (#189437)
The debug counter can be used to control the MFMA chains rewritten to
AGPR form.
[WebAssembly] Fix attributes of exception_grouping_2 test (#191466)
Function calls in `exception_grouping_2` test had incorrect attribute
numbers, making many of them incorrectly `noreturn`, rendering many BBs
after them unreachable. As a result of them, the function became a
trivial single-BB and the test passsed because it didn't have any
exceptions in it. I think this happened because I created that test in
another file and later pasted the function into these files, which had
different attribute numbers.
This also has a few drive-by comment typo fixes.
[OpenMP][OMPIRBuilder] Support complex types in atomic update/capture
Route struct-typed values through the libcall path in
`emitAtomicUpdate`.
Previously, the libcall path was gated on `RMWOp == BAD_BINOP`, so
atomic capture swap patterns (`v = x; x = expr`) for complex values
lowered as structs fell through to the cmpxchg path. That path called
`getScalarSizeInBits()` on a struct type, produced 0, and triggered an
assertion in `IntegerType::get()`.
Remove the `BAD_BINOP` restriction so struct types always use the
libcall path. This is safe because the libcall path does not use
`RMWOp` and already handles arbitrary type sizes correctly.
Also fix `LoadSize` in the libcall path to use `XElemTy` rather than
the pointer type, which previously gave the wrong size for larger
complex types such as `complex(8)`.
[3 lines not shown]
[clang] fix crash on qualified friend function definitions (#186398)
This patch fixes a crash caused by qualified friend function definitions
and We now recover early by diagnosing the invalid qualifier and
clearing the scope
fixes #185341
Revert "[NFC][SSAF] Move EntityPointerLevel to a separate folder" (#191481)
Reverts llvm/llvm-project#191331
A set of bots are broken. For more examples check the reverted PR.
https://lab.llvm.org/buildbot/#/builders/225/builds/5596
Example:
```
30 | ssaf::getUnsafeBuffers(const UnsafeBufferUsageEntitySummary &S) {
| ^~~~
clang/include/clang/ScalableStaticAnalysisFramework/Analyses/UnsafeBufferUsage/UnsafeBufferUsage.h:30:3: note: only here as a ‘friend’
30 | getUnsafeBuffers(const UnsafeBufferUsageEntitySummary &);
| ^~~~~~~~~~~~~~~~
FAILED: clang/lib/ScalableStaticAnalysisFramework/Analyses/EntityPointerLevel/EntityPointerLevel.cpp:61:5: error: qualified name does not name a class before ‘:’ token
61 | : ConstStmtVisitor<EntityPointerLevelTranslator,
| ^
```
[CIR] Handle globals with vptr init (#191291)
When a class contains virtual functions but no data members and has a
trivial constructor, global variables of that type are initialized with
a vptr. CIR was incorrectly creating the global variable with the type
of the vtable (an anonymous record) rather than the class type.
When replacing structors with aliases, we were calling a function to
update argument types at the call sites, but this was only necessary
because we initially generated the call using the same incorrect type
that we used for the global. The type correction wasn't implemented
because we hadn't encountered a case where it was needed. Having found
such a case led me to diagnose the problem as above, and I verified that
the same test case compiled without -mconstructor-aliases just failed in
the verifier because we never hit the replacement code. I'm now
convinced that this argument type fixup isn't necessary, so I replaced
the fixup function with an assert.
Assisted-by: Cursor / claude-4.6-opus-high
[AMDGPU] Fix setreg handling in the VGPR MSB lowering
There are multiple issues with it:
1. It can skip inserting S_SET_VGPR_MSB if we set the mode via
piggybacking. We are now relying on the HW bug for correct
behavior. If/when the bug is fixed lowering will be incorrect.
2. We should just unconditionally update MSBs if immediate allows it.
We shall set correct bits and keep the rest of the immediate
(that is done). There is no reasonable way for an user to change
MSBs nor does it do anything good to set it with SETREG and then
immediately overwrite with S_SET_VGPR_MSB.
3. We can always update immediate if Offset is zero.
4. Redundant mode changes created as seen in the
hazard-setreg-vgpr-msb-gfx1250.mir.
With unconditional immediate update most of time and not relying on
the SETREG for setting MSBs there is no good reason to complicate
handling by supporting SETREG as a piggybacking target. Moreover,
[10 lines not shown]
[flang][OpenMP] Rename GetRequiredCount to GetMinimumSequenceCount (#191465)
The new name better describes the calculated value.
Also adjust a diagnostic message to say that *at least* N loops are
expected in the sequence.
[clang-doc] Consolidate merging logic (#190051)
As we migrate things in the arena, this logic may get more complex.
Factoring it out now, will give clear extension points to make this
easier to manage.
[NFC][SSAF] Move EntityPointerLevel to a separate folder (#191331)
EntityPointerLevel will later be shared with other summaries besides
UnsafeBufferUsage. This commit moves it to a separate file.
[scudo] Remove fill when realloc to smaller size. (#191321)
In the reallocate function, when there is a realloc smaller than the
current size, the code would attempt to fill in the bytes after the new
size. This doesn't really add any extra security and is mostly a waste
of time, so skip it.
Remove the test that verifies this functionality.
[AMDGPU] Always update SETREG MSBs if offset is 0 (#191362)
We can always update immediate if Offset is zero. The bits
HW will write are always at the same position if offset is 0.
In particular it removes redundant mode changes created as seen
in the hazard-setreg-vgpr-msb-gfx1250.mir.
This still relies on the wrong behavior that SETREG updates
MSBs, so it will have to be changed later. Test immediates may be
off from desired for that reason in this patch.
[clang-doc] Use distinct APIs for fixed arena allocation sites
Typically, code either always emits data into the TransientArena or the
PersistentArena. Use more explicit APIs to convey the intent directly
instead of relying on parameters or defaults.