[AIX] Implement the ifunc attribute. (#153049)
Currently, the AIX linker and loader do not provide a mechanism to
implement ifuncs similar to GNU_ifunc on ELF Linux.
On AIX, we will lower `__attribute__((ifunc("resolver"))` to the llvm
`ifunc` as other platforms do. The llvm `ifunc` in turn will get lowered
at late stages of the optimization pipeline to an AIX-specific
implementation. No special linkage or relocations are needed when
generating assembly/object output.
On AIX, a function `foo` has two symbols associated with it: a function
descriptor (`foo`) residing in the `.data` section, and an entry point
(`.foo`) residing in the `.text` section. The first field of the
descriptor is the address of the entry point. Typically, the address
field in the descriptor is initialized once: statically, at load time
(?), or at runtime if runtime linking is enabled.
Here we would like to use the address field in the descriptor to
implement the `ifunc` semantics. Specifically, the ifunc function will
[29 lines not shown]
[mlir][emitc] Update and extend the TOSA -> EmitC test (#177339)
This patch updates and extends the TOSA-to-EmitC lowering test:
* Conversion/ConvertToEmitC/tosa.mlir
Summary of changes and rationale:
* Remove `buffer-alignment=0` from the lowering pipeline; it is not required
(the existing `CHECK` lines are not affected).
* Move the test from Conversion/ConvertToEmitC/tosa.mlir to
Dialect/EmitC/tosa/ops.mlir. Conversion tests are intended for single
conversion passes (e.g. `-convert-dialect1-to-dialect2`), whereas this test
exercises a more complex lowering pipeline with multiple explicit steps (e.g.
TOSA -> Linalg, bufferization, etc.).
* Add a Transform Dialect sequence to complement the existing lowering pipeline
definition. This introduces an additional `RUN` line that is compatible with
the original one. Using the Transform Dialect makes the pipeline easier to
document, maintain, and experiment with.
[NFC][TableGen] Adopt CodeGenHelpers in IntrinsicEmitter (#179310)
- Adopt IfDefEmitter in IntrinsicEmitter.
- Remove #undef for various flags in Intrinsics.cpp/Intrinsics.h as the
TableGen generated code does that now.
[AMDGPU] Allow hoising of V_READFIRSTLANE_B32 for uniform operand
readfirstlane can be moved across control flow for uniform inputs.
The MachineInstr::NoConvergent attribute allows hoisting
which is otherwise prohibited for a convergent instruction.
[LLVM][Intrinsics] Minor cleanup in getIntrinsicInfoTableEntries (#179317)
Change `IITValues` from SmallVector to a simple array, since its maximum
size is bounded and relatively small. As a result, using a SmallVector
for this array is not necessary.
[AMDGPU] Clear no convergence flag on operand folding. NFCI (#179438)
Clear the flag. It fails verification if set, only convergent
operations may have NoConvergent flag. NFCI as it is now because
it just does not happen.
Reapply "[InstCombine] Always fold alignment assumptions into operand bundles (#177597)" (#179497)
Truncating at 32 bits is now avoided by removing a cast to `unsigned`.
This would also break at 64 bits (with the pointer size > 64 bit), but I
don't think LLVM supports such a
thing.
This reverts commit bc7315749d6d16d0f162f816b3ec0ef7169615f2.
ARM: Avoid using isTarget wrappers around Triple predicates (#179512)
These are module level properties, and querying them through
a function-level subtarget context is confusing. Plus we don't
need an aliased name.
Continue change started in 91439817e8d19613ac6e25ca9abd5e7534a9d33b
[LoopUnroll] Fix block frequencies for newly unconditional latches
As another step in issue #135812, this patch fixes block frequencies
when LoopUnroll converts a conditional latch in an unrolled loop
iteration to unconditional. It thus includes complete loop unrolling
(the conditional backedge becomes an unconditional loop exit), which
might be applied to the original loop or to its remainder loop.
As explained in detail in the header comments on the
fixProbContradiction function that this patch introduces, these
conversions mean LoopUnroll has proven that the original uniform latch
probability is incorrect for the original loop iterations associated
with the converted latches. However, LoopUnroll often is able to
perform these corrections for only some iterations, leaving other
iterations with the original latch probability, and thus corrupting
the aggregate effect on the total frequency of the original loop body.
This patch ensures that the total frequency of the original loop body,
summed across all its occurrences in the unrolled loop after the
[27 lines not shown]
[InstCombine] fold icmp ne (and X, 1), 0 --> trunc X to i1 (#178977)
Remove vector check so this fold always is done.
proof: https://alive2.llvm.org/ce/z/oabD6J
closes #172888
Implement `ByteAddressBuffer` Load/Store methods (#176058)
Closes #108058.
This PR:
- Adds the `uint` `Load` and `Store` methods (`Load/Store`,
`Load2/Store2`, `Load3/Store3`, `Load4/Store4`) to the existing
`ByteAddressBuffer` objects
- Adds the new templated `Load` and `Store` methods to
`ByteAddressBuffer` objects, which allow types other than `uint` (e.g.
aggregate types) to be used with them directly
- One exception to this is array types, which are rejected by the
methods (as array returns will be disallowed in 202x)
- Adds the relevant `AST`, `CodeGenHLSL`, and `SemaHLSL` tests for these
methods
*Note: the `HLSL Tests` check is failing because this implementation
makes the `ByteAddressBuffer` tests XPASS. Will remove the XFAILs from
these tests in a follow-up.*
[clang][ssaf] Add FormatInfo sub-registry and tests [3/3]
Add `FormatInfoEntry` template to support per-analysis-type serialization
within a `SerializationFormat`.
This allows to implement different formats for the different analyses in
a decoupled way.
For testing, this patch also implements the MockSerializationFormat
demonstrating the FormatInfo sub-registry pattern.
Assisted-by: claude
[MLIR] Enforce symbol visibility during symbol lookup (#179370)
Update symbol resolution to examine whether a nested symbol being
resolved is private, and fail in that case. This ensures that we
maintain invariants on symbol visibility that we depend on in
optimisations.
[clang][ssaf] Add SerializationFormatRegistry [2/3]
Add a registry infrastructure for SerializationFormat implementations,
enabling registration and instantiation of different serialization formats.
For example:
```c++
static SerializationFormatRegistry::Add<MyFormat>
RegisterFormat("MyFormat", "Description");
```
Formats can then be instantiated by name using `makeFormat()`.
The patch also updates the SerializationFormat base class to accept
FileSystem and OutputBackend parameters for virtualizing I/O
operations.
Assisted-by: claude
[llvm][Support] Add InMemoryOutputBackend [1/3]
Add InMemoryOutputBackend, an output backend that creates files in
memory backed by string buffers in a map.
This is useful for unittests, where we don't want to create files on the
file system, but still want to check the content of the created files.
Assisted-by: claude
ARM: Avoid using isTarget wrappers around Triple predicates
These are module level properties, and querying them through
a function-level subtarget context is confusing. Plus we don't
need an aliased name.
Continue change started in 91439817e8d19613ac6e25ca9abd5e7534a9d33b
[X86] mayFoldIntoVector - recognise larger than legal logic ops may fold to vectors (#179503)
Inspired by the hack to #174761 - move the custom operation handling
inside mayFoldIntoVector where we can more accurately predict ops that
can be moved to the vector unit