[VPlan] Consistently use MinOrMax* in VPlanConstruction transforms (NFC)
Make sure variables/functions consistently use MinOrMax*, as suggested
in https://github.com/llvm/llvm-project/pull/170223. Split off from the
PR.
[clang][RISCV] Add big-endian RISC-V target support (#165599)
We proceeded with frontend/clang changes, until we figure out how ABI
for BE should look like. Once it is final, we will proceed with codegen
changes.
In this patch several things addressed:
- Define riscv32be/riscv64be target triples
- Set correct data layout for BE targets
- Handle BE-specific ABI details
- Emit warning for BE case since it is still experimental
[SelDag] Use use BoolVT size when expanding find-last-active, if larger. (#175971)
On some targets, BoolVT may have been widened earlier. In those cases,
choosing StepVT to be smaller can cause crashes when widening the
mis-matched select. Without the fix, the new test
@extract_last_active_v4i32_penryn crashes when trying to widen.
It also improves codegen for other cases.
PR: https://github.com/llvm/llvm-project/pull/175971
[MemCpyOpt] keep src/dest alloca ordering (#176012)
Rather than test dominator of every use, just check which of src or dest
is first, and use that insert location. This minimizes unnecessary
dominator queries while also helping to preserve the order of allocas
(for better code readability / diff).
Extracted from PR optimization improvement series at
https://github.com/llvm/llvm-project/pull/150792
[MemCpyOpt] allow memcpy-to-memcpy optimization with smaller dest than src (#176010)
Resize the alloca if needed to a common size, as long as the dest was
still fully initialized by the copy.
Extracted from PR optimization improvement series at
https://github.com/llvm/llvm-project/pull/150792 (included all tests
additions from there as well)
[Metadata][profcheck] Handle identical MDNodes in getMergedProfMetadata (#175701)
This fixes a bug where !prof metadata was dropped from SelectInsts when GVN simplified/merged them. Guarded by -profcheck-disable-metadata-fixes. Exposed by the tests in Transforms/SampleProfile.
[NFC][SystemZ] Update insert() API of the AssociatedDataAreaTable class
This patch updates the insert() calls of the AssociatedDataAreaTable class
to return a pair of <const MCSymbol *, uint32_t> instead of just a uint32_t.
This API change of including the MCSymbol is needed in subsequent patches
to come.
[Metadata][profcheck] Handle identical MDNodes in getMergedProfMetadata
This fixes a bug where !prof metadata was dropped from SelectInsts when GVN simplified/merged them.
Guarded by -profcheck-disable-metadata-fixes. Exposed by the tests in
Transforms/SampleProfile.
[AMDGPU] Programmatically port old `.def` clang builtins to `.td` (#175873)
Summary:
This PR ports the old `.def` builtins to the new Tablegen interface.
This required a few changes in the handler, namely there is a real
meaning to `AS(0)` right now, not just in SPIR-V but when the type
parser expects it. The conversion here should be 1-to-1.
Some more work could be done to reduce the amount of repetition by
grouping all the instructions together, I'll leave that up to whether or
not anyone cares.
This was done with a hastily made Python script that likely will not
work
for the other files, but will successfully update this PR. Putting here
in
case someone wants to use it.
https://gist.github.com/jhuber6/d524c65c0da3adae5afd2ad160589537
[VPlan] Strip phi operand from compute-reduction-result comments (NFC).
After d5c11b9a24c84f1, compute-reduction-result does not have the
reduction phi recipe as operand. Update stale comments pointed out
independently in https://github.com/llvm/llvm-project/pull/175461.
InstCombine: Implement SimplifyDemandedFPClass for fma
This can't do much filtering on the sources, except for nans.
We can also attempt to introduce ninf/nnan.