[MLIR][BufferResultsToOutParamsPass] Add Option to Modify Public Function's Signature (#167248)
Since https://github.com/llvm/llvm-project/pull/162441,
`buffer-results-to-out-params` transforms `private` functions only.
But, as mentioned in
https://github.com/llvm/llvm-project/pull/162441#issuecomment-3404195242,
this is a breaking change for pipelines handling C code. Our pipeline
@EfficientComputer is also affected by this breaking change.
Therefore, this PR adds an opt-in flag to allow `public` functions to be
transformed by `BufferResultsToOutParamsPass`.
DAG: Add AssertNoFPClass from call return attributes
This defends against regressions in future patches. This excludes
the target intrinsic case for now; I'm worried introducing an intermediate
AssertNoFPClass is likely to break combines.
[clang-tidy][NFC] Enable `performance-unnecessary-value-param` in the codebase (#163686)
Closes #156156.
In a few cases, instead of just applying the fix-it and making
parameters const references to owning type, I refactored them to be
non-owning types.
[Hexagon] Implement isMaskAndCmp0FoldingBeneficial (#166891)
Sink `and` mask to `cmp` use block if it is masking a single bit since
this will fold the `and/cmp/br` into a single `tstbit` instruction.
[Flang] Add parser support for prefetch directive (#139702)
Implementation details:
* Recognize prefetch directive in the parser as `!dir$ prefetch ...`
* Unparse the prefetch directive
* Add required tests
Details on the prefetch directive:
`!dir$ prefetch designator[, designator]...`, where the designator list
can be a variable or an array reference. This directive is used to
insert a hint to the code generator to prefetch instructions for
memory references.
[llvm][RISCV] Support Zvfbfa codegen for fneg, fabs and copysign (#166944)
This is first patch for Zvfbfa codegen and I'm going to break it down to
several patches to make it easier to reivew.
The codegen supports both scalable vector and fixed length vector on
both
native operations and vp intrinsics.
[LoongArch] Initial implementation for `enableMemCmpExpansion` hook (#166526)
After overriding `TargetTransformInfo::enableMemCmpExpansion` in this
commit, `MergeICmps` and `ExpandMemCmp` passes will be enabled on
LoongArch.
[Github] Make Windows container use zstd (#167022)
This enables much faster image unpack times. We benchmarked 20-30%
improvements when testing this initially. Use skopeo to copy the image
as it just works over the docker-archive/OCI container formats and does
not need to unpack the image to upload it.
[Github] Update PR labeller to v6.0.1 (#167246)
This was reverted earlier due to me not realizing that the config format
also changed. This patch updates the config to match the new format and
bumps the version.
[mlir][tosa] Fix crash in `tosa.concat` verifier (#165966)
The `tosa.concat` verifier crashed when the output rank did not match
the input rank. This PR adds a proper check and error emission to
prevent the crash. Fixes #159742.
[LoongArch] Initial implementation for `enableMemCmpExpansion` hook
After overriding `TargetTransformInfo::enableMemCmpExpansion`
in this commit, `MergeICmps` and `ExpandMemCmp` passes will be
enabled on LoongArch.
[VPlan] Use VPInstructionWithType for casts in VPlan0. (NFC)
Use VPInstructionWithType for casts in VPlan0, to enable additional
analysis/transforms on VPlan0, and more accurate modeling in VPlan0.
[InstCombine] Don't sink if it would require dropping deref assumptions. (#166945)
Currently sinking assumes in instcombine drops assumes if they would
prevent sinking. Removing dereferenceable assumptions earlier on can
inhibit vectorization of early-exit loops in practice.
Special-case deferenceable assumptions so that they block sinking. This
can be combined with a separate change to drop dereferencebale
assumptions after vectorization: https://clang.godbolt.org/z/jGqcx3sbs
PR: https://github.com/llvm/llvm-project/pull/166945
[BOLT] Support restartable sequences in tcmalloc (#167195)
Add `RSeqRewriter` to detect code references from `__rseq_cs` section
and ignore function referenced from that section. Code references are
detected via relocations (static or dynamic).
Note that the abort handler is preceded by a 4-byte signature byte
sequence and we cannot relocate the handler without that the signature,
otherwise the application may crash. Thus we are ignoring the function,
i.e. making sure it's not separated from its signature.