[LegalizeDAG] Remove unnecessary EVT->MVT->EVT conversion. NFC (#173707)
There doesn't appear to be any reason to use MVT here. All of the uses
expect an EVT.
[AMDGPU] add clamp immediate operand to WMMA iu8 intrinsic (#171069)
Fixes #166989
- Adds a clamp immediate operand to the AMDGPU WMMA iu8 intrinsic and
threads it through LLVM IR, MIR lowering, Clang builtins/tests, and MLIR
ROCDL dialect so all layers agree on the new operand
- Updates AMDGPUWmmaIntrinsicModsAB so the clamp attribute is emitted,
teaches VOP3P encoding to accept the immediate, and adjusts Clang
codegen/builtin headers plus MLIR op definitions and tests to match
- Documents what the WMMA clamp operand do
- Implement bitcode AutoUpgrade for source compatibility on WMMA IU8
Intrinsic op
Possible future enhancements:
- infer clamping as an optimization fold based on the use context
---------
Co-authored-by: Matt Arsenault <arsenm2 at gmail.com>
[VPlan] Skip phi recipes in tryToBuildVPlan (NFC).
No phi recipes are being transformed in the main loop any longer, so
skip phi recipes.
This also allows to clarify which recipes need skipping explicitly.
Those are recipes that have been already transformed.
Follow-up to post-commit comment in
https://github.com/llvm/llvm-project/pull/168291.
[clang-tidy] Add C support to `misc-use-internal-linkage` (#173196)
Right now, this check simply doesn't work in C, because we exclude
anything that `isExternC` from analysis (in C, everything `isExternC`).
Besides that, the docs and diagnostic message talk about anonymous
namespaces, which don't exist in C (this was noted in #97969, I'm just
summarizing).
The existing tests use abbreviated `// CHECK-MESSAGES` assertions (e.g.
`// CHECK-MESSAGES: :[[@LINE-1]]:16: warning: function 'cxf'`), but I've
expanded them out. Yes, it's verbose, but now that the diagnostic
message has an important difference between C and C++, I feel it's
important that we test it.
[SPIRV] Implement lowering for llvm.matrix.transpose and llvm.matrix.multiply (#172050)
This patch implements the lowering for the llvm.matrix.transpose and
llvm.matrix.multiply intrinsics in the SPIR-V backend.
- llvm.matrix.transpose is lowered to a G_SHUFFLE_VECTOR with a
mask calculated to transpose the elements.
- llvm.matrix.multiply is lowered by decomposing the operation into
dot products of rows and columns:
- Rows and columns are extracted using G_UNMERGE_VALUES or shuffles.
- Dot products are computed using OpDot for floating point vectors
or standard arithmetic for scalars/integers.
- The result is reconstructed using G_BUILD_VECTOR.
This change also updates SPIRVPostLegalizer to improve type deduction
for G_UNMERGE_VALUES, enabling correct type assignment for the
intermediate virtual registers generated during lowering.
New tests are added to verify support for various matrix sizes and
element types (float and int).
[lit] Disable ulimit-nodarwin test on FreeBSD (#173155)
FreeBSD does not support using ulimit to grow up max file number per
process. This characteristic is inherited by Darwin and thus we pass
this test on FreeBSD as well.
[mlir][dataflow] Fix DataFlowFramework crash by add isBlockEnd logic in the ProgramPoint::print (#173471)
Running -test-dead-code-analysis -debug on the following IR will trigger
a data-flow analysis framework assert, you can see
https://github.com/llvm/llvm-project/blob/2d6b1b174194198498eb10ae811632b3dd945ecf/mlir/include/mlir/Analysis/DataFlowFramework.h#L110
Fix DataFlowFramework crash by add isBlockEnd logic in the
ProgramPoint::print.
```
func.func @trs(%idx1: index, %idx2: index, %s: f32) {
scf.parallel (%i) = (%idx1) to (%idx2) step (%idx2) {
%r = memref.alloca() : memref<10xf32>
scf.forall (%e2) in (%idx2) {
%a = memref.load %r[%idx2] : memref<10xf32>
}
}
return
}
```
[Github][CI] Trigger `code-lint` for clang-tidy documentations (#173700)
Previously we added `doc8` to `code-lint` workflow. However, PRs contain
only documentation changes won't trigger this workflow.
An example: https://github.com/llvm/llvm-project/pull/173699/checks
didn't trigger `code-lint`.
This commit fixes the issue.
[LoopVectorize] Support vectorization of frexp intrinsic (#172957)
This patch enables the vectorization of the llvm.frexp intrinsic.
Following the suggestion in #112408, frexp is moved from
isTriviallyScalarizable to isTriviallyVectorizable.
Fixes #112408
[VPlan] Support extends and truncs in getSCEVExprForVPValue. (NFCI)
Handle extends and truncates in getSCEVExprForVPValue. This enables
computing SCEVs in more cases in the VPlan-based cost-model, but should
compute the matching costs in all cases.