[SPIRV] Implement lowering for llvm.matrix.transpose and llvm.matrix.multiply (#172050)
This patch implements the lowering for the llvm.matrix.transpose and
llvm.matrix.multiply intrinsics in the SPIR-V backend.
- llvm.matrix.transpose is lowered to a G_SHUFFLE_VECTOR with a
mask calculated to transpose the elements.
- llvm.matrix.multiply is lowered by decomposing the operation into
dot products of rows and columns:
- Rows and columns are extracted using G_UNMERGE_VALUES or shuffles.
- Dot products are computed using OpDot for floating point vectors
or standard arithmetic for scalars/integers.
- The result is reconstructed using G_BUILD_VECTOR.
This change also updates SPIRVPostLegalizer to improve type deduction
for G_UNMERGE_VALUES, enabling correct type assignment for the
intermediate virtual registers generated during lowering.
New tests are added to verify support for various matrix sizes and
element types (float and int).
[lit] Disable ulimit-nodarwin test on FreeBSD (#173155)
FreeBSD does not support using ulimit to grow up max file number per
process. This characteristic is inherited by Darwin and thus we pass
this test on FreeBSD as well.
[mlir][dataflow] Fix DataFlowFramework crash by add isBlockEnd logic in the ProgramPoint::print (#173471)
Running -test-dead-code-analysis -debug on the following IR will trigger
a data-flow analysis framework assert, you can see
https://github.com/llvm/llvm-project/blob/2d6b1b174194198498eb10ae811632b3dd945ecf/mlir/include/mlir/Analysis/DataFlowFramework.h#L110
Fix DataFlowFramework crash by add isBlockEnd logic in the
ProgramPoint::print.
```
func.func @trs(%idx1: index, %idx2: index, %s: f32) {
scf.parallel (%i) = (%idx1) to (%idx2) step (%idx2) {
%r = memref.alloca() : memref<10xf32>
scf.forall (%e2) in (%idx2) {
%a = memref.load %r[%idx2] : memref<10xf32>
}
}
return
}
```
[Github][CI] Trigger `code-lint` for clang-tidy documentations (#173700)
Previously we added `doc8` to `code-lint` workflow. However, PRs contain
only documentation changes won't trigger this workflow.
An example: https://github.com/llvm/llvm-project/pull/173699/checks
didn't trigger `code-lint`.
This commit fixes the issue.
[LoopVectorize] Support vectorization of frexp intrinsic (#172957)
This patch enables the vectorization of the llvm.frexp intrinsic.
Following the suggestion in #112408, frexp is moved from
isTriviallyScalarizable to isTriviallyVectorizable.
Fixes #112408
[VPlan] Support extends and truncs in getSCEVExprForVPValue. (NFCI)
Handle extends and truncates in getSCEVExprForVPValue. This enables
computing SCEVs in more cases in the VPlan-based cost-model, but should
compute the matching costs in all cases.
[MemProf] Fix reporting with -memprof-matching-cold-threshold (#173327)
With the -memprof-matching-cold-threshold option, we hint as cold
allocations where the fraction of cold bytes is at least the given
threshold. However, we were incorrectly reporting all of the
allocation's contexts and bytes as hinted cold.
Fix this to report the non-cold contexts as ignored. To do this,
refactor out some existing reporting, and also keep track of the
original allocation type for each context in the Trie along with its
ContextTotalSize information. Most of the changes are the change to this
array's type and name.
[lldb-dap] Migrate stackTrace request to structured types (#173226)
This patch finishes migration to structured types and removes
`LegacyRequestHandler`.
[SLP]Recalculate dependencies for all cleared entries
Need to recalculate the dependencies for all cleared items to avoid
a crash, if the entry is used in other vector nodes
Fixes #173469