[Reland][AMDGPU][GlobalISel] Add register bank legalization for buffer_load byte and short (#172065)
This patch adds register bank legalization support for buffer load byte
and short operations in the AMDGPU GlobalISel pipeline.
This is a re-land of #167798. I have fixed the failing test
/CodeGen/AMDGPU/GlobalISel/buffer-load-byte-short.ll
AMDGPU: Stop requiring afn for f32 rsq formation
We were checking for afn or !fpmath attached to the sqrt. We
are not trying to replace a correctly rounded rsqrt; we're replacing
the two correctly rounded operations with the contracted operation.
It's net a better precision, so contract on both instructions should
be sufficient. Both the contracted and uncontracted sequences pass
the OpenCL conformance test, with a lower maximum error contracted.
[CIR] Support wide string literals in CIR codegen (#171541)
This PR migrates support for wide string literals from the incubator to
upstream.
## Changes
- Implement wide string literal support in
`getConstantArrayFromStringLiteral`
- Handle wchar_t, char16_t, and char32_t string literals
- Collect code units and create constant arrays with IntAttr elements
- Use ZeroAttr for null-filled strings
## Testing
- Copied `wide-string.cpp` test file from incubator
- Expanded test to include wchar_t test cases (incubator only had
char16_t and char32_t)
- All tests pass
[3 lines not shown]
[HLSL][Matrix] Add support for ICK_HLSL_Matrix_Splat to add splat cast of scalars (#170885)
fixes #168960
Adds `ICK_HLSL_Matrix_Splat` and hooks it up to
`PerformImplicitConversion` and `IsMatrixConversion`. Map these to
`CK_HLSLAggregateSplatCast`.
[LoopFusion] Simplifying the legality checks (#171889)
Considering that the current loop fusion only supports adjacent loops,
we are able to simplify the checks in this pass. By removing
`isControlFlowEquivalent` check, this patch fixes multiple issues
including #166560, #166535, #165031, #80301 and #168263.
Now only the sequential/adjacent candidates are collected in the same
list. This patch is the implementation of approach 2 discussed in post
#171207.
Fix misprint in computeKnownFPClass in GISelValueTracking.cpp (#171566)
Fix wrong value(from Instruction enum) in conditional and add test
check.
Related with https://github.com/llvm/llvm-project/issues/169959
[OpenMP] Define remaining OpenMP 6.0 clauses, add flang skeleton
Add definitions of the remaining OpenMP 6.0 clauses to the OMP.td
file. Implement the bare-bones skeleton in flang to support the new
definitions.
Adding a clause to OMP.td automatically generates some flang code
which requires manual completion to even compile. This PR adds the
absolute minimum for all 6.0 clauses that were still missing. This
minimum does not implement any OpenMP functionality, it just allows
flang to compile and run. As a benefit, any future clause-related
clang work will not require any changes to flang.
[Mips] Add compact branch patterns for MipsR6 (#171131)
Added patterns for combining set and branch into one compact branch
The patterns are disabled if -mips-compact-branches=never
[MLIR][AMDGPU] Implement reifyDimOfResult for FatRawBufferCastOp (#171839)
Since `FatRawBufferCastOp` preserves the shape of its source operand,
the result dimensions can be reified by querying the source's
dimensions.
---------
Signed-off-by: Yu-Zhewen <zhewenyu at amd.com>
[NVPTX] Fixup and refactor brx.idx support (#171933)
Guard "brx.idx" generation to appropriate PTX ISA and SM version.
In addition, do some minor refactoring moving the expansion into ISel as
doing this during operation legalization is more complex and offers no
benefits.
fixes https://github.com/llvm/llvm-project/issues/171709
[clang-doc] Add class template to HTML (#171937)
Emit class template declaration info so that it appears above a record's name in the Mustache template.
[mlir][PDL] Add CallableOpInterface to pdl.pattern and inlining support to pdl
This commit enables inlining of calls within PDL patterns by:
1. Adding CallableOpInterface to PatternOp, and implementing the required
interface methods (getCallableRegion, getArgumentTypes, getResultTypes)
and the ArgAndResultAttrsOpInterface stubs to make pdl.pattern a
valid callable.
2. Adding the dialect inliner interface that marks all operations as legal
to inline.
This is particularly useful for nonmaterializable patterns that may
contain func.call operations to external functions defining pattern
matching or rewrite logic. After inlining, these patterns can be
transformed into standard materializable PDL patterns.
NOTE: The pattern op needs to be marked callable as the inliner doesn't
allow inlining if there's no callable ancestor.
[33 lines not shown]
[AArch64][GlobalISel] Fix vector lrint/llrint fallbacks (#170814)
Add .lower() to vector lrint/llrint to enable lowering instead of
falling back to SelectionDAG.
[clang-doc] Add JSON output to existing template tests (#171936)
clang-doc has some useful, preexisting tests for templates, so we'll
reuse them to cover more cases.
Fixes non-functional changes found static analyzer (#171197)
As per @arsenm 's instructions, I've separated the non-functional
changes from https://github.com/llvm/llvm-project/pull/169958.
Afterwards I'll tackle the functional ones one by one. I hope I did
everything right this time.
Full descriptions in the article:
https://pvs-studio.com/en/blog/posts/cpp/1318/
3. Array overrun is possible.
The PVS-Studio warning: V557 Array overrun is possible. The value of
'regIdx' index could reach 31. VEAsmParser.cpp 696
10. Excessive check.
The PVS-Studio warning: V547 Expression 'IsLeaf' is always false.
PPCInstrInfo.cpp 419
11. Doubling the same check.
The PVS-Studio warning: V581 The conditional expressions of the 'if'
statements situated alongside each other are identical. Check lines:
5820, 5823. PPCInstrInfo.cpp 5823
[11 lines not shown]