[mlir][docs] dialect interfaces and mlir reduce documentation fix (#189258)
Two modifications:
1. Reflect newly added dialect interface methods in the documentation
2. Remove the bug in the `MLIR Reduce` documentation
Revert "[VPlan] Extract reverse mask from reverse accesses" (#189637)
Reverts llvm/llvm-project#155579
Assertion added triggers on some buildbots
clang:
/home/tcwg-buildbot/worker/clang-aarch64-sve2-vla/llvm/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp:3840:
virtual InstructionCost
llvm::VPWidenMemoryRecipe::computeCost(ElementCount, VPCostContext &)
const: Assertion `!IsReverse() && "Inconsecutive memory access should
not have reverse order"' failed.
PLEASE submit a bug report to
https://github.com/llvm/llvm-project/issues/ and include the crash
backtrace, preprocessed source, and associated run script.
Stack dump:
0. Program arguments:
/home/tcwg-buildbot/worker/clang-aarch64-sve2-vla/stage1.install/bin/clang
-DNDEBUG -mcpu=neoverse-v2 -mllvm -scalable-vectorization=preferred -O3
-std=gnu17 -fcommon -Wno-error=incompatible-pointer-types -MD -MT
[3 lines not shown]
[mlir][SPIR-V] Support spirv.loop_control attribute on scf.for and scf.while (#189392)
Propagate the `spirv.loop_control` attribute from `scf.for` and
`scf.while` operations to the generated `spirv.mlir.loop` during
SCFToSPIRV conversion
[CIR] Implement member-pointer members lowering/CXX ABI lowering (#187327)
Record types with a member pointer as a member require quite a bit of
work to get to function properly. First, we have to wire them through
the AST->CIR lowering to make sure we properly represent them, and
represent them when they're zero initializable. We also have to properly
initialize elements when we're NOT zero initializable.
More importantly, we have to implement the CXXABILowering of record
types. Before this patch, we just assumed that all RecordTypes were
legal, since we didn't have the above lowering. A vast majority of this
patch is around getting RecordTypes to lower properly. There isn't
really a good way to test this without the FE changes, so it wasn't
split off.
We accomplish this in 2 phases: First, we transform each individual
record type along the way, giving it a new cxx-abi specific name. We
have to ensure that recursive evaluation works correctly, so we pulled
the solution from the LLVM-IR dialect for that. Secondly, we rename all
[13 lines not shown]
[AArch64][llvm] Fix encoding for `stshh` instruction (#189588)
The encoding for `stshh` was incorrect, and has been fixed. This
has been checked against the Arm ARM.
[AMDGPU][NFCI] CustomOperand to have a default type (#189584)
Most of the time, we should not need to care about the type at all, so
having it as a mandatory parameter confuses people and invites using
i1/i8/i16 where not necessary.
[AArch64][llvm] Fix encoding for `stshh` instruction (#189588)
The encoding for `stshh` was incorrect, and has been fixed. This
has been checked against the Arm ARM.
[lldb-dap] Correct attaching by program basename. (#188886)
Fixes an issue where attaching by program would fail if the program name
was a partial name (e.g. "foobar" instead of "/path/to/foobar").
We failed to create the target which caused the attach to fail. Now we
fallback to the dummy target and update to the real target after the
attach completes.
Here is an example launch configuration that fail:
```
{
"type": "lldb-dap",
"name": "Attach (wait)",
"request": "attach",
"program": "foobar",
"waitFor": true
},
```
[SDAG][abd] Combine abd of small types (#181538)
It is beneficial to combine abd of illegal, small types (types that get promoted to wider scalar size).
OpenMP: Match all Triple recognized arch aliases (#189649)
This liberalizes match(device = {arch(some_arch)} to recognize
other names for some_arch.
Previously this compared against getArchTypeForLLVMName, which
only matches a subset of names (which seems to be the canonical
architecture names). There was a special case hack for "x86_64",
which is one of the "x86-64" aliases accepted by parseArch, but is
not the canonical architecture name.
Triple: Expose parseArch as a public method (#189648)
Clang has some code which is doing a direct arch name
string compare which should really be recognizing anything
usable as a triple architecture. It makes more sense to
directly parse the architecture than to construct a temporary
triple just to see what the parsed arch is.
For some reason the existing public parsing method is
getArchTypeForLLVMName. I'm not fully sure what the difference
between the 2 is supposed to be. My current guess is
getArchTypeForLLVMName is only supposed to handle the
canonical architecture name.
[Passes] Remove some optsize checks (#189369)
LibCallsShrinkWrapPass and PGOMemOPSizeOpt already check for optsize
attributes internally, so there is no need to handle this in the pass
pipeline.
The context here is that I'd like to make the pass pipeline completely
independent of Os/Oz so that we know for sure that function-level
optsize/minsize attributes behave identically to the pipeline-level
option.
[AMDGPU][SIFoldOperands] Fix OR -1 fold
In SIFoldOperands, folding `or x, -1` to `v_mov_b32 -1` removed `Src1Idx`,
which is incorrect because `-1` is in `Src0Idx` (after canonicalization).