[mlir][spirv] Fix crash when spirv.struct member type is not a SPIR-V type (#183942)
When parsing a spirv.struct type, any MLIR type was accepted as a member
type without validation. This caused a crash in TypeExtensionVisitor and
TypeCapabilityVisitor which unconditionally used cast<SPIRVType> on
struct element types, asserting when a non-SPIR-V type (e.g.,
vector<2x2xi1>) was encountered.
Fix the parser to reject non-SPIR-V member types with a proper error
message.
Fixes #179675
[flang][OpenMP] Fix counting generated nests
The code in `CountGeneratedNests` returned std::nullopt if the LOOPRANGE
clause was not present on a FUSE construct. That is incorrect, the answer
should be 1 instead, except in cases where the FUSE itself was invalid,
such as having no loops nested in it.
Returning std::nullopt will not cause any messages to be emitted. The case
of zero loops inside of FUSE will be diagnosed when analyzing the body of
the FUSE construct itself, not when checking a construct in which the FUSE
is nested.
This prevents error messages caused by the same problem from being emitted
for every enclosing loop construct.
[lldb][Process/FreeBSDKernelCore] Add riscv64 support (#180670)
This is LLDB version of
https://cgit.freebsd.org/ports/tree/devel/gdb/files/kgdb/riscv-fbsd-kern.c.
This enables selecting riscv64 and reading registers from PCB structure
on core dump and live kenrel debugging while trapframe unwinding support
will be implemented in future. Test files using core dump from riscv64
will be implemented once other kernel debugging improvements are done.
---------
Signed-off-by: Minsoo Choo <minsoochoo0122 at proton.me>
[ThinLTO] Reduce the number of renaming due to promotions (#183793)
Currently for thin-lto, the imported static global values (functions,
variables, etc) will be promoted/renamed from e.g., foo() to
foo.llvm.(). Such a renaming caused difficulties in live patching
since function name is changed ([1]).
It is possible that some global value names have to be promoted to avoid
name collision and linker failure. But in practice, majority of name
promotions can be avoided.
In [2], the suggestion is that thin-lto pre-link decides whether
a particular global value needs name promotion or not. If yes, later on
in thinBackend() the name will be promoted.
I compiled a particular linux kernel version (latest bpf-next tree)
and found 1216 global values with suffix .llvm.. With this patch,
the number of promoted functions is 2, 98% reduction from the
original kernel build.
[21 lines not shown]
[SLP]Recalculate dependencies for the buildvector schedule node, if they have copyable node
Need to recalculate the deps for all buildvector nodes with copyable
deps to prevent a compiler crash during scheduling of instructions
[lldb][lldb-server] Fix zip file lookup ignoring last entry in the zip file (#173966)
Command qModuleInfo (GDB server protocol) can be used to request
metadata of shared libraries stored in a ZIP archive on the target. This
is typically used for retrieving SO files bundled in a APK file on
Android.
Requesting the last entry in the ZIP file often fails because of a bug
in the entry search mechanism. This PR fixes this.
NOTES:
* The bug appears only if the entry in the zip file has no extra field
or comment
* This is part on an effort to get lldb working for debugging Swift on
Android: https://github.com/swiftlang/llvm-project/issues/10831
[Clang][TableGen] Sort undocumented builtins after documented ones in generated docs
The builtin documentation emitter previously sorted all categories purely
alphabetically, which placed the "Undocumented" section before categories like
"WMMA" in the generated RST. This made the output confusing since stub entries
appeared before real documentation.
Push the "Undocumented" category to the end of the output so that all documented
categories appear first, regardless of their names.
[clang-format] bugfix: Whitesmiths with IndentAccessModifiers (#182432)
Due to special handling of Whitesmiths when parsing, the additional
level(s) needed for the block, when used with IndentAccessModifiers,
were not being applied. Consequently, when calculating the access
modifier indent offset, the modifiers were being placed at the class
level.
This change ensures that the additional level(s) are not omitted for
Whitesmiths.
[AMDGPU] Make uniform-work-group-size a valueless attribute
The "uniform-work-group-size" function attribute previously took a
string value of "true" or "false". Since presence alone can convey
the "true" semantics and absence can convey "false", the value is
unnecessary.
This patch converts it to a valueless string attribute: presence
indicates true, absence indicates false. For backward compatibility,
auto-upgrade logic is added in both UpgradeAttributes (bitcode) and
UpgradeFunctionAttributes: if the old value is "true", the attribute
is kept without a value; if "false", the attribute is removed.
All setters (Clang CodeGen, OMPIRBuilder, AMDGPUAttributor, ROCDL
translation) and readers (AMDGPUAttributor, AMDGPULowerKernelAttributes,
AMDGPUHSAMetadataStreamer) are updated accordingly. The attribute is
also documented in the AMDGPU LLVM IR Attributes table where it was
previously missing.
[mlir][GPU] Fix crash in WarpExecuteOnLane0Op::verify with wrong terminator (#183930)
WarpExecuteOnLane0Op::verify() called getTerminator() which performed an
unconditional cast<gpu::YieldOp> on the block's last operation. When the
op body was written with a different terminator (e.g. affine.yield), the
cast asserted immediately instead of emitting a verifier diagnostic.
Fix by using dyn_cast in verify() before calling getTerminator(), and
emitting a proper error message when the terminator is not gpu.yield.
Add a regression test to invalid.mlir.
Fixes #181450
[AMDGPU] Make uniform-work-group-size a valueless attribute
The "uniform-work-group-size" function attribute previously took a
string value of "true" or "false". Since presence alone can convey
the "true" semantics and absence can convey "false", the value is
unnecessary.
This patch converts it to a valueless string attribute: presence
indicates true, absence indicates false. For backward compatibility,
auto-upgrade logic is added in both UpgradeAttributes (bitcode) and
UpgradeFunctionAttributes: if the old value is "true", the attribute
is kept without a value; if "false", the attribute is removed.
All setters (Clang CodeGen, OMPIRBuilder, AMDGPUAttributor, ROCDL
translation) and readers (AMDGPUAttributor, AMDGPULowerKernelAttributes,
AMDGPUHSAMetadataStreamer) are updated accordingly. The attribute is
also documented in the AMDGPU LLVM IR Attributes table where it was
previously missing.
[lldb][Process/FreeBSDKernelCore] Add ppc64le support (#180669)
This is LLDB version of
https://cgit.freebsd.org/ports/tree/devel/gdb/files/kgdb/ppcfbsd-kern.c.
This enables selecting ppc64le and reading registers from PCB structure
on core dump and live kernel debugging. FPU registers aren't supported
yet due to pcb structure issue, but this change still achieves feature
parity with KGDB. Trapframe unwinding support will be implemented in
future. Test files using core dump from ppc64le will be implemented once
other kernel debugging improvements are done.
---------
Signed-off-by: Minsoo Choo <minsoochoo0122 at proton.me>
[ARM] Lower strictfp vector fp16 rounding operations similar to default mode (#183700)
Previously the strictfp rounding nodes were lowered using unrolling to
scalar operations, which has negative impact on performance. Partially
this issue was fixed in #180480, this change continues that work and
implements optimized lowering for v4f16 and v8f16.
[AMDGPU] Make uniform-work-group-size a valueless attribute
The "uniform-work-group-size" function attribute previously took a
string value of "true" or "false". Since presence alone can convey
the "true" semantics and absence can convey "false", the value is
unnecessary.
This patch converts it to a valueless string attribute: presence
indicates true, absence indicates false. For backward compatibility,
auto-upgrade logic is added in both UpgradeAttributes (bitcode) and
UpgradeFunctionAttributes: if the old value is "true", the attribute
is kept without a value; if "false", the attribute is removed.
All setters (Clang CodeGen, OMPIRBuilder, AMDGPUAttributor, ROCDL
translation) and readers (AMDGPUAttributor, AMDGPULowerKernelAttributes,
AMDGPUHSAMetadataStreamer) are updated accordingly. The attribute is
also documented in the AMDGPU LLVM IR Attributes table where it was
previously missing.
[AMDGPU] Enable shift64 hazard recognition for gfx9 (#183839)
Enable shift64 hazard recognition for gfx9 cores.
---------
Signed-off-by: John Lu <John.Lu at amd.com>
Lower strictfp vector rounding operations similar to default mode
Previously the strictfp rounding nodes were lowered using unrolling to
scalar operations, which has negative impact on performance. Partially
this issue was fixed in #180480, this change continues that work and
implements optimized lowering for v4f16 and v8f16.
[mlir][IR] Generalize `DenseElementsAttr` to custom element types (#179122)
`DenseElementsAttr` supports only a hard-coded list of element types:
`int`, `index`, `float`, `complex`. This commit generalizes the
`DenseElementsAttr` infrastructure: it now supports arbitrary element
types, as long as they implement the new `DenseElementTypeInterface`.
The `DenseElementTypeInterface` has the following helper functions:
- `getDenseElementBitSize`: Query the size of an element in bits. (When
storing an element in memory, each element is padded to a full byte.
This is an existing limitation of the `DenseElementsAttr`; with an
exception for `i1`.)
- `convertToAttribute`: Attribute factory / deserializer. Converts bytes
into an MLIR attribute. The attribute provides the assembly format /
printer for a single element.
- `convertFromAttribute`: Serializer. Converts an MLIR attribute into
bytes.
Note: `convertToAttribute` / `convertFromAttribute` are mainly for
[23 lines not shown]
Revert "[mlir][IR] Generalize `DenseElementsAttr` to custom element types" (#183917)
Reverts llvm/llvm-project#183891
Reverting a second time. The build bot failure seems to be
non-deterministic.
[CMake] Use keyword signature in two additional callsites (#183889)
Fix-forward for https://github.com/llvm/llvm-project/pull/183541.
Two callsites to target_link_libraries were not migrated to the
keyword signature.
Signed-off-by: Itay Bookstein <itay.bookstein at nextsilicon.com>
[mlir][VectorToLLVM] Fix crash in VectorInsertOpConversion with dynamic index (#183783)
VectorInsertOpConversion crashes with an assertion failure when
inserting a sub-vector at a dynamic position into a multi-dimensional
vector. The pattern calls getAsIntegers() on the position, which asserts
that all fold results are compile-time constant attributes.
The existing guard (checking llvm::IsaPred<Attribute>) only covered the
case where a scalar is inserted into the innermost dimension (the
extractvalue path). The guard was missing for the insertvalue path when
inserting a sub-vector at a dynamic position into a nested aggregate.
Fix: add the same guard before the llvm.insertvalue creation to return
failure() gracefully when any position index is dynamic, matching the
behavior of VectorExtractOpConversion.
Fixes #177829