[DAG] isKnownNeverZero - add ISD::EXTRACT_VECTOR_ELT handling (#183961)
Initialize DemandedElts mask when the index is constant and inbounds, otherwise check all elements.
[lldb][lldb-dap] Correctly format lldb warnings in the debug console (#173852)
Trivial change to prevent all warnings from being printed on a single
line in the VS Code debug console.
[lldb] Add BytecodeSection class to formatter_bytecode.py (#183876)
Changes `formatter_bytecode.compile_file` to return a `BytecodeSection`
value. The `BytecodeSection` holds the data that needs to be emitted to
an `__lldbformatters` section.
The `BytecodeSection` currently provides `write_binary`, but will be
updated in a follow up commit to include `write_source` which will allow
the data to be emitted as C source code, or Swift source code. This will
make it easier to integrate into build systems, as it's easier to get
data into a binary via source code, than as a raw binary file.
[mlir][spirv] Fix crash when spirv.struct member type is not a SPIR-V type (#183942)
When parsing a spirv.struct type, any MLIR type was accepted as a member
type without validation. This caused a crash in TypeExtensionVisitor and
TypeCapabilityVisitor which unconditionally used cast<SPIRVType> on
struct element types, asserting when a non-SPIR-V type (e.g.,
vector<2x2xi1>) was encountered.
Fix the parser to reject non-SPIR-V member types with a proper error
message.
Fixes #179675
[flang][OpenMP] Fix counting generated nests
The code in `CountGeneratedNests` returned std::nullopt if the LOOPRANGE
clause was not present on a FUSE construct. That is incorrect, the answer
should be 1 instead, except in cases where the FUSE itself was invalid,
such as having no loops nested in it.
Returning std::nullopt will not cause any messages to be emitted. The case
of zero loops inside of FUSE will be diagnosed when analyzing the body of
the FUSE construct itself, not when checking a construct in which the FUSE
is nested.
This prevents error messages caused by the same problem from being emitted
for every enclosing loop construct.
[lldb][Process/FreeBSDKernelCore] Add riscv64 support (#180670)
This is LLDB version of
https://cgit.freebsd.org/ports/tree/devel/gdb/files/kgdb/riscv-fbsd-kern.c.
This enables selecting riscv64 and reading registers from PCB structure
on core dump and live kenrel debugging while trapframe unwinding support
will be implemented in future. Test files using core dump from riscv64
will be implemented once other kernel debugging improvements are done.
---------
Signed-off-by: Minsoo Choo <minsoochoo0122 at proton.me>
[ThinLTO] Reduce the number of renaming due to promotions (#183793)
Currently for thin-lto, the imported static global values (functions,
variables, etc) will be promoted/renamed from e.g., foo() to
foo.llvm.(). Such a renaming caused difficulties in live patching
since function name is changed ([1]).
It is possible that some global value names have to be promoted to avoid
name collision and linker failure. But in practice, majority of name
promotions can be avoided.
In [2], the suggestion is that thin-lto pre-link decides whether
a particular global value needs name promotion or not. If yes, later on
in thinBackend() the name will be promoted.
I compiled a particular linux kernel version (latest bpf-next tree)
and found 1216 global values with suffix .llvm.. With this patch,
the number of promoted functions is 2, 98% reduction from the
original kernel build.
[21 lines not shown]
[SLP]Recalculate dependencies for the buildvector schedule node, if they have copyable node
Need to recalculate the deps for all buildvector nodes with copyable
deps to prevent a compiler crash during scheduling of instructions
[lldb][lldb-server] Fix zip file lookup ignoring last entry in the zip file (#173966)
Command qModuleInfo (GDB server protocol) can be used to request
metadata of shared libraries stored in a ZIP archive on the target. This
is typically used for retrieving SO files bundled in a APK file on
Android.
Requesting the last entry in the ZIP file often fails because of a bug
in the entry search mechanism. This PR fixes this.
NOTES:
* The bug appears only if the entry in the zip file has no extra field
or comment
* This is part on an effort to get lldb working for debugging Swift on
Android: https://github.com/swiftlang/llvm-project/issues/10831
[Clang][TableGen] Sort undocumented builtins after documented ones in generated docs
The builtin documentation emitter previously sorted all categories purely
alphabetically, which placed the "Undocumented" section before categories like
"WMMA" in the generated RST. This made the output confusing since stub entries
appeared before real documentation.
Push the "Undocumented" category to the end of the output so that all documented
categories appear first, regardless of their names.
[clang-format] bugfix: Whitesmiths with IndentAccessModifiers (#182432)
Due to special handling of Whitesmiths when parsing, the additional
level(s) needed for the block, when used with IndentAccessModifiers,
were not being applied. Consequently, when calculating the access
modifier indent offset, the modifiers were being placed at the class
level.
This change ensures that the additional level(s) are not omitted for
Whitesmiths.
[AMDGPU] Make uniform-work-group-size a valueless attribute
The "uniform-work-group-size" function attribute previously took a
string value of "true" or "false". Since presence alone can convey
the "true" semantics and absence can convey "false", the value is
unnecessary.
This patch converts it to a valueless string attribute: presence
indicates true, absence indicates false. For backward compatibility,
auto-upgrade logic is added in both UpgradeAttributes (bitcode) and
UpgradeFunctionAttributes: if the old value is "true", the attribute
is kept without a value; if "false", the attribute is removed.
All setters (Clang CodeGen, OMPIRBuilder, AMDGPUAttributor, ROCDL
translation) and readers (AMDGPUAttributor, AMDGPULowerKernelAttributes,
AMDGPUHSAMetadataStreamer) are updated accordingly. The attribute is
also documented in the AMDGPU LLVM IR Attributes table where it was
previously missing.
[mlir][GPU] Fix crash in WarpExecuteOnLane0Op::verify with wrong terminator (#183930)
WarpExecuteOnLane0Op::verify() called getTerminator() which performed an
unconditional cast<gpu::YieldOp> on the block's last operation. When the
op body was written with a different terminator (e.g. affine.yield), the
cast asserted immediately instead of emitting a verifier diagnostic.
Fix by using dyn_cast in verify() before calling getTerminator(), and
emitting a proper error message when the terminator is not gpu.yield.
Add a regression test to invalid.mlir.
Fixes #181450
[AMDGPU] Make uniform-work-group-size a valueless attribute
The "uniform-work-group-size" function attribute previously took a
string value of "true" or "false". Since presence alone can convey
the "true" semantics and absence can convey "false", the value is
unnecessary.
This patch converts it to a valueless string attribute: presence
indicates true, absence indicates false. For backward compatibility,
auto-upgrade logic is added in both UpgradeAttributes (bitcode) and
UpgradeFunctionAttributes: if the old value is "true", the attribute
is kept without a value; if "false", the attribute is removed.
All setters (Clang CodeGen, OMPIRBuilder, AMDGPUAttributor, ROCDL
translation) and readers (AMDGPUAttributor, AMDGPULowerKernelAttributes,
AMDGPUHSAMetadataStreamer) are updated accordingly. The attribute is
also documented in the AMDGPU LLVM IR Attributes table where it was
previously missing.
[lldb][Process/FreeBSDKernelCore] Add ppc64le support (#180669)
This is LLDB version of
https://cgit.freebsd.org/ports/tree/devel/gdb/files/kgdb/ppcfbsd-kern.c.
This enables selecting ppc64le and reading registers from PCB structure
on core dump and live kernel debugging. FPU registers aren't supported
yet due to pcb structure issue, but this change still achieves feature
parity with KGDB. Trapframe unwinding support will be implemented in
future. Test files using core dump from ppc64le will be implemented once
other kernel debugging improvements are done.
---------
Signed-off-by: Minsoo Choo <minsoochoo0122 at proton.me>
[ARM] Lower strictfp vector fp16 rounding operations similar to default mode (#183700)
Previously the strictfp rounding nodes were lowered using unrolling to
scalar operations, which has negative impact on performance. Partially
this issue was fixed in #180480, this change continues that work and
implements optimized lowering for v4f16 and v8f16.
[AMDGPU] Make uniform-work-group-size a valueless attribute
The "uniform-work-group-size" function attribute previously took a
string value of "true" or "false". Since presence alone can convey
the "true" semantics and absence can convey "false", the value is
unnecessary.
This patch converts it to a valueless string attribute: presence
indicates true, absence indicates false. For backward compatibility,
auto-upgrade logic is added in both UpgradeAttributes (bitcode) and
UpgradeFunctionAttributes: if the old value is "true", the attribute
is kept without a value; if "false", the attribute is removed.
All setters (Clang CodeGen, OMPIRBuilder, AMDGPUAttributor, ROCDL
translation) and readers (AMDGPUAttributor, AMDGPULowerKernelAttributes,
AMDGPUHSAMetadataStreamer) are updated accordingly. The attribute is
also documented in the AMDGPU LLVM IR Attributes table where it was
previously missing.