[lldb][NFC] Move BreakpointSite::IsEnabled/SetEnabled into Process
The Process class is the one responsible for managing the state of a
BreakpointSite inside the process. As such, it should be the one
answering questions about the state of the site.
[lldbremote] Implement support for MultiBreakpoint packet
This is fairly straightfoward, thanks to the helper functions created in
the previous commit.
https://github.com/llvm/llvm-project/pull/192910
[mlir] MlirOptMain: avoid double verification (#192661)
MlirOptMain would run verification twice at the end of the processing:
1. after the last pass in the pipeline;
2. prior to printing. Since there is no logic that could mutate, and
thus potentially invalidate, the IR between the two, the second
verification is redundant. Skip it when possible.
[SelectionDAG] Return poison instead of undef for out-of-bounds EXTRACT_VECTOR_ELT (#192844)
Out-of-bounds EXTRACT_VECTOR_ELT on fixed-length vectors is undefined
behavior.
Return poison instead of undef to be consistent with LangRef semantics.
Prep work to help with https://github.com/llvm/llvm-project/pull/190307
[HLSL][Clang] Start emitting @llvm.structured.alloca (#190157)
Allowing some pattterns in the FE to emit this new instruction to emit
logical pointers. Renamed the experimental-emit-sgep flag to reflect the
broader logic it gates.
This also updates the few frontend tests to reflect the newly emitted
alloca.
Next step is to handle the Mem2Reg/Reg2Mem.
[AArch64] Lower masked.expandload intrinsic using SVE2p2/SME2p2 EXPAND (#190999)
The masked.expandload intrinsic can be lowered using the EXPAND instruction
when available, where the source vector is the result of a contiguous load
of the number of active elements in the predicate. EXPAND is available with
either feature in non-streaming mode. It is available in streaming-mode
with SME2p2, or with SVE2p2 when SME_FA64 is also enabled.
Intrinsics which return a fixed-width result can also be lowered using SVE
instructions when preferred, otherwise they will be scalarised by falling
back on scalarizeMaskedExpandLoad.
Scalable vectors are not supported when EXPAND is not available. In this
case, the cost model will return an invalid cost for the intrinsic.
[libc++][test] Unblock cases for `ranges::sort` with proxy ranges (#188490)
libc++ switched to use `iter_move`/`iter_swap` long time ago, so we
should unblock these cases.
[SPIR-V] Fix image query and sampler type (#190767)
- Use OpImageQuerySize instead of OpImageQuerySizeLod for multisampled
SPIR-V spec requires MS=0 for OpImageQuerySizeLod
- Use `target("spirv.Sampler")` instead of i32 for non-constant sampler
kernel parameters so they produce OpTypeSampler as required by
OpSampledImage
related to https://github.com/llvm/llvm-project/issues/190736
[LoopFusion] Validate loop structure before creating LoopCandidates (#192280)
This patch deletes the assert which required the loop to have
preheader. It is not guaranteed to have preheader when loops
are structured using `indirectbr`. Instead, we now rely on header.
Fixes #156670.
[lldb] Add caching and _NT_SYMBOL_PATH parsing in SymbolLocatorSymStore (#191782)
The _NT_SYMBOL_PATH environment variable is the idiomatic way to set a
system-wide lookup order of symbol servers and a local cache for
SymStore. It holds a semicolon-separated list of entries in the
following notations:
* srv*[<cache>*]<source> sets a source and an optional explicit cache
* cache*<cache> sets an implicit cache for all subsequent entries
* all other entries are bare local directories
Since symbol paths are closely intertwined with the caching of symbol
files, this patch proposes support in LLDB for both features at once.
ParseEnvSymbolPaths() implements the parsing logic, which processes
entries of the symbol path string from left to right to create a series
of LookupEntry objects that each store a source and a cache location.
The source of a LookupEntry can be a local directory or an HTTP server
address. The cache is a local directory or empty. This representation
unifies the implicit vs. explicit caching options from the SymStore
protocol.
[22 lines not shown]
[AArch64][GlobalISel] Move KnownBitsVectorTest to mir. NFC (#192536)
This ports some of the older C++ GlobalISel known-bits tests to use
print<gisel-value-tracking> in a mir file. This is mostly autogenerated,
but attempts to keep the existing comments. Some tests have not been
ported as they are entirely in C++ or tested isKnownToBeAPowerOfTwo,
which is not tested in the print output.
[CIR] add vsqrt and vsqrtq support (#192282)
Part of https://github.com/llvm/llvm-project/issues/185382
co-authored by: @Kouunnn <xerw1314 at gmail.com>
---------
Co-authored-by: Zile Xiong <xiongzile99 at gmail.com>
Co-authored-by: ZCkouun <1765074320 at qq.com>
[compiler-rt] Don't provide `__arm_sme_state` for baremetal targets (#191434)
Previously, we required baremetal runtimes to implement an undocumented
`__aarch64_sme_accessible` hook to check if SME is available (as
checking CPU features may vary across targets).
This allowed us to provide a generic `__arm_sme_state` implementation
but caused some friction for toolchains that depend on compiler-rt.
This patch instead removes the implementation of `__arm_sme_state` for
baremetal. This makes it the responsibility of the runtime (e.g. libc)
to provide this function for baremetal targets.
The requirements of this function are documented in the AAPCS64:
https://github.com/ARM-software/abi-aa/blob/main/aapcs64/aapcs64.rst#811__arm_sme_state
All other SME ABI rountines are still provided by compiler-rt.
[mlir] reduce excessive verification in transform (#192653)
`mergeSymbolsInto` called by the transform interpreter for named
sequence management was calling a full verifier after renaming symbols.
The renaming could have potentially broken symbol table-related
invariants, but not really anything else. Only verify the symbol
table-related invariants instead.
[AArch64] Fix lowering of non-power2 uitofp (#190921)
The code in DAGTypeLegalizer::SplitVecOp_TruncateHelper attempts to use
getFloatingPointVT(InElementSize/2), which is invalid for non-power2
type sizes. Fall back to the existing SplitVecOp_UnaryOp in this case.
[CodeGen] Parse frame-pointer attribute once when creating MachineFunction (#191974)
TargetOptions::DisableFramePointerElim is hot and showing up in
compile-time profiles via AArch64FrameLowering::hasFPImpl on
aarch64-O0-g builds. Repeatedly looking up the function attribute is
expensive. Parsing it once at MachineFunction initialisation and storing
as FramePointerKind on MachineFrameInfo is a -0.21% geomean improvement
on CTMark stage1-aarch64-O0-g. Also helps debug builds on other targets.
https://llvm-compile-time-tracker.com/compare.php?from=215f35eb8f1c313ac135ad47db1cc0b99b3ae694&to=51f6617517177bea1cc49baeab3acaf62d5e9df9&stat=instructions%3Au
[lldb][RISCV] Implement access to TLS variables on RISC-V (#191410)
On RISC-V Linux, LLDB computes TLS variable addresses incorrectly:
`GetThreadLocalData` returns a correct tls_block, but then
unconditionally adds tls_file_addr from `DW_OP_GNU_push_tls_address`,
which on RISC-V/glibc is a VMA inside PT_TLS, not a pure offset. This
results in an over-shifted address.
This patch:
* Adds a small helper that, for an ELF module, finds the PT_TLS program
header and reads its p_vaddr.
* In `DynamicLoaderPOSIXDYLD::GetThreadLocalData`, normalizes
tls_file_addr to an offset: if `PT_TLS` is found and tls_file_addr >=
p_vaddr, it uses tpoff = tls_file_addr - p_vaddr, otherwise keeps the
old value.
* Returns tls_block + tpoff instead of always tls_block + tls_file_addr.
[9 lines not shown]
[mlir][tensor] Preserve tensor encodings when materializing tensor.empty in some passes (#192411)
This PR fixes tensor encoding propagation bugs in some `tensor.empty`
materialization paths that could produce type-invalid IR (encoded result
expected, unencoded value produced).
Assisted-by: Cursor (Codex 5.3)