[LoopFusion] Validate loop structure before creating LoopCandidates (#192280)
This patch deletes the assert which required the loop to have
preheader. It is not guaranteed to have preheader when loops
are structured using `indirectbr`. Instead, we now rely on header.
Fixes #156670.
[lldb] Add caching and _NT_SYMBOL_PATH parsing in SymbolLocatorSymStore (#191782)
The _NT_SYMBOL_PATH environment variable is the idiomatic way to set a
system-wide lookup order of symbol servers and a local cache for
SymStore. It holds a semicolon-separated list of entries in the
following notations:
* srv*[<cache>*]<source> sets a source and an optional explicit cache
* cache*<cache> sets an implicit cache for all subsequent entries
* all other entries are bare local directories
Since symbol paths are closely intertwined with the caching of symbol
files, this patch proposes support in LLDB for both features at once.
ParseEnvSymbolPaths() implements the parsing logic, which processes
entries of the symbol path string from left to right to create a series
of LookupEntry objects that each store a source and a cache location.
The source of a LookupEntry can be a local directory or an HTTP server
address. The cache is a local directory or empty. This representation
unifies the implicit vs. explicit caching options from the SymStore
protocol.
[22 lines not shown]
[AArch64][GlobalISel] Move KnownBitsVectorTest to mir. NFC (#192536)
This ports some of the older C++ GlobalISel known-bits tests to use
print<gisel-value-tracking> in a mir file. This is mostly autogenerated,
but attempts to keep the existing comments. Some tests have not been
ported as they are entirely in C++ or tested isKnownToBeAPowerOfTwo,
which is not tested in the print output.
[CIR] add vsqrt and vsqrtq support (#192282)
Part of https://github.com/llvm/llvm-project/issues/185382
co-authored by: @Kouunnn <xerw1314 at gmail.com>
---------
Co-authored-by: Zile Xiong <xiongzile99 at gmail.com>
Co-authored-by: ZCkouun <1765074320 at qq.com>
[compiler-rt] Don't provide `__arm_sme_state` for baremetal targets (#191434)
Previously, we required baremetal runtimes to implement an undocumented
`__aarch64_sme_accessible` hook to check if SME is available (as
checking CPU features may vary across targets).
This allowed us to provide a generic `__arm_sme_state` implementation
but caused some friction for toolchains that depend on compiler-rt.
This patch instead removes the implementation of `__arm_sme_state` for
baremetal. This makes it the responsibility of the runtime (e.g. libc)
to provide this function for baremetal targets.
The requirements of this function are documented in the AAPCS64:
https://github.com/ARM-software/abi-aa/blob/main/aapcs64/aapcs64.rst#811__arm_sme_state
All other SME ABI rountines are still provided by compiler-rt.
[mlir] reduce excessive verification in transform (#192653)
`mergeSymbolsInto` called by the transform interpreter for named
sequence management was calling a full verifier after renaming symbols.
The renaming could have potentially broken symbol table-related
invariants, but not really anything else. Only verify the symbol
table-related invariants instead.
[AArch64] Fix lowering of non-power2 uitofp (#190921)
The code in DAGTypeLegalizer::SplitVecOp_TruncateHelper attempts to use
getFloatingPointVT(InElementSize/2), which is invalid for non-power2
type sizes. Fall back to the existing SplitVecOp_UnaryOp in this case.
[CodeGen] Parse frame-pointer attribute once when creating MachineFunction (#191974)
TargetOptions::DisableFramePointerElim is hot and showing up in
compile-time profiles via AArch64FrameLowering::hasFPImpl on
aarch64-O0-g builds. Repeatedly looking up the function attribute is
expensive. Parsing it once at MachineFunction initialisation and storing
as FramePointerKind on MachineFrameInfo is a -0.21% geomean improvement
on CTMark stage1-aarch64-O0-g. Also helps debug builds on other targets.
https://llvm-compile-time-tracker.com/compare.php?from=215f35eb8f1c313ac135ad47db1cc0b99b3ae694&to=51f6617517177bea1cc49baeab3acaf62d5e9df9&stat=instructions%3Au
[lldb][RISCV] Implement access to TLS variables on RISC-V (#191410)
On RISC-V Linux, LLDB computes TLS variable addresses incorrectly:
`GetThreadLocalData` returns a correct tls_block, but then
unconditionally adds tls_file_addr from `DW_OP_GNU_push_tls_address`,
which on RISC-V/glibc is a VMA inside PT_TLS, not a pure offset. This
results in an over-shifted address.
This patch:
* Adds a small helper that, for an ELF module, finds the PT_TLS program
header and reads its p_vaddr.
* In `DynamicLoaderPOSIXDYLD::GetThreadLocalData`, normalizes
tls_file_addr to an offset: if `PT_TLS` is found and tls_file_addr >=
p_vaddr, it uses tpoff = tls_file_addr - p_vaddr, otherwise keeps the
old value.
* Returns tls_block + tpoff instead of always tls_block + tls_file_addr.
[9 lines not shown]
[mlir][tensor] Preserve tensor encodings when materializing tensor.empty in some passes (#192411)
This PR fixes tensor encoding propagation bugs in some `tensor.empty`
materialization paths that could produce type-invalid IR (encoded result
expected, unencoded value produced).
Assisted-by: Cursor (Codex 5.3)
[AArch64][GlobalISel] FP Info implementation for AArch64. (#177158)
This work sits on top of #155107. The aim is to implement support for
extended types in the AArch64 backend.
Much of the implementation just builds upon #155107 but features changes to
the MatchTableExecutor to allow for matching multiple patterns to reduce
the need for duplicated patterns. This patch also features a new match
table opcode to match a pattern based on the shape of a type.
---------
Co-authored-by: David Green <david.green at arm.com>
[libc] Add "struct linger" (#192606)
Add a simple test to get/set the socket option. I didn't try to test the
actual lingering behavior. That sounds complicated and I'm not sure if
it's even doable on a loopback connection.
[LoongArch] Combine rounded vector shifts to VSRLR/VSRAR
Add DAG combines to recognize canonical rounded shift patterns and
lower them to target-specific vector rounded shift instructions.
The combines match vector arithmetic and logical right shifts with
rounding implemented as:
```
add (srl/sra X, shift),
(and (srl X, shift-1), 1)
```
and the shift-by-1 variant:
```
add (srl/sra X, 1),
(and X, 1)
```
[14 lines not shown]
[lldbserver] Implement support for MultiBreakpoint packet
This is fairly straightfoward, thanks to the helper functions created in
the previous commit.
https://github.com/llvm/llvm-project/pull/192910
[lldbremote][NFC] Factor out code handling breakpoint packets
This commit extracts the code handling breakpoint packets into a helper
function that can be used by a future implementation of the
MultiBreakpointPacket.
It is meant to be purely NFC.
There are two functions handling breakpoint packets (`handle_Z`
and `handle_z`) with a lot of repeated code. This commit did not attempt
to merge the two, as that would make the diff much larger due to subtle
differences in the error message produced by the two. The only
deduplication done is in the code processing a GDBStoppointType, where a
helper struct (`BreakpointKind`) and function (`std::optional<BreakpointKind> getBreakpointKind(GDBStoppointType stoppoint_type)`) was created.
https://github.com/llvm/llvm-project/pull/192910
[lldb][docs] Update standalone build instructions (#192613)
* LLVM requires CMake 3.20
(https://llvm.org/docs/GettingStarted.html#software) so we do not need
to mention 3.14 anymore.
* CMAKE_BUILD_TYPE was listed twice in one command.
* "ninja" only works when in the build directory or given `-C <dir>`, so
I have changed that to "cmake --build" which works with ninja and other
build tools.
[DWARFYAML] Add support for v5 debug_line file/dir entries (#192226)
This lets us specify all fields in the v5 header. Since v5 entries are
form-based, I've extracted the relevant parts of the debug_info DIE
writing code so it could be reused here as well.
The v5 file and directory entries are more expressive than <=v4 ones, so
one could in theory store everything in the v5 format, while still
reading (YAML) and writing (raw DWARF) in the old format. However, that
would create more corner cases (what if the data cannot be represented
in the older format), and it didn't seem like it was particularly
worthwhile to handle those.
[libcxx][ci] Set CMAKE_C_COMPILER_TARGET for all Arm/AArch64 builds (#192645)
As requested on #192493.
This is not strictly needed for native builds, but setting only
CMAKE_CXX_COMPILER_TARGET does look suspicious. Especially as we often
set both CXX_FLAGS and C_FLAGS in the same builds.
Set both C_COMPILER_TARGET and CXX_COMPILER_TARGET so on one has to
wonder if it's the cause of a problem.
(note that picolibc builds are already setting both)
[NFC][analyzer] Eliminate BranchNodeBuilder (#192744)
This commit removes the class `BranchNodeBuilder` because it didn't
provide enough advantages to justify its existence. (This class wasn't
as bad as `SwitchNodeBuilder` and `IndirectGotoNodeBuilder`, but its
single method was very simple so I still think that it is better to
eliminate it.)
To be able to unify the use of `LocationContext`s in `processBranch`,
this commit also replaces the overcomplicated helper function
`getInlinedLocationContext` with use of `LocationContext::inTopFrame()`.
The code of this helper function was written in March 2012, before the
introduction of `inTopFrame` (November 2012). It was originally in the
body of the method `ExprEngine::processCFGBlockEntrance` and when I
extracted it in 2025 by commit 9600a12f0de233324b559f60997b9c2db153fede
(because I also needed to call it from `processBranch`) I didn't notice
that I can achieve the same goal with simpler logic.
Additionally, a last stray reference to `friend class SwitchNodeBuilder`
is removed (it was overlooked I removed that class in
c80443cd37b2e2788cba67ffa180a6331e5f0791).
[debugserver][NFCI] Factor out logic handling breakpoint packets
This creates a helper function that can be shared with the upcoming
MultiBreakpoint packet.
It is largely NFC with one logging exception: where we were previously
sending an "ILL_FORMED" error, we would identify the file/line in the
location where the error was emitted. This has been lost, however the
error messages are still unique enough that the line can be recovered
from the error message, and this is a principle we've been trying to
follow.
There was also a comment about gdb and refcounting of breakpoints. Such
comment was removed as it did not seem applicable to any line of code.
[flang] preserve representation of logical constants from TRANSFER (#192417)
This patch fixes the reproducer from
https://github.com/llvm/llvm-project/issues/192234.
Logical constant have canonical representation in flang (0 and 1, which
is compiler specific), but other values can be obtained via TRANSFER
from integer (in which case they are true if and only if different from
zero, which is a compiler specific interpretation).
While using TRANSFER to obtain logical is not recommended because it is
not portable, the standard at guarantee that round trips should preserve
the value.
This was not the case with constants because lowering was always
lowering logical constants to either 0 or 1 even when they are obtained
via TRANSFER. This patch fixes this by lowering non canonical logical
constants to an integer constant + bitcast. A folder is added to fold
converts to i1 of such bitcast so that usage of such TRANSFER in
branches for instance can still be optimized at the MLIR level.