[libc++][test] Unblock cases for `ranges::sort` with proxy ranges (#188490)
libc++ switched to use `iter_move`/`iter_swap` long time ago, so we
should unblock these cases.
[SPIR-V] Fix image query and sampler type (#190767)
- Use OpImageQuerySize instead of OpImageQuerySizeLod for multisampled
SPIR-V spec requires MS=0 for OpImageQuerySizeLod
- Use `target("spirv.Sampler")` instead of i32 for non-constant sampler
kernel parameters so they produce OpTypeSampler as required by
OpSampledImage
related to https://github.com/llvm/llvm-project/issues/190736
[LoopFusion] Validate loop structure before creating LoopCandidates (#192280)
This patch deletes the assert which required the loop to have
preheader. It is not guaranteed to have preheader when loops
are structured using `indirectbr`. Instead, we now rely on header.
Fixes #156670.
[lldb] Add caching and _NT_SYMBOL_PATH parsing in SymbolLocatorSymStore (#191782)
The _NT_SYMBOL_PATH environment variable is the idiomatic way to set a
system-wide lookup order of symbol servers and a local cache for
SymStore. It holds a semicolon-separated list of entries in the
following notations:
* srv*[<cache>*]<source> sets a source and an optional explicit cache
* cache*<cache> sets an implicit cache for all subsequent entries
* all other entries are bare local directories
Since symbol paths are closely intertwined with the caching of symbol
files, this patch proposes support in LLDB for both features at once.
ParseEnvSymbolPaths() implements the parsing logic, which processes
entries of the symbol path string from left to right to create a series
of LookupEntry objects that each store a source and a cache location.
The source of a LookupEntry can be a local directory or an HTTP server
address. The cache is a local directory or empty. This representation
unifies the implicit vs. explicit caching options from the SymStore
protocol.
[22 lines not shown]
[AArch64][GlobalISel] Move KnownBitsVectorTest to mir. NFC (#192536)
This ports some of the older C++ GlobalISel known-bits tests to use
print<gisel-value-tracking> in a mir file. This is mostly autogenerated,
but attempts to keep the existing comments. Some tests have not been
ported as they are entirely in C++ or tested isKnownToBeAPowerOfTwo,
which is not tested in the print output.
[CIR] add vsqrt and vsqrtq support (#192282)
Part of https://github.com/llvm/llvm-project/issues/185382
co-authored by: @Kouunnn <xerw1314 at gmail.com>
---------
Co-authored-by: Zile Xiong <xiongzile99 at gmail.com>
Co-authored-by: ZCkouun <1765074320 at qq.com>
[compiler-rt] Don't provide `__arm_sme_state` for baremetal targets (#191434)
Previously, we required baremetal runtimes to implement an undocumented
`__aarch64_sme_accessible` hook to check if SME is available (as
checking CPU features may vary across targets).
This allowed us to provide a generic `__arm_sme_state` implementation
but caused some friction for toolchains that depend on compiler-rt.
This patch instead removes the implementation of `__arm_sme_state` for
baremetal. This makes it the responsibility of the runtime (e.g. libc)
to provide this function for baremetal targets.
The requirements of this function are documented in the AAPCS64:
https://github.com/ARM-software/abi-aa/blob/main/aapcs64/aapcs64.rst#811__arm_sme_state
All other SME ABI rountines are still provided by compiler-rt.
[mlir] reduce excessive verification in transform (#192653)
`mergeSymbolsInto` called by the transform interpreter for named
sequence management was calling a full verifier after renaming symbols.
The renaming could have potentially broken symbol table-related
invariants, but not really anything else. Only verify the symbol
table-related invariants instead.
[AArch64] Fix lowering of non-power2 uitofp (#190921)
The code in DAGTypeLegalizer::SplitVecOp_TruncateHelper attempts to use
getFloatingPointVT(InElementSize/2), which is invalid for non-power2
type sizes. Fall back to the existing SplitVecOp_UnaryOp in this case.
[CodeGen] Parse frame-pointer attribute once when creating MachineFunction (#191974)
TargetOptions::DisableFramePointerElim is hot and showing up in
compile-time profiles via AArch64FrameLowering::hasFPImpl on
aarch64-O0-g builds. Repeatedly looking up the function attribute is
expensive. Parsing it once at MachineFunction initialisation and storing
as FramePointerKind on MachineFrameInfo is a -0.21% geomean improvement
on CTMark stage1-aarch64-O0-g. Also helps debug builds on other targets.
https://llvm-compile-time-tracker.com/compare.php?from=215f35eb8f1c313ac135ad47db1cc0b99b3ae694&to=51f6617517177bea1cc49baeab3acaf62d5e9df9&stat=instructions%3Au
[lldb][RISCV] Implement access to TLS variables on RISC-V (#191410)
On RISC-V Linux, LLDB computes TLS variable addresses incorrectly:
`GetThreadLocalData` returns a correct tls_block, but then
unconditionally adds tls_file_addr from `DW_OP_GNU_push_tls_address`,
which on RISC-V/glibc is a VMA inside PT_TLS, not a pure offset. This
results in an over-shifted address.
This patch:
* Adds a small helper that, for an ELF module, finds the PT_TLS program
header and reads its p_vaddr.
* In `DynamicLoaderPOSIXDYLD::GetThreadLocalData`, normalizes
tls_file_addr to an offset: if `PT_TLS` is found and tls_file_addr >=
p_vaddr, it uses tpoff = tls_file_addr - p_vaddr, otherwise keeps the
old value.
* Returns tls_block + tpoff instead of always tls_block + tls_file_addr.
[9 lines not shown]
[mlir][tensor] Preserve tensor encodings when materializing tensor.empty in some passes (#192411)
This PR fixes tensor encoding propagation bugs in some `tensor.empty`
materialization paths that could produce type-invalid IR (encoded result
expected, unencoded value produced).
Assisted-by: Cursor (Codex 5.3)
[AArch64][GlobalISel] FP Info implementation for AArch64. (#177158)
This work sits on top of #155107. The aim is to implement support for
extended types in the AArch64 backend.
Much of the implementation just builds upon #155107 but features changes to
the MatchTableExecutor to allow for matching multiple patterns to reduce
the need for duplicated patterns. This patch also features a new match
table opcode to match a pattern based on the shape of a type.
---------
Co-authored-by: David Green <david.green at arm.com>
[libc] Add "struct linger" (#192606)
Add a simple test to get/set the socket option. I didn't try to test the
actual lingering behavior. That sounds complicated and I'm not sure if
it's even doable on a loopback connection.
[LoongArch] Combine rounded vector shifts to VSRLR/VSRAR
Add DAG combines to recognize canonical rounded shift patterns and
lower them to target-specific vector rounded shift instructions.
The combines match vector arithmetic and logical right shifts with
rounding implemented as:
```
add (srl/sra X, shift),
(and (srl X, shift-1), 1)
```
and the shift-by-1 variant:
```
add (srl/sra X, 1),
(and X, 1)
```
[14 lines not shown]
[lldbserver] Implement support for MultiBreakpoint packet
This is fairly straightfoward, thanks to the helper functions created in
the previous commit.
https://github.com/llvm/llvm-project/pull/192910
[lldbremote][NFC] Factor out code handling breakpoint packets
This commit extracts the code handling breakpoint packets into a helper
function that can be used by a future implementation of the
MultiBreakpointPacket.
It is meant to be purely NFC.
There are two functions handling breakpoint packets (`handle_Z`
and `handle_z`) with a lot of repeated code. This commit did not attempt
to merge the two, as that would make the diff much larger due to subtle
differences in the error message produced by the two. The only
deduplication done is in the code processing a GDBStoppointType, where a
helper struct (`BreakpointKind`) and function (`std::optional<BreakpointKind> getBreakpointKind(GDBStoppointType stoppoint_type)`) was created.
https://github.com/llvm/llvm-project/pull/192910
[lldb][docs] Update standalone build instructions (#192613)
* LLVM requires CMake 3.20
(https://llvm.org/docs/GettingStarted.html#software) so we do not need
to mention 3.14 anymore.
* CMAKE_BUILD_TYPE was listed twice in one command.
* "ninja" only works when in the build directory or given `-C <dir>`, so
I have changed that to "cmake --build" which works with ninja and other
build tools.
[DWARFYAML] Add support for v5 debug_line file/dir entries (#192226)
This lets us specify all fields in the v5 header. Since v5 entries are
form-based, I've extracted the relevant parts of the debug_info DIE
writing code so it could be reused here as well.
The v5 file and directory entries are more expressive than <=v4 ones, so
one could in theory store everything in the v5 format, while still
reading (YAML) and writing (raw DWARF) in the old format. However, that
would create more corner cases (what if the data cannot be represented
in the older format), and it didn't seem like it was particularly
worthwhile to handle those.
[libcxx][ci] Set CMAKE_C_COMPILER_TARGET for all Arm/AArch64 builds (#192645)
As requested on #192493.
This is not strictly needed for native builds, but setting only
CMAKE_CXX_COMPILER_TARGET does look suspicious. Especially as we often
set both CXX_FLAGS and C_FLAGS in the same builds.
Set both C_COMPILER_TARGET and CXX_COMPILER_TARGET so on one has to
wonder if it's the cause of a problem.
(note that picolibc builds are already setting both)