Bump minimum required sphinx Python to 3.8 (#203963)
There seems to be de-facto use of at least 3.6 in docs, namely:
* Use of pathlib (3.4) in various places
* Format f-strings (3.6) and used in clang/docs/ghlinks.py
I don't see a strong reason to maintain the divide in minimum version
between test/docs, especially considering the "FIXME" indicating
the 3.0 lower bound was just a guess to begin with.
Change-Id: I11e00295ae0a13ec0f1c5cefbb2fdd2db272b152
[llvm-profgen] Enable all AArch64 instructions for disassembly (#204619)
llvm-profgen builds its MCSubtargetInfo from
`ObjectFile::getFeatures()`. For AArch64 ELF objects this often produces
an empty feature set, so the disassembler falls back to the baseline
Armv8.0-A ISA and rejects valid feature-gated instructions such as LSE
atomics and RCPC loads.
`llvm-objdump` already handles this by [adding +all for AArch64
disassembly](https://github.com/llvm/llvm-project/blob/1e2d1bbc12f6a5f5931c77d39894ee1b8679f5f8/llvm/tools/llvm-objdump/llvm-objdump.cpp#L2823-L2824)
when neither -mattr nor -mcpu is specified. Match that behavior in
`llvm-profgen` so valid AArch64 instructions are not reported as invalid
and their addresses are preserved in profgen's code and branch maps.
Add a regression test covering an AArch64 binary containing `ldaddal`
and `ldapr` without object-level feature metadata.
---------
Co-authored-by: Kunal Pathak <kupathak at fb.com>
[SDAG][LegalizeType] Implement result vector widening for VECTOR_DEINTERLEAVE (#203105)
I accidentally found that we haven't implemented result vector widening
for `ISD::VECTOR_DEINTERLEAVE`. This patch implements such type
legalization.
---------
Co-authored-by: Simon Pilgrim <git at redking.me.uk>
Co-authored-by: Craig Topper <craig.topper at sifive.com>
[flang][cuda] Apply implicit managed attribute to pointer variables under -gpu=mem:managed (#204634)
When -gpu=mem:managed is active with CUDA Fortran enabled, only
allocatable variables were implicitly given the managed CUDA data
attribute. Pointer variables were left without it, causing their
allocations to use host memory instead of cudaMallocManaged.
This patch extends the implicit managed attribute in
FinishSpecificationPart to also cover pointer symbols. A
LanguageFeature::CUDA guard is added so the attribute is only applied
when CUDA Fortran semantics are active. The implicit pinned attribute
(-gpu=mem:pinned) remains allocatable-only.
[RFC][NFCI][IR] Extract AMDGPU-specific verification logic into `VerifierAMDGPU.cpp` (#204284)
`Verifier.cpp` is large and already mixes generic IR verification with
target-specific checks. We also have a growing amount of AMDGPU verifier
logic downstream, which would all end up in the same file if we don't
address this, and that is not ideal.
This patch extracts AMDGPU-specific verification logic into a separate
`VerifierAMDGPU.cpp` file, with shared infrastructure
(`VerifierSupport`) moved into `VerifierInternal.h`.
This is purely a code organization change, not a target-dependent IR
verifier. All checks remain compiled and linked into `LLVMCore`
regardless of the target triple. The extracted functions are called
unconditionally at well-defined extension points in `Verifier.cpp`, and
each function internally gates on target-specific conditions (for
example, triple checks or intrinsic IDs) as needed. The file is strictly
limited to AMDGPU-specific IR constructs (amdgcn intrinsics, AMDGPU
module flags, etc.), and does not contain generic IR rules that vary by
[11 lines not shown]
[mlir][Math][XeVM] Add Math to OCL conversion patterns (#198370)
This PR adds conversion patterns to convert supported math ops to SPIR-V
OpenCL builtin calls. These lowerings correspond to `OpExtInst` calls
into the OpenCL SPIR-V extended instruction set via mangled
`__spirv_ocl_` entry points for f32/f64 variants.
[BPF] override getFrameIndexReference for frame object offsets (#204722)
### Summary
The BPF backend currently does not override `getFrameIndexReference()`.
Since BPF uses a fixed frame pointer (R10), frame object offsets are
already expressed relative to the frame pointer. The generic
`TargetFrameLowering::getFrameIndexReference()` implementation adjusts
offsets using the stack size, which is not appropriate for BPF.
This PR overrides `getFrameIndexReference()` to return the correct frame
object offsets for the BPF frame model, resulting in accurate debug
locations for stack variables. For example, the stack variable `local`
in the reproducer below previously received:
```
DW_AT_location (DW_OP_fbreg +0)
```
[67 lines not shown]
[CIR] Lower const arrays as a single llvm.mlir.constant (#203590)
When compiling the blender benchmark for SPEC CPU2017, we hit a case
where a very large array (more than 400k elements) is initialized with
constant values. However, because it contains trailing zeros, CIR
generates a constant record initializer (an array of elements, plus a
zero-initialized trailing array). We were lowering this to the LLVM
dialect using a global initializer function with a huge number of calls
to insertelement. The subsequent lowering to LLVM IR constant folded
back to a constant initializer, but it took about 40 minutes to compile.
The recent fix to avoid calling insertelement for the array
initialization didn't fix this case because it handled only arrays, not
records.
This change updates the lowering to the LLVM dialect to lower constant
array attributes to a single llvm.mlir.const value rather than
attempting to build a chain of insertvalue ops whenever possible.
[MC] emitCodeAlignment: take MCSubtargetInfo by reference. NFC (#205140)
The fragment member cannot be null, and the sibling streamer hooks
(emitInstruction, initSections, emitPrefAlign) already take it by
reference.
[RISCV] Add a Pass for adding %qc.access specifiers (#201938)
Qualcomm's ABI has Access Relocation Markers, which are used to enable
more linker relaxations. This change implements a pass which will
annotate loads and stores (accesses) which are the single user of a
`qc.e.li`-materialized address with these markers so they can be relaxed
in the linker.
This is a follow-up to #188671.
[Clang][Sema] Add -Wstringop-overread warning for source buffer overreads (#183004)
This PR adds a new `-Wstringop-overread` warning that diagnoses calls to
memory functions where the specified size exceeds the size of the source
buffer, increasing parity with GCC's `-Wstringop-overread`.
The warning is emitted when the read size is a compile-time constant
that is greater than the size of the source buffer (when known
statically).
This check applies to the following functions:
- `memcpy`, `memmove`, `mempcpy` (and `__builtin_` / `__builtin___*_chk`
variants)
- `memchr`
- `memcmp`, `bcmp`
Some of the existing code for `-Wfortify-source` was refactored into a
helper class to make its lambdas accessible to other functions.
[6 lines not shown]
[libc++] std::abs support for _BitInt(N) and __int128 (#196532)
`std::abs` does not accept `__int128` or signed `_BitInt(N)`: the call
is ambiguous and fails to compile (#204212).
This adds an explicit `abs(__int128_t)` overload and an
`abs(_BitInt(N))` overload that deduces the width, so every signed
`_BitInt` gets a same-type result. `_BitInt` does not integer-promote,
so without this overload a narrow signed `_BitInt` would be an ambiguous
call against `abs(int/long/long long)` instead of promoting the way
`signed char` and `short` do. Standard narrow types are unchanged: they
still go through `abs(int)`.
Part of the [_BitInt(N) libc++
effort](https://discourse.llvm.org/t/bitint-n-support-in-libc-investigations-possible-improvements-looking-for-guidance/90063).
Fixes #204212
Assisted-by: Claude (Anthropic)
[3 lines not shown]
[libc++] Implement P4206R0 Revert string support in std::constant_wrapper (#203338)
Fixes https://github.com/llvm/llvm-project/issues/203336
---------
Signed-off-by: yronglin <yronglin777 at gmail.com>
Co-authored-by: A. Jiang <de34 at live.cn>
[libc++][lnt] Allow retaining build artifacts in run-benchbot (#205146)
Also, as a drive-by, introduce `--results-dir` to specify where to put
the JSON results instead of using `--build-dir` for that.
Assisted by Claude
[yaml2obj][MachO] Fix byte order of the indirect symbol table (#205044)
This is a follow-up of PR #203680 that added the test case
`linkedit-alignment.test`, which currently fails on big-endian buildbots
(see: https://lab.llvm.org/buildbot/#/builders/98/builds/3084 and
https://lab.llvm.org/buildbot/#/builders/114/builds/906).
The failure seems to be on `yaml2obj`, where `writeDynamicSymbolTable`
emits an indirect symbol table in host byte order rather than the
specified object's byte order (i.e. the `IsLittleEndian` field value).
This PR adds the missing swap and a regression test that round-trips all
endian-sensitive fields with both endianness values.
[lldb][bazel] Add the Windows process plugin to the bazel build (#203146)
Add a cc_library for the native Windows process plugin
(ProcessWindowsCommon),
gated to @platforms//os:windows, and register it via the dedicated
@LLDB_PROCESS_WINDOWS_PLUGIN@ slot in the generated Plugins.def. This
mirrors the
CMake build, which special-cases ProcessWindowsCommon into that slot so
it is
initialized after all other process plugins but before ProcessGDBRemote.
With the help of claude.
Tested internally at Meta by converting Bazel -> BUCK and confirming
matches working BUCK contents for windows lldb build.