[OpenMP] Fix set-but-unused-var warning in omptest (#196069)
This fixes a warning in omptest about a set but unused variable. The var
was intended to control whether colored logging output is created.
That logic has been moved into the `Logger` itself.
[libc][math] Fix -Wshadow warnings in FMod.h (#196346)
The using statement inside the lambda is redundant with the same using 4
lines up.
No behavior change.
AMDGPU: Reland: Codegen for v_dual_dot2acc_f32_f16/bf16 from VOP3
For V_DOT2_F32_F16 and V_DOT2_F32_BF16 add their VOPDName and mark
them with usesCustomInserter which will be used to add pre-RA register
allocation hints to preferably assign dst and src2 to the same physical
register. When the hint is satisfied, canMapVOP3PToVOPD recognises the
instruction as eligible for VOPD pairing by checking if it is VOP2 like:
dst==src2, no source modifiers, no clamp, and src1 is a register.
Mark both instructions as commutable to allow a literal in src1 to be
moved to src0, since VOPD only permits a literal in src0.
Original patch had a bug where it did not check if physical src
registers match register class of appropriate operand in fullVOPD
instructions, check is now done via isValidVOPDSrc.
AMDGPU: Validate VOPD/VOPD3 physical source registers against operand RC
Replace isVGPR checks with isValidVOPDSrc that validates physical source
registers against the actual combined VOPD/VOPD3 instruction's operand
register classes. Now we also validate operands for VOPD instructions.
[LLVM][LICM] Skip unrelated accesses when looking for hoist/sink conflicting instructions. (#195132)
Essentially uses ModRef analysis in place of getClobberingMemoryAccess()
because the former has more accurate information as to how in loop
accesses and the hoist/sink target relate.
[AMDGPU] Increment VA_VDST twice for each VOP3PX2 instruction (#196353)
In expert scheduling mode, change the VA_VDST counts to match the
hardware implementation. The inserted waits were conservatively correct
before. This just makes them more precise in some cases.
[libc++] Introduce implicit and explicit ABI annotations (#193045)
This patch introduces `_LIBCPP_{BEGIN/END}_EXPLICIT_ABI_ANNOTATIONS` and
marks everything within an
`_LIBCPP_{BEGIN,END}_UNVERSIONED_NAMESPACE_STD` (and any derivatives
like `_LIBCPP_{BEGIN,END}_NAMESPACE_STD`) implicitly by default. This
allows us to drop `_LIBCPP_HIDE_FROM_ABI` in most of the code base,
except for functions which shouldn't be `_LIBCPP_HIDE_FROM_ABI`.
This patch doesn't remove any `_LIBCPP_HIDE_FROM_ABI`s, since we have
over 13k of them in the code base. Actually dropping them will happen
over some time to avoid too many merge conflicts.
Revert "[lldb] Real-time console pane for output in lldb tui" (#196507)
Reverts llvm/llvm-project#177160
The new test is timing out on the AArch64 Linux buildbot
(https://lab.llvm.org/buildbot/#/builders/59/builds/34166) and on my own
machine.
I suspect something to do with the requested terminal size. If what we
get is smaller than requested, it could time out waiting for expected
program output.
[llvm][OpenMP][SPIRV] Fix assertion for GPU reductions (#194879)
Currenty compiling a `target reduction` results in the following assert
for spirv64-intel target:
> Assertion `New->getType() == getType() && "replaceUses of value with
new value of different type!"' failed.
This patch fixes it by adding an addrespace cast where necessary to make
the types of the expressions match.
Assisted-by: claude-sonnet-4-5
clang: Add BoundArch argument to addClangTargetOptions
addClangTargetOptions already has an OffloadKind argument,
but it kind of doesn't make sense for any function to know the
OffloadKind, but not the associated BoundArch.
The current process is kind of convoluted. TranslateArgs
synthesizes a -mcpu argument from BoundArch, and later
addClangTargetOptions re-parses that -mcpu argument each
time it wants the architecture. Add this argument so this
can be cleaned up in a future change.
Co-authored-by: Claude Sonnet 4 <noreply at anthropic.com>
[AArch64] Use EXT for byte shuffles with leading zeros (#193466)
Fixes: https://github.com/llvm/llvm-project/issues/191735
Teach AArch64 LowerVECTOR_SHUFFLE to recognize byte shuffles that are a
zero fill right shift and lower them to EXT with a zero vector. Adds a
regression test too.
Change-Id: Iffe97ff7e35cfaff790f537b4f1f5ba9aded4f92
[lldb][test] Move DAP processes to own group to avoid random SIGHUPs (#195816)
On macOS, LLDB's test suite randomly receives SIGHUP signals that stop
the test suite early. The source of these SIGHUP's seems to be a bug in
the kernel (most likely job control).
The exact steps for reproducing this are not clear, but I have a set of
three tests of which two need to run concurrently for this to trigger:
* TestDAP_runInTerminal
* TestDAP_launch_io_integratedTerminal
* TestDAP_launch_stdio_redirection_and_console
I was also running UBSan on this build which may or may not be necessary
to make this random failure more persistent.
When these tests run, macOS job control will send SIGHUP to the process
group of the spawned subprocesses in that test. As LIT is in the same
process group, it also receives the SIGHUP and shuts down.
[8 lines not shown]
[llvm][tools] Extend llvm-objdump to support nested OffloadBinaries (#185425)
Extends llvm-objdump to print the information of images contained in
nested OffloadBinaries. For example, for a binary compiled with #185413
it shows
```
$llvm-objdump --offloading ./a.out
./a.out: file format elf64-x86-64
OFFLOADING IMAGE [0]:
kind elf
arch
triple spirv64-intel
producer openmp
image size 43104 bytes
[Nested OffloadBinary format detected]
Number of inner images: 1
kind spir-v
[13 lines not shown]
[X86][GlobalISel] Support globals in pic mode (#170038)
Introduce G_WRAPPER_RIP it is the same node as in DAG. It is required to
make legalization possible when a load from stub is required to obtain a
pointer to a global value. It allows to avoid manual selection in
X86InstructionSelector.
Also added a missing check on X86SelectAddress failure.
[DAG] canCreateUndefOrPoison - poison generating flags / out of range shift amounts only generate poison (#196489)
Matches ValueTracking / GISel implementations - although testing options are limited until DAG has actual uses of UndefPoisonKind::UndefOnly
[lldb-dap] Fix build when using precompiled header and Xcode generator. (#196366)
When building with precompiled headers and Xcode as a generator, It adds
`obj.lldbDAP.dir/${BUILD_TYPE}/cmake_pch.xxx` but does not generate one
causing the build to fail.
This might have to do with `add_llvm_library` adding a source file
`Dummy.c` to any object it creates if using Xcode as a generator and
`lldbDAP` object not declaring it's LINK_LIBS and LINK_COMPONENTS.
[libclc] Use spirv[64]-mesa-mesa3d triple in README.md (#196483)
Now that #194607 landed we use a normalized triple in the README for the
SPIRV targets. Before `spirv-mesa3d-` and `spirv64-mesa3d-` were being
used and those will be normalized to `spirv-unknown-mesa3d` and
`spirv64-unknown-mesa3d` by the following command in
`runtimes/CMakeLists.txt` with this command:
```console
$ clang --target=spirv-mesa3d- -print-target-triple
spirv-unknown-mesa3d
```
This is because in `llvm/lib/TargetParser/Triple.cpp` the term `mesa3d`
is recognized as an OS and placed in third position. The install path
for `libclc.spv` there ends up in `spirv-unknown-mesa3d/libclc.spv`.
With this change we suggest to use triples that "survive" the
normalization:
[7 lines not shown]
[Dexter][NFC] Add split step-data collection methods for DAP (#196350)
This patch adds some extra state collection methods to DebuggerBase and
implements them for DAP only. These methods are used to fetch a
stacktrace without variable information, and to populate variable
information into a StepIR containing only a stacktrace. These methods
are currently unused, making this patch NFC, but this is a necessary
precursor to the new script model, where we examine the stacktrace to
determine what variable info we will collect.
As part of the stacktrace-collection function, we also fetch the
instruction address for each stack frame, if it is made available by the
debugger; to enable this, this patch adds a new value with default
`None` to `FrameIR`.