Add new coro test to profcheck-xfail (#191436)
Coro haven't yet been fixed up for profcheck, so new tests are likely to
fail.
mtune.ll exercises loop vectorizer (not fixed)
Mark llvm/test/ExecutionEngine and llvm/test/Examples tests as UNSUPPORTED for zOS (#190835)
Tests in `llvm/test/Examples` and `llvm/test/ExecutionEngine` use JIT
which is unsupported for zOS causing the tests to fail.
---------
Co-authored-by: Bahareh Farhadi <bahareh.farhadi at ibm.com>
[PatternMatchHelpers] Improve compile time of m_Combine(And|Or) (#191413)
Squelch the stage-2 compile time regression introduced by the variadic
m_Combine(And|Or) matchers, by replacing the std::apply on a std::tuple
with a recursive inheritance.
[InstCombine] Generalize `(A + 1) + ~B` fold to any constant (#188271)
Example:
int foo(int a, int b) { return a - 1 + ~b; }
Before, on AArch64:
mvn w8, w1
add w8, w0, w8
sub w0, w8, #1
After (matches gcc):
sub w0, w0, w1
sub w0, w0, #2
Proof: https://alive2.llvm.org/ce/z/g_bV01
[BOLT] Fix iterator bugs (#190978)
Fix iterator misuse in four BOLT passes, caught by _GLIBCXX_DEBUG
(enabled via LLVM_ENABLE_EXPENSIVE_CHECKS=ON).
* AllocCombiner: combineAdjustments() erases instructions while
iterating in reverse via llvm::reverse(BB), invalidating the reverse
iterator. Defer erasures to after the loop using a SmallVector.
* ShrinkWrapping: processDeletions() uses
std::prev(BB.eraseInstruction(II)) which is undefined when II ==
begin(). Restructure to standard forward iteration with erase.
* DataflowAnalysis: run() unconditionally dereferences BB->rbegin(),
which crashes on empty basic blocks (possible after the ShrinkWrapping
fix). Guard with an emptiness check.
* IndirectCallPromotion: rewriteCall() dereferences the end iterator via
&(*IndCallBlock.end()). Replace with &IndCallBlock.back().
* TailDuplication: constantAndCopyPropagate() uses
std::prev(OriginalBB.eraseInstruction(Itr)) which is undefined when Itr
== begin(). Restructure to standard forward iteration with erase.
[libc++] Speed up stable_sort.pass.cpp by reducing redundant test coverage (#187368)
We don't need to run the full exhaustive test for all floating points,
as long as we're testing the radix sort code path (which we are, since
radix sort triggers at 1024 elements).
This reduces the test execution time on my machine from 20s to 12s.
Fixes #187329
[flang][OpenMP] Use common utility functions to get affected nest depth
Remove the existing code that calculates the number of affected loops in
an OpenMP construct. There is a single function that does that and that
handles all directives and clauses.
Issue: https://github.com/llvm/llvm-project/issues/191249
[AMDGPU] Change *-DAG to *-SDAG in check prefixes (#191411)
In some cases the use of *-DAG seemed to confuse the update scripts
because of the clash with FileCheck's built-in -DAG suffix.
[compiler-rt] Expose shared DSO helpers for compiler-rt runtimes (#191098)
The motivation of this PR is to refactor and expose DSO helper functions
so
they can be used by all compiler-rt libraries, including the profile
library,
without duplicating dlopen/dlsym (non-Windows) or
LoadLibrary/GetProcAddress
(Windows) logic in each runtime.
Implement the helpers in namespace __interception in
interception_linux.cpp for
non-Windows targets and interception_win.cpp for Windows, and use them
from the
existing Linux interception path for RTLD_NEXT/RTLD_DEFAULT/dlvsym
lookups.
This is NFC for existing libraries that already use interception's
public APIs;
sanitizer and interception lit behavior is unchanged.
[flang][OpenMP] Add optional SemanticsContext parameter to loop utili… (#191231)
…ties
Some of the utilities may be used in symbol resolution which is before
the expression analysis is done. In such situations, the typedExpr's
normally stored in parser::Expr may not be available. To be able to
obtain the numeric values of expressions, using the analyzer directly
may be necessary, which requires SemanticsContext to be provided.
[lldb][Process/FreeBSDKernelCore] Fix thread ordering (#187976)
In #178306, I made an incorrect assumption that traversing `allproc` in
reverse direction would give incremental pid order based on the fact
that new processes are added at the head of allproc. However, this
assumption is false under certain circumstance such as reusing pid
number, thus failing to sort threads correctly. Without using any
assumption, explicitly sort threads based on pid retrieved from memory.
Fixes: 5349c664fabd49f88c87e31bb3774f40bf938691 (#178306)
---------
Signed-off-by: Minsoo Choo <minsoochoo0122 at proton.me>
[libc][docs][NFC] Rework GPU building documentation (#191381)
Reworked libc/docs/gpu/building.rst to match the style of
getting_started.rst:
* Removed mkdir and cd commands.
* Used -S and -B flags for CMake.
* Used -C flag for Ninja.
* Split commands into smaller blocks with brief explanations.
Use the same terminology as elsewhere in the LLVM libc docs and move
away from the deprecated runtime terms.
* Standard runtimes build -> Bootstrap Build
* Runtimes cross build -> Two-stage Cross-compiler Build
[llvm-dwarfdump][LineCov 2/3] Add coverage baseline comparison and line table coverage in isolation (#183790)
Patch 2 of 3 to add to llvm-dwarfdump the ability to measure DWARF
coverage of local variables in terms of source lines, as discussed in
[this
RFC](https://discourse.llvm.org/t/rfc-debug-info-coverage-tool-v2/83266).
This patch adds the ability to compare a variable’s coverage against a
baseline, e.g. an unoptimised compilation of the same code. This is
provided using the optional `--coverage-baseline` argument.
When a baseline is provided, the output also includes a per-variable
measure of the line table’s coverage (`LT`, `LTRatio`), distinct from
the variable’s coverage proper. See section 2.2 of the RFC for details
on this metric.
[libc++] Fix incorrect links and broken formatting in CSV status files (#191289)
Also, update the conformance script to look for closed issues when
searching for unlinked issues.
[Flang][Docs][NFC] Move OpenMP API extensions to separate document (#186981)
This PR follows suit of the Extensions.md document and provides the same
file for OpenMP API extensions. These have previously been stored in
OpenMPSupport.md. Having a more centralized view and place for these
extensions seems useful.
---------
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot at users.noreply.github.com>
[lldb][Process/FreeBSDKernelCore] Switch to LLDBLog::Process (#191408)
Failure to read all required fields for msgbuf isn't ObjectFile's fault
but FreeBSD-Kernel-Core plugin specific. Thus this should be logged
through `LLDBLog::Process` rather than `LLDBLog::Object`.
Signed-off-by: Minsoo Choo <minsoochoo0122 at proton.me>
[LV][NFC] Remove llvm.ident, tbaa and other attributes from tests (#191375)
While in this area I also removed unnecessary annotations for wchar_size
and also cleaned up some more function attributes.
[OpenMP][MLIR] Modify lowering OpenMP Dialect lowering to support attach mapping
This PR adjusts the LLVM-IR lowering to support the new attach map type that the runtime
uses to link data and pointer together, this swaps the mapping from the older
OMP_MAP_PTR_AND_OBJ map type in most cases and allows slightly more complicated ref_ptr/ptee
and attach semantics.
[libc] Add generate-libc-headers custom target (#191160)
Added the generate-libc-headers custom target depending on libc-headers.
This allows troubleshooting headers without needing to install them
first.
[flang][OpenMP] Move check for threadprivate iteration variable (#191208)
This moves the test of whether the iteration variable of an affected DO
loop is marked as threadprivate. This makes the `ordCollapseLevel`
member unnecessary.
Issue: https://github.com/llvm/llvm-project/issues/191249
[Flang][OpenMP][Offload] Modify MapInfoFinalization to handle attach mapping and 6.1's ref_* and attach map keywords
This PR is one of four required to implement the attach mapping semantics in Flang, alongside the
ref_ptr/ref_ptee/ref_ptr_ptee map modifiers and the attach(always/never/auto) modifiers.
This PR is the MapInfoFinalization changes required to support these features, it mainly deals with
applying the correct attach map type and manipulating the descriptor types maps for base address
and descriptor so that when we specify ref_ptr/ref_ptee we emit one of the two maps and when we
emit ref_ptr_ptee we emit our usual default maps. In all cases we add the "glue" of an new
attach map except in cases where a user has provided attach never. In cases where we are
provided an always, we apply the always map type to our attach maps.
It's important to note the runtime has a toggle for the auto map behaviour, which will flip the
attach behaviour to the newer semantics or the older semantics for backwards compatability (outside
the purview of this PR but good to mention).
[AMDGPU] Do not emit function prologue on naked functions (#191398)
Summary:
Naked functions are intended to allow the user to write the entirety of
the function block, so we shouldn't include the `waitcnt` instructions
for them.
[SystemZTTI][CostModel] Improve SystemZ cost model for scalar Read-Modify-Write Sequence, Fix #189183 (#190350)
This patch improves the SystemZ cost model to identify Read-Modify-Write
sequences
that can be folded into a single instruction (e.g., ASI, NI, OI).
If a load, a scalar arithmetic operation (ADD, SUB, AND, OR, XOR) with
an
immediate, and a store all target the same memory location and have no
external uses, the cost of the arithmetic and store insn should bw 0.
This implementation does not include TTI::TCK_RecipThroughput CostKind,
as
it causes regression in non-power-2-subvector-extract.ll.
Fixes #189183. (Refer it for example)
---------
Co-authored-by: anoopkg6 <anoopkg6 at github.com>