[ASan] Fix interface_symbols_darwin.cpp on internal shell
This test turned out to not actually be that interested. There was just a
subshell usage that needed replacing with readfile, and then the test just
works.
Reviewers: fmayer, DanBlackwell, ndrewh
Reviewed By: ndrewh
Pull Request: https://github.com/llvm/llvm-project/pull/168654
[flang][NFC] Strip trailing whitespace from tests (7 of N)
Only some fortran source files in flang/test/Lower have been modified.
The other files in the directory will be cleaned up in subsequent
commits
[MLIR][Python] make sure stubs get installed with LLVM_DISTRIBUTION_COMPONENTS (#168407)
Fixes https://github.com/llvm/llvm-project/issues/168393. Also adds
top-level `MLIR_PYTHON_STUBGEN_ENABLED` CMake option.
[AMDGPU] Remove leftover implicit operands from SI_SPILL/SI_RESTORE. (#168546)
Remove leftover implicit operands from SI_SPILL/SI_RESTORE.
---------
Signed-off-by: John Lu <John.Lu at amd.com>
[clang][analyzer] Add defer_lock_t modelling to BlockInCriticalSectionChecker (#168338)
Fixes #166573
---------
Co-authored-by: Donát Nagy <donat.nagy at ericsson.com>
Co-authored-by: Alan Li <me at alanli.org>
[mlir][vector] Missing indices on vectorization of 1-d reduction to 1-ranked memref (#166959)
Vectorization of a 1-d reduction where the output variable is a 1-ranked
memref can generate an invalid `vector.transfer_write` with no indices
for the memref, e.g.:
vector.transfer_write"(%vec, %buff) <{...}> : (vector<f32>,
memref<1xf32>) -> ()
This patch solves the problem by providing the expected amount of
indices (i.e. matching the rank of the memref).
[lldb] Skip TestLibcxxInternalsRecognizer on asan + MacOS
Unfortunately, in this configuration, the bots are forced to use the
system libcxx, which is too old for what this test is verifying.
In the future, we should re-enable building libcxx with asan on MacOS.
[clang-tidy][docs][NFC] Enforce 80 characters limit (4/4) (#168049)
Fix documentation in `mpi`, `objc`, `openmp`, `performance`,
`portability`, `readability` and `zircon`.
This is part of the codebase cleanup described in
https://github.com/llvm/llvm-project/issues/167098
[OpenACC][CIR] Implement 'atomic capture' lowering (#168422)
The 'atomic capture' variant of the `atomic` construct accepts either a
single statement, or a compound statement containing two statements.
Each of the statements it accepts meet a form of the previous
read/write/update forms, or is a combination of two.
The IR node for atomic capture takes two separate other acc.atomics,
plus a terminator.
This patch implements all of the lowering for these.
Note: This gets the postfix-increment/decrement wrong, but the effort
to do so is enough that I believe we can do that in a followup patch, so
I'll be doing so in the next patch.
[HLSL] Add initial support for output semantics (#168095)
This commits adds the first part of the output semantics. It only
considers return values (and sret), but does not handle `inout` or `out`
parameters yet.
Those missing bits will reuse the same code, but will require additional
testing & some fixups, so planning on adding them separately.
[LV] Consolidate shouldOptimizeForSize and remove unused BFI/PSI. NFC (#168697)
#158690 plans on passing BFI as a lazy lambda to avoid computing
BlockFrequencyInfo when not needed.
In preparation for that, this PR removes BFI and PSI from some
constructors that aren't used. It also consolidates the two calls to
llvm::shouldOptimizeForSize so that the result is computed once and
passed where needed.
This also renames OptForSize in LoopVectorizationLegality to clarify
that it's to prevent runtime SCEV checks, see
https://reviews.llvm.org/D68082
[flang] "Almost NFC" changes to fir::runtime::genCharCompare() (#168563)
As part of investigating a related issue, I made the following changes
to fir::runtime::genCharCompare():
- Renamed a variable
- Added an error check for the same kind of input args
- Updated another error check to use the same error found elsewhere in
this source file
[Clang] Gut the libc wrapper headers and simplify (#168438)
Summary:
These were originally intended to represent the functions that are
present on the GPU as to be provided by the LLVM libc implementation.
The original plan was that LLVM libc would report which functions were
supported and then the offload interface would mark those as supported.
The problem is that these wrapper headers are very difficult to make
work given the various libc extensions everyone does so they were
extremely fragile.
OpenMP already declares all functions used inside of a target region as
implicitly host / device, while these headers weren't even used for CUDA
/ HIP yet anyway. The only things we need to define right now are the
stdio FILE types. If we want to make this work for CUDA we'd need to
define these manually, but we're a ways off and that's way easier
because they do proper overloading.
[LoopInterchange] Don't consider loops with BTC=0 (#167113)
Do not consider loops with a zero backedge taken count as candidates for
interchange. This seems like a sensible thing because it suggests the loop
doesn't execute and there is no point in interchanging. As a bonus, this
seems to avoid triggering an assert about phis and their uses from source
code, so this is a partial fix for #163954 but it needs more work to properly
fix that.
[DA] Replace delinearization for fixed size array (#161822)
This patch replaces the delinearization function used in DA, switching
from one that depends on type information in GEPs to one that does not.
There are three types of changes in regression tests: improvements,
degradations, and degradations but the related features will be
removed. Since there were very few cases that are classified into the
second category, I believe the impact of this change should be
practically insignificant.
[AArch64] Update zero latency instructions in Neoverse scheduling tables (#165690)
NeoverseZeroMove was introduced for Neoverse-V2 and was added to V3 and
V3AE.
Use NeoverseZeroMove for Neoverse-V1, N2, N3 in the same way, including
these instructions:
MOV Xd|Wd, #0|XZR|WZR
For all the above Neoverse targets, the following instructions are also
decoded as not utilizing the scheduling and execution resources of the
machine:
MOV Wd,Wn
MOV Xd,Xn
For Neoverse-N3 only, these instructions also have zero latency
FMOV Dd, Dn
FMOV Sd, Sn
MOV Vd, Vn (vector)
MOV Zd.D, Zn.D
PTRUE
PFALSE
[MLIR][ODS] Fully qualify namespace for mlir::Attribute in ODS generated code (#168536)
ODS generate code can be included and used outside of the `mlir`
namespace and so references to symbols in the mlir namespace
must be fully qualified.
[Clang][Codegen] Move floating point math intrinsic check to separate function [NFC] (#168198)
This PR moves the code that checks whether an LLVM intrinsic should be
generated instead of a call to floating point math functions to a
separate function. This simplifies `EmitBuiltinExpr` in `CGBuiltin.cpp`
and will allow us to reuse the logic in ClangIR.
[AArch64] match TRN starting from undef elements (#167955)
When the first element of a trn mask is undef, the `isTRNMask` function
assumes `WhichResult = 1`. That has a 50% chance of being wrong, so we
fail to match some valid trn1/trn2.
This patch introduces a more precise test to determine the correct value
of `WhichResult`, based on corresponding code in the `isZIPMask` and
`isUZPMask` functions.
- This change is based on #89578. I'd like to follow it up with a
further change along the lines of #167235.
[Runtimes] Default build must use its own output dirs (#168266)
Post-commit fix of #164794 reported at
https://github.com/llvm/llvm-project/pull/164794#issuecomment-3536253493
`LLVM_LIBRARY_OUTPUT_INTDIR` and `LLVM_RUNTIME_OUTPUT_INTDIR` is used by
`AddLLVM.cmake` as output directories. Unless we are in a
bootstrapping-build, It must not point to directories found by
`find_package(LLVM)` which may be read-only directories. MLIR for
instance sets thesese variables to its own build output
directory, so should the runtimes.
[VPLan] Reduce duplication in VPHeaderPHIRecipe::classof. (NFCI)
Implement VPHeaderPHIRecipe::classof(const VPValue *V) in terms of the
variant taking VPRecipeBase.
Reduces some duplication, split off from
https://github.com/llvm/llvm-project/pull/141431.
[AArch64][GlobalISel] Check unmergeSrc is a vector in matchCombineBuildUnmerge (#168692)
This aims to fix the crash in #168495, my combine rule was
missing a check that the source vector was in fact a vector. This then
caused the legality check to fail in this example as the concat was
trying to concat a non vector.
I have also gated the bitcast of the concat to only work on non-scalable
vectors as the mutation calls `getNumElements` which crashes when called
on a scalable vector.
Fixes #168495