[flang-rt][device] Use snprintf result for length (#172239)
The buffer might not be null terminated on the device and result in 1
byte invalid read when trying to get the length.
[BOLT] Introduce getOutputBinaryFunctions(). NFCI (#172174)
To gain better control over the functions that go into the output file
and their order, introduce `BinaryContext::getOutputBinaryFunctions()`.
The new API returns a modifiable list of functions in output order.
This list is filled by a new `PopulateOutputFunctions` pass and includes
emittable functions from the input file, plus functions added by BOLT
(injected functions).
The new functionality allows to freely intermix input functions with
injected ones in the output, which will be used in new PRs.
The new function replaces `BinaryContext::getSortedFunctions()`, but
unlike its predecessor, it includes injected functions in the returned
list.
[orc-rt] Ensure EH/RTTI=On overrides LLVM opts, applies to unit tests. (#172155)
When -DORC_RT_ENABLE_EXCEPTIONS=On and -DORC_RT_ENABLE_RTTI=On are
passed we need to ensure that the resulting compiler flags (e.g.
-fexceptions, -frtti for clang/GCC) are appended so that we override any
inherited options (e.g. -fno-exceptions, -fno-rtti) from LLVM.
Updates unit tests to ensure that these compiler options are applied to
them too.
[clang-tidy] Suggest `std::views::reverse` instead of `std::ranges::reverse_view` in `modernize-use-ranges` (#172199)
`std::views::FOO` should in almost all cases be preferred over
`std::ranges::FOO_view`. For a detailed explanation of why that is, see
https://brevzin.github.io/c++/2023/03/14/prefer-views-meow/. The TLDR is
that it's shorter to spell (which is obvious) and can in certain cases
be more efficient (which is less obvious; see the article if curious).
Partially revert "[NFCI][lldb][test][asm] Enable AT&T syntax explicitly (#166770)" (#172233)
Flag changes reverted as those require the X86 target to be enabled.
Don't have time to test fixes as I need to go to sleep so will revert for now.
Reverts: 423919d31f4b55f22b09cd5066534f7c91e71d4b
[MLIR][Transform][Python] transform.foreach wrapper and .owner OpViews (#172228)
Friendlier wrapper for transform.foreach.
To facilitate that friendliness, makes it so that OpResult.owner returns
the relevant OpView instead of Operation. For good measure, also changes
Value.owner to return OpView instead of Operation, thereby ensuring
consistency. That is, makes it is so that all op-returning .owner
accessors return OpView (and thereby give access to all goodies
available on registered OpViews.)
Reland of #171544 due to fixup for integration test.
[NFCI][lldb][test][asm] Enable AT&T syntax explicitly (#166770)
Implementation files using the Intel syntax typically explicitly specify it.
Do the same for the few files using AT&T syntax.
This enables building LLVM with `-mllvm -x86-asm-syntax=intel` in one's Clang config files
(i.e. a global preference for Intel syntax).
[VPlan] Pass backedge value directly to FOR and reduction phis (NFC).
Pass backedge values directly to VPFirstOrderRecurrencePHIRecipe and
VPReductionPHIRecipe directly, as they must be provided and availbale.
Split off from https://github.com/llvm/llvm-project/pull/168291.
[MLIR][Transform][Python] transform.foreach wrapper and .owner OpViews (#171544)
Friendlier wrapper for `transform.foreach`.
To facilitate that friendliness, makes it so that `OpResult.owner`
returns the relevant `OpView` instead of `Operation`. For good measure,
also changes `Value.owner` to return `OpView` instead of `Operation`,
thereby ensuring consistency. That is, makes it is so that all
op-returning `.owner` accessors return `OpView` (and thereby give access
to all goodies available on registered `OpView`s.)
[VPlan] Simplify live-ins early using SCEV. (#155304)
Use SCEV to simplify all live-ins during VPlan0 construction. This
enables us to remove special SCEV queries when constructing
VPWidenRecipes and improves results in some cases.
This leads to simplifications in a number of cases in real-world
applications (~250 files changed across LLVM, SPEC, ffmpeg)
PR: https://github.com/llvm/llvm-project/pull/155304
AMDGPU: Stop requiring afn for f32 rsq formation
We were checking for afn or !fpmath attached to the sqrt. We
are not trying to replace a correctly rounded rsqrt; we're replacing
the two correctly rounded operations with the contracted operation.
It's net a better precision, so contract on both instructions should
be sufficient. Both the contracted and uncontracted sequences pass
the OpenCL conformance test, with a lower maximum error contracted.
AMDGPU: Introduce f64 rsq pattern in AMDGPUCodeGenPrepare
Handle this here instead of DAGCombine, mostly because the f32
case is handled here due to the dependency on !fpmath. Also we can
take advantage of computeKnownFPClass.
[tsan] Export __cxa_guard_ interceptors from TSan runtime. (#171921)
These functions from C++ ABI are defined in
compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp and are supposed to
replace implementations from libstdc++/libc++abi.
We need to export them similar to why we need to export other
interceptors and TSan runtime functions - e.g. if a dlopen-ed shared
library depends on `__cxa_guard_acquire`, it needs to pick up the
exported definition from the TSan runtime that was linked into the main
executable calling the dlopen()
However, because the `__cxa_guard_` functions don't use traditional
interceptor machinery, they are omitted from the auto-generated
`libclang_rt.tsan.a.syms` files. Fix this by adding them to
tsan.syms.extra file explicitly.
Co-authored-by: Vitaly Buka <vitalybuka at google.com>
[MLIR][MemRef] Emit error on atomic generic result op defined outside the region (#172190)
While figuring out how to perform an atomic exchange on a memref, I
tried the generic atomic rmw with the yielded value captured from the
enclosing scope (instead of a plain atomic_rmw with
`arith::AtomicRMWKind::assign`). Instead of segfaulting, this PR changes
the pass to produce an error when the result is not found in the
region's IR map.
It might be more useful to give a suggestion to the user, but giving an
error message instead of a crash is at least an imrovement, I think.
See: #172184