[AMDGPU] Do not always add latency between LDSDMA -> S_WAIT_LDSDMA (#201942)
In loop bodies we typically see LDSDMA instructions prefetched an
iteration or more. Thus, we may have LDSDMA, followed by S_WAIT_LDSDMA
that is waiting on prior iteration LDSDMA. Currently, the scheduler
thinks there will be a long stall between this LDSDMA and S_WAIT_LDSDMA.
This adds some basic checking for LDSDMA and S_WAIT_LDSDMA in the same
region to avoid adding latency in cases where we are certain the
S_WAIT_LDSDMA does not correspond with the LDSDMA.
[lldb][NFC] Remove redundant TypeSystemClang.h includes (#202439)
TypeSystemClang.h includes a lot of other unique headers, and should not
be included unless needed.
[VPlan] Fix vplan printing for VPExpressionRecipe w/conditional reduction. (#198954)
This patch contains two parts.
- Add a new vplan-printing test which is duplicated from
vplan-printing-reductions.ll and force tail folding.
- Fix the printing of VPExpressionRecipe for conditional reductions.
Since the mask operand cannot be accessed directly through the reduction
recipe once folded, it need to be fetched from the expression recipe's
operands.
[test][DynamicLibrary] Add visibility attribute for GCC/Clang in PipSqueak.h (#202445)
By default CFI builds with hidden, failing expectation for the test.
[analyzer] Fix misleading 'initialized here' note for uninitialized d… (#198345)
…eclarations
When a variable is declared without an initializer, the
BugReporterVisitor would emit 'initialized here' as a note, which is
confusing because the variable was never initialized.
Change the note to 'declared without an initial value' for declarations
that have no initializer. Global-storage variables are also taken into
consideration.
Removed the SI.Value.isUndef() case, as it is unreachable in
practice because core.uninitialized.Assign (a core checker, always
enabled) reports
the assignment before this note can surface.
[LLDB] Expose enumerator for separate-debug-info in SBModule (#144119)
Today we can run `target modules dump separate-debug-info --json` to get
a json blob of all the separate debug info, but it has a few
shortcomings when developing some scripting against it. Namely, the
caller has to know the structure of the JSON per architecture that will
be returned.
I've been working on a Minidump packing utility where we enumerate
symbols and source and put them in a package so we can debug with
symbols portably, and it's been difficult to maintain multiple
architectures due to the above shortcomings. To address this for myself,
I've exposed a simple iterator for the SBModule to get all the
separate-debug-info as list of filespecs with no need for the caller to
have context on what kind of data it is.
I also extened the swig interfaces to make writing my test easier and as
a nice to have.
[Clang][CodeGen] Avoid emitting poison immargs for __builtin_prefetch (#201623)
Fixes #201448
This PR fixes invalid clang CodeGen when lowering `__builtin_prefetch`
that is called with a constant expression that produces a poison value.
The code currently assumes that the immediate operands are
`ConstantInt`s. This is not always true as poison values may come from
UB expressions (e.g., `__builtin_prefetch(0, 2 >> 32)`) due to the use
of `EmitScalarExpr`. This would cause the subsequent downcast
`cast<ConstantInt>` in `SelectionDAGBuilder` to fail.
This PR replaces `EmitScalarExpr` with `EmitScalarOrConstFoldImmArg` to
obtain an integer constant instead of emitting a poison value for the
corresponding arguments of `llvm.prefetch`.
A regression test is also added to cover the poison immediate argument
case.
[4 lines not shown]
Reapply "[flang] Enumeration Type: (PR 1/5) Foundation types + Parser" (#202440)
FortranEvaluate referenced DerivedTypeSpec::GetScope(), defined
out-of-line in FortranSemantics, producing an undefined reference in
libFortranEvaluate.so under BUILD_SHARED_LIBS=ON. Made GetScope() inline
in symbol.h so no cross-library symbol is needed.
This is the fix missing from the original PR (#192651), which was
reverted in #202408.
---------
Co-authored-by: Kevin Wyatt <kwyatt at hpe.com>
[Clang] add support for C23 'H', 'D', and 'DD' length modifiers (#201098)
This patch adds `-Wformat` support for the C23 `H`, `D`, and `DD` length
modifiers in `printf`/`scanf` format strings. #116962
[ObjectYAML] Avoid comparison of compressed data (#202413)
The result of zlib compression isn't consistent across versions.
Downstream this test was failing due to our version giving slightly
different results. This version passes both upstream and downstream.
Assisted-by: Automated tooling, human reviewed.
[flang] Inline DerivedTypeSpec::GetScope to fix shared-lib link
FortranEvaluate referenced DerivedTypeSpec::GetScope(), defined out-of-line
in FortranSemantics, producing an undefined reference in libFortranEvaluate.so
under BUILD_SHARED_LIBS=ON. Make GetScope() inline in symbol.h so no
cross-library symbol is needed.
This is the fix missing from the original PR (#192651), which was reverted
in #202408.
[ubsan] Add [undefined] section to ignorelist (#202380)
`-fsanitize-blacklist` this files passed as which apply to any
sanitizers.
So if Ubsan is combined with Asan, as-is these suppressions apply to
Asan
which is clearly was not the intention.
Update clang/unittests/ScalableStaticAnalysisFramework/Analyses/PointerFlow/PointerFlowTest.cpp
Co-authored-by: Balázs Benics <benicsbalazs at gmail.com>
[InstCombine] Fix incorrect is_zero_poison when folding select+ctlz to cttz (#202388)
foldSelectCtlzToCttz folds
%lz = call i32 @llvm.ctlz.i32(i32 (x & -x), i1 is_zero_poison)
%r = select (icmp eq x, 0), i32 32, i32 (xor %lz, 31)
into
%r = call i32 @llvm.cttz.i32(i32 x, i1 is_zero_poison)
The original select's result is defined when x is zero, even if
is_zero_poison is true. Therefore in the new cttz call, we need to pass
false for the second param, we can't reuse is_zero_poison.
[InstCombine] Fix invalid IR when folding frexp(frexp(x)) with mismatched exponent types (#202419)
Instcombine folds the idempotent frexp pattern
%inner = call { double, i64 } @llvm.frexp.f64.i64(double %x)
%f = extractvalue { double, i64 } %inner, 0
%outer = call { double, i32 } @llvm.frexp.f64.i32(double %f)
to `{ %f, 0 }`, because the fraction after the first frexp call is known
0. It did this by reusing the inner frexp's result struct and
overwriting field 1 with zero.
But you can see in this example that reusing the inner frexp's
result struct is invalid, because that call returns { double, i64 },
whereas the second call returns { double, i32 }.
Fix this by building the new struct instead of modifying the old one.
[ARM] Reject invalid BF encoding when target is next instruction (#201533)
When the BF instruction targets the immediately following label, the
encoded branch offset becomes zero, causing LLVM to emit invalid machine
code.
Add validation in the fixup_bf_branch path to reject this case and emit
an error instead.
Add MC regression test to cover new validation.
Assisted by ChatGPT. Human-verified, debugged, tested and validating by
author.
[flang][acc] Fix separate compilation for module !$acc declare create on allocatables. (#202409)
With separate compilation, a module defining `!$acc declare create` on
an allocatable and a using file that allocates it did not get
declare-action lowering in the using Translation Unit(TU):
`ACCDeclareActionConversion` could not resolve the post-alloc recipe
(defined only in the module .o), so no `fir.call` was emitted.
Add `acc.declare_action` for allocatable/pointer symbols under !$acc
declare.
* In the defining TU: Export module-global post-alloc/post-dealloc
recipes as linkable definitions and mark them with acc.declare_action at
creation.
* In the using TU: When declaring a USE-associated module global, emit
private external recipe stubs so the declare-action conversion pass can
insert fir.calls that link to the module definition.