[flang] Assign sizes & offsets before instantiating some component types (#178927)
Semantics is instantiating derived types too soon is some cases, leading
to incorrect sizes and component offsets in cases of valid forward
references to derived types -- these appear in the declarations of
allocatable and pointer components. The incorrect size led to a runtime
crash in the linked bug report after an insufficient allocation.
Since those components are indirect, their sizes in the derived type
instantiation can be known without having to recursive instantiate the
components' types. Then, after laying out the derived type
instantiation, the compiler can then ensure that the components' types
are instantiated.
Fixes https://github.com/llvm/llvm-project/issues/178786.
[flang] Fix proc ptr default initializers in structure constructors (#178897)
The default initializers for procedure pointer components are not being
used for unspecified components in structure constructors.
Fixes https://github.com/llvm/llvm-project/issues/178813.
[flang][CUDA] Allow constant to match device actual in specific procedure (#178658)
When scanning the specific procedures of a generic interface for a match
for a set of actual arguments, accept a constant actual argument as a
match for a dummy argument with the DEVICE attribute.
[flang] Fix exposed "free" instances of ac-implied-do indices (#178516)
Tweak the implementations of IsConstantExpr, IsInitialDataTarget, and
related utilities so that "free" instances of array constructor implied
DO indices are not treated as constant expressions when the surrounding
context (if any) doesn't contain their bounds. This fixes a current bug
in which a "free" implied DO index in a structure constructor got
wrapped up an a Constant<SomeDerived>, which led to a crash in lowering.
[ELF] Fix IRELATIVE addend if the resolver address is updated by linker relaxation (#179063)
For a non-preemptible ifunc, `handleNonPreemptibleIfunc` creates a
cloned
symbol (`directSym`) to compute the addend of the IRELATIVE dynamic
relocation.
This cloned symbol wasn't tracked by `initSymbolAnchors`, so its value
wasn't adjusted during RISC-V/LoongArch linker relaxation.
This caused IRELATIVE addends to point to pre-relaxation addresses.
Fix this by:
- Tracking cloned IRELATIVE symbols in `ctx.irelativeSyms`
- Adding these symbols to `relaxAux->anchors` in `initSymbolAnchors`
[VPlan] Split up attachCheckBlock in distinct helpers for re-use (NFC).
Split up attachCheckBlock into its distinct operations:
* inserting the check block in the CFG + updating phis, and
* adding the branch VPInstruction.
Those helpers can be re-used in follow-up changes.
[VPlan] Detect and create partial reductions in VPlan. (NFCI) (#167851)
As a first step, move the existing partial reduction detection logic to
VPlan, trying to preserve the existing code structure & behavior as
closely as possible.
With this, partial reductions are detected and created together in a
single step.
This allows forming partial reductions and bundling them up if
profitable together in a follow-up.
PR: https://github.com/llvm/llvm-project/pull/167851
[clang-tidy] Add new check readability-trailing-comma (#173669)
clang-format has a couple of similar options:
https://clang.llvm.org/docs/ClangFormatStyleOptions.html#enumtrailingcomma
- add trailing commas for enum
https://clang.llvm.org/docs/ClangFormatStyleOptions.html#inserttrailingcommas
- add trailing commas for C++
but generally they are marked with such warning:
> Warning
>
> Setting this option to any value other than Leave could lead to
incorrect code formatting due to clang-format’s lack of complete
semantic information. As such, extra care should be taken to review code
changes made by this option.
clang-tidy on the other hand has all semantic information, thus can
[5 lines not shown]
[clang-tidy][NFC] Convert Lexer utils to use std::optional<Token> (#174809)
This bring a more unified api and avoid caveats like "return
``tok::unknown`` if not found.", which makes easier to forget error
checking.
---------
Co-authored-by: mitchell <zeyi2 at nekoarch.cc>
[AMDGPU] Introduce custom MIR formatting for s_wait_alu (#176316)
This patch implements a custom printer/parser for the immediate operand
of s_wait_alu that prints/parses the decoded counter values.
Format:
```
.<counter1>_<value1>_<counter2>_<value2>
```
Example:
`s_wait_alu .VaVdst_1_VmVsrc_1`
; Which is equivalent to this:
`s_wait_alu 8167`
Features:
- If a counter is at its maximum value it won't get printed.
- The parser will error out if a counter is greater or equal to its max
value.
[5 lines not shown]
[NFCI][ELF][AArch64][PAC] Teach addRelativeReloc to emit R_AARCH64_AUTH_RELATIVE
This allows R_AARCH64_AUTH_ABS64 to follow R_AARCH64_ABS64's flow rather
than being implemented on the side in the place that is normally for
symbolic relocations.
Note that this has one implementation change: the RelExpr passed to
relaDyn is now RE_AARCH64_AUTH rather than R_ABS, but the two are
handled identically by InputSectionbase::getRelocTargetVA, and it was
inconsistent with relrAuthDyn which was passed RE_AARCH64_AUTH.
Reviewers: kovdan01
Pull Request: https://github.com/llvm/llvm-project/pull/171180
[ELF][AArch64][PAC][MTE] Handle Memtag globals for R_AARCH64_AUTH_ABS64
Currently, R_AARCH64_AUTH_ABS64 against a tagged global just ignores the
tagging and so, if out of the symbol's bounds, does not write the
negated original addend for the loader to determine which granule's tag
to use for it. Handle the composition of the two.
Note that R_AARCH64_AUTH_ABS64/RELATIVE encode the signing schema in the
upper 32 bits of the value at the relocation target, and so only the
lower 32 bits are available for use as an addend, including for Memtag's
disambiguation, and so if a wildly out-of-bounds PAuth relocation
against a tagged global is used we have no choice but to error out with
the current ABI.
Reviewers: MaskRay, kovdan01, smithp35, asl
Reviewed By: smithp35
Pull Request: https://github.com/llvm/llvm-project/pull/173291
[M68k] Prevent folding of loads + stores when it would introduce new chain dependencies (#175457)
This bug seems to have been exposed by the combined m->m load/store
instructions available on M68k (these instructions are not available on
i386, which the M68k backend is based on). This meant that token factors
were inserted which could lead to distinct call sequence chains,
increasing the nesting level and preventing the matching callseq_start
from being identified during scheduling.
The patch addresses this by not allowing combined loads/stores when the
folded operation would result in a new chain dependency on a different
call sequence.
closes #146213 and #175472
[clang-tidy] Speed up `llvm-prefer-isa-or-dyn-cast-in-conditionals` (#178997)
Same approach as described in #178829.
```txt
---User Time--- --System Time-- --User+System-- ---Wall Time--- --- Name ---
Status quo: 0.2031 (100.0%) 0.0469 (100.0%) 0.2500 (100.0%) 0.2635 (100.0%) llvm-prefer-isa-or-dyn-cast-in-conditionals
With this change: 0.0312 (100.0%) 0.0312 (100.0%) 0.0190 (100.0%) llvm-prefer-isa-or-dyn-cast-in-conditionals
```
(I think `--enable-check-profile` doesn't report any system time after
this change because it's too small).
Thread Safety Analysis: Add more complex cleanup attribute test (#179049)
Test that cleanup attribute is handled correctly in the presence of a
unary operator before scope end.
NFC.
[clang][dataflow] Fix assignment of unknown values. (#178943)
Just because the right-hand side of the assignment doesn't have a known
value, doesn't mean the left-hand side gets to keep its old value.
[clang][bytecode] Fix crash on __builtin_align_up with one-past-end pointers (#178652)
## Summary
Fix assertion failure when evaluating
`__builtin_align_up`/`__builtin_align_down`/`__builtin_is_aligned` with
one-past-end pointers like `&array[size]`.
## Root Cause
`getIndex()` calls `getOffset()` which asserts when `Offset ==
PastEndMark`. This happens for one-past-end element pointers.
## Fix
Check `isElementPastEnd()` before calling `getIndex()`. For past-end
pointers, use `getNumElems()` instead which gives the correct index
value.
## Test
Added test cases in `builtin-align-cxx.cpp` for one-past-end pointer
alignment.
Fixes #178647
InstCombine: Fold bitcast of vector with constant to scalar
Fold bitcast (select cond, val, const) ->
select cond, (bitcast val), (bitcast const)
Rocm device libs has an unfortunate amount of code that does bithacking
on the sign bit of double values by casting to <2 x i32> and operation
on the high element. This breaks value tracking optimizations on the
fp value.
The existing transform would only do this if the input to the select was
also a bitcast with a single use, and if it didn't convert between vector
and scalar.