[ARM] Fix inlining issue in ARM (#169337)
There is an issue on ARM where a function wont be inlined due to
mismatching target features between caller and callee.
The caller has `HasV8Ops` and `FeatureDotProd` and the callee does not,
but AFAIK this should not be a problem.
https://godbolt.org/z/f19h3zT66 is an example showing how the call is
not inlined on armv7.
The expected asm output would be something like:
```asm
.fnstart
vsdot.s8 q0, q1, d4[0]
bx lr
.Lfunc_end0:
```
Thanks to @Amichaxx we managed to narrow it down and now can resolve
this problem by adding `ARM::FeatureDotProd, ARM::HasV8Ops` to
InlineFeaturesAllowed in llvm/lib/Target/ARM/ARMTargetTransformInfo.h,
[7 lines not shown]
[AMDGPU] Fix hoist location for s_set_vgpr_msb past SALU program state instructions (#176206)
If we exit the loop at a non SALU state instruction we have to return
the next instruction because we will insert before the instruction we
return. The check before the loop already did this for cases we start on
a non SALU state instruction by returning `I`. This is now done
afterwards.
[SCEV] Add initial support for ptrtoaddr. (#158032)
Add initial support for PtrToAddr to SCEV, including a new
SCEVPtrToAddrExpr and SCEV expansion support for it.
PR: https://github.com/llvm/llvm-project/pull/158032
[lld][COFF][NFC] Fix warnings on 32-bit asserts builds (#176178)
Fixes #130934 (Wsign-compare warnings reported for Wasm Emscripten
builds). I ran into this when building for 32-bit RISC-V.
ipv6: account for jumbo payload option
If a jumbo payload option is added, the length of the mbuf chain is
increased by 8 but the actual hop-by-hop extension header with the
jumbo playload option is only inserted in the packet if there are
other options. Therefore, adjust optlen to reflect the actual size
of IPv6 extension headers including the hop-by-hop extension header
containing the jumbo payload option.
Reported by: syzbot+73fe316271df473230eb at syzkaller.appspotmail.com
Reviewed by: markj, Timo Voelker
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D54394
[llvm-dwarfdump][LineCov 1/3] Add variable coverage metrics (#169646)
Patch 1 of 3 to add to llvm-dwarfdump the ability to measure DWARF
coverage of local variables in terms of source lines, as discussed in
this RFC:
https://discourse.llvm.org/t/rfc-debug-info-coverage-tool-v2/83266)
This patch adds the basic variable coverage implementation. By default,
inlined instances are shown separately (displaying the full inlining
chain). Alternatively, a combined view that averages across all inlined
instances can be returned using `--combine-instances`.
In this patch, we simply print a count of source lines over which each
variable is covered. Later patches in the series will add the comparison
against a baseline.
[CGP] Refactor tail call eligibility checks in `dupRetToEnableTailCallOpts` (NFC)
Tail call eligibility and profitability checks have been combined
into a single helper to reduce code duplication.
[lldb] Change bitfield range character from '-' to ':' in DIL (#173410)
Change the bitfield extraction range character from '-' to a more common
':'. Add a deprecation error when '-' is used.
[VPlan] Add matchers for reduction result VPInstructions (NFC).
Add dedicated matchers for reduction result VPInstructions, to be
re-used in follow-up patches, including
https://github.com/llvm/llvm-project/pull/167851.
SystemZ: Remove override of insertSSPDeclarations
Remove __stack_chk_guard from the SystemZ system library.
Previously the availability was assumed to match
__stack_chk_fail, but these appear to be differen for SystemZ.
I'm assuming this isn't available for systemz based on the
existing behavior.
Once the runtime library does not add a SYSTEM_CHECK_GUARD
implementation the default will be a no-op if the symbol
isn't added to the system.
Also extend the test to make sure the declaration is not emitted.
[CIR] Add __sync_<OP>_and_fetch builtins (#168347)
Adds support for several `__sync_<OP>_and_fetch` builtins, and several helper methods for emitting atomic fetch + arithmetic operations.
---------
Co-authored-by: Andy Kaylor <akaylor at nvidia.com>
[ci][ids] Fix pattern prefix check (#176334)
The prefix check did not include the trailing /, so llvm-c headers were
treated like llvm headers, resulting in incorrect suggestions to use
LLVM_ABI where LLVM_C_ABI was already present.
See
https://github.com/llvm/llvm-project/pull/176309#issuecomment-3757987748
for an example.
dwc: improve IPv4 transmit checksum offloading
This patch provides two improvements for TCP/IPv4 and UDP/IPv4
transmit checksum offloading:
(1) Use *CIC_SEG instead of *CIC_FULL, since FreeBSD always provides
a pseudo header checksum.
(2) Don't make transmit IPv4 header checksum offloading a prerequisite
for TCP/IPv4 or UDP/IPv4 transmit checksum offloading.
This is the root cause of PR 291696, since right now the epair
interface does not support transmit IPv4 header checksum offloading,
but TCP/IPv4 and UDP/IPv4 transmit checksum offloading.
PR: 291696
Reviewed by: Timo Voelker
Tested by: Marek Benc
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D54395
[DA] Use ScalarEvolution::isKnownPredicate (#170919)
DA uses `DependenceInfo::isKnownPredicate` instead of
`ScalarEvolution::isKnownPredicate` in several places. The former is
intended to be a "wrapper" for the later. Specifically, it performs the
following processes:
- Replace `zext(X) cmp zext(Y)` with `X cmp Y`.
- Replace `X >=s Y` with `X - Y >=s 0`
- Replace `X <=s Y` with `X - Y <=s 0`
- Replace `X >s Y` with `X - Y >s 0`
- Replace `X <s Y` with `X - Y <s 0`
The first one can return an incorrect result when the most significant
bit of `X` and `Y` are different. Everything other than the first one
can be incorrect when `X - Y` overflows. Actually, when a `SCEVUnknown`
is involved (e.g., `%n <s %n + 1` will be `0 <s 1`), this function often
returns a result that ignore the possibility of overflow.
[4 lines not shown]