[BOLT] Lookup top-level inline tree node in YAMLProfileWriter (#165491)
Top-level (binary) functions don't have a unique GUID mapping, with
different
causes namely coroutine fragments sharing the same parent source
function GUID.
Replace the top-level inline tree node GUID lookup with probe lookup
coupled
with walk up the inline tree.
Test Plan: added test-coro-probes.yaml
Reapply "[VPlan] Use predicate from VPValue VPWidenSelectR::computeCost." (#173170)
This reverts commit f42af14073228 and re-applies
https://github.com/llvm/llvm-project/pull/172915.
It has an additional check if the condition is a live-in,
which makes sure we preserve the original behavior in that case.
This should fix the crash that caused the revert.
Original commit message:
Instead of looking up the predicate from the VPValue condition instead
of the underlying IR.
This improves cost modeling in some cases, e.g. when we can fold
operations like negations in compares. On AArch64, this leads to
additional vectorization in a few cases in practice.
[2 lines not shown]
[AMDGPU] Add test for v_fmamk_f16/v_fmaak_f16 in real-true16. NFC
This is to display a bug in real true16 mode that we do not have
an allocatable 16-bit VGPR class and these instructions do not
have VOP3 forms for allocatable VGPR_16 to be used. To use these
instructions 'VGPR_16_Lo128' must be allocable.
[MemProf] Propagate size info used for hint reporting to duplicates (#172535)
When we duplicate contexts (due to clones e.g. matching different
inlined instances), we were propagating the allocation type but not the
ContextSizeInfo, which is used for -memprof-report-hinted-sizes.
This meant that we never reported hinting for any of the duplicated
contexts, which can result in conservative results as in some cases only
the duplicated contexts are able to be cloned and hinted. Note that this
change could result in overly optimistic reporting in some cases.
[AMDGPU] Add test for v_fmamk_f16/v_fmaak_f16 in real-true16. NFC
This is to display a bug in real true16 mode that we do not have
an allocatable 16-bit VGPR class and these instructions do not
have VOP3 forms for allocatable VGPR_16 to be used. To use these
instructions 'VGPR_16_Lo128' must be allocable.
[llvm][utils] Make git-llvm-push not convert remote URLs (#173303)
Previously git-llvm-push would convert all remote URLs to HTTPS,
including SSH remotes for reasons not motivated in the original PR. This
would cause issues in some setups where the HTTPs remote would be
read-only. This patch makes it so that git-llvm-push does not convert
SSH remotes to HTTPS remotes, preserving what the user originally
intended.
Fixes #172828.
[CIR] Canonicalization: leverage MLIR traits and folding
Replace custom rewrite patterns with dedicated fold implementations for ScopeOp, or rely on DCE in cases of effect-less SwitchOp.
[libclang/python] Add release notes for `reparse` throwing (#173301)
``TranslationUnit.reparse`` will now throw an exception when an error
occurs. Previously, errors were silently ignored.
[lldb-dap] Avoid unnecessary allocations when creating variables. (#172661)
This reduces unnecessary string allocations and copies when handling the
variables request.
[libc] Split out src/__support/alloc-checker.h (#173104)
This moves the libc-internal AllocChecker API out of
src/__support/CPP/new.h and updates CPP/README.md to state the
intent to keep src/__support/CPP and the LIBC_NAMESPACE::cpp
namespace a "pure" subset of standard C++ API polyfills.
[VectorCombine] foldShuffleOfIntrinsics - support multiple uses of shuffled ops (#173183)
Fixes #173037
Remove the `m_OneUse` restriction in `foldShuffleOfIntrinsics` and
update the cost model to account for additional uses of the original intrinsics.
[lldb-dap][test] Add Python 3.8 compatibility for test suite (#173264)
Python 3.8 does not support subscriptable built-in types (dict[int],
list[str], etc.) without importing annotations from __future__.
This change adds `annotations` imports and handles missing API
functions.
[DirectX] Resources and simple GEP traversal in DXILMemIntrinsics (#173054)
Walk through GEPs and recognize resource target extension types when
trying to infer the underlying types of memory intrinsics.