[Metadata][profcheck] Handle identical MDNodes in getMergedProfMetadata
This fixes a bug where !prof metadata was dropped from SelectInsts when GVN simplified/merged them.
Guarded by -profcheck-disable-metadata-fixes. Exposed by the tests in
Transforms/SampleProfile.
[libc++][istream] Removed `[[nodiscard]]` from `peek()` (#175591)
Calling `peek()` after constructing a stream is something one can use to
make the stream ignore empty inputs:
```
#include <sstream>
int main() {
std::istringstream s;
s.peek();
while (s && !s.eof()) {
char c;
s >> c;
printf("not eof; read \'%c\' (%d)\n", c, c);
}
}
```
[2 lines not shown]
[llvm][utils] Make git-llvm-push set the skip-precommit-approval label (#174833)
skip-precommit-approval label is intended for simple PR that don't
require approval. To reduce the volume of notifications, label all PRs
created using the git-llvm-push script with the skip-precommit-approval
label.
Fixes #174825
[CodeGen][InlineSpiller] Add SubReg argument to loadRegFromStackSlot for subreg-reload (#175581)
This preparatory patch introduces an additional argument to the target hook
loadRegFromStackSlot. Ths is essential for targets to handle subregister-specific
reload in the future. See how this is used for AMDGPU target with PR #175002.
[Metadata][profcheck] Handle identical MDNodes in getMergedProfMetadata
This fixes a bug where !prof metadata was dropped from SelectInsts when GVN simplified/merged them.
Guarded by -profcheck-disable-metadata-fixes. Exposed by the tests in
Transforms/SampleProfile.
[Metadata][profcheck] Handle identical MDNodes in getMergedProfMetadata
This fixes a bug where !prof metadata was dropped from SelectInsts when GVN simplified/merged them.
Guarded by -profcheck-disable-metadata-fixes. Exposed by the tests in
Transforms/SampleProfile.
[UniformityAnalysis] Jump over reducible cycles when locating join blocks (#174938)
When locating the join blocks of a divergent block, the algorithm relies
on pseudo-edges from the header of a reducible cycle to the cycle exits.
This was missed in the actual traversal, producing unnecessary joins
inside the reducible cycle. This caused an assert in the included test,
which expected that if a join existed in a reducible cycle for a
divergent branch outside the cycle, then it must be header.
This fixes the reverted commit from #174117
[flang][CUDA] Apply implicit managed attribute when `-gpu=mem:managed` is used. (#175648)
When `-gpu=mem:managed` is used, allocatable arrays without explicit
CUDA data attributes are implicitly treated as managed. The
`-gpu=mem:managed` flag to enable this feature is currently only
supported in `bbc`.
[RISCV] Add the missing SEW search table field to vector FMA instructions (#175646)
We split vector floating point FMA (pseudo) instructions' opcodes by SEW
since c6b7944be4dfbb1fb35301c670812726845acaa7 , but forgot to populate
their `SEW` field, which is used by various search tables. This results
in incorrect pseudo instruction opcodes lookup -- and to a larger
extent, incorrect scheduling class lookups -- in llvm-mca. This patch
fixes such issue.
[cmake] Make CMAKE_BUILD_TYPE=Release the default (#174520)
Currently, we report a fatal error if the user leaves CMAKE_BUILD_TYPE
blank. This was implemented in https://reviews.llvm.org/D124153 /
350bdf9227ceb , based on this RFC:
https://discourse.llvm.org/t/rfc-select-a-better-linker-by-default-or-warn-about-using-bfd/61899/1
Tom Stellard mentioned that he'd like to revisit this on Discord, and
Aiden, myself, and apparently most people on the original RFC agree, so
I'm proposing we do it. However, on the review, several folks objected
and insisted that Debug was a better default. I want to reopen the
question.
I think we've made the wrong tradeoff. I wish Debug builds worked out of
the box on most systems, but they don't, and LLVM has only gotten bigger
over the last four years, making the build scalability problems of Debug
builds worse. I think we should optimize our build configuration for new
developers, not experienced longtime contributors who are invested
[9 lines not shown]
[profcheck] Fix encoding of 0 loopEstimatedTrip count (#174896)
We currently encode an estimated trip count of 0 as the latch having branch probabilities 0-0. That's an invalid pair of weights. The probability of a branch is computed as a fraction of its corresponding weight and the sum of the weights. In fact, `BranchProbabilityInfo::calcMetadataWeights` will convert this to a 1-1, meaning 50% - 50%, which isn't quite what we want. To indicate the loop is never taken, we just need to initialize the exit probability to non-zero (hence, 1)
Related: https://reviews.llvm.org/D67905
Issue #147390
[RISCV] Adjust base cost for Xqcilo loads/stores in RISCVMakeCompressible (#175572)
We only need two uses in Xqcilo load/store instructions for the base
adjustment to be profitable as compared to three uses in the base
load/store instructions.
[CIR][X86] Add support for `intersect` builtins (#172554)
adds support for the
`__builtin_ia32_vp2intersect_d`/`__builtin_ia32_vp2intersect_q` x86
builtins.
Part of #167765
---------
Signed-off-by: vishruth-thimmaiah <vishruththimmaiah at gmail.com>