[SelectionDAG] Move HwMode expansion from tablegen to SelectionISel. (#174471)
The way HwMode is currently implemented, tablegen duplicates each
pattern that is dependent on hardware mode. The HwMode predicate is
added as a pattern predicate on the duplicated pattern.
RISC-V uses HwMode on the GPR register class which means almost every
isel pattern is affected by HwMode. This results in the isel table
being nearly twice the size it would be if we only had a single GPR
size.
This patch proposes to do the expansion at instruction selection time
instead. To accomplish this new opcodes like OPC_CheckTypeByHwMode
are added to the isel table. The unique combinations of types and HwMode
are converted to an index that is the payload for the new opcodes.
TableGen emits a new virtual function getValueTypeByHwMode that uses
this index and the current HwMode to look up the type.
This reduces the size of the isel table on RISC-V from ~2.38 million
[13 lines not shown]
[X86] SimplifyDemandedVectorEltsForTargetNode - reduce instruction size if upper half of X86ISD::PCLMULQDQ isn't demanded (#176199)
If the upper subvector half of a 256/512-bit X86ISD::PCLMULQDQ node
isn't demanded, then split the operands and perform using a smaller
instruction
[flang] Fix crash with coarray teams #171048 (#172259)
This PR updates the `CHANGE TEAM` construct to fix the bug mentioned in
the issue #171048.
When a construct such as `IfConstruct` was present in the `CHANGE TEAM`
region, several BB were created but outside the region.
[CFIInstrInserter][NFC] Move `class CSRSavedLocation` definition. (#176053)
This is needed to minimize diff for the future commit where we plan to
use `CSRSavedLocation` in `stuct MBBCFAInfo`.
[MemProf] Handle weak alias and aliasee prevailing in different modules (#176083)
For ThinLTO we only have the cloning information in the FunctionSummary,
so for aliases we create as many clones as there are aliasee clones in
the LTO backend. However, that information is only in the prevailing
symbol's summary, as we don't keep the memprof summary information for
other copies (to reduce memory and compile time).
In the case of weak aliases, it is possible that the prevailing copy
of the alias may be in a different module than the prevailing copy of
the aliasee (e.g. when a module with a weak_odr aliasee definition does
not have a def of the weak_odr alias and is listed first on the link
line). In that case, we were not creating the expected clones of the
alias.
Rather than a more complex solution that adds additional summary
information, detect this case and simply don't add the callsites in the
aliasee function to the callsite context graph. This will result in
conservativeness (because we can't clone through that function), but
this should be a corner case.
InstCombine: Fold known-qnan results to a literal nan
Previously we only considered fcNan to fold to qnan for canonicalizing
results, ignoring the simpler case where we know the nan is already
quiet.
[NVPTX] Update various intrinsic attributes, nfc cleanup (#175660)
This patch migrates the intrinsic properties back to "PureIntrinsic"
from "NVVMPureIntrinsic" (after PR #166450).
While we are there:
* Refactor a few mbarrier intrinsics definitions (NFC)
* Update mbarrier.pending_count properties. (trivial)
* Formatting changes over a few fence intrinsics (NFC)
[CI] Make premerge jobs support GHA postcommit (#176180)
This was causing failures in the release branch as the premerge jobs
there are also run postcommit through GHA. We were expecting a PR number
to always be present when it was not.
[MLIR][XeGPU] Clean up helpers in XeGPUPropagateLayout (#175857)
In XeGPUPropagateLayout.cpp, the helper getDefaultSIMTLayoutInfo is
implemented via multiple overloads that differ significantly in
semantics, not just parameter types.
Reusing the same function name for these semantically different
behaviors makes call sites harder to read and reason about and increases
the maintenance burden. This PR improves readability and maintainability
of layout propagation logic.
[profcheck] Reorder the FileCheck substitution. (#176098)
In the profcheck build, FileCheck commands are substituted with cat > /dev/null to disable output verification. In test/Transforms/SamplePrfile/remarks-hotness.ll we have both "FileCheck"
and "not FileCheck" statements. Replacing the positive one first results in "not cat".
Run the not substitution first to fix this.
[LLVM][CodeGen] Rename `gc-empty-basic-blocks` to `enable-gc-empty-basic-blocks` (#176018)
Rename the `gc-empty-basic-blocks` command line option to
`enable-gc-empty-basic-blocks` in preparation of adding calls to
initializing the pass in `initializeCodeGen` and also make the flag more
consistent with other existing flags to enable or disable passes.
Keep `gc-empty-basic-blocks` as an alias to allow all users to migrate
to the new option.
[Support] Suppress old MSVC warning for [[msvc::no_unique_address]] (#176130)
MSVC versions prior to 19.43 (Visual Studio 2022 version 17.13) emit a
warning when using the [[msvc::no_unique_address]] attribute prior to
C++20.
This is now considered a bug and fixed in later releases of MSVC.
Suppress the warning for older MSVC versions by disabling the warning
around the attribute usage. This allows for warning-free builds when
targeting older MSVC versions.
More details and discussion about the warning can be found here:
https://developercommunity.visualstudio.com/t/msvc::no_unique_address-Should-Not-W/10118435
Revert "[NFC][MI] Tidy Up RegState enum use (1/2)" (#176190)
Reverts llvm/llvm-project#176091
Reverting because some compilers were erroring on the call to
`Reg.isReg()` (which is not `constexpr`) in a `constexpr` function.
[NFC][MI] Tidy Up RegState enum use (1/2) (#176091)
This Change is to prepare to make RegState into an enum class. It:
- Updates documentation to match the order in the code.
- Brings the `get<>RegState` functions together and makes them
`constexpr`.
- Adopts the `get<>RegState` where RegStates were being chosen with
ternary operators in backend code.
- Introduces `hasRegState` to make querying RegState easier once it is
an enum class.
- Adopts `hasRegState` where equivalent was done with bitwise
arithmetic.
- Introduces `RegState::NoFlags`, which will be used for the lack of
flags.
- Documents that `0x1` is a reserved flag value used to detect if
someone is passing `true` instead of flags (due to implicit bool to
unsigned conversions).
- Updates two calls to `MachineInstrBuilder::addReg` which were passing
`false` to the flags operand, to no longer pass a value.
- Documents that `getRegState` seems to have forgotten a call to
`getEarlyClobberRegState`.
[LifetimeSafety] Test lifetime safety on stmt-local analysis test suite (#175906)
Add CFG-based lifetime analysis tests for dangling pointer detection
alongside the existing AST-based analysis.
This change helps validate that the new CFG-based lifetime analysis
correctly detects the same dangling pointer issues as the existing
AST-based analysis. It also documents current limitations of the
CFG-based approach with FIXME comments, providing a roadmap for future
improvements. The test ensures that both analysis methods can work
side-by-side, with the CFG-based analysis eventually intended to replace
the AST-based approach.