[CIR][X86] Add support for `aes` and `aeswide` builtins (#175892)
- Support CIR codegen for follow builtin: `aesenc`, `aesdec`,
`aesencwide` and `aesdecwide`.
- Part of https://github.com/llvm/llvm-project/issues/167752
[acc] `acc declare` + `present` clause for COMMON blocks (#175588)
Fix: `!$acc declare present(/COMMON/)` no longer adds
`acc.declare(dataClause=acc_present)` attribute to the fir.global
common.
Lowering change: COMMON+present is lowered through the structured
declare path (fir.address_of + acc.present operand) to preserve scope.
[AArch64] Use a load instead of a store for inline stack probes (#170855)
Frequently, when big buffers are put on the stack we end up
with multiple virtual pages Copy-On-Write mapped to single physical zero page.
Stack probes would unnecessarily trigger a Copy-On-Write on such pages. Avoid this
by using loads into the XZR.
[LoopFusion] Removing dead code leftover after PR #171889 (NFC) (#176020)
Removed unused functions in order to fix 'unused function' warnings, as
mentioned in PR 171889. This involved the two original functions
```ControlConditions::isEquivalent(const ControlConditions &Other)
const``` and ```ControlConditions::collectControlConditions(const
llvm::BasicBlock&, const llvm::BasicBlock&, const llvm::DominatorTree&,
const llvm::PostDominatorTree&, unsigned int)``` plus all the functions
that became unused as the result of deleting the two original ones.
Co-authored-by: Szymon Sobieszek <szymon.sobieszek1 at huawei.com>
[clang][-Wunsafe-buffer-usage] Ignore consteval functions (#171503)
We dont need to visit or warn on consteval functions as they can't have
UB.
---------
Co-authored-by: mxms <mxms at google.com>
[SelectionDAG] Move HwMode expansion from tablegen to SelectionISel. (#174471)
The way HwMode is currently implemented, tablegen duplicates each
pattern that is dependent on hardware mode. The HwMode predicate is
added as a pattern predicate on the duplicated pattern.
RISC-V uses HwMode on the GPR register class which means almost every
isel pattern is affected by HwMode. This results in the isel table
being nearly twice the size it would be if we only had a single GPR
size.
This patch proposes to do the expansion at instruction selection time
instead. To accomplish this new opcodes like OPC_CheckTypeByHwMode
are added to the isel table. The unique combinations of types and HwMode
are converted to an index that is the payload for the new opcodes.
TableGen emits a new virtual function getValueTypeByHwMode that uses
this index and the current HwMode to look up the type.
This reduces the size of the isel table on RISC-V from ~2.38 million
[13 lines not shown]
[X86] SimplifyDemandedVectorEltsForTargetNode - reduce instruction size if upper half of X86ISD::PCLMULQDQ isn't demanded (#176199)
If the upper subvector half of a 256/512-bit X86ISD::PCLMULQDQ node
isn't demanded, then split the operands and perform using a smaller
instruction
[flang] Fix crash with coarray teams #171048 (#172259)
This PR updates the `CHANGE TEAM` construct to fix the bug mentioned in
the issue #171048.
When a construct such as `IfConstruct` was present in the `CHANGE TEAM`
region, several BB were created but outside the region.
[CFIInstrInserter][NFC] Move `class CSRSavedLocation` definition. (#176053)
This is needed to minimize diff for the future commit where we plan to
use `CSRSavedLocation` in `stuct MBBCFAInfo`.
[MemProf] Handle weak alias and aliasee prevailing in different modules (#176083)
For ThinLTO we only have the cloning information in the FunctionSummary,
so for aliases we create as many clones as there are aliasee clones in
the LTO backend. However, that information is only in the prevailing
symbol's summary, as we don't keep the memprof summary information for
other copies (to reduce memory and compile time).
In the case of weak aliases, it is possible that the prevailing copy
of the alias may be in a different module than the prevailing copy of
the aliasee (e.g. when a module with a weak_odr aliasee definition does
not have a def of the weak_odr alias and is listed first on the link
line). In that case, we were not creating the expected clones of the
alias.
Rather than a more complex solution that adds additional summary
information, detect this case and simply don't add the callsites in the
aliasee function to the callsite context graph. This will result in
conservativeness (because we can't clone through that function), but
this should be a corner case.
InstCombine: Fold known-qnan results to a literal nan
Previously we only considered fcNan to fold to qnan for canonicalizing
results, ignoring the simpler case where we know the nan is already
quiet.
[NVPTX] Update various intrinsic attributes, nfc cleanup (#175660)
This patch migrates the intrinsic properties back to "PureIntrinsic"
from "NVVMPureIntrinsic" (after PR #166450).
While we are there:
* Refactor a few mbarrier intrinsics definitions (NFC)
* Update mbarrier.pending_count properties. (trivial)
* Formatting changes over a few fence intrinsics (NFC)
[CI] Make premerge jobs support GHA postcommit (#176180)
This was causing failures in the release branch as the premerge jobs
there are also run postcommit through GHA. We were expecting a PR number
to always be present when it was not.
[MLIR][XeGPU] Clean up helpers in XeGPUPropagateLayout (#175857)
In XeGPUPropagateLayout.cpp, the helper getDefaultSIMTLayoutInfo is
implemented via multiple overloads that differ significantly in
semantics, not just parameter types.
Reusing the same function name for these semantically different
behaviors makes call sites harder to read and reason about and increases
the maintenance burden. This PR improves readability and maintainability
of layout propagation logic.
[profcheck] Reorder the FileCheck substitution. (#176098)
In the profcheck build, FileCheck commands are substituted with cat > /dev/null to disable output verification. In test/Transforms/SamplePrfile/remarks-hotness.ll we have both "FileCheck"
and "not FileCheck" statements. Replacing the positive one first results in "not cat".
Run the not substitution first to fix this.
[LLVM][CodeGen] Rename `gc-empty-basic-blocks` to `enable-gc-empty-basic-blocks` (#176018)
Rename the `gc-empty-basic-blocks` command line option to
`enable-gc-empty-basic-blocks` in preparation of adding calls to
initializing the pass in `initializeCodeGen` and also make the flag more
consistent with other existing flags to enable or disable passes.
Keep `gc-empty-basic-blocks` as an alias to allow all users to migrate
to the new option.
[Support] Suppress old MSVC warning for [[msvc::no_unique_address]] (#176130)
MSVC versions prior to 19.43 (Visual Studio 2022 version 17.13) emit a
warning when using the [[msvc::no_unique_address]] attribute prior to
C++20.
This is now considered a bug and fixed in later releases of MSVC.
Suppress the warning for older MSVC versions by disabling the warning
around the attribute usage. This allows for warning-free builds when
targeting older MSVC versions.
More details and discussion about the warning can be found here:
https://developercommunity.visualstudio.com/t/msvc::no_unique_address-Should-Not-W/10118435
Revert "[NFC][MI] Tidy Up RegState enum use (1/2)" (#176190)
Reverts llvm/llvm-project#176091
Reverting because some compilers were erroring on the call to
`Reg.isReg()` (which is not `constexpr`) in a `constexpr` function.