[Support] Consolidate the two implementations of Recycler::clear (NFC) (#165081)
This patch consolidates the two implementations of Recycler::clear
with "if constexpr" for simplicity.
[ADT] Skip DenseMapBase::destroyAll on trivially destructible types (#165080)
DenseMap::destroyAll currently iterates through the entire bucket
array to call destructors keys and values. We don't need to do that
if we know that both key and value types are trivially destructible,
meaning that the destructors are no-ops.
This patch introduces "constexpr if" at the beginning of destroyAll to
skip the rest of the function if both key and value types are
trivially destructible.
[ADT] Make internal methods of DenseMap/SmallDenseMap private (NFC) (#165079)
This patch moves the init, copyFrom, and grow methods in DenseMap and
SmallDenseMap from public to private to hide implementation details.
The only problem is that PhysicalRegisterUsageInfo calls
DenseMap::grow instead of DenseMap::reserve, which I don't think is
intended. This patch updates the call to reserve.
[ISel] Use CallBase instead of CallInst (#164769)
This is to follow the discussion in
https://github.com/llvm/llvm-project/pull/164565
CallBase can cover more call-like instructions which carry caling
convention flag.
Co-authored-by: Yuanke Luo <ykluo at birentech.com>
[ORC] Fix race when checking isComplete (#165063)
After #164340 there is a tsan race on `OutstandingSymbolsCount` when
decrementing it in `notifySymbolMetRequiredState` vs reading it in
`isComplete()`. Fix this by having `IL_emit` filter out non-completed
queries when it has the lock to do so, and that way we avoid needing to
call `isComplete()` later.
[CIR][NFC] Upstream EHPersonality for function (#164883)
Upstream the EHPersonality class for a function as a prerequisite for
working with the handlers
Issue #154992
[mlir][amdgpu] Update mfma assembly format with intrinsic shape (#165037)
Use the same format as introduced for wmma by
https://github.com/llvm/llvm-project/pull/164920.
Also make `blocks` default to 1.
[CodeGenPrepare] Don't simplify incomplete expression tree in AddrModeCombine (#164628)
Since new select/phi instructions may construct loops, the expression
tree to be simplified may still be incomplete (i.e., it may contain
select with dummy values or phi without incoming values). This patch
removes the call to simplifyInstruction for now, as it doesn't break
existing tests.
Original PR: https://reviews.llvm.org/D36073
Fix the crash reported in
https://github.com/llvm/llvm-project/pull/163453#issuecomment-3429922732.
[X86] Move x86 specific create*Pass Functions to X86.h
There are no other target specific passes in Passes.h and these really
belong inside x86.h to be consistent with other targets.
Reviewers: arsenm, phoebewang, RKSimon, topperc
Reviewed By: arsenm
Pull Request: https://github.com/llvm/llvm-project/pull/165075
[AArch64] Widen GPR32 zero cycle zeroing (#164244)
Given a GPR32 zeroing instruction, if the target supports zero cycle
zeroing for GPR64 but not for GPR32, widen the instruction to 64 bit
`$xn = MOVZXi 0, 0` instead of writing to `$wn` to exploit zero cycle
zeroing.
It also aligns naming in the generic zeroing test.
[SpecialCaseList] Filtering Globs with matching prefix (#164531)
This commit optimizes `SpecialCaseList` by using a `RadixTree` to filter
glob patterns based on their prefixes. When matching a query, the
`RadixTree` quickly identifies all glob patterns whose prefixes match
the query's prefix. This significantly reduces the number of glob
patterns that need to be fully evaluated, leading to performance
improvements, especially when dealing with a large number of patterns.
According to SpecialCaseListBM:
Lookup benchmarks (significant improvements):
```
OVERALL_GEOMEAN -0.8177
```
Lookup like `prefix*` benchmarks (huge improvements):
```
OVERALL_GEOMEAN -0.9819
[6 lines not shown]