[clang][DebugInfo] Add virtuality call-site target information in DWARF. (#167666)
Given the test case:
struct CBase {
virtual void foo();
};
void bar(CBase *Base) {
Base->foo();
}
and using '-emit-call-site-info' with llc, the following DWARF
is produced for the indirect call 'Base->foo()':
1$: DW_TAG_structure_type "CBase"
...
2$: DW_TAG_subprogram "foo"
...
[18 lines not shown]
[mlir][tosa] Refactor convolution infer return type (#178869)
Lots of logic was repeated for Conv2D, Conv3D and Conv2DBlockScaled ops.
This commit factors out common logic to reduce code duplication.
In doing so, a bug in calculating the bias shape was also fixed. Since
DepthwiseConv2D and TransposeConv2D were fixed independently, this
commit fixes #175765.
[X86] knownbits-vpmadd52.ll - replace extended unicode character with regular ascii (#182278)
Stops update_llc_test_checks.py from complaining / unnecessarily changing the file
[flang][OpenMP] Initial support for DEPTH clause
The semantic checks do not check any conditions on the associated loop
nest (such as actual depth or whether it is a perfect nest).
Lowering will emit a not-implemented-yet message.
[libclc] Completely remove ENABLE_RUNTIME_SUBNORMAL option (#182125)
Summary:
This isn't really used and this simplifies the code. I could go deeper
to remove this content entirely as they all return `false` but I figured
this was an easier change to do first.
---------
Co-authored-by: Wenju He <wenju.he at intel.com>
[mlir][spirv] (De)serialize Offset, XfbBuffer and XfbStride decorations (#181835)
Process decorations number 35, 36 and 37 in SPIR-V deserializer and
serializer; add a simple test case.
[X86] combineSETCC - drop unnecessary shift amount bounds check for larger-than-legal ICMP_ZERO(AND(X,SHL(1,IDX))) folds (#182021)
For i128 etc. bittest patterns, we split the pattern into a i32
extraction + i32 bittest.
But we were unnecessarily limiting this to inbounds shift amounts. I
wrote this fold at the same time as narrowBitOpRMW where we needed the
bounds check for safe memory access, which isn't necessary in
combineSETCC.
Fix 2 of 2 for #147216
[RDF] Fix DenseMap reference invalidation in computePhiInfo (#182144)
In Liveness::computePhiInfo, the reference `RefMap &RUM =
RealUseMap[PA.Id]` can be invalidated when the inner loop inserts into
RealUseMap via `RealUseMap[P.first][SS.Id]`. If `P.first` is a new key,
the DenseMap may rehash, invalidating the RUM reference and any
iterators into it.
Fix by making a copy of the map value instead of holding a reference.
This is detected by _GLIBCXX_DEBUG (enabled via EXPENSIVE_CHECKS) which
tracks iterator validity on std::unordered_map (RefMap).
[LV] NFCI: Add RecurKind to VPPartialReductionChain (#181705)
This avoids having to pass around the RecurKind or re-figure it out from
the VPReductionPHI node.
This is useful in a follow-up PR, where we need to distinguish between a
`Sub` and `AddWithSub` recurrence, which can't be deduced from the
`ReductionBinOp` field.
[mlir][tosa] Fix dense_resource data alignment in tosa-narrow-* tests (#182253)
The alignment of int64 and float64 dense resource should be 8 and not 4
[RegisterCoalescer] Prefer copy over rematerialization when smaller
When the source register has multiple uses, compare instruction sizes
before rematerializing. If the copy is smaller than the rematerialized
instruction, prefer keeping the copy to reduce code size.
Additionally, register-to-register copies are often eliminated by
register renaming on modern out-of-order CPUs, making them effectively
free at runtime.