[DA] Fix the Weak Zero SIV tests when the coeff may be zero (#183736)
In the Weak Zero SIV tests, given two subscripts `{c0,+,a}` and `c1`,
when `c0 == c1`, the tests conclude that a dependency exists from the
former subscript at the first iteration to the latter subscript at every
iteration. However, this conclusion is correct only when `a` is not
zero, which was not being checked.
This patch adds non-zero checks for `a` in the Weak Zero SIV tests.
Fix the test cases added in #183735 .
[MemProf] Enhance thin link optimization remarks (#184829)
Don't require -memprof-report-hinted-sizes for emitting opt remarks
during the thin link step. Invoke the handling also when opt remarks are
enabled for MemProf per OptimizationRemarkEmitter::allowExtraAnalysis.
Also, add a fallback message if we don't have the context size
information, adding tests for those new messages.
I also realized we don't currently emit these messages for MemProf with
regular LTO, and added a TODO.
[llvm-ir2vec] Adding getFuncNames API to ir2vec python bindings (#180473)
This is more a user convenience thing. But I thought it helpful.
Otherwise, at the moment, the user has to fetch the entire embeddings
dict, just to see what all functions a module has
[LV] Remove branch on false in blend-costs.ll test. NFC (#184816)
I have a patch I want to post that improves blend masks, but it ends up
with a weird diff in this test stemming from the branch on false.
This replaces it with an external boolean. This should still test
scalarizing a blend which I believe is the original intent.
[ELF] Remove unused handleTlsRelocation (#184951)
Now that all targets use target-specific relocation scanning for TLS
(#181332 RISC-V being the last), handleTlsRelocation is unused.
[RISCV][P-ext] Select plui.h/w and improve usage of pli.b/h/w. (#184937)
This patch adds custom instruction selection of splat_vector of
constants. Rather that using the element size from the VT, find
the smallest splat size in the constant. This allow us to use
pli.b for i16 or i32 elements that contain a byte splat.
[mlir] Improve dialect conversion failure diagnostics (#182729)
This PR improves MLIR dialect conversion failure diagnostics when
legalization fails.
Previously, the diagnostic mostly included the operation name (and in
partial conversion, whether it was explicitly marked illegal). This
change keeps that prefix and appends the printed failing operation. This
provides immediate operand/result/type context directly in the same
error line.
### Example
Before:
```
failed to legalize operation 'test.type_consumer' that was explicitly marked illegal
```
After:
[6 lines not shown]
[libc++][string] Replace ASAN volatile wrapper with memory barrier (#184693)
The previous `_LIBCPP_ASAN_VOLATILE_WRAPPER` approach was used to
prevent
speculative loads of string data before the short/long state was
determined. This patch replaces that mechanism with a more explicit
`__annotate_memory_barrier()` using an empty volatile assembly block.
This PR is inspired by #183457 and by downstream false positive on
`__get_long_size`. It fails same way as `__get_long_pointer` before we
have
`_LIBCPP_ASAN_VOLATILE_WRAPPER`. Barrier approach avoids
expanding `_LIBCPP_ASAN_VOLATILE_WRAPPER` for size_t, and to
in general looks more readable.
I failed to create reasonable reproducer for test, I suspect it requires
precise set of compiler flags, and libc++ site_config which will be hard
to maintain in test.
[SandboxVec][DAG] Handle unscheduled successors when user is external (#183861)
Whenever an IR use-def edge gets updated, the DAG gets notified about
the change by having its `notifySetUse()` callback called. The
callback's job is to update the DAG node's `UnscheduledSuccs` counter
which is the number of successor nodes that are yet to be scheduled.
This update makes sense only if both ends of the use-def edge are in the
DAG. Up until now we would still update the counter even if the user was
outside the DAG. This patch fixes this, so from now on we skip updatinge
`UnscheduledSuccs` if the user is outside the DAG.
[RISCV] Support 'f' Inline Assembly Constraint for bfloat16 (#184566)
This patch is to add 'f' and 'cf' Inline Assembly Constraint for the `bfloat16` type, so they are passed in the floating point registers.
[ELF] Add target-specific relocation scanning for RISC-V (#181332)
Implement RISCV::scanSectionImpl, following the pattern established
for x86 (#178846) and AArch64 (#181099). This merges the getRelExpr
and TLS handling for SHF_ALLOC sections into the target-specific
scanner, enabling devirtualization and eliminating abstraction
overhead.
- Inline relocation classification into scanSectionImpl with a switch
on relocation type, replacing the generic rs.scan() path.
- Use processR_PC/processR_PLT_PC for common PC-relative and PLT
relocations.
- Handle TLS IE and GD directly (RISC-V does not optimize GD/LD/IE).
- Replace TLS-optimization-specific expressions for TLSDESC, following
the x86 pattern: R_RELAX_TLS_GD_TO_IE -> R_GOT_PC,
R_RELAX_TLS_GD_TO_LE -> R_TPREL. Update relocateAlloc and relax()
to dispatch on relocation type instead of RelExpr for TLSDESC.
- Simplify getRelExpr to only handle relocations needed by
relocateNonAlloc and preprocessRelocs.
[4 lines not shown]
[RISCV] Add register overlap checks to the assembler for vector indexed segment load (#184569)
The destination vector register group cannot overlap the source vector
register group for vector indexed segment load. This patch is to add
register overlap checks to the assembler.
[clang][deps] Store `IgnoreCWD` on `ModuleDeps` (#184921)
This aligns us with downstream, where we need to be able to query
whether a module depends on CWD or not.
[RISCV] Remove outdated TODO in isExtractSubvectorCheap (#184938)
Index 0 is already handled by an early return, so the TODO comment about
extracting index 0 from a mask vector is no longer needed.