[MLIR][NVVM] Preserve PTX special registers in inline_ptx lowering (#203251)
`PtxBuilder::build()` converted operand placeholders (written as %0, %1,
and the predicate as @%N, since TableGen string attributes cannot
contain '$') to the inline-asm operand form with a blanket `replace(ptx,
'%', '$')`. That also rewrote literal PTX special-register names such as
%tid.x, %laneid and %dynamic_smem_size into $tid.x etc., producing
invalid PTX for any `nvvm.inline_ptx` whose body reads a special
register.
Convert only a '%' that is immediately followed by a digit (operand
placeholders and the @%N predicate); leave %<name> special registers
intact. PTX special registers always begin with a letter after '%', so
the digit test unambiguously distinguishes them from operand
placeholders.
Add an NVVMToLLVM regression test that reads %laneid through
nvvm.inline_ptx.
Co-authored-by: Claude Opus 4.8 (1M context) <noreply at anthropic.com>
[BOLT] Buffer DataAggregator diagnostics
To avoid mixed up error messages in multi-perf case, provide diagnostics
buffer and stream for each aggregator job.
Test Plan: updated pre-aggregated-perf.test
Reviewers: yavtuk, maksfb, rafaelauler, ayermolo, paschalis-mpeis, yozhu
Reviewed By: yavtuk
Pull Request: https://github.com/llvm/llvm-project/pull/203464
[BOLT] Propagate DataAggregator parse errors
Propagate perf/preaggregated input parsing errors through DataAggregator
instead of terminating from per-input aggregation jobs.
This lets multi-input aggregation report failed inputs as warnings when
at least one input succeeds while returning errors when all inputs fail.
It also converts pre-aggregated parsing diagnostics to returned Error
values and removes worker-path exits from perf setup and parsing.
Test Plan: updated pre-aggregated-perf.test
Reviewers: maksfb, rafaelauler, ayermolo, yozhu, yavtuk
Reviewed By: rafaelauler
Pull Request: https://github.com/llvm/llvm-project/pull/200476
[LoopFusion][NFC] Avoid copying fusion candidates per pair (#203461)
`fuseCandidates()` copied both candidates (each holding two
`SmallVector<Instruction *, 16>`) for every adjacent pair examined, even
pairs rejected by an early continue. Bind them by const reference; they
are only read before being erased from the list, and performFusion runs
before the erases.
[Flang][Openmp]Prevent TODO abort on nothing directive (#202679)
Since nothing is a no-op directive (OpenMP 5.2, 8.4), handle it during
lowering instead of falling through to the generic unimplemented
utility-directive path and triggering a TODO abort.