[mlir][xegpu] Add vector layout conflict handling in XeGPU layout propagation pass. (#182402)
This PR adds support for layout conflict handling for vector operands. A
conflict for a vector operand occurs when a value consumed at a given
operand is not in the expected layout in the context of the consumer
(for example `vector.multi_reduction` op's source require a specific
layout inferred from its current result layout). To resolve this
conflict, we insert an `xegpu.convert_layout` right after the producer
(essentially duplicating the producer with expected layout) and use the
new value in the consumer.
[TableGen] Complete the support for artificial registers
Artificial registers were added in eb0c510ecde667cd911682cc1e855f73f341d134
as a means of giving super-registers heavier weights than that
of their subregisters, even when they only contain a single
physical subregister.
Artifical registers thus do exist in code and participate in
register unit weight calculations, but are not supposed to be
available for register allocation.
This patch completes the support for artificial registers to:
- Ignore artificial registers when joining register unit uber
sets. Artificial registers may be members of classes that
together include registers and their sub-registers, making it
impossible to compute normalised weights for uber sets they
belong to.
[28 lines not shown]
Removing databases/puppetdb, sysutils/ruby-facter, sysutils/puppetserver,
sysutils/ruby-puppet, sysutils/ruby-puppetserver-ca.
openvox equivalents will take over.
OK kn@
AMDGPU: Implement expansion for f64 exp (#182539)
I asked AI to port the device libs reference implementation.
It mostly worked, though it got the compares wrong and also
missed a fold that happened in compiler. With that fixed I get
identical DAG output, and almost the same globalisel output (differing
by an inverted compare and select). Also adjusted some stylistic
choices.
[TableGen] Complete the support for artificial registers
Artificial registers were added in eb0c510ecde667cd911682cc1e855f73f341d134
as a means of giving super-registers heavier weights than that
of their subregisters, even when they only contain a single
physical subregister.
Artifical registers thus do exist in code and participate in
register unit weight calculations, but are not supposed to be
available for register allocation.
This patch completes the support for artificial registers to:
- Ignore artificial registers when joining register unit uber
sets. Artificial registers may be members of classes that
together include registers and their sub-registers, making it
impossible to compute normalised weights for uber sets they
belong to.
[26 lines not shown]
Revert [Clang] eliminate -Winvalid-noreturn false positive after throw + unreachable try/catch blocks (#183365)
Reverts https://github.com/llvm/llvm-project/pull/175443
---
Reverting for now because the CFG `try` connectivity change caused
additional analysis regressions (`-Wthread-safety-analysis` etc.) beyond
the original fix.
Add @pkgpath and @conflict to openvoxdb, openvox-server, i
ruby-openvoxserver-ca, ruby-openvox, ruby-openfact and package renamings i
from puppet -> openvox equivalents to provide a working upgrade path.
OK kn@
[LoopFusion] clear FusionCandidates more often (#183353)
A LoopVector contains all the loops with the same parent loop (or all
loops with no parent). Once loop fusion is done with the transformation
for candidates extracted from one LoopVector we can safely clear
FusionCandidates. This avoids unnecssary work and results in more
meaningful statistics.
[Hexagon] Fix truncation to boolean vector that need widening (#182528)
When truncating a sub-HVX-width vector to a boolean vector (e.g., v64i8
-> v64i1 in 128-byte HVX mode), the operation would crash with
"Unhandled HVX operation" UNREACHABLE. This happened because the
condition in LowerHvxOperationWrapper/ReplaceHvxNodeResults did not
handle the case where the input vector needs widening and the result is
a boolean vector.
The fix adds WidenHvxTruncateToBool which widens the input to HVX
register width (e.g., v64i8 -> v128i8), performs the truncate to widened
bool type (v128i8 -> v128i1), extracts the result subvector (v128i1 ->
v64i1).
This allows the widened truncate to match the existing V6_vandvrt
pattern in HexagonPatternsHVX.td.
[TableGen] Complete the support for artificial registers
Artificial registers were added in eb0c510ecde667cd911682cc1e855f73f341d134
as a means of giving super-registers heavier weights than that
of their subregisters, even when they only contain a single
physical subregister.
Artifical registers thus do exist in code and participate in
register unit weight calculations, but are not supposed to be
available for register allocation.
This patch completes the support for artificial registers to:
- Ignore artificial registers when joining register unit uber
sets. Artificial registers may be members of classes that
together include registers and their sub-registers, making it
impossible to compute normalised weights for uber sets they
belong to.
[26 lines not shown]
ValueTracking: Special case fmul by llvm.amdgcn.trig.preop
This is another instance of the logic from #183159. If we know
one source is not-infinity, and the other source is less than or
equal to 1, this cannot overflow. Special case llvm.amdgcn.trig.preop,
as a substitute for proper range tracking. This almost enables pruning
edge case handling in trig function implementations, if not for the
recursion depth limit (but that's a problem for another day).
sysutils/py-salt: Update to 3006.23
PR: 287582
Reported by: Nick Hilliard <nick__at__foobar__dot__org>, T.S. <net__at__arrishq__dot__net>, James TD Smith <ahktenzero+freebsd__at__mohorovi__dot__cc>
[WebKit Checkers] Handle CXXRewrittenBinaryOperator in trivial analysis. (#183278)
Visit the semantic form when encountering CXXRewrittenBinaryOperator in
the trivial function analysis / no-delete analysis.
[lldb][Process/FreeBSDKernelCore] Implement DoWriteMemory() (#183237)
Implement `ProcessFreeBSDKernelCore::DoWriteMemory()` to write data on
kernel dump or `/dev/mem`. Due to security concerns (e.g. writing wrong
value on `/dev/mem` can trigger kernel panic), this feature is only
enabled when `plugin.process.freebsd-kernel-core.read-only` is set to
false (true by default).
---------
Signed-off-by: Minsoo Choo <minsoochoo0122 at proton.me>
ValueTracking: Teach computeKnownFPClass that multiply by <=1 cannot overflow
If one operand is known not-inf, that can be propagated if the other operand is
known to have a magnitude <= 1.
This enables elimination of some inf checks inside the implementation of trig
functions when the input is known not-inf.
Revert "[flang][openmp] Add support for ordered regions in SIMD directives (#181012)"
This reverts commit 31dacdc1f5d486da6ef6d8b2f7e3b6126d92c9ff.
See the PR for test failure details.