[MLIR][XeGPU] Port tests from the XeGPUSubgroupDistribute to XeGPUSgToWiDistributeExperimental (#189747)
This PR ports tests from subgroup-distribute.mlir (old pass) to
sg-to-wi-experimental.mlir (new pass)
<bsd.dep.mk>: Several tweaks and style cleanups
* Remove duplicate ':N*.cpp' from ${_ALL_DEPENDS}.
* Simplify '!empty(${_FG:M_})' to be '${_FG} == "_"'.
* Replace `cmd` with $(cmd), which is clearer in expressing nested
command substitution.
* Adjust indentations and add comments to help read the complex flow.
<bsd.subdir.mk>: Fix SUBDIR ordering for non-parallel mode
As documented in make(1) man page as well as my tests, the '.ORDER'
directive only applies to the parallel mode (even -j1), so the
${SUBDIR_ORDERED} is actually ignored in the non-parallel mode. As a
result, the build ordering for the subdirectories is their order in
${SUBDIR}, which may be different from ${SUBDIR_ORDERED}, and this can
lead to build failures. For example, gnu/lib/gcc120/libstd++fs failed
to build because it was built before the dependent libstdcxx/headers.
Discussed-with: swildner
gcc120: Fix "make depend" failure in libgcc_eh
libgcc_eh pulled the 'FLAGS_GROUPS=sse2' from libgcc/Makefile.src and
thus "make depend" would call mkdep(1) on the 'sse2' group, but it would
fail because the 'sfp-machine.h' header was not generated.
Fix the problem by moving the 'FLAGS_GROUPS=sse2' and related variables
from libgcc/Makefile.src to {libgcc,libgcc_pic}/Makefile, where they're
actually used.
Discussed-with: swildner
<bsd.dep.mk>: Apply .NOPATH to .depend_${group} files as well
Each group defined in ${FLAGS_GROUPS} will have its own depend file
named '.depend_${group}'. Apply the '.NOPATH' attribute to them as well
as the main '.depend'.
Meanwhile, tweak the '.NOPATH' syntax as source/attribute to align
better with the make(1) man page.
<bsd.dep.mk>: Fix issues in generating depend files
* Remove the '> ${.TARGET}' command so that a repeat 'make depend' would
not falsely succeed.
Before this change, an empty '.depend' file would be created even if
the mkdep(1) fails, and then another 'make depend' (e.g., from
'make quickworld') would skip creating the depend files and thus
falsely succeed.
* Remove the '-' prefix from the 'rm -f ${.TARGET}' command. This fixes
that the mkdep(1) failure was ignored in the jobs mode (i.e., make -jN).
In the jobs mode, all the commands of a target is executed by one
single shell instance. When the shell does not have ErrCtl enabled
(which is the default), the '-' prefix affects the entire job rather
than specific commands prefixed with '-'. See make(1) for more
details.
[4 lines not shown]
[MLIR][XeGPU] Remove verifyLayouts from sg to wi pass (#190360)
The verifyLayouts function walked the IR before distribution and failed
the pass if any XeGPU anchor op or vector-typed result was missing a
layout attribute. This was added as a temporary guard while the pass was
being developed.
Now we add target check for each op.
Add cross-mode iSCSI compatibility test suite
test_264 exercises common iSCSI behaviours (extents, targets, sessions, CHAP,
XCOPY, etc.) across both SCST and LIO to catch regressions on mode switch.
test_265 covers portal binding.
Add LIO as an alternative iSCSI target stack
The LIO path uses a configfs reconciler (utils/lio/config.py) that writes
desired state directly to /sys/kernel/config/target/. Service, ALUA, and
iSER handling all gate on the active stack. Pre-switch validation on mode
change rejects configurations incompatible with LIO.
[clang-doc] Use distinct APIs for fixed arena allocation sites
Typically, code either always emits data into the TransientArena or the
PersistentArena. Use more explicit APIs to convey the intent directly
instead of relying on parameters or defaults.
[clang-doc] Update type aliases
Many of the type aliases we introduced to simplify migration to arena
allocation are no longer relevant after completing the migration. We
can use more relevant names and remove dead aliases.
[clang-doc] Removed OwnedPtr alias
The alias served a purpose during migration, but now conveys the wrong
semantics, as the memory of these pointers is generally interned inside
a local arena.
[clang-doc] Support deep copy between arenas for merging
Upcoming changes to the merge step will necessitate that we clear the
transient arenas and merge new items into the persistent arena. However
there are some challenges with that, as the existing types typically
don't want to be copied. We introduce some new APIs to simplify that
task and ensure we don't accidentally leak memory.
On the performance front, we reclaim about 2% of the overhead, bringing
the cumulative overhead from the series of patches down to about 7% over
the baseline.
| Metric | Baseline | Prev | This | Culm% | Seq% |
| :--- | :--- | :--- | :--- | :--- | :--- |
| Time | 920.5s | 1014.5s | 991.5s | +7.7% | -2.3% |
| Memory | 86.0G | 39.9G | 40.0G | -53.4% | +0.3% |
| Benchmark | Baseline | Prev | This | Culm% | Seq% |
| :--- | :--- | :--- | :--- | :--- | :--- |
[28 lines not shown]
[clang-doc] Move Info types into arenas
Info types used to own significant chunks of data. As we move these into
local arenas, these types must be trivially destructible, to avoid
leaking resources when the arena is reset. Unfortunaly, there isn't a
good way to transition all the data types one at a time, since most of
them are tied together in some way. Further, as they're now allocated in
the arenas, they often cannot be treated the same way, and even the
aliases and interfaces put in pLace to simplify the transition cannot
cover the full range of changes required.
We also use some SFINAE tricks to avoid adding boilerplate for helper
APIs, we'd otherwise ahve to support
Though it introduces some additional churn, we also try to keep tests
from using arena allocation as much as possible, since this is not
required to test the implementation of the library. As much of the test
code needed to be rewritten anyway, we take the opportunity to
transition now.
[41 lines not shown]