[AMDGPU] Prioritize allocation of low 256 VGPR classes
If we have 1024 VGPRs available we need to give priority to the
allocation of these registers where operands can only use low 256.
That is noteably scale operands of V_WMMA_SCALE instructions.
Otherwise large tuples will be allocated first and take all low
registers, so we would have to spill to get a room for these
scale registers.
Allocation priority itself does not eliminate spilling completely
in large kernels, although helps to some degree. Increasing spill
weight of a restricted class on top of it helps.
[AMDGPU] Add baseline test to show spilling of wmma scale. NFC (#168163)
This is to show the spilling of WMMA scale values which are limited
to low 256 VGPRs. We have free registers, just RA allocates low 256
first.
[Analysis] Move TargetLibraryInfo data to TableGen
The collection of library function names in TargetLibraryInfo faces
similar challenges as RuntimeLibCalls in the IR component. The number
of function names is large, there are numerous customizations based
on the triple (including alternate names), and there is a lot of
replicated data in the signature table.
The ultimate goal would be to capture all lbrary function related
information in a .td file. This PR brings the current .def file to
TableGen, almost as a 1:1 replacement. However, there are some
improvements which are not possible in the current implementation:
- the function names are now stored as a long string together with
an offset table.
- the table of signatures is now deduplicated, using an offset table
for access.
The size of the object file decreases about 34kB with these changes.
[6 lines not shown]
[TableGen] Eliminate the dependency on SDNode definition order
Fix CodeGenDAGPatterns::ParseDefaultOperands() not to depend on the
first SDNode defined being usable as an output pattern operator.
Presently, each `OperandWithDefaultOps` record is processed by
constructing an instance of TreePattern from its `DefaultOps` argument
that has the form `(ops ...)`. Nevertheless the result of processing
the root operator of that DAG is not inspected by `ParseDefaultOperands()`
function itself, that operator has to be parseable by the underlying
`TreePattern::ParseTreePattern()` function. For that reason, a temporary
DAG is created by replacing the root operator of `DefaultOps` argument
from `ops` to the first SDNode defined, which is usually `def imm : ...`
defined in `TargetSelectionDAG.td` file.
This results in misleading errors being reported when defining new
`SDNode` types, if the new definition happens to be added before the
`def imm : ...` line. The error is reported by several test cases
executed by `check-llvm` target, as well as if one of the targets
[8 lines not shown]
[Github] Make metrics container build use common actions (#168667)
This patch makes the metrics container build/push job use the common
container build/push actions to simplify the workflow by quite a bit.
[Github] Bump Runner Version in CI Containers
To ensure we stay ahead of the ~6 month time horizon. This new version
seems to be mostly small version bumps and minor fixes that probably are
not too relevant to us.
[RISCV] Update X60 ReleaseAtCycles for Vector Integer Arithmetic Instructions (#152557)
This PR updates the ReleaseAtCycles for all instructions described in
Section 11 of the RVV Spec: Vector Integer Arithmetic Instructions. The
data used comes from camel-cdr.
[NFC][lldb] move DiagnosticsRendering to Host (#168696)
NFC patch which moves `DiagnosticsRendering` from `Utility` to `Host`.
This refactoring is needed for
https://github.com/llvm/llvm-project/pull/168603. It adds a method to
check whether the current terminal supports Unicode or not. This will be
OS dependent and a better fit for `Host`. Since `Utility` cannot depend
on `Host`, `DiagnosticsRendering` must live in `Host` instead.
[libc++] Remove is_signed<T> use from <limits> (#168334)
`numeric_limits` already has an `is_signed` member. We can use that
instead of using `std::is_signed`.
[Fuzzer] make big-file-copy.test work with the internal shell (#168658)
This patch uses several shell features not supported by the internal
shell, such as $? to get the exit code of a command, and exit. This
patch adjusts the test to work with the internal shell by using bash to
run the actual command with a zero exit code to ensure the file is
deleted, and python to propagate the exit code up to lit.
[ASan] Make duplicate_os_log_reports.cpp work with the internal shell
This test used a for loop to implement retries and also did some trickery with PIDs.
For this test, just invoke bash for actually running the test given we need the PID,
and move the for loop into a separate shell script file that we can then invoke from
within the test. Normally it would make sense to rewrite such a script in Python, but
given this test does not have portability concerns only running on Darwin, it is fine
to use a shell script here given there is no other convenient alternative.
Reviewers: ndrewh, DanBlackwell, fmayer
Reviewed By: ndrewh
Pull Request: https://github.com/llvm/llvm-project/pull/168656
[cross-project-tests][DebugInfo] Make simplified-template-names test runnable on Darwin (#168725)
The test was failing on Darwin for two reasons:
1. `-fdebug-type-sections` is not a recognized flag on Darwin
2. We fail to reconstitute a name if the template parameter has a type
that has a preferred_name. With LLDB tuning the type of such a parameter
is a typedef, i.e., the preferred name. Without tuning it would be the
canonical type that the typedef (possibly through a chain of typedefs)
points to.
This patch addresses (1) by splitting the `-fdebug-type-sections` tests
into a separate file (and only mark that one `UNSUPPORTED`). Which means
we can at least XFAIL the non-type-sections tests on Darwin.
To fix (2) we might need to make the `DWARFTypePrinter` aware of
non-canonical `DW_AT_type`s of template parameters.
[HLSL] replace std::unordered_map with DenseMap (#168739)
Broke some builds because of a missing include. Changing to a DenseMap
and adding the missing include.
[ASan] Make dyld_insert_libraries_reexec work with internal shell
This test was doing some feature checks within the test itself. This patch
rewrites the feature checks to be done in a fashion more idiomatic to lit,
as the internal shell does not support the features needed for the previous
feature checks.
Reviewers: ndrewh, DanBlackwell, fmayer
Reviewed By: ndrewh
Pull Request: https://github.com/llvm/llvm-project/pull/168655