[RISC-V] Add --implicit-check-not="error:" to a few tests
Ensures that the test checks for every error emitted by llvm-mc. To do this
we have to move the CHECK lines to the next line rather than the same line
since otherwise we get a false-positive match.
This adds a few missing CHECK line in the xqcibm-invalid test and is needed
to minimize the diff in one of my subsequent commit.
Pull Request: https://github.com/llvm/llvm-project/pull/203091
[MLIR][XeGPU] Support partial subgroup lane distribution (#201667)
for convert_layout
Add lowering support in XeGPUSgToLaneDistribute for values that are
distributed across only a fraction of the subgroup.
- SgToLaneConvertLayout now lowers a rank-2 xegpu.convert_layout that
shrinks the lane layout along the outer (distributed) dimension while
keeping lane_data unchanged (e.g. [16, 1] -> [8, 1]). The partial-subgroup
case is detected directly in the pattern: equal order, rank 2, unit inner
lane layout, and a genuinely distributed outer lane layout (> 1, which also
rules out the degenerate [1, 1] layout). Because the data is no longer
replicated in every lane, it is gathered across lanes and the distributed
outer dimension is doubled when the lane count is halved.
- The cross-lane gather is factored into a dedicated helper,
shuffleDataAsLaneLayoutChange(): it bitcasts the source to i32, issues
gpu.shuffle up to fetch the values from the dropped lanes, and concatenates
[9 lines not shown]
[2/2][AMDGPU] Insert IMPLICIT_DEF to provide a reaching def for unspilled reloads
Depends on https://github.com/llvm/llvm-project/pull/198472
PR #198472 skips unspilling a slot if a spill reload is reachable from
entry along a path that does not contain a spill store. This patch builds
on it by finding a basic block where an IMPLICIT_DEF can be inserted to
provide a reaching definition on all paths to such reloads, allowing the
unspill to proceed. This new def may extend the rewritten vreg's live
range, so extra interference checks are performed over the extended region
to pick an appropriate physical register.
For the joint-dominance tests, an IMPLICIT_DEF insertion block is found,
but no physical register is interference-free over the extended range,
so the unspill is conservatively skipped.
Assisted-by: Cursor/Claude Opus
Revert "[lldb][test] Increase polling in TestInterruptThreadNames.py (#201554)" (#203126)
This reverts commit fdfd1c1344187d64b63504ea8e3662ae4936503a.
The Intel mac CI bot is timing out often with these new timeouts and
we're getting failing runs. Raphael will adjust and re-land.
[MLIR][XeGPU] Extend 8-bit load_nd support in XeVM lowering (#201645)
2D block load on 8bit element type has a shape 32x16 supported by OpenCL
API
```
void intel_sub_group_2d_block_read_transform_8b_32r16x1c( // reads eight uints
global void* base_address,
int width, int height, int pitch, int2 coord, private uint* destination);
```
The API is for load with transform/VNNI request.
OpenCL does not provide a load API for the same vector type and no
transform request. But value returned is identical for this special
vector type. <32x16x"8b">
The PR adds support for this vector type with no transform request.
[MLIR][XeGPU] Enable peephole optimization for the CRI target (#201655)
Enable the XeGPU transpose peephole and array-length optimizations for
the Crescent Island (cri) target alongside pvc and bmg. Skip sub-byte (<
8-bit) element types in array-length optimizations, which are not yet
supported.
Add tests in peephole-optimize.mlir covering the cri target and the
array-length optimization rejecting sub-byte
[Passes] Enhance `--print-pipeline-passes` (#202892)
Allow users to specify output format, make pipeline output more
palatable to FileCheck. Currently, it only support `text` and `tree`
format.
Fixes #200926.
[SpecialCaseList] Add backward compatible dot-slash handling
This PR is preparation for:
* https://github.com/llvm/llvm-project/pull/167283
The new behavior is controlled by the `Version` field in the special
case list file.
- Version 1 and 2: Path is matched as-is, regardless of presence of "./".
- Version 3, 5 and higher: Paths with leading dot-slash are canonicalized
to paths without dot-slash before matching. This means that a rule
like `src=./foo` will never match, and `src=foo` will match both
`foo` and `./foo`. (Version 3 never became default but has this behavior).
- Version 4: Transitionary version. Paths are matched both ways
(canonicalized and non-canonicalized) to maintain backward compatibility.
If a match only works with the old behavior (non-canonicalized), a warning
is emitted.
This change allows for a gradual transition to the new behavior, while
[6 lines not shown]
[lldb][Mach-O] Bound export-trie symbol name length (#202947)
`ParseTrieEntries` assembles a symbol name by appending every edge label
along a trie path into a `std::string`. A corrupt export trie can encode
an edge label whose terminator is far away in the trie data, making a
single label many megabytes long. Appending it requests an unbounded
allocation, which can crash lldb while parsing the symbol table.
Reject a trie whose assembled name exceeds a sane bound (1 MiB) as
corrupt data, the same way an unterminated edge label is already
handled. Add a unit test covering an oversized edge label.
Assisted-by: Claude
[SpecialCaseList] Add backward compatible dot-slash handling
This PR is preparation for:
* https://github.com/llvm/llvm-project/pull/167283
The new behavior is controlled by the `Version` field in the special
case list file.
- Version 1 and 2: Path is matched as-is, regardless of presence of "./".
- Version 3, 5 and higher: Paths with leading dot-slash are canonicalized
to paths without dot-slash before matching. This means that a rule
like `src=./foo` will never match, and `src=foo` will match both
`foo` and `./foo`. (Version 3 never became default but has this behavior).
- Version 4: Transitionary version. Paths are matched both ways
(canonicalized and non-canonicalized) to maintain backward compatibility.
If a match only works with the old behavior (non-canonicalized), a warning
is emitted.
This change allows for a gradual transition to the new behavior, while
[6 lines not shown]
[lldb][docs] Generate the Python API enums page from headers (#202780)
The "Python API enumerators and constants" page was added by
3cae8b33297b as an explicit stop-gap: its contents were grepped out of
the headers by hand, with the few available doc strings copied over
manually. That commit noted the real fix would be a tool that parses the
enum/constant headers and emits the page automatically.
Being hand-maintained, the page drifted badly out of sync. By now it was
missing 19 enums and 60+ enumerators, still documented three values that
no longer exist, and carried stale descriptions.
Add gen-python-api-enums.py, which parses lldb-enumerations.h and
lldb-defines.h and emits the page at build time. It is pulled into
python_api_enums.md via the {build-include} directive, the same
mechanism already used for the generated settings page, so the page can
stay in sync with the source.
Enums from the separately generated SBLanguages.h (eLanguageName*) are
[2 lines not shown]
[libc++] Strengthen optional value constructor noexcept
The standard does not require `optional<T>(U&&)` to be potentially throwing; it simply does not specify noexcept for the primary `optional<T>` converting constructor. Standard library implementations are permitted [[res.on.exception.handling]/5](https://eel.is/c++draft/res.on.exception.handling#5) to strengthen exception specifications for non-virtual library functions, as long as the strengthened specification is correct.
GNU libstdc++ already does this:
https://github.com/gcc-mirror/gcc/blob/master/libstdc%2B%2B-v3/include/std/optional#L911-L913https://github.com/gcc-mirror/gcc/blob/master/libstdc%2B%2B-v3/include/std/optional#L962-L974
The existing libc++ code only added noexcept for the C++26 `optional<T&>` case, guarded by `is_lvalue_reference_v<_Tp>`. It is safe to remove that gate and use the general condition instead:
```
noexcept(is_nothrow_constructible_v<_Tp, _Up>)
```
For `optional<T&>`, this still becomes the intended reference-construction check. For ordinary `optional<T>`, it correctly reflects whether constructing `T` from `U` can throw. The constructor only forwards into the contained object construction and updates optional bookkeeping, so if `T` is nothrow-constructible from `U`, the optional construction is also nothrow.
[flang][OpenMP] Implicit declarations of procedures in DECLARE_TARGET (#201935)
This replaces commit 8f5df8891840b, since it was rejecting the following
case:
```
function baz(a)
!$omp declare target to(baz)
real, intent(in) :: a
baz = a
end
program main
!$omp declare target(baz)
integer, save :: baz ! error: 'baz' is already declared
end
```
Instead of flagging an error, the 'baz' in the directive should be
resolved to the explicitly declared variable.
[26 lines not shown]
[lldb] Expose SBProcess::IsLiveDebugSession() (#203111)
Expose the existing `Process::IsLiveDebugSession()` through the SB API
as `SBProcess::IsLiveDebugSession()`, letting clients distinguish a live
debuggee from a post-mortem session such as a core file or minidump. It
returns `false` when there is no underlying process, consistent with
other `SBProcess` query methods.
AMDGPU/GlobalISel: RegBankLegalize rules for gfx90a/gfx942 MFMAs (#194076)
Add rules for gfx90a/gfx942 MFMA/SMFMAC intrinsics.
I see some regressions with imm splat tests and stores that could have
taken agprs. I will try to address those in a follow-up patch.
[mlir][ROCM] Disable integration tests on shared library builds (#203114)
Recent ROCm builds cause conflicts when loading the HIP library into
mlir-rocm-runner when LLVM is built as a shared library (this manifests
as duplicate command-line options).
Fixing this properly would require dlopen()-ing the HIP libraries or
some other such workaround, which can be done later.
For now, disable these tests on such builds.
[LLDB] RISCV feature attribute support and allows overriding additional(default) feature (#147990)
Parse ELF attributes to automatically set disassembler features.
llvm-objdump calls ELFObjectFile::getFeatures, then turns that into a
cstr to pass to createMCSubtargetInfo.
The lldb disassembler builds features for various architectures manually
and adds in the value from the command line.
If this is empty, it uses the default. then it turns that into a cstr
and passes it to createMCSubtargetInfo.
For Hexagon and RISC-V, parse the attributes, set up features, add
anything else needed.
If this is empty, pick the default.
Then turn into a cstr and pass to createMCSubtargetInfo (via
MCDisasmInstance::Create).
This patch adds RISCV feature attribute support and allows overriding
additional(default) feature.
[3 lines not shown]
[GlobalISel] Add `or_and_xor_to_or` pattern from SelectionDAG (#201108)
This PR adds the `fold (or (and X, (xor Y, -1)), Y) -> (or X, Y)`
pattern from SelectionDAG to GlobalISel.
---------
Co-authored-by: Matt Arsenault <arsenm2 at gmail.com>
[NFC][SpecialCaseList] Introduce Matcher::matchInternal helper (#203097)
Extract the core matching logic (std::visit on the variant) from
Matcher::match
into a private Matcher::matchInternal helper method. This is a
preparation step
for implementing warning logic that will need to call the matching logic
multiple times.
Also make Matcher's M and Options member variables private as part of
this
refactoring.
Assisted-by: Gemini
[NFC][SpecialCaseList] Introduce QueryOptions struct (#203098)
Refactor the RemoveDotSlash boolean parameter into a QueryOptions
struct.
This struct will hold all matching options and simplifies the Matcher
constructor signature. This is a preparation step for adding more
options
in subsequent patches.
Assisted-by: Gemini
[flang][acc] Add tests for implicit `acc declare` of type descriptors (#203100)
Adds 4 tests to cover different cases which requires implicit `acc
declare` for type descriptors.