[VPlan] Remove unused InductionDescriptor VPDerivedIVRecipe constructor (#206583)
Both callers use the 5-argument (Kind, FPBinOp, ...) constructor; the
delegating InductionDescriptor overload has no users.
[flang][OpenMP] Add explicit return type to visitor lambdas (#206588)
This should silence MSVC (14.51.36231) error:
error C2338: static assertion failed: 'visit() requires the result of
all potential invocations to have the same type and value category
(N4950 [variant.visit]/5).'
e.g. https://lab.llvm.org/buildbot/#/builders/166/builds/9664
[lldb] Add a BugReporter plugin type and "diagnostics report" (#206578)
Introduce a BugReporter plugin kind that files an assembled
Diagnostics::Report through a pluggable destination, plus a "diagnostics
report" command (aliased "bugreport") that collects the bundle and files
it through the first registered reporter.
CreateBugReporterInstance() returns the first registered reporter, so a
reporter registered earlier wins and a downstream tree can take over by
registering ahead of the built-ins. BugReporterNone is the
always-registered, last-in-order fallback. Its File() returns an error
pointing at LLDB_BUG_REPORT_URL, so the command surfaces "no tracker
configured" through the normal error path instead of special-casing it.
"diagnostics report" writes the bundle, prints a review warning, and
files it unless --no-open is given. The upcoming GitHub reporter, gated
by a CMake option, is the first real destination.
[VPlan] Pass CostCtx to makeMemOpWideningDecisions (NFC). (#206580)
makeMemOpWideningDecisions already uses 2 members (PSE, L) and will need
more in the future. Direcly pass CostCtx.
AMDGPU: Migrate unittests to subarch triples
Replace specifying a processor name with the triple
subarch.
The register-limit helpers in AMDGPUUnitTests.cpp that enumerate every
valid CPU via fillValidArchListAMDGCN still pass the CPU explicitly, as
does the MC Disassembler smoke test (its C disassembler API derives the
subtarget from the CPU, not the triple subarch).
Co-authored-by: Claude (Opus 4.8) <noreply at anthropic.com>
clang: Start using new amdgpu subarch triples
Fixup invocations using --target=amdgcn + -mcpu to introduce
the subarch in the triple.
For offload toolchains, a single toolchain is constructed for the
top level amdgpu architecture, and the effective triple is used for
target specific tool invocations.
The specifics of the resource directory layout are tbd. This does
try to find resources in the subarch named directory. The paths
are searched at toolchain creation time, so that does not work
when there are multiple subarches.
Fixes #154925
clang/AMDGPU: Stop passing redundant -target-cpu to cc1
Now that the exact target is encoded in the triple's subarch field,
-target-cpu is redundant. This avoids polluting the resultant IR with
unwanted "target-cpu" attributes. The net result is the desired codegen
when compiling libraries for a major subarch and linking it into a
program compiled for a specific arch. e.g., compiling for "gfx9-generic"
would pollute the IR with "target-cpu"="gfx9-generic", so codegen
would ultimately be performed for the generic target even after
linking into the concrete gfx9 cpu. The specialization will now be
achieved by merging the triples without the linker or optimization
passes needing to fixup function attributes.
clang/AMDGPU: Validate -target-cpu in cc1 is valid for the subarch
Restrict the reported list of valid target-cpus based on the triple's
subarch. This is more consistent with how other targets validate the
target CPU name. Currently we have split handling validating the target
name for the triple in both the driver and here. The driver based diagnostic
seems to be an amdgpu-ism in 2 different places (though there is one arm
validation emitting the same diagnostic). In the future we could probably
drop those.
AMDGPU: Introduce amdgpu triple arch
Move towards using the triple for representing incompatible
ISA changes. Use the subarch field to represent the various
incompatible cases. Previously we pretended a single triple arch
was universally compatible, and only distinguished by function
level subtargets. Move towards using distinct triples to enable
more sophisticated toolchain handling in the future, like proper
runtime library linking.
Introduce a new subarch per unique ISA, but also introduce
"major subarches" which are compatible by a set of covered
minor ISA versions. These map to the existing generic targets.
There are a few placeholder subarch entries, which currently
have missing backing generic arches for codegen.
This should be the preferred triple arch name going forward,
but is treated as an alias of amdgcn. This does not yet change
clang to emit the new triples.
[2 lines not shown]
[flang][OpenMP] Add explicit return type to visitor lambdas
This should silence MSVC (14.51.36231) error:
error C2338: static assertion failed: 'visit() requires the result of
all potential invocations to have the same type and value category
(N4950 [variant.visit]/5).'
e.g. https://lab.llvm.org/buildbot/#/builders/166/builds/9664
[mlir][acc] Lower sequential acc.loop to scf.for in ACCComputeLowering (#206165)
Sequential loops already have fixed parallelism, so represent them with
`scf.for` rather than `scf.parallel`. To prevent further analysis and
parallelization, `parDimAttr` is set to seq.
[HLSL] Implement codegen for copying cbuffer structs with resources (#204232)
Global-scope structs are in `hlsl_constant` address space and use
cbuffer layout. When those structs contain resources, the resources are
not stored inline in the constant buffer. Instead, they are represented
as separate globals, or in case of resource arrays initialized on
demand.
This change implements the HLSL codegen for cases where a cbuffer-backed
struct with embedded resources is copied into a local variable or passed
as a function argument. CodeGen materializes a temporary in the default
address space, copies the constant-data fields using the cbuffer struct
layout, and reconstruct the resource members in the local copy.
Fixes #182990
[PGO] Fix malformed raw profile test (#206574)
PR #190708 added a uniform counter pointer to the raw profile data
record, but a hand-written raw profile test gained one extra zero word
in the record.
Remove the extra word so the name section starts at the offset expected
by the reader. This fixes the regression while keeping the test focused
on the malformed counter pointer.
Buildbot failure:
https://lab.llvm.org/buildbot/#/builders/24/builds/21581
Test:
```
~/git/scripts_shared/scripts/llvm/llvm-dev.sh llvm test llvm/test/tools/llvm-profdata/malformed-ptr-to-counter-array.test
```
[libc] Add regex AST and ExprPool (#198728)
Implemented the core AST nodes and the ExprPool arena-based allocator.
Utilised AllocChecker for memory safety and enforced hardening at node
initialisation.
Assisted-by: Automated tooling, human reviewed.
[lldb][Windows] also run tests with LLDB_TEST_USE_LLDB_SERVER=1 (#206511)
`LLDB_TEST_USE_LLDB_SERVER` defaults to 0, meaning that the green dragon
lldb config on Windows only tests lldb with the in process plugin.
This patch runs the test suite both with and without lldb-server.
This only affects
https://ci-external.swift.org/job/lldb-windows/job/main/.
[BOLT] Fix use-old-text-zero-padding on FreeBSD
BSD od supports only decimal value to -N parameter. To fix the test
failure, we use decimal value instead of hex value in this test case.
[Clang] Switch to Default PIE on FreeBSD
We have started to compile the binary in our base as PIE by defualt. It
makes sense to compile the binary to PIE by default in toolchain as
Linux now. Also, extended testcases to support default PIE and no-pie
parameter in freebsd.c and hip-fpie-option.hip.
[mlir][acc] Improve verifier for workgroup memory operation (#206187)
Adds validity checks for the scaling and offset attributes that they
must be non-negative.
[llvm][cas] Fix a couple of includes NFC (#206573)
Remove two unused includes, and add an include that was relying on
transitive includes. Noticed these in the diff between downstream and
upstream sources.
[AMDGPU] Add additional coverage tests for llvm.amdgcn.tanh, NFC (#202864)
Mainly for source modifiers: neg, abs and neg(abs)
---------
Co-authored-by: Claude Sonnet 4 <noreply at anthropic.com>
[Clang] Refactor and consolidate color diagnostic handling (#202441)
Summary:
This PR tries to consolidate the color output handling in Clang. The
motivation was noticing that `-Xclang -ast-dump` would not behave like
`-fcolor-diagnostics` and would output ANSI codes to a file when I tried
to pipe it.
This PR primarily turns the handling into a tri-state enum keyed off of
`-f[no]-color-diagnostics`. The default/auto case will be if the target
stream supports colors. Getting this to work required a lot of seemingly
unrelated plumbing.
Co-authored-by: Cursor <cursoragent at cursor.com>
[CodeGen] Migrate report_fatal_error from CodeGen headers (#203656)
Replace deprecated report_fatal_error references in
llvm/include/llvm/CodeGen with reportFatalInternalError or
reportFatalUsageError based on the failure category.
MachineFunctionProperties verification failures indicate an internal
codegen pipeline invariant failure. Default target hooks for unsupported
functionality use reportFatalUsageError. Also update stale pseudocode in
IRTranslator.h.
Part of #138914.
[lldb] Collect a diagnostics bundle on the Diagnostics class (#206189)
Add Diagnostics::Collect, which gathers the state a triager needs into a
directory, best-effort (one failed section never sinks the rest): the
always-on log plus the debugger's file logs, statistics.json from
DebuggerStats, and a snapshot of the commands run first when triaging
(target list, image list, thread list, backtraces, image lookup, frame
variable).
It returns a Diagnostics::Report with the LLDB version, host, and how
LLDB was invoked, plus an Attachments holding the bundle directory and
the files written into it. Each file is recorded as it is written, so a
file that could not be created is simply absent from the list. The
report is expected to grow more fields over time.
`diagnostics dump` now calls Collect and prints the report as JSON to
the terminal instead of only reporting where the directory was written.
Here's what this all looks like:
[18 lines not shown]