[LFI][Doc] Update documentation for planned features (#192128)
This PR updates the LFI documentation to be a little less
AArch64-specific in anticipation of x86-64 support. I've also updated
the planned names for the `no-lfi-stores`/`no-lfi-loads` features, and
updated the planned rewrite sequence for `x30` modifications to make it
more PAC-compatible for when we include support for that.
[RISCV] Support emitting plui.h for i32 constants on RV64. (#192534)
If the constant was originally i32, it will be sign or zero
extended to i64 during type legalization. If we can prove the
upper bits aren't used we can duplicate the lower bits to allow
RISCVMatInt to select plui.h.
[lit] Fix tests when run via symlinks (#192530)
If the path to Inputs/filter-failed was a symlink, it would copy the
symlinks, and then edit the same files, leading to flaky failures if the
tests ran in parallel.
[ThinLTO] Drop !inline_history metadata when importing functions (#192564)
In #190876 we now have functions in ValueAsMetadata (!inline_history
metadata). This has caused undefined symbol linker errors in some
ThinLTO builds. The following is what's going on:
@f in module A is getting imported from module A to module B, and it has
a call with !inline_history pointing to @g in module A, so a declaration
for @g is also imported into module B. But @g gets internalized in
module A, causing the undefined symbol error at link time due to
memprof's ICP in module B creating a call to @g since we can ICP a call
to any declaration.
To avoid pulling in a function declaration that may be wrong, simply
drop !inline_history metadata when importing functions. They aren't
necessary for correctness, they only prevent inlining explosion in some
recursive edge cases. Worst case is we do another round of inlining
through mutually recursive functions and then stop again due to newly
added !inline_history metadata, which should be fine; the inlining
explosion typically happens because we keep inlining through mutually
recursive functions.
[SSAF][WPA] Add no-op PointerFlow and UnsafeBufferUsage analysis
We need no-op PointerFlow and UnsafeBufferUsage analyses for the
analysis that depends on their summary data.
Refactored PointerFlow and UnsafeBufferUsage serialization for code
sharing.
rdar://174874942
[lldb] Assert lack of trailing period or newlines in diagnostics (#191447)
This PR adds an assert to `CommandReturnObject::{AppendNote,
AppendWarning}` to ensure the diagnostics don't end with a newline,
which is added by the function, or a period, which goes against the
coding standards.
I added a little helper that asserts in assert-enabled builds and trim
the diagnostic otherwise. I know that goes against the notion that
"asserts are preconditions" and therefore you shouldn't handle the case
where they don't hold (something I generally advocate for) but I think
we should prioritize a consistent user experience over purity.
We should do the same thing for `AppendError`, but currently there are
still too many violations that need to be cleaned up and if the compiler
emits non-compliant diagnostics, we may not be able to do this at all.
[NewPM] Adds a port for AArch64PostLegalizerLowering (#190718)
Standard porting (extraction into a helper function shared across legacy
and new PM passes).
Dropped unused include `TargetPassConfig.h`
[llvm-lit] Error on malformed `.lit_test_times` entries instead of stack trace (#191305)
When running `llvm-lit`, I sometimes hit a traceback, because a
`.lit_test_times.txt` file has got corrupted (not sure how).
However, it's non-obvious what the issue is (you just get a traceback),
so I've fixed this as follows:
`read_test_times()` currently assumes every line in
`.lit_test_times.txt`
contains a floating-point time followed by a test path. If the file
contains a blank line, a line without a path, or a non-numeric time,
`llvm-lit` will now error with: `fatal: found malformed timing data in`
+ filename.
Add test coverage for this too.
[flang][acc] Accept component reference in non-global `acc declare` (#192563)
The current TODO was being issued for all cases of `acc declare`
including ones which are treated as a subroutine-scope lifetime. Since
the latter use normal data mapping clauses without the need for
ctors/dtors, accept them. However, still emit TODO for the cases where
component references are `acc declare`d in global context.
[lldb] Rally around triple rather than arch in the API tests (#191416)
This PR removes as much uses of arch as possible, in favor of using
triple directly. Most of the changes are in the builder, which no longer
passes `ARCH` to Make, and of course in Makefile.rules.
This significantly simplifies the remote Darwin test suite, as it
previously had to try and piece together the triple from the platform
and the arch. As an added benefit, we now go through the same code path
for host and remote test runs.
I have tested this on Darwin and Linux and made the changes with the
remote test suites in mind, but it's possible I missed something not
caught by my local testing.
[AMDGPU] Mark ASYNCMARK as meta instruction to fix hazard cycle miscounting (#189981)
ASYNCMARK emits no hardware code it is used for tracking purpose but was
not marked as meta, causing getNumWaitStates to return 1 and
GCNHazardRecognizer to incorrectly count it as a pipeline cycle.
This patch marks ASYNCMARK as meta-Instruction so it correctly reports 0
wait states.
Fixes: #186878
[flang][OpenACC] Support acc routine info on ProcEntityDetails for separate compilation (#192367)
When !$acc routine(name) vector is used in a caller for an external
subroutine, the symbol has ProcEntityDetails (not SubprogramDetails).
The routine info (vector/worker/gang/seq) was silently lost because
AddRoutineInfoToSymbol only handled SubprogramDetails, and CallInterface
only checked SubprogramDetails for openACCRoutineInfos.
Add openACCRoutineInfos storage to ProcEntityDetails and handle it in
both AddRoutineInfoToSymbol and CallInterface so the parallelism level
is properly lowered to acc.routine with the correct keyword.
[flang][cuda][openacc] use the ultimate symbol to set the implicit device attribute (#192553)
The attribute was not applied when the symbol had a UseDetails. Use the
ultimate symbol so we get the proper ObjectEntityDetails to apply the
implicit attribute.
[flang][acc] Fix crash on collapse(force:N) with non-tightly nested loops (#191310)
When collapse(force:N) is applied to non-tightly nested loops, the
compiler could crash or generate redundant inner loops.
Crashes occurred because getNestedEvaluations() was called without
checking hasNestedEvaluations() first. Add guards in hasEarlyReturn(),
createRegionOp(), and the collapse-force sinking logic in Bridge.cpp.
Redundant inner loops were generated because processDoLoopBounds
absorbed N levels of do-loops into the outer acc.loop, but the PFT
walker still generated separate acc.loop ops for those same loops.
Fix by tracking absorbed DoConstruct* pointers in visitLoopControl
and skipping them in genFIR(DoConstruct).
[lldb] Convert TestPtrAuth.py from an inline to a regular test (NFC) (#192705)
This PR changes TestPtrAuth.py from an inline to a "regular" API test.
The motivation for this is #191416 and the need to specify parameters to
the build.
[Github][CI] Add note about AI tools in good-first-issue text (#173109)
After https://github.com/llvm/llvm-project/pull/172515, we have a new
paragraph in LLVM policy about AI:
> The one exception we reserve is for GitHub issues labelled with the
“good first issue” label. These issues are selected by LLVM contributors
to help newcomers get familiar with the code base. Thus, it makes no
sense to fix them using AI tools. Using AI tools to fix issues labelled
as “good first issues” is forbidden.
We should add disclosure about it in the introduction note for
developers to see clearly.
---------
Co-authored-by: Reid Kleckner <rkleckner at nvidia.com>
[AMDGPU] Stop coercing structs with FP and int fields to integer arrays (#185083)
Fixes #184150
This PR fixes the ABI lowering code for small aggregates (≤64 bits) on
AMDGPU targets to selectively coerce based on element types:
- Structs containing only sub-32-bit integers (char, short): Continue to
coerce to i16/i32/[2 x i32] for efficient register packing
- Structs containing floats or full-sized integers (i32, i64, float,
double): Preserve original types using ABIArgInfo::getDirect() without
coercion
Previously, ALL small aggregates were unconditionally coerced to integer
types. A struct like { float, int } would be lowered to [2 x i32],
losing the floating-point type information. This prevented attaching
FP-specific attributes like nofpclass to the float
component.
[35 lines not shown]
[OpenACC] Require a complete type for vars-with-restrictions (#192680)
The bug report shows a case where an incomplete type was passed to a
var-list in a clause that has a restriction. Only the 'private',
'firstprivate', and 'reduction' clauses have such restrictions on what
they can reference, so only those will cause problems.
This patch adds a 'completeness' requirement for all 3 of those to make
sure we can properly enforce our restrictions.
Fixes: #192664
[lldb] Don't adopt in the ExecutionContext from auto-continue events (#191433)
When a breakpoint auto-continues, the event handler receives a "stopped
but restarted" event. During the transition where we step over the
breakpoint (before continuing), the public state hasn't yet been set to
running. This caused the `DefaultEventHandler` to call
`ExecutionContextRef` with `adopt_selected=true`, which would fetch
stale thread/frame state and needlessly (and incorrectly) interrupt the
target to compute the execution context (used by the statusline). This
PR fixes that by not doing that.
Fixes #190956
Co-authored-by: Jim Ingham <jingham at apple.com>
[PowerPC] Add builtins for Post Quantum Cryptography Acceleration (#184717)
This patch implements Post Quantum Cryptography (PQC) Acceleration
builtins for PowerPC's future ISA by ensuring that vector operations
(vec_add, vec_sub, vec_mul, vec_mulh) correctly map to VSX instructions
(xvadduwm, xvadduhm, xvsubuwm, xvsubuhm, xvmuluwm, xvmuluhm, xvmulhsw,
xvmulhsh, xvmulhuw, xvmulhuh) when targeting mcpu=future.
Implement new builtin for vec_mulh:
* vector short vec_mulh(vector signed short, vector signed short)
* vector unsigned short vec_mulh(vector unsigned short, vector unsigned
short)
Assisted by AI.