[AArch64] Fix wrong AArch64Subtarget construction. (#172942)
The AArch64Subtarget construction was wrong for two reasons: firstly,
createMCSubtargetInfo() does not create an AArch64Subtarget object, and
secondly, the target CPU and features were left blank. This has been
benign so far since no methods were called that depended on this, but it
is undefined for the first reason, and creating the subtarget info in a
state that the user did not request for the second reason. This commit
fixes both issues.
[clang-tidy][NFC][Docs] Fix typo in bugprone-macro-parentheses (#173101)
Link title for `CERT C Coding Standard rule PRE20-C` should be `PRE02-C`
to match target.
[flang] Correctly buffer warnings in Semantics/check-call.cpp (#172738)
There are calls to semantics::SemanticsContext::Warn() in check-call.cpp
that are not properly directing their output to the local message
buffer, so they can appear unconditionally in the output of the
compiler. This is a problem for generic interface resolution, which
checks procedure actual arguments against specific procedures using this
code, buffering the messages that might appear, and discarding the
messages for failed matches. Worse, the bogus warnings that escape the
buffering can be associated with completely unrelated locations.
Fix by passing the local message buffer to these Warn() calls.
(I couldn't come up with a good reduced test case, and am not sure that
the original code can be copied for use as one.)
[flang] Improve scan for dummy argument type declarations (#172706)
We can handle a forward reference to an explicitly typed integer dummy
argument when its name appears in a specification expression, rather
than applying the active implicit typing rules, so long as the explicit
type declaration statement has a literal constant kind number. Extend
this to also accept INTEGER(int_ptr_kind()) or other function reference
without an actual argument.
[flang] Extension: Allow POINTER,INTENT(IN) passed objects (#172175)
ISO Fortran now accepts a non-pointer actual argument to associate with
a dummy argument with the POINTER attribute if it is also INTENT(IN), so
long as the actual argument is a valid target for the pointer. But
passed-object dummy arguments still have a blanket prohibition against
being pointers in the ISO standard. Relax that constraint in the case of
INTENT(IN) so that passed objects can also benefit from the feature.
Fixes https://github.com/llvm/llvm-project/issues/172157.
[libc][docs] Update website to reflect new strategy (#168637)
The LLVM-libc goals are updated to better reflect the strategy shared
at the LLVM dev meeting 2025.
[llvm-profdata][StaticDataLayout] Print summary of data access profiles in llvm-profdata (#173087)
This gives some aggregated information about the data access profiles.
The summaries are computed on the fly to save a profile version update.
Implementation-wise
* `MemProfSummary::printSummaryYaml` is updated to print data access
summaries for v4, the profile version that started to support data
access profiles.
* MemProfSummary.cpp has a FIXME comment to serialize the summary into
profile data, ideally batching with more substantial profile format
change.
* MemProfSummaryBuilder is not updated for now. This class is used to
serialize memprof summaries for v4 and above by memprof writer, and to
construct memprof summaries for v3 and prior versions by llvm-profdata.
[LV] Check Addr in getAddressAccessSCEV in terms of SCEV expressions. (#171204)
getAddressAccessSCEV previously had some restrictive checks that limited
pointer SCEV expressions passed to TTI to GEPs with operands that must
either be invariant or marked as inductions.
As a consequence, the check rejected things like `GEP %base, (%iv + 1)`,
while the SCEV for the GEP should be as easily analyzeable as for `GEP
%base, %v`, with the only difference being the of the AddRec start
adjusted by 1.
This patch changes the code to use a SCEV-based check, limiting the
address SCEV to be loop invariant, an affine AddRec (i.e. induction ),
or an add expression of such operands or a sign-extended AddRec.
This catches all existing cases getAddressAccessSCEV caught, plus
additional ones like the cases mentioned above.
This means we pass address SCEVs in more cases, giving the backends a
[16 lines not shown]
[msan][NFCI] Remove element-size override for VNNI intrinsics (#172762)
MSan's handleVectorPmaddIntrinsic had an EltSizeInBits parameter to
override the incorrect element size for VNNI intrinsics. Now that the
element size has been corrected
(https://github.com/llvm/llvm-project/issues/97271), it is no longer
necessary to override the element size.
This patch also updates the comments.
[clang] Apply cfi_unchecked_callee rules to -fsanitize=function (#170725)
Allow the normal rules for preventing instrumentation on indirect calls
to `cfi_unchecked_callee` function types and `cfi_unchecked_callee`
functions when using `-fsanitize=function`. While it's technically
separate from `-fsanitize=cfi`, this particular UBSan mode checks for
similar control flow bugs so it makes sense to also prevent those
control flow checks from being added onto `cfi_unchecked_callee`
functions.
[SystemZ] Update CodeGen/SystemZ/tdc-05.ll test file (#172437)
This PR updates `llvm/test/CodeGen/SystemZ/tdc-05.ll` using
`llvm/utils/update_llc_test_checks.py` to refresh the expected output.
The updated checks reflect the current output of llc and reduce noise in
future diffs.
[AMDGPU] Limit allocation of lo128 registers for occupancy
Parent change allows allocation of lo128 VGPRs from all 4 banks.
That may result in the undesired allocation leaving a hole of
maximum 128 registers in case if for example v0-v127 are allocated,
and v128-v255 are free.
Limit the available allocation order to the occupancy. Both hard
occupancy limits and occupancy achieved during scheduling are
considered. That is better to spill a register than to drop occupancy
in this case.
[AMDGPU] Allow allocation of lo128 registers from all banks
We can encode 16-bit operands in a short form for VGPRs [0..127].
When we have 1K registers available we can in fact allocate 4
times more from all 4 banks. That, however, requires an allocatable
class for these operands. When for most of the instructions it will
result in the VOP3 longer form, for V_FMAAMK/FMADAK_F16 it will
simply prohibit the encoding because these do not have VOP3 forms.
A straight forward solution would be to create a register class
with all registers having bit 8 of the encoding zero, i.e. to
create a register class with holes punched in it: [0-127, 256-383,
512-639, 768-895]. LLVM, however, does not like register classes
with punched holes when they also have subregisters. The cross-
product of all classes explodes and some combinations of a 'class
having a common subreg with another' becomeing impossible. Just
doing so explodes our register info to 4+Gb, uncompilable too.
The solution proposed is to define _lo128 RC with contigous 896
[17 lines not shown]
[StaticDataLayout] Sort records before printing them in text format (#172592)
This change proposes to sort records before printing to make it more
readable and easier to compare.