Forward declare TextEncodingConverter in TextEncoding.h, move config.h into TextEncoding.cpp (#207382)
This patch forward declares TextEncodingConverter in
clang/include/clang/Lex/TextEncoding.h, and moves config.h into
llvm/lib/Support/TextEncoding.cpp instead of the header.
[Clang] Fix crash on subscripting a complete matrix subscript expression (#207317)
Subscripting a complete MatrixSubscriptExpr (which has scalar type)
caused an assertion failure in ActOnArraySubscriptExpr because the code
unconditionally asserted isIncomplete() on any MatrixSubscriptExpr base.
Fix by guarding the matrix subscript path with an isIncomplete() check,
allowing complete matrix subscript expressions to fall through to the
standard subscript handling, which emits an appropriate diagnostic.
Fixes #203163
[AArch64] Fix ReconstructShuffle for known vscale>1 (#205099)
The code at AArch64TargetLowering::ReconstructShuffle expects
NEON-compatible types. But for e.g. vscale_range = {2}, we can get legal
fixed-length vectors that are wider than 128 bits.
[Clang] Remove unused TokenKey::KEYNOZOS (#207132)
[Clang] Remove unused TokenKey::KEYNOZOS
KEYNOZOS was defined as a TokenKey flag to mark keywords not supported
on z/OS, but no keyword in TokenKinds.def actually uses it. This patch
removes the unused enum value and its associated handling code.
Build: `ninja clang` succeeded (2923/2923 targets).
Tests: `ninja check-clang` passed — 51180 passed, 0 failed.
AI assistance was used for code review analysis and CI failure
debugging.
Fixes #206877
Co-authored-by: Chenguang Ding <dingchenguang at kylinos.cn>
[analyzer][docs] Fix invalid MyST toctree 'numbered' option after Markdown migration (#207217)
The RST-to-Markdown migration (#206181) converted the RST flag
`:numbered:` into `:numbered: true`.
MyST parses the toctree `numbered` option as `int_or_nothing`, so the
string `true` fails with:
```
'toctree': Invalid option value for 'numbered': true:
invalid literal for int() with base 10: 'true'
```
This breaks the `-W` (warnings-as-errors) `docs-clang-html` build.
Make `numbered` a valueless flag, which MyST accepts (equivalent to the
original RST behavior of numbering all levels).
Assisted-By: claude
[Clang][SVE ACLE] Remove +bf16 requirement from neon-sve bridge builtins. (#205332)
These builtins only care about the size of the element type and do not
require bfloat specific instructions.
[AMDGPU] Accept sext addresses when folding image ops to a16 (#203189)
canSafelyConvertTo16Bit() only accepts a zext when narrowing image
address coordinates to 16 bits. Add an opt-in AllowI16SExt flag so a
sext from i16 is accepted too, and enable it for sampler-less image
instructions.
Coordinates of sampler-less loads/stores are unsigned, so sext and zext
only disagree for a negative i16 (>= 0x8000), which is already out of
bounds since the maximum image dimension is <= 0x8000. Accepting the
sext therefore lets such coordinates fold to the a16 form, reducing VGPR
pressure.
Co-authored-by: Barbara Mitic <Barbara.Mitic at amd.com>
[VPlan] Optimize pre-increment IV latch users with tail folding (#206499)
This was noticed after #204089 caused IndVarsSimplify to convert some
live out IV users to use the pre-incremented IV, not the
post-incremented.
Tail folded live-outs don't have the `(extract-last-lane
(extract-last-part foo))` form, but instead have the form `(extract-lane
(last-active-lane header-mask), foo)`.
For post-incremented IVs in tail folding, these are converted to
VPInstruction::ExitingIVValue which are handled separately. But
ExitingIVValue can't be used for the pre-incremented IV. So this teaches
optimizeLatchExitInductionUser to detect the last-active-lane of the
header mask form.
[ADT][NFC] Remove unused includes in DenseMap/DenseSet headers (#207282)
Remove unused includes in DenseMap/DenseSet headers.
`llvm/Support/AlignOf.h` was transitively included in
`llvm/Support/JSON.h`
[mlir][OpenMP] Change device declare target functions to hidden visibility (#207234)
During OpenMP lowering, globally visible device functions are emitted.
These functions might not be kernels themselves, but are designed to
only be called in a kernel context. However, if they are unused, and not
inlined, and reference LDS, the AMDGPU ISel emits lots of misleading
warnings related to "local memory global used by non-kernel function".
Fix by changing visibility from external+default to external+hidden,
which allows DCE to just remove the functions.
Claude assisted with this patch.
[M68k] Fix build after removal of RegisterClasses pointer array (#207364)
Commit 4d8ec1968023 ("[CodeGen][NFC] Remove RegisterClasses pointer
array (#207204)") removed regclass_begin()/regclass_end() from
TargetRegisterInfo, so those names now resolve to the MCRegisterInfo
versions whose iterator dereferences to a MCRegisterClass rather than a
const TargetRegisterClass *, breaking getMaximalPhysRegClass():
error: cannot convert 'const llvm::MCRegisterClass' to
'const llvm::TargetRegisterClass*' in initialization
M68k was not updated in that commit. Switch to the range-based
regclasses() idiom used elsewhere in the same change.
Regressor: 4d8ec1968023 ("[CodeGen][NFC] Remove RegisterClasses pointer
array") (#207204)
[AArch64] Minor simplification in aarch64-ldst-opt with an early return (#207182)
Remove the local `MBBIWithRenameReg` by moving an early return at an
even earlier point.
When `MBBIWithRenameReg` is set we always return early. By moving the
early return to `MBBIWithRenameReg` update we get rid of a local
variable which spans 200+ lines. This also fixes a misleading debug
print between `MBBIWithRenameReg` update and early return:
```
LLVM_DEBUG(dbgs() << "Unable to combine these instructions due to "
<< "interference in between, keep looking.\n");
```
This line shouldn't be printed when we set `MBBIWithRenameReg`, which is
fixed with this change.
[X86] haddsub-undef.ll - sync more testnames with their phaseordering equivalents (#207370)
Ensure we have equivalent hadd/sub middle-end test coverage with similar names for lookup
[flang][Driver] Add option for real sum reassociation
Compiler driver option for #207371: -freal-sum-reassociation. This is in
the hidden help for now. Disabled by default.
Assisted-by: Codex
[libc++][ranges] Enable CPO compile tests (#207123)
`adjacent_transform_view` and `stride_view` were implemented but the
test cases were omitted.
Co-authored-by: Hristo Hristov <zingam at outlook.com>
[ADT][NFC] Remove unused ValueInfoT from DenseSetImpl (#207277)
The `ValueInfoT` template type was unused and is removed in this patch.
---------
Co-authored-by: Nikita Popov <github at npopov.com>
[flang][Lower] Add alternative real expression lowering
This is opt-in by an engineering option and disabled by default.
In section 10.1.5.2.4 of the 2023 Fortran standard "Evaluation of
numerical intrinsic operations", the standard explicitly allows
alternate mathematically equivalent lowerings. For example the source
expression X + Y + Z could be evaluated (X + Y) + Z, X + (Y + Z) or even
(X + Z) + Y, etc.
The open source benchmark SNBone shows significantly better results with
classic flang because classic flang emits real arithmetic expressions in
a different order. In the case of this benchmark it reduces dependency
depth for instructions issued to the vector unit, allowing for more of
the arithmetic to be parallelised over multiple vector execution units
in the ALU.
The lowering added by this patch tries to mimic the way classic flang
orders instructions for these expressions. I did not read any classic
[36 lines not shown]