LLVM/project 54a7896llvm/lib/ExecutionEngine/JITLink COFF_x86_64.cpp, llvm/test/ExecutionEngine/JITLink/x86-64 COFF_dllimport_iat.s

[JITLink][COFF] Synthesize __imp_ IAT entries (#203906)

Adds a default COFF/x86_64 JITLink pass that synthesizes `__imp_` Import
Address Table (IAT) entries for dllimport references. This allows COFF
objects using dllimport to be JIT-linked without a hand-built import library or
a special generator.

On COFF, `__declspec(dllimport)` codegen emits indirect accesses through a named
`__imp_X` symbol (`callq *__imp_bar(%rip)`; `movq __imp_g(%rip)` for data),                                                                                                                                                                                                                                                  
with `__imp_X` left undefined. JITLink had no handling for this. The new pass —
the COFF counterpart of the ELF/Mach-O GOT builder — defines each undefined
external `__imp_X` over an 8-byte slot holding the address of `X`, and leaves `X`
as an ordinary external to be resolved normally (import library, dynamic-library
search generator, etc.). Both the call and data-access forms then resolve
indirectly through the slot.

Rather than the `GOTTableManager` pattern (anonymous entry + edge redirection),
the pass defines the *named* `__imp_X` symbol over the slot. ELF GOT references
are nameless edge kinds, so that builder must create an anonymous entry and

    [14 lines not shown]
DeltaFile
+78-0llvm/lib/ExecutionEngine/JITLink/COFF_x86_64.cpp
+55-0llvm/test/ExecutionEngine/JITLink/x86-64/COFF_dllimport_iat.s
+133-02 files

LLVM/project d3057e9clang/test/OpenMP target_teams_generic_loop_codegen.cpp

fix test after merge
DeltaFile
+53-109clang/test/OpenMP/target_teams_generic_loop_codegen.cpp
+53-1091 files

LLVM/project eb7ce80llvm/include/llvm/Passes CodeGenPassBuilder.h, llvm/include/llvm/Target CGPassBuilderOption.h

CodeGenPassBuilder: Use cl::boolOrDefault directly in CGPassBuilderOption (#204196)

Current implementation that uses std::optional<bool> captures cl::BOU_FALSE,
for example -global-isel=0, as true. Explictly setting option to 0 should be
false, forced option not set.
This could be fixed but I find it cleaner to use boolOrDefault directly and
use same logic as in TargetPassConfig.
Options EnableIPRA and EnableGlobalISelAbort are left as optional since for
them it is explicitly checked if they are set using getNumOccurrences.
boolOrDefault has encoded unset option.
DeltaFile
+26-26llvm/lib/CodeGen/TargetPassConfig.cpp
+10-10llvm/include/llvm/Passes/CodeGenPassBuilder.h
+6-6llvm/include/llvm/Target/CGPassBuilderOption.h
+2-2llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
+2-2llvm/lib/Target/AArch64/AArch64TargetMachine.cpp
+2-1llvm/lib/Target/AMDGPU/AMDGPUCodeGenPrepare.cpp
+48-471 files not shown
+49-487 files

LLVM/project debd018clang/include/clang/Basic LangOptions.h, clang/include/clang/Lex TextEncoding.h

use LiteralEncoding internally, address other comments
DeltaFile
+8-8clang/lib/Lex/TextEncoding.cpp
+6-6clang/lib/Frontend/InitPreprocessor.cpp
+10-0clang/test/CodeGen/systemz-charset.c
+4-4clang/include/clang/Lex/TextEncoding.h
+2-2clang/include/clang/Options/Options.td
+2-2clang/include/clang/Basic/LangOptions.h
+32-225 files not shown
+37-2711 files

LLVM/project 91ed1e4llvm/lib/Target/AArch64 AArch64PerfectShuffle.cpp AArch64PerfectShuffle.h, llvm/test/CodeGen/AMDGPU amdgcn.bitcast.1024bit.ll llvm.amdgcn.mfma.scale.f32.32x32x64.f8f6f4.ll

Rebase, address comments

Created using spr 1.3.7
DeltaFile
+6,940-6,782llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.1024bit.ll
+7,323-0llvm/test/CodeGen/X86/fptosi-sat-vector-512.ll
+6,583-0llvm/lib/Target/AArch64/AArch64PerfectShuffle.cpp
+3-6,571llvm/lib/Target/AArch64/AArch64PerfectShuffle.h
+6,132-0llvm/test/CodeGen/X86/fptoui-sat-vector-512.ll
+5,788-1llvm/test/CodeGen/AMDGPU/llvm.amdgcn.mfma.scale.f32.32x32x64.f8f6f4.ll
+32,769-13,3542,460 files not shown
+143,475-54,5602,466 files

LLVM/project bede000clang/include/clang/Basic OffloadArch.h, clang/include/clang/Driver BoundArch.h Job.h

Merge into OffloadArch header
DeltaFile
+0-49clang/include/clang/Driver/BoundArch.h
+31-1clang/include/clang/Basic/OffloadArch.h
+4-6clang/include/clang/Driver/Job.h
+1-1clang/include/clang/Driver/SanitizerArgs.h
+1-1clang/include/clang/Driver/ToolChain.h
+1-1clang/include/clang/Driver/Compilation.h
+38-593 files not shown
+41-629 files

LLVM/project 039d0e2clang/include/clang/Options Options.td, clang/lib/Driver/ToolChains Clang.cpp

[Driver][DirectX] Add /Qstrip_debug flag
DeltaFile
+13-0llvm/test/CodeGen/DirectX/ContainerData/ContainerFlags.ll
+7-3llvm/lib/Target/DirectX/DXILWriter/DXILWriterPass.cpp
+4-1llvm/lib/MC/MCDXContainerWriter.cpp
+4-0clang/lib/Driver/ToolChains/Clang.cpp
+3-0clang/include/clang/Options/Options.td
+2-0clang/test/Driver/dxc_debug.hlsl
+33-46 files

LLVM/project a7ffab3clang/include/clang/Basic LangOptions.h, clang/include/clang/Lex TextEncoding.h

use LiteralEncoding internally, address other comments
DeltaFile
+8-8clang/lib/Lex/TextEncoding.cpp
+6-6clang/lib/Frontend/InitPreprocessor.cpp
+4-4clang/include/clang/Lex/TextEncoding.h
+2-2clang/include/clang/Basic/LangOptions.h
+2-2clang/include/clang/Options/Options.td
+1-1clang/test/CodeGen/systemz-charset-diag.cpp
+23-234 files not shown
+27-2710 files

LLVM/project b0e2a24flang/lib/Semantics check-omp-loop.cpp

[flang][OpenMP] Simplify check for DISTRIBUTE/LINEAR restriction, NFC

Use `CollectAffectedDoLoops` instead of traversing the loop nest by hand.
DeltaFile
+7-31flang/lib/Semantics/check-omp-loop.cpp
+7-311 files

LLVM/project a6fe3c7libcxx/test/std/language.support/support.limits/limits/numeric.limits.members min.pass.cpp max.pass.cpp, libcxx/test/std/numerics/bit byteswap.verify.cpp byteswap.pass.cpp

[libc++][test] Migrate _BitInt probe to __BITINT_MAXWIDTH__ and fix latent test bugs (#203876)

`libcxx` tests gate `_BitInt` blocks on `TEST_HAS_EXTENSION(bit_int)`,
which is not a recognized Clang extension and returns 0 in every
language mode. The blocks have been compiling as dead code, hiding
latent bugs across 23 files.

Migrate to a `TEST_HAS_BITINT` helper backed by the standard
`__BITINT_MAXWIDTH__`. The latent bugs the activation surfaces are fixed
in the same commit:
- overflow-safe `min`;
- post-P4052R0 saturating-arithmetic renames plus a
`clang-21`/`apple-clang-21` skip for `saturating.bitint.pass.cpp` (Clang
21 asserts in constexpr eval on non-byte-aligned `_BitInt`);
- an `intcmp` syntax fix;
- `byteswap.verify` directive tightening;
- a missing `<climits>` include in `byteswap.pass` (only visible under
`-fmodules`);
- C++03-compatible `static_assert` form in `digits10`; gating

    [13 lines not shown]
DeltaFile
+75-93libcxx/test/std/numerics/numeric.ops/numeric.ops.sat/saturating.bitint.pass.cpp
+36-35libcxx/test/std/numerics/bit/byteswap.verify.cpp
+0-57libcxx/test/std/utilities/format/format.arguments/format.arg.store/make_format_args.bitint.verify.cpp
+18-7libcxx/test/std/language.support/support.limits/limits/numeric.limits.members/min.pass.cpp
+9-4libcxx/test/std/numerics/bit/byteswap.pass.cpp
+10-1libcxx/test/std/language.support/support.limits/limits/numeric.limits.members/max.pass.cpp
+148-19718 files not shown
+197-23324 files

LLVM/project 80c80e6clang/lib/AST/ByteCode Interp.cpp, clang/test/AST/ByteCode cxx20.cpp

[clang][bytecode] Check const writes more thorougly (#204529)

We used to only have a list of blocks under construction, but now we
have a list of pointers, which gives us more information.

Use this new list to diagnose a case we couldn't previously diagnose.
The test case is from `constant-expression-cxx14.cpp` and shows that a
write to a const member is invalid, even if the parent object is being
constructed right now.
DeltaFile
+40-5clang/lib/AST/ByteCode/Interp.cpp
+34-0clang/test/AST/ByteCode/cxx20.cpp
+74-52 files

LLVM/project 91c6934llvm/test/CodeGen/RISCV clmul.ll clmulr.ll, llvm/test/CodeGen/RISCV/rvv clmulh-sdnode.ll clmul-sdnode.ll

Merge remote-tracking branch 'origin/main' into xteam-red-runtime
DeltaFile
+31,001-87,165llvm/test/CodeGen/RISCV/rvv/clmulh-sdnode.ll
+15,519-26,130llvm/test/CodeGen/RISCV/rvv/clmul-sdnode.ll
+12,134-24,576llvm/test/CodeGen/RISCV/clmul.ll
+8,309-12,701llvm/test/CodeGen/RISCV/clmulr.ll
+7,968-12,512llvm/test/CodeGen/RISCV/clmulh.ll
+8,361-8,920llvm/test/CodeGen/RISCV/rvv/expandload.ll
+83,292-172,00410,343 files not shown
+845,566-574,70910,349 files

LLVM/project 8781298llvm/docs AMDGPUExecutionSynchronization.rst

[AMDGPU][doc] Refactor Barrier Execution Model

Remove everything that has to do with named barriers and put it in a series of model extensions specific to /sbarrier/named-barriers.

I had to change a few things to make it fit, in summary:

Base Model:

* Stylistic changes that make it easier to refer to specific rules. Each rule is in a rubric instead of a bullet point.
* (-) No longer defines `barrier-mutually-exclusive`
* (-) No longer defines barrier `join` and any associated rule.

New named barrier extensions
* Define "named barrier" as a sub-type of barrier objects. This makes barrier-mutually-exclusive redundant.
* Define barrier join as an op that can exclusively be done on `named barrier objects`.
* Define rules relating to join and its ordering with other barrier operations

Following these changes, the target tables changed a bit as well.


    [2 lines not shown]
DeltaFile
+200-154llvm/docs/AMDGPUExecutionSynchronization.rst
+200-1541 files

LLVM/project 45d3dbdllvm/docs AMDGPUExecutionSynchronization.rst

Comments
DeltaFile
+10-10llvm/docs/AMDGPUExecutionSynchronization.rst
+10-101 files

LLVM/project 47f8379llvm/test/CodeGen/RISCV clmul.ll clmulr.ll, llvm/test/CodeGen/RISCV/rvv clmulh-sdnode.ll clmul-sdnode.ll

Merge branch 'main' into users/kparzysz/single-check
DeltaFile
+25,784-36,416llvm/test/CodeGen/RISCV/rvv/clmulh-sdnode.ll
+12,227-23,140llvm/test/CodeGen/RISCV/rvv/clmul-sdnode.ll
+4,004-11,142llvm/test/CodeGen/RISCV/clmul.ll
+3,502-9,174llvm/test/CodeGen/X86/clmul-vector.ll
+3,985-7,989llvm/test/CodeGen/Thumb2/mve-clmul.ll
+4,144-6,437llvm/test/CodeGen/RISCV/clmulr.ll
+53,646-94,298455 files not shown
+76,678-126,828461 files

LLVM/project a5dc61cclang/lib/Driver/ToolChains Clang.cpp

Rename local Zi to HasDebug
DeltaFile
+4-4clang/lib/Driver/ToolChains/Clang.cpp
+4-41 files

LLVM/project 8179e4aclang/include/clang/Options Options.td

Adjust Options.td
DeltaFile
+13-18clang/include/clang/Options/Options.td
+13-181 files

LLVM/project 60a2d43llvm/lib/Target/AArch64 SVEShuffleOpts.cpp AArch64TargetMachine.cpp, llvm/test/CodeGen/AArch64 sve-tbl-folding-opts.ll sve-tbl-folding-new-pm.ll

[AArch64] Add SVE shuffle optimization pass (#193951)

Add a pass to perform VLA shuffle optimizations for SVE.

First up is using tbl to replace deinterleave4+uunpk+zext/uitofp
by generating shuffle masks with index, exploiting the fact that
out-of-range indices in the mask produce zeroes in the result
vector. That way, we can easily zero-extend smaller elements
by using the destination type when generating the mask, and
having one index in range with several out-of-range for each
destination element.
DeltaFile
+642-0llvm/test/CodeGen/AArch64/sve-tbl-folding-opts.ll
+293-0llvm/lib/Target/AArch64/SVEShuffleOpts.cpp
+210-0llvm/test/CodeGen/AArch64/sve-tbl-folding-new-pm.ll
+14-0llvm/lib/Target/AArch64/AArch64TargetMachine.cpp
+14-0llvm/lib/Target/AArch64/AArch64.h
+6-0llvm/lib/Target/AArch64/AArch64PassRegistry.def
+1,179-02 files not shown
+1,183-08 files

LLVM/project 3cc9463llvm/lib/Analysis Delinearization.cpp, llvm/test/Analysis/Delinearization inconsistent-types.ll

[Delinearization] Narrow the scope of the term collection (#204145)

In parametric delinearization, it collects subexpressions whose SCEV
type is `SCEVUnknown` and uses them as candidates for the array
dimensions. When traversing these subexpressions, it may follow any kind
of expression. For example, if it follows a `sext` expression, this can
lead to type inconsistencies among the collected terms.
This patch fixes this issue by preventing traversal into subexpressions
other than `SCEVAddExpr` or `SCEVAddRecExpr`.

Note: I tried to minimize the test case, but this seems to be as far as
it can go.

Fix #204066.
DeltaFile
+44-0llvm/test/Analysis/Delinearization/inconsistent-types.ll
+5-11llvm/lib/Analysis/Delinearization.cpp
+49-112 files

LLVM/project bc70d29clang/docs LanguageExtensions.rst, clang/lib/CodeGen CodeGenModule.cpp CodeGenModule.h

[Clang][AIX] Add -mloadtime-comment-vars support to preserve variables in the final object file.
DeltaFile
+119-0clang/lib/CodeGen/CodeGenModule.cpp
+66-0clang/docs/LanguageExtensions.rst
+61-0clang/test/CodeGen/loadtime-comment-vars.c
+13-8llvm/test/Transforms/LowerCommentString/lower-comment-string.ll
+18-0clang/lib/CodeGen/CodeGenModule.h
+12-0clang/test/CodeGen/PowerPC/loadtime-comment-mixed.c
+289-84 files not shown
+319-810 files

LLVM/project f6fd6eamlir/lib/ExecutionEngine CMakeLists.txt

[mlir][ExecutionEngine] Fix dead -Wno-c++98-compat-extra-semi guard (#204524)

`check_cxx_compiler_flag` stores its result in
`CXX_SUPPORTS_NO_CXX98_COMPAT_EXTRA_SEMI_FLAG`, but the guarding `if()`
checked `CXX_SUPPORTS_CXX98_COMPAT_EXTRA_SEMI_FLAG` (without `_NO_`),
which is never set. The condition was therefore always false and the
`-Wno-c++98-compat-extra-semi` suppression for `mlir_rocm_runtime` was
never applied.

The sibling flag checks in the same block (`-Wno-return-type-c-linkage`,
`-Wno-nested-anon-types`, `-Wno-gnu-anonymous-struct`) already use
matching variable names, so this aligns the typo'd guard with the
established pattern.

No test is included, this is a build-system-only (CMake) change to a
warning-suppression guard and is not unit-testable.

Signed-off-by: bogdan-petkovic <bpetkovi at amd.com>
DeltaFile
+1-1mlir/lib/ExecutionEngine/CMakeLists.txt
+1-11 files

LLVM/project b90ec9cllvm/lib/CodeGen StackColoring.cpp

[StackColoring] Remove unused BB numbering state (#204414)
DeltaFile
+8-17llvm/lib/CodeGen/StackColoring.cpp
+8-171 files

LLVM/project 500d1f8llvm/lib/Target/SPIRV SPIRVUtils.cpp SPIRVPrepareFunctions.cpp, llvm/test/CodeGen/SPIRV/extensions/SPV_INTEL_function_pointers fun-ptr-void-call-aggregate-arg.ll

[SPIR-V] Fix crash on void indirect call with aggregate argument (#204388)

removeAggregateTypesFromCalls named the call to key the type-restoration
metadata, which asserts for void-returning calls. Key the metadata via
instruction metadata on the call instead, which works for void results.
DeltaFile
+42-0llvm/test/CodeGen/SPIRV/extensions/SPV_INTEL_function_pointers/fun-ptr-void-call-aggregate-arg.ll
+20-4llvm/lib/Target/SPIRV/SPIRVUtils.cpp
+9-6llvm/lib/Target/SPIRV/SPIRVPrepareFunctions.cpp
+71-103 files

LLVM/project e6daa68compiler-rt/test/builtins/Unit lit.cfg.py

Revert "Revert "[Compiler-rt][test] Fix circular link dependency between builtins and libc"" (#204728)

Reverts llvm/llvm-project#203152
DeltaFile
+3-1compiler-rt/test/builtins/Unit/lit.cfg.py
+3-11 files

LLVM/project fdf3d44llvm/test/Transforms/InstCombine pext.ll pdep.ll

[InstCombine] Add tests showing failure to fold pdep(0,x) and pext(0,x) to 0 (#204783)

As noted on #204144
DeltaFile
+18-0llvm/test/Transforms/InstCombine/pext.ll
+18-0llvm/test/Transforms/InstCombine/pdep.ll
+36-02 files

LLVM/project 4549680clang/test/SemaCXX enable_if.cpp, llvm/examples/OrcV2Examples/LLJITWithSymbolAliases LLJITWithSymbolAliases.cpp

Merge branch 'main' into users/kasuga-fj/delin-fix-param-types
DeltaFile
+97-114llvm/include/llvm/Support/LSP/Protocol.h
+61-27clang/test/SemaCXX/enable_if.cpp
+85-0llvm/examples/OrcV2Examples/LLJITWithSymbolAliases/LLJITWithSymbolAliases.cpp
+73-0llvm/test/CodeGen/SPIRV/instructions/phi-large-vector-shader.ll
+48-21llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
+65-0llvm/test/CodeGen/AArch64/sve-masked-gather-64b-unscaled.ll
+429-16296 files not shown
+1,430-596102 files

LLVM/project a5e83b9clang/include/clang/Basic arm_neon.td, clang/lib/CodeGen/TargetBuiltins ARM.cpp

[Clang][NEON ACLE] Remove +bf16 requirement from opaque bfloat builtins. (#204201)

Builtins that only care about the size of the element type but not its
format (e.g loads, stores and shuffles) do not require any special
instructions to code generate beyond those already available to +neon.

Fixes https://github.com/llvm/llvm-project/issues/203159
DeltaFile
+0-56clang/lib/CodeGen/TargetBuiltins/ARM.cpp
+18-16clang/include/clang/Basic/arm_neon.td
+2-2clang/test/Sema/aarch64-neon-without-target-feature.cpp
+2-2clang/test/CodeGen/AArch64/neon-luti.c
+2-2clang/test/CodeGen/AArch64/bf16-lane-intrinsics.c
+2-2clang/test/CodeGen/AArch64/bf16-ldst-intrinsics.c
+26-806 files not shown
+30-9012 files

LLVM/project 39a8be5llvm/lib/Target/AArch64 AArch64ISelLowering.cpp, llvm/test/CodeGen/AArch64 sve-masked-gather-64b-unscaled.ll sve-masked-gather.ll

[AArch64] Combine undef UZP and NVCAST away.

These are used to lower insert_subvec nodes quite early in SDAG. After
DAG combines run, it's possible that the inputs to these AArch64 nodes
become UNDEF.
DeltaFile
+17-5llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+3-6llvm/test/CodeGen/AArch64/sve-masked-gather-64b-unscaled.ll
+3-6llvm/test/CodeGen/AArch64/sve-masked-gather.ll
+1-2llvm/test/CodeGen/AArch64/sve-masked-gather-legalize.ll
+24-194 files

LLVM/project 40cbc98llvm/lib/CodeGen/SelectionDAG LegalizeVectorTypes.cpp, llvm/test/CodeGen/AArch64 sve-masked-gather-64b-unscaled.ll sve-masked-scatter-64b-unscaled.ll

[AArch64][SDAG] Legalise nxv1 gather/scatter nodes (#204620)

This updates WidenVecRes_MGATHER and WidenVecOp_MSCATTER to support
scalable vector types.
DeltaFile
+65-0llvm/test/CodeGen/AArch64/sve-masked-gather-64b-unscaled.ll
+62-0llvm/test/CodeGen/AArch64/sve-masked-scatter-64b-unscaled.ll
+61-0llvm/test/CodeGen/AArch64/sve-masked-gather.ll
+58-0llvm/test/CodeGen/AArch64/sve-masked-scatter.ll
+18-14llvm/test/CodeGen/AArch64/sve-masked-gather-legalize.ll
+9-11llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
+273-252 files not shown
+301-258 files

LLVM/project 47b29c2llvm/lib/Target/SPIRV SPIRVLegalizerInfo.cpp, llvm/test/CodeGen/SPIRV/instructions phi-large-vector-shader.ll phi-large-vector.ll

[SPIR-V] Legalize G_PHI of oversized vectors via fewer-elements (#203993)

`G_PHI` on vectors wider than the SPIR-V max vector size previously
failed legalization. This PR adds a `fewerElementsIf` rule that splits
them down to `MaxVectorSize`, matching how other vector ops are handled
in `SPIRVLegalizerInfo.cpp`.


Added the following test
`llvm/test/CodeGen/SPIRV/instructions/phi-large-vector.ll` covering
spirv32 and spirv64.
DeltaFile
+73-0llvm/test/CodeGen/SPIRV/instructions/phi-large-vector-shader.ll
+44-0llvm/test/CodeGen/SPIRV/instructions/phi-large-vector.ll
+5-1llvm/lib/Target/SPIRV/SPIRVLegalizerInfo.cpp
+122-13 files