LLVM/project a714b73offload/plugins-nextgen/level_zero/dynamic_l0/level_zero ze_api.h

[OFFLOAD][L0] Fix incorrect values in the Level Zero cached header (#196587)

The current ZE_STRUCTURE_TYPE_DEVICE_IP_VERSION_EXT and
ZE_STRUCTURE_TYPE_RELAXED_ALLOCATION_LIMITS_EXP_DESC values are
incorrect as seen here:
*
https://github.com/oneapi-src/level-zero/blob/0f246f6edf90d56604f00f83b41d783dc6a9394e/include/ze_api.h#L318
*
https://github.com/oneapi-src/level-zero/blob/0f246f6edf90d56604f00f83b41d783dc6a9394e/include/ze_api.h#L324
DeltaFile
+2-2offload/plugins-nextgen/level_zero/dynamic_l0/level_zero/ze_api.h
+2-21 files

LLVM/project edd7810llvm/test/Transforms/SLPVectorizer/X86 arith-mul-smulo.ll arith-add-uaddo.ll

Revert "[SLP] Vectorize struct-returning intrinsics"

This reverts commit b0c6df7b95b3c70d78c65a39598007f722794d38 to fix
buildbots https://lab.llvm.org/buildbot/#/builders/52/builds/17118

Reviewers: 

Pull Request: https://github.com/llvm/llvm-project/pull/196591
DeltaFile
+615-549llvm/test/Transforms/SLPVectorizer/X86/arith-mul-smulo.ll
+615-449llvm/test/Transforms/SLPVectorizer/X86/arith-add-uaddo.ll
+615-449llvm/test/Transforms/SLPVectorizer/X86/arith-sub-usubo.ll
+615-449llvm/test/Transforms/SLPVectorizer/X86/arith-add-saddo.ll
+615-449llvm/test/Transforms/SLPVectorizer/X86/arith-sub-ssubo.ll
+615-429llvm/test/Transforms/SLPVectorizer/X86/arith-mul-umulo.ll
+3,690-2,7744 files not shown
+3,912-3,26510 files

LLVM/project e6efa1allvm/unittests/Target/AMDGPU GCNRegPressureTest.cpp CMakeLists.txt

[AMDGPU] Pre-commit unit test for RP tracking `reset`/`advance` inconsistencies fix (#196098)

This adds a new AMDGPU unit test file for testing the behavior of
`GCNRPTracker` and its related classes. The two test showcase confusing
return value and behavioral semantics for variants of the advance and
reset functions, which will be clarified in a follow up commit.
DeltaFile
+156-0llvm/unittests/Target/AMDGPU/GCNRegPressureTest.cpp
+1-0llvm/unittests/Target/AMDGPU/CMakeLists.txt
+157-02 files

LLVM/project 17a0494llvm/lib/Target/PowerPC PPCISelLowering.cpp

[PowerPC][NFC]Refactor EmitInstrWithCustomInserter (#196114)

Currently PPCTargetLowering::EmitInstrWithCustomInserter() uses a large
if/else-if structure. Update to use switch and
move ATOMIC_CMP_SWAP and SELECT code to helper functions for better
readability and maintenance.
DeltaFile
+555-435llvm/lib/Target/PowerPC/PPCISelLowering.cpp
+555-4351 files

LLVM/project 89264ffclang/lib/Driver/ToolChains Flang.cpp Flang.h

clang/AMDGPU: Pass BoundArch through device libs handling

Pre-work to consolidate target identification for future target
option bug fixes. Also requires updating flang to match recent
clang changes.

Co-authored-by: Claude Sonnet 4 <noreply at anthropic.com>
DeltaFile
+14-10clang/lib/Driver/ToolChains/Flang.cpp
+13-3clang/lib/Driver/ToolChains/Flang.h
+10-5clang/lib/Driver/ToolChains/HIPAMD.cpp
+5-6clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp
+8-3clang/lib/Driver/ToolChains/AMDGPU.cpp
+2-2clang/lib/Driver/ToolChains/HIPSPV.cpp
+52-295 files not shown
+57-3411 files

LLVM/project b0c6df7llvm/test/Transforms/SLPVectorizer/X86 arith-mul-smulo.ll arith-sub-usubo.ll

[SLP] Vectorize struct-returning intrinsics

Allow SLP to combine across lanes calls that return a literal struct
(llvm.sincos, llvm.*.with.overflow, llvm.frexp, ...) into a single
call returning a struct of vectors, by widening {T, T, ...} to
{<VF x T>, ...} via VectorTypeUtils and emitting extractvalue +
extractelement for external uses.

Reviewers: hiraditya, bababuck, RKSimon

Pull Request: https://github.com/llvm/llvm-project/pull/195521
DeltaFile
+549-615llvm/test/Transforms/SLPVectorizer/X86/arith-mul-smulo.ll
+449-615llvm/test/Transforms/SLPVectorizer/X86/arith-sub-usubo.ll
+449-615llvm/test/Transforms/SLPVectorizer/X86/arith-add-saddo.ll
+449-615llvm/test/Transforms/SLPVectorizer/X86/arith-add-uaddo.ll
+449-615llvm/test/Transforms/SLPVectorizer/X86/arith-sub-ssubo.ll
+429-615llvm/test/Transforms/SLPVectorizer/X86/arith-mul-umulo.ll
+2,774-3,6904 files not shown
+3,265-3,91210 files

LLVM/project 62fd4ffllvm/unittests/Target/AMDGPU CMakeLists.txt

[AMDGPU] Add missing CMake link component (#196579)

The issue was triggered by #196547.
DeltaFile
+1-0llvm/unittests/Target/AMDGPU/CMakeLists.txt
+1-01 files

LLVM/project a7591efllvm/test/Transforms/SLPVectorizer struct-return-revec.ll

[SLP][NFC]Add a test with the revectorization of the struct-returning intrinsics



Reviewers: 

Pull Request: https://github.com/llvm/llvm-project/pull/196581
DeltaFile
+65-0llvm/test/Transforms/SLPVectorizer/struct-return-revec.ll
+65-01 files

LLVM/project ebf4b14llvm/lib/Transforms/InstCombine InstructionCombining.cpp, llvm/test/Transforms/InstCombine fold-multi-use-select-packed-constants.ll pr80597.ll

[InstCombine] Fold binop into multi-use select when one select arm and the other operand are constant
DeltaFile
+15-15llvm/test/Transforms/InstCombine/fold-multi-use-select-packed-constants.ll
+10-2llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
+2-7llvm/test/Transforms/InstCombine/pr80597.ll
+2-7llvm/test/Transforms/InstCombine/pr72433.ll
+3-3llvm/test/Transforms/InstCombine/shift.ll
+1-4llvm/test/Transforms/InstCombine/extractelement.ll
+33-382 files not shown
+35-408 files

LLVM/project 2b97000llvm/lib/Target/AArch64/GISel AArch64LegalizerInfo.cpp AArch64RegisterBankInfo.cpp, llvm/test/CodeGen/AArch64 bf16-instructions.ll bf16-v8-instructions.ll

[AArch64][GlobalISel] Legalize F64 to BF16 fptruncates (#196077)

This two-step expansion of bf16 fptrunc steps needs to be careful to
avoid double-rounding error. Under AArch64 we can apparently convert to
a fcvtxn that performs round-to-odd, followed by a standard fp truncate
to bf16 to make sure the rounding from there is done correctly. This
reuses the existing lowering added for vector operations.
DeltaFile
+78-37llvm/test/CodeGen/AArch64/bf16-instructions.ll
+54-26llvm/test/CodeGen/AArch64/bf16-v8-instructions.ll
+33-17llvm/test/CodeGen/AArch64/bf16-v4-instructions.ll
+16-3llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp
+6-5llvm/test/CodeGen/AArch64/arm64-vcvt_f.ll
+2-0llvm/lib/Target/AArch64/GISel/AArch64RegisterBankInfo.cpp
+189-886 files

LLVM/project 6c5f5c1clang/lib/Serialization ASTReader.cpp

[Clang][Modules] Fix -Wunused-variable (#196577)

Mark some variables [[maybe_unused]] and inline others that do not have
side effects to avoid -Wunused-variable in non-assert builds.
DeltaFile
+4-4clang/lib/Serialization/ASTReader.cpp
+4-41 files

LLVM/project 6c083a6llvm/lib/Target/AMDGPU VOP3PInstructions.td, llvm/test/MC/AMDGPU gfx13_asm_vop3p.s gfx13_asm_vop3p_features.s

[AMDGPU] Add VOP3P encoding to gfx13 (#196252)

Co-authored-by: Ivan Kosarev <ivan.kosarev at amd.com>
DeltaFile
+1,608-0llvm/test/MC/AMDGPU/gfx13_asm_vop3p.s
+125-0llvm/test/MC/AMDGPU/gfx13_asm_vop3p_features.s
+60-42llvm/lib/Target/AMDGPU/VOP3PInstructions.td
+34-0llvm/test/MC/AMDGPU/gfx13_asm_vop3p_dpp8.s
+18-0llvm/test/MC/AMDGPU/gfx13_asm_vop3p_dpp16.s
+1-0llvm/test/MC/AMDGPU/gfx12_asm_vop3p_aliases.s
+1,846-426 files

LLVM/project 64f9bb5llvm/lib/Object WasmObjectFile.cpp, llvm/test/tools/llvm-readobj/wasm invalid-data-segment-name-index.test

[Object][Wasm] Fix off-by-one in data segment name index validation (#196338)

The check `Index > DataSegments.size()` in `parseNameSection()` allows
`Index == DataSegments.size()`, which is an out-of-bounds access.

In an assertions-disabled ASan build, a malformed wasm object with one
data segment and a data segment name entry using index 1 triggers a
heap-buffer-overflow READ in `WasmObjectFile::parseNameSection()`.

Fix by checking `Index >= DataSegments.size()` instead.

Also add a regression test that verifies the malformed input is rejected
with "invalid data segment name entry".
DeltaFile
+25-0llvm/test/tools/llvm-readobj/wasm/invalid-data-segment-name-index.test
+2-2llvm/lib/Object/WasmObjectFile.cpp
+27-22 files

LLVM/project 003846blibc/test/src/string/memory_utils op_tests.cpp

[libc] Fix op_tests Memcmp guard to require SSE4.1 (#196572)

The is_vector<__m128i> specialisation in op_x86.h is gated on
__SSE4_1__, but op_tests.cpp included generic::Memcmp<__m128i> under the
weaker __SSE2__ guard. On baseline x86-64 (where __SSE2__ is always
defined but __SSE4_1__ may not be), this caused a static_assert failure
in is_element_type_v.

Changed the guard from __SSE2__ to __SSE4_1__ to match the
specialisation requirement, consistent with how BcmpImplementations
already guards its __m128i entry.

Assisted-by: Automated tooling, human reviewed.
DeltaFile
+1-1libc/test/src/string/memory_utils/op_tests.cpp
+1-11 files

LLVM/project 8d0e5e8llvm/lib/Transforms/InstCombine InstructionCombining.cpp, llvm/test/Transforms/InstCombine fold-multi-use-select-packed-constants.ll pr80597.ll

[InstCombine] Fold binop into multi-use select when one select arm and the other operand are constant
DeltaFile
+15-15llvm/test/Transforms/InstCombine/fold-multi-use-select-packed-constants.ll
+10-2llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
+2-7llvm/test/Transforms/InstCombine/pr80597.ll
+2-7llvm/test/Transforms/InstCombine/pr72433.ll
+3-3llvm/test/Transforms/InstCombine/shift.ll
+1-4llvm/test/Transforms/InstCombine/extractelement.ll
+33-382 files not shown
+35-408 files

LLVM/project 4771770llvm/lib/CodeGen/SelectionDAG SelectionDAG.cpp DAGCombiner.cpp, llvm/test/CodeGen/X86 freeze-unary.ll

[DAG] canCreateUndefOrPoison - ISD::FCEIL/FFLOOR/FTRUNC/FRINT/FNEARBYINT/FROUND/FROUNDEVEN can never create poison/undef (#196543)

Also add missing fold support for ftrunc(fround(x)) -> fround(x)
DeltaFile
+14-49llvm/test/CodeGen/X86/freeze-unary.ll
+7-0llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
+1-0llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+22-493 files

LLVM/project 5681c52clang/lib/Driver/ToolChains Darwin.cpp AMDGPU.cpp

clang: Add BoundArch argument to addClangTargetOptions

addClangTargetOptions already has an OffloadKind argument,
but it kind of doesn't make sense for any function to know the
OffloadKind, but not the associated BoundArch.

The current process is kind of convoluted. TranslateArgs
synthesizes a -mcpu argument from BoundArch, and later
addClangTargetOptions re-parses that -mcpu argument each
time it wants the architecture. Add this argument so this
can be cleaned up in a future change.

Co-authored-by: Claude Sonnet 4 <noreply at anthropic.com>
DeltaFile
+9-5clang/lib/Driver/ToolChains/Darwin.cpp
+7-5clang/lib/Driver/ToolChains/AMDGPU.cpp
+7-3clang/lib/Driver/ToolChains/Darwin.h
+6-3clang/lib/Driver/ToolChains/AMDGPU.h
+5-3clang/lib/Driver/ToolChains/Cuda.cpp
+5-3clang/lib/Driver/ToolChains/XCore.h
+39-2248 files not shown
+115-5654 files

LLVM/project a80491blldb/unittests/Expression DWARFExpressionTest.cpp, llvm/test/CodeGen/AMDGPU amdgpu-simplify-libcall-pow.ll pseudo-scalar-transcendental.ll

Merge upstream/main into users/mariusz-sikora-at-amd/add-feature-min-max-mad
DeltaFile
+4,634-367llvm/test/CodeGen/RISCV/rvv/fixed-vectors-reduction-fp.ll
+0-775llvm/utils/Reviewing/find_interesting_reviews.py
+666-0llvm/test/Transforms/LoopVectorize/AArch64/partial-reduce-costs.ll
+329-329llvm/test/CodeGen/AMDGPU/amdgpu-simplify-libcall-pow.ll
+507-0lldb/unittests/Expression/DWARFExpressionTest.cpp
+440-61llvm/test/CodeGen/AMDGPU/pseudo-scalar-transcendental.ll
+6,576-1,5321,418 files not shown
+30,851-14,5551,424 files

LLVM/project 935b7eaclang/include/clang/Driver Action.h, clang/lib/Driver/ToolChains Clang.cpp

clang: Consolidate -aux-triple handling

All of the offload languages were essentially doing the
same thing, with overcomplicated conditions conditional on
the language.
DeltaFile
+41-51clang/lib/Driver/ToolChains/Clang.cpp
+3-0clang/include/clang/Driver/Action.h
+1-1clang/test/Driver/sycl-offload-jit-xarch.cpp
+45-523 files

LLVM/project 65ba09fllvm/lib/Target/AArch64 AArch64InstrInfo.cpp, llvm/unittests/Target/AArch64 InstSizes.cpp

[AArch64] Report accurate sizes for MOVaddr and MOVimm pseudos
DeltaFile
+89-0llvm/unittests/Target/AArch64/InstSizes.cpp
+25-0llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
+114-02 files

LLVM/project d4712bellvm/unittests/Target/AMDGPU GCNRegPressureTest.cpp CMakeLists.txt

[AMDGPU] Pre-commit unit test for RP tracking reset/advance behavior

This adds a new AMDGPU unit test file for testing the behavior of
`GCNRPTracker` and its related classes. The two test showcase confusing
return value and behavioral semantics for variants of the advance and
reset functions, which will be clarified in a follow up commit.

This also moves some common test helpers from other AMDGPU unit tests to
the `AMDGPUUnitTests` TU to avoid repetition between unit tests.
DeltaFile
+156-0llvm/unittests/Target/AMDGPU/GCNRegPressureTest.cpp
+1-0llvm/unittests/Target/AMDGPU/CMakeLists.txt
+157-02 files

LLVM/project 1430e83llvm/lib/Target/AArch64 AArch64ExpandPseudoInsts.cpp AArch64ExpandImm.cpp

[NFC][AArch64] Extract MOVaddr* expansion model into common header

This makes the expansion logic reusable by getInstSizeInBytes in a
follow-up patch.
DeltaFile
+72-53llvm/lib/Target/AArch64/AArch64ExpandPseudoInsts.cpp
+46-26llvm/lib/Target/AArch64/AArch64ExpandImm.cpp
+8-1llvm/lib/Target/AArch64/AArch64ExpandImm.h
+1-0llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
+127-804 files

LLVM/project 803074fclang/docs LanguageExtensions.rst, clang/include/clang/Options Options.td

Enable driver changes for fexec-charset
DeltaFile
+14-6clang/lib/Driver/ToolChains/Clang.cpp
+14-4clang/include/clang/Options/Options.td
+11-3clang/test/Driver/clang_f_opts.c
+10-0llvm/lib/Support/TextEncoding.cpp
+4-3clang/test/Driver/cl-options.c
+3-3clang/docs/LanguageExtensions.rst
+56-193 files not shown
+60-199 files

LLVM/project d1b40cdclang/include/clang/Basic TargetInfo.h, clang/lib/AST ASTContext.cpp

convert to exec-charset inside getPredefinedStringLiteralFromCache, test __builtin_FILE()
DeltaFile
+28-0clang/test/CodeGen/systemz-charset.cpp
+10-0clang/lib/AST/ASTContext.cpp
+5-4clang/lib/Lex/TextEncodingConfig.cpp
+3-0clang/lib/Basic/TargetInfo.cpp
+2-0clang/include/clang/Basic/TargetInfo.h
+48-45 files

LLVM/project 4db2f4fclang/lib/AST PrintfFormatString.cpp FormatString.cpp, clang/lib/Sema SemaChecking.cpp

Add format string handling
DeltaFile
+58-31clang/lib/AST/PrintfFormatString.cpp
+46-40clang/lib/AST/FormatString.cpp
+33-21clang/lib/Sema/SemaChecking.cpp
+25-11clang/lib/AST/FormatStringParsing.h
+15-8clang/lib/AST/ScanfFormatString.cpp
+19-0llvm/lib/Support/TextEncoding.cpp
+196-11112 files not shown
+258-12118 files

LLVM/project 7c7f5bellvm/unittests/CodeGen CodeGenTestBase.h RematerializerTest.cpp, llvm/unittests/Target/AMDGPU LiveRegUnits.cpp AMDGPUUnitTests.cpp

[CodeGen][AMDGPU] Move boilerplate unit test code to base class (NFC) (#196547)

This adds the `CodeGenTestBase` class to handle boilerplate code for
codegen unit tests and makes use of it wherever possible, in particular
in AMDGPU unit tests.

Furthermore, this makes all AMDGPU unit tests rely on GoogleTest's API
for "run once per test-suite" code, instead of re-implementing that
behavior using a `std::once` flag. As a consequence all TEST(...) become
TEST_F(...).
DeltaFile
+91-0llvm/unittests/CodeGen/CodeGenTestBase.h
+7-71llvm/unittests/CodeGen/RematerializerTest.cpp
+5-72llvm/unittests/CodeGen/MachineDomTreeUpdaterTest.cpp
+9-43llvm/unittests/Target/AMDGPU/LiveRegUnits.cpp
+19-21llvm/unittests/Target/AMDGPU/AMDGPUUnitTests.cpp
+18-5llvm/unittests/Target/AMDGPU/AMDGPUUnitTests.h
+149-2126 files not shown
+159-25212 files

LLVM/project 3e059e2clang/include/clang/Sema Sema.h

Remove old include
DeltaFile
+0-1clang/include/clang/Sema/Sema.h
+0-11 files

LLVM/project f29a959clang/docs LanguageExtensions.rst, clang/include/clang/Options Options.td

Enable driver changes for fexec-charset
DeltaFile
+14-6clang/lib/Driver/ToolChains/Clang.cpp
+14-4clang/include/clang/Options/Options.td
+11-3clang/test/Driver/clang_f_opts.c
+10-0llvm/lib/Support/TextEncoding.cpp
+4-3clang/test/Driver/cl-options.c
+3-3clang/docs/LanguageExtensions.rst
+56-193 files not shown
+60-199 files

LLVM/project 5d6451cclang/include/clang/Basic TargetInfo.h, clang/lib/AST ASTContext.cpp

convert to exec-charset inside getPredefinedStringLiteralFromCache, test __builtin_FILE()
DeltaFile
+28-0clang/test/CodeGen/systemz-charset.cpp
+10-0clang/lib/AST/ASTContext.cpp
+5-4clang/lib/Lex/TextEncodingConfig.cpp
+3-0clang/lib/Basic/TargetInfo.cpp
+2-0clang/include/clang/Basic/TargetInfo.h
+48-45 files

LLVM/project 5f9b389clang/lib/AST PrintfFormatString.cpp FormatString.cpp, clang/lib/Sema SemaChecking.cpp

Add format string handling
DeltaFile
+58-31clang/lib/AST/PrintfFormatString.cpp
+46-40clang/lib/AST/FormatString.cpp
+33-21clang/lib/Sema/SemaChecking.cpp
+25-11clang/lib/AST/FormatStringParsing.h
+15-8clang/lib/AST/ScanfFormatString.cpp
+19-0llvm/lib/Support/TextEncoding.cpp
+196-11112 files not shown
+258-12218 files