LLVM/project e6e6947mlir/lib/Dialect/LLVMIR/IR LLVMTypeSyntax.cpp

clang-format
DeltaFile
+3-2mlir/lib/Dialect/LLVMIR/IR/LLVMTypeSyntax.cpp
+3-21 files

LLVM/project 49266famlir/include/mlir/Dialect/LLVMIR LLVMTypes.td, mlir/lib/Dialect/LLVMIR/IR LLVMTypes.cpp LLVMDialect.cpp

[mlir][LLVM] Add the `byte` type to the LLVM dialect

This PR ports the newly added `byte` type from LLVM IR to mlir's LLVM dialect.
The simplest motivation for the byte type is being able to implement `memcpy` in LLVM IR. This was previously not possible: Due to rules around conversions between integers and pointers (which e.g. implicitly happen during loads), partial-poisons and pointer provenance were not preserved.
No alterantive types to integers existed that one could use to have poison and provenance preserving SSA-values. The byte type solves exactly this issue.
Frontends are encouraged to use it when needed for better optimization capabilities.

Currently, the only operation that has changed semantics around `byte` is `bitcast`. Is now allows casting between `byte` and `ptr` (unlike integers and pointers).

Corresponding LLVM commit: https://github.com/llvm/llvm-project/commit/80f2ef70f592

Assisted by Claude & Gemini
DeltaFile
+57-0mlir/test/Dialect/LLVMIR/layout.mlir
+25-0mlir/include/mlir/Dialect/LLVMIR/LLVMTypes.td
+23-1mlir/lib/Dialect/LLVMIR/IR/LLVMTypes.cpp
+14-0mlir/test/Dialect/LLVMIR/roundtrip.mlir
+9-4mlir/lib/Dialect/LLVMIR/IR/LLVMDialect.cpp
+9-3mlir/lib/Target/LLVMIR/TypeToLLVM.cpp
+137-85 files not shown
+172-1111 files

LLVM/project cc61305llvm/lib/Target/X86 X86ISelLowering.cpp X86ISelLowering.h, llvm/test/Transforms/AtomicExpand/X86 expand-atomic-non-integer.ll

[X86] Remove shouldCastAtomicLoadInIR; use DAG combine instead (#199520)

Remove X86's shouldCastAtomicLoadInIR override that cast FP atomic loads
to integer at the IR level. Instead, handle this in a pre-legalize DAG
combine (combineAtomicLoad) that rewrites FP/FP-vector atomic loads to
integer atomic loads plus a bitcast.

This and #199310, which adds the necessary cmpxchg support for
non-integer atomic loads in AtomicExpand, are a response to
https://github.com/llvm/llvm-project/pull/148899 for `atomic_vec4_float`
of `atomic-load-store.ll`.

Stacked above #201303.
DeltaFile
+25-7llvm/lib/Target/X86/X86ISelLowering.cpp
+2-4llvm/test/Transforms/AtomicExpand/X86/expand-atomic-non-integer.ll
+0-2llvm/lib/Target/X86/X86ISelLowering.h
+27-133 files

LLVM/project 71f69c1llvm/lib/Transforms/Vectorize LoopVectorize.cpp

[LV] Use ResumeForEpilogue for header phi resume in epilogue plan (NFC) (#203786)

Pass the ResumeForEpilogue VPInstructions created by
preparePlanForMainVectorLoop into preparePlanForEpilogueVectorLoop and
get the resume IR from ResumeForEpilogue::getUnderlyingValue()
DeltaFile
+15-9llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+15-91 files

LLVM/project ad71198llvm/test/Transforms/LoopVectorize outer_loop_contiguous.ll

[LV] Add outer-loop tests for continuous access analysis. (NFC) (#203789)

Add outer loop tests with different strided accesses.
DeltaFile
+288-0llvm/test/Transforms/LoopVectorize/outer_loop_contiguous.ll
+288-01 files

LLVM/project f7e53fcoffload/liboffload/src OffloadImpl.cpp

[offload] Fix olMemcpy error message typo (#197273)
DeltaFile
+1-1offload/liboffload/src/OffloadImpl.cpp
+1-11 files

LLVM/project 80b1dddllvm/lib/Transforms/Vectorize VPlanTransforms.cpp, llvm/test/Transforms/LoopVectorize predicatedinst-loop-invariant.ll

[LV] Drop the mask of a predicated store masked by the header mask. (#201676)

Drop the mask of a predicated store masked by the header mask (which is
guaranteed to be true at least for the first lane) and both the stored
value and the address are uniform across VF and UF.

An similar version for loads was included in
https://github.com/llvm/llvm-project/pull/196630, but restricted the
uniform-across-vfs-and-ufs did not have impact in practice.

For stores, this results in some improvements after
https://github.com/llvm/llvm-project/pull/196632.

PR: https://github.com/llvm/llvm-project/pull/201676
DeltaFile
+59-49llvm/test/Transforms/LoopVectorize/X86/vectorize-interleaved-accesses-gap.ll
+49-39llvm/test/Transforms/LoopVectorize/SystemZ/pr47665.ll
+17-0llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
+0-16llvm/test/Transforms/LoopVectorize/predicatedinst-loop-invariant.ll
+125-1044 files

LLVM/project 3df63e9bolt/include/bolt/Rewrite RewriteInstance.h, bolt/lib/Rewrite RewriteInstance.cpp

[BOLT] Zero alignment padding when reusing old text section (#202375)

With --use-old-text, the output starts as a byte-for-byte copy of the
input. Alignment padding between sections could retain stale data from
the original binary. Zero the padding so the result matches writing
sections to new file offsets.
DeltaFile
+46-0bolt/test/AArch64/use-old-text-zero-padding.c
+44-0bolt/lib/Rewrite/RewriteInstance.cpp
+5-0bolt/include/bolt/Rewrite/RewriteInstance.h
+95-03 files

LLVM/project 3e11b3bcompiler-rt/test/fuzzer fork_corpus_groups.test fork.test

[Fuzzer] Bump max RSS in fork tests (#203688)

These became flaky on at least one buildbot after enabling the internal
shell by default:
1. https://lab.llvm.org/buildbot/#/builders/174/builds/36874
2. https://lab.llvm.org/buildbot/#/builders/174/builds/36876

Try bumping the max RSS to see if that helps.
DeltaFile
+1-1compiler-rt/test/fuzzer/fork_corpus_groups.test
+1-1compiler-rt/test/fuzzer/fork.test
+2-22 files

LLVM/project a53a62d.github/workflows/containers/github-action-ci-tooling Dockerfile

[Github] Remove unnecessary packages from github-automation container (#203358)

This cuts the container size from 654 MB to 229 MB. This is mainly due
to removing the python3-pip package which was pulling in some big
depedencies like gcc.

A smaller container will be faster to download which will speed up the
workflow runs, but also, having less packages means smaller attack
surface for the container.
DeltaFile
+7-5.github/workflows/containers/github-action-ci-tooling/Dockerfile
+7-51 files

LLVM/project 6bda63cllvm/test/tools/llvm-cov gap-region-line-coverage.test, llvm/test/tools/llvm-cov/Inputs/gap-region-quirk gap-quirk.yaml gap-quirk.proftext

[llvm-cov] Replace binary test blobs with text formats

Replace .covmapping and .profdata binary blobs with .yaml (obj2yaml)
and .proftext respectively. The test now uses yaml2obj and
llvm-profdata merge to produce inputs at test time.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply at anthropic.com>
DeltaFile
+57-0llvm/test/tools/llvm-cov/Inputs/gap-region-quirk/gap-quirk.yaml
+7-18llvm/test/tools/llvm-cov/gap-region-line-coverage.test
+16-0llvm/test/tools/llvm-cov/Inputs/gap-region-quirk/gap-quirk.proftext
+0-0llvm/test/tools/llvm-cov/Inputs/gap-region-quirk/gap-quirk.profdata
+0-0llvm/test/tools/llvm-cov/Inputs/gap-region-quirk/gap-quirk.covmapping
+80-185 files

LLVM/project eda662fflang/docs Intrinsics.md, flang/lib/Evaluate intrinsics.cpp

[flang] Add support for the IARGC and GETARG legacy intrinsics (#196425)

Adds semantic checking and lowering, along with semantic and lowering
tests for the legacy GNU intrinsics 'IARGC()' and 'GETARG(POS, VALUE)'.

Although these could just be added as aliases to the standard
COMMAND_ARGUMENT_COUNT and GET_COMMAND_ARGUMENT intrinsics, they were
implemented as separate intrinsics because of some semantic differences
between them:

* IARGC always returns INTEGER(4), whereas COMMAND_ARGUMENT_COUNT
returns a default INTEGER, which could have a different kind.
* GETARG has only two arguments, both of which are required.
* GETARG's POS argument accepts any integer type of width less than or
equal to the default integer kind, while GET_COMMAND_ARGUMENT only
accepts default integers.

Fixes #158438
DeltaFile
+55-0flang/docs/Intrinsics.md
+54-0flang/test/Semantics/test-iargc-getarg.f90
+32-0flang/lib/Optimizer/Builder/IntrinsicCall.cpp
+23-0flang/test/Lower/Intrinsics/getarg.f90
+11-0flang/test/Lower/Intrinsics/iargc.f90
+8-0flang/lib/Evaluate/intrinsics.cpp
+183-01 files not shown
+185-07 files

LLVM/project f683de4utils/bazel/llvm-project-overlay/llvm BUILD.bazel

[bazel] Add an LLVM ABI-breaking checks build setting (#203739)
DeltaFile
+23-5utils/bazel/llvm-project-overlay/llvm/BUILD.bazel
+23-51 files

LLVM/project f58495allvm/lib/Target/RISCV RISCVFrameLowering.cpp, llvm/test/CodeGen/RISCV shadowcallstack.ll

[RISCV] Remove manual compression of SSPUSH in RISCVFrameLowering.cpp. NFC (#203635)

We used to emit a Zcmop instruction here, which required manual
compression. Since we now emit a Zicfiss instruction, we can rely on
CompressPat to do the right thing.
DeltaFile
+814-394llvm/test/CodeGen/RISCV/shadowcallstack.ll
+3-10llvm/lib/Target/RISCV/RISCVFrameLowering.cpp
+817-4042 files

LLVM/project a44231fllvm/lib/Target/RISCV RISCVISelLowering.cpp RISCVInstrInfoP.td

[RISCV] Lower the paadd/pasub intrinsics to existing ISD nodes. NFC (#203646)

This avoids having 2 different isel patterns for the same operation.
DeltaFile
+40-2llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+0-27llvm/lib/Target/RISCV/RISCVInstrInfoP.td
+40-292 files

LLVM/project e19d1f5llvm/lib/Target/X86 X86ISelLowering.cpp, llvm/test/CodeGen/X86 gfni-xor-fold.ll gfni-xor-fold-avx512.ll

[X86] Fold XOR of two VGF2P8AFFINEQB instructions with same matrix (#199146)

Adds an optimization to fold a XOR between two `vgf2p8affineqb`
instructions that share the same matrix by XORing their sources
beforehand. This patch:

- Can eliminate one `vgf2p8affineqb` instruction.
- Doesn't occur if either affine is multi use, preventing an increase in code size.
- Includes test coverage for both positive and negative cases.

Fixes #196879
DeltaFile
+93-15llvm/test/CodeGen/X86/gfni-xor-fold.ll
+48-8llvm/test/CodeGen/X86/gfni-xor-fold-avx512.ll
+30-13llvm/lib/Target/X86/X86ISelLowering.cpp
+171-363 files

LLVM/project 6d25dcellvm/test/tools/llubi ptrtoaddr.ll ptrtoaddr_after_ptrtoint.ll, llvm/tools/llubi/lib Interpreter.cpp

[llubi] Implement ptrtoaddr (#203771)

This PR implements the `ptrtoaddr` instruction.
DeltaFile
+25-0llvm/test/tools/llubi/ptrtoaddr.ll
+21-0llvm/test/tools/llubi/ptrtoaddr_after_ptrtoint.ll
+20-0llvm/test/tools/llubi/ptrtoaddr_no_expose.ll
+9-2llvm/tools/llubi/lib/Interpreter.cpp
+75-24 files

LLVM/project 7a225ceclang/test/Sema warn-lifetime-safety.cpp, clang/test/Sema/LifetimeSafety safety.cpp

Rebase

Created using spr 1.3.7
DeltaFile
+3,204-3,450llvm/test/CodeGen/X86/vector-interleaved-store-i16-stride-7.ll
+1,905-2,037llvm/test/CodeGen/X86/vector-interleaved-store-i16-stride-6.ll
+3,716-0clang/test/Sema/LifetimeSafety/safety.cpp
+0-3,653clang/test/Sema/warn-lifetime-safety.cpp
+2,760-227llvm/test/CodeGen/AMDGPU/fcanonicalize.ll
+1,813-654llvm/test/CodeGen/AMDGPU/llvm.amdgcn.sched.group.barrier.ll
+13,398-10,021995 files not shown
+71,343-29,3361,001 files

LLVM/project da2e8c4clang/test/CodeGen ubsan-aggregate-null-align-bounds.c ubsan-aggregate-null-align.c, llvm/test/Transforms/SLPVectorizer/X86 reduction-ordered-fadd.ll

Rebase

Created using spr 1.3.7
DeltaFile
+448-3llvm/tools/llubi/lib/Context.cpp
+34-204llvm/tools/llubi/lib/Interpreter.cpp
+170-0clang/test/CodeGen/ubsan-aggregate-null-align-bounds.c
+0-170clang/test/CodeGen/ubsan-aggregate-null-align.c
+165-0llvm/test/Transforms/SLPVectorizer/X86/reduction-ordered-fadd.ll
+165-0mlir/test/Dialect/SparseTensor/sparse_loop_ordering.mlir
+982-37744 files not shown
+1,854-67750 files

LLVM/project 86543b7llvm/test/Transforms/LoopVectorize/AArch64 transform-narrow-interleave-to-widen-memory-constant-ops.ll

[LV] Add additional tests for narrowing IGs with invariant ops (NFC). (#203726)

Add test where operands are different live-ins.
DeltaFile
+99-28llvm/test/Transforms/LoopVectorize/AArch64/transform-narrow-interleave-to-widen-memory-constant-ops.ll
+99-281 files

LLVM/project beadaecllvm/test/tools/llubi global_constexpr_initializer.ll unsupported_constant.ll, llvm/tools/llubi/lib Context.cpp Interpreter.cpp

[llubi] Add support for constant expressions (#203746)

This patch adds support for most kinds of constant expressions, except
for ptrtoint/inttoptr. Casting between pointers and integers is
stateful, so they cannot be cached. I plan to implement them in
subsequent patches. ptrtoaddr is also supported in this patch to block
constant folding.

The logic in `evaluateConstantExpression` duplicates the interpreter's
code in `visit*` methods. But I think it is acceptable. Only the GEP
computation is reused.
DeltaFile
+346-1llvm/tools/llubi/lib/Context.cpp
+4-200llvm/tools/llubi/lib/Interpreter.cpp
+106-2llvm/test/tools/llubi/global_constexpr_initializer.ll
+36-0llvm/tools/llubi/lib/Value.h
+13-0llvm/tools/llubi/lib/Context.h
+12-0llvm/test/tools/llubi/unsupported_constant.ll
+517-2036 files

LLVM/project 96d3adeflang/lib/Lower/OpenMP OpenMP.cpp, mlir/lib/Dialect/OpenMP/IR OpenMPDialect.cpp

[flang][OpenMP] Lower target in_reduction for host fallback

Enable host-fallback lowering for target in_reduction in Flang and MLIR OpenMP translation.

Model target in_reduction through the matching map entry, force address-preserving implicit mapping for Flang in_reduction list items, and emit the host-side task-reduction lookup with __kmpc_task_reduction_get_th_data. Unsupported device/offload-entry and richer reduction forms remain diagnosed.

Add Flang lowering, MLIR verifier/translation, and LLVM IR tests for the supported host-fallback path and the remaining unsupported cases.
DeltaFile
+128-14mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
+77-36mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp
+72-19flang/lib/Lower/OpenMP/OpenMP.cpp
+83-3mlir/test/Target/LLVMIR/openmp-todo.mlir
+75-0mlir/test/Target/LLVMIR/openmp-target-in-reduction-multi.mlir
+60-0mlir/test/Dialect/OpenMP/invalid.mlir
+495-726 files not shown
+615-8912 files

LLVM/project 97bd294clang/test/Sema warn-lifetime-safety.cpp, clang/test/Sema/LifetimeSafety safety.cpp

Merge branch 'main' into users/arsenm/runtimes/only-forward-macos-targets-darwin
DeltaFile
+3,204-3,450llvm/test/CodeGen/X86/vector-interleaved-store-i16-stride-7.ll
+1,905-2,037llvm/test/CodeGen/X86/vector-interleaved-store-i16-stride-6.ll
+3,716-0clang/test/Sema/LifetimeSafety/safety.cpp
+0-3,653clang/test/Sema/warn-lifetime-safety.cpp
+2,760-227llvm/test/CodeGen/AMDGPU/fcanonicalize.ll
+1,813-654llvm/test/CodeGen/AMDGPU/llvm.amdgcn.sched.group.barrier.ll
+13,398-10,021719 files not shown
+53,421-25,327725 files

LLVM/project 6015c47libcxx/test/std/containers/container.adaptors/container.adaptors.format format.functions.format.pass.cpp format.functions.vformat.pass.cpp, libcxx/test/std/utilities/format/format.range/format.range.fmtmap format.functions.format.pass.cpp format.functions.vformat.pass.cpp

Remove LLVM-LIBC-FIXME XFAILs from tests that unexpectedly passed
DeltaFile
+0-2libcxx/test/std/containers/container.adaptors/container.adaptors.format/format.functions.format.pass.cpp
+0-2libcxx/test/std/containers/container.adaptors/container.adaptors.format/format.functions.vformat.pass.cpp
+0-2libcxx/test/std/utilities/format/format.range/format.range.fmtmap/format.functions.format.pass.cpp
+0-2libcxx/test/std/utilities/format/format.range/format.range.fmtmap/format.functions.vformat.pass.cpp
+0-2libcxx/test/std/utilities/format/format.range/format.range.fmtset/format.functions.format.pass.cpp
+0-2libcxx/test/std/utilities/format/format.range/format.range.fmtset/format.functions.vformat.pass.cpp
+0-126 files not shown
+0-2412 files

LLVM/project 416a7a8clang/include/clang/Options Options.td, clang/lib/Driver/ToolChains AMDGPU.cpp AMDGPU.h

clang/AMDGPU: Split out target ID flags in TranslateArgs.

Change how xnack and sramecc are processed. Introduce
-mxnack/-mno-xnack and -msramecc/-mno-sramecc flags.
When the target is first parsed in TranslateArgs, synthesize
the appropriate flag for the toolchain. This avoids
special case feature string fixups in getAMDGPUTargetFeatures,
and also avoids an extra parse of the target ID.

In the future this will also simplify tracking these ABI
modifiers in a module flag.

As a side-effect, you can use these flags to override the
no specifier case with the flags. These do not fully replace
the target ID syntax, as there's no way to represent compiling
both modes for the same subtarget.

I didn't bother trying to forward these flags on the main command
line without being specified to the offload device, but I suppose

    [3 lines not shown]
DeltaFile
+149-0clang/test/Driver/amdgpu-xnack-sramecc-flags.c
+24-27clang/lib/Driver/ToolChains/AMDGPU.cpp
+9-4clang/test/Driver/hip-target-id.hip
+6-4clang/lib/Driver/ToolChains/AMDGPU.h
+3-2clang/lib/Driver/ToolChains/HIPAMD.cpp
+4-0clang/include/clang/Options/Options.td
+195-375 files not shown
+203-4411 files

LLVM/project db210c5llvm/include/llvm/Analysis ProfileSummaryInfo.h

[NFC] Prefer compile-time branching in function template (#203698)
DeltaFile
+2-2llvm/include/llvm/Analysis/ProfileSummaryInfo.h
+2-21 files

LLVM/project c8d16e2mlir/lib/Target/LLVMIR/Dialect/OpenMP OpenMPToLLVMIRTranslation.cpp, mlir/test/Target/LLVMIR openmp-taskloop-reduction.mlir openmp-todo.mlir

[mlir][OpenMP] Translate reductions on taskloop

Add LLVM IR translation for reduction and in_reduction clauses on omp.taskloop.context.

For taskloop reduction, emit the implicit taskgroup reduction setup and map each generated task to runtime-provided private reduction storage through __kmpc_task_reduction_get_th_data. For in_reduction, use the same runtime lookup path with a null descriptor to join an enclosing task reduction context.

Unsupported byref, cleanup, and two-argument initializer forms remain diagnosed.

Add MLIR translation tests for the supported taskloop reduction and in_reduction cases.
DeltaFile
+266-0mlir/test/Target/LLVMIR/openmp-taskloop-reduction.mlir
+220-27mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
+92-10mlir/test/Target/LLVMIR/openmp-todo.mlir
+578-373 files

LLVM/project 0a2565dclang/lib/AST/ByteCode InterpBuiltin.cpp

[clang][bytecode] Use `isSignedType()` in `pushInteger` (#203670)

We need to classify here anyway, so use the information from the
primtype.
DeltaFile
+1-1clang/lib/AST/ByteCode/InterpBuiltin.cpp
+1-11 files

LLVM/project 4331c07libcxx/include optional

[libc++][NFC] Simplify `optional<T>` and `optional<T&>` a bit (#203665)

- Make `optional<T&>`'s iterator base directly from the storage base
instead of inheriting the empty bases, allowing us to remove the
`is_lvalue_reference_v` conditions in the empty bases
- Move the `__is_constructible_for_optional_{meow}` variables closer to
`make_optional` since that's the only place they're really useful for
now
- Change the SFINAE for the iterator availability to use concepts
instead

The above should make it easier to split up in an upcoming patch.
DeltaFile
+62-60libcxx/include/optional
+62-601 files

LLVM/project 0b17f4fclang/include/clang/Analysis/Analyses/LifetimeSafety Origins.h FactsGenerator.h, clang/lib/Analysis/LifetimeSafety Origins.cpp FactsGenerator.cpp

[LifetimeSafety] Track per-field origins for record types
DeltaFile
+348-5clang/test/Sema/warn-lifetime-safety.cpp
+106-7clang/lib/Analysis/LifetimeSafety/Origins.cpp
+69-37clang/lib/Analysis/LifetimeSafety/FactsGenerator.cpp
+30-0clang/include/clang/Analysis/Analyses/LifetimeSafety/Origins.h
+4-6clang/test/Sema/warn-lifetime-safety-dangling-field.cpp
+0-2clang/include/clang/Analysis/Analyses/LifetimeSafety/FactsGenerator.h
+557-576 files