LLVM/project 3c33c36libcxx/src atomic.cpp

[libc++] Use public os_sync API instead of private __ulock on newer Apple platforms (#202519)

The atomic wait and wake implementation on Apple platforms currently
relies on `__ulock_wait` and `__ulock_wake`, which are private kernel
APIs. This is a problem for anyone shipping apps through the App Store
since Apple flags private symbol usage during review.

Starting with macOS 14.4 and iOS 17.4, Apple ships public replacements
through `os_sync_wait_on_address` and `os_sync_wake_by_address_any/all`
in `<os/os_sync_wait_on_address.h>`. These cover the same functionality
and are documented, stable, and safe for App Store submissions.

This patch makes use of the public APIs instead of the private ones
whenever the underlying OS permits it.

This takes over #182947.
Fixes #182908
Fixes #146142

Co-authored-by: Bbn08 <atrancendentbeing at gmail.com>
DeltaFile
+87-15libcxx/src/atomic.cpp
+87-151 files

LLVM/project e8a022alibc/include/llvm-libc-macros/linux sys-ioctl-macros.h

[libc] Include linux headers to get ioctl macros (#204555)

Linux has many existing ioctls and keeps adding them, so a
hand-maintained list would always be out of date. Additionally, some
ioctls have architecture specific numbers (some in a very subtle way --
by having the number depend on the size of a structure).

asm/ioctls.h and linux/sockios.h are pretty clean, and are already
included by glibc, so we can just do the same to get the latest
definitions.
DeltaFile
+2-8libc/include/llvm-libc-macros/linux/sys-ioctl-macros.h
+2-81 files

LLVM/project 23abfa0mlir/include/mlir/Dialect/Tosa/IR TosaTypesBase.td, mlir/test/Conversion/TosaToArith tosa-to-arith.mlir

[mlir][tosa] Allow rank-0 vector operands in tosa.apply_scale (#199924)

I was facing a bug that can be reproduced this way:

```mlir
 // RUN:  mlir-opt --transform-interpreter tosa_apply_scale_rank0_repro.mlir
  #map        = affine_map<(d0) -> (d0)>
  #map_scalar = affine_map<(d0) -> ()>

  func.func @repro(%input: tensor<64xi32>, %scalar_t: tensor<i32>,
                   %out_init: tensor<64xi8>) -> tensor<64xi8> {
    %c31_i8     = arith.constant 31 : i8
    %cScale_i32 = arith.constant -1010580540 : i32

    %tile_out = linalg.generic
      { indexing_maps = [#map, #map_scalar, #map],
        iterator_types = ["parallel"] }
      ins(%input, %scalar_t : tensor<64xi32>, tensor<i32>)
      outs(%out_init : tensor<64xi8>) {

    [52 lines not shown]
DeltaFile
+10-0mlir/test/Conversion/TosaToArith/tosa-to-arith.mlir
+1-1mlir/include/mlir/Dialect/Tosa/IR/TosaTypesBase.td
+11-12 files

LLVM/project d77f3bfllvm/utils/gn/secondary/clang/tools/clang-ssaf-format BUILD.gn

[gn] port 53dabae40fb3a8514 more (#204578)
DeltaFile
+1-0llvm/utils/gn/secondary/clang/tools/clang-ssaf-format/BUILD.gn
+1-01 files

LLVM/project 1343b64compiler-rt/cmake builtin-config-ix.cmake

[compiler-rt] Fix default builtins target _Float16 detection on x86_64/i386 (#204474)
DeltaFile
+6-0compiler-rt/cmake/builtin-config-ix.cmake
+6-01 files

LLVM/project 7fda520llvm/utils/gn/secondary/llvm/unittests/Transforms/Vectorize BUILD.gn

[gn build] Port 4d812375c174 (#204575)
DeltaFile
+0-1llvm/utils/gn/secondary/llvm/unittests/Transforms/Vectorize/BUILD.gn
+0-11 files

LLVM/project 8b0462fllvm/utils/gn/secondary/clang/lib/Driver BUILD.gn, llvm/utils/gn/secondary/clang/lib/FrontendTool BUILD.gn

[gn] port 53dabae40fb3a8514 (ssaf/SourceTransformation) (#204574)
DeltaFile
+12-0llvm/utils/gn/secondary/clang/lib/ScalableStaticAnalysisFramework/SourceTransformation/BUILD.gn
+4-0llvm/utils/gn/secondary/clang/unittests/ScalableStaticAnalysisFramework/BUILD.gn
+1-0llvm/utils/gn/secondary/clang/tools/clang-ssaf-analyzer/BUILD.gn
+1-0llvm/utils/gn/secondary/clang/lib/Driver/BUILD.gn
+1-0llvm/utils/gn/secondary/clang/lib/FrontendTool/BUILD.gn
+1-0llvm/utils/gn/secondary/clang/tools/clang-ssaf-linker/BUILD.gn
+20-06 files

LLVM/project 5fe9132llvm/utils/gn/secondary/clang/lib/ScalableStaticAnalysisFramework/Core BUILD.gn

[gn build] Port 6e21a04a5a96 (#204576)
DeltaFile
+1-0llvm/utils/gn/secondary/clang/lib/ScalableStaticAnalysisFramework/Core/BUILD.gn
+1-01 files

LLVM/project df06afbllvm/lib/Target/AMDGPU SIWholeQuadMode.cpp, llvm/test/CodeGen/AMDGPU wqm.mir licm-wwm.mir

[AMDGPU] Mark all instructions in WWM region as convergent

Mark instructions between ENTER_STRICT_WWM and EXIT_STRICT_WWM as
convergent, so they don't get moved out of the whole wave mode region
(see the licm-wwm.mir test). This doesn't automagically fix all our
woes, since things can still be moved out of the region before we even
run si-wqm, but there are rumours about moving WWM formation earlier
anyway.

This is not a substitute for proper WWM support - in particular, this
would inhibit most optimizations inside WWM regions with complex control
flow. Right now most WWM is relatively limited in size and complexity,
so I think this is acceptable until we get a more principled solution.

I haven't thought too much about whether or not we need this for WQM as
well.

Assisted by: Claude Sonnet

commit-id:9204c7e2
DeltaFile
+17-17llvm/test/CodeGen/AMDGPU/wqm.mir
+24-1llvm/test/CodeGen/AMDGPU/licm-wwm.mir
+5-5llvm/test/CodeGen/AMDGPU/wqm-debug-instr.mir
+8-0llvm/lib/Target/AMDGPU/SIWholeQuadMode.cpp
+2-2llvm/test/CodeGen/AMDGPU/si-init-whole-wave.mir
+56-255 files

LLVM/project 28ac70ellvm/docs AMDGPUExecutionSynchronization.rst

[AMDGPU][doc] Refactor Barrier Execution Model

Remove everything that has to do with named barriers and put it in a series of model extensions specific to /sbarrier/named-barriers.

I had to change a few things to make it fit, in summary:

Base Model:

* Stylistic changes that make it easier to refer to specific rules. Each rule is in a rubric instead of a bullet point.
* (-) No longer defines `barrier-mutually-exclusive`
* (-) No longer defines barrier `join` and any associated rule.

New named barrier extensions
* Define "named barrier" as a sub-type of barrier objects. This makes barrier-mutually-exclusive redundant.
* Define barrier join as an op that can exclusively be done on `named barrier objects`.
* Define rules relating to join and its ordering with other barrier operations

Following these changes, the target tables changed a bit as well.


    [2 lines not shown]
DeltaFile
+200-154llvm/docs/AMDGPUExecutionSynchronization.rst
+200-1541 files

LLVM/project 60416cfopenmp/runtime/src kmp_traits.cpp kmp_traits.h, openmp/runtime/src/i18n en_US.txt

[libomp] Parse OMP_DEFAULT_DEVICE with new device trait parser

... but do not yet expose the new functionalities to the user. This is a
backward compatible update that is going to be followed by the step to
the OpenMP 6.0 semantics as defined in 4.3.8.
DeltaFile
+105-0openmp/runtime/unittests/Traits/TestOMPTraitParser.cpp
+24-0openmp/runtime/src/kmp_traits.cpp
+8-0openmp/runtime/src/kmp_traits.h
+3-2openmp/runtime/src/kmp_settings.cpp
+3-0openmp/runtime/src/i18n/en_US.txt
+143-25 files

LLVM/project 514190copenmp/runtime/src kmp_traits.cpp kmp_traits.h, openmp/runtime/src/i18n en_US.txt

move parse_single_device to other PR
DeltaFile
+0-80openmp/runtime/unittests/Traits/TestOMPTraitParser.cpp
+0-19openmp/runtime/src/kmp_traits.cpp
+0-7openmp/runtime/src/kmp_traits.h
+0-1openmp/runtime/src/i18n/en_US.txt
+0-1074 files

LLVM/project f39ddcbclang/lib/Driver/ToolChains AMDGPU.cpp, llvm/include/llvm/TargetParser AMDGPUTargetParser.def

AMDGPU: Add subtarget feature for controllable xnack modes

This replaces the previously removed xnack-any-only feature,
with the inversion xnack-on-off-modes. All pre-gfx12.5 xnack
targets support the controllable mode. Ignore explicitly
set xnack settings the same way as is done for xnack requests
on other unsupported targets.
DeltaFile
+22-22llvm/include/llvm/TargetParser/AMDGPUTargetParser.def
+13-11llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp
+22-0llvm/test/CodeGen/AMDGPU/target-id-xnack-always-on.ll
+14-6llvm/lib/Target/AMDGPU/AMDGPU.td
+2-8llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUTargetStreamer.h
+4-3clang/lib/Driver/ToolChains/AMDGPU.cpp
+77-504 files not shown
+82-5710 files

LLVM/project 46f3a3ellvm/lib/Target/AMDGPU SIWholeQuadMode.cpp, llvm/test/CodeGen/AMDGPU wqm.mir licm-wwm.mir

[AMDGPU] Mark all instructions in WWM region as convergent

Mark instructions between ENTER_STRICT_WWM and EXIT_STRICT_WWM as
convergent, so they don't get moved out of the whole wave mode region
(see the licm-wwm.mir test). This doesn't automagically fix all our
woes, since things can still be moved out of the region before we even
run si-wqm, but there are rumours about moving WWM formation earlier
anyway.

This is not a substitute for proper WWM support - in particular, this
would inhibit most optimizations inside WWM regions with complex control
flow. Right now most WWM is relatively limited in size and complexity,
so I think this is acceptable until we get a more principled solution.

I haven't thought too much about whether or not we need this for WQM as
well.

Assisted by: Claude Sonnet

commit-id:9204c7e2
DeltaFile
+17-17llvm/test/CodeGen/AMDGPU/wqm.mir
+26-1llvm/test/CodeGen/AMDGPU/licm-wwm.mir
+5-5llvm/test/CodeGen/AMDGPU/wqm-debug-instr.mir
+8-0llvm/lib/Target/AMDGPU/SIWholeQuadMode.cpp
+2-2llvm/test/CodeGen/AMDGPU/si-init-whole-wave.mir
+58-255 files

LLVM/project 904ce28llvm/include/llvm/CodeGen MachineInstr.h, llvm/test/CodeGen/AMDGPU isel-amdgpu-cs-chain-cc.ll

Repurpose MIFlag::NoConvergent

The NoConvergent MIFlag allows us to mark specific instances of
convergent (as indicated by their MCID) MachineInstrs as not convergent.
Sometimes it's useful to do the opposite as well - mark certain
instances of instructions that are not normally convergent as
convergent (for instance inside WWM regions on AMDGPU).
This patch renames the NoConvergent flag to OverrideConvergence. This
can be set to communicate that if the opcode is usually convergent, then
this particular instance of it isn't, and the other way around. When
changing the opcode of an instruction, we first check if the new opcode
has the same "convergence" as the old one - if it does, then we preserve
the flag, otherwise we clear it since we can get the correct convergence
from the opcode now.

Assisted by: Claude Sonnet

commit-id:93c99000
DeltaFile
+92-92llvm/test/CodeGen/AMDGPU/GlobalISel/irtranslator-call.ll
+50-50llvm/test/CodeGen/AMDGPU/GlobalISel/irtranslator-call-return-values.ll
+43-44llvm/include/llvm/CodeGen/MachineInstr.h
+20-20llvm/test/CodeGen/AMDGPU/GlobalISel/irtranslator-call-implicit-args.ll
+16-16llvm/test/CodeGen/AMDGPU/isel-amdgpu-cs-chain-cc.ll
+10-10llvm/test/CodeGen/AMDGPU/GlobalISel/dereferenceable-declaration.ll
+231-23224 files not shown
+293-30830 files

LLVM/project 2062fc6llvm/test/CodeGen/AMDGPU/GlobalISel dropped_debug_info_assert.ll

[AMDGPU] Run update script on test. NFC

There's some bogus whitespace in the generated CHECKs that changes when
touching the test.

commit-id:150473b0
DeltaFile
+37-37llvm/test/CodeGen/AMDGPU/GlobalISel/dropped_debug_info_assert.ll
+37-371 files

LLVM/project 6236b9allvm/lib/Target/X86 X86CompressEVEX.cpp, llvm/test/CodeGen/X86 compress-evex-vpmov-kill.mir

[X86] Fix stale kill flag when folding VPMOV*2M + KMOV to VMOVMSK (#204342)

tryCompressVPMOVPattern folds VPMOV*2M + KMOV into a single VMOVMSK in
place: it changes the KMOV's opcode and repoints its source operand from
the mask k-register to the XMM source via MachineOperand::setReg().

setReg() only changes the register number and keeps the operand's other
flags, so the kill flag computed for the mask ("killed $k0") is reused
for the XMM source. When the source is still live this marks it killed,
which the machine verifier reports as a use of an undefined register.

We should instead use the kill flag from the VPMOV's source operand. The
forward scan already guarantees the source is not redefined between the
VPMOV and the KMOV, so the VPMOV's flag is correct at the relocated
read.

Found via @jlebar's X86 LLVM bug-hunt / FuzzX effort:

https://github.com/SemiAnalysisAI/FuzzX/tree/master/x86/bugs/043-compress-evex-vpmov-srcvec-clobber-kmov

cc @jlebar
DeltaFile
+37-0llvm/test/CodeGen/X86/compress-evex-vpmov-kill.mir
+5-1llvm/lib/Target/X86/X86CompressEVEX.cpp
+42-12 files

LLVM/project f6589a1llvm/lib/Target/AMDGPU SIInsertWaitcnts.cpp

Adjust comment
DeltaFile
+2-2llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
+2-21 files

LLVM/project 61f4618llvm/lib/Target/X86/GISel X86LegalizerInfo.cpp, llvm/test/CodeGen/X86 isel-invoke.ll

[X86][GlobalISel] Explicitly legalize G_INVOKE_REGION_START (#203503)

Removing dependency on the legacy ruleset similarly to #197374

The missing testing coverage was found during LegacyLegalizerInfo
removal in #197308
DeltaFile
+106-0llvm/test/CodeGen/X86/isel-invoke.ll
+1-0llvm/lib/Target/X86/GISel/X86LegalizerInfo.cpp
+107-02 files

LLVM/project 4397de2llvm/docs LangRef.rst, llvm/include/llvm/IR Instructions.h

[IR] Add elementwise modifier to atomic loads
DeltaFile
+33-0llvm/test/Assembler/invalid-load-store-atomic-elementwise.ll
+18-4llvm/lib/IR/Verifier.cpp
+15-6llvm/docs/LangRef.rst
+14-3llvm/include/llvm/IR/Instructions.h
+14-2llvm/lib/AsmParser/LLParser.cpp
+16-0llvm/test/Bitcode/atomic-load-store-elementwise.ll
+110-156 files not shown
+140-2112 files

LLVM/project d65d7a1clang/lib/Analysis/LifetimeSafety LoanPropagation.cpp, clang/test/Sema/LifetimeSafety dangling-global.cpp dangling-field.cpp

[LifetimeSafety] Count escape facts when classifying persistent origins (#204485)

computePersistentOrigins marks an origin persistent (kept across CFG
blocks)
only if it appears in more than one block, but it omitted
OriginEscapesFact.
A global is not seeded at function entry, so a global assigned a stack
address
on a conditional or loop path had its origin appear only in the storing
block;
misclassified as block-local, its loan was dropped at the join before
the
escape check, a silently missed dangling-global
(stack-use-after-return).
Count the escaped origin as a cross-block appearance.

Assisted-by: Claude Opus 4.8

Co-authored-by: Gabor Horvath <gaborh at apple.com>
DeltaFile
+23-1clang/test/Sema/LifetimeSafety/dangling-global.cpp
+19-0clang/test/Sema/LifetimeSafety/dangling-field.cpp
+5-1clang/lib/Analysis/LifetimeSafety/LoanPropagation.cpp
+47-23 files

LLVM/project 12aefe2clang/include/clang/Options Options.td, clang/lib/Driver/ToolChains Flang.cpp

[Flang][Driver] Support for -fsplit-lto-unit option in flang driver (#202858)

When mixing Fortran objects from Flang with C/C++ objects compiled by
Clang during a combined LTO build, it is necessary to ensure that all
files use the same setting for split-lto-unit. This requires the support
for -fsplit-lto-unit option in the flang driver. This support is added
as part of this commit.

 Co-authored-by: Shivarama Rao <shivarama.rao at amd.com>
DeltaFile
+23-0flang/test/Driver/split-lto-unit.f90
+23-0flang/test/Integration/split-lto-unit-2.f90
+11-4flang/lib/Frontend/FrontendActions.cpp
+5-0clang/lib/Driver/ToolChains/Flang.cpp
+2-2clang/include/clang/Options/Options.td
+4-0flang/lib/Frontend/CompilerInvocation.cpp
+68-61 files not shown
+69-67 files

LLVM/project f310904clang/lib/Driver/ToolChains AMDGPU.cpp, llvm/include/llvm/TargetParser AMDGPUTargetParser.def

AMDGPU: Add subtarget feature for controllable xnack modes

This replaces the previously removed xnack-any-only feature,
with the inversion xnack-on-off-modes. All pre-gfx12.5 xnack
targets support the controllable mode. Ignore explicitly
set xnack settings the same way as is done for xnack requests
on other unsupported targets.
DeltaFile
+22-22llvm/include/llvm/TargetParser/AMDGPUTargetParser.def
+13-11llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp
+22-0llvm/test/CodeGen/AMDGPU/target-id-xnack-always-on.ll
+14-6llvm/lib/Target/AMDGPU/AMDGPU.td
+2-8llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUTargetStreamer.h
+4-3clang/lib/Driver/ToolChains/AMDGPU.cpp
+77-503 files not shown
+81-579 files

LLVM/project 08b557bclang/test/Driver hip-sanitize-options.hip, clang/test/Driver/Inputs/rocm/amdgcn/bitcode oclc_isa_version_12-5-generic.bc

AMDGPU: Remove xnack-any-only subtarget feature and handling (#204514)

This reverts commit f4caa0a172d96597c375e6b6b2192c289723a6b9.

This feature was added to gfx12-5-generic only, which does not make
sense given that both gxf1250 and gfx1251 have the same unconditional
xnack handling. It also does not make sense to diagnose trying to use
a specific xnack mode on the generic target only, and only from the
backend.

The current feature management is a confusing mess, given that we have
2 parallel feature systems. AMDGPUTargetParser has a table containing
a bitmask of features, which already contained FEATURE_XNACK_ALWAYS
for gfx1250/gfx1251, but not gfx12-5-generic. Add this handling there
so the sanitizer detection is consistent on the generic target.

These 2 feature tables probably should be unified in some way. We also
probably should have a subtarget feature for the xnack handling, but it
should be inverted. xnack-any-only is an antifeature, in that it removes
functionality from the base target. It would be better to invert this,
so all of the older targets support configurable xnack modes.
DeltaFile
+0-9llvm/test/CodeGen/AMDGPU/gfx12-5-generic-no-xnack.ll
+5-0clang/test/Driver/hip-sanitize-options.hip
+0-5llvm/lib/Target/AMDGPU/AMDGPU.td
+0-4llvm/lib/Target/AMDGPU/GCNSubtarget.cpp
+1-1llvm/include/llvm/TargetParser/AMDGPUTargetParser.def
+0-0clang/test/Driver/Inputs/rocm/amdgcn/bitcode/oclc_isa_version_12-5-generic.bc
+6-196 files

LLVM/project 567eeecllvm/lib/Target/AMDGPU AMDGPUMCResourceInfo.cpp, llvm/test/CodeGen/AMDGPU indirect-call-agpr-cap.ll indirect-call-vgpr-cap.ll

[AMDGPU] Capping max number of registers to function's occupancy budget for indirect calls (#199765)

Depends on https://github.com/llvm/llvm-project/pull/199746

Changed the SetMaxReg Lambda function to cap at
ST.getMaxNumVectorRegs(F) for VGPRs and AGPRs and ST.getMaxNumSGPRs(F)
for SGPR. ST.getMaxNumVectorRegs(F) and ST.getMaxNumSGPRs(F) returns the
budget for the given function F and is mainly determined by the
occupancy of the function.

This cap is needed since the module's max could overestimate register
usage because it includes functions that earlier in the compiler was
determined inaccessible by that indirect call. ST.getMaxNumVectorRegs(F)
and ST.getMaxNumSGPRs(F) were calculated according to an more accurate
call graph.

This fixes an issue of overinflated number of VGPR/AGPR for kernels that
have indirect function calls. This inflation has led to some kernels
with indirect calls to go over the limit for VGPR/AGPR and crash.

    [5 lines not shown]
DeltaFile
+57-0llvm/test/CodeGen/AMDGPU/indirect-call-agpr-cap.ll
+54-0llvm/test/CodeGen/AMDGPU/indirect-call-vgpr-cap.ll
+22-22llvm/test/CodeGen/AMDGPU/attr-amdgpu-flat-work-group-size-vgpr-limit.ll
+21-21llvm/test/CodeGen/AMDGPU/function-resource-usage.ll
+35-0llvm/test/CodeGen/AMDGPU/indirect-call-sgpr-cap.ll
+29-4llvm/lib/Target/AMDGPU/AMDGPUMCResourceInfo.cpp
+218-478 files not shown
+270-9314 files

LLVM/project b3c6fbcllvm/include/llvm/Passes CodeGenPassBuilder.h, llvm/include/llvm/Target CGPassBuilderOption.h

[CommandLine] Make cl::boolOrDefault a scoped enum

Prevents implicit conversion to bool/int, where BOU_FALSE wrongly
evaluated as true. All uses qualified as cl::boolOrDefault::BOU_*.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply at anthropic.com>
DeltaFile
+15-13llvm/lib/CodeGen/TargetPassConfig.cpp
+10-8llvm/include/llvm/Passes/CodeGenPassBuilder.h
+10-5llvm/lib/CodeGen/BranchFolding.cpp
+7-7llvm/lib/Target/AArch64/AArch64TargetMachine.cpp
+6-6llvm/include/llvm/Target/CGPassBuilderOption.h
+10-1llvm/lib/Support/CommandLine.cpp
+58-4024 files not shown
+129-10130 files

LLVM/project c70f3aellvm/docs LangRef.rst, llvm/include/llvm/IR Instructions.h

[IR] Add elementwise modifier to atomic loads
DeltaFile
+33-0llvm/test/Assembler/invalid-load-store-atomic-elementwise.ll
+18-4llvm/lib/IR/Verifier.cpp
+15-6llvm/docs/LangRef.rst
+14-2llvm/lib/AsmParser/LLParser.cpp
+16-0llvm/test/Bitcode/atomic-load-store-elementwise.ll
+12-1llvm/include/llvm/IR/Instructions.h
+108-136 files not shown
+137-1912 files

LLVM/project b636f43llvm/lib/Target/AMDGPU SIInstrInfo.h SIDefines.h, llvm/lib/Target/AMDGPU/MCTargetDesc AMDGPUInstPrinter.cpp

[NFC][AMDGPU] Introduce SIInstrFlags predicates for TSFlags access (#201512)

Add inline predicate templates to SIDefines.h for each TSFlags bit
(isSALU, isVOP3, isFLAT, isMaybeAtomic, hasFPClamp, ...) plus compound
predicates (isAtomic, isSegmentSpecificFLAT, isImage, isVMEM).

Route all SIInstrInfo.h thin wrapper bodies through the new predicates
instead of reading TSFlags bits directly. 
---------
Co-authored-by: Claude Sonnet 4.6 <noreply at anthropic.com>
DeltaFile
+122-144llvm/lib/Target/AMDGPU/SIInstrInfo.h
+210-0llvm/lib/Target/AMDGPU/SIDefines.h
+2-3llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.cpp
+2-2llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
+336-1494 files

LLVM/project 2563af1clang/test/Driver aarch64-mcpu.c, clang/test/Driver/print-enabled-extensions aarch64-neoverse-v3ae.c

[AArch64] Add armagicpu CPU (#202557)
DeltaFile
+2-1llvm/unittests/TargetParser/TargetParserTest.cpp
+1-1clang/test/Misc/target-invalid-cpu-note/aarch64.c
+2-0clang/test/Driver/aarch64-mcpu.c
+1-0clang/test/Driver/print-enabled-extensions/aarch64-neoverse-v3ae.c
+1-0llvm/lib/Target/AArch64/AArch64Processors.td
+7-25 files

LLVM/project 4d81237llvm/lib/Transforms/Vectorize VPlanTransforms.cpp VPlanUtils.cpp, llvm/unittests/Transforms/Vectorize VPlanUncountableExitTest.cpp VPlanTestBase.h

[LV] Follow up to uncountable exit with side effects vectorization (#201589)

Addressing post-commit comments on #178454.
DeltaFile
+193-55llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
+0-166llvm/unittests/Transforms/Vectorize/VPlanUncountableExitTest.cpp
+0-109llvm/lib/Transforms/Vectorize/VPlanUtils.cpp
+0-12llvm/lib/Transforms/Vectorize/VPlanUtils.h
+1-10llvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp
+0-11llvm/unittests/Transforms/Vectorize/VPlanTestBase.h
+194-3638 files not shown
+212-38214 files