LLVM/project 4aee501llvm/utils/gn/secondary/llvm/include/llvm/Analysis BUILD.gn, llvm/utils/gn/secondary/llvm/lib/Analysis BUILD.gn

[gn] port c9f573463ebd (TargetLibraryInfo.inc)
DeltaFile
+5-0llvm/utils/gn/secondary/llvm/include/llvm/Analysis/BUILD.gn
+3-0llvm/utils/gn/secondary/llvm/lib/Analysis/BUILD.gn
+1-0llvm/utils/gn/secondary/llvm/utils/TableGen/Basic/BUILD.gn
+9-03 files

LLVM/project 34253e8llvm/lib/Target/AMDGPU AMDGPUISelDAGToDAG.cpp, llvm/test/CodeGen/AMDGPU invariant-load-no-alias-store.ll

AMDGPU: Handle invariant loads when considering if a load can be scalar

Doesn't touch the globalisel version because the handling
there looks a bit broken.
DeltaFile
+14-1llvm/test/CodeGen/AMDGPU/invariant-load-no-alias-store.ll
+2-1llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp
+16-22 files

LLVM/project 0e1cb2dllvm/test/CodeGen/AMDGPU load-select-ptr.ll select-vectors.ll, llvm/test/CodeGen/NVPTX i1-select.ll fast-math.ll

Reapply "DAG: Allow select ptr combine for non-0 address spaces" (#168292) (#168786)

This reverts commit 6d5f87fc4284c4c22512778afaf7f2ba9326ba7b.

Previously this failed due to treating the unknown MachineMemOperand
value as known uniform.
DeltaFile
+71-76llvm/test/CodeGen/AMDGPU/load-select-ptr.ll
+43-34llvm/test/CodeGen/NVPTX/i1-select.ll
+36-28llvm/test/CodeGen/NVPTX/fast-math.ll
+34-29llvm/test/CodeGen/AMDGPU/select-vectors.ll
+19-38llvm/test/CodeGen/NVPTX/lower-byval-args.ll
+15-12llvm/test/CodeGen/AMDGPU/select-load-to-load-select-ptr-combine.ll
+218-2176 files not shown
+260-24912 files

LLVM/project 4d92961flang/lib/Semantics check-omp-structure.cpp check-omp-structure.h

Remove invalidState_
DeltaFile
+0-4flang/lib/Semantics/check-omp-structure.cpp
+0-1flang/lib/Semantics/check-omp-structure.h
+0-52 files

LLVM/project 1c46492openmp/runtime/test/api omp_device_uid.f90

push forgotten test
DeltaFile
+65-0openmp/runtime/test/api/omp_device_uid.f90
+65-01 files

LLVM/project 7ca737dflang/test/Lower select-case-statement.f90

[flang] Switch select-case-statement.f90 to new lowering (#168754)

test/Lower/select-case-statement.f90 was still using the old lowering.
Modified the test with FIR generated using the new lowering. Changed the
test to use flang_fc1 instead of bbc and added testing for -O0 and -O1,
since character comparison lowering is done differently at -O0 (uses
runtime function) and -O1 (inlines some cases). Use different FileCheck
prefixes for different optimization levels (CHECK-O0 for -O0, CHECK-O1
for -O1, CHECK for both).
DeltaFile
+152-188flang/test/Lower/select-case-statement.f90
+152-1881 files

LLVM/project 602fa0cllvm/lib/CodeGen/SelectionDAG LegalizeVectorTypes.cpp LegalizeIntegerTypes.cpp

[SDAG] Fix whitespace errors (NFC) (#168897)

To make life easier for future contributors. Note that formatting
changes are due to git clang-format on the touched whitespace-error
lines.
DeltaFile
+16-16llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
+1-1llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp
+17-172 files

LLVM/project 6ce4794llvm/test/CodeGen/AMDGPU fp_to_sint.ll fp_to_uint.ll

[AMDGPU] Precommit tests for V_CVT_PK_[IU]16_F32 (#168893)

DeltaFile
+565-0llvm/test/CodeGen/AMDGPU/fp_to_sint.ll
+460-0llvm/test/CodeGen/AMDGPU/fp_to_uint.ll
+22-4llvm/test/CodeGen/AMDGPU/scalar-float-sop1.ll
+1,047-43 files

LLVM/project 1a7ca4cllvm/include/llvm/Analysis TargetTransformInfo.h, llvm/lib/Analysis UniformityAnalysis.cpp TargetTransformInfo.cpp

add target hook getInstructionUniformity
DeltaFile
+12-5llvm/lib/Analysis/UniformityAnalysis.cpp
+8-5llvm/lib/CodeGen/MachineUniformityAnalysis.cpp
+11-0llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp
+10-0llvm/include/llvm/Analysis/TargetTransformInfo.h
+8-0llvm/lib/Target/NVPTX/NVPTXTargetTransformInfo.cpp
+5-0llvm/lib/Analysis/TargetTransformInfo.cpp
+54-103 files not shown
+62-109 files

LLVM/project 6c79cc7llvm/lib/Target/X86 X86ISelLowering.cpp, llvm/test/CodeGen/X86 ldexp-avx512.ll fold-int-pow2-with-fmul-or-fdiv.ll

[X86] Lower mathlib call ldexp into scalef when avx512 is enabled  (#166839)

Resolves #165694
DeltaFile
+200-1,637llvm/test/CodeGen/X86/ldexp-avx512.ll
+53-105llvm/test/CodeGen/X86/fold-int-pow2-with-fmul-or-fdiv.ll
+73-2llvm/lib/Target/X86/X86ISelLowering.cpp
+326-1,7443 files

LLVM/project b9d9811llvm/lib/Transforms/Utils ProfileVerify.cpp, llvm/test/Transforms/PGOProfile profcheck-exclusions.ll

[profcheck] Exclude `naked`, asm-only functions from profcheck (#168447)

We can't do anything meaningful to such functions: they aren't optimizable, and even if inlined, they would bring no code open to optimization.
DeltaFile
+19-0llvm/lib/Transforms/Utils/ProfileVerify.cpp
+10-0llvm/test/Transforms/PGOProfile/profcheck-exclusions.ll
+29-02 files

LLVM/project 5b8656cclang/lib/CIR/CodeGen CIRGenExpr.cpp, clang/test/CIR/CodeGen vector-ext-element.cpp

[CIR] ExtVectorElementExpr with rvalue base (#168260)

Upstream ExtVectorElementExpr with rvalue base
DeltaFile
+144-0clang/test/CIR/CodeGen/vector-ext-element.cpp
+14-3clang/lib/CIR/CodeGen/CIRGenExpr.cpp
+158-32 files

LLVM/project cd38396llvm/lib/Target/AMDGPU AMDGPUISelDAGToDAG.cpp, llvm/test/CodeGen/AMDGPU invariant-load-no-alias-store.ll

AMDGPU: Handle invariant loads when considering if a load can be scalar

Doesn't touch the globalisel version because the handling
there looks a bit broken.
DeltaFile
+14-1llvm/test/CodeGen/AMDGPU/invariant-load-no-alias-store.ll
+2-1llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp
+16-22 files

LLVM/project e2c7ac8llvm/test/CodeGen/AMDGPU load-select-ptr.ll select-vectors.ll, llvm/test/CodeGen/NVPTX i1-select.ll fast-math.ll

Reapply "DAG: Allow select ptr combine for non-0 address spaces" (#168292)

This reverts commit 6d5f87fc4284c4c22512778afaf7f2ba9326ba7b.

Previously this failed due to treating the unknown MachineMemOperand
value as known uniform.
DeltaFile
+71-76llvm/test/CodeGen/AMDGPU/load-select-ptr.ll
+43-34llvm/test/CodeGen/NVPTX/i1-select.ll
+36-28llvm/test/CodeGen/NVPTX/fast-math.ll
+34-29llvm/test/CodeGen/AMDGPU/select-vectors.ll
+19-38llvm/test/CodeGen/NVPTX/lower-byval-args.ll
+15-12llvm/test/CodeGen/AMDGPU/select-load-to-load-select-ptr-combine.ll
+218-2176 files not shown
+260-24912 files

LLVM/project d3c3c6bllvm/lib/Target/AMDGPU AMDGPUISelDAGToDAG.cpp AMDGPUInstrInfo.cpp, llvm/test/CodeGen/AMDGPU load-select-ptr.ll

AMDGPU: Fix treating divergent loads as uniform (#168785)

Avoids regression which caused the revert 6d5f87fc42.

This is a hack on a hack. We currently have isUniformMMO,
which improperly treats unknown source value as known uniform.
This is hack from before we had divergence information in the
DAG, and should be removed. This is the minimum change to avoid
the regression; removing the aggressive handling of the unknown
case (or dropping isUniformMMO entirely) are more involved fixes.
DeltaFile
+84-0llvm/test/CodeGen/AMDGPU/load-select-ptr.ll
+11-3llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp
+1-0llvm/lib/Target/AMDGPU/AMDGPUInstrInfo.cpp
+96-33 files

LLVM/project 53b2697llvm/test/CodeGen/RISCV cfi-multiple-locations.mir

[RISCV] Do not write .s file in a test (#168865)

DeltaFile
+2-1llvm/test/CodeGen/RISCV/cfi-multiple-locations.mir
+2-11 files

LLVM/project 67e35bbllvm/lib/Transforms/Vectorize LoopVectorize.cpp, llvm/test/Transforms/LoopVectorize/AArch64 partial-reduce-incomplete-chains.ll

[LV] Check full partial reduction chains in order. (#168036)

https://github.com/llvm/llvm-project/pull/162822 added another
validation step to check if entries in a partial reduction chain have
the same scale factor. But the validation was still dependent on the
order of entries in PartialReductionChains, and would fail to reject
some cases (e.g. if the first first link matched the scale of the second
link, but the second link is invalidated later).

To fix that, group chains by their starting phi nodes, then perform the
validation for each chain, and if it fails, invalidate the whole chain
for the phi.

Fixes https://github.com/llvm/llvm-project/issues/167243.
Fixes https://github.com/llvm/llvm-project/issues/167867.

PR: https://github.com/llvm/llvm-project/pull/168036
DeltaFile
+113-0llvm/test/Transforms/LoopVectorize/AArch64/partial-reduce-incomplete-chains.ll
+36-26llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+149-262 files

LLVM/project b725bdbcompiler-rt/test lit.common.cfg.py

Reapply "[compiler-rt] Default to Lit's Internal Shell (#168232)" (#168760)

This reverts commit eb20b5392599996ce94e4c0392095cacaa33687c.

This relands the compiler-rt internal shell after XRay and Darwin tests
that were failing under the internal shell have been fixed.
DeltaFile
+4-7compiler-rt/test/lit.common.cfg.py
+4-71 files

LLVM/project 21fedcbllvm/include/llvm/Support BranchProbability.h, llvm/lib/Support BranchProbability.cpp

[LoopPeel] Fix BFI when peeling last iteration without guard (#168250)

LoopPeel sometimes proves that, when reached, the original loop always
executes at least two iterations. LoopPeel then unconditionally executes
both the remaining loop's initial iteration and the peeled final
iteration. But that increases the latter's frequency above its frequency
in the original loop. To maintain the total frequency, this patch
compensates by decreasing the remaininng loop's latch probability.

This is another step in issue #135812 and was discussed at
<https://github.com/llvm/llvm-project/pull/166858#discussion_r2528968542>.
DeltaFile
+87-0llvm/test/Transforms/LoopUnroll/branch-weights-freq/peel-last-iteration-no-guard.ll
+28-2llvm/lib/Transforms/Utils/LoopPeel.cpp
+3-6llvm/lib/Transforms/Utils/LoopUnrollRuntime.cpp
+5-0llvm/lib/Support/BranchProbability.cpp
+3-0llvm/include/llvm/Support/BranchProbability.h
+126-85 files

LLVM/project 0c085c4llvm/include module.modulemap

Fix build breakage when using modules (#168883)

Commit c9f573463ebd7b4e46da4877802f2364f700e54a removed the file
TargetLibraryInfo.def but did not remove it from the module map.
DeltaFile
+0-1llvm/include/module.modulemap
+0-11 files

LLVM/project b5812c0llvm/lib/Target/LoongArch LoongArchISelLowering.h LoongArchISelLowering.cpp

[LoongArch] TableGen-erate SDNode descriptions (#168129)

This allows SDNodes to be validated against their expected type profiles
and reduces the number of changes required to add a new node.

I had to split `VSHUF4I` into two variants (`VSHUF4I` and `VSHUF4I_D`)
since `loongarch_vshuf4i` and `loongarch_vshuf4i_d` have different
number of operands, and this prevented the node from being imported.

There is just one node that currently fails validation, see
`LoongArchSelectionDAGInfo::verifyTargetNode()`.

Part of #119709.

Pull Request: https://github.com/llvm/llvm-project/pull/168129
DeltaFile
+0-176llvm/lib/Target/LoongArch/LoongArchISelLowering.h
+4-120llvm/lib/Target/LoongArch/LoongArchISelLowering.cpp
+37-0llvm/lib/Target/LoongArch/LoongArchInstrInfo.td
+31-0llvm/lib/Target/LoongArch/LoongArchSelectionDAGInfo.h
+29-0llvm/lib/Target/LoongArch/LoongArchSelectionDAGInfo.cpp
+12-1llvm/lib/Target/LoongArch/LoongArchLSXInstrInfo.td
+113-2977 files not shown
+141-30313 files

LLVM/project bb0a95dllvm/cmake/modules HandleLLVMOptions.cmake

[CMake] handle the AIX form of the lto cache dir option (#168868)

This handles the AIX form of the thinLTO cache dir option, which get's
turned on when thinLTO is enabled.
DeltaFile
+3-0llvm/cmake/modules/HandleLLVMOptions.cmake
+3-01 files

LLVM/project 891b3cfmlir/docs/Dialects SPIR-V.md, mlir/include/mlir/Dialect/SPIRV/IR SPIRVControlFlowOps.td

[mlir][spirv] Add support for SwitchOp (#168713)

The dialect implementation mostly copies the one of `cf.switch`, but
aligns naming to the SPIR-V spec.
DeltaFile
+215-0mlir/test/Dialect/SPIRV/IR/control-flow-ops.mlir
+106-0mlir/include/mlir/Dialect/SPIRV/IR/SPIRVControlFlowOps.td
+83-0mlir/lib/Dialect/SPIRV/IR/ControlFlowOps.cpp
+77-0mlir/lib/Dialect/SPIRV/IR/SPIRVOpDefinition.cpp
+68-0mlir/test/Target/SPIRV/selection.mlir
+57-1mlir/docs/Dialects/SPIR-V.md
+606-17 files not shown
+669-213 files

LLVM/project 50ec7a3flang/lib/Semantics check-omp-loop.cpp

format
DeltaFile
+2-1flang/lib/Semantics/check-omp-loop.cpp
+2-11 files

LLVM/project 0e54667llvm/include/llvm/CodeGen MachineBasicBlock.h

[CodeGen] Use MCRegister in MachineBasicBlock::liveout_iterator. NFC (#168834)

MachineBasicBlock::liveout_begin() calls this constructor with
MCRegisters so this removes an implicit cast.
DeltaFile
+3-3llvm/include/llvm/CodeGen/MachineBasicBlock.h
+3-31 files

LLVM/project 683de3bllvm/include module.modulemap

Fix build breakage when using modules

Commit c9f573463ebd7b4e46da4877802f2364f700e54a removed the file TargetLibraryInfo.def but did not remove it from the module map.
DeltaFile
+0-1llvm/include/module.modulemap
+0-11 files

LLVM/project 0b6a74cllvm/test/Transforms/VectorCombine/AMDGPU extract-insert-chain-to-shuffles.ll extract-insert-i8.ll

VectorCombine/AMDGPU: Cleanup a test and add a new one (#168817)

The existing, recently added test contains a whole lot of noise in the
form of dead instructions. Also, prefer named values.

The new test isolates a separate issue with concatenating i8 vectors.
DeltaFile
+47-531llvm/test/Transforms/VectorCombine/AMDGPU/extract-insert-chain-to-shuffles.ll
+186-0llvm/test/Transforms/VectorCombine/AMDGPU/extract-insert-i8.ll
+233-5312 files

LLVM/project 07e8932llvm/test/Analysis/CostModel/AMDGPU shufflevector.ll

AMDGPU: Expand cost model shufflevector test (#168816)

Add a few corner cases of the "simplified" shuffle kinds.
DeltaFile
+50-0llvm/test/Analysis/CostModel/AMDGPU/shufflevector.ll
+50-01 files

LLVM/project 72c046fflang/include/flang/Parser parse-tree.h, flang/lib/Parser openmp-parsers.cpp unparse.cpp

[flang][OpenMP] Better diagnostics for invalid or misplaced directives

Add two more AST nodes, one for a misplaced end-directive, and one for
an invalid string following the OpenMP sentinel (e.g. "!$OMP XYZ").

Emit error messages when either node is encountered in semantic analysis.
DeltaFile
+32-19flang/lib/Parser/openmp-parsers.cpp
+23-0flang/lib/Semantics/check-omp-structure.cpp
+19-1flang/include/flang/Parser/parse-tree.h
+14-0flang/test/Semantics/OpenMP/loop-association.f90
+10-0flang/lib/Parser/unparse.cpp
+6-0flang/lib/Semantics/check-omp-structure.h
+104-207 files not shown
+119-2413 files

LLVM/project aaea8e6flang/lib/Parser openmp-parsers.cpp parse-tree.cpp, flang/lib/Semantics canonicalize-omp.cpp check-omp-loop.cpp

[flang][OpenMP] Implement loop nest parser

Previously, loop constructs were parsed in a piece-wise manner: the
begin directive, the body, and the end directive were parsed separately.
Later on in canonicalization they were all coalesced into a loop
construct. To facilitate that end-loop directives were given a special
treatment, namely they were parsed as OpenMP constructs. As a result
syntax errors caused by misplaced end-loop directives were handled
differently from those cause by misplaced non-loop end directives.

The new loop nest parser constructs the complete loop construct,
removing the need for the canonicalization step. Additionally, it is
the basis for parsing loop-sequence-associated constructs in the future.

It also removes the need for the special treatment of end-loop
directives. While this patch temporarily degrades the error messaging
for misplaced end-loop directives, it enables uniform handling of any
misplaced end-directives in the future.
DeltaFile
+0-163flang/lib/Semantics/canonicalize-omp.cpp
+135-8flang/lib/Parser/openmp-parsers.cpp
+74-0flang/lib/Semantics/check-omp-loop.cpp
+22-18flang/test/Semantics/OpenMP/loop-transformation-construct01.f90
+27-8flang/lib/Parser/parse-tree.cpp
+13-20flang/test/Semantics/OpenMP/loop-association.f90
+271-21710 files not shown
+293-24316 files