LLVM/project 739e997libc/src/__support/GPU allocator.cpp

[libc] Remove ballot on slab find (#176606)

Summary:
This negatively impacts performance, while the other changes in the
initial PR slightly improved it. This was originally done to make Volta
independent thread scheduling work, but that doesn't seem to work
correctly all the time either so we should make this faster.
DeltaFile
+9-8libc/src/__support/GPU/allocator.cpp
+9-81 files

LLVM/project 49cd842mlir/utils/vscode package-lock.json

[mlir][vscode] Remove resolved from lock file (#176611)

DeltaFile
+457-4,789mlir/utils/vscode/package-lock.json
+457-4,7891 files

LLVM/project 032eb06clang/test/CodeGenOpenCL amdgpu-features-illegal.cl, clang/test/SemaOpenCL builtins-amdgcn-error-wave32.cl builtins-amdgcn-wave32-func-attr.cl

[Clang][AMDGPU] Handle `wavefrontsize32` and `wavefrontsize64` features more robustly

We should also not allow `-wavefrontsize32` and `-wavefrontsize64` to be specified at the same time.
DeltaFile
+30-9llvm/lib/TargetParser/TargetParser.cpp
+14-7clang/test/CodeGenOpenCL/amdgpu-features-illegal.cl
+6-2flang/test/Driver/target-cpu-features-invalid.f90
+2-2clang/test/SemaOpenCL/builtins-amdgcn-error-wave32.cl
+2-2clang/test/SemaOpenCL/builtins-amdgcn-wave32-func-attr.cl
+54-225 files

LLVM/project d3dad64llvm/utils/gn/secondary/clang/lib/Analysis BUILD.gn

[gn build] Port 17ff9b3c67ab
DeltaFile
+0-1llvm/utils/gn/secondary/clang/lib/Analysis/BUILD.gn
+0-11 files

LLVM/project b8e3276mlir/utils/vscode package-lock.json

[mlir][vscode] Update lock file to match (#176608)

DeltaFile
+5,759-343mlir/utils/vscode/package-lock.json
+5,759-3431 files

LLVM/project 676b292mlir/utils/vscode package.json

[mlir][vscode] Update engine (#176605)

DeltaFile
+1-1mlir/utils/vscode/package.json
+1-11 files

LLVM/project 462673amlir/utils/vscode package.json

[mlir][vscode] Update dev dependencies (#176604)

Update to match vscode-clangd.
DeltaFile
+4-4mlir/utils/vscode/package.json
+4-41 files

LLVM/project 17ff9b3clang-tools-extra/clang-tidy/bugprone UnsafeFunctionsCheck.cpp, clang/include/clang/Analysis AnnexKDetection.h

Revert "[clang][analyzer] Add ReportInC99AndEarlier option to DeprecatedOrUnsafeBuf…" (#176603)

Reverts llvm/llvm-project#168704
Checking what causes the clang-bolt buildbot failure.
DeltaFile
+12-76clang/lib/StaticAnalyzer/Checkers/CheckSecuritySyntaxOnly.cpp
+0-43clang/lib/Analysis/AnnexKDetection.cpp
+0-40clang/include/clang/Analysis/AnnexKDetection.h
+0-40clang/test/Analysis/security-deprecated-buffer-handling-report-modes.c
+6-19clang/include/clang/StaticAnalyzer/Checkers/Checkers.td
+20-2clang-tools-extra/clang-tidy/bugprone/UnsafeFunctionsCheck.cpp
+38-2204 files not shown
+38-23610 files

LLVM/project d5e14afmlir/utils/vscode language-configuration.json package.json

[mlir][vscode] Add angle bracket support to MLIR language configuration (#176602)

Add angle brackets (<>) to brackets, autoClosingPairs, and
surroundingPairs for better editing of types like tensor<3xf32>. Also
add colorizedBracketPairs for visual distinction between nested bracket
types.
DeltaFile
+10-1mlir/utils/vscode/language-configuration.json
+1-1mlir/utils/vscode/package.json
+11-22 files

LLVM/project f64b69bmlir/utils/vscode pdll-grammar.json

[mlir][vscode] Fix PDLL grammar character class regex (#176601)

The character class [aA-zZ_0-9] incorrectly matches characters between
ASCII 90-97 (Z-a range), which includes: [ \ ] ^ _ `. This should be
[a-zA-Z_0-9] for proper identifier matching.
DeltaFile
+8-8mlir/utils/vscode/pdll-grammar.json
+8-81 files

LLVM/project 727f8f9clang/test/CodeGenOpenCL amdgpu-features-illegal.cl, clang/test/SemaOpenCL builtins-amdgcn-error-wave32.cl

[Clang][AMDGPU] Handle `wavefrontsize32` and `wavefrontsize64` features more robustly

We should also not allow `-wavefrontsize32` and `-wavefrontsize64` to be specified at the same time.
DeltaFile
+18-7llvm/lib/TargetParser/TargetParser.cpp
+8-5clang/test/CodeGenOpenCL/amdgpu-features-illegal.cl
+1-1clang/test/SemaOpenCL/builtins-amdgcn-error-wave32.cl
+27-133 files

LLVM/project a93734bllvm/lib/Target/AMDGPU AMDGPURegBankLegalizeRules.cpp, llvm/test/CodeGen/AMDGPU/GlobalISel regbankselect-amdgpu-smed3.mir regbankselect-amdgpu-cvt-pk-i16-i32.mir

[AMDGPU][GlobalISel] Add RegBankLegalize rules for SMED3 and CVT_PK_I16_I32

These opcodes are created together for the i64->i16 signed clamp pattern.
DeltaFile
+46-0llvm/test/CodeGen/AMDGPU/GlobalISel/regbankselect-amdgpu-smed3.mir
+41-0llvm/test/CodeGen/AMDGPU/GlobalISel/regbankselect-amdgpu-cvt-pk-i16-i32.mir
+8-0llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeRules.cpp
+4-4llvm/test/CodeGen/AMDGPU/GlobalISel/combine-short-clamp.ll
+99-44 files

LLVM/project f316a39llvm/utils/gn/secondary/clang/lib/Analysis BUILD.gn

[gn build] Port 694c4d6539cc
DeltaFile
+1-0llvm/utils/gn/secondary/clang/lib/Analysis/BUILD.gn
+1-01 files

LLVM/project 123acb2llvm/test/Transforms/LoopVectorize/AArch64 vector-reverse.ll select-costs.ll, llvm/test/Transforms/LoopVectorize/RISCV predicated-reverse-store.ll

[LV] Add missing coverage for LV cost model code paths.

Add a set of tests that expose crashes with some upcoming and pending
patches.
DeltaFile
+210-24llvm/test/Transforms/LoopVectorize/AArch64/vector-reverse.ll
+66-0llvm/test/Transforms/LoopVectorize/AArch64/select-costs.ll
+60-0llvm/test/Transforms/LoopVectorize/RISCV/predicated-reverse-store.ll
+336-243 files

LLVM/project 811fb22llvm/lib/Target/WebAssembly WebAssemblyTargetTransformInfo.cpp WebAssemblyTargetTransformInfo.h

[WebAssembly] Mark extract.last.active as having invalid cost.

Currently the WebAssembly backend crashes when trying to lower some
extract.last.active intrinsic calls. Mark their cost as invalid
temporarily, to avoid them being introduced by the loop
vectorizer after 2abd6d6d7ac (#158088).
DeltaFile
+13-0llvm/lib/Target/WebAssembly/WebAssemblyTargetTransformInfo.cpp
+5-0llvm/lib/Target/WebAssembly/WebAssemblyTargetTransformInfo.h
+18-02 files

LLVM/project 1c6d603mlir/utils/vscode package.json

[mlir][vscode] Add actions for publishing to openvsx. (#176593)

DeltaFile
+2-0mlir/utils/vscode/package.json
+2-01 files

LLVM/project ec080ab.ci monolithic-windows.sh monolithic-linux.sh

[CI] Disable precompiled headers in pre-commit CI (#176563)

Spliced out from #176420 to make sure that CI is fine without PCH, which
are currently used by Flang.
DeltaFile
+1-0.ci/monolithic-windows.sh
+1-0.ci/monolithic-linux.sh
+2-02 files

LLVM/project f853880.ci/buildbot worker.py, offload/ci openmp-offload-amdgpu-clang-flang.py openmp-offload-amdgpu-runtime.py

[Offload][CI] Convert openmp-offload-amdgpu staging bots to ScriptedBuilder (#174991)

Convert the first AMDGPU buildbots to use the ScriptedBuilder introduced
llvm-zorg. For the motivation, see
https://github.com/llvm/llvm-zorg/pull/648.

Since the production buildbot still needs to be restarted for
ScriptedBuilder to work, only convert the builders that are currently in
staging for now. These are:

 * openmp-offload-amdgpu-runtime
 * openmp-offload-amdgpu-clang-flang

Both of them happen to be OpenMPBuilder.getOpenMPCMakeBuildFactory-based
builders before this change. They also set an environment variable that
the previous ScriptedBuilder did not, so we are adding support.

The corresponding llvm-zorg change is
https://github.com/llvm/llvm-zorg/pull/697.
DeltaFile
+71-0offload/ci/openmp-offload-amdgpu-clang-flang.py
+60-0offload/ci/openmp-offload-amdgpu-runtime.py
+11-3.ci/buildbot/worker.py
+1-0offload/ci/.gitignore
+143-34 files

LLVM/project 2070646llvm/test/CodeGen/X86 llc-pipeline-npm.ll

[X86][NewPM] Use -NEXT FileCheck suffix for NPM test (#176592)

Otherwise when someone adds a pass to the pipeline the test still
passes.
DeltaFile
+330-324llvm/test/CodeGen/X86/llc-pipeline-npm.ll
+330-3241 files

LLVM/project 694c4d6clang-tools-extra/clang-tidy/bugprone UnsafeFunctionsCheck.cpp, clang/include/clang/Analysis AnnexKDetection.h

[clang][analyzer] Add ReportInC99AndEarlier option to DeprecatedOrUnsafeBuf… (#168704)

…ferHandling checker

The checker may report warnings for deprecated buffer handling functions
(memcpy, memset, memmove, etc.) even when not compiling with C11
standard if the new option "ReportInC99AndEarlier" is set to true.

These functions are deprecated in C11, but may still be problematic in
earlier C standards.
DeltaFile
+76-12clang/lib/StaticAnalyzer/Checkers/CheckSecuritySyntaxOnly.cpp
+43-0clang/lib/Analysis/AnnexKDetection.cpp
+40-0clang/test/Analysis/security-deprecated-buffer-handling-report-modes.c
+40-0clang/include/clang/Analysis/AnnexKDetection.h
+19-6clang/include/clang/StaticAnalyzer/Checkers/Checkers.td
+2-20clang-tools-extra/clang-tidy/bugprone/UnsafeFunctionsCheck.cpp
+220-384 files not shown
+236-3810 files

LLVM/project ea3a965llvm/lib/Target/X86 X86LoadValueInjectionRetHardening.cpp X86ReturnThunks.cpp

[X86][NewPM] Cleanup some minor issues in recently ported passes

* Ensure passes implemented as single functions are marked as static to
  enforce internal linkage.
* Avoid the use of temporary variables to hold pass output status that
  only have one user/do not change any ordering guarantees.
DeltaFile
+5-5llvm/lib/Target/X86/X86LoadValueInjectionRetHardening.cpp
+4-5llvm/lib/Target/X86/X86ReturnThunks.cpp
+2-1llvm/lib/Target/X86/X86SpeculativeExecutionSideEffectSuppression.cpp
+11-113 files

LLVM/project 1c4e03allvm/lib/Target/X86 X86ReturnThunks.cpp X86.h, llvm/test/CodeGen/X86 attr-function-return.mir

[NewPM] port x86-return-thunks to new pass manager (#176226)

Test: https://gist.github.com/nigham/f430f76b06478449e0d228c88f9db8ff
DeltaFile
+25-9llvm/lib/Target/X86/X86ReturnThunks.cpp
+8-2llvm/lib/Target/X86/X86.h
+2-2llvm/lib/Target/X86/X86TargetMachine.cpp
+1-2llvm/lib/Target/X86/X86CodeGenPassBuilder.cpp
+2-0llvm/test/CodeGen/X86/attr-function-return.mir
+1-1llvm/lib/Target/X86/X86PassRegistry.def
+39-166 files

LLVM/project 284ef1bllvm/include/llvm/Support GenericDomTree.h, llvm/lib/CodeGen MachineLICM.cpp

[Support][NFCI] Store DomTree children as linked list (#176409)

Reduce the size of a DomTreeNodeBase from 80 to 56 bytes by not storing
the children in a SmallVector. Instead, store children as forward-linked
list. This also avoids extra allocations for nodes with many children.
Additionally, DomTreeNodeBase is now trivially destructible.

A lot of code depends on the order of nodes in the dominator tree, so
make sure that the order is the same when inserting nodes. (Not having
to do this would save 8 bytes per node.)

NewGVN uses the order of nodes in the dominator tree in a way that is
not entirely clear to me (https://reviews.llvm.org/D28129). I kept the
semantics as, but now this is the only external user of
addChild/removeChild, which actually should be private.

https://llvm-compile-time-tracker.com/compare.php?from=263802c56b4db3fc9b6ed9fd313499cb03ca44da&to=43e0c0c5b663b3a4067252fc0addbaccefd0014d&stat=instructions:u
DeltaFile
+60-34llvm/include/llvm/Support/GenericDomTree.h
+14-13llvm/lib/CodeGen/MachineLICM.cpp
+14-4llvm/lib/Transforms/Scalar/NewGVN.cpp
+3-3llvm/test/Transforms/JumpThreading/domtree-updates.ll
+2-4llvm/lib/Transforms/Utils/LoopSimplify.cpp
+2-2llvm/lib/Target/SystemZ/SystemZLDCleanup.cpp
+95-604 files not shown
+102-6710 files

LLVM/project ae425abmlir/lib/Dialect/Utils VerificationUtils.cpp

[mlir][nfc] Fix function definition names post #175880 (#176586)

Ensure that the input argument names for `verifyRanksMatch` in the
function definition match those in the declaration.
DeltaFile
+7-7mlir/lib/Dialect/Utils/VerificationUtils.cpp
+7-71 files

LLVM/project 585efb4mlir/include/mlir/Dialect/Utils VerificationUtils.h, mlir/lib/Dialect/Bufferization/IR BufferizationOps.cpp

[mlir][Utils] Add verifyRanksMatch helper (NFC) (#175880)

This change builds on https://github.com/llvm/llvm-project/pull/174336,
which introduced shared VerificationUtils with an initial
verifyDynamicDimensionCount() method.

This patch adds a new verifyRanksMatch() verification utility that
checks if two shaped types have matching ranks and emits consistent
error messages. The utility is applied to several ops across multiple
MLIR dialects.

---------

Co-authored-by: Andrzej Warzyński <andrzej.warzynski at gmail.com>
DeltaFile
+30-32mlir/lib/Dialect/Tosa/IR/TosaOps.cpp
+16-0mlir/lib/Dialect/Utils/VerificationUtils.cpp
+4-3mlir/lib/Dialect/Linalg/IR/LinalgOps.cpp
+3-3mlir/test/Dialect/Tosa/invalid.mlir
+3-2mlir/lib/Dialect/Bufferization/IR/BufferizationOps.cpp
+5-0mlir/include/mlir/Dialect/Utils/VerificationUtils.h
+61-402 files not shown
+63-428 files

LLVM/project 6f26b1cutils/bazel/llvm-project-overlay/llvm BUILD.bazel

[bazel] Fix build failure caused by tools/llvm-ir2vec/*.h not found (#176573)

Closes https://github.com/llvm/llvm-project/issues/176572.

cc @nishant-sachdeva
DeltaFile
+1-1utils/bazel/llvm-project-overlay/llvm/BUILD.bazel
+1-11 files

LLVM/project 5995fe9llvm/lib/Transforms/Vectorize VPlanConstruction.cpp, llvm/test/Transforms/LoopVectorize find-last.ll iv-select-cmp-trunc.ll

[VPlan] Normalize selects to always select the data op when cond is true.

Fix a miscompile in the FindLast handling by normalizing selects
with the phi node as the first op to ones that select the data value
when the condition is true, by swapping operands and inverting the
condition.

This should ensure correct codegen for both cases.

Select normalization:
https://alive2.llvm.org/ce/z/yFdivK

Fixes a miscompile reported for 2abd6d6d7ac (#158088).
DeltaFile
+8-8llvm/test/Transforms/LoopVectorize/find-last.ll
+6-6llvm/test/Transforms/LoopVectorize/iv-select-cmp-trunc.ll
+9-0llvm/lib/Transforms/Vectorize/VPlanConstruction.cpp
+4-4llvm/test/Transforms/LoopVectorize/select-cmp.ll
+27-184 files

LLVM/project dae679dllvm/lib/CodeGen TwoAddressInstructionPass.cpp

[TwoAddressInstruction][NPM] Conditionally preserve SlotIndexes in NPM (#173536)

In the New PM, `SlotIndexesAnalysis` should only be preserved when
`LiveIntervals` was cached and available, as `SlotIndexes` are only
maintained when `LiveIntervals` analysis is available.

This fixes potential stale `SlotIndexes` issues when running with NPM
where `LiveIntervals` analysis wasn't requested by prior passes.
DeltaFile
+15-7llvm/lib/CodeGen/TwoAddressInstructionPass.cpp
+15-71 files

LLVM/project 1a20818libc/hdr CMakeLists.txt

[libc][CMake] Add dependency on ELF headers for elf_proxy target (#176557)

Fixes parallel build problem for check-libc target where headers are
generated after they are needed. I think this was likely caused by
https://github.com/llvm/llvm-project/pull/172766.
DeltaFile
+42-0libc/hdr/CMakeLists.txt
+42-01 files

LLVM/project 6822517llvm/lib/Transforms/InstCombine InstCombineSimplifyDemanded.cpp, llvm/test/Transforms/InstCombine simplify-demanded-fpclass-maximum.ll simplify-demanded-fpclass-maximumnum.ll

InstCombine: Stop using nsz in multi-use min/max fold

In SimplifyDemandedFPClass, stop using nsz when there's a
mismatch in the sign of 0 for the various min and maxes.

Alive2 doesn't like it: https://alive2.llvm.org/ce/z/ZyhSGA,
presumably because of the possible mismatch between the stored
value and the propagated. Maybe it would be OK if nsz is on all
the uses.
DeltaFile
+4-3llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
+2-2llvm/test/Transforms/InstCombine/simplify-demanded-fpclass-maximum.ll
+2-2llvm/test/Transforms/InstCombine/simplify-demanded-fpclass-maximumnum.ll
+2-2llvm/test/Transforms/InstCombine/simplify-demanded-fpclass-minimum.ll
+2-2llvm/test/Transforms/InstCombine/simplify-demanded-fpclass-minimumnum.ll
+12-115 files