LLVM/project dd907edllvm/test/CodeGen/X86 sjlj-do-not-merge-stack-slots.ll

[StackColoring] Introduce test for PR196542 (NFC) (#196951)
DeltaFile
+49-0llvm/test/CodeGen/X86/sjlj-do-not-merge-stack-slots.ll
+49-01 files

LLVM/project fd30f5bclang/include/clang/CIR/Dialect/IR CIRAttrs.td, clang/lib/CIR/CodeGen CIRGenDeclCXX.cpp

[CIR] Implement Namespace/global TLS CIR CodeGen (#196332)

Unlike local TLS, global TLS functions need to be initialized upon their
first use in a thread.

First, all attempts to 'get' said TLS global are replaced with calls to
a 'wrapper' function, which calls an 'init' alias function, then returns
the global. While classic codegen manages to omit this in simple cases
sometimes, this CIR implementation doesn't attempt to do such constant
folding/inlining. The call to the 'init' is omitted if there is no
ctor/dtor setup required, so sometimes the wrapper is just a 'no-op'
(intentionally!).

There are also two types of 'global' TLS functions: unordered, and
ordered. Unordered are typically variable templates, and their 'init'
function initializes JUST them. The rest are ordered, which requires all
ordered initializations to happen as soon as any happen.

The Wrapper:

    [25 lines not shown]
DeltaFile
+65-0clang/test/CIR/CodeGen/global-tls-simple-init.cpp
+59-0clang/include/clang/CIR/Dialect/IR/CIRAttrs.td
+50-0clang/test/CIR/CodeGen/global-tls-dyn-init.cpp
+39-0clang/lib/CIR/CodeGen/CIRGenDeclCXX.cpp
+34-0clang/test/CIR/CodeGen/global-tls-templates.cpp
+18-0clang/test/CIR/IR/invalid-tls.cir
+265-06 files not shown
+294-1612 files

LLVM/project b150adallvm/test/CodeGen/AMDGPU/GlobalISel legalize-load-private.mir legalize-llvm.amdgcn.image.sample.a16.ll

AMDGPU/GlobalISel: Switch to extended LLTs

Switch is required to be able to translate bfloat.

After the switch most of the codegen patterns now require explicit
type on register to match instead of LLT::scalar.
So we can still use LLT::scalar for type checks but new instructions
created during lowerings/combines need to use propper extended LLT.

inst select test sources fully switched to i32/f32 so patterns can match
for legalizer and regbanklegalize left as is (should probably be switched
as well)

New functionality worth noting is f16 and bitcast lowering to i32
f16 = g_bitcast i16
->
i32 = g_anyext i16
f16 = g_trunc i32

f16 = trunc i32 is legal
DeltaFile
+6,753-6,685llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-load-private.mir
+5,732-5,732llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-llvm.amdgcn.image.sample.a16.ll
+5,570-5,519llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-load-local.mir
+5,045-5,045llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-store-global.mir
+5,017-4,999llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-load-global.mir
+3,948-3,900llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-llvm.amdgcn.image.dim.a16.ll
+32,065-31,880581 files not shown
+107,437-105,033587 files

LLVM/project fe69360llvm/lib/Transforms/IPO AlwaysInliner.cpp, llvm/test/Transforms/Inline flatten.ll

Revert "[LLVM] Fix use-after-free in AlwaysInliner flatten worklist (#194485)"

This reverts commit b40c1d511b2e84842707939a1332b90ebb1a50a0.
DeltaFile
+42-39llvm/lib/Transforms/IPO/AlwaysInliner.cpp
+0-38llvm/test/Transforms/Inline/flatten.ll
+42-772 files

LLVM/project b9613dcclang/include/clang/Basic TargetInfo.h, clang/lib/AST ASTContext.cpp

convert to exec-charset inside getPredefinedStringLiteralFromCache, test __builtin_FILE()
DeltaFile
+28-0clang/test/CodeGen/systemz-charset.cpp
+10-0clang/lib/AST/ASTContext.cpp
+5-4clang/lib/Lex/TextEncodingConfig.cpp
+3-0clang/lib/Basic/TargetInfo.cpp
+2-0clang/include/clang/Basic/TargetInfo.h
+48-45 files

LLVM/project c687b82clang/docs LanguageExtensions.rst, clang/include/clang/Options Options.td

Enable driver changes for fexec-charset
DeltaFile
+14-6clang/lib/Driver/ToolChains/Clang.cpp
+14-4clang/include/clang/Options/Options.td
+11-3clang/test/Driver/clang_f_opts.c
+10-0llvm/lib/Support/TextEncoding.cpp
+4-3clang/test/Driver/cl-options.c
+3-3clang/docs/LanguageExtensions.rst
+56-193 files not shown
+60-199 files

LLVM/project 390277bclang/lib/AST PrintfFormatString.cpp FormatString.cpp, clang/lib/Sema SemaChecking.cpp

Add format string handling
DeltaFile
+58-31clang/lib/AST/PrintfFormatString.cpp
+46-40clang/lib/AST/FormatString.cpp
+33-21clang/lib/Sema/SemaChecking.cpp
+25-11clang/lib/AST/FormatStringParsing.h
+15-8clang/lib/AST/ScanfFormatString.cpp
+19-0llvm/lib/Support/TextEncoding.cpp
+196-11111 files not shown
+257-12117 files

LLVM/project ad90e98clang/include/clang/Sema Sema.h

Fix build failure
DeltaFile
+1-0clang/include/clang/Sema/Sema.h
+1-01 files

LLVM/project 9830c43clang/lib/AST/ByteCode Descriptor.cpp, clang/test/AST/ByteCode literals.cpp

[clang][bytecode] Fix a crash in Descriptor::getElemDataSize() (#196929)

`FIXED_SIZE_INT_TYPE_SWITCH` does not handle `PT_Bool`, handle it
explicitly before.
DeltaFile
+4-0clang/test/AST/ByteCode/literals.cpp
+2-0clang/lib/AST/ByteCode/Descriptor.cpp
+6-02 files

LLVM/project 9a3193bclang/lib/CodeGen CGOpenMPRuntime.cpp CGExpr.cpp, clang/lib/Sema SemaOpenMP.cpp

[clang][OpenMP 6.0][CodeGen] Codegen for declare_target 'local' clause (#196431)

Implement code generation for the OpenMP 6.0 declare_target 'local'
clause, which creates device-only variables with per-device static
storage.

A 'local' variable exists in the device image with its static
initializer and is always accessed directly by device code. This is the
same as 'to'/'enter' without unified shared memory, except that no
offload entry is registered.

Using 'device_type(nohost)' with 'local' is not yet supported. Sema
generates a warning and converts it to 'device_type(any)'.

Testing:
- Updated tests:
     clang/test/OpenMP/declare_target_messages.cpp
     clang/test/OpenMP/declare_target_ast_print.cpp
- New tests:

    [2 lines not shown]
DeltaFile
+430-0clang/test/OpenMP/declare_target_local_codegen.cpp
+52-0clang/test/OpenMP/declare_target_local_usm_codegen.cpp
+40-0offload/test/offloading/declare_target_local.cpp
+16-10clang/lib/CodeGen/CGOpenMPRuntime.cpp
+9-10clang/lib/CodeGen/CGExpr.cpp
+10-5clang/lib/Sema/SemaOpenMP.cpp
+557-256 files not shown
+581-4712 files

LLVM/project 0aa4619llvm/utils/gn/secondary/clang/lib/CodeGen BUILD.gn, llvm/utils/gn/secondary/llvm/lib/ABI BUILD.gn

[gn] port 07b5dfe9473c6 + deps (LLVMABI dep in clang) (#196944)

Also adds build files for llvm/lib/ABI, which was dead code before
07b5dfe9473c6 (at least in the GN build).
DeltaFile
+15-0llvm/utils/gn/secondary/llvm/lib/ABI/BUILD.gn
+1-0llvm/utils/gn/secondary/clang/lib/CodeGen/BUILD.gn
+16-02 files

LLVM/project 2243f63llvm/lib/Target/AMDGPU/AsmParser AMDGPUAsmParser.cpp

[AMDGPU] Replace vdst_in opcode exclusion list with position check

Use getNamedOperandIdx to detect if vdst_in has already been added
by a prior converter, instead of maintaining a hardcoded opcode list.
DeltaFile
+6-42llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp
+6-421 files

LLVM/project a9257f7clang/docs ReleaseNotes.rst, clang/test/CodeGenCXX typeid-most-derived.cpp

address review comments
DeltaFile
+14-14clang/test/CodeGenCXX/typeid-most-derived.cpp
+3-0clang/docs/ReleaseNotes.rst
+17-142 files

LLVM/project 75813d0llvm/lib/Transforms/Vectorize LoopVectorizationPlanner.cpp, llvm/test/Transforms/LoopVectorize if-conversion-scalable.ll

[LV] Add test showing lack of gather/scatter can prevent if-convert

This introduces a new force-target-supports-gather-scatter-ops CLI
option for testing, as well a new isLegalMaskedLoadOrStore() helper.
DeltaFile
+120-0llvm/test/Transforms/LoopVectorize/if-conversion-scalable.ll
+7-1llvm/lib/Transforms/Vectorize/LoopVectorizationPlanner.cpp
+127-12 files

LLVM/project bc2dedbclang/lib/AST ExprCXX.cpp, clang/test/CodeGenCXX typeid-most-derived.cpp

[clang][AST] Teach `CXXTypeidExpr::isMostDerived` to use `isEffectivelyFinal`
DeltaFile
+57-0clang/test/CodeGenCXX/typeid-most-derived.cpp
+5-0clang/lib/AST/ExprCXX.cpp
+62-02 files

LLVM/project b4e8f59mlir/lib/Conversion/MathToSPIRV MathToSPIRV.cpp, mlir/test/Conversion/MathToSPIRV math-to-opencl-spirv.mlir

[mlir][SPIR-V] Lower math.{exp2,log2,log10} operations (#196723)
DeltaFile
+8-12mlir/test/Conversion/MathToSPIRV/math-to-opencl-spirv.mlir
+3-2mlir/lib/Conversion/MathToSPIRV/MathToSPIRV.cpp
+11-142 files

LLVM/project f325d13llvm/lib/Transforms/Vectorize LoopVectorize.cpp LoopVectorizationPlanner.cpp

[LV] Use isLegalMaskedLoadOrStore for interleaved accesses too (NFC) (#195243)

isLegalMaskedLoadOrStore is now the central place for querying target
capabilities for masked accesses. Access pattern legality checks are
hoisted outside of it.
DeltaFile
+4-6llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+0-4llvm/lib/Transforms/Vectorize/LoopVectorizationPlanner.cpp
+2-1llvm/lib/Transforms/Vectorize/LoopVectorizationPlanner.h
+6-113 files

LLVM/project 743ee9blibcxx/docs/Status Cxx2cIssues.csvgb

remove empty csvgb file
DeltaFile
+0-0libcxx/docs/Status/Cxx2cIssues.csvgb
+0-01 files

LLVM/project de2d725llvm/include/llvm/ADT GenericUniformityInfo.h GenericUniformityImpl.h, llvm/lib/Target/AMDGPU AMDGPURegBankLegalizeRules.cpp AMDGPUAtomicOptimizer.cpp

review: address suggestion
DeltaFile
+54-54llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeRules.cpp
+10-8llvm/include/llvm/ADT/GenericUniformityInfo.h
+4-4llvm/unittests/Target/AMDGPU/UniformityAnalysisTest.cpp
+4-4llvm/lib/Target/AMDGPU/AMDGPUAtomicOptimizer.cpp
+3-3llvm/include/llvm/ADT/GenericUniformityImpl.h
+2-2llvm/lib/Target/AMDGPU/AMDGPUGlobalISelDivergenceLowering.cpp
+77-7512 files not shown
+92-9018 files

LLVM/project 0982089utils/bazel/llvm-project-overlay/mlir BUILD.bazel

[Bazel] Fixes 34502b0 (#196930)

This fixes 34502b0c7e076e658bd176030223029cd4402941.

Co-authored-by: Google Bazel Bot <google-bazel-bot at google.com>
DeltaFile
+2-0utils/bazel/llvm-project-overlay/mlir/BUILD.bazel
+2-01 files

LLVM/project 9ff5e12flang/lib/Semantics tools.cpp, flang/test/Semantics pure-function-result-pointer.f90 pure-host-associated-result.f90

[Flang][Semantics] Treat host/use-associated objects as externally visible. (#192892)

This patch fixes a false semantic error in Flang where function result
variables were incorrectly treated as externally visible in
pure-definability checks.

As a result, valid code assigning a pointer component of a function
result (as in flang/test/Semantics/pure-function-result-pointer.f90) was
rejected with “not definable in a pure subprogram.”

The fix updates _FindExternallyVisibleObject_ to treat function result
symbols as local, which matches Fortran semantics for function result
variables.
DeltaFile
+46-0flang/test/Semantics/pure-function-result-pointer.f90
+8-7flang/lib/Semantics/tools.cpp
+14-0flang/test/Semantics/pure-host-associated-result.f90
+68-73 files

LLVM/project fb69fcdflang/lib/Lower Bridge.cpp, flang/test/Lower/OpenMP copyin-derived-allocatable-comp.f90

[Flang][OpenMP] Fix COPYIN of derived types with allocatable components at -O3 (#196063)

COPYIN of threadprivate derived types with allocatable components
segfaults at -O3 because the OpenMP runtime zero-fills per-thread
storage, leaving allocatable component descriptors with invalid
metadata. This patch skips the copy on the master thread (where source
and destination alias) and uses temporary_lhs assignment on worker
threads so the runtime initializes descriptors before the deep copy.

Assisted-by: Claude Opus 4.6

Fixes :
[https://github.com/llvm/llvm-project/issues/196134](https://github.com/llvm/llvm-project/issues/196134)
Minimal reprducing test-case : 
```
program repro_o3_segv
  use omp_lib
  implicit none


    [64 lines not shown]
DeltaFile
+88-0flang/test/Lower/OpenMP/copyin-derived-allocatable-comp.f90
+36-0flang/lib/Lower/Bridge.cpp
+124-02 files

LLVM/project c3628c7llvm/lib/Analysis AliasAnalysis.cpp, llvm/test/Analysis/BasicAA atomics.ll

Reapply [AA] No synchronization effects for never-escaping identified local (#196923)

Relative to the previous attempt, this makes sure that the location does
not alias with the pointer operand first. If it aliases, then we need to
consider the direct ModRef effects of the instruction, not just the
synchronization effects.

-----

Fences and other synchronizing operations (such as atomic accesses
stronger than monotonic) are modelled as reading and writing all memory,
in order to enforce their implied ordering constraints.

Currently, this happens even for identified function locals that do not
escape. This patch excludes those objects.

Notably, we can not reason based on captures-before here, because the
synchronizing operation still has an effect even if the object only
escapes later.

    [2 lines not shown]
DeltaFile
+55-27llvm/lib/Analysis/AliasAnalysis.cpp
+50-22llvm/test/Analysis/BasicAA/atomics.ll
+0-8llvm/test/Transforms/DeadStoreElimination/fence.ll
+3-3llvm/test/Transforms/LICM/atomics.ll
+2-2llvm/test/Transforms/GVN/fence.ll
+1-1llvm/test/Analysis/MemorySSA/atomic-clobber.ll
+111-631 files not shown
+111-657 files

LLVM/project 6f08482llvm/test/CodeGen/AArch64 bf16-v8-instructions.ll bf16-v4-instructions.ll, llvm/test/CodeGen/RISCV/rvv fixed-vectors-reduction-fp.ll

Merge remote-tracking branch 'origin/main' into gbossu.isLegalMaskedLoadOrStore
DeltaFile
+7,584-740llvm/test/CodeGen/AArch64/bf16-v8-instructions.ll
+6,873-0llvm/test/tools/llvm-mca/AArch64/Cortex/C1Premium-sve-instructions.s
+4,634-367llvm/test/CodeGen/RISCV/rvv/fixed-vectors-reduction-fp.ll
+4,174-657llvm/test/CodeGen/AArch64/bf16-v4-instructions.ll
+2,969-1,160llvm/test/CodeGen/X86/vector-reduce-mul.ll
+3,979-0llvm/test/tools/llvm-mca/AArch64/Cortex/C1Premium-writeback.s
+30,213-2,9243,847 files not shown
+149,503-45,7503,853 files

LLVM/project 64212c8llvm/lib/Target/SPIRV SPIRVInstructionSelector.cpp

[NFC][SPIR-V] Use createVirtualRegister helper in selectSUCmp (#196905)

Resolve the existing TODO that asks us to do that
DeltaFile
+4-8llvm/lib/Target/SPIRV/SPIRVInstructionSelector.cpp
+4-81 files

LLVM/project a5547d3cross-project-tests lit.cfg.py, cross-project-tests/debuginfo-tests/dexter/dex/test_script Script.py Nodes.py

[Dexter] Add basic structured script parsing (#193710)

See PSA:
https://discourse.llvm.org/t/psa-planned-changes-to-dexter/90402

This patch begins adding support for "structured scripts" to Dexter,
starting with some of the core classes and the ability to parse script
files. This patch does not add the ability to actually run scripts, or
any of the underlying functionality required to do so.

NB: This patch adds a dependency on PyYAML, which is specified in a new
requirements.txt file.
DeltaFile
+238-0cross-project-tests/debuginfo-tests/dexter/dex/test_script/Script.py
+204-0cross-project-tests/debuginfo-tests/dexter/dex/test_script/Nodes.py
+55-7cross-project-tests/lit.cfg.py
+31-3cross-project-tests/debuginfo-tests/dexter/dex/tools/test/Tool.py
+24-0cross-project-tests/debuginfo-tests/dexter/feature_tests/scripts/parser/invalid-script-nodes.test
+23-0cross-project-tests/debuginfo-tests/dexter/feature_tests/scripts/parser/error-locations.test
+575-107 files not shown
+622-1413 files

LLVM/project 1e84219mlir/include/mlir/Analysis/DataFlow IntegerRangeAnalysis.h, mlir/lib/Analysis/DataFlow IntegerRangeAnalysis.cpp

[mlir][dataflow] IntRange: Replace yield-based widening with per-state lattice budget (#196616)

IntegerRangeAnalysis can hang on `scf.while` loops with dynamic bounds:
a
loop-carried range ratchets [0,0]->[0,1]->[0,2]->... by one per worklist
visit, requiring up to 2^31 iterations on i32. The new
`int-range-analysis-convergence.mlir` test reproduces this.

The ratchet lives at framework merge sites (region successors, callable
args) where the solver joins lattices via virtual
`Lattice::join(const AbstractSparseLattice &)`. The pre-existing
`isYieldedResult`/`isYieldedValue` heuristic in
`IntegerRangeAnalysis::visitOperation` doesn't help: it runs in the
transfer-function callback for inferrable-op results used by a
terminator,
not on the merge path. It is also harmful where it fires - slams to
maxRange on the *second* visit (after, say, [1,1]->[1,2]), so naturally
bounded accumulators (e.g. `arith.minsi`-clamped iter args) widen to
[INT_MIN, INT_MAX].

    [8 lines not shown]
DeltaFile
+91-0mlir/test/Dialect/Arith/int-range-analysis-convergence.mlir
+63-0mlir/test/Dialect/Arith/int-range-loop-iter-args.mlir
+25-34mlir/lib/Analysis/DataFlow/IntegerRangeAnalysis.cpp
+28-0mlir/include/mlir/Analysis/DataFlow/IntegerRangeAnalysis.h
+207-344 files

LLVM/project 34502b0mlir/include/mlir/Dialect/GPU/Pipelines Passes.h, mlir/lib RegisterAllPasses.cpp

[MLIR][GPU] Add gpu-lower-to-rocdl-pipeline meta-pass (#196751)

Add `gpu-lower-to-rocdl-pipeline` meta-pass which lowers common MLIR
dialects (gpu/arith/scf/vector) to binary, similar to the existing
XeVM/NVVM pipelines.
DeltaFile
+136-0mlir/lib/Dialect/GPU/Pipelines/GPUToROCDLPipeline.cpp
+69-0mlir/test/Integration/GPU/ROCM/gpu-lower-to-rocdl-pipeline.mlir
+60-0mlir/include/mlir/Dialect/GPU/Pipelines/Passes.h
+5-0mlir/lib/Dialect/GPU/Pipelines/CMakeLists.txt
+1-0mlir/lib/RegisterAllPasses.cpp
+271-05 files

LLVM/project 02177f3llvm/lib/CodeGen/SelectionDAG SelectionDAGBuilder.cpp SelectionDAGBuilder.h

[SelectionDAG] Emit `AssertZext` for function argument range attributes
DeltaFile
+6-1llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
+2-0llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.h
+8-12 files

LLVM/project 67d7ee6llvm/lib/CodeGen/SelectionDAG SelectionDAGBuilder.cpp, llvm/test/CodeGen/X86 argument-range-attr.ll

address review comment
DeltaFile
+11-0llvm/test/CodeGen/X86/argument-range-attr.ll
+1-1llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
+12-12 files