LLVM/project 5293760mlir/lib/Dialect/LLVMIR/Transforms DIScopeForLLVMFuncOp.cpp, mlir/test/Dialect/LLVMIR add-debuginfo-func-scope.mlir

[mlir][llvmir] Fix crash when a CallSiteLoc has a UnknownLoc callee (#186860)

Avoids reading a null StringAttr when no file name is present by
manufacturing a default instead.
DeltaFile
+14-2mlir/test/Dialect/LLVMIR/add-debuginfo-func-scope.mlir
+9-4mlir/lib/Dialect/LLVMIR/Transforms/DIScopeForLLVMFuncOp.cpp
+23-62 files

LLVM/project a74605bllvm/lib/CodeGen/SelectionDAG FastISel.cpp, llvm/test/CodeGen/X86 fake-use-fastisel.ll

[FastISel] generate FAKE_USE for llvm.fake.use
DeltaFile
+20-0llvm/test/CodeGen/X86/fake-use-fastisel.ll
+7-2llvm/lib/CodeGen/SelectionDAG/FastISel.cpp
+27-22 files

LLVM/project 673002futils/bazel/llvm-project-overlay/libc BUILD.bazel

[libc][math] Fix bazel build for fmaf16 (#187111)
DeltaFile
+1-1utils/bazel/llvm-project-overlay/libc/BUILD.bazel
+1-11 files

LLVM/project 74c7fb9clang/include/clang/CIR/Dialect/IR CIROps.td

[CIR] Fix missing RegionBranchTerminatorOpInterface declarations

After https://github.com/llvm/llvm-project/pull/186832 operations with RegionBranchTerminatorOpInterface needs to declare `getMutableSuccessorOperands`.
DeltaFile
+5-2clang/include/clang/CIR/Dialect/IR/CIROps.td
+5-21 files

LLVM/project 6b2e347libc/include wctype.yaml, libc/src/wctype iswpunct.h iswpunct.cpp

[libc]: implement 'iswpunct' entrypoint (#186968)

Added entrypoints:
- baremetal/arm
- baremetal/aarch64
- baremetal/riscv
- darwin/aarch64
- linux/aarch64
- linux/arm
- linux/riscv
- linux/x86_64
- windows

Also added the unit test for iswpunct.

Part of the issue: #185136
DeltaFile
+63-0libc/test/src/wctype/iswpunct_test.cpp
+21-0libc/src/wctype/iswpunct.h
+19-0libc/src/wctype/iswpunct.cpp
+12-0libc/src/wctype/CMakeLists.txt
+10-0libc/test/src/wctype/CMakeLists.txt
+6-0libc/include/wctype.yaml
+131-09 files not shown
+142-015 files

LLVM/project 96d873cflang/include/flang/Semantics openmp-utils.h, flang/lib/Semantics openmp-utils.cpp check-omp-loop.cpp

[flang][OpenMP] Use OmpDirectiveSpecification for range/depth queries, NFC

That makes them usable for a potential future implementation of APPLY.
DeltaFile
+18-20flang/lib/Semantics/openmp-utils.cpp
+2-2flang/include/flang/Semantics/openmp-utils.h
+2-2flang/lib/Semantics/check-omp-loop.cpp
+22-243 files

LLVM/project d4afb1bflang/include/flang/Semantics openmp-utils.h

[flang][OpenMP] Remove unused function declaration, NFC (#187101)

The function `GetNumGeneratedNestsFrom` has been removed, but repeated
local rebases stubbornly inserted the declaration back in.
DeltaFile
+0-4flang/include/flang/Semantics/openmp-utils.h
+0-41 files

LLVM/project f0e699autils/bazel/llvm-project-overlay/libc BUILD.bazel

[libc][math] Fix fma bazel build (#187107)
DeltaFile
+1-1utils/bazel/llvm-project-overlay/libc/BUILD.bazel
+1-11 files

LLVM/project 2ef41ccclang/lib/Format FormatTokenLexer.cpp FormatTokenLexer.h, clang/unittests/Format FormatTest.cpp

[clang-format] Fix Macros configuration not working with try/catch expansions (#184891)

This is a superseding followup to my previous PR,
https://github.com/llvm/llvm-project/pull/183352.

In my previous PR, I proposed adding TryMacros and CatchMacros
configuration options, similar in spirit to IfMacros and ForEachMacros.
I did so because I noticed that configuration like
`Macros=["TRY_MACRO=try", "CATCH_MACRO(e)=catch(e)]` did not format
configured macro(s) as try/catch blocks. @owenca confirmed in my
previous PR that this observed behavior is undesired, and we should
prefer to fix it rather than introduce new features.

This PR proposes a fix, described in detail in the commit message below
the break. In general terms, it deletes a heuristic from the lexing
phase, where it interacted poorly with the Macros option, and moves its
functionality to the parsing phase instead.

I describe a possibly cleaner fix in [a comment

    [34 lines not shown]
DeltaFile
+0-22clang/lib/Format/FormatTokenLexer.cpp
+6-4clang/unittests/Format/FormatTest.cpp
+0-1clang/lib/Format/FormatTokenLexer.h
+6-273 files

LLVM/project 4a8b61fllvm/lib/Target/AMDGPU AMDGPUCoExecSchedStrategy.cpp GCNSchedStrategy.cpp, llvm/test/CodeGen/AMDGPU coexec-sched-effective-stall.mir

[AMDGPU] Add structural stall heuristic to scheduling strategies

Implements a structural stall heuristic that considers both resource
hazards and latency constraints when selecting instructions. In coexec,
this changes the pending queue from a binary “not ready to issue”
distinction into part of a unified candidate comparison. Pending
instructions still identify structural stalls in the current cycle, but
they are now evaluated directly against available instructions by stall
cost, making the heuristics both more intuitive and more expressive.

- Add getStructuralStallCycles() to GCNSchedStrategy that computes the
number of cycles an instruction must wait due to:
  - Resource conflicts on unbuffered resources (from the SchedModel)
  - Sequence-dependent hazards (from GCNHazardRecognizer)

- Add getHazardWaitStates() to GCNHazardRecognizer that returns the number
of wait states until all hazards for an instruction are resolved,
providing cycle-accurate hazard information for scheduling heuristics.
DeltaFile
+38-3llvm/lib/Target/AMDGPU/AMDGPUCoExecSchedStrategy.cpp
+35-0llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp
+7-2llvm/lib/Target/AMDGPU/GCNSchedStrategy.h
+6-0llvm/lib/Target/AMDGPU/GCNHazardRecognizer.h
+2-4llvm/test/CodeGen/AMDGPU/coexec-sched-effective-stall.mir
+4-0llvm/lib/Target/AMDGPU/GCNHazardRecognizer.cpp
+92-91 files not shown
+94-97 files

LLVM/project c064b5allvm/lib/Target/AMDGPU AMDGPUCoExecSchedStrategy.cpp GCNSchedStrategy.h, llvm/test/CodeGen/AMDGPU coexec-sched-effective-stall.mir

Address comments.
DeltaFile
+26-23llvm/lib/Target/AMDGPU/AMDGPUCoExecSchedStrategy.cpp
+2-3llvm/lib/Target/AMDGPU/GCNSchedStrategy.h
+1-0llvm/test/CodeGen/AMDGPU/coexec-sched-effective-stall.mir
+29-263 files

LLVM/project 202b7c6llvm/lib/CodeGen/SelectionDAG FastISel.cpp, llvm/test/CodeGen/X86 fake-use-fastisel.ll

[FastISel] generate FAKE_USE for llvm.fake.use
DeltaFile
+20-0llvm/test/CodeGen/X86/fake-use-fastisel.ll
+8-2llvm/lib/CodeGen/SelectionDAG/FastISel.cpp
+28-22 files

LLVM/project af67e30llvm/lib/Transforms/Vectorize SLPVectorizer.cpp

[SLP][NFC] Refactor BinOpSameOpcodeHelper BIT enum (#187067)

More readable syntax and increase type width to avoid silent errors if
we reach 17 members.
DeltaFile
+10-10llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+10-101 files

LLVM/project bed5e7dlibc/shared/math fmaf16.h, libc/src/__support/math fmaf16.h CMakeLists.txt

[libc][math] Refactor fmaf16 implementation to header-only in src/__support/math folder. (#163977)
DeltaFile
+33-0libc/src/__support/math/fmaf16.h
+29-0libc/shared/math/fmaf16.h
+11-1utils/bazel/llvm-project-overlay/libc/BUILD.bazel
+10-0libc/src/__support/math/CMakeLists.txt
+2-4libc/src/math/generic/fmaf16.cpp
+1-2libc/src/math/generic/CMakeLists.txt
+86-73 files not shown
+90-79 files

LLVM/project e6f0ec8libc/shared/math fmaf.h, libc/src/__support/math fmaf.h CMakeLists.txt

[libc][math] Refactor fmaf implementation to header-only in src/__support/math folder. (#163970)

Part of #147386

in preparation for:
https://discourse.llvm.org/t/rfc-make-clang-builtin-math-functions-constexpr-with-llvm-libc-to-support-c-23-constexpr-math-functions/86450
DeltaFile
+27-0libc/src/__support/math/fmaf.h
+23-0libc/shared/math/fmaf.h
+11-1utils/bazel/llvm-project-overlay/libc/BUILD.bazel
+9-0libc/src/__support/math/CMakeLists.txt
+2-5libc/src/math/generic/fmaf.cpp
+1-1libc/src/math/generic/CMakeLists.txt
+73-73 files not shown
+76-79 files

LLVM/project f5d83fbmlir/lib/Dialect/GPU/Transforms SubgroupIdRewriter.cpp, mlir/test/Dialect/GPU subgroupId-rewrite.mlir

[mlir][GPU] Set nsw/nuw when expanding out subgroup ID (#187099)

There's no world where the subgroup ID (or the intermediate values
needed to compute it) will be negative or will have signed overflow.
This commit adds flags accordingly, which is helpful as this is a rather
low-level rewrite that might run after the analyses that would
ordinarily add these flags.
DeltaFile
+8-8mlir/test/Dialect/GPU/subgroupId-rewrite.mlir
+10-5mlir/lib/Dialect/GPU/Transforms/SubgroupIdRewriter.cpp
+18-132 files

LLVM/project 8859866flang/include/flang/Semantics openmp-utils.h

[flang][OpenMP] Remove unused function declaration, NFC

The function `GetNumGeneratedNestsFrom` has been removed, but repeated
local rebases stubbornly inserted the declaration back in.
DeltaFile
+0-4flang/include/flang/Semantics/openmp-utils.h
+0-41 files

LLVM/project d0d1f0blibc/shared/math fma.h, libc/src/__support/math fma.h CMakeLists.txt

[libc][math] Refactor fma implementation to header-only in src/__support/math folder. (#163968)

Part of #147386

in preparation for:
https://discourse.llvm.org/t/rfc-make-clang-builtin-math-functions-constexpr-with-llvm-libc-to-support-c-23-constexpr-math-functions/86450
DeltaFile
+27-0libc/src/__support/math/fma.h
+23-0libc/shared/math/fma.h
+9-1utils/bazel/llvm-project-overlay/libc/BUILD.bazel
+9-0libc/src/__support/math/CMakeLists.txt
+2-5libc/src/math/generic/fma.cpp
+1-1libc/src/math/generic/CMakeLists.txt
+71-73 files not shown
+74-79 files

LLVM/project c10f5d5llvm/test/CodeGen/AMDGPU/GlobalISel atomicrmw-fmin-fmax.ll, llvm/test/CodeGen/X86/apx sub.ll add.ll

Merge branch 'main' into users/ziqingluo/reapply-PR169191570
DeltaFile
+1,809-0llvm/test/Instrumentation/NumericalStabilitySanitizer/intrinsics.ll
+1,432-0llvm/test/Instrumentation/NumericalStabilitySanitizer/libfuncs.ll
+486-145llvm/test/CodeGen/X86/apx/sub.ll
+476-140llvm/test/CodeGen/X86/apx/add.ll
+596-0llvm/test/CodeGen/AMDGPU/GlobalISel/atomicrmw-fmin-fmax.ll
+450-132llvm/test/CodeGen/X86/apx/or.ll
+5,249-417650 files not shown
+23,875-8,526656 files

LLVM/project 3d0e7e0llvm/include/llvm/Object Archive.h, llvm/lib/BinaryFormat Magic.cpp

[z/OS] Recognize EBCDIC archive magic (#186854)

`z/OS` archives use the same structural layout as traditional Unix
archives but encode all text fields in EBCDIC. The magic string is the
EBCDIC representation of `\"!<arch>\n\" (hex: 5A 4C 81 99 83 88 6E 15)`.
This patch adds recognition of the `z/OS` archive magic to
`identify_magic()` and defines the `ZOSArchiveMagic` constant. This is
the first in a series of patches adding `z/OS` archive support to LLVM.
DeltaFile
+5-0llvm/lib/BinaryFormat/Magic.cpp
+2-0llvm/unittests/BinaryFormat/TestFileMagic.cpp
+2-0llvm/include/llvm/Object/Archive.h
+9-03 files

LLVM/project f01f168clang/unittests/ScalableStaticAnalysisFramework/Analyses/UnsafeBufferUsage UnsafeBufferUsageTest.cpp

address change meaning warnings
DeltaFile
+2-2clang/unittests/ScalableStaticAnalysisFramework/Analyses/UnsafeBufferUsage/UnsafeBufferUsageTest.cpp
+2-21 files

LLVM/project 294995blibc/src/__support/math CMakeLists.txt

dep
DeltaFile
+1-0libc/src/__support/math/CMakeLists.txt
+1-01 files

LLVM/project 996b622utils/bazel/llvm-project-overlay/libc BUILD.bazel, utils/bazel/llvm-project-overlay/mlir BUILD.bazel

[bazel] NFC: reformat mlir & libc bazel files (#187094)
DeltaFile
+1-1utils/bazel/llvm-project-overlay/mlir/BUILD.bazel
+0-1utils/bazel/llvm-project-overlay/libc/BUILD.bazel
+1-22 files

LLVM/project aef7e57llvm/lib/Target/DirectX/DirectXIRPasses PointerTypeAnalysis.cpp, llvm/test/CodeGen/DirectX empty-global-ctors.ll

[DirectX] Fix assertion in PointerTypeAnalysis with empty global_ctors (#179034)

When `llvm.global_ctors` has no elements (e.g., when all resources are
unused in a shader library), its initializer is a `zeroinitializer`
(`ConstantAggregateZero`) rather than a `ConstantArray`. The previous
code used `cast<ConstantArray>` which asserts on incompatible types:

> "cast<Ty>() argument of incompatible type!"

This patch uses `dyn_cast` and returns early if the initializer is not a
`ConstantArray`, handling the edge case gracefully.

Fixes #178993.

Co-authored-by: Kaitlin Peng <kaitlinpeng at microsoft.com>
DeltaFile
+16-0llvm/test/CodeGen/DirectX/empty-global-ctors.ll
+7-1llvm/lib/Target/DirectX/DirectXIRPasses/PointerTypeAnalysis.cpp
+23-12 files

LLVM/project 385aeb2llvm/include/llvm/Transforms/Utils UnrollLoop.h, llvm/lib/Target/AMDGPU AMDGPUTargetTransformInfo.cpp

Revert "[LoopUnroll] Remove `computeUnrollCount()`'s return value " (#187035)

Reverts llvm/llvm-project#184529 due to
https://github.com/llvm/llvm-project/pull/184529#issuecomment-4074393657
DeltaFile
+54-75llvm/lib/Transforms/Scalar/LoopUnrollPass.cpp
+31-0llvm/lib/Transforms/Utils/LoopUnroll.cpp
+14-10llvm/include/llvm/Transforms/Utils/UnrollLoop.h
+12-3llvm/lib/Transforms/Scalar/LoopUnrollAndJamPass.cpp
+3-2llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp
+114-905 files

LLVM/project 1977fb9libc/shared/math fmaf.h, libc/src/__support/math fmaf.h CMakeLists.txt

[libc][math] Refactor fmaf implementation to header-only in src/__support/math folder.
DeltaFile
+27-0libc/src/__support/math/fmaf.h
+23-0libc/shared/math/fmaf.h
+9-1utils/bazel/llvm-project-overlay/libc/BUILD.bazel
+8-0libc/src/__support/math/CMakeLists.txt
+2-5libc/src/math/generic/fmaf.cpp
+1-1libc/src/math/generic/CMakeLists.txt
+70-73 files not shown
+73-79 files

LLVM/project 27172d7libc/src/__support/math CMakeLists.txt

dep
DeltaFile
+1-0libc/src/__support/math/CMakeLists.txt
+1-01 files

LLVM/project b1141aelibc/shared/math fmaf16.h, libc/src/__support/math fmaf16.h CMakeLists.txt

[libc][math] Refactor fmaf16 implementation to header-only in src/__support/math folder.
DeltaFile
+33-0libc/src/__support/math/fmaf16.h
+29-0libc/shared/math/fmaf16.h
+9-1utils/bazel/llvm-project-overlay/libc/BUILD.bazel
+8-0libc/src/__support/math/CMakeLists.txt
+2-4libc/src/math/generic/fmaf16.cpp
+1-2libc/src/math/generic/CMakeLists.txt
+82-73 files not shown
+86-79 files

LLVM/project 803828fmlir/include/mlir/Dialect/GPU/IR GPUBase.td, mlir/lib/Conversion/GPUCommon IndexIntrinsicsOpLowering.cpp

[mlir][GPU] Refactor, improve constant size information handling (#186907)

1. There was duplicate code between the integer range analysis's
handling of static dimension size information (ex. gpu.known_block_dim
attributes) and the handling during the lowering of those operations.
The code from integer range analysis was given a dialect-wide entry
point (and had its types fixed to be more accurate), which the lowering
templates now call.
2. The templated lowering for block/grid/cluster_dim now produces
precise ranges (indicating the constant value) where one is known, and
the lowerings in rocdl (including those for subgroup_id) have been fixed
appropriately.
3. While I was here, the gpu.dimension enum has been moved to GPUBase so
it lives next to the other enums.
4. The pattern that expands subgroup_id operations now adds any thread
dimension bounds it finds in context.

(Claude was used for an initial round of review, I did the main coding
myself.)

    [3 lines not shown]
DeltaFile
+62-57mlir/lib/Dialect/GPU/IR/InferIntRangeInterfaceImpls.cpp
+11-47mlir/lib/Conversion/GPUCommon/IndexIntrinsicsOpLowering.cpp
+24-17mlir/lib/Conversion/GPUToROCDL/LowerGpuOpsToROCDLOps.cpp
+34-5mlir/lib/Dialect/GPU/Transforms/SubgroupIdRewriter.cpp
+26-0mlir/test/Dialect/GPU/subgroupId-rewrite.mlir
+21-0mlir/include/mlir/Dialect/GPU/IR/GPUBase.td
+178-1266 files not shown
+203-15412 files

LLVM/project da86e03utils/bazel/llvm-project-overlay/libc BUILD.bazel

[Bazel] Fixes ebb3309 (#187090)

This fixes ebb3309975c8e49096d8295a368c93c684bf10f1.
DeltaFile
+86-5utils/bazel/llvm-project-overlay/libc/BUILD.bazel
+86-51 files