LLVM/project 9a17aa4llvm/tools/llvm-objdump llvm-objdump.cpp

llvm-objdump: Avoid contraction in error message (#189272)
DeltaFile
+1-1llvm/tools/llvm-objdump/llvm-objdump.cpp
+1-11 files

LLVM/project 2c41a8dllvm/test/CodeGen/AMDGPU llvm.amdgcn.ds.bvh.stack.push.pop.rtn.ll llvm.amdgcn.dual_intersect_ray.ll

AMDGPU: Fix using -march in a couple tests (#189271)
DeltaFile
+2-2llvm/test/CodeGen/AMDGPU/llvm.amdgcn.ds.bvh.stack.push.pop.rtn.ll
+2-2llvm/test/CodeGen/AMDGPU/llvm.amdgcn.dual_intersect_ray.ll
+4-42 files

LLVM/project e911910llvm/lib/Analysis ValueTracking.cpp, llvm/lib/Target/ARM/AsmParser ARMAsmParser.cpp

[LLVM] remove redundant uses of dyn_cast (NFC) (#189105)

This removes dyn_cast invocations where the argument is already of the
target type (including through subtyping). This was created by adding a
static assert in dyn_cast and letting an LLM iterate until the code base
compiled. I then went through each example and cleaned it up. This does
not commit the static assert in dyn_cast, because it would prevent a lot
of uses in templated code. To prevent backsliding we should instead add
an LLVM aware version of
https://clang.llvm.org/extra/clang-tidy/checks/readability/redundant-casting.html
(or expand the existing one).
DeltaFile
+46-49llvm/tools/llvm-size/llvm-size.cpp
+7-20llvm/lib/Target/ARM/AsmParser/ARMAsmParser.cpp
+10-10llvm/lib/Transforms/IPO/GlobalOpt.cpp
+6-8llvm/lib/Transforms/IPO/GlobalDCE.cpp
+6-8llvm/lib/Transforms/InstCombine/InstCombinePHI.cpp
+3-5llvm/lib/Analysis/ValueTracking.cpp
+78-10017 files not shown
+105-14423 files

LLVM/project c8c3694llvm/tools/llvm-objdump llvm-objdump.cpp

llvm-objdump: Avoid contraction in error message
DeltaFile
+1-1llvm/tools/llvm-objdump/llvm-objdump.cpp
+1-11 files

LLVM/project 479a826llvm/lib/Support Parallel.cpp

[Support] Use namespace qualifiers in Parallel.cpp. NFC (#189268)

Replace `namespace llvm { namespace parallel { ... } }` blocks with
`using namespace` and qualified definitions per

https://llvm.org/docs/CodingStandards.html#use-namespace-qualifiers-to-define-previously-declared-symbols

Also reformat the TaskGroup constructor to avoid clang-format issues
with #if/#endif split across the initializer list.
DeltaFile
+18-21llvm/lib/Support/Parallel.cpp
+18-211 files

LLVM/project ff13b76llvm/test/CodeGen/AMDGPU llvm.amdgcn.ds.bvh.stack.push.pop.rtn.ll llvm.amdgcn.dual_intersect_ray.ll

AMDGPU: Fix using -march in a couple tests
DeltaFile
+2-2llvm/test/CodeGen/AMDGPU/llvm.amdgcn.ds.bvh.stack.push.pop.rtn.ll
+2-2llvm/test/CodeGen/AMDGPU/llvm.amdgcn.dual_intersect_ray.ll
+4-42 files

LLVM/project 3fcbba3clang/lib/CIR/Dialect/Transforms LoweringPrepare.cpp

unreachable on RDC compilation
DeltaFile
+2-1clang/lib/CIR/Dialect/Transforms/LoweringPrepare.cpp
+2-11 files

LLVM/project e17c219llvm/lib/Transforms/Instrumentation MemorySanitizer.cpp, llvm/test/Instrumentation/MemorySanitizer/Hexagon vararg-hexagon.ll hexagon.ll

[msan] Add MSan instrumentation support for Hexagon (#189122)

Add MemorySanitizer instrumentation pass support for Hexagon Linux. This
is the codegen/instrumentation side; the compiler-rt runtime changes are
in a separate patch.

The shadow memory layout uses XOR-based mapping with XorMask=0x20000000
and OriginBase=0x50000000, designed to fit within the 32-bit address
space.

VarArg handling uses VarArgGenericHelper with VAListTagSize=12, matching
the Hexagon ABI where va_list is a three-pointer struct {
current_reg_area, reg_area_end, overflow_area }.
DeltaFile
+90-0llvm/test/Instrumentation/MemorySanitizer/Hexagon/vararg-hexagon.ll
+86-0llvm/test/Instrumentation/MemorySanitizer/Hexagon/hexagon.ll
+20-0llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp
+196-03 files

LLVM/project a8cdc5acompiler-rt/lib/msan msan_interceptors.cpp, compiler-rt/lib/sanitizer_common sanitizer_platform_interceptors.h

[compiler-rt][msan] Guard shmat interceptor w SANITIZER_INTERCEPT_SHMCTL (#189198)

The shmat interceptor calls REAL(shmctl), but shmctl is not intercepted
on all targets (e.g. 32-bit Linux with musl). Guard shmat behind
SANITIZER_INTERCEPT_SHMCTL and use a MSAN_MAYBE_INTERCEPT pattern
consistent with other conditional interceptors.
DeltaFile
+6-1compiler-rt/lib/msan/msan_interceptors.cpp
+2-0compiler-rt/lib/sanitizer_common/sanitizer_platform_interceptors.h
+8-12 files

LLVM/project 44f1fa9compiler-rt/cmake/Modules AllSupportedArchDefs.cmake, compiler-rt/lib/msan msan.h msan_allocator.cpp

[compiler-rt][msan] Add MSan support for Hexagon (Linux) (#189124)

Add the runtime infrastructure for MemorySanitizer on Hexagon Linux.
Hexagon is 32-bit, so the shadow memory layout uses a compact XOR-based
  mapping that fits within the lower 3GB of address space:

    0x00000000 - 0x10000000  APP-1     (256MB, program text/data/heap)
    0x10000000 - 0x20000000  ALLOCATOR (256MB)
    0x20000000 - 0x40000000  SHADOW-1  (512MB, covers APP-1 + ALLOCATOR)
    0x40000000 - 0x50000000  APP-2     (256MB, shared libs + stack)
    0x60000000 - 0x70000000  SHADOW-2  (256MB, covers APP-2)
    0x70000000 - 0x90000000  ORIGIN-1  (512MB)
    0xB0000000 - 0xC0000000  ORIGIN-2  (256MB)

MEM_TO_SHADOW uses XOR 0x20000000, and SHADOW_TO_ORIGIN adds 0x50000000.
  The dual-APP layout accommodates QEMU user-mode, which places shared
  libraries and the stack at 0x40000000.

  The allocator uses SizeClassAllocator32 with a 256MB region at
  0x10000000, and kMaxAllowedMallocSize is set to 1GB consistent with
  other 32-bit targets.
DeltaFile
+24-0compiler-rt/lib/msan/msan.h
+14-0compiler-rt/lib/msan/msan_allocator.cpp
+1-1compiler-rt/cmake/Modules/AllSupportedArchDefs.cmake
+39-13 files

LLVM/project b8d0411clang/lib/Driver Driver.cpp

clang: Avoid intermediate DenseSet of triples (#189263)

This was computing a DenseSet<StringRef> of triples, but the
only use was to insert all the entries into a multiset. Just
use the multiset in the first place.
DeltaFile
+8-10clang/lib/Driver/Driver.cpp
+8-101 files

LLVM/project 3fe1fd1clang/include/clang/Driver ToolChain.h, clang/lib/Driver Driver.cpp

clang: Store Triple in multiset

Previously this was storing StringRefs, which just happen
to be constant allocated strings. Change this into an owning
reference in the form that will actually be used. This will allow
changing the triples to something computed without maintaining
a table of every possible permutation.
DeltaFile
+20-16clang/lib/Driver/Driver.cpp
+7-0llvm/include/llvm/TargetParser/Triple.h
+4-0clang/include/clang/Driver/ToolChain.h
+31-163 files

LLVM/project 23ddaccclang/include/clang/Driver ToolChain.h, clang/lib/Driver Driver.cpp ToolChain.cpp

clang: Simplify OpenMP triple adjustment

Previously this would find a list of offloading triples,
then later fill in the unknown components specifically for
OpenMP after the fact. Start normalizing the triples upfront,
before inserting into the set. Also stop special casing OpenMP
since there's no apparent reason to treat it differently from
other offload languages.

Also operate on the Triple rather than the string, and handle
the unset OS and environment separately.
DeltaFile
+19-13clang/include/clang/Driver/ToolChain.h
+9-10clang/lib/Driver/Driver.cpp
+1-1clang/lib/Driver/ToolChains/Clang.cpp
+1-1clang/lib/Driver/ToolChain.cpp
+30-254 files

LLVM/project a58579eclang/lib/Driver Driver.cpp

clang: Avoid intermediate DenseSet of triples

This was computing a DenseSet<StringRef> of triples, but the
only use was to insert all the entries into a multiset. Just
use the multiset in the first place.
DeltaFile
+8-10clang/lib/Driver/Driver.cpp
+8-101 files

LLVM/project ad1e30bclang/include/clang/Basic OffloadArch.h, clang/lib/Basic OffloadArch.cpp

clang: Move Triple computing logic to separate function (#189262)
DeltaFile
+7-12clang/lib/Driver/Driver.cpp
+16-0clang/lib/Basic/OffloadArch.cpp
+4-0clang/include/clang/Basic/OffloadArch.h
+27-123 files

LLVM/project c467d38llvm/lib/Transforms/Vectorize VPlanTransforms.cpp LoopVectorize.cpp

[LV] Fix offset handling for epilogue resume values. (NFCI) (#189259)

Instead of replacing all uses of the canonical IV with an add of the
resume value and then relying on the fold to simplify, directly create
offset versions of both the canonical IV and its increment.

The original offset computation were incorrect, but not resulted in
mis-compiles due to the corresponding fold.

Split off from approved
https://github.com/llvm/llvm-project/pull/156262.
DeltaFile
+28-11llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
+15-8llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+43-192 files

LLVM/project 8374475clang/lib/Driver Driver.cpp

clang: Use isAMDGPU triple helper (#189261)

Also remove redundant SPIRV check.
DeltaFile
+1-1clang/lib/Driver/Driver.cpp
+1-11 files

LLVM/project cd9a653clang/include/clang/Driver ToolChain.h, clang/lib/Driver Driver.cpp ToolChain.cpp

clang: Simplify OpenMP triple adjustment

Previously this would find a list of offloading triples,
then later fill in the unknown components specifically for
OpenMP after the fact. Start normalizing the triples upfront,
before inserting into the set. Also stop special casing OpenMP
since there's no apparent reason to treat it differently from
other offload languages.

Also operate on the Triple rather than the string, and handle
the unset OS and environment separately.
DeltaFile
+19-13clang/include/clang/Driver/ToolChain.h
+9-10clang/lib/Driver/Driver.cpp
+1-1clang/lib/Driver/ToolChain.cpp
+1-1clang/lib/Driver/ToolChains/Clang.cpp
+30-254 files

LLVM/project f789d2cclang/lib/Driver Driver.cpp

clang: Avoid intermediate DenseSet of triples

This was computing a DenseSet<StringRef> of triples, but the
only use was to insert all the entries into a multiset. Just
use the multiset in the first place.
DeltaFile
+7-8clang/lib/Driver/Driver.cpp
+7-81 files

LLVM/project eb0588cclang/include/clang/Driver ToolChain.h, clang/lib/Driver Driver.cpp

clang: Store Triple in multiset

Previously this was storing StringRefs, which just happen
to be constant allocated strings. Change this into an owning
reference in the form that will actually be used. This will allow
changing the triples to something computed without maintaining
a table of every possible permutation.
DeltaFile
+20-16clang/lib/Driver/Driver.cpp
+7-0llvm/include/llvm/TargetParser/Triple.h
+4-0clang/include/clang/Driver/ToolChain.h
+31-163 files

LLVM/project 6883276clang/include/clang/Basic OffloadArch.h, clang/lib/Basic OffloadArch.cpp

clang: Move Triple computing logic to separate function
DeltaFile
+7-12clang/lib/Driver/Driver.cpp
+16-0clang/lib/Basic/OffloadArch.cpp
+4-0clang/include/clang/Basic/OffloadArch.h
+27-123 files

LLVM/project b661af6clang/lib/Driver Driver.cpp

clang: Use isAMDGPU triple helper

Also remove redundant SPIRV check.
DeltaFile
+1-1clang/lib/Driver/Driver.cpp
+1-11 files

LLVM/project 73cddefllvm/test/CodeGen/AArch64 is_fpclass.ll is_fpclass-bfloat.ll, llvm/test/CodeGen/AMDGPU r600.llvm.is.fpclass.ll llvm.is.fpclass.bf16.ll

optimize `is_finite` assembly (#169402)

Fixes https://github.com/llvm/llvm-project/issues/169270

Changes the implementation of `is_finite` to emit fewer instructions,
e.g.

X86_64

```asm
old: # 18 bytes
        movd    %xmm0, %eax
        andl    $2147483647, %eax
        cmpl    $2139095040, %eax
        setl    %al
        retq
new: # 15 bytes
        movd    %xmm0, %eax
        addl    %eax, %eax

    [23 lines not shown]
DeltaFile
+160-191llvm/test/CodeGen/AArch64/is_fpclass.ll
+158-170llvm/test/CodeGen/AMDGPU/r600.llvm.is.fpclass.ll
+117-115llvm/test/CodeGen/X86/is_fpclass.ll
+44-52llvm/test/CodeGen/AArch64/is_fpclass-bfloat.ll
+48-45llvm/test/CodeGen/AMDGPU/llvm.is.fpclass.bf16.ll
+22-22llvm/test/CodeGen/X86/isel-fpclass.ll
+549-5956 files not shown
+617-65412 files

LLVM/project 1bb0302libc/shared/math asinpi.h, libc/src/__support/math asinpi.h asin_utils.h

[libc][math][c23] implement double-precision asinpi (#188158)

Implement the double precision version of the asinpi c23 math function
DeltaFile
+291-0libc/src/__support/math/asinpi.h
+92-0libc/test/src/math/asinpi_test.cpp
+70-0libc/src/__support/math/asin_utils.h
+52-0libc/test/src/math/smoke/asinpi_test.cpp
+26-0utils/bazel/llvm-project-overlay/libc/BUILD.bazel
+23-0libc/shared/math/asinpi.h
+554-023 files not shown
+668-129 files

LLVM/project b6bbf2allvm/lib/Analysis DependenceAnalysis.cpp

Refactor

Signed-off-by: Ruoyu Qiu <cabbaken at outlook.com>
DeltaFile
+14-26llvm/lib/Analysis/DependenceAnalysis.cpp
+14-261 files

LLVM/project 2e6e36bllvm/test/CodeGen/AArch64 is_fpclass.ll is_fpclass-bfloat.ll

[NFC][AArch64] update tests for `is_fpclass` (#187336)

Hopefully this is better.

One wrinkle is that `@llvm.is.fpclass.bf16` is not currently implemented
for GI. That might be easy to add but I've not been able to figure out
where the issue is exactly so far.

I'm also not totally sure `-mattr=-fp-armv8` is equivalent to softfloat,
but some tests do suggest that they are equivalent (and looking at the
assembly, that seems right).
DeltaFile
+559-213llvm/test/CodeGen/AArch64/is_fpclass.ll
+112-0llvm/test/CodeGen/AArch64/is_fpclass-bfloat.ll
+671-2132 files

LLVM/project 37c8bafllvm/lib/Analysis DependenceAnalysis.cpp

Remove redundant logic

Signed-off-by: Ruoyu Qiu <cabbaken at outlook.com>
DeltaFile
+0-6llvm/lib/Analysis/DependenceAnalysis.cpp
+0-61 files

LLVM/project a0217f5llvm/lib/Analysis DependenceAnalysis.cpp

[DA] Fix overflow of calculation in weakCrossingSIVtest

This patch fixes a correctness issue where integer overflow in the
upper bound calculation of weakCrossingSIVtest caused the pass to
incorrectly prove independence.

The previous logic used `SCEV::getMulExpr` to calculate
`2 * ConstCoeff * UpperBound` and compared it to `Delta` using
`isKnownPredicate`. In the presence of overflow, this could yield
unsafe results.

This change replaces the SCEV arithmetic with `ConstantRange` and
its operation (`smul_fast`). If the calculation overflows,
`intersectWith(MLRange).isEmptySet()` would be false, ensures we
conservatively assume a dependence if the bounds cannot be proven
safe.

Signed-off-by: Ruoyu Qiu <cabbaken at outlook.com>
DeltaFile
+17-5llvm/lib/Analysis/DependenceAnalysis.cpp
+17-51 files

LLVM/project b5331b4llvm/lib/Analysis DependenceAnalysis.cpp, llvm/test/Analysis/DependenceAnalysis weak-crossing-siv-overflow.ll

update

Signed-off-by: Ruoyu Qiu <cabbaken at outlook.com>
DeltaFile
+59-0llvm/test/Analysis/DependenceAnalysis/weak-crossing-siv-overflow.ll
+16-15llvm/lib/Analysis/DependenceAnalysis.cpp
+75-152 files

LLVM/project 4151f5dmlir/lib/Dialect/LLVMIR/IR LLVMDialect.cpp, mlir/lib/Target/LLVMIR/Dialect/LLVMIR LLVMToLLVMIRTranslation.cpp

[MLIR][LLVMIR] Allow llvm.call and llvm.invoke to use llvm.mlir.alias as callee (#189154)

Previously, the verifier for `llvm.call` and `llvm.invoke` would reject
calls where the callee was an `llvm.mlir.alias`, reporting that the
symbol does not reference a valid LLVM function or IFunc. Similarly, the
MLIR-to-LLVM-IR translation had no handling for aliases as callees.

This patch extends both the verifier and the translation to accept
`llvm.mlir.alias` as a valid callee for `llvm.call` and `llvm.invoke`,
mirroring the existing support for `llvm.mlir.ifunc`. The function type
for alias calls is derived from the call operands and result types, and
the translation emits a call through the alias global value.

Fixes #147057

Assisted-by: Claude Code
DeltaFile
+52-0mlir/test/Target/LLVMIR/alias.mlir
+38-12mlir/lib/Target/LLVMIR/Dialect/LLVMIR/LLVMToLLVMIRTranslation.cpp
+21-0mlir/test/Dialect/LLVMIR/alias.mlir
+6-1mlir/lib/Dialect/LLVMIR/IR/LLVMDialect.cpp
+117-134 files