LLVM/project 56eef98libc/test/integration/src/stdlib getenv_test.cpp

[libc][stdlib] Simplify getenv_test by using strcmp instead of custom helper (#163055)

[libc][stdlib] Simplify getenv_test by using inline_strcmp instead of custom helper

Replace the custom `my_streq` helper function with LLVM libc's
`inline_strcmp` utility from `src/string/memory_utils/inline_strcmp.h`.

Changes:
- Remove 18-line custom `my_streq` implementation
- Use `inline_strcmp` with a simple comparator lambda for string comparisons
- Replace `my_streq(..., nullptr)` checks with direct `== nullptr` comparisons
- Maintain identical test coverage while reducing code duplication

Benefits:
- Uses existing, well-tested LLVM libc infrastructure
- Clearer test assertions with standard comparison functions
- More concise code (reduced from ~48 to ~33 lines)
- Consistent with LLVM libc coding practices


    [6 lines not shown]
DeltaFile
+17-32libc/test/integration/src/stdlib/getenv_test.cpp
+17-321 files

LLVM/project f969c86llvm/test/CodeGen/X86 bfloat.ll

[X86] bfloat.ll - cleaned up check prefixes to stop update script conflict warnings (#167877)

DeltaFile
+466-275llvm/test/CodeGen/X86/bfloat.ll
+466-2751 files

LLVM/project e5baf07llvm/lib/Target/AArch64 AArch64ISelLowering.cpp AArch64InstrInfo.cpp, llvm/test/CodeGen/AArch64 alias_mask.ll csel-subs-dag-combine.ll

[AArch64] Generalize  CSEL a, b, cc, SUBS(SUB(x,y), 0) -> CSEL a, b, cc, SUBS(x,y) transform to peephole (#167527)

This transform should have never been done in ISel in the first place.
It should have been done in peephole, but a few cases were missing.
DeltaFile
+104-110llvm/test/CodeGen/AArch64/alias_mask.ll
+0-112llvm/test/CodeGen/AArch64/csel-subs-dag-combine.ll
+0-23llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+12-0llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
+116-2454 files

LLVM/project 4cd8361llvm/lib/Target/AMDGPU SIInstrInfo.cpp SIInstrInfo.h, llvm/test/CodeGen/AMDGPU absdiff.ll move-to-valu-absdiff.mir

[AMDGPU] Lower S_ABSDIFF_I32 to VALU instructions (#167691)

Added support for lowering the scalar S_ABSDIFF_I32 instruction to
equivalent VALU operations.
DeltaFile
+38-0llvm/test/CodeGen/AMDGPU/absdiff.ll
+36-0llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
+31-0llvm/test/CodeGen/AMDGPU/move-to-valu-absdiff.mir
+2-0llvm/lib/Target/AMDGPU/SIInstrInfo.h
+107-04 files

LLVM/project 7b7a422mlir/include/mlir/Dialect/LLVMIR LLVMIntrinsicOps.td, mlir/test/Target/LLVMIR llvmir-intrinsics.mlir

[MLIR][LLVMIR] Add {s,u}cmp intrinsics to LLVM dialect (#167870)

DeltaFile
+32-0mlir/test/Target/LLVMIR/Import/intrinsic.ll
+32-0mlir/test/Target/LLVMIR/llvmir-intrinsics.mlir
+12-0mlir/include/mlir/Dialect/LLVMIR/LLVMIntrinsicOps.td
+76-03 files

LLVM/project 31536e6mlir/lib/Bindings/Python IRCore.cpp, mlir/test/mlir-tblgen op-python-bindings.td

[MLIR] [Python] `ir.Value` is now generic in the type of the value it holds (#166148)

This makes it similar to `mlir::TypedValue` in the MLIR C++ API and
allows users to be more specific about the values they produce or
accept.

Co-authored-by: Maksim Levental <maksim.levental at gmail.com>
DeltaFile
+30-1mlir/tools/mlir-tblgen/OpPythonBindingGen.cpp
+12-3mlir/lib/Bindings/Python/IRCore.cpp
+7-7mlir/test/mlir-tblgen/op-python-bindings.td
+8-1mlir/test/python/dialects/python_test.py
+57-124 files

LLVM/project f73bcdbmlir/include/mlir/Dialect/LLVMIR ROCDLOps.td, mlir/test/Dialect/LLVMIR rocdl.mlir

[ROCDL] Added missing s.get.named.barrier.state op (gfx1250) (#167876)

This patch introduces some missing s.get.named.barrier.state
instructions in the ROCDL dialect
DeltaFile
+9-0mlir/include/mlir/Dialect/LLVMIR/ROCDLOps.td
+7-0mlir/test/Dialect/LLVMIR/rocdl.mlir
+7-0mlir/test/Target/LLVMIR/rocdl.mlir
+23-03 files

LLVM/project c0f7d51llvm/lib/Transforms/Vectorize VPlanTransforms.cpp VPlanRecipes.cpp, llvm/test/Transforms/LoopVectorize/RISCV low-trip-count.ll vector-loop-backedge-elimination-with-evl.ll

[VPlan] Simplify ExplicitVectorLength(%AVL) -> %AVL when AVL <= VF (#167647)

[`llvm.experimental.get.vector.length`](https://llvm.org/docs/LangRef.html#id2399)
has the property that if the AVL (%cnt) is less than or equal to VF
(%max_lanes) then the return value is just AVL.

This patch uses SCEV to simplify this in optimizeForVFAndUF, and adds
`ExplicitVectorLength` to
`VPInstruction::opcodeMayReadOrWriteFromMemory` so it gets removed once
dead.
DeltaFile
+30-0llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
+12-16llvm/test/Transforms/LoopVectorize/RISCV/low-trip-count.ll
+1-2llvm/test/Transforms/LoopVectorize/RISCV/vector-loop-backedge-elimination-with-evl.ll
+1-0llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
+44-184 files

LLVM/project 78554d9clang/include/clang/AST Attr.h, clang/include/clang/Basic AttrDocs.td Attr.td

Reapply "[HLSL] Rework semantic handling as attributes"  (#167862)

Last PR had asan failures due to bad use of a Twine instead of an
std::string.
DeltaFile
+80-83clang/lib/Sema/SemaHLSL.cpp
+47-45clang/lib/CodeGen/CGHLSLRuntime.cpp
+0-61clang/include/clang/Basic/AttrDocs.td
+26-25clang/test/SemaHLSL/Semantics/semantics-valid.hlsl
+19-31clang/include/clang/Basic/Attr.td
+7-32clang/include/clang/AST/Attr.h
+179-27714 files not shown
+278-36620 files

LLVM/project 031d213llvm/test/CodeGen/AArch64/GlobalISel irtranslator-inline-asm.ll

Fix test
DeltaFile
+15-7llvm/test/CodeGen/AArch64/GlobalISel/irtranslator-inline-asm.ll
+15-71 files

LLVM/project be2f875mlir/include/mlir/IR CommonAttrConstraints.td, mlir/test/IR locations.mlir

[MLIR Attr] Allow LocationAttr to be used as an operation attribute (#167690)

Enables locations to be used as operation attributes.

In contrast to the implicit source location every operation carries
(`Operation::getLoc()`)—which may be fused or modified during
transformations—a `LocationAttr` used as an operation attribute has
explicit semantics defined by the operation itself.

For example, in our Zig-like language frontend (where types are
first-class values), we use a location attribute on struct type
operations to store the declaration location, which is part of the
type's semantic identity. Using an explicit attribute instead of
`Operation::getLoc()` ensures this semantic information is preserved
during transformations.
DeltaFile
+6-1mlir/include/mlir/IR/CommonAttrConstraints.td
+7-0mlir/test/IR/locations.mlir
+6-0mlir/test/lib/Dialect/Test/TestOps.td
+19-13 files

LLVM/project edd8b29llvm/lib/Transforms/Scalar Float2Int.cpp, llvm/test/Transforms/Float2Int pr167627.ll

[Float2Int] Make sure the CFP can be represented in the integer type (#167699)

When `convertToInteger` fails, the integer result is undefined. In this
case, we cannot use it in the subsequent steps.
Close https://github.com/llvm/llvm-project/issues/167627.
DeltaFile
+18-0llvm/test/Transforms/Float2Int/pr167627.ll
+8-4llvm/lib/Transforms/Scalar/Float2Int.cpp
+26-42 files

LLVM/project 0430063llvm/lib/CodeGen/GlobalISel InlineAsmLowering.cpp

clang-format
DeltaFile
+1-1llvm/lib/CodeGen/GlobalISel/InlineAsmLowering.cpp
+1-11 files

LLVM/project 7fd6100llvm/lib/CodeGen/GlobalISel InlineAsmLowering.cpp

Comments
DeltaFile
+14-4llvm/lib/CodeGen/GlobalISel/InlineAsmLowering.cpp
+14-41 files

LLVM/project d66a6a5llvm/lib/CodeGen/GlobalISel InlineAsmLowering.cpp, llvm/test/CodeGen/AArch64/GlobalISel irtranslator-inline-asm.ll arm64-fallback.ll

[GlobalISel] Add support for value/constants as inline asm memory operand
DeltaFile
+94-0llvm/test/CodeGen/AMDGPU/GlobalISel/irtranslator-inline-asm.ll
+85-0llvm/test/CodeGen/AArch64/GlobalISel/irtranslator-inline-asm.ll
+30-13llvm/lib/CodeGen/GlobalISel/InlineAsmLowering.cpp
+0-9llvm/test/CodeGen/AArch64/GlobalISel/arm64-fallback.ll
+209-224 files

LLVM/project 3cfe6aamlir/tools/mlir-opt mlir-opt.cpp

[MLIR] Apply clang-tidy fixes for misc-use-internal-linkage in mlir-opt.cpp (NFC)
DeltaFile
+1-1mlir/tools/mlir-opt/mlir-opt.cpp
+1-11 files

LLVM/project 971e124mlir/lib/Dialect/GPU/Transforms AsyncRegionRewriter.cpp

[MLIR] Apply clang-tidy fixes for misc-use-internal-linkage in AsyncRegionRewriter.cpp (NFC)
DeltaFile
+2-2mlir/lib/Dialect/GPU/Transforms/AsyncRegionRewriter.cpp
+2-21 files

LLVM/project 1a8e6f7mlir/include/mlir/Dialect/LLVMIR NVVMOps.td

[MLIR] Replace LLVM_Type in bar.warp.sync and cp.async ops with I32 (#167826)

This patch replaces generic `LLVM_Type` with specific `I32` type in NVVM
operations.

`NVVM_SyncWarpOp`: Change mask parameter from `LLVM_Type` to `I32`.
`NVVM_CpAsyncOp`: Change cpSize parameter from `Optional<LLVM_Type>` to
`Optional<I32>`.

Signed-off-by: Dharuni R Acharya <dharunira at nvidia.com>
DeltaFile
+2-2mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td
+2-21 files

LLVM/project 2a53949llvm/test/CodeGen/RISCV/rvv zvqdotq-sdnode.ll

[RISCV] Add test for partial reduce with select. NFC

RISC-V test coverage for #167857
DeltaFile
+25-0llvm/test/CodeGen/RISCV/rvv/zvqdotq-sdnode.ll
+25-01 files

LLVM/project f84ad45llvm/lib/Transforms/InstCombine InstCombineAndOrXor.cpp, llvm/test/Transforms/InstCombine not.ll

[LLVM][InstCombine] not (bitcast (cmp A, B) --> bitcast (!cmp A, B) (#167693)

DeltaFile
+78-0llvm/test/Transforms/InstCombine/not.ll
+9-1llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
+87-12 files

LLVM/project ed1fdbcllvm/lib/Target/AArch64/AsmParser AArch64AsmParser.cpp

fixup! [AArch64][llvm] Improve writeback reg handling for FEAT_MOPS

Fix silly typos
DeltaFile
+1-1llvm/lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp
+1-11 files

LLVM/project 876114fllvm/lib/Target/X86 X86ISelLowering.cpp, llvm/test/CodeGen/X86 build-vector-128.ll setcc-wide-types.ll

[X86] Add widenBuildVector to create a wider build vector if the scalars are mergeable (#167667)

See if each pair of scalar operands of a build vector can be freely
merged together - typically if they've been split for some reason by
legalization.

If we can create a new build vector node with double the scalar size,
but half the element count - reducing codegen complexity and potentially
allowing further optimization.

I did look at performing this generically in DAGCombine, but we don't
have as much control over when a legal build vector can be folded -
another generic fold would be to handle this on insert_vector_elt pairs,
but again legality checks could be limiting.

Fixes #167498
DeltaFile
+73-311llvm/test/CodeGen/X86/build-vector-128.ll
+52-0llvm/lib/Target/X86/X86ISelLowering.cpp
+16-28llvm/test/CodeGen/X86/setcc-wide-types.ll
+141-3393 files

LLVM/project 59c01ccclang/include/clang/Basic BuiltinsX86.td, clang/lib/CodeGen/TargetBuiltins X86.cpp

[Headers][X86] Update FMA3/FMA4 scalar intrinsics to use __builtin_elementwise_fma and support constexpr (#154731)

Now that #152455 is done, we can make all the scalar fma intrinsics to
wrap __builtin_elementwise_fma, which also allows constexpr

The main difference is that FMA4 intrinsics guarantee that the upper
elements are zero, while FMA3 passes through the destination register
elements like older scalar instructions

Fixes #154555
DeltaFile
+70-46clang/test/CodeGen/X86/fma4-builtins.c
+54-46clang/test/CodeGen/X86/fma-builtins.c
+32-32clang/lib/Headers/fmaintrin.h
+24-32clang/lib/Headers/fma4intrin.h
+0-10clang/include/clang/Basic/BuiltinsX86.td
+0-6clang/lib/CodeGen/TargetBuiltins/X86.cpp
+180-1726 files

LLVM/project 0ee2facclang/lib/Driver/ToolChains AMDGPU.h AMDGPU.cpp, clang/test/Driver hip-sanitize-options.hip amdgpu-openmp-sanitize-options.c

refactor code; support errors for explicit -Xarch_device; improve testing
DeltaFile
+59-16clang/lib/Driver/ToolChains/AMDGPU.h
+45-1clang/test/Driver/hip-sanitize-options.hip
+15-29clang/lib/Driver/ToolChains/AMDGPU.cpp
+27-1clang/test/Driver/amdgpu-openmp-sanitize-options.c
+4-16clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp
+4-16clang/lib/Driver/ToolChains/HIPAMD.cpp
+154-791 files not shown
+161-797 files

LLVM/project 20034ballvm/lib/Target/X86 X86ISelLowering.cpp, llvm/test/CodeGen/X86 sqrt-fastmath-mir.ll llc-fp-contract-warning.ll

[X86] Don't rely on global contraction flag (#167252)

As in title. See here for more context:
https://discourse.llvm.org/t/allowfpopfusion-vs-sdnodeflags-hasallowcontract/80909

Also add a warning in llc when global contract flag is encountered on x86. 
Remove global contract from last x86 test
DeltaFile
+21-21llvm/test/CodeGen/X86/sqrt-fastmath-mir.ll
+3-11llvm/lib/Target/X86/X86ISelLowering.cpp
+12-0llvm/test/CodeGen/X86/llc-fp-contract-warning.ll
+6-0llvm/tools/llc/llc.cpp
+42-324 files

LLVM/project 5fa3ccbllvm/lib/Target/AArch64 AArch64ISelLowering.cpp, llvm/test/CodeGen/AArch64 sve2p1-fixed-length-fdot.ll

[AArch64] Use SVE fdot for partial.reduce.fadd for NEON types. (#167856)

We only seem to use the SVE fdot for fixed-length vector types when they
are larger than 128bits, whereas we can also use them for 128bits
vectors if SVE2p1/SME2 is available.
DeltaFile
+57-11llvm/test/CodeGen/AArch64/sve2p1-fixed-length-fdot.ll
+3-0llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+60-112 files

LLVM/project a5342d5llvm/lib/CodeGen/SelectionDAG DAGCombiner.cpp, llvm/test/CodeGen/AArch64 umin-sub-to-usubo-select-combine.ll

Revert "[DAG] Fold (umin (sub a b) a) -> (usubo a b); (select usubo.1 a usubo.0)" (#167854)

Reverts llvm/llvm-project#161651 due to downstream bad codegen reports
DeltaFile
+0-156llvm/test/CodeGen/X86/umin-sub-to-usubo-select-combine.ll
+0-151llvm/test/CodeGen/AArch64/umin-sub-to-usubo-select-combine.ll
+0-19llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+0-3263 files

LLVM/project 4340159llvm/lib/Target/AArch64 AArch64SchedNeoverseN3.td, llvm/test/tools/llvm-mca/AArch64/Neoverse N3-sve-instructions.s

[AArch64] Fix SVE FADDP latency on Neoverse-N3 (#167676)

This patch fixes the latency of the SVE FADDP instruction for the
Neoverse-N3 SWOG. The latency of flaoting point arith, min/max pairwise
SVE FADDP should be 3, as per the N3 SWOG.
DeltaFile
+3-3llvm/test/tools/llvm-mca/AArch64/Neoverse/N3-sve-instructions.s
+2-2llvm/lib/Target/AArch64/AArch64SchedNeoverseN3.td
+5-52 files

LLVM/project ef9a02cllvm/include/llvm/CodeGen RegisterPressure.h MachineRegisterInfo.h, llvm/lib/CodeGen RegisterPressure.cpp MachinePipeliner.cpp

[CodeGen] Use VirtRegOrUnit where appropriate (NFCI) (#167730)

Use it in `printVRegOrUnit()`, `getPressureSets()`/`PSetIterator`,
and in functions/classes dealing with register pressure.

Static type checking revealed several bugs, mainly in MachinePipeliner.
I'm not very familiar with this pass, so I left a bunch of FIXMEs.

There is one bug in `findUseBetween()` in RegisterPressure.cpp, also
annotated with a FIXME.
DeltaFile
+140-136llvm/lib/CodeGen/RegisterPressure.cpp
+48-18llvm/lib/CodeGen/MachinePipeliner.cpp
+34-29llvm/lib/Target/AMDGPU/SIMachineScheduler.cpp
+28-23llvm/include/llvm/CodeGen/RegisterPressure.h
+13-10llvm/lib/Target/AMDGPU/SIWholeQuadMode.cpp
+11-11llvm/include/llvm/CodeGen/MachineRegisterInfo.h
+274-2277 files not shown
+313-26113 files

LLVM/project 1cb05fdllvm/docs ReleaseNotes.md, llvm/lib/Target/AArch64 AArch64Features.td

fixup! [AArch64][llvm] Add support for Permission Overlays Extension 2 (FEAT_S1POE2)

Add `alle3` testcase other small fixes
DeltaFile
+0-4llvm/unittests/TargetParser/TargetParserTest.cpp
+4-0llvm/test/MC/AArch64/arm-poe2-tlbid-diagnostics.s
+2-1llvm/docs/ReleaseNotes.md
+1-1llvm/lib/Target/AArch64/AArch64Features.td
+0-1llvm/test/MC/AArch64/arm-btie.s
+7-75 files