LLVM/project 83bc933mlir/include/mlir/Dialect/Utils IndexingUtils.h

NFC: MLIR Indexing Utils comment fix (#183438)

the comment for delinearize was incorrect and swapped modulus and
division, updated comment to match code
DeltaFile
+2-2mlir/include/mlir/Dialect/Utils/IndexingUtils.h
+2-21 files

LLVM/project 008dc8bflang-rt/lib/runtime execute.cpp, flang-rt/unittests/Runtime CommandTest.cpp

[flang-rt] Fix EXECUTE_COMMAND_LINE() on Windows (#184875)

Detect cmd.exe special status code 9009 that indicates "command not
found" condition. Crash the process if "command not found" detected when
CMDSTAT was not specified.
DeltaFile
+44-10flang-rt/lib/runtime/execute.cpp
+8-8flang-rt/unittests/Runtime/CommandTest.cpp
+52-182 files

LLVM/project c9555f6lldb/source/Plugins/DynamicLoader/Darwin-Kernel DynamicLoaderDarwinKernel.cpp

[lldb][Darwin] Don't try to insert breakpoint on corefiles (#184749)

lldb is printing an error that the kext-loaded notification breakpoint
can't be set when debugging a kernel corefile. The breakpoint only needs
to be inserted in live debug sessions.

rdar://170813438
DeltaFile
+2-1lldb/source/Plugins/DynamicLoader/Darwin-Kernel/DynamicLoaderDarwinKernel.cpp
+2-11 files

LLVM/project 559951clibclc/clc/include/clc/subgroup sub_group_broadcast.h, libclc/clc/lib/amdgcn SOURCES

libclc: Add sub_group_broadcast
DeltaFile
+55-0libclc/clc/lib/amdgcn/subgroup/sub_group_broadcast.cl
+43-0libclc/clc/include/clc/subgroup/sub_group_broadcast.h
+32-0libclc/opencl/lib/generic/subgroup/sub_group_broadcast.cl
+1-0libclc/clc/lib/amdgcn/SOURCES
+1-0libclc/opencl/lib/generic/SOURCES
+132-05 files

LLVM/project 1813143libclc/opencl/lib/amdgcn SOURCES, libclc/opencl/lib/amdgcn/subgroup subgroup.cl

libclc: Add amdgpu subgroup functions
DeltaFile
+74-0libclc/opencl/lib/amdgcn/subgroup/subgroup.cl
+1-0libclc/opencl/lib/amdgcn/SOURCES
+75-02 files

LLVM/project 0b5d5eflibclc/opencl/lib/generic/atomic atomic_work_item_fence.cl

__opencl_get_clang_memory_scope
DeltaFile
+2-1libclc/opencl/lib/generic/atomic/atomic_work_item_fence.cl
+2-11 files

LLVM/project e2d14bflibclc/clc/lib/amdgcn/mem_fence clc_mem_fence.cl, libclc/opencl/lib/generic SOURCES

libclc: Add atomic_work_item_fence
DeltaFile
+17-0libclc/opencl/lib/generic/atomic/atomic_work_item_fence.cl
+2-0libclc/clc/lib/amdgcn/mem_fence/clc_mem_fence.cl
+1-0libclc/opencl/lib/generic/SOURCES
+20-03 files

LLVM/project 34259b7mlir/include/mlir/Dialect/XeGPU/IR XeGPUAttrs.td, mlir/lib/Dialect/XeGPU/IR XeGPUDialect.cpp

[MLIR][XeGPU] Refactoring Transpose OP Layout Propagation (#184702)

This PR refactors Transpose Op Layout Propagation: 
1. Add inferTransposeSourceLayout() to layout utility, enhance layout
propagation and conflict handling to use this function
2. Add Layout utility: TransposeDims()
3. Refactor IsTransposeOf() and fix minor bugs
4. Fix minor issue in dropSgLayoutAndData()
DeltaFile
+143-14mlir/lib/Dialect/XeGPU/IR/XeGPUDialect.cpp
+27-47mlir/include/mlir/Dialect/XeGPU/IR/XeGPUAttrs.td
+18-0mlir/lib/Dialect/XeGPU/Transforms/XeGPULayoutImpl.cpp
+5-2mlir/lib/Dialect/XeGPU/Transforms/XeGPUPropagateLayout.cpp
+3-3mlir/test/Dialect/XeGPU/propagate-layout-subgroup.mlir
+3-3mlir/test/Dialect/XeGPU/propagate-layout.mlir
+199-693 files not shown
+209-729 files

LLVM/project c6e2ff8libclc/opencl/lib/generic/synchronization work_group_barrier.cl

__opencl_get_clang_memory_scope
DeltaFile
+4-3libclc/opencl/lib/generic/synchronization/work_group_barrier.cl
+4-31 files

LLVM/project 169f561llvm/test/CodeGen/AMDGPU amdgcn.bitcast.1024bit.ll amdgcn.bitcast.512bit.ll, llvm/test/CodeGen/RISCV/rvv clmulh-sdnode.ll

Rebase, address comment

Created using spr 1.3.7
DeltaFile
+84,419-78,498llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.1024bit.ll
+66,293-29,491llvm/test/CodeGen/RISCV/rvv/clmulh-sdnode.ll
+25,751-24,782llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.512bit.ll
+23,663-20,281llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.960bit.ll
+21,867-18,577llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.896bit.ll
+19,112-16,445llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.832bit.ll
+241,105-188,0742,687 files not shown
+467,404-311,3682,693 files

LLVM/project 11727c1libclc/clc/include/clc/workitem clc_get_enqueued_local_size.h, libclc/clc/lib/amdgcn SOURCES

libclc: Implement get_enqueued_local_size (#184842)
DeltaFile
+17-0libclc/clc/include/clc/workitem/clc_get_enqueued_local_size.h
+14-0libclc/clc/lib/amdgcn/workitem/clc_get_enqueued_local_size.cl
+14-0libclc/opencl/lib/generic/workitem/get_enqueued_local_size.cl
+1-0libclc/clc/lib/amdgcn/SOURCES
+1-0libclc/opencl/lib/generic/SOURCES
+47-05 files

LLVM/project d624466utils/bazel/llvm-project-overlay/mlir BUILD.bazel

[mlir][acc] Add Dialect Utils to OpenACCDialect deps (#184895)
DeltaFile
+1-0utils/bazel/llvm-project-overlay/mlir/BUILD.bazel
+1-01 files

LLVM/project b53adf4llvm/lib/Transforms/Vectorize VPlanConstruction.cpp, llvm/test/Transforms/LoopVectorize multiple-argmin-argmax.ll

[VPlan] Always process all argmin/argmax reductions in plan.

Follow-up to https://github.com/llvm/llvm-project/pull/170223.
Instead of exiting early, continue processing remaining reductions in
the loop. This ensures all multi-use reductions are properly converted
or the plan is rejected if there are unconvertable patterns.

Fixes https://github.com/llvm/llvm-project/issues/184729.
DeltaFile
+62-81llvm/test/Transforms/LoopVectorize/multiple-argmin-argmax.ll
+6-4llvm/lib/Transforms/Vectorize/VPlanConstruction.cpp
+68-852 files

LLVM/project e87d342lldb/test/Shell/ScriptInterpreter/Python bytecode.test, lldb/test/Shell/ScriptInterpreter/Python/Inputs/FormatterBytecode formatter.py

[lldb] Fix bytecode.test (#184903)

Follow up to #184714.

There are some other latent bugs here inside the formatter, but for now
this puts the test in a working state.
DeltaFile
+4-2lldb/test/Shell/ScriptInterpreter/Python/Inputs/FormatterBytecode/formatter.py
+2-2lldb/test/Shell/ScriptInterpreter/Python/bytecode.test
+6-42 files

LLVM/project d414e8cflang/docs Extensions.md, flang/include/flang/Support Fortran-features.h

[flang] Reject PARAMETER constants in NAMELIST groups (#178960)

The Fortran standard does not allow `PARAMETERS` within a
`namelist-group-object`, it should only allow variables. An error should
be emitted when a `PARAMETER` is found within a `namelist-group-object`.

Fixes: #178955
DeltaFile
+29-0flang/test/Semantics/namelist02.f90
+12-0flang/docs/Extensions.md
+8-0flang/lib/Semantics/check-namelist.cpp
+1-1flang/include/flang/Support/Fortran-features.h
+50-14 files

LLVM/project 94a8ca1libc/src/__support/math inv_trigf_utils.h asinpif.h

[libc][math] Optimize `asinpif` and `acospif` using estrin's scheme (#184286)

Optimize `asinpif` and `acospif` using [Estrin's
scheme](https://en.wikipedia.org/wiki/Estrin%27s_scheme).

## Benchmarking in **debug mode**
### **before**
**asinpif**

```
Ntrial = 40 ; Min = 0.000 + 350.920 clc/call; Median-Min = 450.477 clc/call; Max = 468.882 clc/call;
```

**acospif**
```
Ntrial = 40 ; Min = 0.000 + 309.248 clc/call; Median-Min = 384.386 clc/call; Max = 420.073 clc/call;
```

### **after**

    [7 lines not shown]
DeltaFile
+36-8libc/src/__support/math/inv_trigf_utils.h
+0-22libc/src/__support/math/asinpif.h
+36-302 files

LLVM/project 08a8c2bclang/test/SemaHLSL/Types/BuiltinMatrix MatrixFloatPrecisionWarnings.hlsl

Fix one more test.
DeltaFile
+3-3clang/test/SemaHLSL/Types/BuiltinMatrix/MatrixFloatPrecisionWarnings.hlsl
+3-31 files

LLVM/project bf5d5dfclang/lib/Analysis UnsafeBufferUsage.cpp, clang/unittests/Analysis/Scalable/Analyses/UnsafeBufferUsage UnsafeBufferUsageTest.cpp

clean up
DeltaFile
+1-1clang/lib/Analysis/UnsafeBufferUsage.cpp
+1-1clang/unittests/Analysis/Scalable/Analyses/UnsafeBufferUsage/UnsafeBufferUsageTest.cpp
+2-22 files

LLVM/project b5be659clang/lib/CIR/CodeGen CIRGenBuiltinAArch64.cpp, clang/lib/CodeGen/TargetBuiltins ARM.cpp

[CIR][AArch64] Add missing lowerings for vceqz_* Neon builtins (#184893)

Implement the remaining CIR lowerings for the AdvSIMD (Neon)
`vceqz{|q|d|s}_*` intrinsic group (bitwise equal to zero).

The `vceqzd_s64` variant was already supported; this patch completes
the rest of the group [1].

Tests for these intrinsics are moved from:
  * test/CodeGen/AArch64/neon-misc.c
  
to:
  * test/CodeGen/AArch64/neon/intrinsics.c

The implementation largely mirrors the existing lowering in
CodeGen/TargetBuiltins/ARM.cpp.

`emitCommonNeonBuiltinExpr` is introduced to support these lowerings.
`getNeonType` is moved without functional changes.

    [2 lines not shown]
DeltaFile
+725-67clang/lib/CIR/CodeGen/CIRGenBuiltinAArch64.cpp
+370-8clang/test/CodeGen/AArch64/neon/intrinsics.c
+1-306clang/test/CodeGen/AArch64/neon-misc.c
+6-0clang/lib/CodeGen/TargetBuiltins/ARM.cpp
+1,102-3814 files

LLVM/project ccd5fd7libclc/opencl/lib/generic SOURCES, libclc/opencl/lib/generic/synchronization work_group_barrier.cl

Use __opencl_get_clang_memory_scope
DeltaFile
+2-1libclc/opencl/lib/generic/synchronization/work_group_barrier.cl
+1-1libclc/opencl/lib/generic/SOURCES
+3-22 files

LLVM/project 0baa2d4libclc/opencl/lib/generic/synchronization work_group_barrier.cl barrier.cl

Rename file
DeltaFile
+25-0libclc/opencl/lib/generic/synchronization/work_group_barrier.cl
+0-25libclc/opencl/lib/generic/synchronization/barrier.cl
+25-252 files

LLVM/project 8aeee11libclc/opencl/lib/amdgcn/synchronization barrier.cl, libclc/opencl/lib/generic SOURCES

libclc: Define work_group_barrier

Previously only the old barrier name was implemented. Define this
as an indirection around the new name, and move it to common code.
The target implementations are already provided by __clc_work_group_barrier,
so targets were unnecessarily duplicating these.

This also fixes the default scope, which should be
memory_work_group_scope. Previously this was guessing that
if the flags included global memory, it makes the scope
device which is not the case.
DeltaFile
+25-0libclc/opencl/lib/generic/synchronization/barrier.cl
+0-17libclc/opencl/lib/ptx-nvidiacl/synchronization/barrier.cl
+0-17libclc/opencl/lib/amdgcn/synchronization/barrier.cl
+1-1libclc/opencl/lib/generic/async/wait_group_events.cl
+0-1libclc/opencl/lib/ptx-nvidiacl/SOURCES
+1-0libclc/opencl/lib/generic/SOURCES
+27-361 files not shown
+27-377 files

LLVM/project 9143f21llvm/lib/Transforms/Scalar LoopFuse.cpp

[LoopFusion] Correction in the comments (NFC) (#184689)

The comments in the code should have been updated following the change
in https://github.com/llvm/llvm-project/pull/183353. This PR addresses
that issue.
DeltaFile
+4-6llvm/lib/Transforms/Scalar/LoopFuse.cpp
+4-61 files

LLVM/project 4ad8104llvm/lib/Transforms/Instrumentation AddressSanitizer.cpp

Fixing codegen when using link.exe on arm64
DeltaFile
+1-1llvm/lib/Transforms/Instrumentation/AddressSanitizer.cpp
+1-11 files

LLVM/project 6695f1dllvm/docs AIToolPolicy.md

[docs] Add exception to AI tool policy for Bazel build fixer (#183408)

The Bazel RFC concluded earlier this month:
https://discourse.llvm.org/t/rfc-ai-assisted-bazel-fixer-bot/89178/93

I felt the best way to document this decision was to incorporate it into
this policy document.
DeltaFile
+12-0llvm/docs/AIToolPolicy.md
+12-01 files

LLVM/project dc62e28flang/include/flang/Parser openmp-utils.h, flang/lib/Lower/OpenMP Utils.cpp

[flang][OpenMP] Implement utility to locate OmpClause in ODS, NFC (#184866)

Simplify looking for a specific clause in OmpDirectiveSpecification.
This is alternative to DirectiveStructureChecker::FindClause for when
the internal checker structures have not yet been updated in the AST
traversal.
DeltaFile
+31-40flang/lib/Semantics/check-omp-loop.cpp
+4-14flang/lib/Semantics/check-omp-structure.cpp
+3-7flang/lib/Lower/OpenMP/Utils.cpp
+10-0flang/lib/Parser/openmp-utils.cpp
+2-6flang/lib/Parser/parse-tree.cpp
+3-0flang/include/flang/Parser/openmp-utils.h
+53-676 files

LLVM/project dce8586clang/include/clang/Analysis/Analyses UnsafeBufferUsage.h, clang/include/clang/Analysis/Scalable/Analyses/UnsafeBufferUsage UnsafeBufferUsage.h

[ssaf][UnsafeBufferUsage] Add support for extracting unsafe pointers from all kinds of contributors

- Generalize the -Wunsafe-buffer-usage API for finding unsafe pointers in all kinds of Decls
- Add support in SSAF-based UnsafeBufferUsage analysis for extracting from various contributors
- Mock implementation of HandleTranslationUnit

rdar://171735836
DeltaFile
+113-25clang/lib/Analysis/Scalable/Analyses/UnsafeBufferUsage/UnsafeBufferUsageExtractor.cpp
+45-18clang/lib/Analysis/UnsafeBufferUsage.cpp
+58-4clang/unittests/Analysis/Scalable/Analyses/UnsafeBufferUsage/UnsafeBufferUsageTest.cpp
+8-1clang/include/clang/Analysis/Analyses/UnsafeBufferUsage.h
+2-0clang/include/clang/Analysis/Scalable/Analyses/UnsafeBufferUsage/UnsafeBufferUsage.h
+226-485 files

LLVM/project 275a215flang/lib/Semantics check-acc-structure.cpp, flang/test/Lower/OpenACC acc-cache.f90

[flang][openacc] Relax semantic check on cache directive (#184887)

The specification doesn't really forbid the colon notation to be used to
specify the full array. Reference compiler accepts this and our lowering
can already handle it.
DeltaFile
+16-0flang/test/Lower/OpenACC/acc-cache.f90
+0-6flang/lib/Semantics/check-acc-structure.cpp
+2-4flang/test/Semantics/OpenACC/acc-cache-validity.f90
+18-103 files

LLVM/project a82e3a1llvm/lib/Target/AMDGPU SIInstructions.td, llvm/test/CodeGen/AMDGPU llvm.fptrunc.round.ll

[AMDGPU] add back the true16 pattern for cvt_pk_rtz (#184857)

I found that the `SupportedRoundMode` pattern for true16 mode is removed
in https://github.com/llvm/llvm-project/pull/177069 by mistake. Added it
back in this patch and add gfx11 to the test which runs true16 mode
DeltaFile
+364-111llvm/test/CodeGen/AMDGPU/llvm.fptrunc.round.ll
+9-1llvm/lib/Target/AMDGPU/SIInstructions.td
+373-1122 files

LLVM/project 9548db8libc/shared/math ffmaf128.h, libc/src/__support/math ffmaf128.h CMakeLists.txt

[libc][math] Refactor ffmaf128 into a header only. (#184751)

closes #175325 
part of #147386
DeltaFile
+34-0libc/src/__support/math/ffmaf128.h
+29-0libc/shared/math/ffmaf128.h
+12-1utils/bazel/llvm-project-overlay/libc/BUILD.bazel
+11-0libc/src/__support/math/CMakeLists.txt
+2-4libc/src/math/generic/ffmaf128.cpp
+1-2libc/src/math/generic/CMakeLists.txt
+89-73 files not shown
+93-79 files