LLVM/project a75e6a5mlir/lib/Dialect/XeGPU/Transforms XeGPUUnroll.cpp, mlir/test/Dialect/XeGPU xegpu-wg-to-sg.mlir xegpu-wg-to-sg-unify-ops.mlir

[MLIR][XeGPU] Remove offsets from create_nd_tdesc & remove update_nd_offset, move offsets to load/store/prefetch ops (#193330)

This PR removes the optional offsets/const_offsets operands on
xegpu.create_nd_tdesc and instead mandates offsets directly on the
consuming load, store, and prefetch ops. It also deprecates the
update_nd_offset op.
DeltaFile
+980-230mlir/test/Dialect/XeGPU/xegpu-wg-to-sg.mlir
+0-987mlir/test/Dialect/XeGPU/xegpu-wg-to-sg-unify-ops.mlir
+245-107mlir/test/Dialect/XeGPU/xegpu-wg-to-sg-rr.mlir
+164-174mlir/test/Dialect/XeGPU/propagate-layout.mlir
+44-282mlir/lib/Dialect/XeGPU/Transforms/XeGPUUnroll.cpp
+106-147mlir/test/Dialect/XeGPU/xegpu-blocking.mlir
+1,539-1,92721 files not shown
+1,946-3,33627 files

LLVM/project e7164d4libclc CMakeLists.txt

[libclc] Only check the triple architecture for libclc (#194149)

Summary:
Previously, `nvptx64--` would reject `nvptx64-unknown-unknown`. Two
options, either normalize all the triples in CMake, or just check the
architecture. I went with the former because it makes it easier for
people to pass different values.
DeltaFile
+9-14libclc/CMakeLists.txt
+9-141 files

LLVM/project 0ccb181compiler-rt/lib/sanitizer_common sanitizer_redefine_builtins.h

[compiler-rt] Use asm .set only for Hexagon (#194160)

Two incompatible assembler syntaxes exist for symbol assignment:
```
  sym = val      -- accepted by most GNU assembler targets; rejected by
                    Hexagon, which interprets it as a mnemonic
  .set sym, val  -- accepted by Hexagon; rejected by Alpha, which
                    reserves .set for assembler mode flags
```
Switch all to `sym = val`, and opt out Hexagon to `.set sym`.

Fixes: dbb03f8f606e ("[compiler-rt] Replace assignment w/.set directive
(#107667)")

---------

Co-authored-by: Vitaly Buka <vitalybuka at google.com>
DeltaFile
+17-5compiler-rt/lib/sanitizer_common/sanitizer_redefine_builtins.h
+17-51 files

LLVM/project b614c15llvm/include/llvm/MC TargetRegistry.h, llvm/include/llvm/MC/MCParser MCTargetAsmParser.h

[MC] Drop MCTargetOptions parameter from MCTargetAsmParser (#194120)

Since #180464, MCAsmInfo holds the canonical MCTargetOptions.
The MCTargetAsmParser::MCOptions member is a redundant by-value copy,
which may have inconsistent values (llvm-exegesis passes a temporary
MCTargetOptions(), but this probably doesn't matter in practice; other
in-tree uses are correct).

Remove the field in favor of getParser().getContext().getTargetOptions,
and remove the MCTargetOptions parameter from the base ctor, all
subclass ctors, Target::createMCAsmParser, MCAsmParserCtorTy, and
RegisterMCAsmParser.
DeltaFile
+7-9llvm/include/llvm/MC/TargetRegistry.h
+7-6llvm/include/llvm/MC/MCParser/MCTargetAsmParser.h
+5-4llvm/lib/Target/Mips/AsmParser/MipsAsmParser.cpp
+4-4llvm/lib/Target/WebAssembly/AsmParser/WebAssemblyAsmParser.cpp
+3-4llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp
+3-3llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp
+29-3028 files not shown
+76-8334 files

LLVM/project 8174442clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/zvfofp8min/policy/non-overloaded vfncvtbf16.c, clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/zvfofp8min/policy/overloaded vfncvtbf16.c

Add extra cehck for invariants

Created using spr 1.3.7
DeltaFile
+3,230-456llvm/test/CodeGen/WebAssembly/strided-int-mac.ll
+704-882llvm/test/CodeGen/RISCV/rvv/setcc-int-vp.ll
+472-472clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/zvfofp8min/policy/overloaded/vfncvtbf16.c
+345-558llvm/test/CodeGen/RISCV/rvv/fixed-vectors-setcc-int-vp.ll
+280-280clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/zvfofp8min/policy/non-overloaded/vfncvtbf16.c
+236-281llvm/test/CodeGen/RISCV/rvv/vsrl-vp.ll
+5,267-2,929148 files not shown
+10,292-6,461154 files

LLVM/project 486e97aclang/include/clang/Sema Initialization.h

[clang][NFC] Fix typo in HLSL initialization comment (#194124)
DeltaFile
+1-1clang/include/clang/Sema/Initialization.h
+1-11 files

LLVM/project b5471ccllvm/lib/MC MCObjectStreamer.cpp, llvm/test/MC/AsmParser directive_fill.s

[MC] Always lower .fill to MCFillFragment (#194164)

Constant-count, constant-pattern .fill expands inline into the current
fragment via emitIntValue per byte, wasting both memory and time (a
redundant copy at MCAssembler.cpp). #50974 reports a 4s compile dropping
to 0.6s when the loop is removed.

Drop the inline path so .fill always becomes MCFillFragment.
This cannot be done before commit 507efbcce03d (2023) allowed
label differences to be separated by a MCFillFragment.

In directive_fill.s, the parse time warning is now diagnosed by
MCAssembler.
DeltaFile
+5-16llvm/lib/MC/MCObjectStreamer.cpp
+1-1llvm/test/MC/AsmParser/directive_fill.s
+6-172 files

LLVM/project 4c7dc9clibc/src/__support/FPUtil BasicOperations.h, libc/src/__support/math CMakeLists.txt fmaximum_mag_numbf16.h

Reland "[libc][math] Refactor fmaximum_mag_num family to header-only" (#194194)

Reland #182169

---------

Co-authored-by: Muhammad Bassiouni <60100307+bassiounix at users.noreply.github.com>
DeltaFile
+47-2utils/bazel/llvm-project-overlay/libc/BUILD.bazel
+30-0libc/src/__support/math/CMakeLists.txt
+20-7libc/src/__support/FPUtil/BasicOperations.h
+26-0libc/src/__support/math/fmaximum_mag_numbf16.h
+25-0libc/src/__support/math/fmaximum_mag_num.h
+25-0libc/src/__support/math/fmaximum_mag_numf.h
+173-911 files not shown
+272-3017 files

LLVM/project de7c63ellvm/tools/llvm-profgen PerfReader.cpp ProfileGenerator.cpp

[llvm-profgen] Add --time-profgen (#191930)

Add `NamedRegionTimer`s to main profgen phases:
- Parse and aggregate trace (`parseAndAggregateTrace`)
- Unwind samples (`unwindSamples`)
- Generate profile (`ProfileGenerator::generateProfile`)
- Generate CS profile (`CSProfileGenerator::generateProfile`)

Test Plan:
```
$ llvm-profgen --time-profgen ...

===-------------------------------------------------------------------------===
                                  llvm-profgen
===-------------------------------------------------------------------------===
  Total Execution Time: 2826.6549 seconds (2873.3410 wall clock)

   ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
  1059.4929 ( 38.1%)   8.5146 ( 17.3%)  1068.0075 ( 37.8%)  1090.6604 ( 38.0%)  Generate CS profile

    [3 lines not shown]
DeltaFile
+11-0llvm/tools/llvm-profgen/PerfReader.cpp
+5-0llvm/tools/llvm-profgen/ProfileGenerator.cpp
+1-0llvm/tools/llvm-profgen/Options.h
+17-03 files

LLVM/project 75b450fbolt/test/X86 pre-aggregated-records.s, bolt/test/X86/Inputs pre-aggregated-bad-hex.txt pre-aggregated-bad-type.txt

[BOLT] Add tests for pre-aggregated parsing (#193843)

Extends e2e coverage of pre-aggregated profile parsing to match the
unit-test coverage added in #192390:

- R (Return) records, including the branch=0 fallback path that
  rewrites to the FT_EXTERNAL_RETURN sentinel.
- r (FT_EXTERNAL_RETURN) records.
- B and T records using the negative -1 hex form (#192391),
  which is parsed as the BR_ONLY/FT_ONLY sentinel.
- Error paths: invalid record type letter and malformed hex address
  (perf2bolt is expected to exit non-zero with a parser error).

The two error-path inputs are tiny raw files under Inputs/ since they
contain intentionally malformed records that link_fdata doesn't process.

Test Plan:
added bolt/test/X86/pre-aggregated-records.s
DeltaFile
+60-0bolt/test/X86/pre-aggregated-records.s
+1-0bolt/test/X86/Inputs/pre-aggregated-bad-hex.txt
+1-0bolt/test/X86/Inputs/pre-aggregated-bad-type.txt
+62-03 files

LLVM/project 71816eflibc/src/__support/FPUtil/generic add_sub.h, libc/src/__support/math fdimf.h fdimf16.h

[libc][math] Qualify fdim funtions to constexpr (#194137)

Signed-off-by: udaykiriti <udaykiriti624 at gmail.com>
Co-authored-by: Muhammad Bassiouni <60100307+bassiounix at users.noreply.github.com>
DeltaFile
+8-0libc/test/shared/shared_math_constexpr_test.cpp
+5-1libc/src/__support/FPUtil/generic/add_sub.h
+6-0libc/test/shared/CMakeLists.txt
+3-1libc/src/__support/math/fdimf.h
+3-1libc/src/__support/math/fdimf16.h
+3-1libc/src/__support/math/fdim.h
+28-44 files not shown
+32-810 files

LLVM/project 24f4629lldb/test/API/commands/thread/backtrace TestThreadBacktraceRepeat.py

[lldb][test] Use assertIn in TestThreadBacktraceRepeat.py (NFC) (#194193)

I broke this test locally, and fixed the assets to produce more useful
output upon failure.
DeltaFile
+7-8lldb/test/API/commands/thread/backtrace/TestThreadBacktraceRepeat.py
+7-81 files

LLVM/project 13e7958llvm/test/CodeGen/AMDGPU/NextUseAnalysis spill-vreg-many-lanes.mir acyclic-770bb.mir

rebase

Created using spr 1.3.4
DeltaFile
+275,101-0llvm/test/CodeGen/AMDGPU/NextUseAnalysis/spill-vreg-many-lanes.mir
+144,679-0llvm/test/CodeGen/AMDGPU/NextUseAnalysis/acyclic-770bb.mir
+57,682-0llvm/test/CodeGen/AMDGPU/NextUseAnalysis/double-nested-loops-complex-cfg.mir
+41,844-0llvm/test/CodeGen/AMDGPU/NextUseAnalysis/test_ers_multiple_spills2.mir
+40,613-0llvm/test/CodeGen/AMDGPU/NextUseAnalysis/test_ers_multiple_spills1.mir
+37,209-0llvm/test/CodeGen/AMDGPU/NextUseAnalysis/test_ers_multiple_spills3.mir
+597,128-05,383 files not shown
+1,085,016-125,6375,389 files

LLVM/project e55f02fllvm/test/CodeGen/AMDGPU/NextUseAnalysis spill-vreg-many-lanes.mir acyclic-770bb.mir

[𝘀𝗽𝗿] changes introduced through rebase

Created using spr 1.3.4

[skip ci]
DeltaFile
+275,101-0llvm/test/CodeGen/AMDGPU/NextUseAnalysis/spill-vreg-many-lanes.mir
+144,679-0llvm/test/CodeGen/AMDGPU/NextUseAnalysis/acyclic-770bb.mir
+57,682-0llvm/test/CodeGen/AMDGPU/NextUseAnalysis/double-nested-loops-complex-cfg.mir
+41,844-0llvm/test/CodeGen/AMDGPU/NextUseAnalysis/test_ers_multiple_spills2.mir
+40,613-0llvm/test/CodeGen/AMDGPU/NextUseAnalysis/test_ers_multiple_spills1.mir
+37,209-0llvm/test/CodeGen/AMDGPU/NextUseAnalysis/test_ers_multiple_spills3.mir
+597,128-05,383 files not shown
+1,085,016-125,6375,389 files

LLVM/project 46154febolt/docs profiles.md, bolt/lib/Profile DataReader.cpp DataAggregator.cpp

[BOLT] Support negative hex in pre-aggregated profile (#192391)

Handle signed values in parseHexField by falling back to int64_t parsing
when uint64_t fails. This allows pre-aggregated profile tools to use -1
for BR_ONLY, -2 for FT_EXTERNAL_ORIGIN, -3 for FT_EXTERNAL_RETURN.

Guard the external address reset loop in parseAggregatedLBREntry to
preserve sentinel values (offsets >= FT_EXTERNAL_RETURN).

Add tests for -1/-2/-3 in parseHexField and T entries with -1,
ffffffffffffffff, and buildid:-1 as BR_ONLY.
DeltaFile
+44-6bolt/docs/profiles.md
+40-0bolt/unittests/Profile/DataAggregator.cpp
+8-3bolt/lib/Profile/DataReader.cpp
+4-2bolt/lib/Profile/DataAggregator.cpp
+96-114 files

LLVM/project 53f7610llvm/test/CodeGen/AMDGPU/NextUseAnalysis spill-vreg-many-lanes.mir acyclic-770bb.mir

rebase

Created using spr 1.3.4
DeltaFile
+275,101-0llvm/test/CodeGen/AMDGPU/NextUseAnalysis/spill-vreg-many-lanes.mir
+144,679-0llvm/test/CodeGen/AMDGPU/NextUseAnalysis/acyclic-770bb.mir
+57,682-0llvm/test/CodeGen/AMDGPU/NextUseAnalysis/double-nested-loops-complex-cfg.mir
+41,844-0llvm/test/CodeGen/AMDGPU/NextUseAnalysis/test_ers_multiple_spills2.mir
+40,613-0llvm/test/CodeGen/AMDGPU/NextUseAnalysis/test_ers_multiple_spills1.mir
+37,209-0llvm/test/CodeGen/AMDGPU/NextUseAnalysis/test_ers_multiple_spills3.mir
+597,128-05,382 files not shown
+1,084,972-125,6315,388 files

LLVM/project 2954251llvm/test/CodeGen/AMDGPU/NextUseAnalysis spill-vreg-many-lanes.mir acyclic-770bb.mir

[𝘀𝗽𝗿] changes introduced through rebase

Created using spr 1.3.4

[skip ci]
DeltaFile
+275,101-0llvm/test/CodeGen/AMDGPU/NextUseAnalysis/spill-vreg-many-lanes.mir
+144,679-0llvm/test/CodeGen/AMDGPU/NextUseAnalysis/acyclic-770bb.mir
+57,682-0llvm/test/CodeGen/AMDGPU/NextUseAnalysis/double-nested-loops-complex-cfg.mir
+41,844-0llvm/test/CodeGen/AMDGPU/NextUseAnalysis/test_ers_multiple_spills2.mir
+40,613-0llvm/test/CodeGen/AMDGPU/NextUseAnalysis/test_ers_multiple_spills1.mir
+37,209-0llvm/test/CodeGen/AMDGPU/NextUseAnalysis/test_ers_multiple_spills3.mir
+597,128-05,382 files not shown
+1,084,972-125,6315,388 files

LLVM/project cd2cf73bolt/include/bolt/Profile DataAggregator.h, bolt/unittests/Profile DataAggregator.cpp

[BOLT] Add unit tests for pre-aggregated profile parsing (#192390)

Add PreAggregatedTestHelper fixture with friend access to DataAggregator
internals. Add tests for parseHexField and all pre-aggregated entry
types (B, F, f, r, T, R).
DeltaFile
+198-2bolt/unittests/Profile/DataAggregator.cpp
+1-0bolt/include/bolt/Profile/DataAggregator.h
+199-22 files

LLVM/project cb9b66cllvm/lib/Transforms/Vectorize SLPVectorizer.cpp, llvm/test/Transforms/SLPVectorizer revec-shufflevector.ll

[SLP]Initial support for non-power-of-2 vectorization

Enables non-power-of-2 vectorization within the SLP tree. The root nodes
are still required to be power-of-2, will be addressed in a follow-up
patches.

Recommit after revert in e19f36ff8189f1bd6d3b214d2c30ab8ef0639678

Original Pull Request: https://github.com/llvm/llvm-project/pull/151530

Reviewers: 

Pull Request: https://github.com/llvm/llvm-project/pull/194189
DeltaFile
+442-220llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+64-154llvm/test/Transforms/SLPVectorizer/RISCV/reordered-buildvector-scalars.ll
+156-0llvm/test/Transforms/SLPVectorizer/AArch64/long-non-power-of-2.ll
+26-44llvm/test/Transforms/SLPVectorizer/revec-shufflevector.ll
+50-14llvm/test/Transforms/SLPVectorizer/RISCV/reductions.ll
+29-33llvm/test/Transforms/SLPVectorizer/X86/parent-node-schedulable-with-multi-copyables.ll
+767-46563 files not shown
+1,118-91969 files

LLVM/project fc99b67libc/src/__support/FPUtil BasicOperations.h, libc/src/__support/math CMakeLists.txt fmaximum_mag_numbf16.h

Revert "[libc][math] Refactor fmaximum_mag_num family to header-only" (#194183)

Reverts llvm/llvm-project#182169
DeltaFile
+2-47utils/bazel/llvm-project-overlay/libc/BUILD.bazel
+0-30libc/src/__support/math/CMakeLists.txt
+11-16libc/src/__support/FPUtil/BasicOperations.h
+0-26libc/src/__support/math/fmaximum_mag_numbf16.h
+0-25libc/src/__support/math/fmaximum_mag_numf.h
+0-25libc/src/__support/math/fmaximum_mag_num.h
+13-16911 files not shown
+34-26817 files

LLVM/project 23cc957libc/src/__support/FPUtil BasicOperations.h, libc/src/__support/math CMakeLists.txt fmaximum_mag_numbf16.h

[libc][math] Refactor fmaximum_mag_num family to header-only (#182169)

Refactors the fmaximum_mag_num math family to be header-only.

Closes https://github.com/llvm/llvm-project/issues/182168

Target Functions:
  - fmaximum_mag_num
  - fmaximum_mag_numbf16
  - fmaximum_mag_numf

---------

Co-authored-by: Muhammad Bassiouni <60100307+bassiounix at users.noreply.github.com>
DeltaFile
+47-2utils/bazel/llvm-project-overlay/libc/BUILD.bazel
+30-0libc/src/__support/math/CMakeLists.txt
+16-11libc/src/__support/FPUtil/BasicOperations.h
+26-0libc/src/__support/math/fmaximum_mag_numbf16.h
+25-0libc/src/__support/math/fmaximum_mag_num.h
+25-0libc/src/__support/math/fmaximum_mag_numf.h
+169-1311 files not shown
+268-3417 files

LLVM/project bbd4d67libc/shared/math fdivf128.h, libc/src/__support/math fdivf128.h CMakeLists.txt

[libc][math] Refactor fdiv family to header-only (#182192)

Refactors the fdiv math family to be header-only.

Closes https://github.com/llvm/llvm-project/issues/182191

Target Functions:
  - fdiv
  - fdivf128
  - fdivl

---------

Co-authored-by: Muhammad Bassiouni <60100307+bassiounix at users.noreply.github.com>
DeltaFile
+43-2utils/bazel/llvm-project-overlay/libc/BUILD.bazel
+31-0libc/src/__support/math/fdivf128.h
+29-0libc/shared/math/fdivf128.h
+28-0libc/src/__support/math/CMakeLists.txt
+25-0libc/src/__support/math/fdiv.h
+25-0libc/src/__support/math/fdivl.h
+181-210 files not shown
+253-1816 files

LLVM/project b5ac484llvm/include/llvm/ADT DenseMap.h

[DenseMap] Resolves asan + msvc build syntax errors (#193695)

The problem was introduced by #183457 as an asan workaround for clang
builds to silence false positices, so the fix here just enables the
workaround for clang builds.

Fixes #189323

Signed-off-by: Davide Grohmann <davide.grohmann at arm.com>
DeltaFile
+4-2llvm/include/llvm/ADT/DenseMap.h
+4-21 files

LLVM/project 8eef507llvm/lib/Transforms/Vectorize VPlanTransforms.cpp VPlanUtils.cpp

[VPlan] Fix assert in finding WideCanIV (NFC) (#193269)

addActiveLaneMask asserts that the return value of a find_if is
contextully convertible to true, when finding a WideCanonicalIV recipe:
what it should really be checking that the iterator is not the end
iterator. Fix this assert by introducing and using a variant of
vputils::findUserOf.
DeltaFile
+4-11llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
+3-7llvm/lib/Transforms/Vectorize/VPlanUtils.cpp
+6-0llvm/lib/Transforms/Vectorize/VPlanUtils.h
+13-183 files

LLVM/project 28de393compiler-rt/test/profile instrprof-tmpdir.c

[Profile] Reenable instrprof-tmpdir.c (#194181)

env -u is supported by the internal shell which is now the default
everywhere.
DeltaFile
+0-5compiler-rt/test/profile/instrprof-tmpdir.c
+0-51 files

LLVM/project d62067a.github/workflows/containers/libc Dockerfile

[Github] Drop LLVM 21 installation from libc Dockerfile (#194178)

The compiler version was bumped in
8abce0a63c10124aa26a070ead80a68f705c95f9, so we no longer need to
include this. We should probably just hash pin the version in future
workflows for future toolchain upgrades.
DeltaFile
+0-3.github/workflows/containers/libc/Dockerfile
+0-31 files

LLVM/project d36e524llvm/include/llvm/CodeGen KCFI.h, llvm/lib/CodeGen KCFI.cpp

[NewPM] Adds a port for KCFI (#194163)

Standard porting w/ refactored pass logic to support old and new PMs.

Wired in to X86 pass builder.
DeltaFile
+29-11llvm/lib/CodeGen/KCFI.cpp
+31-0llvm/include/llvm/CodeGen/KCFI.h
+4-0llvm/test/CodeGen/X86/llc-pipeline-npm.ll
+2-1llvm/lib/Target/X86/X86CodeGenPassBuilder.cpp
+1-1llvm/lib/Target/AArch64/AArch64TargetMachine.cpp
+1-1llvm/lib/Target/ARM/ARMTargetMachine.cpp
+68-147 files not shown
+75-1813 files

LLVM/project e19f36fllvm/lib/Transforms/Vectorize SLPVectorizer.cpp, llvm/test/Transforms/SLPVectorizer revec-shufflevector.ll shuffle-mask-resized.ll

Revert "[SLP]Initial support for non-power-of-2 vectorization"

This reverts commit 1348766d1d686b8825bdaa2f6638c1783d76a4a7 to fix
a crash, reported in https://github.com/llvm/llvm-project/pull/151530#pullrequestreview-4176091133

Reviewers: 

Pull Request: https://github.com/llvm/llvm-project/pull/194177
DeltaFile
+220-439llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+154-64llvm/test/Transforms/SLPVectorizer/RISCV/reordered-buildvector-scalars.ll
+44-26llvm/test/Transforms/SLPVectorizer/revec-shufflevector.ll
+14-50llvm/test/Transforms/SLPVectorizer/RISCV/reductions.ll
+33-29llvm/test/Transforms/SLPVectorizer/X86/parent-node-schedulable-with-multi-copyables.ll
+16-31llvm/test/Transforms/SLPVectorizer/shuffle-mask-resized.ll
+481-63962 files not shown
+919-95968 files

LLVM/project bb1f822.github/workflows/containers/github-action-ci Dockerfile, .github/workflows/containers/github-action-ci-windows Dockerfile

[Github] Bump CI containers to 22.1.4 (#194175)
DeltaFile
+1-1.github/workflows/containers/github-action-ci/Dockerfile
+1-1.github/workflows/containers/github-action-ci-windows/Dockerfile
+2-22 files

LLVM/project 93317d1llvm/lib/Transforms/IPO Inliner.cpp

Fix typo in comment in Inliner.cpp (#194172)
DeltaFile
+1-1llvm/lib/Transforms/IPO/Inliner.cpp
+1-11 files