LLVM/project de7c63ellvm/tools/llvm-profgen PerfReader.cpp ProfileGenerator.cpp

[llvm-profgen] Add --time-profgen (#191930)

Add `NamedRegionTimer`s to main profgen phases:
- Parse and aggregate trace (`parseAndAggregateTrace`)
- Unwind samples (`unwindSamples`)
- Generate profile (`ProfileGenerator::generateProfile`)
- Generate CS profile (`CSProfileGenerator::generateProfile`)

Test Plan:
```
$ llvm-profgen --time-profgen ...

===-------------------------------------------------------------------------===
                                  llvm-profgen
===-------------------------------------------------------------------------===
  Total Execution Time: 2826.6549 seconds (2873.3410 wall clock)

   ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
  1059.4929 ( 38.1%)   8.5146 ( 17.3%)  1068.0075 ( 37.8%)  1090.6604 ( 38.0%)  Generate CS profile

    [3 lines not shown]
DeltaFile
+11-0llvm/tools/llvm-profgen/PerfReader.cpp
+5-0llvm/tools/llvm-profgen/ProfileGenerator.cpp
+1-0llvm/tools/llvm-profgen/Options.h
+17-03 files

LLVM/project 75b450fbolt/test/X86 pre-aggregated-records.s, bolt/test/X86/Inputs pre-aggregated-bad-hex.txt pre-aggregated-bad-type.txt

[BOLT] Add tests for pre-aggregated parsing (#193843)

Extends e2e coverage of pre-aggregated profile parsing to match the
unit-test coverage added in #192390:

- R (Return) records, including the branch=0 fallback path that
  rewrites to the FT_EXTERNAL_RETURN sentinel.
- r (FT_EXTERNAL_RETURN) records.
- B and T records using the negative -1 hex form (#192391),
  which is parsed as the BR_ONLY/FT_ONLY sentinel.
- Error paths: invalid record type letter and malformed hex address
  (perf2bolt is expected to exit non-zero with a parser error).

The two error-path inputs are tiny raw files under Inputs/ since they
contain intentionally malformed records that link_fdata doesn't process.

Test Plan:
added bolt/test/X86/pre-aggregated-records.s
DeltaFile
+60-0bolt/test/X86/pre-aggregated-records.s
+1-0bolt/test/X86/Inputs/pre-aggregated-bad-hex.txt
+1-0bolt/test/X86/Inputs/pre-aggregated-bad-type.txt
+62-03 files

LLVM/project 71816eflibc/src/__support/FPUtil/generic add_sub.h, libc/src/__support/math fdimf.h fdimf16.h

[libc][math] Qualify fdim funtions to constexpr (#194137)

Signed-off-by: udaykiriti <udaykiriti624 at gmail.com>
Co-authored-by: Muhammad Bassiouni <60100307+bassiounix at users.noreply.github.com>
DeltaFile
+8-0libc/test/shared/shared_math_constexpr_test.cpp
+5-1libc/src/__support/FPUtil/generic/add_sub.h
+6-0libc/test/shared/CMakeLists.txt
+3-1libc/src/__support/math/fdimf.h
+3-1libc/src/__support/math/fdimf16.h
+3-1libc/src/__support/math/fdim.h
+28-44 files not shown
+32-810 files

LLVM/project 24f4629lldb/test/API/commands/thread/backtrace TestThreadBacktraceRepeat.py

[lldb][test] Use assertIn in TestThreadBacktraceRepeat.py (NFC) (#194193)

I broke this test locally, and fixed the assets to produce more useful
output upon failure.
DeltaFile
+7-8lldb/test/API/commands/thread/backtrace/TestThreadBacktraceRepeat.py
+7-81 files

LLVM/project 13e7958llvm/test/CodeGen/AMDGPU/NextUseAnalysis spill-vreg-many-lanes.mir acyclic-770bb.mir

rebase

Created using spr 1.3.4
DeltaFile
+275,101-0llvm/test/CodeGen/AMDGPU/NextUseAnalysis/spill-vreg-many-lanes.mir
+144,679-0llvm/test/CodeGen/AMDGPU/NextUseAnalysis/acyclic-770bb.mir
+57,682-0llvm/test/CodeGen/AMDGPU/NextUseAnalysis/double-nested-loops-complex-cfg.mir
+41,844-0llvm/test/CodeGen/AMDGPU/NextUseAnalysis/test_ers_multiple_spills2.mir
+40,613-0llvm/test/CodeGen/AMDGPU/NextUseAnalysis/test_ers_multiple_spills1.mir
+37,209-0llvm/test/CodeGen/AMDGPU/NextUseAnalysis/test_ers_multiple_spills3.mir
+597,128-05,383 files not shown
+1,085,016-125,6375,389 files

LLVM/project e55f02fllvm/test/CodeGen/AMDGPU/NextUseAnalysis spill-vreg-many-lanes.mir acyclic-770bb.mir

[𝘀𝗽𝗿] changes introduced through rebase

Created using spr 1.3.4

[skip ci]
DeltaFile
+275,101-0llvm/test/CodeGen/AMDGPU/NextUseAnalysis/spill-vreg-many-lanes.mir
+144,679-0llvm/test/CodeGen/AMDGPU/NextUseAnalysis/acyclic-770bb.mir
+57,682-0llvm/test/CodeGen/AMDGPU/NextUseAnalysis/double-nested-loops-complex-cfg.mir
+41,844-0llvm/test/CodeGen/AMDGPU/NextUseAnalysis/test_ers_multiple_spills2.mir
+40,613-0llvm/test/CodeGen/AMDGPU/NextUseAnalysis/test_ers_multiple_spills1.mir
+37,209-0llvm/test/CodeGen/AMDGPU/NextUseAnalysis/test_ers_multiple_spills3.mir
+597,128-05,383 files not shown
+1,085,016-125,6375,389 files

LLVM/project 46154febolt/docs profiles.md, bolt/lib/Profile DataReader.cpp DataAggregator.cpp

[BOLT] Support negative hex in pre-aggregated profile (#192391)

Handle signed values in parseHexField by falling back to int64_t parsing
when uint64_t fails. This allows pre-aggregated profile tools to use -1
for BR_ONLY, -2 for FT_EXTERNAL_ORIGIN, -3 for FT_EXTERNAL_RETURN.

Guard the external address reset loop in parseAggregatedLBREntry to
preserve sentinel values (offsets >= FT_EXTERNAL_RETURN).

Add tests for -1/-2/-3 in parseHexField and T entries with -1,
ffffffffffffffff, and buildid:-1 as BR_ONLY.
DeltaFile
+44-6bolt/docs/profiles.md
+40-0bolt/unittests/Profile/DataAggregator.cpp
+8-3bolt/lib/Profile/DataReader.cpp
+4-2bolt/lib/Profile/DataAggregator.cpp
+96-114 files

LLVM/project 53f7610llvm/test/CodeGen/AMDGPU/NextUseAnalysis spill-vreg-many-lanes.mir acyclic-770bb.mir

rebase

Created using spr 1.3.4
DeltaFile
+275,101-0llvm/test/CodeGen/AMDGPU/NextUseAnalysis/spill-vreg-many-lanes.mir
+144,679-0llvm/test/CodeGen/AMDGPU/NextUseAnalysis/acyclic-770bb.mir
+57,682-0llvm/test/CodeGen/AMDGPU/NextUseAnalysis/double-nested-loops-complex-cfg.mir
+41,844-0llvm/test/CodeGen/AMDGPU/NextUseAnalysis/test_ers_multiple_spills2.mir
+40,613-0llvm/test/CodeGen/AMDGPU/NextUseAnalysis/test_ers_multiple_spills1.mir
+37,209-0llvm/test/CodeGen/AMDGPU/NextUseAnalysis/test_ers_multiple_spills3.mir
+597,128-05,382 files not shown
+1,084,972-125,6315,388 files

LLVM/project 2954251llvm/test/CodeGen/AMDGPU/NextUseAnalysis spill-vreg-many-lanes.mir acyclic-770bb.mir

[𝘀𝗽𝗿] changes introduced through rebase

Created using spr 1.3.4

[skip ci]
DeltaFile
+275,101-0llvm/test/CodeGen/AMDGPU/NextUseAnalysis/spill-vreg-many-lanes.mir
+144,679-0llvm/test/CodeGen/AMDGPU/NextUseAnalysis/acyclic-770bb.mir
+57,682-0llvm/test/CodeGen/AMDGPU/NextUseAnalysis/double-nested-loops-complex-cfg.mir
+41,844-0llvm/test/CodeGen/AMDGPU/NextUseAnalysis/test_ers_multiple_spills2.mir
+40,613-0llvm/test/CodeGen/AMDGPU/NextUseAnalysis/test_ers_multiple_spills1.mir
+37,209-0llvm/test/CodeGen/AMDGPU/NextUseAnalysis/test_ers_multiple_spills3.mir
+597,128-05,382 files not shown
+1,084,972-125,6315,388 files

LLVM/project cd2cf73bolt/include/bolt/Profile DataAggregator.h, bolt/unittests/Profile DataAggregator.cpp

[BOLT] Add unit tests for pre-aggregated profile parsing (#192390)

Add PreAggregatedTestHelper fixture with friend access to DataAggregator
internals. Add tests for parseHexField and all pre-aggregated entry
types (B, F, f, r, T, R).
DeltaFile
+198-2bolt/unittests/Profile/DataAggregator.cpp
+1-0bolt/include/bolt/Profile/DataAggregator.h
+199-22 files

LLVM/project cb9b66cllvm/lib/Transforms/Vectorize SLPVectorizer.cpp, llvm/test/Transforms/SLPVectorizer revec-shufflevector.ll

[SLP]Initial support for non-power-of-2 vectorization

Enables non-power-of-2 vectorization within the SLP tree. The root nodes
are still required to be power-of-2, will be addressed in a follow-up
patches.

Recommit after revert in e19f36ff8189f1bd6d3b214d2c30ab8ef0639678

Original Pull Request: https://github.com/llvm/llvm-project/pull/151530

Reviewers: 

Pull Request: https://github.com/llvm/llvm-project/pull/194189
DeltaFile
+442-220llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+64-154llvm/test/Transforms/SLPVectorizer/RISCV/reordered-buildvector-scalars.ll
+156-0llvm/test/Transforms/SLPVectorizer/AArch64/long-non-power-of-2.ll
+26-44llvm/test/Transforms/SLPVectorizer/revec-shufflevector.ll
+50-14llvm/test/Transforms/SLPVectorizer/RISCV/reductions.ll
+29-33llvm/test/Transforms/SLPVectorizer/X86/parent-node-schedulable-with-multi-copyables.ll
+767-46563 files not shown
+1,118-91969 files

LLVM/project fc99b67libc/src/__support/FPUtil BasicOperations.h, libc/src/__support/math CMakeLists.txt fmaximum_mag_numbf16.h

Revert "[libc][math] Refactor fmaximum_mag_num family to header-only" (#194183)

Reverts llvm/llvm-project#182169
DeltaFile
+2-47utils/bazel/llvm-project-overlay/libc/BUILD.bazel
+0-30libc/src/__support/math/CMakeLists.txt
+11-16libc/src/__support/FPUtil/BasicOperations.h
+0-26libc/src/__support/math/fmaximum_mag_numbf16.h
+0-25libc/src/__support/math/fmaximum_mag_numf.h
+0-25libc/src/__support/math/fmaximum_mag_num.h
+13-16911 files not shown
+34-26817 files

LLVM/project 23cc957libc/src/__support/FPUtil BasicOperations.h, libc/src/__support/math CMakeLists.txt fmaximum_mag_numbf16.h

[libc][math] Refactor fmaximum_mag_num family to header-only (#182169)

Refactors the fmaximum_mag_num math family to be header-only.

Closes https://github.com/llvm/llvm-project/issues/182168

Target Functions:
  - fmaximum_mag_num
  - fmaximum_mag_numbf16
  - fmaximum_mag_numf

---------

Co-authored-by: Muhammad Bassiouni <60100307+bassiounix at users.noreply.github.com>
DeltaFile
+47-2utils/bazel/llvm-project-overlay/libc/BUILD.bazel
+30-0libc/src/__support/math/CMakeLists.txt
+16-11libc/src/__support/FPUtil/BasicOperations.h
+26-0libc/src/__support/math/fmaximum_mag_numbf16.h
+25-0libc/src/__support/math/fmaximum_mag_num.h
+25-0libc/src/__support/math/fmaximum_mag_numf.h
+169-1311 files not shown
+268-3417 files

LLVM/project bbd4d67libc/shared/math fdivf128.h, libc/src/__support/math fdivf128.h CMakeLists.txt

[libc][math] Refactor fdiv family to header-only (#182192)

Refactors the fdiv math family to be header-only.

Closes https://github.com/llvm/llvm-project/issues/182191

Target Functions:
  - fdiv
  - fdivf128
  - fdivl

---------

Co-authored-by: Muhammad Bassiouni <60100307+bassiounix at users.noreply.github.com>
DeltaFile
+43-2utils/bazel/llvm-project-overlay/libc/BUILD.bazel
+31-0libc/src/__support/math/fdivf128.h
+29-0libc/shared/math/fdivf128.h
+28-0libc/src/__support/math/CMakeLists.txt
+25-0libc/src/__support/math/fdiv.h
+25-0libc/src/__support/math/fdivl.h
+181-210 files not shown
+253-1816 files

LLVM/project b5ac484llvm/include/llvm/ADT DenseMap.h

[DenseMap] Resolves asan + msvc build syntax errors (#193695)

The problem was introduced by #183457 as an asan workaround for clang
builds to silence false positices, so the fix here just enables the
workaround for clang builds.

Fixes #189323

Signed-off-by: Davide Grohmann <davide.grohmann at arm.com>
DeltaFile
+4-2llvm/include/llvm/ADT/DenseMap.h
+4-21 files

LLVM/project 8eef507llvm/lib/Transforms/Vectorize VPlanTransforms.cpp VPlanUtils.cpp

[VPlan] Fix assert in finding WideCanIV (NFC) (#193269)

addActiveLaneMask asserts that the return value of a find_if is
contextully convertible to true, when finding a WideCanonicalIV recipe:
what it should really be checking that the iterator is not the end
iterator. Fix this assert by introducing and using a variant of
vputils::findUserOf.
DeltaFile
+4-11llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
+3-7llvm/lib/Transforms/Vectorize/VPlanUtils.cpp
+6-0llvm/lib/Transforms/Vectorize/VPlanUtils.h
+13-183 files

LLVM/project 28de393compiler-rt/test/profile instrprof-tmpdir.c

[Profile] Reenable instrprof-tmpdir.c (#194181)

env -u is supported by the internal shell which is now the default
everywhere.
DeltaFile
+0-5compiler-rt/test/profile/instrprof-tmpdir.c
+0-51 files

LLVM/project d62067a.github/workflows/containers/libc Dockerfile

[Github] Drop LLVM 21 installation from libc Dockerfile (#194178)

The compiler version was bumped in
8abce0a63c10124aa26a070ead80a68f705c95f9, so we no longer need to
include this. We should probably just hash pin the version in future
workflows for future toolchain upgrades.
DeltaFile
+0-3.github/workflows/containers/libc/Dockerfile
+0-31 files

LLVM/project d36e524llvm/include/llvm/CodeGen KCFI.h, llvm/lib/CodeGen KCFI.cpp

[NewPM] Adds a port for KCFI (#194163)

Standard porting w/ refactored pass logic to support old and new PMs.

Wired in to X86 pass builder.
DeltaFile
+29-11llvm/lib/CodeGen/KCFI.cpp
+31-0llvm/include/llvm/CodeGen/KCFI.h
+4-0llvm/test/CodeGen/X86/llc-pipeline-npm.ll
+2-1llvm/lib/Target/X86/X86CodeGenPassBuilder.cpp
+1-1llvm/lib/Target/AArch64/AArch64TargetMachine.cpp
+1-1llvm/lib/Target/ARM/ARMTargetMachine.cpp
+68-147 files not shown
+75-1813 files

LLVM/project e19f36fllvm/lib/Transforms/Vectorize SLPVectorizer.cpp, llvm/test/Transforms/SLPVectorizer revec-shufflevector.ll shuffle-mask-resized.ll

Revert "[SLP]Initial support for non-power-of-2 vectorization"

This reverts commit 1348766d1d686b8825bdaa2f6638c1783d76a4a7 to fix
a crash, reported in https://github.com/llvm/llvm-project/pull/151530#pullrequestreview-4176091133

Reviewers: 

Pull Request: https://github.com/llvm/llvm-project/pull/194177
DeltaFile
+220-439llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+154-64llvm/test/Transforms/SLPVectorizer/RISCV/reordered-buildvector-scalars.ll
+44-26llvm/test/Transforms/SLPVectorizer/revec-shufflevector.ll
+14-50llvm/test/Transforms/SLPVectorizer/RISCV/reductions.ll
+33-29llvm/test/Transforms/SLPVectorizer/X86/parent-node-schedulable-with-multi-copyables.ll
+16-31llvm/test/Transforms/SLPVectorizer/shuffle-mask-resized.ll
+481-63962 files not shown
+919-95968 files

LLVM/project bb1f822.github/workflows/containers/github-action-ci Dockerfile, .github/workflows/containers/github-action-ci-windows Dockerfile

[Github] Bump CI containers to 22.1.4 (#194175)
DeltaFile
+1-1.github/workflows/containers/github-action-ci/Dockerfile
+1-1.github/workflows/containers/github-action-ci-windows/Dockerfile
+2-22 files

LLVM/project 93317d1llvm/lib/Transforms/IPO Inliner.cpp

Fix typo in comment in Inliner.cpp (#194172)
DeltaFile
+1-1llvm/lib/Transforms/IPO/Inliner.cpp
+1-11 files

LLVM/project 87673e4llvm/utils/gn/secondary/llvm/lib/CodeGen BUILD.gn

[gn build] Port c59c19bf5921 (#194167)
DeltaFile
+1-0llvm/utils/gn/secondary/llvm/lib/CodeGen/BUILD.gn
+1-01 files

LLVM/project c34d43dllvm/utils/gn/secondary/llvm/lib/Target/RISCV BUILD.gn

[gn build] Port a693efcc40b1 (#194166)
DeltaFile
+1-0llvm/utils/gn/secondary/llvm/lib/Target/RISCV/BUILD.gn
+1-01 files

LLVM/project 5482074llvm/utils/gn/secondary/libcxx/include BUILD.gn

[gn build] Port 2f28e1db535b (#194165)
DeltaFile
+1-0llvm/utils/gn/secondary/libcxx/include/BUILD.gn
+1-01 files

LLVM/project 8b1d624clang/test/Driver serenity.cpp, clang/test/Driver/Inputs/empty_tree .keep

[clang] Make serenity.cpp more independent of the host (#193981)

Tests matching crt files previously relied on the host system not using
the same file paths as Serenity. This breaks on AIX, as both systems use
`/usr/lib/crt0.o`.

Redirect most tests to an empty sysroot so they match only on the
filename and remain independent of the host system. Also add a test that
verifies crt files can be found in a normal sysroot.
DeltaFile
+35-15clang/test/Driver/serenity.cpp
+0-0clang/test/Driver/Inputs/serenity_tree/usr/lib/crtendS.o
+0-0clang/test/Driver/Inputs/serenity_tree/usr/lib/crtbeginS.o
+0-0clang/test/Driver/Inputs/empty_tree/.keep
+35-154 files

LLVM/project 6de092dclang/lib/AST/ByteCode Interp.h Pointer.h, clang/test/AST/ByteCode c.c

[clang][bytecode] Fix crash involving labels and null sub (#194115)

For null pointers, getDeclDesc() may return null, so we can't call
asExpr() on it.
DeltaFile
+8-5clang/lib/AST/ByteCode/Interp.h
+6-0clang/lib/AST/ByteCode/Pointer.h
+4-0clang/test/AST/ByteCode/c.c
+18-53 files

LLVM/project 8e3fa95flang/lib/Semantics check-omp-structure.cpp check-omp-structure.h

[flang][OpenMP] Replace llvmOmpClause with llvm::omp::Clause (#194162)

Both types, llvmOmpClause (alias of const llvm::omp::Clause) and
llvm::omp::Clause are in use, let's just stick with one.
DeltaFile
+7-7flang/lib/Semantics/check-omp-structure.cpp
+3-5flang/lib/Semantics/check-omp-structure.h
+10-122 files

LLVM/project 764e10cllvm/lib/Target/AArch64 AArch64SIMDInstrOpt.cpp AArch64.h

[NewPM] Adds a port for AArch64SIMDInstrOpt (#188177)

Adds a port for AArch64SIMDInstrOpt

- Refactored to extract base logic as Impl.
- **Note**: Moved theI nstruction Replacement Table and cross-function
cached maps as members of the Impl class.
- **Note**: Updated `InstReplInfo::RC` to be a pointer rather than a
stack object, because we're putting it into MRI
[here](https://github.com/llvm/llvm-project/blob/704c60fe9110256d2698d8e56b8c44ec5d1e733f/llvm/lib/Target/AArch64/AArch64SIMDInstrOpt.cpp#L532).
- Renamed existing pass with "Legacy" suffix and updated references
- Added NewPM pass AArch64SIMDInstrOptPass
- Updated pass type to `aarch64-simd-instr-opt` (prev:
`aarch64-simdinstr-opt`)

No existing `.mir` tests to update.
DeltaFile
+138-97llvm/lib/Target/AArch64/AArch64SIMDInstrOpt.cpp
+12-1llvm/lib/Target/AArch64/AArch64.h
+1-1llvm/lib/Target/AArch64/AArch64TargetMachine.cpp
+1-0llvm/lib/Target/AArch64/AArch64PassRegistry.def
+152-994 files

LLVM/project 0eae5cfllvm/lib/Target/RISCV RISCVISelLowering.cpp, llvm/test/CodeGen/RISCV/rvv setcc-int-vp.ll fixed-vectors-setcc-int-vp.ll

[RISCV] Remove codegen for vp.icmp (#193606)

Part of the work to remove trivial VP intrinsics from the RISC-V
backend, see
https://discourse.llvm.org/t/rfc-remove-codegen-support-for-trivial-vp-intrinsics-in-the-risc-v-backend/87999

This splits off vp.icmp from #179622
DeltaFile
+704-882llvm/test/CodeGen/RISCV/rvv/setcc-int-vp.ll
+345-558llvm/test/CodeGen/RISCV/rvv/fixed-vectors-setcc-int-vp.ll
+70-70llvm/test/CodeGen/RISCV/rvv/setcc-int-vp-mask.ll
+3-120llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+40-40llvm/test/CodeGen/RISCV/rvv/fixed-vectors-setcc-int-vp-mask.ll
+7-11llvm/test/CodeGen/RISCV/rvv/sink-splat-operands.ll
+1,169-1,6813 files not shown
+1,177-1,6899 files