LLVM/project 2da01fcllvm/lib/Analysis ValueTracking.cpp, llvm/test/Transforms/Attributor/AMDGPU nofpclass-amdgcn-exp.ll

ValueTracking: Handle amdgcn_exp2 in computeKnownFPClass

The base exp handling looks pretty incomplete.
DeltaFile
+24-24llvm/test/Transforms/Attributor/AMDGPU/nofpclass-amdgcn-exp.ll
+6-1llvm/lib/Analysis/ValueTracking.cpp
+30-252 files

LLVM/project 47281a6llvm/lib/Analysis ValueTracking.cpp, llvm/test/Transforms/Attributor nofpclass-exp.ll

ValueTracking: Teach computeKnownFPClass about exp(-inf)

If the source cannot be -inf, the result cannot be +0.
DeltaFile
+66-66llvm/test/Transforms/Attributor/nofpclass-exp.ll
+24-24llvm/test/Transforms/Attributor/AMDGPU/nofpclass-amdgcn-exp.ll
+3-0llvm/lib/Analysis/ValueTracking.cpp
+93-903 files

LLVM/project 51361e6llvm/test/Transforms/Attributor/AMDGPU nofpclass-amdgcn-exp.ll

ValueTracking: Add baseline test for amdgcn_exp2 handling
DeltaFile
+130-0llvm/test/Transforms/Attributor/AMDGPU/nofpclass-amdgcn-exp.ll
+130-01 files

LLVM/project 2895e79llvm/lib/Analysis ValueTracking.cpp, llvm/test/Transforms/Attributor/AMDGPU nofpclass-amdgcn-log.ll

ValueTracking: Handle amdgcn_log in computeKnownFPClass
DeltaFile
+48-48llvm/test/Transforms/Attributor/AMDGPU/nofpclass-amdgcn-log.ll
+7-4llvm/lib/Analysis/ValueTracking.cpp
+55-522 files

LLVM/project a01da49llvm/test/Transforms/Attributor/AMDGPU nofpclass-amdgcn-log.ll

ValueTracking: Add baseline test for nofpclass handling of amdgcn_log
DeltaFile
+277-0llvm/test/Transforms/Attributor/AMDGPU/nofpclass-amdgcn-log.ll
+277-01 files

LLVM/project 8d0fabeflang/lib/Evaluate intrinsics.cpp, flang/test/Semantics bessel_jn.f90 num_images01.f90

[flang] Improve intrinsic error messages when multiple signatures exist (#172099)

When an intrinsic has multiple table entries (e.g., BESSEL_JN has both
elemental and transformational forms), the error message selection logic
now prefers specific argument errors over generic "too many actual
arguments" messages.

For example, given:

    real, dimension(10) :: xarray
    integer :: n1, n2
    print *, bessel_jn(n1, n2, xarray)

The transformational form of BESSEL_JN(n1, n2, x) requires x to be
scalar. Previously, flang reported "too many actual arguments" from the
elemental form (which only accepts 2 args), even though the
transformational form matched the argument count but failed on rank.

Now flang correctly reports "x= argument has unacceptable rank 1", which

    [4 lines not shown]
DeltaFile
+25-0flang/test/Semantics/bessel_jn.f90
+16-2flang/lib/Evaluate/intrinsics.cpp
+1-2flang/test/Semantics/num_images01.f90
+1-1flang/test/Semantics/num_images02.f90
+43-54 files

LLVM/project 329426bllvm/lib/Analysis ValueTracking.cpp, llvm/test/Transforms/Attributor/AMDGPU nofpclass-amdgcn-rcp.ll

AMDGPU: Handle amdgcn_rcp in computeKnownFPClass
DeltaFile
+78-78llvm/test/Transforms/Attributor/AMDGPU/nofpclass-amdgcn-rcp.ll
+33-0llvm/lib/Analysis/ValueTracking.cpp
+111-782 files

LLVM/project 1963a8ellvm/test/Transforms/Attributor/AMDGPU nofpclass-amdgcn-rcp.ll

ValueTracking: Add baseline test for amdgcn_rcp handling
DeltaFile
+430-0llvm/test/Transforms/Attributor/AMDGPU/nofpclass-amdgcn-rcp.ll
+430-01 files

LLVM/project 5925210clang/include/clang/AST OpenMPClause.h, clang/lib/CIR/CodeGen CIRGenOpenMPClause.cpp CIRGenStmtOpenMP.cpp

[OpenMP][CIR] Implement basic 'parallel' lowering + some clause infra (#172308)

This patch adds some basic lowering for the OMP 'parallel' lowering,
which adds an omp.parallel operation, plus tries to insert into its
region, with a omp.terminator operation.. However, this patch doesn't
implement CapturedStmt (and I don't intend to do so at all), so there is
an NYI error when emitting a parallel region (plus it shows up in IR
    as 'empty'.

This patch also adds some infrastructure to 'lower' clauses, however no
clauses are emitted, and this simply adds a 'not yet implemented'
warning any time a clause is attempted. The OMP clause visitor seems to
have had a bug with how it 'degraded' when a clause wasn't handled (it
  would result in an infinite recursion if it wasn't supplied), so this
fixes that as well.

A followup patch or two may use this infrastructure to demonstrate how
to use it.
DeltaFile
+63-0clang/lib/CIR/CodeGen/CIRGenOpenMPClause.cpp
+50-0clang/test/CIR/CodeGenOpenMP/parallel.c
+29-2clang/lib/CIR/CodeGen/CIRGenStmtOpenMP.cpp
+4-2clang/include/clang/AST/OpenMPClause.h
+4-1clang/test/CIR/CodeGenOpenMP/not-yet-implemented.c
+4-0clang/lib/CIR/CodeGen/CIRGenFunction.h
+154-51 files not shown
+155-57 files

LLVM/project 49f6979clang/lib/CIR/CodeGen CIRGenStmtOpenMP.cpp, clang/test/CIR/CodeGenOpenMP barrier.c

[OpenMP][CIR] Implement 'barrier' lowering (#172305)

As my next patch showing how OMP lowering should work, here is a simple
construct implementation. Best I can tell, the 'barrier' construct just
results in a omp.barrier to be emitted into the IR. This is our first
use of the omp dialect, though the dialect was already added in my last
patch.
DeltaFile
+14-0clang/test/CIR/CodeGenOpenMP/barrier.c
+3-2clang/lib/CIR/CodeGen/CIRGenStmtOpenMP.cpp
+17-22 files

LLVM/project 516dd2bllvm/docs ReleaseNotes.md, llvm/test/tools/llvm-objdump mattr-mcpu-help.test mattr-mcpu-triple-help.test

[llvm-objdump] Support --mcpu=help/--mattr=help without -d (#165661)

Currently `--mcpu=help` and `--mattr=help` only produce help out when
disassembling. This patch specialises these cases to always print the
requested help.

If `--triple` is specified, the help text will be derived from the
specified target. Otherwise, it will be derived from the target of the
first input file.

Fixes: #150567

---------

Signed-off-by: Ruoyu Qiu <cabbaken at outlook.com>
Co-authored-by: James Henderson <James.Henderson at sony.com>
DeltaFile
+97-4llvm/test/tools/llvm-objdump/mattr-mcpu-help.test
+72-10llvm/tools/llvm-objdump/llvm-objdump.cpp
+21-0llvm/test/tools/llvm-objdump/mattr-mcpu-triple-help.test
+3-0llvm/docs/ReleaseNotes.md
+193-144 files

LLVM/project 18e9b48libcxx/include tuple, libcxx/test/libcxx/utilities/tuple nodiscard.verify.cpp

[libc++][tuple] Applied `[[nodiscard]]` (#172008)

`[[nodiscard]]` should be applied to functions where discarding the
return value is most likely a correctness issue.

- https://libcxx.llvm.org/CodingGuidelines.html
- https://wg21.link/tuple
DeltaFile
+66-0libcxx/test/libcxx/utilities/tuple/nodiscard.verify.cpp
+19-14libcxx/include/tuple
+4-2libcxx/test/std/utilities/tuple/tuple.tuple/tuple.cnstr/recursion_depth.pass.cpp
+89-163 files

LLVM/project 3ae16fellvm/lib/Target/AMDGPU AMDGPUCodeGenPrepare.cpp, llvm/test/CodeGen/AMDGPU rsq.f32-safe.ll amdgpu-codegenprepare-fdiv.ll

AMDGPU: Stop requiring afn for f32 rsq formation

We were checking for afn or !fpmath attached to the sqrt. We
are not trying to replace a correctly rounded rsqrt; we're replacing
the two correctly rounded operations with the contracted operation.
It's net a better precision, so contract on both instructions should
be sufficient. Both the contracted and uncontracted sequences pass
the OpenCL conformance test, with a lower maximum error contracted.
DeltaFile
+504-1,529llvm/test/CodeGen/AMDGPU/rsq.f32-safe.ll
+52-45llvm/test/CodeGen/AMDGPU/amdgpu-codegenprepare-fdiv.ll
+6-25llvm/lib/Target/AMDGPU/AMDGPUCodeGenPrepare.cpp
+562-1,5993 files

LLVM/project 23f967aclang/test/Driver/print-enabled-extensions aarch64-c1-pro.c aarch64-c1-ultra.c, llvm/lib/Target/AArch64 AArch64Processors.td

[AArch64] Add support for C1 CPUs (#171124)

This patch adds initial support for the Arm v9.3 C1 processors:
* C1-Nano
* C1-Pro
* C1-Premium
* C1-Ultra

For more information on each, see:
https://developer.arm.com/Processors/C1-Nano
https://developer.arm.com/Processors/C1-Pro
https://developer.arm.com/Processors/C1-Premium
https://developer.arm.com/Processors/C1-Ultra

Technical Reference Manual for C1-Nano:
https://developer.arm.com/documentation/107753/latest/

Technical Reference Manual for C1-Pro:
https://developer.arm.com/documentation/107771/latest/

    [5 lines not shown]
DeltaFile
+111-0llvm/lib/Target/AArch64/AArch64Processors.td
+71-0clang/test/Driver/print-enabled-extensions/aarch64-c1-pro.c
+71-0clang/test/Driver/print-enabled-extensions/aarch64-c1-ultra.c
+71-0clang/test/Driver/print-enabled-extensions/aarch64-c1-premium.c
+69-0clang/test/Driver/print-enabled-extensions/aarch64-c1-nano.c
+12-0llvm/unittests/TargetParser/Host.cpp
+405-07 files not shown
+437-113 files

LLVM/project ff0362ellvm/lib/Target/AMDGPU AMDGPUCodeGenPrepare.cpp, llvm/test/CodeGen/AMDGPU rsq.f64.ll amdgpu-codegenprepare-fdiv.f64.ll

X not Y0
DeltaFile
+144-144llvm/test/CodeGen/AMDGPU/rsq.f64.ll
+9-9llvm/test/CodeGen/AMDGPU/amdgpu-codegenprepare-fdiv.f64.ll
+1-1llvm/lib/Target/AMDGPU/AMDGPUCodeGenPrepare.cpp
+154-1543 files

LLVM/project 2f9ed9dllvm/lib/Target/AArch64 AArch64SVEInstrInfo.td, llvm/test/CodeGen/AArch64 sve-nontemporal-ldst.ll nontemporal-load.ll

[AArch64][SVE] Select non-temporal instructions for unpredicated loads/stores with the nontemporal flag (#171261)

Add patterns to select SVE non-temporal load/store instructions for
unpredicated vector loads/stores with the `nontemporal` flag.
Previously, regular instructions were used for these cases.

Fixes #169034
DeltaFile
+256-0llvm/test/CodeGen/AArch64/sve-nontemporal-ldst.ll
+30-8llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
+11-10llvm/test/CodeGen/AArch64/nontemporal-load.ll
+20-0llvm/test/CodeGen/AArch64/sve-nontemporal-masked-ldst.ll
+317-184 files

LLVM/project d4e9355lld/MachO LTO.cpp Driver.cpp, lld/test/MachO lto-emit-llvm.ll

[lld][MachO] Add --lto-emit-llvm command line option

This option will cause the linker to emit LLVM bitcode instead of an
object file. The implementation is similar to that of the corresponding
option in the ELF backend. This only works with LLD and will not work
the gold plugin.
DeltaFile
+18-0lld/test/MachO/lto-emit-llvm.ll
+10-0lld/MachO/LTO.cpp
+5-4lld/MachO/Driver.cpp
+2-0lld/MachO/Options.td
+1-0lld/MachO/Config.h
+36-45 files

LLVM/project 3adab49offload/plugins-nextgen/common/src PluginInterface.cpp

Address comments
DeltaFile
+4-4offload/plugins-nextgen/common/src/PluginInterface.cpp
+4-41 files

LLVM/project de0cb77libcxx/docs Contributing.rst, libcxx/utils/ci run-buildbot-container

[libc++] Store the premerge runner images in the monorepo (#171443)

We need one canonical place to store the image used by the various sets
of libc++ CI runners. This is needed so that our run-buildbot-container
script can stay up-to-date, and for the pre-merge infrastructure to stay
up-to-date.

Previously, the images used by the premerge infrastructure were stored
in llvm-zorg, which makes it less discoverable and more complicated to
update and keep synchronized.
DeltaFile
+10-30libcxx/docs/Contributing.rst
+3-2libcxx/utils/ci/run-buildbot-container
+1-0libcxx/utils/ci/images/libcxx_runners.txt
+1-0libcxx/utils/ci/images/libcxx_next_runners.txt
+1-0libcxx/utils/ci/images/libcxx_release_runners.txt
+16-325 files

LLVM/project ecbb444mlir/include/mlir/Dialect/AMDGPU/IR AMDGPU.td, mlir/lib/Conversion/AMDGPUToROCDL AMDGPUToROCDL.cpp

[mlir][amdgpu] Add tensor load store operations (#170918)

* removes unused code.
* lowers tensor load and store operations.
DeltaFile
+50-7mlir/lib/Conversion/AMDGPUToROCDL/AMDGPUToROCDL.cpp
+31-0mlir/include/mlir/Dialect/AMDGPU/IR/AMDGPU.td
+18-0mlir/test/Conversion/AMDGPUToROCDL/gfx1250.mlir
+99-73 files

LLVM/project 116a4ecllvm/test/CodeGen/AArch64 sme2-intrinsics-bfscale.ll

fixup! [AArch64][llvm] Add intrinsics for SVE BFSCALE

Also test llvm.aarch64.sve.fscale.nxv8bf16() intrinsic
in streaming mode as well as non-streaming
DeltaFile
+12-0llvm/test/CodeGen/AArch64/sme2-intrinsics-bfscale.ll
+12-01 files

LLVM/project 2b55e4cclang/include/clang/Lex TextEncodingConfig.h LiteralConverter.h, clang/lib/Lex LiteralConverter.cpp TextEncodingConfig.cpp

rename LiteralConverter to TextEncodingConfig
DeltaFile
+0-60clang/lib/Lex/LiteralConverter.cpp
+59-0clang/lib/Lex/TextEncodingConfig.cpp
+40-0clang/include/clang/Lex/TextEncodingConfig.h
+0-40clang/include/clang/Lex/LiteralConverter.h
+5-5clang/lib/Lex/LiteralSupport.cpp
+5-5clang/include/clang/Lex/LiteralSupport.h
+109-1103 files not shown
+116-1179 files

LLVM/project bbbba9cclang/docs ReleaseNotes.rst, clang/lib/Sema SemaExprCXX.cpp

[CLANG] Fixes the crash on the use of nested requirements in require expressions (#169876)

Fixes #165386
Nested requirements in requires-expressions must be constant
expressions. Previously, using an invented parameter in a nested
requirement caused a crash. Now emit a clear diagnostic and recover.
DeltaFile
+13-0clang/test/SemaCXX/requires-nested-non-constant.cpp
+2-0clang/lib/Sema/SemaExprCXX.cpp
+1-0clang/docs/ReleaseNotes.rst
+16-03 files

LLVM/project e9575f0clang/test/CodeGen/AArch64/sve-intrinsics acle_sve_bfscale.c, llvm/lib/Target/AArch64 AArch64InstrInfo.td

fixup! [AArch64][llvm] Add intrinsics for SVE BFSCALE

Change `HasSVEBFSCALE` to be correct. This now requires
+sve-bfscale in sve-intrinsics-fp-arith.ll, and ensure
it is lowered to `bfscale` correctly.

Also run in both streaming and non-streaming mode in acle_sve_bfscale.c
DeltaFile
+7-7llvm/test/CodeGen/AArch64/sve-intrinsics-fp-arith.ll
+1-1llvm/lib/Target/AArch64/AArch64InstrInfo.td
+2-0clang/test/CodeGen/AArch64/sve-intrinsics/acle_sve_bfscale.c
+10-83 files

LLVM/project e89734eclang/lib/Headers avx512fp16intrin.h avx512fintrin.h, clang/test/CodeGen/X86 avx512f-builtins.c avx512fp16-builtins.c

[Headers][X86] Allow vector bitcast intrinsics to be used in constexpr (#167180)

Fixes #156348
DeltaFile
+18-18clang/lib/Headers/avx512fp16intrin.h
+25-2clang/test/CodeGen/X86/avx512f-builtins.c
+10-10clang/lib/Headers/avx512fintrin.h
+18-0clang/test/CodeGen/X86/avx512fp16-builtins.c
+71-304 files

LLVM/project dc69c41clang/lib/CodeGen CGObjCGNU.cpp

[CGObjCGNU] Set isSigned for negative value
DeltaFile
+6-4clang/lib/CodeGen/CGObjCGNU.cpp
+6-41 files

LLVM/project 9e11754clang/lib/CodeGen CGHLSLRuntime.cpp

[CGHLSLRuntime] Use getSigned() for total array size

This may be -1 for incompete array types.
DeltaFile
+2-2clang/lib/CodeGen/CGHLSLRuntime.cpp
+2-21 files

LLVM/project 58ee3ecclang/lib/CodeGen ItaniumCXXABI.cpp

[ItaniumCXXABI] Use getSigned() for signed offset
DeltaFile
+1-1clang/lib/CodeGen/ItaniumCXXABI.cpp
+1-11 files

LLVM/project f3025d1clang/lib/CodeGen CGBuilder.h

[CGBuilder] Use getSigned() for CharUnits

CharUnits holds a signed quantity.
DeltaFile
+1-1clang/lib/CodeGen/CGBuilder.h
+1-11 files

LLVM/project aa7a95cflang/lib/Semantics check-omp-loop.cpp

[flang][OpenMP] Make function name more accurate, NFC (#172328)

Change `CountGeneratedLoops` to `CountGeneratedNests`, since it's really
the nests that are counted.
DeltaFile
+8-8flang/lib/Semantics/check-omp-loop.cpp
+8-81 files