InstCombine: Introduce nsz flag on minimum/maximum in SimplifyDemandedFPClass (#173898)
Alive isn't particularly happy with this in the case where
one of the inputs could be zero, but I think
it's wrong: https://alive2.llvm.org/ce/z/dF7V6k
nsz shouldn't permit introducing a -0 result where
there wasn't one in the input here.
worklows/release-tasks: Remove the release-lit workflow (#174644)
This hasn't been working for a while, and I think we should wait until
lit is part of the llvm organization on pypi before we start trying to
automate its release again.
(cherry picked from commit a331728c7a68a08c621070b9cab5cf1f72b425e2)
[Serialization] Complete only needed partial specializations
It is unclear (to me) why this needs to be done "for safety", but
this change significantly improves the effectiveness of lazy loading.
Reviewed as part of https://github.com/llvm/llvm-project/pull/133057
Reapply "[libc++] Optimize std::find_if" (#175903) (#175921)
#175913 removed that `__builtin_assume_dereferenceable(ptr, 0)` implies
`ptr != nullptr`, which should allow us to use the builtin with LLVM 23.
This reverts commit 776c09c212e945fdceeae240b42c38df3dd34727.
[FMV][AArch64] Release notes for clang/llvm 22
Clang (support level upgraded to Release in ACLE)
- Resolver functions can use the PAC and BTI hardening settings.
- Users can override function version priority.
- Unreachable functions versions are diagnosed and ignored.
LLVM (bug fix and improvements in IPO/GlobalOpt)
- Fixed static resolution of indirect calls to versioned functions,
by separating unrelated caller versions which were mixed together.
- Improved the accuracy of the algorithm for low version counts.
[clang-repl] Use more precise search to find the orc runtime. (#175805)
The new mechanism relies on the path in the toolchain which should be
the autoritative answer. This patch tweaks the discovery of the orc
runtime from unittests where the resource directory is hard to deduce.
Should address the issue raised in #175435 and #175322
(cherry picked from commit 84c19e7cf303a0525fd6c7bf5d03053714402c91)
InstCombine: Introduce nsz flag on minimum/maximum in SimplifyDemandedFPClass
Alive isn't particularly happy with this in the case where
one of the inputs could be zero, but I think
it's wrong: https://alive2.llvm.org/ce/z/dF7V6k
nsz shouldn't permit introducing a -0 result where
there wasn't one in the input here.
[flang][openacc] support array section privatization in lowering (#175184)
Add support array section in private, firstprivate, and reduction.
Key changes:
- Change the related data operation result type to return the same type
as the array base (same type as the acc variable input in the
operation), while it was the type of the section before. This allows
remapping the base the to result value (to use the data operation result
as the base when generating addressing inside the compute region).
- The generatePrivateInit implementation of FIROpenACCTypeInterfaces is
modified to allocate storage only for the section, and to return the
mock base address (that is the address of the allocation minus the
offset/lower bound of the privatized section).
- The code generating the copy and combiner region is moved from
OpenACC.cpp to FIROpenACCTypeInterfaces.cpp via the addition of new
generateCopy and generateCombiner interface in the
MappableTypeInterface. This allows sharing all the addressing helper
with generatePrivateInit, and will allow late generation of all recipes
[7 lines not shown]
[Linalg] Update Conv Decomposition patterns to work with generic convolution ops as well (#174196)
-- This commit updates Conv Decomposition patterns to work with both
named as
well as generic convolution ops.
-- Since now a "generic" LinalgOp is being used as the root op in the
patterns
above the `assert` of the op implementing a ConvolutionOpInterface has
been replaced with an early exit `if`.
Signed-off-by: Abhishek Varma <abhvarma at amd.com>
[RISCV] Add tests for rv32 gather/scatter costs. NFC
There's a divergence with the rv32 costs that I plan on fixing in
another patch, so this precommits the tests for them.
The zve32f RUN lines were split off into another file so the check prefixes
are easier to reason about.
The -riscv-v-vector-bits-max RUN lines were also removed to simplify the
check prefixes since I'm not sure if they were intentionally testing any
specific logic.
[Serialization] Hash inner template arguments
The code is applied from ODRHash::AddDecl with the reasoning given
in the comment, to reduce collisions. This was particularly visible
with STL types templated on std::pair where its template arguments
were not taken into account.
Reviewed as part of https://github.com/llvm/llvm-project/pull/133057
[ORC][MachO] Clean up incompatible-arch-in-object error message. (#176092)
Add missing spaces to error messages, use Triple::getArchName (gives
canonical arch name on Darwin, e.g. "arm64" rather than "aarch64").
No testcase for this one: the change is cosmetic, and the error message
format not relied upon anywhere.
[profcheck] Reorder the FileCheck substitution.
In the profcheck build, FileCheck commands are substituted with cat > /dev/null to disable output verification. In a test/Transforms/SamplePrfile/remarks-hotness.ll we have both "FileCheck" and "not FileCheck" statements. Replacing the positive one first results in "not cat".
https://github.com/llvm/llvm-project/blob/main/llvm/test/Transforms/SampleProfile/remarks-hotness.ll#L18
Run the not substitution first to fix this.
[SelectionDAG] Fix zext assertion check for scalable vectors (#176064)
Use element type comparisons in getZeroExtendInReg to avoid comparing
scalable and fixed types.
Fixes #176037
[MC][NFC] Use appendLEB128 helper in MCDwarf.cpp (#175962)
This is a very minor simplification of the logic. We made a similar
change to RISC-V in #173198.
[CIR] Upstream support for coroutine co_yield expression (#173162)
This PR upstreams support for the co_yield expression by emitting a
cir.await op with the yield kind.
[DebugInfo] Only generate template parameters in the skeleton CU for a template function/type with simplified name (3/3) (#175879)
Currently, when generating debug info for skeleton units, all template
parameters are emitted unconditionally. To optimize debug info size, the
emission should be conditional — providing parameters only for template
types/functions whose names have actually been simplified. As described
in [this
RFC](https://discourse.llvm.org/t/rfc-debuginfo-selectively-generate-template-parameters-in-the-skeleton-cu/89395).
Previous patches: #175130, #175708