[libc++] Fix std::for_each(associative-container) not using std:invoke and projections (#171984)
#164405 added specializations of `for_each` that didn't do the ranges
call shenanigans, but instead just did what the classic algorithms have
to do. This updates the calls to work for the ranges overloads as well.
[clang][NVPTX] Add support for mixed-precision FP arithmetic (#168359)
This change adds support for mixed precision floating point
arithmetic for `f16` and `bf16` where the following patterns:
```
%fh = fpext half %h to float
%resfh = fp-operation(%fh, ...)
...
%fb = fpext bfloat %b to float
%resfb = fp-operation(%fb, ...)
where the fp-operation can be any of:
- fadd
- fsub
- llvm.fma.f32
- llvm.nvvm.add(/fma).*
```
are lowered to the corresponding mixed precision instructions which
combine the conversion and operation into one instruction from
[18 lines not shown]
[VPlan] Directly unroll VectorPointerRecipe (#168886)
In an effort to get rid of VPUnrollPartAccessor and directly unroll
recipes, start by directly unrolling VectorPointerRecipe, allowing for
VPlan-based simplifications and simplification of the corresponding
execute.
[libc++] Remove unused __parent_pointer alias from __tree and map (#172185)
The `__parent_pointer` type alias was marked to be removed in
d163ab3323495560eb0255ac807da2bf24d3c629.
At that time, <map> still had uses of `__parent_pointer` as a local
variable type in operator[] and at()
Those uses were removed in 4a2dd31f16d60b65a46696a909efad5c11b18c19,
which refactored `__find_equal` to return a pair instead of using an out
parameter
However, the typedef in <map> and the alias in __tree were left behind
This patch removes the unused typedef from <map> and the
`__parent_pointer` alias from __tree
Signed-off-by: Krechals <topala.andrei at gmail.com>
AMDGPU/GlobalISel: Regbanklegalize for G_CONCAT_VECTORS (#171471)
RegBankLegalize using trivial mapping helper, assigns same reg bank
to all operands, vgpr or sgpr.
Uncovers multiple codegen and regbank combiner regressions related to
looking through sgpr to vgpr copies.
Skip regbankselect-concat-vector.mir since agprs are not yet supported.
[mlir:bazel] Fix missing dependency introduced in #171727. (#172267)
That PR added an include to `LLVMOps.td` without adding a target
providing that file. Curiously, this does not break the official builds
but it *does* break my bazel build.
Signed-off-by: Ingo Müller <ingomueller at google.com>
llvm: Export IndexedCodeGenDataLazyLoading (#169563)
This is needed so the llvm-cgdata tool properly builds with
`LLVM_BUILD_LLVM_DYLIB` so LLVM can be built as a DLL on Windows.
This effort is tracked in #109483.
Add .gitignore file in .cache/clangd/index (#170003)
This solves a common issue where users have to manually add the
`.cache/clangd/index/` folder to their `.gitignore`. I got this idea
from [ruff](https://github.com/astral-sh/ruff), which creates
`.ruff_cache/.gitignore` and it would greatly improve the user
experience for everyone without requiring per-computer configurations
and without any significant cost.
[Clang] Recompute the value category when rebuilding SubstNonTypeTemplateParmExpr (#172251)
In concept checking, we need to transform SubstNTTPExpr when evaluating
constraints.
The value category is initially computed during parameter mapping,
possibly with a dependent expression. However during instantiation, it
wasn't recomputed, and the stale category is propagated into parent
expressions. So we may end up with an 'out-of-thin-air' reference type,
which breaks the evaluation.
We now call BuildSubstNonTypeTemplateParmExpr in TreeTransform, in which
the value category is recomputed.
The issue was brought by both 078e99e and the concept normalization
patch, which are not released yet, so no release note.
Fixes https://github.com/llvm/llvm-project/issues/170856
[AArch64] Support lowering smaller than legal LOOP_DEP_MASKs to whilewr/rw (#171982)
This adds support for lowering smaller-than-legal masks such as:
```
<vscale x 8 x i1> @llvm.loop.dependence.war.mask.nxv8i1(ptr %a, ptr %b, i64 1)
```
To a whilewr + unpack. It also slightly simplifies the lowering.
[InstSimplify] Support ptrtoaddr in simplifyICmpInst() (#171985)
This is basically the same change as #162653, but for InstSimplify
instead of ConstantFolding.
It folds `icmp (ptrtoaddr x, ptrtoaddr y)` to `icmp (x, y)` and `icmp
(ptrtoaddr x, C)` to `icmp (x, inttoptr C)`.
The fold is restricted to the case where the result type is the address
type, as icmp only compares the icmp bits. As in the other PR, I think
in practice all the folds are also going to work if the ptrtoint result
type is larger than the address size, but it's unclear how to justify
this in general.
[clang][bytecode][NFC] Add Block::getBlockDesc<T>() (#172218)
Which returns the block-level descriptor. This way we don't have to do
the reinterpret_cast dance everywhere.
Reapply "[AMDGPU][SDAG] Add missing cases for SI_INDIRECT_SRC/DST (#170323) (#171838)
A buildbot failed for the original patch.
https://github.com/llvm/llvm-project/pull/171835 addresses the issue
raised by the buildbot.
After the fix is merged, the original patch is reapplied without any
change.
[AArch64] Add a performBICiCombine function.
This moves the code out of PerformDAGCombine directly, changing the return
to return SDValue(N, 0) to match other uses of SimplifyDemandedBits.
[RISCV] Custom legalize i32 saddo/ssubo on RV64 to return a sign extended value for the data result. (#172112)
This is consistent with how we handle regular ADD/SUB and helps with
computeNumSignBits optimizations.
Fixes #172089
[orc-rt] Prevent RTTIExtends from being used for errors. (#172250)
Custom error types (ErrorInfoBase subclasses) should use ErrorExtends as
of 8f51da369e6. Adding a static_assert allows us to enforce that at
compile-time.
[CIR] Rename allEnumCasesCovered to all_enum_cases_covered (#172153)
Use the convetional snake_case for MLIR assembly and align with
operation documentation that already mentions snake_cased attribute.