[InstCombine][VectorCombine] Move bitcast vp.load fold into VectorCombine (#200321)
Fixes https://github.com/llvm/llvm-project/issues/199896
In #192173 we started folding bitcasts of vp.loads with an all ones mask
into a vp.load with the casted type. However on RISC-V a vp.load of an
i1 vector is illegal (since there's no masked variant of `vlm.v`), and
we have no way of checking this in InstCombine.
This moves the fold into VectorCombine so we can query TTI if the cost
is legal (and profitable)
As a side note, it may be possible to lower a vp.load of an i1 vector on
RISC-V to `vlm.v` **only** if the mask is all ones. But this means the
lowering would only be valid for certain values, which is difficult to
cost. And I'm not sure if it would be profitable anyway.
[flang][mlir] Add flang to mlir lowering for dyn_groupprivate (#180938)
This PR implements the Flang frontend lowering for the
`dyn_groupprivate` clause
Changes:
- Add ClauseProcessor handling for DynGroupprivate clause
- Generate appropriate MLIR representation for dyn_groupprivate
- Add/update test cases for dyn_groupprivate lowering
- Remove TODO marker for dyn_groupprivate clause
[AsmParser] Apply deferred debug locations before intrinsic upgrade. (#200779)
Intrinsic upgrades may delete instructions, leaving dangling pointers
that may be accessed when applying deferred debug locations after
91b77dc (#200649).
Fix by applying deferred debug locations before intrinsic upgrade.
PR: https://github.com/llvm/llvm-project/pull/200779
[mlir][bufferization] Implement e2e IR transformation for static memory planner
This adds the complete transformation pass that converts multiple
memref.alloc/dealloc pairs into a single arena with subviews.
The offset assignment is intentionally simple (just sequential) - this
establishes the e2e pipeline so we can add smarter bin-packing later.
Tests verify arena sizing, sequential offsets, and that dynamic shapes
or missing deallocations are correctly skipped.
[DA] Add test for the Exact test misses dependency due to overflow (NFC) (#200780)
This patch adds a test case that demonstrates that the Exact test misses
the dependency due to mishandling of overflow. The test case is taken
from #200766.
[AArch64] Lower scalable i64 CLMUL with SVE2/SME (#198999)
When AES or SSVE-AES are not available, but SVE2 or SME are,
clmul.nxv2i64 can benefit from a cross-byte CLMUL of .S precision. This
re-uses the functionality added for nxv8i16.
[Support] Take ArrayRef in convertWideToUTF8 (#200687)
`convertWideToUTF8` took a `std::wstring`, but it never modified its
data. An `ArrayRef` or `std::wstring_view` are sufficient here. I chose
`ArrayRef<wchar_t>` over `std::wstring_view`, because it can be
implicitly constructed from any range that provides `data()` and
`size()`. A second overload taking a `const wchar_t *` is provided to
convert null terminated wide C-strings.
[Liveness][analyzer] Fix handling of [[assume]] attributes (#198618)
Before this commit, if the analyzer encountered code like
```
int f(int a, int b) {
[[assume(a == 2), assume(b == 3)]];
return a + b;
}
```
it performed the following steps:
1. It visited the expression `a == 2` with `ExprEngine::Visit` (after
visiting its sub-expressions, within the regular visitation that visits
each statement of the `CFGBlock`). This triggered the `EagerlyAssume`
logic and separated two execution paths.
2. It discarded the result bound to `a == 2` from the `Environment`
because `a == 2` is not a direct child of the `AttributedStmt`.
3. Analogously, it visited an evaluated `b == 3`.
4. Analogously, it discarded the result bound to `b == 3`.
5. On each execution path `VisitAttributedStmt` was reached, it ran the
[32 lines not shown]
[libsycl] Add single_task (#192499)
Depends on liboffload PR:
https://github.com/llvm/llvm-project/pull/194333.
The approach with void sycl_kernel_launch(pack of arguments) implies
that
we can use or copy arguments only during that call. Since it pass only
kernel arguments as parameters and returns void - we have to split
setting
of extra kernel data like event dependencies and range and getting
result
event from arguments handling and direct kernel submision if it is
possible. Key stages: 1) passing to queue (or handler in future)
dependency
events and range (for parallel_for), saving them in queue (copy/move).
2)
wrapping kernel arguments into typeless wrappers (pointer based,
initially
[39 lines not shown]
[LifetimeSafety] Add support for lifetime capture_by (#196884)
This PR implements support for the `[[clang::lifetime_capture_by(X)]]`
attribute within the lifetime-safety analysis.
The PR introduces a new helper in `FactGenerator.cpp` called
`handleLifetimeCaptureBy` which detects
`[[clang::lifetime_capture_by(X)]]` on parameters. If detected, the
analyzer now generates an `OriginFlowFact` ensuring that captured
dependencies are added to the capturer's state. The PR supports
capture_by params and `this` and currently doesn't implement attributes
on function declarations.
Example:
Integrate `[[clang::lifetimebound]]`: This existing Clang annotation is
crucial for specifying that the lifetime of a function's output is tied
to one of its inputs.
```cpp
[60 lines not shown]
[IR] Handle nofree noalias in canBeFreed() (#200194)
Based on the argument nofree semantics specified in
https://github.com/llvm/llvm-project/pull/195658, we can conclude that
an argument with both nofree and noalias cannot be freed.
This also handles the case of readonly + noalias, to be consistent with
the logic for functions (and because we had a FIXME for it...)
[lldb] Reduce size of Mangled class (#200181)
The Mangled class is used in several places in LLDB, most notably as a
direct member of Symbol. This makes this class one of the most
frequently long-lived allocations in LLDB.
In commit a2672250be871bdac18c1a955265a98704434218 , this class got a
(large) cache that stores information about demangled data. This cache
is stored in a std::optional member, which means the memory for the
class is allocated within our Mangled object. It should be noted that
this cache is only used when we actually demangle the name, which
doesn't happen for every mangled name we encounter.
The additional cache member caused that the size of Mangled went from
16B to 152B by default (that is, even if the Mangled name was never
demangled).
This patch replaces the std::optional with a unique_ptr which stores the
cache on first use in a separate heap allocation. This changes decreases
the amount of allocated memory when debugging a relatively small
Objective-C project from 1.57GiB to 1.18GiB (-400MiB).
[clang-tools-extra][docs] Convert maintainers file to Markdown (#200365)
Following the way clang does it.
* Moved files to .md (done in #200769).
* Reformatted into Markdown.
* Changed the stub file docs/Maintainers.rst into docs/Maintainers.md
and used a myst directive for the include.
* In the config file, added myst parser and ".md" as a recognised file
extension.
After this change, all maintainers files in llvm-project will be in
Markdown format.
[clang-tools-extra] Move maintainer files to .md files (#200769)
Without any formatting changes. This will break the docs build, but a
follow up (#200365) will fix the formatting and so on.