[AMDGPU] Add `.amdgpu.info` section for per-function metadata
AMDGPU object linking requires the linker to propagate resource usage
(registers, stack, LDS) across translation units. To support this, the compiler
must emit per-function metadata and call graph edges in the relocatable object
so the linker can compute whole-program resource requirements.
This PR introduces a `.amdgpu.info` ELF section using a tagged, length-prefixed
binary format: each entry is encoded as:
```
[kind: u8] [len: u8] [payload: <len> bytes]
```
A function scope is opened by an `INFO_FUNC` entry (containing a symbol
reference), followed by per-function attributes (register counts, flags, private
segment size) and relational edges (direct calls, LDS uses, indirect call
signatures). String data such as function type signatures is stored in a
companion `.amdgpu.strtab` section.
[4 lines not shown]
[flang][OpenMP] Get final label from nested constructs (#192517)
Non-block DO loops can share termination statements. When parsing a
non-block DO loop, account for labels on terminating statements from
recursively parsed ExecutionPartConstructs.
Fixes https://github.com/llvm/llvm-project/issues/188892
[SSAF][WPA] Add no-op PointerFlow and UnsafeBufferUsage analysis
We need no-op PointerFlow and UnsafeBufferUsage analyses for the
analysis that depends on their summary data.
Refactored PointerFlow and UnsafeBufferUsage serialization for code
sharing.
rdar://174874942
[flang][OpenMP] Move ALLOCATE + privatize check to semantic checks (#192792)
Move the check from symbol resolution to semantic checks.
The check now seems to be more accurate, catching some cases that were
not detected before.
[clang] implement CWG2064: ignore value dependence for decltype
The 'decltype' for a value-dependent (but non-type-dependent) should be known,
so this patch makes them non-opaque instead.
This patch also implements what's neceessary to allow overloading
on pure differences in instantiation dependence, making `std::void_t`
usable for SFINAE purposes.
This also readds a few test cases from da98651, which was a previous attempt
at resolving CWG2064.
Fixes #8740
Fixes #61818
Fixes #190388
[SLP][REVEC] Honor slot type when computing NumberOfParts
The getNumberOfParts() helper split VecTy without considering that a
REVEC slot is a FixedVectorType, so NumParts could fall on a non-slot
boundary.
Add an explicit ScalarTy argument, require (Sz / NumParts) to be a
multiple of getNumElements(ScalarTy), and use ScalarTy for the
hasFullVectorsOrPowerOf2 check. For non-REVEC callers ScalarSz == 1 and
behavior is unchanged.
Fixes #192963.
Reviewers:
Pull Request: https://github.com/llvm/llvm-project/pull/193085
geom manuals: Clarify units
The gpart manual says that sizes are specified in blocks, unless an SI
unit suffix is provided. This confuses new operators because GEOM uses
binary bytes, a large difference at modern storage pool sizes. Rewrite
suffixes in all GEOM manuals to consistently clarify this, matching what
we and the rest of the industry have been doing in other documentation.
While here, use non-breaking spaces between numbers and units, unless
they are already written with a hyphen.
MFC after: 3 days
Reviewed by: fuz
Reported by: bbaovanc <bbaovanc at bbaovanc.com>
Differential Revision: https://reviews.freebsd.org/D56534
[AMDGPU] Add `.amdgpu.info` section for per-function metadata
AMDGPU object linking requires the linker to propagate resource usage
(registers, stack, LDS) across translation units. To support this, the compiler
must emit per-function metadata and call graph edges in the relocatable object
so the linker can compute whole-program resource requirements.
This PR introduces a `.amdgpu.info` ELF section using a tagged, length-prefixed
binary format: each entry is encoded as:
```
[kind: u8] [len: u8] [payload: <len> bytes]
```
A function scope is opened by an `INFO_FUNC` entry (containing a symbol
reference), followed by per-function attributes (register counts, flags, private
segment size) and relational edges (direct calls, LDS uses, indirect call
signatures). String data such as function type signatures is stored in a
companion `.amdgpu.strtab` section.
[4 lines not shown]
fusefs: better handling for low-memory conditions
Under conditions of low memory, getblk can fail. fusefs was not
handling those failures very systematically. It was always using
PCATCH, which appears to have been originally copy/pasted from the NFS
client code, but isn't always appropriate:
* During fuse_vnode_setsize_immediate, which can be called from many
different VOPs and from the vn_delayed_setsize mechanism, remove
PCATCH. Some of these callers cannot tolerate allocate failure.
* In fuse_inval_buf_range, don't assume that getblk will always succeed.
* When calling fuse_inval_buf_range from VOP_ALLOCATE,
VOP_COPY_FILE_RANGE, or VOP_WRITE (with IO_DIRECT), return EINTR if
the allocation fails.
* When calling fuse_inval_buf_range from VOP_DEALLOCATE, remove PCATCH.
This VOP must not fail with EINTR.
[7 lines not shown]
include/stdbit.h: declare size_t, (u)int*_t, and (u)int_least*_t
These are required by ISO/IEC 9899:2024 § 7.18.1 ¶ 1 but were forgotten
in my initial work.
The current approach leaks intptr_t, uintptr_t, intmax_t, and uintmax_t
through <sys/_stdint.h>. This could be avoided using a more complicated
approach if desired.
PR: 294131
Fixes: 6296500a85c8474e3ff3fe2f8e4a9d56dd0acd64
Reported by: Collin Funk <collin.funk1 at gmail.com>
Reviewed by: imp
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D56515
[Extractor] Use function return for the one and only output (#191824)
Currently code extractor uses parameters to pass outputs.
Alloca/store/load instructions are used to get the output value in the
parent functions.
When there is only one output from the extracted code (this is one of
the most common cases), using the function return for the only one
output can facilitate the other transformations (eg, tail call opt).
This is to modify the code for the extracted function to return the
output value if there is only one output for the extracted region.
[CodeGen] Inline trivial TargetLoweringBase::getCmpLibcallReturnType method. NFC (#192483)
This trivial method makes more sense in the header (indeed, both the
existing overrides for it are already in headers).
I noticed this while hunting for the default value while reviewing
#192425
[DebugInfo] Remove unused argument of DataExtractor constructor (NFC) (#191968)
`AddressSize` parameter is not used by `DataExtractor` and will be
removed in the future. See #190519 for more context.
As a drive-by change, use the constructor accepting ArrayRef where it
allows removing extra casts.
[lldb] Remove unused argument of DataExtractor constructor (NFC) (#191876)
`AddressSize` parameter is not used by `DataExtractor` and will be
removed in the future. See #190519 for more context.
This also removes two of the four related methods:
```
DataExtractor::GetAsLLVM()
DataExtractor::GetAsLLVMDWARF() - removed as unused
DWARFDataExtractor::GetAsLLVM() - removed as redundant, it hid the equivalent method of DataExtractor
DWARFDataExtractor::GetAsLLVMDWARF()
```
That is, now we have:
```
DataExtractor::GetAsLLVM()
DWARFDataExtractor::GetAsLLVMDWARF()
```
flux2: Update to 2.8.5
Changes:
Flux v2.8 comes with Helm v4 support, bringing server-side apply and
enhanced health checking to Helm releases. Big thanks to the Helm
maintainers for their work on Helm v4 and for collaborating with us to
ensure a smooth integration with Flux!
In this release, we have also introduced several new features to the
Flux controllers:
- Reduced the mean time to recovery of Flux-managed applications
- Readiness evaluation of Helm-managed objects with CEL expressions
- ArtifactGenerator support for extracting and modifying Helm charts
- Support for commenting on Pull Requests directly from Flux notifications
- Custom SSA apply stages for ordering resource application in
kustomize-controller
- Automatic GitHub App installation ID lookup from the repository owner
- Support for Cosign v3 for verifying OCI artifacts and container images
[7 lines not shown]
[SPIRV] Add StorageImageExtendedFormats to available list for Vulkan (#192512)
Some image formats used in OpTypeImage require the
StorageImageExtendedFormats capability. When it is not in the available
list, it is not emitted, causing invalid SPIR-V.
The solution is to add it to the list. It is available for all versions
of Vulkan.
Fixes #192486
[SelectOptimize] Update Missing PSI Error Message, Add Test (#193034)
Update the error message to omit a period/start with a lowercase letter
per the coding standards. Also add a test as suggested in post-commit
feedback on #192871.
---------
Co-authored-by: Matt Arsenault <arsenm2 at gmail.com>
[clang][SSAF] Add missing `explicit` to single-argument constructors (#193052)
This PR adds `explicit` to `TUSummary`,
`UnsafeBufferUsageTUSummaryExtractor`, and
`UnsafeBufferUsageEntitySummary` constructors. This ensures uniform use
of `explicit` for all SSAF single-argument constructors.