AMDGPU: Perform libcall recognition to replace fast OpenCL pow
If a float-typed call site is marked with afn, replace the 4
flavors of pow with a faster variant.
This transforms pow, powr, pown, and rootn to __pow_fast,
__powr_fast, __pown_fast, and __rootn_fast if available. Also
attempts to handle all of the same basic folds on the new fast
variants that were already performed with the base forms. This
maintains optimizations with OpenCL when the device libs unsafe
math control library is deleted. This maintains the status quo
of how libcalls work, and only handles 4 new entry points. This
only helps with the elimination of the control library, and not
general libcall emission problems.
This makes no practical difference for HIP, which is the status
quo for libcall optimizations. AMDGPULibCalls recognizes the OpenCL
mangled names. e.g., OpenCL float "pow" is really _Z3powff but the
HIP provided function "powf" is really named _ZL4powfff, and std::pow
[6 lines not shown]
Add tiering API
This commit modifies the truenas API to wrap around tiering design
in the following ways:
A new namespace zfs.tier. will be added. This contains global
configuration for systemwide tiering settings. Parameters include
- enabled: whether to enable tiering. This feature requries changes
to global ZFS behavior and we will have various internal checks
that check this value in datastore extend context methods.
- max_concurrent_jobs: the maximum number of concurrent rewrite
jobs (tier migrations for existing data).
- min_available_space: point in available space for a dataset where
tier migrations will error out.
The namespace will also support APIs for managing and querying
[9 lines not shown]
math/wide-integer: New port: Generic C++ template for extended width unsigned/signed integral types
Wide-integer implements a generic C++ template for extended width
unsigned and signed integral types.
This C++ template header-only library implements drop-in big integer
types such as uint128_t, uint256_t, uint384_t, uint512_t, uint1024_t,
uint1536_t, etc.
These can be used essentially like regular built-in integers.
Corresponding signed integer types such as int128_t, int256_t, and the
like can also be used.
Reuired for net-p2p/transmission 4.1.0.
PR: 292846
Co-authored-by: Vladimir Druzenko <vvd at FreeBSD.org>
devel/small: New port: C++ small containers
C++ standard template library optimized small containers.
Reuired by net-p2p/transmission 4.1.0.
PR: 292846
Co-authored-by: Vladimir Druzenko <vvd at FreeBSD.org>
lang/gcc12: Fix patch for Darwin aarch64
The 4 previously added patches have been made on top of gcc-12.5.0.diff.
For other architectures, these patches are broken. This commit fixes
this issue by merging the 4 patches to gcc-12.5.0.diff.
[emacs] Rework tablegen mode
This commit reworks tablegen-mode to be derived from prog-mode and
removes a lot of the manual work that define-derived-mode does for you
these days, along with fixing other lints (such as an over-long
summary).
This is a major version bump because td-decorators-face has been
renamed to tablegen-decorators-face in order to not pollute other
namespaces.
[NFC][emacs] Fix emacs lists in the LLVM and MLIR modes
This mainly involved explicitly declaring minimum emacs versions for
setq-local and adding a lexical-binding annotaton.
The commit also removes some workarounds from the MLIR mode for Emacs
23 (!).
textproc/krep: [NEW PORT] High-performance string search utility
krep is an optimized string search utility designed for maximum throughput and
efficiency when processing large files and directories. It is built with
performance in mind, offering multiple search algorithms and SIMD acceleration
when available.
Note: Krep is not intended to be a full replacement or direct competitor to
feature-rich tools like grep or ripgrep. Instead, it aims to be a minimal,
efficient, and pragmatic tool focused on speed and simplicity.
Krep provides the essential features needed for fast searching, without the
extensive options and complexity of more comprehensive search utilities. Its
design philosophy is to deliver the fastest possible search for the most common
use cases, with a clean and minimal interface.
WWW: https://github.com/davidesantangelo/krep/
Approved by: db@, yuri@ (Mentors, implicit)
Differential Revision: https://reviews.freebsd.org/D55357
[flang][OpenMP]Fix versioning for implicit linear clause (#181791)
The versioning of the implicit linear clause was set at OpenMP 4.5.
However, versions v5.0 and v5.2 also allow implicit linearisation, which
was missed earlier. This PR fixes this.
OpenMP v5.0 (2.19.1.1) : "_The loop iteration variable in the associated
do-loop of a simd construct with just one associated do-loop is linear
with a linear-step that is the increment of the associated do-loop_."
OpenMP v5.2 (5.1.1) : "_The loop iteration variable in the associated
loop of a simd construct with just one associated loop is linear with a
linear-step that is the increment of the associated loop_"
OpenMP v6.0 (7.1.1) : "_The loop-iteration variable in any affected loop
of a loop or simd construct is lastprivate_."
Fixes: https://github.com/llvm/llvm-project/issues/179345
AMDGPU: Libcall expand fast pow/powr/pown/rootn for float case (#180553)
This is to eliminate the special case global unsafe math options
in these functions from the library. The core operation only
uses about 4 instructions, and then there's an additional prolog
and/or epilog to fixup special cases.
I have an alternative patch which implements this by using separate
entrypoints in the library, and having the pass replace the calls
instead of this full handling. However, given the unfortunate state
of library development, it requires a full year to make cross project
changes. This is the most expedient path to deleting the control
library;
in the future we can do libcall emission when compiler has the real
ability to properly emit new calls.
This is mostly a direct port of these functions:
https://github.com/ROCm/llvm-project/blob/amd-staging/amd/device-libs/ocml/src/powF_base.h
[23 lines not shown]
mvc: BaseListField: shared implementation of $internalStaticOptionList, proof of concept for https://github.com/opnsense/core/issues/9816
Wrap static access in protected functions which ensures content is static per inherited class:
hasStaticOptions()
getStaticOptions()
setStaticOptions(array)
resetStaticOptions()