[mlir][acc] Rewrite acc routine bind calls inside gpu.func (#204220)
Run `acc-bind-routine` on `FunctionOpInterface` and rewrite calls to
bound symbols in offload regions and `gpu.func`. For string bind names,
declare private functions in the enclosing `gpu.module` symbol table
when the call is inside device code.
Reapply "[Dexter] Add ability to rewrite scripts to fill-in unknown values" (#206034)
Reverts llvm/llvm-project#205657
The original commit was causing pre-merge CI to fail for AArch64, as one
of the tests expects stepping behaviour that is seen on not seen on
AArch64 targets; the test suite containing the failing test is meant to
be configured to not run for AArch64, but the unsupported label was not
being applied, due to an error in the unsupported check. This patch
fixes the unsupported check in scripts/lit.local.cfg, which should
prevent further errors.
AMDGPU: Use -mtriple= instead of with a space for llc run lines (#206067)
-mtriple=amdgcn is by far the dominant form over space separation.
Convert these to simplify future bulk test updates.
[CIR] Honor Direct coercion offset in callconv
A Direct classification with a coerced type assumed the coerced value
started at byte 0 of the original aggregate. On x86-64 SysV a 16-byte
record whose low eightbyte is NO_CLASS carries its live value in the high
eightbyte and is classified as getDirect(coerceType, offset=8); the
coercion path read and wrote the wrong eightbyte for that shape.
Add a directOffset to ArgClassification (with a getDirect(coerced, offset)
overload). emitCoercionToMemory now applies the offset to the coerced
(scalar) side of the slot via a u8 ptr_stride before the typed view, so the
aggregate side stays at offset 0 while the scalar is read from / written to
the right bytes. The offset is threaded through both emitCoercion overloads,
insertReturnCoercion, and the call-site and entry-block Direct arms. Offset
0 takes the original plain-bitcast path and is byte-identical to before.
The Test target parser gains an optional direct_offset key so cir-opt can
inject this classification; coerce-direct-offset.cir covers the offset-8
return and argument plus an offset-0 negative case.
ports-mgmt/pkg-devel: update to 2.7.99.3 (will become 2.8.0)
Changes:
- db: switch the local database to WAL journal mode with synchronous=NORMAL for better read concurrency and faster writes
- db: huge optimisation of the local DB, drop 15 redundant single-column indexes, add a flavors VIEW and shlib_id indexes
- db: open read-only databases with SQLITE_OPEN_READONLY, falling back to immutable=1 when WAL sidecar files are inaccessible
- binary repo: use 16K pages and synchronous=OFF during bulk catalog updates to speed up pkg update
- pkgdb: convert implicit SQL-89 JOINs to explicit JOIN syntax and optimize several queries
- osvf: update to the official OSVf JSON schema 1.7.5, parse CVE names, add osv_type for VuXML version-checking compatibility
- install: add -X as an alias for --register-only
- plist: support @for ... @end loops and # comments
- compression: respect DECOMPRESSION_THREADS when decompressing
- repo: support storing arbitrary data in the repository meta file
- multiple bug fixes
- some refactoring for consistency of the code
- refactor: add SPDX license identifier tags to source files
- plug additional memory leaks and fix resource leaks
- update sqlite to 3.53.2
[clang][docs] Document ThreadSanitizer run-time flags and suppressions (#205761)
This patch updates the ThreadSanitizer documentation in clang/docs by
documenting the run-time flags and suppressions, which was requested in
google/sanitizers#446.
Specifically:
- Adds a "Run-time Flags" section detailing common options that can be
passed in TSAN_OPTIONS (e.g. exitcode, log_path, history_size,
halt_on_error, report_atomic_races, etc.).
- Explains how to print the full list of options using help=1.
- Adds a "Suppressions" section documenting the syntax, wildcard rules,
and types of runtime suppressions (race, thread, called_from_lib) with a
practical example suppressions file.
- Adds compile-time ignorelist code examples.
- Document limitations with C++ Exceptions, non-instrumented code, and
GDB/ASLR issues.
- Removes outdated references to the archived sanitizers wiki.
[runtimes] Add explicit offload arch tool dependencies
Needed for the offload unittests which detect the target arch at
configure time if not forced by OFFLOAD_TESTS_FORCE_AMDGPU_ARCH. Bug had
been masked by the dependency on flang, which we recently removed in
https://github.com/llvm/llvm-project/pull/198205.
Claude assisted with this patch.
[X86] Use valign instead of vperm for float domain shuffles (#201624)
The X86 backend then lowers the shuffle through lowerV16F32Shuffle /
lowerV8F64Shuffle, which fall through to lowerShuffleWithPERMV (VPERMPS
/ VPERMPD). lowerShuffleAsVALIGN is asserted on i32 / i64 element types
only and is never called from the float-domain paths, even when the mask
is a clean concatenate-and-shift that VALIGN expresses exactly.
On znver5, VALIGN and VPERMPS / VPERMPD have identical latency (5 cycles
for zmm), throughput (2), and macro-op count (1). The real cost of
VPERMPS / VPERMPD is the extra zmm register required to hold the
permutation index vector.
Intrinsic path for _mm512_alignr_epi32 also gets a vperm. Its a win in
generic path as well as vpermps zmm1, zmm0, zmm3 requires a dedicated
zmm register to hold the permutation index vector. valignd zmm1, zmm3,
zmm3, 1 encodes the rotation count as an immediate (imm8 = 1), using no
extra registers.
Co-authored-by: Shivanshu
Convert SNMP plugin to the typesafe pattern
## Context
Migrate the `snmp` plugin to the typesafe pattern: a lean `SystemServiceService[SNMPEntry]` delegating to an `SNMPServicePart`, with Pydantic API models, `check_annotations=True`, and `config`/`do_update` returning typed models instead of dicts.
## Solution
- Split the single `snmp.py` into a `snmp/` package: a lean `__init__.py` (service class + port delegate) and `config.py` (the service part holding the SQLAlchemy model, the model-based `do_update`, the v3 user lifecycle, and the defaults helper). `get_snmp_users` stays a `@private` method because the integration tests invoke it over the wire; the unused `_is_snmp_running` was dropped.
- Decouple the legacy `@single_argument_args` model into `SNMPEntry` / `SNMPUpdate` / `SNMPUpdateArgs` / `SNMPUpdateResult` in `api/v27_0_0`. The `v3_password` / `v3_privpassphrase` secrets are read via `get_secret_value()` and persisted with the `expose_secrets` dump context.
- `snmp.config` now returns a model in-process, so the `snmpd.conf.mako` renderer is switched from dict subscripting to attribute access.
- Register the service in `main.py`'s `ServiceContainer`, add the plugin to `mypy.yml`, and fully type-annotate the `utils_snmp_user` helpers so the now-checked plugin passes mypy.
lang/gnat15: New GCC-15 Ada Port
GCC-15 Ada Port:
* Add GNAT-15 to the Ports Tree
* Add xz:threads option to ${TAR} to make use of multi-threaded
compression
* Add GNU OpenMP libraries. This is required for GNATColl Bindings
* Add GNAT and GNAT_SO_VERSION to ${PLIST_SUB} to remove hard-coded
version dependent information. The aim here is to reduce work
updating the Port, or using the pkg-plist to bootstrap the next major
release
* Complete ${LICENSE} block
* Expand pkg-message and make UCL compliant
* Modernise and update ${COMMENT}, ${WWW} and pkg-descr
https://gcc.gnu.org/gcc-15/changes.html#ada
PR: 292708
Co-authored-by: Alastair Hogge <agh at riseup.net>
Co-authored-by: Marcin Cieślak <saper at saper.info>