[CIR][AArch64] Add missing lowerings for vceqz_* Neon builtins (#184893)
Implement the remaining CIR lowerings for the AdvSIMD (Neon)
`vceqz{|q|d|s}_*` intrinsic group (bitwise equal to zero).
The `vceqzd_s64` variant was already supported; this patch completes
the rest of the group [1].
Tests for these intrinsics are moved from:
* test/CodeGen/AArch64/neon-misc.c
to:
* test/CodeGen/AArch64/neon/intrinsics.c
The implementation largely mirrors the existing lowering in
CodeGen/TargetBuiltins/ARM.cpp.
`emitCommonNeonBuiltinExpr` is introduced to support these lowerings.
`getNeonType` is moved without functional changes.
[2 lines not shown]
iwx: Fix 32-bit compilation
- Avoid shifts wider than integer types, by wrapping the corresponding
checks into '#if __SIZEOF_SIZE_T__ > 32' blocks. 'bus_addr_t'
currently has the same width as 'size_t' on all architectures (and
this is not going to change for 32-bit architectures).
- Use appropriate printf(3) format for 'wk_keytsc'.
Reviewed by: adrian
MFC after: 1 minute
MFC to: stable/15
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D55667
(cherry picked from commit 35da55c28dbb56dd7056b7863efc5b547950d885)
[LoopFusion] Correction in the comments (NFC) (#184689)
The comments in the code should have been updated following the change
in https://github.com/llvm/llvm-project/pull/183353. This PR addresses
that issue.
[docs] Add exception to AI tool policy for Bazel build fixer (#183408)
The Bazel RFC concluded earlier this month:
https://discourse.llvm.org/t/rfc-ai-assisted-bazel-fixer-bot/89178/93
I felt the best way to document this decision was to incorporate it into
this policy document.
[flang][OpenMP] Implement utility to locate OmpClause in ODS, NFC (#184866)
Simplify looking for a specific clause in OmpDirectiveSpecification.
This is alternative to DirectiveStructureChecker::FindClause for when
the internal checker structures have not yet been updated in the AST
traversal.
[ssaf][UnsafeBufferUsage] Add support for extracting unsafe pointers from all kinds of contributors
- Generalize the -Wunsafe-buffer-usage API for finding unsafe pointers in all kinds of Decls
- Add support in SSAF-based UnsafeBufferUsage analysis for extracting from various contributors
- Mock implementation of HandleTranslationUnit
rdar://171735836
[flang][openacc] Relax semantic check on cache directive (#184887)
The specification doesn't really forbid the colon notation to be used to
specify the full array. Reference compiler accepts this and our lowering
can already handle it.
[AMDGPU] add back the true16 pattern for cvt_pk_rtz (#184857)
I found that the `SupportedRoundMode` pattern for true16 mode is removed
in https://github.com/llvm/llvm-project/pull/177069 by mistake. Added it
back in this patch and add gfx11 to the test which runs true16 mode
[CIR][AArch64] Add missing lowerings for vceqz_* NEON builtins
Implement the remaining CIR lowerings for the AdvSIMD (NEON)
`vceqz{|q|d|s}_*` intrinsic group (bitwise equal to zero).
The `vceqzd_s64` variant was already supported; this patch completes
the rest of the group.
Tests for these intrinsics are moved from:
test/CodeGen/AArch64/neon-misc.c
to:
test/CodeGen/AArch64/neon/intrinsics.c
The implementation largely mirrors the existing lowering in
CodeGen/TargetBuiltins/ARM.cpp.
`emitCommonNeonBuiltinExpr` is introduced to support these lowerings.
`getNeonType` is moved without functional changes.
[2 lines not shown]
[flang][acc] Handle ViewLike ops with OutlineRematerializationOpInterface in OffloadLiveInValueCanonicalization (#184218)
`fir::ConvertOp` implements both `ViewLikeOpInterface` and
`OutlineRematerializationOpInterface`. `fir.convert` is also used for
ptr-to-int conversions like `(!fir.ref<i32>) -> i64`. That is not really
a "view" — it converts a pointer to an integer — but
`ViewLikeOpInterface` is still attached, so `getOriginalValue` traces
through it to the underlying value.
When the underlying value is not a rematerialization candidate (e.g.,
`fir.alloca`, a block argument, or a `fir.call` result),
`isRematerializationCandidate` returns false and the `fir.convert` is
left as a live-in. This prevents `ACCImplicitData` from tracing back to
the original pointer to create the data mapping.
This PR:
1. Registers `fir::ConvertOp` with
`OutlineRematerializationOpInterface`.
2. Adds a fallback in `isRematerializationCandidate`: when the traced
[16 lines not shown]
[Clang][CIR][AArch64] NFC: Cleanups in AArch64 builtins lowering (#184404)
This patch performs small cleanups and fixes in the AArch64 builtins
lowering code, with the goal of aligning the CIR path more closely
with the existing Clang CodeGen implementation.
Changes include:
* Make sure that `noundef` is consistently matched using `{{.*}}`.
* Rename `AArch64BuiltinInfo` to `armVectorIntrinsicInfo` for better
consistency with the original CodeGen implementation.
* Simplify `emitAArch64CompareBuiltinExpr`, fix an incorrect
assert condition (missing `!`) and make sure to use the input `kind`
condition instead of hard-coding `cir::CmpOpKind::eq`.
* Improve and clarify comments.
No functional changes intended (NFC).
protect some more against TOCTOU in fs plugins/utils
This commit adds a few more usages of RESOLVE_NO_SYMLINKS and
fixes a TOCTOU concern in attrs utils.
[mlir][acc] Add acc.compute_region and acc.par_width operations (#184864)
Introduce two new codegen operations to the acc dialect that model GPU
compute region execution and parallel launch configuration:
- acc.par_width: specifies a parallel dimension.
- acc.compute_region: wraps a region of code for GPU execution,
capturing
launch configuration (from acc.par_width results) and input values as
block arguments.
These operations bridge the gap between high-level OpenACC compute
constructs (acc.parallel, acc.kernels, acc.serial) and gpu.launch. The
passes that do these transformations will soon follow.
---------
Co-authored-by: Scott Manley <rscottmanley at gmail.com>
[CIR] Add support for delete cleanup after new operators (#184707)
This adds support for calling operator delete when an exception is
thrown during initialization following an operator new call.
This does not yet handle the case where a temporary object is
materialized during the object initialization. That case is marked by
the "setupCleanupBlockActivation" diagnostic in deactivateCleanupBlock
and will be implemented in a future change.
[HLSL] Fix interleaved vector and matrix return types in AST dump
HLSL vector and matrix types were previously printed with their closing
syntax (', N>') in 'printAfter', causing them to interleave with function
parameters when used as return types (e.g., 'vector<float (args), 4>').
This change moves the HLSL vector and matrix closing syntax into
'printBefore' when 'UseHLSLTypes' is enabled, ensuring the type is
printed completely before the parameter list.
Note that address space qualifiers are now printed after the type
(e.g., 'vector<float, 4>hlsl_device'). This is because
'canPrefixQualifiers' in 'TypePrinter.cpp' returns false for these types.
We cannot easily change this to check 'UseHLSLTypes' because
'canPrefixQualifiers' is a static method and does not have access to the
PrintingPolicy at that point.
Fixes interleaved output in HLSL AST tests.
Assisted-by: Gemini
Mk/Uses/go.mk: Minor documentation improvements
- Add links to PHB sections on porting Go apps
- Reword the descriptions of most args to put the verb first (my
theory is that this makes it easier to comprehend at a glance?)
- Use "set up" for verb form and "setup" for noun form (my own pet
peeve)
- Consistently use tabs for leading whitespace
Revert "[mlir][arith] Add `exact` to `index_cast{,ui}` (#183395)" (#184876)
This reverts commit 7ad2c6db54a0e77249f2edb3c589ccf4c930d455.
PR #183395 introduced the `exact` flag to `index_cast` and
`index_castui` and updated some canonicalization patterns.
These canonicalization patterns were found to be unsound. For example:
* `index_cast(index_cast(x)) -> x`
* where one first truncates and then widens x
the rewrite is unsound because information is lost on the first cast as
it **may** truncate the value of x, therefore losing information. The
`exact` flag was made to make this transformation sound. Its semantics
are that when the `exact` flag is present, then it is assumed that the
operand to index_cast does not lose information (i.e., fits perfectly in
the destination type).
In PR #183395, the canonicalization rule was rewritten such that would
[25 lines not shown]