[LoopInterchange] Fix test phi-ordering.ll (NFC) (#181989)
I found that the test phi-ordering.ll is a bit fragile and can fail with
any irrelevant changes. Also this test is not consistent with the
following comment, which is at the top of the file:
```
;; Checks the order of the inner phi nodes does not cause havoc.
;; The inner loop has a reduction into c. The IV is not the first phi.
```
After examining the change history, I found that the original intent of
this test was effectively lost in
https://github.com/llvm/llvm-project/commit/c8bd6ea35e459169cbd401372e81168ed8482536.
A workaround was introduced later in
https://github.com/llvm/llvm-project/commit/eac34875109898ac01985f4afa937eec30c1c387
to preserve the test output, but this seems to have made the test more
complicated.
[5 lines not shown]
[SimplifyLibCalls] Avoid simplifying pow(x, 2.0) -> x * x with math-errno. (#183099)
It came up in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=123826 that
GCC was simplifying pow(x, 2.0) -> x * x, even when doing so caused
-fmath-errno to be ignored. This patch fixes a similar bug in LLVM.
For ConstantFolding folding powf expressions that may raise exceptions,
see #183102.
FunctionAttrs: Propagate nofpclass from callsite arguments
Follow along with the nonnull handling. This is essentially the same,
except it can union with an existing attribute.
I'm wondering if getParamNoFPClass should have an AllowUndefOrPoison
argument to check noundef like nonnull. None of the uses of hasNonNullAttr
use this with true though, so maybe both should just check noundef.
[CIR][AArch64] Add lowering + tests for predicated SVE svdup_lane builtins
This PR adds CIR lowering + tests for SVE `svdup_lane` builtins on
AArch64. The corresponding ACLE intrinsics are documented at:
https://developer.arm.com/architectures/instruction-sets/intrinsics
[AArch64] Report accurate sizes for MOVaddr and MOVimm pseudos
getInstSizeInBytes returned the default 4 bytes for MOVaddr*,
MOVi32imm and MOVi64imm pseudos, which doesn't reflect their
expanded size. Compute the real sizes: 8 or 12 bytes for MOVaddr*
(depending on MO_TAGGED), and the actual expansion count for
MOVi32imm/MOVi64imm using AArch64_IMM::expandMOVImm.
py-bandit: updated to 1.9.4
1.9.4
* Fix B106 reporting wrong line number on multiline function calls
* Lower version guard in check\_ast\_node to Python 3.12
* Fix B615 false positive when revision is set via variable
* Include filename in nosec 'no failed test' warning
* Fix B613 crash when reading from stdin
* Bump docker/build-push-action from 6.18.0 to 6.19.2
* Bump docker/login-action from 3.6.0 to 3.7.0
* chore: fixed some typos in comments
[flang][OpenMP] Fix crash in declare reduction with intrinsic operators (#182978)
genOMP for OpenMPDeclareReductionConstruct unconditionally extracts
ProcedureDesignator from OmpReductionIdentifier, but when the reduction
identifier is an intrinsic operator like `+`, the parser produces a
DefinedOperator instead. This causes a std::get crash.
Visit both variants of OmpReductionIdentifier to extract the reduction
name string, handling DefinedOperator (with IntrinsicOperator and
DefinedOpName sub-variants) alongside the existing ProcedureDesignator
path.
This fixes the ICE; the underlying lack of derived-type reduction
support (TODO in ReductionProcessor::getReductionInitValue) remains
a separate issue.
Co-authored-by: Matt P. Dziubinski <matt-p.dziubinski at hpe.com>
[Clang] Add `__builtin_reduce_[in_order|assoc]_fadd` for floating-point reductions (#176160)
This adds `__builtin_reduce_[in_order|assoc]_fadd` to expose the
`llvm.vector.reduce.fadd.*` intrinsic directly in Clang, for the full
range of supported FP types.
Given a floating-point vector `vec` and a scalar floating-point value
`acc`:
- `__builtin_reduce_assoc_fadd(vec)` corresponds to an fast/associative
reduction
* i.e, the fadds can occur in any order
- `__builtin_reduce_in_order_fadd(vec, acc)` corresponds to an ordered
redunction
* i.e, the result is as-if an accumulator was initialized with `acc`
and each lane was added to it in-order, starting from lane 0
[mlir][gpu] Support arith.truncf in subgroup MMA elementwise ops (#182499)
This commit adds support for arith.truncf in the supported list of
elementwise ops for subgroup MMA ops, and enables lowering to SPIR-V.
[DAG] visitOR - attempt to fold (or buildvector(), buildvector()) -> buildvector() (#183032)
See if we can fold all elements of an OR of buildvectors: OR(-1,X) ->
-1, OR(0,X) -> X, etc.
[clang] Define __PTRAUTH_INTRINSICS__ for arm64e-apple-* targets (#172944)
The macro is set by Xcode clang for the arm64e-apple-* targets, and
ifdefed in the macOS and iPhoneOS SDKs.
[AMDGPU]Fix compute num sign bits unsigned underflow (#182723)
Fixes #182677
The `BFE_I32` case in `ComputeNumSignBitsForTargetNode` was not masking
the width operand with `& 0x1f`, unlike other BFE operations in the same
file. Since the hardware instruction only uses the low 5 bits of the
width field, values >= 32 passed via `@llvm.amdgcn.sbfe.i32` caused
unsigned integer underflow in the calculation:
unsigned SignBits = 32 - Width->getZExtValue() + 1;
When width > 33, this underflows, producing incorrect SignBits values.
When width == 33, SignBits becomes 0, violating the expected return
range of [1, BitWidth]. This led to assertion failures and
miscompilation where subsequent BFE narrowing operations were
incorrectly eliminated.
This patch:
[2 lines not shown]
update to got-0.123
- make gotsys-write-conf configure clone-urls for all accessible repositories
- ensure visitors see the repository index page after logging into gotwebd
- make 'gotadmin cleanup' run even if HEAD points at a non-existent branch
- gotsys.conf.5 and got.1 wording and markup fixes
- replace obsolete tmppath pledge in got-notify-http with wpath+cpath & unveil
- avoid a malloc/free dance per parsed tree entry in got-read-pack
- stop using the pack delta-cache in got-read-pack, cache-less is faster here
- fix double-free in error path of the 'gotadmin pack' commit coloring phase
- store first-level object_idset hash table entries inline to avoid malloc/free
- avoid doing asprintf/free per tree entry in got_pack_load_tree_entries()
- avoid a per tree-entry memcpy() in got-read-pack enumerate_tree()
- avoid deltifying packed delta-base objects to speed up pack file generation
- cache fewer but larger deltas in delta-cache to speed up got-index-pack
[llvm][release] Link to .jsonl signatures for Windows x86_64 and ARM64 (#183053)
Previously we linked to .sig files, which were created by the person who
built the release.
Now these are built in GitHub so they have .jsonl signature files
instead.
Add a temporary patch to remove tmppath from pledge in favour of
unveil(_PATH_TMP)+pledge("rpath wpath cpath").
This patch is to bridge the time until a new release of dkimsign can be
made.
OK op@ kirill@