[AArch64][CodeGen] match (or x (not y)) to generate mov+orn (#191145)
Fixes: #100045
Adds a tablegen pattern that matches (or x (not y)) and generates a
mov+orn instead of the original mvn+orr.
The number of instructions still stay the same but mov+orn can be
considered better than mvn+orr for two reasons:
1. Symmetry: For the same input with an 'and' instead of 'or', mov+bic
is generated.
2. Optimzation through register rename: If mov is immediate (for
example, 'mov x1, #0x4'), it can be retired early by the register
renamer and never issued for execution.
This patch was reverted as I wanted to change my email associated with
the patch.
Original patch: #190769
[2 lines not shown]
[LV] NFCI: Create VPExpressions in transformToPartialReductions.
With this change, all logic to generate partial reductions and
recognising them as VPExpressions is contained in
`transformToPartialReductions`, without the need for a second
transform pass.
The PR intends to be a non-functional change.
[CIR][Aarch64] upstream scalar & vector intrinsics (FP16) (#190310)
This PR upstreams the following fp16 intrinsics as part of #185382:
- vaddh_f16,
- vsubh_f16,
- vmulh_f16,
- vdivh_f16
This is my first PR to LLVM, so any feedback is greatly appreciated!
[clang][CIR] Add lowering for vcvt_n_ and vcvtq_n_ conversion intrinsics
This PR adds lowering for the conversion intrinsics with an immediate
argument (identified by `_n_` in the intrinsic name), excluding FP16
variants.
It also moves the corresponding tests from:
* clang/test/CodeGen/AArch64/neon_intrinsics.c
to:
* clang/test/CodeGen/AArch64/neon/intrinsics.c
The lowering follows the existing implementation in
CodeGen/TargetBuiltins/ARM.cpp and adds the `getFloatNeonType` helper
to support it. The remaining changes are code motion and refactoring.
Reference:
[1] https://arm-software.github.io/acle/neon_intrinsics/advsimd.html#conversions
[lldb] Handle simulator printout in TestSimulatorPlatform (#189571)
This test invokes a binary in a simulator and then reads the first line
of stderr to parse the PID of the invoked binary.
This approach fails when the simulator itself prints a warning/error on
startup. In this case, we try to parse the error as the PID and fail.
This patch just removes the line limit. It doesn't seem to add any value
as we anyway need to search until we find the PID line, and if there is
no PID line we cannot do anything but time out eventually.
See also rdar://169799464
[GlobalISel] Prevent hoisting of CheckIsSameOperand from creating invalid match tables (#190963)
Fixes #188513
This patch adds logic to ask PredicateMatchers whether they'd like to be
hoisted out of a specific Matcher or not.
SameOperandMatcher can use it to check if it's being hoisted out of the
RuleMatcher that defines the operand it relies on.
Assisted-By: Claude Opus 4.6
Context of Use: Claude was only used to add LLVM-style RTTI to the
matcher class (repetitive work). I then reviewed and cleaned up the code
it generated.
[SeparateConstOffsetFromGEP] Fix incorrect inbounds flag in case of non-negative index but negative offset (#190192)
Fixes #190187
Currently, SeparateConstOffsetFromGEP preserves inbounds attribute if
new sequence of GEPs from the same base pointer has non-negative offsets
in each GEP (this was mentioned in
https://github.com/llvm/llvm-project/pull/159515). This statement seems
correct for me (if the sequence consists from 2 GEPs), but current
implementation has a flaw: it checks that constant byte offset and GEP
indices are non-negative. However, in some corner cases we can have a
situation when the index is non-negative, but its offset (in bytes) is
negative, so we can't preserve inbounds attribute. In the example, GEP
index after transformation can have values
`0x7ffffffffffffffd`/`0x7ffffffffffffffe`/`0x7fffffffffffffff`; they are
all non-negative (sign bit is zero), however, after multiplication on
sizeof(i64) they become negative and inbounds can't be preserved
anymore.
The proposed fix is to check that Idx * ElementStride is non-negative
(instead of checking Idx only).
[Polly] Remove pipeline-level Oz handling for LoopRotate (#191137)
This handling was moved fully into the pass in
1662c200a5b151ad15b7efc82837076d8967dc11. However, that changed
missed the usage of LoopRotate in polly.
[Flang][OpenMP] Data-sharing restrictions on assumed-size arrays (#189324)
Per `OpenMP 5.0 2.19.1 Data-Sharing Attribute Rules`, assumed-size
arrays are predetermined shared and may not appear in a data-sharing
clause besides `shared`.
Patch adds a semantics check for assumed-size arrays appearing in
clauses where they aren't allowed.
[UniformityAnalysis] Skip CycleAnalysis on targets without branch divergence (#189948)
UniformityAnalysis unconditionally computes CycleAnalysis even on
targets that don't care about divergence, causing measurable
compile-time overhead (see [#99878
(comment)](https://github.com/llvm/llvm-project/pull/175167#issuecomment-4156230947)).
---------
Co-authored-by: padivedi <padivedi at amd.com>
[orc-rt] Remove Session::waitForShutdown. (#191124)
The existing implementation triggered Session shutdown and then blocked
on a std::future that would be unblocked by an on-shutdown callback that
waitForShutdown had installed. Since there is no guarantee that this
callback would be the last one run, the result was that waitForShutdown
only guaranteed that it would not return until the shutdown sequence had
started (rather than completed).
This could have been fixed, but the Session destructor is already
supposed to block until the Session can be safely destroyed, so a
"working" waitForShutdown would be effectively redundant. Since it was
also a potential footgun (calling it from an on-detach or on-shutdown
callback could deadlock) it was safer to just remove it entirely.
Some Session unit tests do rely on testing properties of the Session
after the shutdown sequence has started, so a new utility has been added
to SessionTests.cpp to support this.