[X86] computeKnownBitsForTargetNode - add X86ISD::FANDN coverage and X86SelectionDAGTest infrastructure (#181994)
Setup X86SelectionDAGTest unit tests (matching for AArch64/ARM/RISCV pattern) to allow us to more easily test X86ISD DAG nodes.
First test is for X86ISD::FANDN nodes, which are very tricky to test in regular tests as they so often fold beforehand
AMDGPU/GlobalISel: Regbanklegalize rules for INTRIN_IMAGE
Regbanklegalize rules for INTRIN_IMAGE loads and stores.
Because of very large number of different type signatures, rule specifies
only function for lowering (waterfall lowering of RsrcIdx operand if needed)
and this function also applies register banks.
[mlir][scf] Refactor and improve ParallelLoopFusion (#179284)
Refactor and extend the scf::ParalleLoopFusion pass:
- Refactor code, rename functions and add comments to improve
readability
- Make the dependency analysis safer by checking for read-after-write
dependencies also with vector.load/store & vector.transfer_read/write
ops, in addition to memref.load/store, and bail out when other
unsupported ops with memory effects are found.
- Extend the cases when the fusion is applied: allow fusing also when
one of the two loops reads/writes to memory through a full view/alias of
the buffer (read/written by the dual operation in the other loop) that
can be trivially resolved, including rank-reducing full subviews.
[CodeGen] Refactor register operand parsing in MI parser (#181748)
- Refactor register operand parsing to eliminate duplicated LLT parsing
code.
- Additionally, fix the register operand syntax in MI LangRef to match
what the parser supports.
[X86] LowerAndToBT - fold ICMP_ZERO(AND(X,AND(Y,SHL(1,Z)))) -> BT(AND(X,Y),Z) patterns (#182007)
Use m_ReassociatableAnd matcher to handle any permutation of a 3 op AND chain that involves a bit test
Fix 1 of 2 for #147216
[ASan/sanitizers] Make stack unwinding better on Windows. (#180205)
I created an issue about this in #179976.
Clang's Address Sanitizer installs its own SEH filter which handles some
types of uncaught exceptions. Along with register values and some other
information, it also generates a stack trace. However, current logic is
incomplete. It relies on DbgHelp's SymFunctionTableAccess64 and
SymGetModuleBase64 which won't work with machine code that has its
RUNTIME_FUNCTION entry registered with Rtl* (e.g. RtlAddFunctionTable)
system calls. Most likely, this is because DbgHelp either relies on
information in PDB files or considers PDATA and XDATA only from loaded
EXE and DLL modules. Either way, consider the following example:
```
#include <windows.h>
#include <iostream>
#include <vector>
[150 lines not shown]
libclc: Stop using r600 asm intrinsic declarations for amdgcn (#181975)
Really the workitem functions should all be moved to generic code
and use gpuintrin.h. These implementations were copied from there.
[lldb][PlatformDarwin][NFCI] Factor out dSYM script auto-loading into helper function (#182002)
Depends on:
* https://github.com/llvm/llvm-project/pull/182001
(only second commit is relevant for this review)
This patch factors out the logic to load dSYM scripting resources into a
helper function. In the process we eliminate some redundant copying of
`FileSpec` and pass it to the helper by `const-ref` instead
(specifically the `symfile_spec`).
[SelectionDAG] Fix bug related to demanded bits/elts for BITCAST
When we have a BITCAST and the source type is a vector with smaller
elements compared to the destination type, then we need to demand
all the source elements that make up the demanded elts for the
result when doing recursive calls to SimplifyDemandedBits,
SimplifyDemandedVectorElts and SimplifyMultipleUseDemandedBits.
Problem is that those simplifications are allowed to turn non-demanded
elements of a vector into POISON, so unless we demand all source
elements that make up the result there is a risk that the result
would be more poisonous (even for demanded elts) after the
simplification.
The patch fixes some bugs in SimplifyMultipleUseDemandedBits and
SimplifyDemandedBits for situations when we did not consider the
problem described above. Now we make sure that we also demand vector
elements that "must not be turned into poison" even if those elements
correspond to bits that does not need to be defined according to
the DemandedBits mask.
[2 lines not shown]
[lldb][PlatformDarwin][NFC] Use early-return style in LocateExecutableScriptingResources (#182001)
I'm planning on adding more to this function. It'll be easier to
review/reason about if we un-nested the if-blocks (as the LLVM style
guide recommends).
[modules] Fix warning: missing submodule 'LLVM_IR.FunctionProperties' (#181888)
When compiling LLVM with LLVM_ENABLE_MODULES=ON, I get the warning
```
warning: missing submodule 'LLVM_IR.FunctionProperties' [-Wincomplete-umbrella]
```
Fix is to add file `FunctionProperties.def` to the module map.
[X86] combineSETCC - merge inner isScalarInteger() condition. NFC. (#182004)
All folds in the outer if() require this - inc combineVectorSizedSetCCEquality
[flang][OpenMP] Include check for fully unrolled loops into nest check, NFC (#181729)
It's naturally a part of the verification of constructs nested in loop
constructs, so perform that check there instead of having it in a
separate function.