[ObjC] Support emission of selector stubs calls instead of objc_msgSend. (#183922)
This optimizes objc_msgSend calls by emitting "selector stubs" instead.
Usually, the linker redirects calls to external symbols to a symbol stub
it generates, which loads the target function's address from the GOT and
branches to it:
<symbol stub for _func:>
adrp x16, _func at GOTPAGE
ldr x16, [x16, _func at GOTPAGEOFF]
br x16
with msgSend selector stubs, we extend that to compute the selector as
well:
<selector stub for "foo":>
adrp x1, <selector ref for "foo">@PAGE
ldr x1, [x1, <selector ref for "foo">@PAGEOFF]
[35 lines not shown]
Make omp.iterator verify more robust and add tests
- Make sure
- step in omp.iterator is not zero
- when step > 0, lo < hi
- when step < 0, lo > hi
- Add negative test for above checks
- Add iterator lowering test to make sure negative step work
```
// OpenMP 5.2.6
The iterator value setof the iterator are the set ofvalues i_1,...,i_N where:
i_1 = begin
i_j = i_{j-1} + step, for j >= 2
If step > 0:
i_1 <= end
i_N <= end
i_N + step > end
[6 lines not shown]
Use findAllocaInsertPoint when possible and move the affinity packing logic to OpenMPToLLVMIRTranslation
- Move the omp.affinity_list packing logic from OMPIRBuilder to
OpenMPToLLVMIRTranslation so that we have all the omp.affinity_list
allocating logic inside the lambda defined in buildAffinityData
- all the allocation logic for affinity list is now using
findAllocaInsertPoint when possible (static count)
- `task_affinity_iterator_dynamic_tripcount` in
openmp-iterator.mlir is a regression test add previously for
dynamic tripcount
Fix affinity type, handle unexpected iterator loop body and accumulate affinity entry for one register call
- Generate kmpTaskAffinityInfoTy based on platform and create a helper
in OMPIRBuilder so that we can use it in OpenMPToLLVMIRTranslation and
OMPIRBuilder
- Handle invalid iterator loop body and add unit test
- Accumulate affinity info and only one register call for a task
construct
- remove `this->` in member fucntion
Refactor createIteratorLoop to use OMPIRBuilder utility functions and make end-of-block insertion robust.
- Replace manual splitBasicBlock/branch with splitBB
and redirectTo()
- When insertion point is at BB.end() and the block is terminated, split
before the terminator so the original successor path is preserved
through omp.it.cont
- Add test for unterminated blocks
[mlir][llvmir][OpenMP] Translate affinity clause in task construct to llvmir
Translate affinity entries to LLVMIR by passing affinity information to
createTask (__kmpc_omp_reg_task_with_affinity is created inside PostOutlineCB).
Refactor and support multiple affinity register for a task
- Support multiple affinity register for a task
- Move iterator loop generate logic to OMPIRBuilder
- Extract iterator loop body convertion logic
- Refactor buildAffinityData by hoisting the creation of affinity_list
- IteratorsOp -> IteratorOp
- Add mlir to llvmir test
Implement lowering for omp.iterator in affinity
Create IteratorLoopNestScope for building nested loop for iterator.
Take advantage of RAII so that we can have correct exit for each
level of the loop.
[PowerPC] Refactor immediate operand part 2 (#180289)
Contiue with immediate operand refactoring:
* consolidate printU##Imm into a template function resulting in simpler
class def
* separate imm and relocation classes to clearly reflect what they are
[lldb] Add darwin-mte-launcher (#185921)
A new tool called `darwin-mte-launcher`. In order to launch a process
under MTE, we need to set a posix_spawn flag. We already support this in
LLDB when launching with the `--memory-tagging` flag (see #162944).
Python's built-in allocator doesn't play nice with MTE and requires
setting PYTHONMALLOC=malloc to use the systme allocator. The launcher
takes care of this as well.
[libunwind][PAC] Defang ptrauth's PC in valid CFI range abort
It turns out making the CFI check a release mode abort causes many,
if not the majority, of JITs to fail during unwinding as they do not
set up CFI sections for their generated code. As a result any JITs
that do nominally support unwinding (and catching) through their JIT
or assembly frames trip this abort.
rdar://170862047
[flang][OpenMP] Loop IVs inside TEAMS are predetermined private in 5.2+
Mark the induction variables of loops in a TEAMS construct as predetermined
private when OpenMP version is 5.2 or later.
[NFC][SPIRV] New test for untested SPIRV backend case (#185686)
[This
line](https://github.com/ambergorzynski/llvm-project/blob/main/llvm/lib/Target/SPIRV/SPIRVInstructionSelector.cpp#L2362)
in the SPIRV backend is uncovered by the existing test suite (checked
using coverage, and by asserting that no tests in the existing test
suite fails if we insert an `abort()` at this line).
We propose a test that covers this line. We demonstrate the test by
inserting an `abort()` at that line in commit
[#e37e88e](https://github.com/llvm/llvm-project/commit/e37e88e2500bd695d9635322af487c0c40ba3b8a).
Running all tests shows that only our proposed test fails in the
presence of the `abort`. We've removed the `abort` ahead of merging.
This is the only test that fails in the presence of the abort (which
includes our new test): `LLVM.CodeGen/SPIRV/instructions/icmp.ll`
[LLVM] [SeparateConstOffsetFromGEP] Fix sep-const-offset-from-gep invalid assumption (#183402)
`SeparateConstOffsetFromGEP` assumed the index of a GEP was non-negative
(and therefore previous sext/add could be reordered safely) if the GEP
was marked `inbounds`. This can only be assumed if the GEP is working
off of the base address for the object (counter example:
https://alive2.llvm.org/ce/z/FjGgWp).
This fix removes the general assumption of inbounds GEPs and replaces it
with new checks. The transform is valid when:
1. Value tracking shows the index is known non-negative.
2. The GEP is inbounds and the offset from the base ptr is 0.
3. The GEP is inbounds and the offset is within the threshold `(2^(N-1)
- C + 1) * stride`, where N is the bit width of the index, C is a
positive constant in the add, and stride is the type size of the GEP.
4. The GEP is inbounds and the object size is within the threshold
`(2^(N-1) - C + 1) * stride` for positive C or `(2^(N-1) + C) * stride`
for negative C.
[2 lines not shown]
[SandboxVec][DAG][NFC] Remove argument from setScheduelued() (#185787)
DGNode::setScheduled() is only used to mark a nodes a scheduled, not the
reverse. The reverse should only happen with a call to
DGNode::resetScheduleState().
[HLSL] Implement Texture2D::mips[][]
We implement the Textur2D::mips[][] method. We follow the design in DXC.
There is a new member called `mips` with type mips_type. The member will
contain a copy of the handle for the texture.
The type `mips_type` will have a member function `operator[]` that takes
a level, and returns a `mips_slice_type`. The slice will contain the
handle and the level. It also has an operator[] member function that
take a coordinate. It will do a load from the handle with the level and
coordinate, and return that value.
Assisted-by: Gemini
Revert "[clang][ssaf] Add UnsafeBufferUsage summary extractor for functions (#182941)"
This reverts commit b7512418d2c1f0ba9ae3016024cb503ded7835d1.
There are bots broken by this commit.
[bazel] Add -Wno-vla-cxx-extension to macOS lldb srcs (#185945)
```
lldb/tools/debugserver/source/MacOSX/MachVMMemory.cpp:88:20: warning: variable length arrays in C++ are a Clang extension [-Wvla-cxx-extension]
88 | int dispositions[dispositions_size];
| ^~~~~~~~~~~~~~~~~
```
This code contains objc++ which can only be compiled with clang anyways.
[X86] Use shift+add/sub for vXi8 splat multiplies (#174110)
Fixes #164200
~~I will create a separate PR to the `llvm-test-suite` repo for the
microbenchmark for this change.~~ The benchmark is in
https://github.com/llvm/llvm-test-suite/pull/316
In my experiments on an EC2 `c6i.4xl`, the change gives a small
improvement for the `x86-64`, `x86-64-v2`, and `x86-64-v3` targets. It
regresses performance on `x86-64-v4` (in particular, when the constant
decomposes into two shifts). The performance summary follows:
```
$ ../MicroBenchmarks/libs/benchmark/tools/compare.py benchmarks results-baseline-generic-v1.json results-opt-generic-v1.json |tail -n1
OVERALL_GEOMEAN -0.2846 -0.2846 0 0 0 0
$ ../MicroBenchmarks/libs/benchmark/tools/compare.py benchmarks results-baseline-generic-v2.json results-opt-generic-v2.json |tail -n1
OVERALL_GEOMEAN -0.0907 -0.0907 0 0 0 0
$ ../MicroBenchmarks/libs/benchmark/tools/compare.py benchmarks results-baseline-generic-v3.json results-opt-generic-v3.json |tail -n1
[3 lines not shown]