[AArch64] Use dup (lane mov) over ext for high-half extract (#195010)
This changes the instruction we use to extract the high half of a vector
register from a `ext v0, v1, v1, 8` to a `dup d0, v1.d[1]`. This is
apparently slightly quicker on certain cpus and is generally a simpler
instruction. This matches the instruction that gisel produced.
Some of the old patterns for extract_subvector with index of 1 seem
incorrect but were never used as we do not reach selection with such
instructions. They have been repurposed to emit the new DUPi64
instructions.
[clang][bytecode] Visit `tryEvaluateObjectSize` expr as lvalue (#196010)
Just like we do with the first parameter of a regular
`__builtin_object_size` call.
This still doesn't fix the bigger bos test cases since e.g.
```c++
int NoViableOverloadObjectSize3(void *const p PS(3))
__attribute__((overloadable)) {
return __builtin_object_size(p, 3);
}
void test4(struct Foo *t) {
gi = NoViableOverloadObjectSize3(&t[1].t[1]);
}
```
is still broken because we don't have special handling for the
`&t[1].t[1]` handling here and we can't usually access a one-past-end
pointer.
[lldb] Fix TestDelayedBreakpoint on ARM Thumb (#196888)
The original address used for the "fake breakpoint" is not valid in
Thumb mode. To be safe, change it to have 0's in the LSBs.
[CIR][AMDGPU] Add lowering for amdgcn ds swizzle builtin. (#196011)
Upstreaming clangIR PR: https://github.com/llvm/clangir/pull/2052
This PR adds support for lowering of _builtin_amdgcn_ds_swizzle* amdgpu
builtin to clangIR.