[CodeGen] Fix emission of function pointer casts with non-zero program AS (#186210)
Imagine we have the following code:
```c++
void foo() {}
void bar() {
void *ptr = reinterpret_cast<void*>(foo);
}
```
Usually clang would treat this as a simple `bitcast`, but in the case
that the target has a non-default program address space, this needs to
be an `addrspacecast`.
Today, if we try to codegen this, we get an assert due to the two types
not being valid for a `bitcast`.
[15 lines not shown]
[lldb][Module][NFC] Use raw string literal and formatv-style format in LoadScriptingResourceInTarget (#186411)
Makes it obvious what the warning will look like (with the indenentation
etc.). Also adds a test since we had no coverage for the warning before
(as far as I'm aware).
[lldb][PlatformDarwin] Disallow '+' in auto-loadable Python script names (#186346)
The `ScriptInterpreterPython` will refuse to load script names that
contain `+`. This patch makes `SanitizedScriptingModuleName` handle this
by replacing it with `x`. That might seem a bit arbitrary but the way
the current dSYM script loading (and the future "auto-load") mechanism
works is that it will look for scripts called `<lldb-module-name>.py`.
So for something like `libc++.1.dylib`, we would look for `libc++.1.py`.
Replacing `x` with `_` like we do for other special characters would
look strange in my opinion. The simplest way of working around this is
to recommend renaming the script to `libcxx_1.py`.
An alternative to the whole "replace special characters" logic is to
have a MANIFEST file which advertises the script name that LLDB should
load. During reading that script we could bail if we saw special
characters. But I haven't thought that through fully. And since the
`llvm::replace` approach is the path of least resistence I went with it
for now.
[BOLT] Gadget scanner: implement finer-grained --scanners=... argument (#176135)
Add separate options to enable each of the available gadget detectors.
Furthermore, add two meta-options enabling all PtrAuth scanners and all
available scanners of any type (which is only PtrAuth for now, though).
This commit renames `pacret` option to `ptrauth-pac-ret` and `pauth` to
`ptrauth-all`.
[VPlan] Prevent uses of materialized VPSymbolicValues. (NFC) (#182318)
After VPSymbolicValues (like VF and VFxUF) are materialized via
replaceAllUsesWith, they should not be accessed again. This patch:
1. Tracks materialization state in VPSymbolicValue.
2. Asserts if the materialized VPValue is used again. Currently it
adds asserts to various member functions, preventing calling them
on materialized symbolic values.
Note that this still allows some uses (e.g. comparing VPSymbolicValue
references or pointers), but this should be relatively harmless given
that it is impossible to (re-)add any users. If we want to further
tighten the checks, we could add asserts to the accessors or override
operator&, but that will require more changes and not add much extra
guards I think.
Depends on https://github.com/llvm/llvm-project/pull/182146 to fix a
[2 lines not shown]
[lldb][Module][NFC] Use early-return style in LoadScriptingResourceInTarget (#186392)
Planning on adding more to this function/loop soon. Making it
early-return style (as suggested by the LLVM style guide) makes those
changes easier to reason about.
Drive-by:
* Reduced the indentation of the loop by doing an early-continue if the
`FileSpec` is invalid or doesn't exist
[AArch64] Improve pow(x,y) cost model for some constant values of y (#185607)
Some optimisations of pow(x, y) calls only occur during codegen,
e.g. pow(x, 0.25) -> sqrt(sqrt(x)) and at the IR level we don't
currently reflect this in the cost of calls to the llvm.pow
intrinsic. This patch attempts to fix that in cases where we know
the intrinsic can in general be legally lowered to libcalls. For
scalable vector variants of llvm.pow we need to be cautious, since
without a math library this cannot be scalarised and there is
always a small risk that the optimisation will not happen during
codegen.
[IVDescriptors] Remove single-use constraint from FindLast comparisons (#186096)
Just relaxing some minor constraints for FindLast recurrence detection.
[AArch64][SVE2] Allow commuting two-input NBSL/BSL2N idioms. (#184847)
Specifically, EON, NAND and NOR are commutable operations that lack
dedicated SVE2 instructions, but we support them via NBSL/BSL2N.
However, as NBSL/BSL2N have tied operands, sometimes we generate a COPY
even if one of the operands could be clobbered.
This patch defines custom expansion for these operations to allow using
their commuted forms or, if still necessary, using MOVPRFX for the COPY.
Should help with
https://github.com/llvm/llvm-project/pull/176194#discussion_r2889564685.
[gn] port b80248a0ea35df more (clang-doc md templates) (#186401)
The previous version misspelled the name of comments-partial.mustache,
and it put the md files in the wrong output directory.
[libc] Add support for chown on platforms that don't define SYS_chown (#186167)
Some platforms don't define SYS_chown (like risc-v), so this PR adds a
fallback to calling SYS_fchownat.
[Offload][L0] clear completed events from a wait list (#186379)
Queue's WaitEvent collection wasn't being cleared after synchronization
and resetting of the events. This led to hangs on subsequent host
synchronizations if not preceeded by any other operation.
[MIR] Support symbolic inline asm operands (#185893)
Support parsing and printing inline assembly operands in MIR using the
symbolic form instead of numeric register class IDs, thus removing the
need to update tests when the numbers change.
The numeric form remains supported.
---------
Co-authored-by: Claude Opus 4.6 <noreply at anthropic.com>
[libc++] Make the associative container query benchmarks more representative (#183036)
Currently the query benchmarks are training the branch predictor
incredibly well, which isn't representative of the real world. This
change causes the branch misses to go from <1% to ~50% with the current
implementation of `__tree::__find_end`.
This patch also removes the `non-existent` benchmarks, since it'd be
non-trivial to write a representative benchmark for that case, and the
benchmark would be relatively low value. We're already searching to leaf
nodes ~50% of the time (since half the nodes are leaves) with the
current benchmark. So we'd only additionally cover a relatively trivial
failure branch that is only taken once per function call. The loop is
already covered through benchmarking with keys existing in the
container.
[mlir][tosa] Allow integer gather/scatter ops in fp profile (#183342)
This commit updates profile compliance to allow integer gather and
scatter operations to be used with the floating point profile. This
update aligns with the specification change:
https://github.com/arm/tosa-specification/pull/35.
[CIR] Implement zero-init-bases lowering (#186230)
This showed up in a test suite. A zero-initializer for a whole struct
seems completely sensible, as long as the type is zero-initializable.
This patch doesn't change the non-zero-init behavior (I am working on a
patch to do so, but it is a massive scope), so this is limited to JUST
classes with bases.
[VectorCombine] Fix crash in foldShuffleOfSelects for single-element shuffle result (#185713)
In foldShuffleOfSelects, if the shuffle result has a single element, the
resulting type may be scalar rather than a vector. The later code in
foldShuffleOfSelects assumes the result is a vector and performs cast<
FixedVectorType >, which triggers an assertion.
Fixes #183625
[AMDGPU] Pass MF into the SIInsertWaitcnts constructor. NFC. (#186369)
Pass MF into the SIInsertWaitcnts constructor instead of the run method.
This is more natural now that SIInsertWaitcnts is constructed once per
MachineFunction and enables future cleanup by initializing more fields
in the constructor that depend on MF.