[RFC][Clang][AMDGPU] Emit only delta target-features to reduce IR bloat
Currently, AMDGPU functions have `target-features` attribute populated with all default features for the target GPU. This is redundant because the backend can derive these defaults from the `target-cpu` attribute via `AMDGPUTargetMachine::getFeatureString()`.
In this PR, for AMDGPU targets only:
- Functions without explicit target attributes no longer emit `target-features`
- Functions with `__attribute__((target(...)))` or `-target-feature` emit only features that differ from the target's defaults (delta)
The backend already handles missing `target-features` correctly by falling back to the TargetMachine's defaults.
A new cc1 flag `-famdgpu-emit-full-target-features` is added to emit full features when needed.
Example:
Before:
```llvm
attributes #0 = { "target-cpu"="gfx90a" "target-features"="+16-bit-insts,+atomic-buffer-global-pk-add-f16-insts,+atomic-fadd-rtn-insts,+ci-insts,+dl-insts,+dot1-insts,+dot2-insts,..." }
[13 lines not shown]
[CIR] Upstream support co_return of values from co_await (#173174)
This PR adds support for returning the result of a `co_await` via
`co_return`. A new variable, `__coawait_resume_rval`, is introduced to
store the returned value.
[NFC][win] Use an enum for the cfguard module flag (#176461)
Currently the `cfguard` module flag can be set to 1 (emit tables only,
no checks) or 2 (emit tables and checks).
This change formalizes that definition by moving these values into an
enum, instead of just having them documented in comments.
Split out from #176276
[CIR] Upstream support for calling through method pointers (#176063)
This adds support to CIR for calling functions through pointer to method
pointers with the Itanium ABI for x86_64 targets. The ARM-specific
handling of method pointers is not-yet implemented.
Create a poor-developer's msan for libc wide read functions. (#170586)
Most libcs optimize functions like strlen by reading in chunks larger
than a single character. As part of "the implementation", they can
legally do this as long as they are careful not to read invalid memory.
However, such tricks prevents those functions from being tested under
the various sanitizers.
This PR creates a test framework that can report when one of these
functions read or write in an invalid way without using the sanitizers.
[llvm][Support] Move llvm::createStringErrorV to a new ErrorExtras.h header (#176491)
Introducing `llvm::createStringErrorV` caused a `0.5%` compile-time
regression because it's an inline function in a core header. This moves
the API to a new header to prevent including this function in files that
don't need it.
Also includes the header in the source files that have been using
`createStringErrorV` (which currently is just LLDB).
[flang][cuda] Emit better error when subprogram attribute is absent or bad (#176501)
this patch update the parser for CUDA Fortran subprogram attribute to
emit more precise error.
Instead of having error like:
```
error: expected 'END'
attributes(managed) integer function fooj()
^
```
The parser will emit:
```
expected DEVICE, GLOBAL, GRID_GLOBAL, or HOST attribute
attributes(managed) integer function fooj()
^
```
[X86] Separate sibcall checks from guaranteed TCO (#176479)
Rename IsEligibleForTailCallOptimization to isEligibleForSiblingCallOpt.
LLVM supports two other ways to bypass this logic: musttail and
ShouldGuaranteeTCO. The result of this function doesn't really control
tail call eligibility, and returning false from it is not sufficient to
block tail call emission. Rename it to clarify the code.
Move the calling convention match check, which is the only thing that
matters in the guaranteed TCO case, out of this sibcall eligibility
check.
Move the GOT early binding check into the sibcall eligibility check,
since it is bypassed in either guaranteed TCO case. When that [diff
landed](https://reviews.llvm.org/D9799), it did not have exceptions for
`musttail`, but later in 9ff2eb1ea596a the two guaranteed tail call
cases were made to override this check, forcing lazy binding, which I
agree is the right tradeoff.
[3 lines not shown]
[AArch64] Fix Windows prologue handling to pair more registers. (#170214)
Currently, there's code to suppress pairing, but we don't actually need
to suppress that; we just need to suppress the formation of
pre-decrement/post-increment instructions.
Pairing saves an instruction in some cases, and enables packed unwind in
some cases.
[lldb] Support both RISCV-32 and RISCV-64 in GetRegisterInfo (#176472)
`GetRegisterInfo` hardcodes to use `RegisterInfoPOSIX_riscv64` instead
of checking the triple to determine whether to use
`RegisterInfoPOSIX_riscv64` or `RegisterInfoPOSIX_riscv32`.
Someone put up a [PR](https://github.com/llvm/llvm-project/pull/175262)
for this, but seems to have removed their account and the associated PR
with it.
Fixes #175092
CMake cache file for building Pico SDK toolchain (#113267)
This cache file demonstrates how to build a complete baremetal
Clang/LLVM toolchain that can be used to build the Pico SDK.
[lldb] Fix llvm_unreachable for invalid Wasm address (#176464)
We had an llvm_unreachable following a switch on the WasmAddress's type.
However, the type is encoded in a larger 64 bit address, and therefore
it's possible to create an invalid value that doesn't map back on one of
the enum types.
We could try to diagnose that in the wrapper, or treat all invalid types
the same. I took the latter approach because it makes it easier to show
the invalid type after the fact in an error message.
rdar://168314695
[clang][Wunsafe-buffer-usage] Add -Wunsafe-buffer-usage-in-static-sized-array (#176466)
This PR adds support for toggling on/off warnings around static sized
arrays. This supports / addresses
https://github.com/llvm/llvm-project/issues/87284, for those who use
-fsanitize=array-bounds which inserts checks for fixed sized arrays
already.