[Clang][AMDGPU] Add a Sema check for the imm argument of ` __builtin_amdgcn_s_setreg`
Our backend cannot select the corresponding intrinsic if the imm argument is not a `int16_t` or `uint16_t`, which is not really helpful.
GlobalISel: Use LibcallLowering to get libcall calling conventions
0e304e6d9f306ead81fc5177b8a497af0d416a73 converted the name queries,
but missed some of the calling conventions.
CodeGen: Use LibcallLoweringInfo for stack protector insertion
Thread LibcallLoweringInfo into the TargetLowering hooks used
by the stack protector passes.
[Clang][AMDGPU] Handle `wavefrontsize32` and `wavefrontsize64` features more robustly
We should also not allow `-wavefrontsize32` and `-wavefrontsize64` to be specified at the same time.
[RFC][Clang][AMDGPU] Emit only delta target-features to reduce IR bloat
Currently, AMDGPU functions have `target-features` attribute populated with all default features for the target GPU. This is redundant because the backend can derive these defaults from the `target-cpu` attribute via `AMDGPUTargetMachine::getFeatureString()`.
In this PR, for AMDGPU targets only:
- Functions without explicit target attributes no longer emit `target-features`
- Functions with `__attribute__((target(...)))` or `-target-feature` emit only features that differ from the target's defaults (delta)
The backend already handles missing `target-features` correctly by falling back to the TargetMachine's defaults.
A new cc1 flag `-famdgpu-emit-full-target-features` is added to emit full features when needed.
Example:
Before:
```llvm
attributes #0 = { "target-cpu"="gfx90a" "target-features"="+16-bit-insts,+atomic-buffer-global-pk-add-f16-insts,+atomic-fadd-rtn-insts,+ci-insts,+dl-insts,+dot1-insts,+dot2-insts,..." }
[13 lines not shown]
pfctl(8): change default limiter action from no-match to block
pf(4) users who use limiters in current should update the rules
accordingly to reflect the change in default behavior. The existing
rule which reads as follows:
pass in from any to any state limiter test
needs to be changed to:
pass in from any to any state limiter test (no-match)
OK dlg@
Obtained from: OpenBSD, sashan <sashan at openbsd.org>, c600931321
Sponsored by: Rubicon Communications, LLC ("Netgate")
[AMDGPU][NFC] Pre-commit tests for f16<->i32 conversions (#176630)
This patch adds tests showing current AMDGPU codegen for f16 to i32 and
i32 to f16 conversions.
These tests will be updated in a follow-up PR with modified
SOPInstructions.td that will have an effect on the tests.
[libc++abi] Remove old workaround for detecting libunwind (#176009)
This workaround should no longer be relevant since we don't support
building against libunwind headers from LLVM 14.
[MemCpyOpt] support offset slices for performStackMoveOptzn and processMemCpy (#176436)
In particular, support offset of src, since offset of dest will be a
followup change when dest is allowed to be not full-sized with copy.
Extracted from https://github.com/llvm/llvm-project/pull/150792
GlobalISel: Use LibcallLoweringInfo in IRTranslator for real
f24eafa6556b8899b6ec6ccde95d97d4620d3fd1 changed this to use
the newer APIs using LibcallImpl, but didn't query from the
analysis and used the old copy from TargetLowering.
CodeGen: Use LibcallLoweringInfo for stack protector insertion
Thread LibcallLoweringInfo into the TargetLowering hooks used
by the stack protector passes.