[SelectionDAG] Scalarize <1 x T> vector types for atomic store
`store atomic <1 x T>` is not valid. This change legalizes
vector types of atomic store via scalarization in SelectionDAG
so that it can, for example, translate from `v1i32` to `i32`.
[X86] Add atomic vector store tests for unaligned >1 sizes.
Unaligned atomic vector stores with size >1 are lowered to calls.
Adding their tests separately here.
[RISCV] Prefer SP over FP for frame index access when offset fits within compressed immediate range. (#193962)
Before this change, we would use fp/s0/x8 for most stack accesses when
frame pointers were present. This is an over-approximation when a
stack slot is reachable from both SP and FP with no scalable offset.
This patch replaces the unconditional getFrameRegister() call in
getFrameIndexReference with an explicit register selection decision
tree.
When both SP and FP are available (no stack realignment, no RVV objects,
no variable-sized objects), prefer SP if the SP-relative offset fits in
the compressed instruction immediate range (<=252 for RV32, <=504 for RV64).
This enables compression for sp-relative instructions to c.swsp/c.lwsp
(RV32) and c.sdsp/c.ldsp (RV64) thereby reducing code size.
[mlir][complex] Emit complex.mul multiplications before add/sub lowering (#196231)
[mlir][complex] Emit complex.mul multiplications before add/sub lowering
This changes the lowering of complex.mul in both ComplexToStandard and
ComplexToLLVM to emit the four independent multiplications before
creating the
final add/sub operations.
The lowered computation is unchanged:
real = lhs.real * rhs.real - lhs.imag * rhs.imag
imag = lhs.imag * rhs.real + lhs.real * rhs.imag
but the generated operation order changes from:
mul, mul, sub, mul, mul, add
to:
[6 lines not shown]
[clang] make concept normalization a SFINAE context
It is incorrect to allow the substitution failures in concept normalization
to make the program ill-formed.
These can happen when comparing the equivalence of constraints for
redeclaration checking, and a failure here only means these declarations
are not equivalent.
For now, these diagnostics are simply discarded. It would be nice
if some day, as part of diagnostics for non-matching out-of-line definitions,
we would explain why matching failed for each candidate, and then add these
as notes explaining why the constraints were not equivalent.
[CIR][AMDGPU] Add lowering for amdgcn readlane readfirstlane builtins (#197153)
Upstreaming clangIR PR: https://github.com/llvm/clangir/pull/2053
This PR adds support for lowering of "_builtin_amdgcn_readlane" and
"_builtin_amdgcn_readfirstlane" amdgpu builtins to clangIR.
[clang] make concept normalization a SFINAE context
It is incorrect to allow the substitution failures in concept normalization
to make the program ill-formed.
These can happen when comparing the equivalence of constraints for
redeclaration checking, and a failure here only means these declarations
are not equivalent.
For now, these diagnostics are simply discarded. It would be nice
if some day, as part of diagnostics for non-matching out-of-line definitions,
we would explain why matching failed for each candidate, and then add these
as notes explaining why the constraints were not equivalent.
[LifetimeSafety] Diagnose invalidated-global (#197220)
Teach lifetime safety invalidation diagnostics to handle origins that
escape through global or static storage before the referenced object is
invalidated. Previously they were skipped.
Follow up of #196680
Closes https://github.com/llvm/llvm-project/issues/195706
[RISCV] Make SFMM configuration instruction emit like VSETVLI (#196487)
Reuse the PseudoVSETVLI condition instead of its own condition.
---------
Co-authored-by: Luke Lau <luke_lau at icloud.com>
Revert "[DirectX][ObjectYAML] Add ILDN part support" (#197348)
Reverts llvm/llvm-project#194508 due to
1. Compilation error on older cl.exe versions due to having a field
"DebugName" as a member of class "DebugName".
2. Layout violation between MC and Object
(see https://github.com/llvm/llvm-project/pull/197343).
[clang] Align x86 CR/DR intrinsic declarations with MSVC (#196886)
Align CR/DR and related MSR intrinsic declarations in intrin.h with
MSVC's x86/x64 signatures
Fixes #185457
[libclc] Apply hidden visibility to amdgpu / nvptx builds (#197235)
Summary:
This is not currently used because we force `--internalize` for
llvm-link, but if you ever want to link this normally we'd need hidden
visibility. SPIR-V does not currently handle hidden visibility, as it is
an extension still under discussion, so it is omitted for now.
[libclc] Create a static `.a` file in addition to the `.bc` file (#197247)
Summary:
This changes the libraries to be object libraries instead of static
libraries. We can then just link these into final static libraries that
contain everything we need.
The desire here is that we'd need static libraries if we wanted to move
away from `-mlink-builtin-bitcode` appraoches.
Effectively we'll have `libclc.a` next to `libclc.bc` and the idea is
that we could alternatively link it and let the target linker handle it.
[MIR] Save internal VirtRegMap state in MIR
Adds two optional fields to the per-vreg YAML record so MIR tests can
express VirtRegMap state that previously had no representation:
registers:
- { id: 1, class: vgpr_32, split-from: '%0', assigned-phys: '$vgpr5' }
Testing passes that consume sibling-register information (e.g.
InlineSpiller) requires constructing a VirtRegMap with split
relationships from a MIR test, which implies triggering live-range
splitting at minimum and make reproducers unnecessarily complicated.
So this change introduces a mechanism to serialize/deserialize the state
of the VirtRegMap pass.
Mechanism:
- For serialization:
- MIRPrinter emits the new fields only when the new -mir-emit-vrm
[16 lines not shown]
[AMDGPU][test] Use mir test for regalloc issue
Use the newly introduced split-from flag to produce a more robust test case
for the hoistSpillInsideBB live-range update issue.
NFC
[MIR] Serialize/Deserialize MachineInstr::LRSplit attribute
The LRSplit MachineInstr flag is set by SplitKit on copies inserted for
live-range splitting.
Until now the flag had no MIR-text representation.
This patch fixes that so that it gets easier to reproduce/capture issues
that involves SplitKit.
Round-trip coverage in
llvm/test/CodeGen/MIR/AMDGPU/lr-split-flag.mir.