[Clang] Remove 't' from __builtin_amdgcn_flat_atomic_fadd_f32/f64 (#173381)
Allows for type checking depending on the built-in signature.
This introduces some subtle changes in code generation: before, since
the signature was meaningless, we would accept any pointer type without
casting. After this change, the pointer of the `atomicrmw` matches the
flat address space.
[ELF] Include sharded relocations in RelocationBaseSection::getSize
Although mergeRels is called prior to using this size for final layout,
Writer::setReservedSymbolSections uses this in order to set the value of
__rel[a]_iplt_end and, downstream in Morello LLVM, __rel[a]_dyn_end.
Currently none of the relocations that can exist when static linking (as
the case when these symbols are defined) are sharded, but a future
commit will change this for R_AARCH64_AUTH_RELATIVE, and similarly
R_MORELLO_RELATIVE is sharded downstream in Morello LLVM. Make sure we
compute the right size when called prior to mergeRels, and add a
regression test to demonstrate that R_AARCH64_AUTH_RELATIVE still gets
the right __rel[a]_ipt_end in future even when sharding is adopted.
Reviewers: MaskRay
Reviewed By: MaskRay
Pull Request: https://github.com/llvm/llvm-project/pull/173285
[NFC][ELF] Move mergeRels/partitionRels into finalizeContents
Other than the ordering requirements that remain between sections, this
abstracts the details of how these sections are implemented.
Note that isNeeded already checks relocsVec for both section types, so
finalizeSynthetic can call it before mergeRels just fine.
Reviewers: MaskRay
Reviewed By: MaskRay
Pull Request: https://github.com/llvm/llvm-project/pull/171203
[NFC][ELF][AArch64][MTE] Don't duplicate addRelativeReloc call for MTE globals
This call to addRelativeReloc is the same as the one at the end of the
function, so skip the relrDyn code for this case and add the special
out-of-bounds handling code to the end of the function. This makes it
obvious where MTE globals differ in behaviour rather than having to
compare the two different implementations.
This also adds a comment documenting why relrDyn isn't used, and in it
highlights that it's probably safe to use relrDyn so long as the offset
is within the symbol's bounds.
Reviewers: pcc, kovdan01, MaskRay
Reviewed By: MaskRay
Pull Request: https://github.com/llvm/llvm-project/pull/171181
[NFC][ELF] Abstract RelrBaseSection more like RelocationBaseSection
This makes addRelativeReloc a bit more readable and uniform, as well as
the relrAuthDyn call in RelocScan::process.
Reviewers: MaskRay
Reviewed By: MaskRay
Pull Request: https://github.com/llvm/llvm-project/pull/171178
[NFC][ELF] Don't reimplement addReloc in GotSection::addConstant
This is just a copy of InputSectionBase::addReloc, so we can just
forward to that rather than poking into the internals. Whilst here, move
the implementation to the header so it can be inlined.
This is helpful downstream for CHERI, as static relocations to emit an
entire capability (whether for a relative relocation or for an undefined
weak symbol) need to be split in two, one per word, as getRelocTargetVA
only returns a uint64_t. Having a single function that pushes to
InputSectionBase's static relocations array centralises that so the
outside world can pretend it's a singular relocation, and internally it
gets mapped to the pair.
Reviewers: MaskRay
Reviewed By: MaskRay
Pull Request: https://github.com/llvm/llvm-project/pull/171177
[NFC][ELF] Use InputSectionBase::addReloc in addRelativeReloc
There's no need to poke into the internals, we can just use the more
abstract member function like everywhere else in LLD.
Reviewers: MaskRay
Pull Request: https://github.com/llvm/llvm-project/pull/171176
[mlir][Transforms][NFC] Improve debug output of `-remove-dead-values` (#173468)
Print the index of the block arguments, op results etc. that are being
removed.
InstCombine: Handle exp/exp2/exp10 in SimplifyDemandedFPClass
I'm working on optimizing out the tail sequences in the
implementations of the 4 different flavors of pow. These
include chains of selects on the various edge cases.
Related to #64870
ValueTracking: Add baseline tests for computeKnownFPClass exp
This is already handled, but misses opportunities. Test cases
where the input is known positive or negative.
InstCombine: Handle canonicalize in SimplifyDemandedFPClass
Doesn't try to handle PositiveZero flushing mode, but I
don't believe it is incorrect with it.