[clang][cmake] Move perf-training out of CLANG_INCLUDE_TESTS (#192163)
perf-training defines the generate-profdata target used by the PGO
bootstrap build.
However, it is currently enabled only when CLANG_INCLUDE_TESTS=ON.
For distribution builds such as Yocto/OE, tests are usually disabled by
setting this to OFF.
But perf-training is a PGO utility, not a test target, and it is
currently gated by that block.
As a result, generate-profdata is unavailable to the PGO bootstrap build
when
CLANG_INCLUDE_TESTS=OFF.
Move perf-training out of the CLANG_INCLUDE_TESTS block.
This is safe because utils/perf-training/CMakeLists.txt adds targets
only when
LLVM_BUILD_INSTRUMENTED or CLANG_BOLT is enabled, so moving it out does
[9 lines not shown]
[AMDGPU][SIMemoryLegalizer] Consider scratch operations as NV=1 if GAS is disabled
- Clarify that `thread-private` MMO flag is still useful.
- If GAS is not enabled (which is the default as of last patch), consider an op as `NV=1` if it's a `scratch_` opcode, or if the MMO is in the private AS.
- Add tests for the new cases.
- Update AMDGPUUsage GFX12.5 memory model
[AMDGPU] Make globally-addressable-scratch opt-in
This feature is meant to be opt-in for more advanced users, not default-enabled.
It may reduce performance otherwise as we can't assume private AS is thread-local
when it is enabled.
- Add `HasGloballyAddressableScratchSupport` feature to check if a target's scratch
addressing is changed due to support for globally addressable scratch.
- Use `EnableGloballyAddressableScratch` to check whether the user opted into
globally addressable scratch. This affects whether to lower scratch atomics as flat,
and in the future will affect whether NV=1 can be set on scratch accesses.
ObsoleteFiles: Add some ancient locale symlinks
These were dropped in 2021 but were never listed in ObsoleteFiles.inc,
so systems that have been upgraded from source since before that date
(or from 13.x) may still have them.
PR: 295668
MFC after: 1 week
Fixes: 0a36787e4c1f ("locales: separate unicode from other locales")
Reviewed by: bapt
Differential Revision: https://reviews.freebsd.org/D57331
Reimplement aspath_merge() in a more cynical fashion
Mergin AS4_PATH into ASPATH can be done a bit simpler by using the fact
that AS4_PATH must be a subset of ASPATH. The resulting path has the same
size and layout as the ASPATH. bgpd inflates the 2-byte ASPATH to 4-byte
representation early on so this simplifies the merge.
When mering the path be strict and any difference in the two paths triggers
a treat-as-withdraw error. Something is off so refuse to work with this path.
This is harsher than RFC 6793 but the concerns then no longer matter.
Use ibuf for all the buffers to have memory safety during this merge operation.
OK tb@
[GVN] MemorySSA for GVN: eliminate redundant loads via MemorySSA (#152859)
Introduce the main algorithm performing redundant load elimination via
MemorySSA in GVN. The entry point is `findReachingValuesForLoad`, which,
given as input a possibly redundant load `L`, it attempts to provide as
output a set of reaching memory values (`ReachingMemVal`), i.e., which
values (defs or equivalent reads) can reach `L` along at least one path
where that memory location is not modified meanwhile (if non-local, PRE
will establish whether the load may be eliminated).
Specifically, a reaching value may be of the following descriptor kind
(`DepKind`):
* Def: found a new instruction that produces exactly the bits the load
would read. For example, a must-alias store (which defines the load
memory location), or a must-alias read (exactly reads the same memory
location, found, e.g., after a phi-translation fixup);
* Clobber: found a write that clobbers a superset of the bits the load
would read. For example, a memset call over a memory region, whose value
read overlaps such a region (and may be forwarded to the load), or a
[20 lines not shown]
[AMDGPU][NFC] Use generated hasMinMaxI64Insts subtarget feature query
Replace the custom GCNSubtarget::hasIntMinMax64 helper with
the generated hasMinMaxI64Insts from AMDGPUSubtargetFeature.
[LoopInterchange] Prevent interchange when memory-accessing calls exist (#200828)
Previously loop-interchange can be applied even though the loop has call
instructions which may access the memory. The root cause of this problem
is that the implementation didn't match the comment, like below:
```cpp
// readnone functions do not prevent interchanging.
if (CI->onlyWritesMemory() || isa<PseudoProbeInst>(CI))
continue;
```
However, I think ensuring `readnone` is insufficient in the first place,
because the LLVM Language Reference states about `readnone` as follows:
```
This attribute indicates that the function does not dereference that pointer argument, even though it may read or write the memory that the pointer points to if accessed through other pointers.
```
[6 lines not shown]