[clang-tidy][NFC] Construct map at compile time (#158166)
The important part of this PR is the changes to
`getDurationInverseForScale`. I changed the other `get*ForScale`
functions so that they all follow the same pattern, but those aren't as
important.
benchmarks: Skip runtime libcalls benchmark for llvm-driver build
Apparently if you enable LLVM_TOOL_LLVM_DRIVER_BUILD, many individual
tool binaries are not built and instead create object targets which
are linked into an llvm-driver tool which you need to use instead.
In principle we could reconstruct this command with llvm-driver, but
I can't get a build to complete when I turn this on as a standalone
option.
[C++20] [Modules] Fix issues with non-exported in-class friend declarations
Close https://github.com/llvm/llvm-project/issues/159424
Close https://github.com/llvm/llvm-project/issues/133720
For in-class friend declaration, it is hard for the serializer to decide
if they are visible to other modules. But luckily, Sema can handle it
perfectly enough. So it is fine to make all of the in-class friend
declaration as generally visible in ASTWriter and let the Sema to make
the final call. This is safe as long as the corresponding class's
visibility are correct.
[NewPM] Don't preserve BlockFrequencyInfo in FunctionToLoopPassAdaptor (#157888)
Function analyses in LoopStandardAnalysisResults are marked as preserved
by the loop pass adaptor, because LoopAnalysisManagerFunctionProxy
manually invalidates most of them.
However the proxy doesn't invalidate BFI, since it is only preserved on
a "lossy" basis: see https://reviews.llvm.org/D86156 and
https://reviews.llvm.org/D110438.
So any changes to the CFG will result in BFI giving incorrect results,
which is fine for loop passes which deal with the lossiness.
But the loop pass adapator still marks it as preserved, which causes the
lossy result to leak out into function passes.
This causes incorrect results when viewed from e.g. LoopVectorizer,
where an innermost loop header may be reported to have a smaller
frequency than its successors.
[7 lines not shown]
[libc][bazel] Add BUILD rules for fma and fmaf functions. (#159502)
This change adds the capability to build fma/fmaf with Bazel (fmal,
fmaf128 variants are not implemented yet), and run smoke tests.
BUILD rules for regular MPFR-based tests will be added later, since they
require support for building rand/srand as well, which is missing in
Bazel for now.
[llvm][LoongArch] Introduce LASX and LSX conversion intrinsics
This patch introduces the LASX and LSX conversion intrinsics:
- <8 x float> @llvm.loongarch.lasx.cast.128.s(<4 x float>)
- <4 x double> @llvm.loongarch.lasx.cast.128.d(<2 x double>)
- <4 x i64> @llvm.loongarch.lasx.cast.128(<2 x i64>)
- <8 x float> @llvm.loongarch.lasx.concat.128.s(<4 x float>, <4 x float>)
- <4 x double> @llvm.loongarch.lasx.concat.128.d(<2 x double>, <2 x double>)
- <4 x i64> @llvm.loongarch.lasx.concat.128(<2 x i64>, <2 x i64>)
- <4 x float> @llvm.loongarch.lasx.extract.128.lo.s(<8 x float>)
- <2 x double> @llvm.loongarch.lasx.extract.128.lo.d(<4 x double>)
- <2 x i64> @llvm.loongarch.lasx.extract.128.lo(<4 x i64>)
- <4 x float> @llvm.loongarch.lasx.extract.128.hi.s(<8 x float>)
- <2 x double> @llvm.loongarch.lasx.extract.128.hi.d(<4 x double>)
- <2 x i64> @llvm.loongarch.lasx.extract.128.hi(<4 x i64>)
- <8 x float> @llvm.loongarch.lasx.insert.128.lo.s(<8 x float>, <4 x float>)
- <4 x double> @llvm.loongarch.lasx.insert.128.lo.d(<4 x double>, <2 x double>)
- <4 x i64> @llvm.loongarch.lasx.insert.128.lo(<4 x i64>, <2 x i64>)
[3 lines not shown]
[RISCV][CodeGen] Add CodeGen support of Zibi experimental extension (#146858)
This adds the CodeGen support of Zibi v0.1 experimental extension, which
depends on #127463.
Reapply "RuntimeLibcalls: Use get_host_tool_path for executables used … (#159488) (#159489)
This reverts commit 44b7abcc75b005ab87e11e2beac155bf0b155992.
Add additional if TARGET checks
AMDGPU: Select VGPR MFMAs by default
AGPRs are undesirable since they are only usable by a
handful instructions like loads, stores and mfmas and everything
else requires copies to/from VGPRs. Using the AGPR form should be
a measure of last resort if we must use more than 256 VGPRs.
[flang] Add a warning for CDEFINED declarations that have initializers (#159456)
CDEFINED declarations are similar to "extern" declarations in C. If they
have initializers, this could lead to linker errors. clang warns about
"extern" declarations with initializers. Add similar warning to flang:
```
$ flang -c cdefined.f90 -pedantic
./cdefined.f90:3:57: warning: CDEFINED variable should not have an initializer [-Wcdefined-init]
integer(c_int), bind(C, name='c_global', CDEFINED) :: c = 4
^
```
AMDGPU: Remove unnecessary AGPR legalize logic
The manual legalizeOperands code only need to consider cases that
require full instruction context to know if the operand is legal.
This does not need to handle basic operand register class constraints.
[clang-doc] concatenate SymbolIDs to truncated mangled names
Previously, if mangled names were too long to be used as filenames, the
object's SymbolID was used as a filename. This worked for length
restrictions, but made URLs/filenames inconsistent. This patch truncates
the mangled name and appends the SymbolID. Thus, we can keep some
context in the URL/filename while preserving uniqueness.
RuntimeLibcalls: Add bitset for available libcalls
This is a step towards separating the set of available libcalls
from the lowering decision of which call to use. Libcall recognition
now directly checks availability instead of indirectly checking through
the lowering table.
Reapply "RuntimeLibcalls: Use get_host_tool_path for executables used … (#159488)
This reverts commit 44b7abcc75b005ab87e11e2beac155bf0b155992.
Add additional if TARGET checks
Revert "RuntimeLibcalls: Use get_host_tool_path for executables used … (#159488)
…in benchmark (#153955)"
This reverts commit f3c9c6c0c51880109b39411be4e6d742c16210d1.
Fails fuschia bot.
RuntimeLibcalls: Use get_host_tool_path for executables used in benchmark (#153955)
Copied from what the llvm-shlib build is doing.
This reverts commit 0b1b567d9f84e67124c58d69b5aa375357d68c9e.
AMDGPU: Remove unnecessary operand legalization for WMMAs (#159370)
The operand constraints already express this constraint, and
InstrEmitter will respect them.