[llvm] Proofread *.rst (#168185)
This patch is limited to single-word replacements to fix spelling
and/or grammar to ease the review process. Punctuation and markdown
fixes are specifically excluded.
[llvm-pdbutil] Create DBI section headers in yaml2pdb (#166566)
The section headers present in the DBI stream got lost when using
`pdb2yaml` and `yaml2pdb`.
They are a list of COFF section headers. The
`llvm::object::coff_section` didn't have a YAML mapping, so I added one
in llvm-pdbutil. The mapping for COFF sections in ObjectYAML includes
the section data itself, so we can't use it here.
Creation of the section map and headers in yaml2pdb is done like in LLD:
https://github.com/llvm/llvm-project/blob/438a18c1e105ca04e624239644195e48b28b5099/lld/COFF/PDB.cpp#L1695-L1703
[AMDGPU] When shrinking and/or to bitset*, remove implicit scc def (#168128)
When shrinking and/or to bitset* remove leftover implicit scc def.
bitset* instructions do not set scc.
Signed-off-by: John Lu <John.Lu at amd.com>
[X86] Remove vector length (256 vs 512) distinction of AVX10 (#167736)
As in title. AVX10.x doesn't distinguish between available vector
lengths.
-mattr=avx10.x-512 and defining of macros with _512 is kept for compatibility.
Bit-positions of avx10.1/2 features in compiler-rt and X86TargetParser
are synced to match those in the gcc.
[ValueTracking] Only check up to CtxIter in willNotFreeBetween.
Only check up to CtxI (CtxIter) when checking for calls that may free
in CtxI's block.
Missed update in https://github.com/llvm/llvm-project/pull/167965.
This should be NFC, as all current callers pass a terminator that is
guaranteed to not free as CtxI
HIP non-RDC: enable new offload driver on Windows via linker wrapper (#167918)
Use clang linker wrapper to device-link and embed HIP fat binary
directly. Match CUDA non-RDC flow in new driver by producing .hipfb like
.fatbin.
Previously, llvm offload binary is used to package the device IR's and
embed them in the host object file, then clang linker wrapper is used
with each host object file to extract device IR's, perform device
linking, bundle code objects into a fat binary, wrap it in a host object
file, then merge it with the original host object by the host linker
with '-r' option. However, the host linker in MSVC toolchain does not
support '-r' option.
The new approach still package the device IR's with llvm offload binary,
but instead of embed it in a host object, it is passed to clang linker
wrapper directly, where device IR's are extracted and linked, fat binary
is generated, then embeded in the host object directly. Compared with
the old offload driver, this approach can parallelize the device linking
[3 lines not shown]
[VPlan] Support VPWidenIntOrFpInduction in getSCEVExprForVPValue. (NFCI)
Construct SCEVs for VPWidenIntOrFpInductionRecipe analogous to
VPCanonicalInductionPHIRecipe: create an AddRec with start + step from
the recipe.
Currently the only impact should be computing more costs of replicating
stores directly in VPlan.
[mlir][emitc] Fix ineffective tests (#168197)
These tests were only checking the specialized prefix, leaving common
code unchecked (and incorrect). Checked code was also not using patterns
for SSA values.
[ValueTracking] Bail out on non-immediate constant expressions (#168084)
In https://github.com/llvm/llvm-project/pull/165748 constant expressions
were allowed in `collectPossibleValues` because we are still using
insertelement + shufflevector idioms to represent a scalable vector
splat. However, it also accepts some unresolved constants like ptrtoint
of globals or pointer difference between two globals. Absolutely we can
ask the user to check this case with the constant folding API. However,
since we don't observe the real-world usefulness of handling constant
expressions, I decide to be more conservative and only handle immediate
constants in the helper function. With this patch, we don't need to
touch the SimplifyCFG part, as the values can only be either ConstantInt
or undef/poison values (NB: switch on undef condition is UB).
Fix the miscompilation reported by
https://github.com/llvm/llvm-project/pull/165748#issuecomment-3532245218
[ValueTracking] Check across single predecessors in willNotFreeBetween. (#167965)
Extend willNotFreeBetween to perform simple checking across blocks to
support the case where CtxI is in a successor of the block that contains
the assume, but the assume's parent is the single predecessor of CtxI's
block.
This enables using _builtin_assume_dereferenceable to vectorize
std::find_if and co in practice.
End-to-end reproducer: https://godbolt.org/z/6jbsd4EjT
PR: https://github.com/llvm/llvm-project/pull/167965
[LV] Use variables in CHECK lines for unnamed VPValues in test.
Update test to capture unnamed VPValues in variables, making it easier
to update with future VPlan changes.
[analyzer] Fix crash in Z3 SMTConv when negating a boolean expression (#165779) (#168034)
Refer to #158276 for previous hotfix.
In Z3, boolean expressions are incompatible with bitvec operators.
However, C expressions like `-(5 && a)` will generate such symbolic
expressions, which will be further used as an integer. To be compatible
with such usages, this fix converts such expressions to integer using
the existing `fromCast`.
[libc++] Make `std::numeric_limits<NonPromoted>::traps` `false` (#166724)
Per [LWG554](https://cplusplus.github.io/LWG/issue554), the rationale is
that even if `true / false` traps, the values causing trap are the
converted `int` values produced by usual arithmetic conversion, but not
the original `bool` values.
This is also true for all other non-promoted integer types. As a result,
`std::numeric_limits<I>` should be `false` if `I` is a non non-promoted
integer type.
Fixes #166053.
AMDGPU: Select vector reg class for divergent build_vector (#168169)
The main improvement is to the mfma tests. There are some
mild regressions scattered around, and a few major ones.
The worst regressions are in some of the bitcast tests;
these are cases where the SGPR argument list runs out
and uses VGPRs, and the copies-from-VGPR are misidentified
as divergent. Most of the shufflevector tests are also
regressions. These end up with cleaner MIR, but then get poor
regalloc decisions.
CodeGen: Remove PointerLikeRegClass handling from codegen
All uses have been migrated to RegClassByHwMode. This is now
an implementation detail of InstrInfoEmitter for pseudoinstructions.
CodeGen: Make target overrides of PointerLikeRegClass mandatory
Most targets should now use the convenience multiclass to fixup
the operand definitions of pointer-using pseudoinstructions:
defm : RemapAllTargetPseudoPointerOperands<target_ptr_regclass>;