[ADT] Make DenseMapBase::moveFrom safer (NFC) (#168180)
Without this patch, DenseMapBase::moveFrom() moves buckets and leaves
the moved-from object in a zombie state. This patch teaches
moveFrom() to call kill() so that the move-from object is in a known
good state. This brings moveFrom()'s behavior in line with standard
C++ move semantics.
kill() is implemented so that it takes the fast path in the destructor
-- both destroyAll() and deallocateBuckets().
[MLIR][Transform][Python] Expose applying named_sequences as a method (#168223)
Makes it so that a NamedSequenceOp can be directly applied to a Module,
via a method `apply(...)`.
[VPlan] Always set trip count when creating plan for unit tests (NFC).
Simplifies some tests which no do not need to pass TC, and future
changes will require to always have a trip count available.
[llvm] Proofread *.rst (#168185)
This patch is limited to single-word replacements to fix spelling
and/or grammar to ease the review process. Punctuation and markdown
fixes are specifically excluded.
[llvm-pdbutil] Create DBI section headers in yaml2pdb (#166566)
The section headers present in the DBI stream got lost when using
`pdb2yaml` and `yaml2pdb`.
They are a list of COFF section headers. The
`llvm::object::coff_section` didn't have a YAML mapping, so I added one
in llvm-pdbutil. The mapping for COFF sections in ObjectYAML includes
the section data itself, so we can't use it here.
Creation of the section map and headers in yaml2pdb is done like in LLD:
https://github.com/llvm/llvm-project/blob/438a18c1e105ca04e624239644195e48b28b5099/lld/COFF/PDB.cpp#L1695-L1703
[AMDGPU] When shrinking and/or to bitset*, remove implicit scc def (#168128)
When shrinking and/or to bitset* remove leftover implicit scc def.
bitset* instructions do not set scc.
Signed-off-by: John Lu <John.Lu at amd.com>
[X86] Remove vector length (256 vs 512) distinction of AVX10 (#167736)
As in title. AVX10.x doesn't distinguish between available vector
lengths.
-mattr=avx10.x-512 and defining of macros with _512 is kept for compatibility.
Bit-positions of avx10.1/2 features in compiler-rt and X86TargetParser
are synced to match those in the gcc.
[ValueTracking] Only check up to CtxIter in willNotFreeBetween.
Only check up to CtxI (CtxIter) when checking for calls that may free
in CtxI's block.
Missed update in https://github.com/llvm/llvm-project/pull/167965.
This should be NFC, as all current callers pass a terminator that is
guaranteed to not free as CtxI
HIP non-RDC: enable new offload driver on Windows via linker wrapper (#167918)
Use clang linker wrapper to device-link and embed HIP fat binary
directly. Match CUDA non-RDC flow in new driver by producing .hipfb like
.fatbin.
Previously, llvm offload binary is used to package the device IR's and
embed them in the host object file, then clang linker wrapper is used
with each host object file to extract device IR's, perform device
linking, bundle code objects into a fat binary, wrap it in a host object
file, then merge it with the original host object by the host linker
with '-r' option. However, the host linker in MSVC toolchain does not
support '-r' option.
The new approach still package the device IR's with llvm offload binary,
but instead of embed it in a host object, it is passed to clang linker
wrapper directly, where device IR's are extracted and linked, fat binary
is generated, then embeded in the host object directly. Compared with
the old offload driver, this approach can parallelize the device linking
[3 lines not shown]
[VPlan] Support VPWidenIntOrFpInduction in getSCEVExprForVPValue. (NFCI)
Construct SCEVs for VPWidenIntOrFpInductionRecipe analogous to
VPCanonicalInductionPHIRecipe: create an AddRec with start + step from
the recipe.
Currently the only impact should be computing more costs of replicating
stores directly in VPlan.
[mlir][emitc] Fix ineffective tests (#168197)
These tests were only checking the specialized prefix, leaving common
code unchecked (and incorrect). Checked code was also not using patterns
for SSA values.