[DAG] Use TLO.LegalTypes() instead of AfterLegalizeTypes (#201840)
Fix typo from #178617 - AfterLegalizeTypes is an enum constant, not an actual check for legalised types
[SLP] Recompute copyable operand deps for nodes sharing an instruction
When an instruction is vectorized in several nodes and one models an operand as
copyable while another (built later) uses it directly, the operand's dependency
count missed the direct def-use edge and the scheduler decremented it more times
than its count, tripping the unscheduled-deps assertion. Defer such operand dep
recomputation unconditionally via RecalcCopyableOperandDeps instead of the narrow
IsDuplicateCopyableNode gate.
Fixes #201855
Reviewers:
Pull Request: https://github.com/llvm/llvm-project/pull/202032
[AArch64] Change postinc index types to uint64_t (#202024)
The uint32_t could overflow, make sure we do not throw away the high
bits by
using a uint64_t.
[openmp][omptest] Include cstdlib for malloc() (#202021)
This is to address the error appearing when building this code with
somewhat more recent compilers:
```
Use of undeclared identifier 'malloc'
```
Such inclusion has already been added to the OmptTester.cpp file.
[llubi] explain Byte.TagValue encoding trick (#201863)
This took me a while to understand in #185977 so let's make it more
explicit why `TagValue` can be so small.
[X86] Remove shouldCastAtomicLoadInIR; use DAG combine instead
Remove X86's shouldCastAtomicLoadInIR override that cast FP atomic
loads to integer at the IR level. Instead, handle this in a pre-legalize
DAG combine (combineAtomicLoad) that rewrites FP/FP-vector atomic loads
to integer atomic loads plus a bitcast.
This depends on #199310 which adds the necessary cmpxchg support for
non-integer atomic loads in AtomicExpand.
[LoopFusion] Remove unused DataLayout parameter (NFC) (#202009)
The LoopFuser constructor took a DataLayout reference that was never
stored or used, and run() computed it solely to pass it in. Drop both.
[X86][GlobalISel] Remove dependency on legal ruleset (#197374)
This fills in always legal rules, to remove the dependency on the legacy
ruleset. I'm not sure about the truncate rule but all tests pass. This
is not guaranteed to be all the rules, just the ones that appear in
tests.
[Clang] support C23 printf width length modifiers (#199991)
This patch adds `-Wformat` support for the C23 `wN` and `wfN` length
modifiers in `printf`/`scanf` format strings. #116962
Revert "[clang][driver][darwin] Hold onto full triples in Darwin SDKPlatformInfo (#200896)" (#202010)
This doesn't work for 32 bit arm because that usually gets converted to
thumb-apple-os, and that doesn't match arm-apple-os from
SDKSettings.json.
This reverts commit b89bb06afd069aa1b5e9f05ab692b3e6b41318c0.
[clang-format] Keep C++20 module/import decls on a single line (#199459)
This patch fixes #193676.
- Added `UnwrappedLineParser::parseModuleDecl()` to parse C++20 module
declarations.
- Adapted `parseCppModuleImport()` from #193834 and renamed it to
`parseImportDecl()`.
- Used the test cases from the same PR.
- Removed the invalid test cases and fixed an incorrect one in
`FormatTest.cpp`.
---------
Co-authored-by: Björn Schäpers <github at hazardy.de>
[CIR] Initialization of atomic aggregates with padding (#200668)
This patch adds support for the initialization of atomic aggregates with
padding. The changes include:
- During CIRGen, the type `_Atomic(T)` is represented by a CIR struct
`{T, sint8[padding_size]}` if the size of `_Atomic(T)` does not match
the size of `T`. `padding_size` is the difference between the size of
`_Atomic(T)` and `T`.
- CIRGen for the initialization process is updated to handle the
initialization of such CIR struct values.
[libc++] Assume that <atomic> is available (#199674)
We always define either `_LIBCPP_HAS_C_ATOMIC_IMP` or
`_LIBCPP_HAS_GCC_ATOMIC_IMP`, so we can remove any special handling of
not having an `<atomic>` header.
[TableGen] Recompute only the affected UberSet when inheriting reg units (#200962)
CodeGenRegBank::computeRegUnitWeights() runs a fixpoint over all registers;
normalizeWeight() calls the global computeUberWeights() -- which rescans
every UberRegSet, every register, and all of their register units -- each time
a register inherits register units from its subregisters.
Most of the time, we do better by just recomputing one register's
UberSet.
On AMDGPU (21266 registers) with this change, the "Compute reg unit
weights" phase drops from 3.19s to 0.70s (4.5x speedup) and
-gen-register-info improves overall from ~16.4s to ~14.0s.
Revert "[clang-cl] Add new option `/pathmap:<from>=<to>` to replace the path prefix <from> with <to>." (#201981)
Reverts llvm/llvm-project#198664
Causes test failures on
[llvm-clang-aarch64-darwin](https://lab.llvm.org/buildbot/#/builders/190)
bot.