[libc] Remove ballot on slab find (#176606)
Summary:
This negatively impacts performance, while the other changes in the
initial PR slightly improved it. This was originally done to make Volta
independent thread scheduling work, but that doesn't seem to work
correctly all the time either so we should make this faster.
[Clang][AMDGPU] Handle `wavefrontsize32` and `wavefrontsize64` features more robustly
We should also not allow `-wavefrontsize32` and `-wavefrontsize64` to be specified at the same time.
[mlir][vscode] Add angle bracket support to MLIR language configuration (#176602)
Add angle brackets (<>) to brackets, autoClosingPairs, and
surroundingPairs for better editing of types like tensor<3xf32>. Also
add colorizedBracketPairs for visual distinction between nested bracket
types.
[mlir][vscode] Fix PDLL grammar character class regex (#176601)
The character class [aA-zZ_0-9] incorrectly matches characters between
ASCII 90-97 (Z-a range), which includes: [ \ ] ^ _ `. This should be
[a-zA-Z_0-9] for proper identifier matching.
[Clang][AMDGPU] Handle `wavefrontsize32` and `wavefrontsize64` features more robustly
We should also not allow `-wavefrontsize32` and `-wavefrontsize64` to be specified at the same time.
[WebAssembly] Mark extract.last.active as having invalid cost.
Currently the WebAssembly backend crashes when trying to lower some
extract.last.active intrinsic calls. Mark their cost as invalid
temporarily, to avoid them being introduced by the loop
vectorizer after 2abd6d6d7ac (#158088).
[CI] Disable precompiled headers in pre-commit CI (#176563)
Spliced out from #176420 to make sure that CI is fine without PCH, which
are currently used by Flang.
[Offload][CI] Convert openmp-offload-amdgpu staging bots to ScriptedBuilder (#174991)
Convert the first AMDGPU buildbots to use the ScriptedBuilder introduced
llvm-zorg. For the motivation, see
https://github.com/llvm/llvm-zorg/pull/648.
Since the production buildbot still needs to be restarted for
ScriptedBuilder to work, only convert the builders that are currently in
staging for now. These are:
* openmp-offload-amdgpu-runtime
* openmp-offload-amdgpu-clang-flang
Both of them happen to be OpenMPBuilder.getOpenMPCMakeBuildFactory-based
builders before this change. They also set an environment variable that
the previous ScriptedBuilder did not, so we are adding support.
The corresponding llvm-zorg change is
https://github.com/llvm/llvm-zorg/pull/697.
[clang][analyzer] Add ReportInC99AndEarlier option to DeprecatedOrUnsafeBuf… (#168704)
…ferHandling checker
The checker may report warnings for deprecated buffer handling functions
(memcpy, memset, memmove, etc.) even when not compiling with C11
standard if the new option "ReportInC99AndEarlier" is set to true.
These functions are deprecated in C11, but may still be problematic in
earlier C standards.
[X86][NewPM] Cleanup some minor issues in recently ported passes
* Ensure passes implemented as single functions are marked as static to
enforce internal linkage.
* Avoid the use of temporary variables to hold pass output status that
only have one user/do not change any ordering guarantees.
[Support][NFCI] Store DomTree children as linked list (#176409)
Reduce the size of a DomTreeNodeBase from 80 to 56 bytes by not storing
the children in a SmallVector. Instead, store children as forward-linked
list. This also avoids extra allocations for nodes with many children.
Additionally, DomTreeNodeBase is now trivially destructible.
A lot of code depends on the order of nodes in the dominator tree, so
make sure that the order is the same when inserting nodes. (Not having
to do this would save 8 bytes per node.)
NewGVN uses the order of nodes in the dominator tree in a way that is
not entirely clear to me (https://reviews.llvm.org/D28129). I kept the
semantics as, but now this is the only external user of
addChild/removeChild, which actually should be private.
https://llvm-compile-time-tracker.com/compare.php?from=263802c56b4db3fc9b6ed9fd313499cb03ca44da&to=43e0c0c5b663b3a4067252fc0addbaccefd0014d&stat=instructions:u
[mlir][nfc] Fix function definition names post #175880 (#176586)
Ensure that the input argument names for `verifyRanksMatch` in the
function definition match those in the declaration.
[mlir][Utils] Add verifyRanksMatch helper (NFC) (#175880)
This change builds on https://github.com/llvm/llvm-project/pull/174336,
which introduced shared VerificationUtils with an initial
verifyDynamicDimensionCount() method.
This patch adds a new verifyRanksMatch() verification utility that
checks if two shaped types have matching ranks and emits consistent
error messages. The utility is applied to several ops across multiple
MLIR dialects.
---------
Co-authored-by: Andrzej Warzyński <andrzej.warzynski at gmail.com>
[VPlan] Normalize selects to always select the data op when cond is true.
Fix a miscompile in the FindLast handling by normalizing selects
with the phi node as the first op to ones that select the data value
when the condition is true, by swapping operands and inverting the
condition.
This should ensure correct codegen for both cases.
Select normalization:
https://alive2.llvm.org/ce/z/yFdivK
Fixes a miscompile reported for 2abd6d6d7ac (#158088).
[TwoAddressInstruction][NPM] Conditionally preserve SlotIndexes in NPM (#173536)
In the New PM, `SlotIndexesAnalysis` should only be preserved when
`LiveIntervals` was cached and available, as `SlotIndexes` are only
maintained when `LiveIntervals` analysis is available.
This fixes potential stale `SlotIndexes` issues when running with NPM
where `LiveIntervals` analysis wasn't requested by prior passes.
[libc][CMake] Add dependency on ELF headers for elf_proxy target (#176557)
Fixes parallel build problem for check-libc target where headers are
generated after they are needed. I think this was likely caused by
https://github.com/llvm/llvm-project/pull/172766.
InstCombine: Stop using nsz in multi-use min/max fold
In SimplifyDemandedFPClass, stop using nsz when there's a
mismatch in the sign of 0 for the various min and maxes.
Alive2 doesn't like it: https://alive2.llvm.org/ce/z/ZyhSGA,
presumably because of the possible mismatch between the stored
value and the propagated. Maybe it would be OK if nsz is on all
the uses.