[InstCombine] Avoid propagating invalid metadata in FoldOpIntoSelect (#199155)
Fixes #186471
FoldOpIntoSelect may create a select with a different result type from
the original instruction. The existing implementation blindly copied all
metadata from the original select, which could propagate invalid
type-specific metadata to the transformed instruction.
In particular, folding an fcmp over a floating-point select could copy
!fpmath metadata onto a non-FP select, producing invalid IR and causing
verifier failures.
This change preserves only metadata that remains valid for the
transformed select and propagates !fpmath only for FP-typed selects.
Debug locations are also preserved explicitly.
[AArch64][GlobalISel] Add select to and combines (#200131)
This adds combines for
// select c, x, 0 -> and c, x
// select c, 0, x -> and (not c), x
// select (not c), x, y -> select c, y, x
We need to freeze the value in the first two. The second is only
profitable if hasAndNot, so it excluded from all_combines.
https://alive2.llvm.org/ce/z/eG-aHT
This helps alleviate regressions when G_SELECT is made legal for vector
operations under AArch64. The AMD tests I am not sure about - let me
know if they look worse. The third combine is mostly useful
post-legalize.
[RDA] Slightly optimize enterBasicBlock() (NFC) (#201608)
Instead of initializing LiveRegs and doing an elementwise std::max with
the first incoming predecessor, directly copy the data for the first
predecessor.
[flang-rt] Extension: accept '!' as value separator in NAMELIST input (#200441)
Treat '!' as a self-delimiting value separator when reading NAMELIST
input, so that "name=value!comment" is accepted without an intervening
blank, comma, slash, or end of record. This matches gfortran, ifx, and
classic nvfortran behavior on real-world namelist input files.
F2023 13.11.3.6 p.1 requires a value separator before a '!' comment
introducer in namelist input, so this is a documented extension. The
change does not affect '!' characters inside character literal
constants, which continue to be taken literally.
The extension was documented in flang/docs/Extensions.md.
Assisted-by: AI
[mlir][x86] Fix - replace `vector.load` with `vector.transfer_read` (#201503)
This patch replaces `vector.load` with `vector.transfer_read` to load
elements as `vector<2x32xi8>` instead of `vector<64xi8>`. This fixes the
online shuffling for `int8` and `f8` types.
[MCP] Early exit if no copies (NFC) (#201602)
These two functions do expensive per-regunit work, but are no-ops if
there are no Copies, so short-circuit this case.
Include AArch64 SME builtins to compiler-rt for Bazel. (#196607)
Include the AArch64 SME (Scalable Matrix Extension) source files in the
compiler-rt builtins library when targeting aarch64. Added a selection
based on OS platform to use either Apple or Non-Apple sources.
[Clang] Correct diagnostic notes for C++11 range-based for statements with invalid iterator types (#201461)
Previously, diagnostic notes issued for errors encountered due to invalid
iterator types in C++11 range-based for statements reported the range type
as the iterator type instead of the invalid iterator type. Now fixed.
[AArch64][GlobalISel] Add pattern to prevent scalar uqxtn fallback (#201546)
Previously, attempting to select the intrinsic
@llvm.aarch64.neon.scalar.uqxtn would cause GlobalISel to fall back to
SDAG.
This was both due to:
1. RegBankSelect placing the operands on gpr banks.
2. No instruction selection patterns for the intrinsic.
Add pattern, and fix RegBankSelect to place operands on the correct
banks.
[RISCV][P-ext] Support packed bswap/bitreverse. (#200448)
We can implement these using combinations of rev, rev8, and ppairoe.*.
Rename REV16->REV16_RV64. A hypothetical REV16 on RV32 would have a
different encoding like REV and REV8.
Long term we should probably custom lower these instead of having
complex isel patterns. That would allow additional optimizations. But I
think the isel patterns are fine as a starting point.
[Clang][Docs] Documented sentinel attribute (#196088)
The documentation of the sentinel attribute was missing, this PR
documents the behavior of the sentinel attribute.
[Bazel]: Pull from Bazel-Central-Registry for third party deps. (#197316)
The majority of these dependencies are available in the
[Bazel-Central-Registry](https://github.com/bazelbuild/bazel-central-registry)
(BCR) and to improve build performance for bzlmod users, llvm-project
should pull from the BCR to consolidate targets.
[X86] Fix MachineBlockInfo hash for machine-block-hash.mir (#201039)
I looked at llvm/include/llvm/CodeGen/MachineBlockHashInfo.h,
BlendedBlockHash function and rewrote failing test.
---------
Co-authored-by: mattarde <mattarde at intel.com>
[clang][driver] Rename ClangExecutable and getClangProgramPath (NFC) (#200814)
This patch is to rename ClangExecutable to DriverExecutable and
getClangProgramPath to getDriverProgramPath. This makes the
name more neutral and less confusing when used in flang.
[MachineScheduler] Rework dag-maps-huge-region (#200945)
For compile time/memory reasons, dag-maps-huge-region is the number of
memory instructions at which we create a barrier and reset maps.
Previously we'd get to dag-maps-huge-region number of instructions, then
add a barrier in the middle of the current set of instructions, and
continue processing the second half of remaining instructions.
With this change, now we simply add a barrier every time we reach
dag-maps-huge-region number of memory instructions, and blow away all
previous instructions.
So now instead of waiting until we get to 1000 memory operations before
creating a barrier for 500 of them, we do it at 500 and do it for all
500.
With this change, -dag-maps-huge-region=500 still has
addChainDependencies() taking up over half of the codegen pipeline in
some cases I looked at, but it's much better than the previous 90%.
[Dexter][NFC] Mark script tests unsupported for non-lldb debuggers (#201596)
The recently-added structured script feature currently relies on
DAP-based debuggers, of which the only one currently supported by Dexter
is LLDB. In order to prevent the tests that depend on this feature from
running for other debuggers, we require LLDB for the script test
directory.
[ThinLTO][AIX] Teach ModuleSummaryAnalysis to include globals
referenced via !implicit.ref metadata as explicit reference edges in the ThinLTO
module summary via a new helper findImplicitRefEdges. Add imported
implicit ref strings (available_externally GVs) to llvm.compiler.used during thinLTO interaction with pragma comment copyright.
[mlir][Func][EmitC] Bail-out to avoid errors from MemRef array conversions (#198583)
Update FuncToEmitC to bail-out before creating invalid EmitC ops for
unsupported cases.
FuncToEmitC now rejects functions, calls, and returns whose converted
result type is `emitc.array`, instead of relying on later `emitc.func`,
`emitc.call`, or `emitc.return` verifier failures.
This does not add support for returning memrefs from functions. It only
makes the existing limitation explicit at the conversion boundary.
## Tests
Added negative tests for the standalone conversion pass. This pass marks
their source ops illegal, so when a pattern bails-out the pass reports a
legalization failure. This is the expected behavior and documents the
unsupported cases directly.
[5 lines not shown]