[libc++] Allows any types of size 4 and 8 to use native platform ulock_wait (#161086)
This is to address #146145
The issue before was that, for `std::atomic::wait/notify`, we only
support `uint64_t` to go through the native `ulock_wait` directly. Any
other types will go through the global contention table's `atomic`,
increasing the chances of spurious wakeup. This PR tries to allow any
types that are of size 4 or 8 to directly go to the `ulock_wait`.
This PR is just proof of concept. If we like this idea, I can go further
to update the Linux/FreeBSD branch and add ABI macros so the existing
behaviours are reserved under the stable ABI
Here are some benchmark results
```
Benchmark Time CPU Time Old Time New CPU Old CPU New
----------------------------------------------------------------------------------------------------------------------------------------------------
[48 lines not shown]
[mlir][emitc] Fix bug in dereference translation (#171028)
The op was not added to `hasDeferredEmission()` when introduced by
f17abc280c70, causing incorrect translation.
[mlir][emitc] Simplify inlining logic (NFCI) (#169978)
This change makes inlining logic in the translator simpler and more
consistent by
(a) Extending the inlining concept to include CExpression ops, which by
definition are inlined if and only if they reside within an
ExpressionOp.
(b) Concentraing all inlining decisions in `shouldBeInlined()` to make
sure that ops get the same decision when queried as operations and
as operands.
[mlir] Remove deprecated GEN_PASS_CLASSES. (#166904)
This was marked as deprecated in 2022, but as comment. Switch to error
to make visible and stop generating. Will remove the error message in
follow up, just felt this was easier for folks to understand compilation
errors. The change required to new form is rather minimal.
[LLVM][X86] Add f80 support for ct.select
Add special handling for x86_fp80 types in CTSELECT lowering by splitting
them into three 32-bit chunks, performing constant-time selection on each
chunk, and reassembling the result. This fixes crashes when compiling
tests with f80 types.
Also updated ctselect.ll to match current generic fallback implementation.
[LLVM][X86] Add native ct.select support for X86 and i386
Add native X86 implementation with CMOV instructions and comprehensive tests:
- X86 ISelLowering with CMOV for x86_64 and i386
- Fallback bitwise operations for i386 targets without CMOV
- Post-RA expansion for pseudo-instructions
- Comprehensive test coverage:
- Edge cases (zero conditions, large integers)
- i386-specific tests (FP, MMX, non-CMOV fallback)
- Vector operations
- Optimization patterns
The basic test demonstrating fallback is in the core infrastructure PR.
[LLVM][AArch64] Add native ct.select support for ARM64
This patch implements architecture-specific lowering for ct.select on AArch64
using CSEL (conditional select) instructions for constant-time selection.
Implementation details:
- Uses CSEL family of instructions for scalar integer types
- Uses FCSEL for floating-point types (F16, BF16, F32, F64)
- Post-RA MC lowering to convert pseudo-instructions to real CSEL/FCSEL
- Handles vector types appropriately
- Comprehensive test coverage for AArch64
The implementation includes:
- ISelLowering: Custom lowering to CTSELECT pseudo-instructions
- InstrInfo: Pseudo-instruction definitions and patterns
- MCInstLower: Post-RA lowering of pseudo-instructions to actual CSEL/FCSEL
- Proper handling of condition codes for constant-time guarantees
[LLVM][ARM] Add native ct.select support for ARM32 and Thumb
This patch implements architecture-specific lowering for ct.select on ARM
(both ARM32 and Thumb modes) using conditional move instructions and
bitwise operations for constant-time selection.
Implementation details:
- Uses pseudo-instructions that are expanded Post-RA to bitwise operations
- Post-RA expansion in ARMBaseInstrInfo for BUNDLE pseudo-instructions
- Handles scalar integer types, floating-point, and half-precision types
- Handles vector types with NEON when available
- Support for both ARM and Thumb instruction sets (Thumb1 and Thumb2)
- Special handling for Thumb1 which lacks conditional execution
- Comprehensive test coverage including half-precision and vectors
The implementation includes:
- ISelLowering: Custom lowering to CTSELECT pseudo-instructions
- ISelDAGToDAG: Selection of appropriate pseudo-instructions
- BaseInstrInfo: Post-RA expansion of BUNDLE to bitwise instruction sequences
[3 lines not shown]
[ConstantTime][RISCV] Add comprehensive tests for ct.select
Add comprehensive test suite for RISC-V fallback implementation:
- Edge cases (zero conditions, large integers, sign extension)
- Pattern matching (nested selects, chains)
- Vector support with RVV extensions
- Side effects and memory operations
The basic fallback test is in the core infrastructure PR.
[LLVM][AArch64] Add native ct.select support for ARM64
This patch implements architecture-specific lowering for ct.select on AArch64
using CSEL (conditional select) instructions for constant-time selection.
Implementation details:
- Uses CSEL family of instructions for scalar integer types
- Uses FCSEL for floating-point types (F16, BF16, F32, F64)
- Post-RA MC lowering to convert pseudo-instructions to real CSEL/FCSEL
- Handles vector types appropriately
- Comprehensive test coverage for AArch64
The implementation includes:
- ISelLowering: Custom lowering to CTSELECT pseudo-instructions
- InstrInfo: Pseudo-instruction definitions and patterns
- MCInstLower: Post-RA lowering of pseudo-instructions to actual CSEL/FCSEL
- Proper handling of condition codes for constant-time guarantees
[LLVM][ARM] Add native ct.select support for ARM32 and Thumb
This patch implements architecture-specific lowering for ct.select on ARM
(both ARM32 and Thumb modes) using conditional move instructions and
bitwise operations for constant-time selection.
Implementation details:
- Uses pseudo-instructions that are expanded Post-RA to bitwise operations
- Post-RA expansion in ARMBaseInstrInfo for BUNDLE pseudo-instructions
- Handles scalar integer types, floating-point, and half-precision types
- Handles vector types with NEON when available
- Support for both ARM and Thumb instruction sets (Thumb1 and Thumb2)
- Special handling for Thumb1 which lacks conditional execution
- Comprehensive test coverage including half-precision and vectors
The implementation includes:
- ISelLowering: Custom lowering to CTSELECT pseudo-instructions
- ISelDAGToDAG: Selection of appropriate pseudo-instructions
- BaseInstrInfo: Post-RA expansion of BUNDLE to bitwise instruction sequences
[3 lines not shown]
[LLVM][AArch64] Add native ct.select support for ARM64
This patch implements architecture-specific lowering for ct.select on AArch64
using CSEL (conditional select) instructions for constant-time selection.
Implementation details:
- Uses CSEL family of instructions for scalar integer types
- Uses FCSEL for floating-point types (F16, BF16, F32, F64)
- Post-RA MC lowering to convert pseudo-instructions to real CSEL/FCSEL
- Handles vector types appropriately
- Comprehensive test coverage for AArch64
The implementation includes:
- ISelLowering: Custom lowering to CTSELECT pseudo-instructions
- InstrInfo: Pseudo-instruction definitions and patterns
- MCInstLower: Post-RA lowering of pseudo-instructions to actual CSEL/FCSEL
- Proper handling of condition codes for constant-time guarantees
[LLVM][X86] Add f80 support for ct.select
Add special handling for x86_fp80 types in CTSELECT lowering by splitting
them into three 32-bit chunks, performing constant-time selection on each
chunk, and reassembling the result. This fixes crashes when compiling
tests with f80 types.
Also updated ctselect.ll to match current generic fallback implementation.
[LLVM][X86] Add native ct.select support for X86 and i386
Add native X86 implementation with CMOV instructions and comprehensive tests:
- X86 ISelLowering with CMOV for x86_64 and i386
- Fallback bitwise operations for i386 targets without CMOV
- Post-RA expansion for pseudo-instructions
- Comprehensive test coverage:
- Edge cases (zero conditions, large integers)
- i386-specific tests (FP, MMX, non-CMOV fallback)
- Vector operations
- Optimization patterns
The basic test demonstrating fallback is in the core infrastructure PR.
[ConstantTime][RISCV] Add comprehensive tests for ct.select
Add comprehensive test suite for RISC-V fallback implementation:
- Edge cases (zero conditions, large integers, sign extension)
- Pattern matching (nested selects, chains)
- Vector support with RVV extensions
- Side effects and memory operations
The basic fallback test is in the core infrastructure PR.