[clang-repl] Enable extending `launchExecutor` (#152562)
This patch introduces the ability to customize the fork process with an external lambda function. This is useful for downstream clients where they want to do stream redirection.
[NFC][Clang][Docs] Update Pointer Authentication documentation
This updates the pointer authentication documentation to include
a complete description of the existing functionaliy and behaviour,
details of the more complex aspects of the semantics and security
properties, and the Apple arm64e ABI design.
Co-authored-By: Ahmed Bougacha <ahmed at bougacha.org>
Co-authored-By: Akira Hatanaka <ahatanak at gmail.com>
Co-authored-By: John Mccall <rjmccall at apple.com>
[NFC][Clang][Docs] Update Pointer Authentication documentation
This updates the pointer authentication documentation to include
a complete description of the existing functionaliy and behaviour,
details of the more complex aspects of the semantics and security
properties, and the Apple arm64e ABI design.
Co-authored-By: Ahmed Bougacha <ahmed at bougacha.org>
Co-authored-By: Akira Hatanaka <ahatanak at gmail.com>
Co-authored-By: John Mccall <rjmccall at apple.com>
[AMDGPU] Generate some WQM/WWM tests (NFC) (#152635)
Update llvm.amdgcn.kill.ll and wqm.mir to be generated.
This preparatory work for refactoring of WQM/WWM pass.
Reland "[clang-tidy] fix bugprone-narrowing-conversions false positive for conditional expression" (#151874)
This is another attempt to merge previously
[reverted](https://github.com/llvm/llvm-project/pull/139474#issuecomment-3148339124)
PR #139474. The added tests
`narrowing-conversions-conditional-expressions.c[pp]` failed on
[different (non x86_64)
platforms](https://github.com/llvm/llvm-project/pull/139474#issuecomment-3148334280)
because the expected warning is implementation-defined. That's why the
test must explicitly specify target (the line `// RUN: -- -target
x86_64-unknown-linux`).
[libc] Migrate FEnvSafeTest and FPTest to ErrnoCheckingTest. (#152633)
This would ensure that errno value is cleared out before test execution
and tests pass even when LIBC_ERRNO_MODE_SYSTEM_INLINE is specified (and
errno may be clobbered before test execution).
A lot of the tests would fail, however, since errno would end up getting
set to EDOM or ERANGE during test execution and never validated before
the end of the test. This should be fixed - and errno should be
explicitly checked or ignored in all of those cases, but for now add a
TODO to address it later (see open issue #135320) and clear out errno in
test fixture to avoid test failures.
[mlir][linalg] Add mixed precision folding pattern in vectorize_children_and_apply_patterns TD Op (#148684)
In case of mixed precision inputs, the inputs are generally casted to
match output type thereby introduces arith.extFOp/extIOp instructions.
Folding such pattern into vector.contract is desirable for HW having
mixed precision ISA support.
This patch adds folding of mixed precision pattern into vector.contract
optionaly which can be enabled using attribute
`fold_type_extensions_into_contract`.
[lldb] Fix incorrect print of UUID and load address (#152560)
The current display is missing a space, for example:
```
no target │ Locating binary: 24906A83-0182-361B-8B4A-90A249B04FD7at 0x0000000c0d108000
```
Co-authored-by: Dominic Chen <daming_chen at apple.com>
[flang] Skip processing reductions for unstructured `do concurrent` loops (#150188)
Fixes #149563
When emitting unstructured `do concurrent` loops, reduction processing
should be skipped since we are not emitting `fir.do_concurrent` loop in
the first place.
[libc][math][c++23] Add {ceil,floor,round,roundeven,trunc}bf16 math functions (#152352)
This PR implements the following basic math functions for BFloat16 type
along with the tests:
- ceilbf16
- floorbf16
- roundbf16
- roundevenbf16
- truncbf16
---------
Signed-off-by: Krishna Pandey <kpandey81930 at gmail.com>
[mlir][gpu] Update attribute definitions in `gpu::LaunchOp` (#152106)
`gpu::LaunchOp` is updated the following way:
- Change the attribute type of kernel function and module from
`SymbolRefAttr` to `FlatSymbolRefAttr` to avoid nested symbol
references.
- Rename variables from camel case (kernelFunc, kernelModule) to lower
case (function, module) and update the syntax.
- `LaunchOp::build` support passing `module` and `function` attributes.
[lld][ELF] filter out section symbols when use BP reorder (#151685)
When using Temporal Profiling with the BP algorithm, we encounter an
issue with the internal function reorder. In cases where the symbol
table contains entries like:
```
Symbol table '.symtab' contains 45 entries:
Num: Value Size Type Bind Vis Ndx Name
10: 0000000000000000 0 SECTION LOCAL DEFAULT 18 .text.L1
11: 0000000000000000 24 FUNC LOCAL DEFAULT 18 L1
````
The zero-sized section symbol .text.L1 gets stored in the secToSym map
first. However, when the function lookup searches for L1 (as seen in
[BPSectionOrdererBase.inc:191](https://github.com/llvm/llvm-project/blob/main/lld/include/lld/Common/BPSectionOrdererBase.inc#L191)),
it fails to find the correct entry in rootSymbolToSectionIdxs because
the section symbol has already claimed that slot.
This patch fixes the issue by skipping zero-sized symbols during the
addSections process, ensuring that function symbols are properly
registered for lookup.
MC: Refine ALIGN relocation conditions
Each section now tracks the index of the first linker-relaxable
fragment, enabling two changes:
* Delete redundant ALIGN relocations before the first linker-relaxable
instruction in a section. The primary example is the offset 0
R_RISCV_ALIGN relocation for a text section aligned by 4.
* For alignments larger than the NOP size after the first
linker-relaxable instruction, ALIGN relocations are now generated, even in
norelax regions. This fixes the issue #150159.
The new test llvm/test/MC/RISCV/Relocations/align-after-relax.s
verifies the required ALIGN in a norelax region following
linker-relaxable instructions.
By using a fragment index within the subsection (which is less than or
equal to the section's index), the implementation may generate redundant
ALIGN relocations in lower-numbered subsections before the first
linker-relaxable instruction.
[30 lines not shown]
[DA] Fix the check between Subscript and Size after delinearization (#151326)
Delinearization provides two values: the size of the array, and the
subscript of the access. DA checks their validity (`0 <= subscript <
size`), with some special handling. In particular, to ensure `subscript
< size`, calculate the maximum value of `subscript - size` and check if
it is negative. There was an issue in its process: when `subscript -
size` is expressed as an affine format like `init + step * i`, the value
in the last iteration (`start + step * (num_iterations - 1)`) was
assumed to be the maximum value. This assumption is incorrect in the
following cases:
- When `step` is negative
- When the AddRec wraps
This patch introduces extra checks to ensure the sign of `step` and
verify the existence of nsw/nuw flags.
Also, `isKnownNonNegative(S - smax(1, Size))` was used as a regular
[5 lines not shown]
[libc] Fix typo and amend restrict qualifier (#152410)
This removes an extraneous ',' in the generated dlfcn header.
This also adds `__restrict` to `dladdr`'s declaration per POSIX. Another
fix is made: the C standard `restrict` keyword is removed from
dlinfo.cpp/dlinfo.h (but note that dlfcn.yaml still annotates
`__restrict` for dlinfo's decl).
[Offload][Conformance] Add support for CUDA Math and HIP Math providers (#152362)
This patch extends the conformance testing infrastructure to support two
new providers of math function implementations for GPUs: CUDA Math
(`cuda-math`) and HIP Math (`hip-math`).