[DataLayout] Add null pointer value infrastructure
Add support for specifying the null pointer bit representation per address space
in DataLayout via new pointer spec flags:
- 'z': null pointer is all-zeros
- 'o': null pointer is all-ones
When neither flag is present, the address space inherits the default set by the
new 'N<null-value>' top-level specifier ('Nz' or 'No'). If that is also absent,
the null pointer value is zero.
No target DataLayout strings are updated in this change. This is pure
infrastructure for a future ConstantPointerNull semantic change to support
targets with non-zero null pointers (e.g. AMDGPU).
[flang][OpenMP] Remove deferredNonVariables_ from OmpStructureChecker… (#195100)
…, NFC
It was created to defer error messages about invalid argument types
until the end of the analysis of the construct. That is not necessary
since diagnostic messages are emitted in the order corresponding to
their location in the source, not the order they were generated.
[llubi] Vector manipulation intrinsics cleanup (#195004)
This PR fixes llvm.vector.insert and llvm.vector.extract by adding a
missing UB case and handle scalable vectors correctly.
See also #194345.
[DAG] expandVecReduce - widen sub-legal vectors to not prematurely scalarize later reduction levels (#194672)
When repeatedly splitting the pow2 vector source, we currently begin to
scalarize as soon as the split ops drop below the legal vector op type.
This patch attempts to widen the source vectors back to legal op types
to avoid excess scalarization / additional vector element extractions.
Fixes #194655
[libc][docs] Add nl_types.h POSIX header documentation (#122006) (#194373)
Add nl_types.h implementation-status docs to llvm-libc.
Depends on PR #194367. That change fixes docgen lookup for underscored
headers, without it, the macros of nl_types.h implementation status is
not reported accurately.
[ProfileData] Use FORCE_ON for LLVM_ENABLE_OPENCSD (#194973)
Use FORCE_ON instead of ON to only report the error but proceed when the
dependency is not found.
[flang][openmp] Fix incorrect reduction for array section in OpenMP DO SIMD (#192394)
for "!omp do parallel simd reduction" ensuring that reduction for array
section is done properly by :
1) per-SIMD-lane reduction results are combined into the wsloop's
thread-local copies.
2) wsloop thread-local copies are combined across threads by the wsloop
reduction.
Issue is in [192077](https://github.com/llvm/llvm-project/issues/192077)
---------
Co-authored-by: Sunil Kuravinakop <kuravina at pe31.hpc.amslabs.hpecorp.net>
[DataLayout] Add null pointer value infrastructure
Add support for specifying the null pointer bit representation per address space
in DataLayout via new pointer spec flags:
- 'z': null pointer is all-zeros
- 'o': null pointer is all-ones
When neither flag is present, the address space inherits the default set by the
new 'N<null-value>' top-level specifier ('Nz' or 'No'). If that is also absent,
the null pointer value is zero.
No target DataLayout strings are updated in this change. This is pure
infrastructure for a future ConstantPointerNull semantic change to support
targets with non-zero null pointers (e.g. AMDGPU).
[libc] Stop passing `--version` to compiler when detecting target (#176680)
This reverts c267501c155f9, and also adds a `-c` flag.
Both gcc and clang print the `Target:` line that we're trying to find
just find with just `-v`.
When passing `--version`, gcc passes `--version` to the system linker,
and when using gcc on macOS, the system linker does not understand
`--version`. Since `--version` does not seem to be necessary, drop it.
Also, passing `-c` lets gcc not print linker details, so add that too,
as a belt-and-suspenders fix.
---
Makes `cmake` succeed for me on my mac with
`/Applications/CMake.app/Contents/bin/cmake ../llvm-project/llvm -G
Ninja -DLLVM_ENABLE_PROJECTS="libc" -DCMAKE_BUILD_TYPE=Release
-DCMAKE_C_COMPILER=gcc-12 -DCMAKE_CXX_COMPILER=g++-12` (with gcc-12 from
homebrew).
[RISCV] Fix crash when tryReduceVL tries to sink to the end of the basic block. (#194706)
tryReduceVL may need to move an instruction to make the VL dominate. If
there is no instruction after the VL instruction, getNextNode will
return a nullptr.
Rewrite the code to use iterators so we will get an end iterator
instead. Replace the call to MachineInstr::moveBefore with the
equivalent MachineBasicBlock::slice which works on iterators.
[OpenMP][offload] Cross-team reductions with variable number of teams
This is the first patch in an upcoming series of patches that rework
OpenMP cross-team reductions.
This patch tries to be as minimal as possible and includes the following
changes:
1) Don't work through larger number of teams in chunks. Allocate a
suitable-sized global buffer for the team values and launch them all
at once. The last team that finishes uses a strided loop to reduce the
team values from the global buffer.
2) Inline the new functions to reduce register usage, get rid of spills,
and get rid of long switch-tables that codegen produced for the
indirect callbacks that are passed to the parallel/xteam reduction.*
The performance benefits in comparison to the previous state are often
up to 5x-10x. I did not observe any performance regressions. Can be
reproduced using my benchmark suite https://github.com/ro-i/xteam-test
(6854b7abc8848702b5a2d9ce2ea02849b5dc590b). Set compiler paths in
[14 lines not shown]
[flang][OpenMP] Remove deferredNonVariables_ from OmpStructureChecker, NFC
It was created to defer error messages about invalid argument types until
the end of the analysis of the construct. That is not necessary since
diagnostic messages are emitted in the order corresponding to their
location in the source, not the order they were generated.
Fix memcpy-operator= generation with restrict parameters. (#194906)
The below issue (and #63884) both report that we reject (and also
assert, because the memcpy failed) the memcpy we're generating for a
restrict field of a type with an implicit copy constructor.
First, we shouldn't be rejecting it this late, IF we wanted to reject it
(I contend we do not), we should do it at the same time we reject
const-members/make this a deleted operator. Second, of course we
shouldn't fail.
This patch NOW works by just having us skip the premature 'memcpy'
optimization here. In the end, the memcpy is generally skipped by
`CodeGenFunction::EmitCXXMemberOrOperatorMemberCallExpr` in the example
(as this is a trivial type), but this reverts it to using a 'for' loop
for restrict, as it does for const, and volatile qualified values.
We perhaps might think about doing this for address-spaces/ptr-auth, but
at the moment, this fixes restrict version.
Fixes: #37979
[AArch64][GlobalISel] Use generic matchinfo. NFC (#195094)
This removes some of the simple AArch64 matchinfo's, using the generic
alternatives instead.
[GlobalISel] Remove spaces at the ends of liness in Combine.td. NFC (#195086)
Some editors do this automatically. Clean up the file so that it doesn't
come up again and again.
[Offload] Remove use of raw Linux fd for error reporting (#195073)
Summary:
This is a blocker on builiding Windows, we should be able to share the
common err stream tha LLVM provides
[libc] add proc number parser and sysconf wrapper (#194159)
Add the functionality to detect number of processors with best effort.
Needed by STL to detect parallelism.
Assisted-by: Codex with gpt-5.4 high fast