[SPIRV]Implementing PopCount for 16 and 64 bits (#191283)
`OpBitCount` only supports 32bit types. So this patch modifies the
codegen to follow a similar pattern as `firstbithigh` and `firstbitlow`.
On 8 and 16 bits, the parameters are zero-extended to 32 bits. With 64
bits it is bitcasting into 2xi32 types. The logic is adapted to larger
component counts as well.
Fix: https://github.com/llvm/llvm-project/issues/142677
---------
Co-authored-by: Joao Saffran <jderezende at microsoft.com>
Make GSYM 64 bit safe and add a new version 2 of the GSYM files (#190353)
# Motivation
GSYM files are approaching the need for 64 bit offsets in the GSYM
files. We also want to add more global data to GSYM files. Right now the
GSYM file format is:
```
Header
AddressOffsets
AddressInfoOffsets
FileTable
StringTable
FunctionInfos
```
The location of the `AddressOffsets`, `AddressInfoOffsets` and
`FileTable` are always immediately following the Header. The
`StringTable` is pointed to by the header and the header uses 32 bit
integers for the string table file offset and file size. The
[74 lines not shown]
[LSR][IndVarSimplify] Update assertion message (#192168)
rewriteLoopExitValues is called by both LSR and IndVarSimplify. Update
the assertion message to match this reality rather than only mentioning
IndVarSimplify.
[runtimes] Aggregate per-target runtime checks in top-level check-${runtime_name} (#191743)
When a per-target runtime build exports a
check-${runtime_name}-${target} proxy, make the top-level
check-${runtime_name} target depend on it, creating
check-${runtime_name} on demand (it may not exist).
This applies regardless of whether the runtime comes from the default
LLVM_ENABLE_RUNTIMES set or from a target-specific
RUNTIMES_<target>_LLVM_ENABLE_RUNTIMES override.
This allows a single `check-${runtime_name}` command to trigger all
per-target tests for that runtime.
[lldb] Fix deadlock when scripted frame providers load on private state thread (#191913)
Frame providers are an overlay on top of the parent reality (the
unwinder stack). The private state thread (PST) manages the stop of that
parent reality, so the correct view for PST logic IS the parent --
providers should only be applied once the process has settled and
clients query the stopped state.
When a scripted breakpoint's `was_hit` callback calls
`EvaluateExpression` on the PST, `RunThreadPlan` spawns an override PST
(Thread B) and reassigns `m_current_private_state_thread_sp` to it. Two
threads then need to see parent frames:
- Thread B (override PST): processes stop events via
`HandlePrivateEvent` -> `ShouldStop` -> `GetStackFrameList`. If it loads
a provider, the provider's Python code can acquire locks held by Thread
A, causing a deadlock.
- Thread A (original PST): processes events inline via
[25 lines not shown]
[BOLT] Fix DW_FORM_implicit_const values lost during DWARF5 rewriting
Summary:
Fix two bugs in DIEBuilder that caused DW_FORM_implicit_const values to
be zeroed out when rewriting DWARF5 debug sections (--update-debug-sections).
1. In constructDIEFast(), DWARFFormValue was constructed with just the
form code, leaving the value at 0. For DW_FORM_implicit_const,
extractValue() is a no-op since the value is expected to be pre-set.
Fix: use AttrSpec.getFormValue() which initializes the value from the
abbreviation table.
2. In assignAbbrev(), AddAttribute(Attr.getAttribute(), Attr.getForm())
used the two-argument overload which discards the implicit_const
value. Fix: use AddAttribute(Attr) to copy the full DIEAbbrevData.
Fixes https://github.com/llvm/llvm-project/issues/192084
[AMDGPU] Add object linking support for LDS and named barrier lowering in the middle end (#191645)
This is the first patch in a series introducing object linking support
for AMDGPU.
This PR adds the `-amdgpu-enable-object-linking` flag to enable object
linking in the backend. It also updates the `AMDGPULowerModuleLDSPass`
and `AMDGPULowerExecSync` passes to support lowering LDS and named
barrier globals when object linking is enabled.
[RISCV][GISel] Use a single FEQ for fcmp ord/uno x, x (#192022)
When both operands of an ORD/UNO compare are the same register,
the double-FEQ + AND sequence is redundant: a single FEQ x, x
gives the same result. Addresses the FIXME in selectFCmp.
[LV][VPlan] Print VPlan after construction and initial optimizations. NFC (#187443)
This patch add a helper pass `printOptimizedVPlan` to print the plan at
the end of the VPlan construction and optimize pipeline.
This patch enables the opportunity that we can further clamp and attach
VF range after `VPlanTransforms::optimize` and not changing the test
printing (#172799).
[CMake] Pass ZLIB_LIBRARY_* to runtimes bootstrap (#191555)
Runtimes external project (compiler-rt / combined runtimes) reconfigures
with an initial cache that did not propagate `ZLIB_LIBRARY_RELEASE`.
CMake 4.x `FindZLIB` may leave `ZLIB_LIBRARY` unset while finding
headers, leading to:
```
-- Could NOT find ZLIB (missing: ZLIB_LIBRARY) (found version "...")
```
and later when loading LLVM exports from the main build:
```
The link interface of target "LLVMSupport" contains: ZLIB::ZLIB
but the target was not found.
```
This was found by building the Windows installer with:
```
llvm\utils\release\build_llvm_release.bat --x64 --version 23.0.0 --skip-checkout --local-python
```
[flang][CodeGen] Fix address space mismatch for CUF globals in AddrOfOpConversion (#190408)
AddrOfOpConversion in CodeGen.cpp only handled `LLVM::GlobalOp` when
determining the address space for `llvm.mlir.addressof`. When the global
was still a `fir::GlobalOp` (not yet converted), it fell back to address
space 0, breaking CUF constant globals (addr_space 4) and AMDGPU targets
(global addr_space 1).
This extends the upstream fix (#192111, which only covered Constant) to
also handle Shared and Managed CUF data attributes, and returns
`std::nullopt` instead of 0 for non-CUF globals so the target's default
address space is preserved.
[lldb][debugserver] Fix lldb testsuite routine parsing logs (#192157)
I changed how lldb and debugserver fetch binaries when attaching to a
process (only fetching the addresses of binaries, not the detailed
information) but a utility function was parsing the log file and
expected the detailed information in the initial response. Updated it to
expect detailed information in the initial response, or in the
subsequent query when the first response is addresses-only.
[CIR] Implement array-to-incomplete-array cast (#192138)
This is a noop cast that is allowed in some situations in C++20, and is
validated with one of the test suites. This patch adds a very defensive
NYI diagnostic to replace the other one, plus implements the array decay
case.
[lld][llvm-objcopy] Enable Xbox subsystem for PE images. (#191779)
This patch enables selecting the Xbox subsystem (IMAGE_SUBSYSTEM_XBOX)
for PE images. Certain existing tools used in the Xbox homebrew scene
expect images to use the Xbox subsystem, so it's nice to be able to set
this within the LLVM toolchain instead of invoking yet another tool or
manually patching the binaries.
[flang] Recognize generic allocations in Flang LICM. (#191923)
Instead of matching particular operations like `fir.alloca`
we can use `MemoryEffectOpInterface` to figure out if a location
is a new allocation.
[analyzer] Fix alignment of entries in -analyzer-help (#190570)
Fix a formatting bug in `AnalyzerOptions::printFormattedEntry` (used by
`clang -cc1 -print-analyzer-options`), which led to misalignment of a
checker description.
This commit ensures that `printFormattedEntry` inserts a newline in the
corner case when the length of the name of a checker is exactly equal to
`EntryWidth`. (In this situation the old code inserted a space between
the name and the description, so this description was not aligned with
the other descriptions.)
Additionally this commit also fixes the corner case where the pad before
the checker name (specified by `InitialPad`) is 0. Before the fix, due
to `llvm::raw_formatted_ostream::PadToColumn` logic, `InitialPad = 0`
still added one space character as padding before the checker name.
Fortunately `InitialPad = 0` was never used in the program, so this bug
was not visible to the user.
These changes are both tested by the freshly added unit tests.
[RISCV][P-ext] Support plui.h/w in generateInstSeqImpl. (#192137)
There's some overlap in the pli/plui encodings. I've modified the code
to prefer pli.b over pli.h and to prefer pli.h over plui.h. This matches
what we do in the splat_vector path in RISCVISelDAGToDAG.
[sanitizer] Generalize FD closing in StartSubprocess (#192114)
Use internal_close_range with a fallback to the sysconf(_SC_OPEN_MAX)
loop. This removes the platform-specific #if and lets all platforms
benefit from close_range when supported.
Follow-up to #191971.
Make Passes Required - func-properties-stats and instcount (#192130)
These passes count different types of instructions and we want to see
them even though optnone is enabled