[Clang][HLSL] Fix invalid flag passed by the driver (#170300)
The test were using the DXC driver in Clang, which adds the
`--spirv-ext=` option. Turns out some buildbots are built without this
flag support, meaning any test using this driver would fail with an
'unknown command line argument' error.
[MLIR][Presburger] Fix Gaussian elimination (#164437)
In the Presburger library, there are two minor bugs of Gaussian
elimination.
In Barvinok.cpp, the `if (equations(i, i) != 0) continue;` is intended
to skip only the row-swapping, but it in fact skipped the whole loop
body altogether, including the elimination parts.
In IntegerRelation.cpp, the Gaussian elimination forgets to advance
`firstVar` (the number of finished columns) when it finishes a column.
Moreover, when it checks the pivot row of each column, it didn't ignore
the rows considered.
As an example, suppose the constraints are
```
1 0 0 1 2 = 0
0 1 0 0 3 = 0
0 0 0 1 4 = 0
[8 lines not shown]
Revert "Revert "[LLDB] Update Shell lit config to handle c8031c3dd743"" (#170312)
Reverts llvm/llvm-project#170288
Turns out this was not the cause of the failure
[CUDA][HIP] Fix CTAD for host/device constructors (#168711)
Clang currently does not allow using CTAD in CUDA/HIP device functions
since deduction guides are treated as host-only. This patch fixes that
by treating deduction guides as host+device. The rationale is that
deduction guides do not actually generate code in IR, and there is an
existing check for device/host correctness for constructors.
The patch also suppresses duplicate implicit deduction guides from
host/device constructors with identical signatures and constraints
to prevent ambiguity.
For CUDA/HIP, deduction guides are now always implicitly enabled for
both host and device, which matches nvcc's effective behavior. Unlike
nvcc, which silently ignores explicit CUDA/HIP target attributes on
deduction guides, Clang diagnoses such attributes as errors to keep
the syntax clean and avoid confusion.
This ensures CTAD works correctly in CUDA/HIP for constructors with
[21 lines not shown]
ipfilter: Load optionlist prior to ippool invocation
As a safety precaution df381bec2d2b limits ippool hash table size to 1K.
This causes any legitimely large hash table to fail to load. The
htable_size_max ipf tuneable adjusts this but the adjustment is made
in the ipfilter rc script, invoked after the ippool script (because it
depends on ippool). Let's load the ipfilter_optionlist in ippool as well.
ipfilter_optionlist load will also occur in the ipfilter rc script in case
the user uses ipfilter without ippool.
Fixes: df381bec2d2b
(cherry picked from commit d5d005e9bf4933d5680dd0bb5d42bdf440122aa4)
ipfilter: Load optionlist prior to ippool invocation
As a safety precaution df381bec2d2b limits ippool hash table size to 1K.
This causes any legitimely large hash table to fail to load. The
htable_size_max ipf tuneable adjusts this but the adjustment is made
in the ipfilter rc script, invoked after the ippool script (because it
depends on ippool). Let's load the ipfilter_optionlist in ippool as well.
ipfilter_optionlist load will also occur in the ipfilter rc script in case
the user uses ipfilter without ippool.
Fixes: df381bec2d2b
(cherry picked from commit d5d005e9bf4933d5680dd0bb5d42bdf440122aa4)
[WIP][IR][Constants] Change the semantic of `ConstantPointerNull` to represent an actual `nullptr` instead of a zero-value pointer
The value of a `nullptr` is not always `0`. For example, on AMDGPU, the `nullptr` in address spaces 3 and 5 is `0xffffffff`. Currently, there is no target-independent way to get this information, making it difficult and error-prone to handle null pointers in target-agnostic code.
We do have `ConstantPointerNull`, but it might be a little confusing and misleading. It represents a pointer with an all-zero value rather than necessarily a real `nullptr`. Therefore, to represent a real `nullptr` in address space `N`, we need to use `addrspacecast ptr null to ptr addrspace(N)` and it can't be folded.
In this PR, we change the semantic of `ConstantPointerNull` to represent an actual `nullptr` instead of a zero-value pointer. Here is the detailed changes.
* `ptr addrspace(N) null` will represent the actual `nullptr` in address space `N`.
* `ptr addrspace(N) zeroinitializer` will represent a zero-value pointer in address space `N`.
* `Constant::getNullValue` will return a _null_ value. It is same as the current semantics except for the `PointerType`, which will return a real `nullptr` pointer.
* `Constant::getZeroValue` will return a zero value constant. It is completely same as the current semantics. To represent a zero-value pointer, a `ConstantExpr` will be used (effectively `inttoptr i8 0 to ptr addrspace(N)`).
* Correspondingly, there will be both `Constant::isNullValue` and `Constant::isZeroValue`.
The RFC is https://discourse.llvm.org/t/rfc-introduce-sentinel-pointer-value-to-datalayout/85265. It is a little bit old and the title might look different, but everything eventually converges to this change. An early attempt can be found in https://github.com/llvm/llvm-project/pull/131557, which has many valuable discussion as well.
This PR is still WIP but any early feedback is welcome. I'll include as many necessary code changes as possible in this PR, but eventually this needs to be carefully split into multiple PRs, and I'll do it after the changes look good to every one.
Add unregister
The goal of unregister is to remove the record from the database that
a package is install but to no touch the files (or even run the scripts
as we want to keep the generated data for example).
This will allow people to migrate from a pkgbase to a non-pkgbase install.
Sponsored by: Beckhoff Automation GmbH & Co. KG
[SPIRV] Add support for CodeSectionINTEL storage class in legalizer (#167961)
The
[SPV_INTEL_function_pointers](https://github.com/intel/llvm/blob/sycl/sycl/doc/design/spirv-extensions/SPV_INTEL_function_pointers.asciidoc)
extension defines a new storage class `CodeSectionINTEL` that is
represented in LLVM IR as `addrspace(9)`.
Per the spec, it is basically not allowed to be casted to or interact
with pointers with other storage classes.
Add `addrspace(9)` as a known pointer type to the legalizer, and then
add some error cases for IR that is impossible to legalize.
Right now, if you try to run the backend on input with SPIR-V, basically
everything errors saying that it is unable to legalize because `ptr
addrspace(9)` is not considered a pointer type.
Ideally the FE should not generate the illegal IR or error out earlier,
but we should catch it before generating invalid SPIR-V.
[3 lines not shown]
AMDGPU: Fix treating unknown mem operands as uniform (#168980)
The test changes are mostly GlobalISel specific regressions.
GlobalISel is still relying on isUniformMMO, but it doesn't really
have an excuse for doing so. These should be avoidable with new
regbankselect.
There is an additional regression for addrspacecast for cov4. We
probably ought to be using a separate PseudoSourceValue for the
access of the queue pointer.
[Clang][CodeGen] Remove explicit insertion of AllocToken pass (#169360)
Remove explicit insertion of the AllocTokenPass, which is now handled by
the PassBuilder. Emit AllocToken configuration as LLVM module flags to
persist into the backend.
Specifically, this also means it will now be handled by LTO backend
phases; this avoids interference with other optimizations (e.g. PGHO)
and enable late heap-allocation optimizations with LTO enabled.
clang/AMDGPU: Enable opencl 2.0 features for unknown target
Assume amdhsa triples support flat addressing, which matches
the backend logic for the default target. This fixes the
rocm device-libs build.
clang/AMDGPU: Add missing opencl feature macros
This is a partial fix for the rocm device-libs build. This
was most likely broken by 423bdb2bf257e19271d62e60b6339d84b8ce05aa