AMDGPU: Strip sign bit operations on llvm.amdgcn.trig.preop uses (#179712)
The instruction ignores the sign bit, so we can find the magnitude
source. The real library use has a fabs input which this avoids.
stripSignOnlyFPOps should probably go directly into PatternMatch in some
form.
[NFC][emacs] Fix emacs lints in the LLVM and MLIR modes (#182074)
This mainly involved explicitly declaring minimum emacs versions for
setq-local and adding a lexical-binding annotaton.
The commit also removes some workarounds from the MLIR mode for Emacs 23
(!).
docs: Delete incorrect code generation section of HowToSubmitABug (#182315)
I've never used this. Based on the description here, I'm assuming
it relied on the C backend, which was removed in 2012.
tools: Remove untested PluginLoader includes (#117644)
As far as I can tell there are 2 parallel plugin mechanisms.
opt -load=plugin does not work, and is ignored. opt -load-pass-plugin
does work. PluginLoader.h forces a static definition of the "load"
cl::opt into included TUs. Delete the cases with no tests.
[mlir][acc] Add attributes for parallelism dimensions (#182209)
As OpenACC gets lowered to eventually mapping to GPU (via GPU dialect),
we need to track parallelism assignment which we can use in how
variables get privatized, how barriers and synchronizations are inserted
to ensure appropriate OpenACC execution model, and for loop
work-sharing. This adds GPUParallelDimAttr and GPUParallelDimsAttr for
this.
[WebAssembly] Fix SELECT_CC lowering for reference types (#181622)
SELECT_CC nodes with externref or funcref return types were not being
expanded, causing "Cannot select" errors during instruction selection.
This adds SELECT_CC to the list of operations that should be expanded
for reference types, similar to how it's already handled for scalar
types (i32, i64, f32, f64). This allows the SELECT_CC to be lowered to a
SELECT node, which already has instruction patterns defined in
WebAssemblyInstrRef.td.
[X86] combineCMov - fold CMOV(LOAD(PTR0),LOAD(PTR1)) -> LOAD(CMOV(PTR0,PTR1)) (#182084)
As discussed on #182021 - if we have equivalent simple loads (chain,
addressspace etc.), just with different pointers then we can select
between the pointers directly and perform just a single load, which in
most cases will avoid branching.
A future patch might be able to further simplify some (mainly stack?)
address math with CMOV(X,ADD(X,C1)) -> ADD(X,CMOV(0,C1)) /
CMOV(ADD(X,C0),ADD(X,C1)) -> ADD(X,CMOV(C0,C1))
llvm: Delete bugpoint
For crash reduction, I don't think it does anything that llvm-reduce
can't. Pass pipeline reduction also has a separate reduction script.
The main thing there isn't a replacement tool is the miscompilation
reducer, but I'm not sure that's actually functioned for years.
There are still some references to bugpoint in various comments
and pieces of documentation that don't all necessarily make sense
to replace or remove. In particular there are a few passes documented
as "only for bugpoint", but I've left those alone in case they are
useful for manual reductions.
[AArch64][llvm] Remove `+xs` gating for `tlbip *nxs` instructions
A recent specification update has removed FEAT_XS gating for `tlbip *nxs`
instructions. It remains gated on FEAT_XS for `tlbi *nxs` instructions.
[MLIR] Fix a crash in CollapseLinalgDimensions (#181715)
This patch fixes #181610
Added a check in areDimSequencesPreserved()
to verify that each map is a projected permutation before calling
isDimSequencePreserved().
If a map is not a projected permutation, the
function returns false (dimension sequences cannot be preserved in
non-projected-permutation maps).
[mlir][Interface] Allow scalar operands and require ranked shaped operands in IndexingMapOpInterface (#179072)
This change adjusts `IndexingMapOpInterface::verifyImpl`: Scalars are
allowed as operands (treated as rank-0), vectors remain allowed,
unranked tensors/memrefs are rejected with explicit diagnostics.
Fixes https://github.com/llvm/llvm-project/issues/179043
[flang] Lowering a ArrayCoorOp to arithmetic computations when a fir memref is a block argument (#182139)
Remove the special-case that handled `fir.array_coor` with a
block-argument base by converting the element ref result (!fir.ref<i32>
-> memref<i32>) and leaving fir.array_coor alive.
Instead, we now always convert the base (!fir.ref<!fir.array<...>> ->
memref<...>) and compute the memref indices from the fir.array_coor
operands, so loads/stores become memref.load/store base[indices] and
fir.array_coor can be erased when it’s only used by memory ops.
docs: Delete incorrect code generation section of HowToSubmitABug
I've never used this. Based on the description here, I'm assuming
it relied on the C backend, which was removed in 2012.
[AMDGPUEmitPrintf] Fix operand order
Fix a typo in 87eee80dad79417e079c369b9ff5578873019b78, the
CreatePtrDiff operands were supposed to be the other way around.
[RISCV] Add ComplexPatterns for matching xor/vmnot_vl+vmset_vl. NFC (#182071)
Xor is commutable and we don't guarantee which operand will be the
vmset_vl. Tablegen will generate all possible permutations when creating
RISCVGenDAGISel.inc. These xor/vmnot_vl are used by other commutable
nodes leading to quite a few patterns being generated by tablegen.
By using a ComplexPattern we can handle both cases with one piece of C++
code. This reduces the isel table by 2-3k.
[AMDGPU] BackOffBarrier feature added to gfx1250; Removed incorrect "DS Store drain" check. (#179818)
Missing BackOffBarrier feature added to gfx1250.
Checking for S_BARRIER only does not imply prior DS Stores getting
drained.
docs: Update HowToSubmitABug to use llvm-reduce instead of bugpoint (#182310)
Convert the crash section to recommend llvm-reduce, and stop mentioning
bugpoint. The miscompilation section still uses bugpoint.