LLVM/project 9625cf6utils/bazel/llvm-project-overlay/mlir BUILD.bazel

[BAZEL] Add missing dependency on /llvm:Support from XeGPUTransformOps (#167332)

Fixes 1553f90f93d30b41457097cf274c3791b182f316
DeltaFile
+1-0utils/bazel/llvm-project-overlay/mlir/BUILD.bazel
+1-01 files

LLVM/project d7dc554bolt/lib/Passes PointerAuthCFIAnalyzer.cpp, bolt/test/AArch64 negate-ra-state-incorrect.s

[BOLT] Use opts::Verbosity in PointerAuthCFIAnalyzer
DeltaFile
+17-10bolt/lib/Passes/PointerAuthCFIAnalyzer.cpp
+1-1bolt/test/AArch64/negate-ra-state-incorrect.s
+18-112 files

LLVM/project 7e17eeabolt/lib/Passes PointerAuthCFIAnalyzer.cpp, bolt/test/runtime/AArch64 pacret-synchronous-unwind.cpp

[BOLT][PAC] Warn about synchronous unwind tables

BOLT currently ignores functions with synchronous PAuth DWARF info.
When more than 10% of functions get ignored for inconsistencies, we
should emit a warning to only use asynchronous unwind tables.

See also: #165215
DeltaFile
+33-0bolt/test/runtime/AArch64/pacret-synchronous-unwind.cpp
+8-1bolt/lib/Passes/PointerAuthCFIAnalyzer.cpp
+41-12 files

LLVM/project 451be48bolt/docs PacRetDesign.md

Update PacRetDesign.md
DeltaFile
+1-1bolt/docs/PacRetDesign.md
+1-11 files

LLVM/project 12d91b0bolt/docs PacRetDesign.md

Update bolt/docs/PacRetDesign.md

Co-authored-by: Paschalis Mpeis <paschalis.mpeis at arm.com>
DeltaFile
+1-1bolt/docs/PacRetDesign.md
+1-11 files

LLVM/project 85a832fbolt/lib/Passes InsertNegateRAStatePass.cpp PointerAuthCFIFixup.cpp, bolt/unittests/Passes InsertNegateRAState.cpp PointerAuthCFIFixup.cpp

[BOLT][NFC] Rename Pointer Auth DWARF rewriter passes

Original names were "working titles". After initial patches are merged,
I'd like to rename these passes to names that reflect their intent
better and show their relationship to each other:

InsertNegateRAStatePass renamed to PointerAuthCFIFixup,
MarkRAStates renamed to PointerAuthCFIAnalyzer.
DeltaFile
+0-350bolt/lib/Passes/InsertNegateRAStatePass.cpp
+350-0bolt/lib/Passes/PointerAuthCFIFixup.cpp
+0-288bolt/unittests/Passes/InsertNegateRAState.cpp
+288-0bolt/unittests/Passes/PointerAuthCFIFixup.cpp
+145-0bolt/lib/Passes/PointerAuthCFIAnalyzer.cpp
+0-145bolt/lib/Passes/MarkRAStates.cpp
+783-78313 files not shown
+929-92819 files

LLVM/project 56251e7bolt/unittests/Passes InsertNegateRAState.cpp

[BOLT] Test fillUnknownStubs
DeltaFile
+61-0bolt/unittests/Passes/InsertNegateRAState.cpp
+61-01 files

LLVM/project 99fa15ebolt/unittests/Passes InsertNegateRAState.cpp

[BOLT] Test fillUnknownRAStateInBB
DeltaFile
+63-0bolt/unittests/Passes/InsertNegateRAState.cpp
+63-01 files

LLVM/project da7f5d1bolt/docs PacRetDesign.md, bolt/include/bolt/Passes InsertNegateRAStatePass.h

[BOLT] Improve InsertNegateRAStatePass::inferUnknownStates

Previous implementation used a simple heuristic. This can be improved in
several ways:
- If a BasicBlock has instruction both with known RAState and unknown RAState,
  use the known states to work out the unknown ones.
- If a BasicBlock only consists of instructions with unknown RAState,
  use the last known RAState from its predecessors, or the first known
  from its successors to set the RAStates in the BasicBlock. This includes
  error checking: all predecessors/successors should have the same RAState.
- Some BasicBlocks may only contain instructions with unknown RAState,
  and have no CFG neighbors. These already have incorrect unwind info.
  For these, we copy the last known RAState based on the layout order.

Updated bolt/docs/PacRetDesign.md to reflect changes.
DeltaFile
+203-20bolt/lib/Passes/InsertNegateRAStatePass.cpp
+32-2bolt/include/bolt/Passes/InsertNegateRAStatePass.h
+18-5bolt/docs/PacRetDesign.md
+253-273 files

LLVM/project aa2d90bbolt/unittests CMakeLists.txt, bolt/unittests/Passes InsertNegateRAState.cpp CMakeLists.txt

[BOLT] Single-pass unittest for InsertNegateRAState

This commit creates a new directory: bolt/unittests/Passes, to be used
by unittests that need to register and run passes with the
BinaryFunctionPassManager.

An example test is created for InsertNegateRAState pass. Actual tests
will be added in followup commits.
DeltaFile
+164-0bolt/unittests/Passes/InsertNegateRAState.cpp
+30-0bolt/unittests/Passes/CMakeLists.txt
+1-0bolt/unittests/CMakeLists.txt
+195-03 files

LLVM/project 51815b1clang/lib/AST ASTImporter.cpp, clang/unittests/AST ASTImporterTest.cpp

[Clang][ASTImporter] Implement AST import for CXXParenListInitExpr, SubstNonTypeTemplateParmPackExpr, PseudoObjectExpr (#160904)

Add new visit functions to ASTImporter for CXXParenListInitExpr,
SubstNonTypeTemplateParmPackExpr and PseudoObjectExpr.
On CTU analysis there are lot of "cannot import unsupported AST node"
for CXXParenListInitExpr, SubstNonTypeTemplateParmPackExpr and
PseudoObjectExpr. Problem occurred after full support of Concepts in
importer.
DeltaFile
+66-0clang/unittests/AST/ASTImporterTest.cpp
+48-0clang/lib/AST/ASTImporter.cpp
+114-02 files

LLVM/project cd68056bolt/include/bolt/Core MCPlusBuilder.h, bolt/lib/Core MCPlusBuilder.cpp

[BOLT] Simplify RAState helpers (NFCI) (#162820)

- unify isRAStateSigned and isRAStateUnsigned to a common getRAState,
- unify setRASigned and setRAUnsigned into setRAState(MCInst, bool),
- update users of these to match the new implementations.
DeltaFile
+39-14bolt/lib/Passes/InsertNegateRAStatePass.cpp
+11-16bolt/lib/Core/MCPlusBuilder.cpp
+6-13bolt/include/bolt/Core/MCPlusBuilder.h
+2-10bolt/lib/Passes/MarkRAStates.cpp
+58-534 files

LLVM/project 2d381bfmlir/lib/Bindings/Python IRCore.cpp

[MLIR][Python] add/fix docstrings in IRCore (#167063)

This PR adds all the missing doc strings in IRCore.cpp. It also

1. Normalizes all doc strings to have proper punctuation;
2. Inlines non-duplicated docstrings which are currently at the top of
the source file (and thereby possibly out of sync).

Follow-up PRs will do the same for the rest of the modules/source files.

---------

Co-authored-by: Copilot <175728472+Copilot at users.noreply.github.com>
DeltaFile
+1,116-696mlir/lib/Bindings/Python/IRCore.cpp
+1,116-6961 files

LLVM/project 4fd7521bolt/include/bolt/Core MCPlusBuilder.h, bolt/lib/Target/AArch64 AArch64MCPlusBuilder.cpp

[BOLT][BTI] Add MCPlusBuilder::addBTItoBBStart

This function contains most of the logic for BTI:
- it takes the BasicBlock and the instruction used to jump to it.
- then it checks if the first non-pseudo instruction is a sufficient
landing pad for the used call.
- if not, it generates the correct BTI instruction.

Also introduce the isBTIVariantCoveringCall helper to simplify the logic.
DeltaFile
+105-0bolt/unittests/Core/MCPlusBuilder.cpp
+75-0bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp
+13-0bolt/include/bolt/Core/MCPlusBuilder.h
+193-03 files

LLVM/project 2681497llvm/lib/Target/SPIRV SPIRVModuleAnalysis.cpp, llvm/test/CodeGen/SPIRV spirv_param_decorations_quals.ll

[SPIRV] Allow multiple FuncParamAttr decoration on the same id. (#166782)

According to SPIR-V spec:

> It is invalid to decorate any given id or structure member more than
one time with the same
[decoration](https://registry.khronos.org/SPIR-V/specs/unified1/SPIRV.html#Decoration),
unless explicitly allowed below for a specific decoration.

`FuncParamAttr` explicitly allows multiple uses of the decoration on the
same id, so this patch honors it.
DeltaFile
+9-8llvm/lib/Target/SPIRV/SPIRVModuleAnalysis.cpp
+3-1llvm/test/CodeGen/SPIRV/spirv_param_decorations_quals.ll
+12-92 files

LLVM/project 99a5e7bmlir/include/mlir/Tools/mlir-opt MlirOptMain.h, mlir/lib/Tools/mlir-opt MlirOptMain.cpp

[mlir][NFC] Split registerAndParseCLIOptions() in mlir-opt (#166538)

mlir-opt's registerAndParseCLIOptions() forces users to both register
default MLIR options and parse the command line string. Custom mlir-opt
implementations, however, may need to provide own options or own
parsing. It seems that separating the two functions makes it easier to
achieve necessary customizations.

For example, one can register "default" options, then register custom
options (not available in standard mlir-opt), then parse all of them.
Other cases include two-stage parsing where some additional options
become available based on parsed information (e.g. compilation target
can allow additional options to be present).
DeltaFile
+21-12mlir/lib/Tools/mlir-opt/MlirOptMain.cpp
+14-0mlir/include/mlir/Tools/mlir-opt/MlirOptMain.h
+35-122 files

LLVM/project 46b0e9cllvm/lib/Transforms/Vectorize VPlanRecipes.cpp

Remove uint64_t MulConst
DeltaFile
+1-2llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
+1-21 files

LLVM/project 92342e0llvm/include/llvm/Analysis TargetTransformInfo.h, llvm/lib/Transforms/Vectorize VPlan.h LoopVectorize.cpp

[VPlan] Implement compressed widening of memory instructions
DeltaFile
+20-12llvm/lib/Transforms/Vectorize/VPlan.h
+17-7llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+17-6llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
+6-5llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
+1-0llvm/include/llvm/Analysis/TargetTransformInfo.h
+61-305 files

LLVM/project c17a839clang/lib/CodeGen CGOpenMPRuntime.cpp, clang/test/OpenMP spirv_target_codegen_basic.cpp force-usm.c

[OMPIRBuilder] Fix addrspace of internal critical section lock (#166459)

First, for internal variables, they are always global, so use the global
AS by default unless specified otherwise. We can't really use `0` as a
default like we do now because that has an actual meaning on some
targets, so we really need specified vs unspecified, so I used
`std::optional` which is already used in many places in OMPIRBuilder.

Second, for the critical lock variable, add an addrspace cast if needed.

Signed-off-by: Nick Sarnie <nick.sarnie at intel.com>
DeltaFile
+16-9clang/lib/CodeGen/CGOpenMPRuntime.cpp
+7-6llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
+6-0clang/test/OpenMP/spirv_target_codegen_basic.cpp
+1-1clang/test/OpenMP/force-usm.c
+1-1llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h
+31-175 files

LLVM/project 69c8756llvm/test/CodeGen/PowerPC milicode64.ll milicode32.ll

[NFC][PowerPC] Pre-commit adding test case: use millicode for memmove (#166961)

add test case to test lib call are used for the memmove milicode
DeltaFile
+79-0llvm/test/CodeGen/PowerPC/milicode64.ll
+56-0llvm/test/CodeGen/PowerPC/milicode32.ll
+135-02 files

LLVM/project 258e4cfllvm/lib/Target/AArch64 AArch64InstrInfo.cpp, llvm/test/CodeGen/AArch64 licm-regclass-copy.mir

[AArch64] Consider COPY between disjoint register classes as expensive

The motivation is to allow passes such as MachineLICM to hoist trivial
FMOV instructions out of loops, where previously it didn't do so even
when the RHS is a constant.
On most architectures, these expensive move instructions have a latency
of 2-6 cycles, and certainly not cheap as a 0-1 cycle move.
DeltaFile
+55-0llvm/test/CodeGen/AArch64/licm-regclass-copy.mir
+24-0llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
+79-02 files

LLVM/project 94a7006mlir/include/mlir/Dialect/XeGPU/TransformOps XeGPUTransformOps.td, mlir/lib/Dialect/XeGPU/TransformOps XeGPUTransformOps.cpp

[MLIR][XeGPU][TransformOps] Add set_op_layout_attr op (#166854)

Adds `transform.xegpu.set_op_layout_attr` transform op that attaches
`xegpu.layout` attribute to the target op.
DeltaFile
+134-0mlir/test/Dialect/XeGPU/transform-ops.mlir
+102-19mlir/lib/Dialect/XeGPU/TransformOps/XeGPUTransformOps.cpp
+65-0mlir/include/mlir/Dialect/XeGPU/TransformOps/XeGPUTransformOps.td
+58-0mlir/test/Dialect/XeGPU/transform-ops-invalid.mlir
+49-0mlir/test/python/dialects/transform_xegpu_ext.py
+47-0mlir/python/mlir/dialects/transform/xegpu.py
+455-196 files

LLVM/project 71cdd40clang/lib/Headers avx512vlbwintrin.h avx512bwintrin.h, clang/test/CodeGen/X86 avx512vlbw-builtins.c avx512bw-builtins.c

Allow avx512 bw masked intrinsics to be used in constexpr (#162871)

Added CONSTEXPR macro and test for the following intrinsics:

-- _mm_mask_adds_epi16 _mm_maskz_adds_epi16
-- _mm_mask_adds_epi8 _mm_maskz_adds_epi8
-- _mm_mask_adds_epu16 _mm_maskz_adds_epu16
-- _mm_mask_adds_epu8 _mm_maskz_adds_epu8
-- _mm_mask_broadcastb_epi8 _mm_maskz_broadcastb_epi8
-- _mm_mask_broadcastw_epi16 _mm_maskz_broadcastw_epi16
-- _mm_mask_cvtepi8_epi16 _mm_maskz_cvtepi8_epi16
-- _mm_mask_cvtepu8_epi16 _mm_maskz_cvtepu8_epi16
-- _mm_mask_packs_epi16 _mm_maskz_packs_epi16
-- _mm_mask_packs_epi32 _mm_maskz_packs_epi32
-- _mm_mask_packus_epi16 _mm_maskz_packus_epi16
-- _mm_mask_packus_epi32 _mm_maskz_packus_epi32
-- _mm_mask_set1_epi16 _mm_maskz_set1_epi16
-- _mm_mask_set1_epi8 _mm_maskz_set1_epi8
-- _mm_mask_slli_epi16 _mm_mask_slli_epi16

    [56 lines not shown]
DeltaFile
+91-91clang/lib/Headers/avx512vlbwintrin.h
+135-34clang/test/CodeGen/X86/avx512vlbw-builtins.c
+54-64clang/lib/Headers/avx512bwintrin.h
+76-4clang/test/CodeGen/X86/avx512bw-builtins.c
+356-1934 files

LLVM/project 1553f90mlir/include/mlir/Dialect/XeGPU/TransformOps XeGPUTransformOps.td, mlir/lib/Dialect/XeGPU/TransformOps XeGPUTransformOps.cpp

[MLIR][XeGPU][TransformOps] Add get_desc_op (#166801)

Add `transform.xegpu.get_desc_op` transform op that finds a
`xegpu.create_nd_tdesc` producer op of a `Value`.
DeltaFile
+65-0mlir/lib/Dialect/XeGPU/TransformOps/XeGPUTransformOps.cpp
+62-0mlir/test/Dialect/XeGPU/transform-ops.mlir
+23-5mlir/include/mlir/Dialect/XeGPU/TransformOps/XeGPUTransformOps.td
+21-0mlir/python/mlir/dialects/transform/xegpu.py
+16-1mlir/test/python/dialects/transform_xegpu_ext.py
+187-65 files

LLVM/project 263577fllvm/lib/Target/AArch64 AArch64SystemOperands.td, llvm/lib/Target/AArch64/AsmParser AArch64AsmParser.cpp

[AArch64][llvm] GICv5 instruction `GIC CDEOI` takes no operand

There was a minor oversight in commit 6836261ee; the AArch64 GICv5
instruction `GIC CDEOI` takes no operands, since the text of the
specification says:
```
The Rt field should be set to 0b11111. If the Rt field is not
set to 0b11111, it is CONSTRAINED UNPREDICTABLE whether:
* The instruction is UNDEFINED.
* The instruction behaves as if the Rt field is set to 0b11111.
```
DeltaFile
+4-4llvm/test/MC/AArch64/armv9.7a-gcie.s
+4-4llvm/lib/Target/AArch64/AArch64SystemOperands.td
+4-0llvm/test/MC/AArch64/armv9.7a-gcie-diagnostics.s
+1-1llvm/lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp
+1-1llvm/lib/Target/AArch64/MCTargetDesc/AArch64InstPrinter.cpp
+14-105 files

LLVM/project a5d4ba7clang/include/clang/Basic AttrDocs.td Attr.td

[clang][doc] Document and prefer __asm as mangled name attribute (#167226)

It's supported together with the other spellings and results in the same attribute.
Document it and prefer it in the documentation as the `asm()` spelling is C++ and GNU-only.

See: https://github.com/llvm/llvm-project/pull/167221#issuecomment-3508464545
DeltaFile
+4-4clang/include/clang/Basic/AttrDocs.td
+1-1clang/include/clang/Basic/Attr.td
+5-52 files

LLVM/project 5b20453llvm/lib/CodeGen CodeGenPrepare.cpp, llvm/test/CodeGen/RISCV overflow-intrinsics.ll

[CodeGenPrepare] sinkCmpExpression - don't sink larger than legal integer comparisons (#166778)

A generic alternative to #166564 - make the assumption that expanding
integer comparisons will be expensive if they are larger than the largest
legal type so avoid sinking if they are also used in the current BB + any phis.

Fixes #166534
DeltaFile
+16-52llvm/test/CodeGen/X86/pr166534.ll
+22-26llvm/test/CodeGen/RISCV/overflow-intrinsics.ll
+15-2llvm/lib/CodeGen/CodeGenPrepare.cpp
+53-803 files

LLVM/project 60a2e2bclang/include/clang/Driver Options.td, clang/include/clang/Lex LiteralConverter.h

address comments
DeltaFile
+30-23clang/lib/Lex/LiteralConverter.cpp
+31-13clang/lib/Lex/LiteralSupport.cpp
+15-11clang/lib/Driver/ToolChains/Clang.cpp
+11-6clang/include/clang/Lex/LiteralConverter.h
+9-7clang/test/Driver/clang_f_opts.c
+9-4clang/include/clang/Driver/Options.td
+105-6411 files not shown
+130-7617 files

LLVM/project 6988ba2clang/lib/Frontend FrontendAction.cpp, clang/lib/Lex PPDirectives.cpp

do not translate line/digit directives, do not translate filename
DeltaFile
+4-2clang/lib/Lex/PPDirectives.cpp
+3-1clang/lib/Frontend/FrontendAction.cpp
+7-32 files

LLVM/project f39e2a0llvm/lib/Transforms/Vectorize LoopVectorize.cpp

[LoopVectorize][NFC] Refactor widening decision logic
DeltaFile
+24-28llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+24-281 files