[clang-format][NFC] Don't always rebuild clang-format-check-format (#203828)
Instead, check the format of clan-format source only if the built
clang-format binary or one of the source files is newer.
[llvm] Replace unordered_set<std::string> with StringSet (#204048)
std::unordered_set<std::string> without a pointer-stability requirement
can use StringSet: it avoids per-TU hashtable instantiations and the
std::string temporary at StringRef lookup sites (~3-4 KiB smaller .text
for llc/opt).
[ELF] Support multiple PT_GNU_RELRO when SECTIONS is used without PHDRS (#203675)
When a SECTIONS command interleaves relro and non-relro sections, the
relro
region is split into discontiguous runs. lld emits an error since
https://reviews.llvm.org/D40359
error: section: <name> is not contiguous with other relro sections
This is overly strict: while glibc only honors the first PT_GNU_RELRO,
other loaders (e.g. Bionic and FreeBSD rtld-elf) protect every
PT_GNU_RELRO segment.
Emit one PT_GNU_RELRO segment for each contiguous run of relro sections.
Track the boundary section so that `createPhdrs` starts a fresh PT_LOAD
at each relro->non-relro transition, as before.
Consumers that don't expect multiple PT_GNU_RELRO should check the
output themselves.
[llvm] Replace unordered_set<T *> with SmallPtrSet<T *, 0> (#204051)
std::unordered_set is slow. For pointer sets without a pointer-stability
or iterator-stability requirement, use SmallPtrSet<T *, 0> for a smaller
code size.
[TTI] Add missing no-cost coroutine intrinsics (#203816)
These intrinsics are lowered in the CoroCleanup pass and don't represent
actual code. This patch adds them to the no-cost list so they do not
contribute to the cost of inlining and optimization.
[flang][mlir] Add flang to mlir lowering for groupprivate (#180934)
This PR implements the Flang frontend lowering for the OpenMP
`groupprivate`
Changes:
- Update genOMP handler for OpenMPGroupprivate in OpenMP.cpp to generate
`omp.groupprivate` MLIR operation.
- Add clause processing for groupprivate variables
- Add test cases for groupprivate lowering
Co-Authored-By: Claude
[noreply at anthropic.com](mailto:noreply at anthropic.com)
[RISCV] Consider known leading zeros in narrowIndex for gather/scatter. (#203970)
If there are enough leading zeros for the shift amount, then
we can do the shift in the narrow type.
[AMDGPU] Track VALU instructions separately for WMMA coexecution hazards
WMMA coexecution hazards can only be resolved by VALU instructions, not
S_NOPs. Track VALU/WMMA instructions separately so the scheduler can
accurately determine stall cycles.
[AMDGPU] Set WMMA source-operand reuse bits in SIPreEmitPeephole
gfx1250 WMMA instructions can set matrix_a_reuse / matrix_b_reuse bits
that keep the A or B source operand in a high-temporality state in the
VALU source-operand cache, so a later WMMA reusing the same registers
hits in the cache instead of re-reading the register file.
Add a late, post-RA peephole in the existing pre-emit peephole pass that
scans each basic block and, for every WMMA, sets the A/B reuse bit when
one of the next few WMMAs reuses the same physical registers as its A or B
operand and those registers are not redefined in between.
Stale sticky entries in the cache are cleared when a register is used in
an instruction without a reuse bit being set. Therefore, the final WMMA
use of the same source should not set the bit.
Merge tag 'hardening-v7.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull hardening updates from Kees Cook:
- lkdtm:
- Add case to provoke a crash in EFI runtime services (Ard Biesheuvel)
- add PPC_RADIX_TLBIEL test and missed isync (Sayali Patil)
- stddef: Document designated initializer semantics for
__TRAILING_OVERLAP() (Gustavo A. R. Silva)
- strarray: drop redundant allocation, add __counted_by_ptr (Thorsten
Blum)
* tag 'hardening-v7.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
lkdtm/powerpc: add PPC_RADIX_TLBIEL test for radix MCE validation
lkdtm/powerpc: add isync after slbmte to enforce SLB update ordering
lkdtm: Add case to provoke a crash in EFI runtime services
lib/string_helpers: annotate struct strarray with __counted_by_ptr
[3 lines not shown]
Merge tag 'libcrypto-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux
Pull crypto library updates from Eric Biggers:
- Drop the last architecture-specific implementation of MD5
- Mark clmul32() as noinline_for_stack to improve codegen in some cases
* tag 'libcrypto-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux:
lib/crypto: gf128hash: mark clmul32() as noinline_for_stack
lib/crypto: powerpc/md5: Drop powerpc optimized MD5 code
Merge tag 'crc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux
Pull CRC updates from Eric Biggers:
"Accelerate CRC64-NVME for 32-bit ARM by refactoring the arm64 NEON
intrinsics implementation to be shared by 32-bit and 64-bit.
Also apply a similar cleanup to the 32-bit ARM NEON implementation of
xor_gen(), where it now reuses code from the 64-bit implementation"
* tag 'crc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux:
crypto: aegis128 - Use neon-intrinsics.h on ARM too
lib/crc: arm: Enable arm64's NEON intrinsics implementation of crc64
lib/crc: Turn NEON intrinsics crc64 implementation into common code
xor/arm64: Use shared NEON intrinsics implementation from 32-bit ARM
xor/arm: Replace vectorized implementation with arm64's intrinsics
ARM: Add a neon-intrinsics.h header like on arm64
Merge tag 'v7.2-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto updates from Herbert Xu:
"API:
- Drop support for off-CPU cryptography in af_alg
- Document that af_alg is *always* slower
- Document the deprecation of af_alg
- Remove zero-copy support from skcipher and aead in af_alg
- Cap AEAD AD length to 0x80000000 in af_alg
- Free default RNG on module exit
Algorithms:
- Fix vli multiplication carry overflow in ecc
- Drop unused cipher_null crypto_alg
- Remove unused variants of drbg
- Use lib/crypto in drbg
- Use memcpy_from/to_sglist in authencesn
- Allow authenc(hmac(sha{256,384}),cts(cbc(aes))) in FIPS mode
- Disallow RSA PKCS#1 SHA-1 sig algs in FIPS mode
[41 lines not shown]
Merge tag 'slab-for-7.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab
Pull slab updates from Vlastimil Babka:
- Support for "allocation tokens" (currently available in Clang 22+)
for smarter partitioning of kmalloc caches based on the allocated
object type, which can be enabled instead of the "random"
per-caller-address-hash partitioning.
It should be able to deterministically separate types containing a
pointer from those that do not (Marco Elver)
- Improvements and simplification of the kmem_cache_alloc_bulk() and
mempool_alloc_bulk() API. This includes adaptation of callers
(Christoph Hellwig)
- Performance improvements and cleanups related mostly to sheaves
refill (Hao Li, Shengming Hu, Vlastimil Babka)
[21 lines not shown]
Merge tag 'docs-7.2' of git://git.kernel.org/pub/scm/linux/kernel/git/docs/linux
Pull documentation updates from Jonathan Corbet:
"Things have calmed down a bit on the docs front, with no earthshaking
changes this time around:
- Ongoing work on the Japanese and Portuguese translations
- Better integration of the MAINTAINERS file into the rendered
documents, including a search interface
- A seemingly infinite supply of fixes for typos, minor grammatical
issues, and related problems that LLMs find with abandon"
* tag 'docs-7.2' of git://git.kernel.org/pub/scm/linux/kernel/git/docs/linux: (93 commits)
docs: pt_BR: Translate 3.Early-stage.rst into Portuguese
docs: pt_BR: update "Purpose of Defconfigs" section in maintainer-soc.rst
Documentation: bug-hunting.rst: fix grammar
docs/ja_JP: translate submitting-patches.rst (interleaved-replies)
[17 lines not shown]
Merge tag 'fbdev-for-7.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev
Pull fbdev updates from Helge Deller:
"Beside the removal of the Hercules monochrome ISA graphics driver and
the corresponding text console driver, there is just the typical
maintanance with smaller driver fixes and cleanups:
Removal of drivers:
- Hercules monochrome ISA graphics adapter driver (Ethan Nelson-Moore)
- Hercules mdacon console driver (Ethan Nelson-Moore)
Changes affecting many drivers at once:
- possible memory leak fixes in various drivers (Abdun Nihaal)
- many conversions to use strscpy() (David Laight)
- Use named initializers in drivers (Uwe Kleine-König)
Code fixes:
- fbcon: don't suspend/resume when vc is graphics mode (Lu Yao)
- modedb: fix a possible UAF in fb_find_mode() (Tuo Li)
[44 lines not shown]
[AMDGPU] Update f8f6f4-wmma hazard tests regarding matrix format, NFC (#204037)
Need to map the matrix format suffix to the register size correctly in
the MIR tests. For example, 'F4' format needs v8i32 register class.
Merge tag 'mmc-v7.2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
Pull MMC updates from Ulf Hansson:
"MMC core:
- Validate host's max_segs to fail gracefully
MMC host:
- davinci:
- Avoid potential NULL dereference in the IRQ handler
- Call mmc_add_host() in the correct order during probe
- dw_mmc-exynos:
- Increase DMA threshold for exynos7870
- renesas_sdhi:
- Add support for RZ/G2E, RZ/G2N and R-Car M3Le variants
- sdhci-msm:
- Add support for Hawi, Eliza and Shikra variants
- sdhci-of-k1:
- Add support for SD UHS-I modes
- Add support for tuning for eMMC HS200 and SD UHS-I"
[23 lines not shown]
Merge tag 'hwmon-for-v7.2' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
Pull hwmon updates from Guenter Roeck:
"New drivers for the following chips:
- Analog Devices LTC4283 Swap Controller
- Analog Devices MAX20830
- Analog Devices MAX20860A
- ARCTIC Fan Controller
- Delta E50SN12051
- Luxshare LX1308
- Microchip EMC1812/13/14/15/33
- Monolithic MP2985
- Murata D1U74T PSU
New chip support added to existing drivers:
- asus-ec-sensors: Support for ROG MAXIMUS Z790 EXTREME, ROG STRIX
B850-E GAMING WIFI, and ROG STRIX B650E-E GAMING WIFI
- dell-smm: Add Dell Latitude 7530 to fan control whitelist
- nct6683: Support for ASRock Z890 Pro-A
[71 lines not shown]