py-tokenizers: Import py-tokenizers-0.22.1 as wip/py-tokenizers
Provides an implementation of today's most used tokenizers, with a
focus on performance and versatility.
Bindings over the Rust implementation. If you are interested in the
High-level design, you can go check it there.
Main features:
* Train new vocabularies and tokenize using 4 pre-made tokenizers
(Bert WordPiece and the 3 most common BPE versions).
* Extremely fast (both training and tokenization), thanks to the Rust
implementation. Takes less than 20 seconds to tokenize a GB of text
on a server's CPU.
* Easy to use, but also extremely versatile.
* Designed for research and production.
* Normalization comes with alignments tracking. It's always possible
to get the part of the original sentence that corresponds to a given
[3 lines not shown]
py-transformers: Import py-transformers-4.57.1 as wip/py-transformers
Transformers acts as the model-definition framework for
state-of-the-art machine learning with text, computer vision, audio,
video, and multimodal models, for both inference and training.
It centralizes the model definition so that this definition is agreed
upon across the ecosystem. transformers is the pivot across
frameworks: if a model definition is supported, it will be compatible
with the majority of training frameworks (Axolotl, Unsloth, DeepSpeed,
FSDP, PyTorch-Lightning, ...), inference engines (vLLM, SGLang, TGI,
...), and adjacent modeling libraries (llama.cpp, mlx, ...) which
leverage the model definition from transformers.
We pledge to help support new state-of-the-art models and democratize
their usage by having their model definition be simple, customizable,
and efficient.
There are over 1M+ Transformers model checkpoints on the Hugging Face
[4 lines not shown]
py-safetensors: Import py-safetensors-0.7.0 as wip/py-safetensors
New simple format for storing tensors safely (as opposed to pickle)
and that is still fast (zero-copy).
In the Lbrdmatch loop, fix the number of bytes the pointer is advanced
when a match is not found (code was using 18, correct number is 10).
Explainer:
The board type entries consist of 4 shorts + 1 long for a total of 12
bytes. When comparing the board ID, the pointer is advanced meaning
that there are 10 bytes left to skip in the case of a non-match.
As for why the old code was using 18 (0x12) bytes, a comment that says
"Each entry is 20-2 bytes long" provides a clue: if the CPU type, MMU
type, and FPU type entries were longs, each entry would in fact be 18
bytes -- and maybe this was the case in the distant past. However, it
still would be wrong to advance 18 bytes because of the auto-increment
used during the compare.
scw@ is a smart guy and I'm certain this worked at some point, but I'm
not going to go and do the archeology needed to figure out exactly when
this broke. If you happened to have an MVME147 (first entry in the board
[5 lines not shown]
mail/mutt: Update to version 2.2.16
This is a bug-fix release, fixing a resource leak when compiled with
OpenSSL/LibreSSL, which could eventually result in new connections failing.
png: update to 1.6.51.
Version 1.6.51 [November 21, 2025]
Fixed CVE-2025-64505 (moderate severity):
Heap buffer overflow in `png_do_quantize` via malformed palette index.
(Reported by Samsung; analyzed by Fabio Gritti.)
Fixed CVE-2025-64506 (moderate severity):
Heap buffer over-read in `png_write_image_8bit` with 8-bit input and
`convert_to_8bit` enabled.
(Reported by Samsung and <weijinjinnihao at users.noreply.github.com>;
analyzed by Fabio Gritti.)
Fixed CVE-2025-64720 (high severity):
Buffer overflow in `png_image_read_composite` via incorrect palette
premultiplication.
(Reported by Samsung; analyzed by John Bowler.)
Fixed CVE-2025-65018 (high severity):
Heap buffer overflow in `png_combine_row` triggered via
`png_image_finish_read`.
(Reported by <yosiimich at users.noreply.github.com>.)
[8 lines not shown]
Handle PMAP_NOCACHE case in pmap_kenter_pa() in Hibler's pmap_motorola.c.
This is also necessary to replace deprecated physaccess() with
pmap_kenter_pa(9), and was missed in the previous hp300 physaccess()
elimination. Patch from and noticed by thorpej@.
Eliminate physaccess() and physunaccess() calls in hp300.
They are marked as 'should go away' and the new 68k pmap won't
provide them. Patch from thorpej@.
Tested on 362 and 382.