mathstodon.xyz is one of the many independent Mastodon servers you can use to participate in the fediverse.
A Mastodon instance for maths people. We have LaTeX rendering in the web interface!

Server stats:

2.7K
active users

#bzip2

0 posts0 participants0 posts today

@kaasbaas #wikipedia (English language) is only ~22GB of #bzip2 compressed #xml (uncompressed size is ~86GB).
is it possible to access it without decompression? I guess #random #access to .xml.bz2 should be a solved problem, right?
we're routinely using gzip with random access in #bioinformatics ie via #samtools or #tabix

EDIT:
Wikipedia xml.bz2 does support random access for multistream version. does @kiwix or any other wiki reader support it? I couldn't find info on their website...

BZip3

在 Hacker News 上看到 BZip3 的連結:「Bzip3: A spiritual successor to BZip2 (github.com/kspalaiologos)」。

雖然名字看起來與 bzip2 有關,但看起來是不同的人弄出來的東西,不過有些經典的演算法有留下來用,像是 Burrows-Wheeler transform。

另外值得一提的是,bzip2 是 1996 年出的 (不過 1.0 大約是 2000 年時出的),BZip3 的第一個 release 在 2022 年,這段時間也累積了不少有趣的演算法可以用。

無損壓縮中如果期望有比較的壓縮率,目前比較常用的應該是 LZMA 類的演算法 (差不多是 2001 年出現的),用的工具通常會是 X

blog.gslin.org/archives/2025/0

Gea-Suan Lin's BLOG · BZip3

StarFive has donated 4 VisionFive-2 risc-v boards with 8GB, 4-core JH7110 supporting the RV64GC ISA for the CI running on builder.sourceware.org for Sourceware hosted projects.

They are running ubuntu server 23.10 wiki.ubuntu.com/RISC-V/StarFiv

Various projects are already running on risc-v: #annobin, #binutils, #bzip2, #debugedit, #dwz, #elfutils, #gnupoke and #libabigail.

builder.sourceware.org/buildbo

Please contact the builder project if you want to help out adding other Sourceware hosted projects.

builder.sourceware.orgsourceware buildbotSourceware GNU Toolchain buildbot
Replied in thread

@ASTAFATHERSATAN Granted, I'm not gonna implement a custom compression and decompression algo in the Linux kernel as I'm not only able to do so with my skills but literally doubt I could beat the #XZ numbers without sinking in decades of R&D - and even then that may be a dubious investment of time and resources...

I only used #bzip2 for said backups as it's convenient, fast and didn't require me to setup stuff to work.

Replied in thread

@chakuari OFC.

In theory offering more than just one is "trivial" in the sense that for OS/1337 the idea is to have statical binaries (and maybe the few necessary configs) as "#packages" so it's just a download as an archive (#bzip2 because it's available in #toybox) and just pull that and place them in the system.

OFC absolute hardcore folks will literally do #eMail just with #curl, #cat, #sed & #awk I guess, but ideally offering a convenient alternative like #neomutt is better.

It's always sad when the dedicated #apt binding cannot handle #apt's own files (Packages, Sources, etc.), but only when they're compressed in a particular way.

With gzip on the way out, not being able to grok bz2 (~ unimportant) and xz (~ vital) isn't so good…

See bugs.debian.org/932491#20 for #xz test cases, and bugs.debian.org/932491#27 for #bzip2 test cases (which seem to confirm the problem isn't somewhere in the built-in #lzma module).

H/T @bremner who spotted this 4+ years ago…

bugs.debian.org#932491 - python3-apt: segfault reading from lzma stream - Debian Bug report logs
Continued thread

#dar 3/ Dar's homepage is dar.linux.free.fr/ . As a simple file archiver tool, dar is great. Random access is preserved, you can compress with #gzip #bzip2 #lzo #xz #zstd or #lz4. You can encrypt with #GnuPG or symmetric #AES. You can stream if you want. You can split ("slice") across multiple media, and dar will prompt you for the slice(s) you need and seek you right to them.

That's cool, but we're just getting started.

dar.linux.free.frDAR - Disk ARchive