mathstodon.xyz is one of the many independent Mastodon servers you can use to participate in the fediverse.
A Mastodon instance for maths people. We have LaTeX rendering in the web interface!

Server stats:

2.7K
active users

#datahoarding

1 post1 participant0 posts today

Apparently this morning's project is wget’ing a bunch of retired web pages from an old WNYC show (Studio 360), pulling out the links to mp3s, and downloading them. All because the podcast feed only goes back three years instead of covering all 20+ years.

Why? Because there are interesting conversations.

Looks like The Fishko Files are after that.

:blobcatnomdisk: Приведение в порядок моих файловых архивов несколько сократилось благодаря утилите backdown, обыскивающей указанные места на предмет дубликатов: github.com/Canop/backdown

Если у вас где-то образовались слабоорганизованные свалки из файлов с нескольких систем, которые вы свалили на другой носитель не разбираясь, это отличный префильтр перед тем, как осознанно сортировать то, что останется.

Долго, зато автоматически.

A deduplicator. Contribute to Canop/backdown development by creating an account on GitHub.
GitHubGitHub - Canop/backdown: A deduplicatorA deduplicator. Contribute to Canop/backdown development by creating an account on GitHub.

does someone have a file server hosted at home (or privately enough that it is YOURS) and wants to archive all Linus Tech Tips Floatplane Exclusives??? (in a way I could still download/access them if I want to)

I really need some storage space :'D and I want to archive this forever if possible

it's 154GB of 1080p30fps videos, all the Floatplane exclusives from when they started being a thing, until 20th February (I really could only afford one month so some recent vids are already missing)

(please boost for maximum reach 💛)

Replied in thread

mastodon.ml/@mintbug/114086408
Написал скрипт, который принимает https-ссылку на пост, получает ссылки на вложения с помощью #toot, преобразует https-ссылку в fedi-ссылку и сохраняет вложения в директорию с именами вида `<fedi-ссылка> <номер вложения>.ext` (см. иллюстрацию). Слеши в fedi-ссылке заменены на знак `∫`, потому что ~~а почему бы и нет~~ я ещё не настолько сошёл с ума, чтобы использовать слеши в именах файлов, поэтому их нужно на что-то заменить.

Запуск `fedi-save (wl-paste)` повесил на хоткей в #sway; открытие ссылки, содержащейся в имени файла (`fedi-save --open <path>`) повесил на хоткей в #yazi.

Just discovered ArchiveBox — FOSS, self-hosted internet archiving.

The way the web is going, with the US government redacting and outright erasing historic content, publishers segmenting content by region (and also sometimes redacting/censoring it), and CloudFlare shitting all over everything, I think it's time for me to start my #archiving and #DataHoarding journey.

#SelfHosting #SelfHosted #DataHoarder

github.com/ArchiveBox/ArchiveB

🗃 Open source self-hosted web archiving. Takes URLs/browser history/bookmarks/Pocket/Pinboard/etc., saves HTML, JS, PDFs, media, and more... - ArchiveBox/ArchiveBox
GitHubGitHub - ArchiveBox/ArchiveBox: 🗃 Open source self-hosted web archiving. Takes URLs/browser history/bookmarks/Pocket/Pinboard/etc., saves HTML, JS, PDFs, media, and more...🗃 Open source self-hosted web archiving. Takes URLs/browser history/bookmarks/Pocket/Pinboard/etc., saves HTML, JS, PDFs, media, and more... - ArchiveBox/ArchiveBox

- I've setup a VNET thick jail on my FreeBSD NAS.
- The jail has its own IP address on my LAN.
- I declared a devfs ruleset to unhide /dev/tun* for the VNET jail.
- I installed Wireguard in the jail.
- I enabled Wireguard with a ProtonVPN configuration.
- I installed qbittorrent-nox and configured it to use the Wireguard interface.

I now have a home ISP-proof qBittorrent setup with which to torrent Anna's Archive.

I probably spent half the morning reinventing the wheel, but I couldn't find a nice, tidy dataset containing the #J6 defendants so I spent some time cleaning and tidying the list on npr.org

Since I did this, maybe it helps someone else. It's just a spreadsheet with 1500 rows or so. It could be parsed further (especially the parts about the charges and follow-up).

drive.google.com/drive/folders

Google Driveshare – Google Drive

Data hoarding question from an ally: What things on the internet are in most need of saving?

I have backup tapes and hard drives and the capability of providing more. I want my resources to be available for the greater good.

I’m aware of people backing up YouTube channels that are at risk as well as archive.org and government data.

What else is at risk?

I’m also open to do tape backups if someone want to collab. I have an LTO-4 drive.

Our PV Spec:

24.14kWDC tied to twin 10k SolarEdge inverters
4.8kWDC is DC coupled to the XW systems
91x Solar Modules, average 330w each
eGauge Circuit-Level Monitoring System
3x Schneider XW6848 Inverters - AC coupled together for power sharing

1x 20kWh DIY LFP battery
1x 5kWh Fortress LFP battery
1x 40kWh AGM Unigy II Battery
1x 60kWh battery in EV with 12v-48v DC converter for charging house batteries

Total 28.94kWDC PV Capacity

Ground mount installed 2017, roof mount 2019. System paid off Oct 2023. The utility pays me!

We bank approximately 5MWh each summer, and draw on it in winter via 1:1 net metering in Maryland.

Additional efficiencies include HP water heater, inverter mini-split, dual-stage central HVAC, and blown insulation.

We use 25MWh annually, which is a ton. But #datahoarding storage drives burn watts!!