Christian Lawson-Perfect @christianp

Recent searches

Search options

Only available when logged in.

John Carlos Baez @johncarlosbaez@mathstodon.xyz

These days reporters are interviewing me again about the Azimuth Climate Data Backup Project - because we're again facing the possibility that a Trump administration could get rid of the US government's climate data.

From 2016 to 2018, our team backed up up 30 terabytes of US government databases on climate change and the environment, saving it from the threat of a government run by climate change deniers. 627 people contributed a total of $20,427 to our project on Kickstarter to pay for storage space and a server.

That project is done now, with the data stored in a secret permanent location. But that data is old, and there's plenty more by now.

As before, I'm hoping that the people at NOAA, NASA, etc. have quietly taken their own precautions. They're in a much better position to do it!

I got interviewed for this New York Times article about the current situation:

• Austyn Gaffney, How Trump's return could affect climate and weather data, New York Times, November 14, 2024. https://archive.is/y5Qb9

For what we actually did, read this:

https://math.ucr.edu/home/baez/azimuth_backup_project/

Nov 27, 2024, 01:28 AM·

114boosts·105favorites

**smxi** @smxi@fosstodon.org · Nov 27, 2024 *

Nov 27, 2024 *

smxi @smxi@fosstodon.org

@johncarlosbaez makes sense to me. Get rid of climate change by getting rid of the data. Kind of like making up data in 80 CE or so then saying that's true. So the inverse logically applies as well.

Isn't this data already shared with responsible global researchers? The US is the only major player plagued by this pseudo-biblical antiscience nonsense globally so I'd think the data should be distributed.

**John Carlos Baez** @johncarlosbaez · Nov 27, 2024

Nov 27, 2024

John Carlos Baez @johncarlosbaez

@smxi wrote: "Isn't this data already shared with responsible global researchers?" That's the $64,000 question. I haven't seen any sign that it's true. But they're not dumb, so let's hope they've done it.

**Syulang** @Syulang@aus.social · Nov 27, 2024

Nov 27, 2024

Syulang @Syulang@aus.social

@smxi @johncarlosbaez I agree. Having one central high secure vault like backup seems like a shaky proposition. Safety in numbers seems more relevent here. 30tb sounds like a lot, and it is, but even a modest regional university likely has a few server racks with this sort of space on it, especially on off-line archive space. Distribute as widely as possible, rather than seeking perfection.

**Ray Lee** @4raylee · Nov 27, 2024

Nov 27, 2024

Ray Lee @4raylee

@Syulang I think I have 30TB lying around my house. As storage has gotten much cheaper over the past eight years, distributing copies now seems practical.

@smxi @johncarlosbaez

**Peter** @PeterLG@theblower.au · Nov 27, 2024

Nov 27, 2024

Peter @PeterLG@theblower.au

@4raylee

We have 28TB here ... and we live in a motorhome! So no, 30Tb isn't much these days.

@Syulang @smxi @johncarlosbaez

**smxi** @smxi@fosstodon.org · Nov 27, 2024

Nov 27, 2024

smxi @smxi@fosstodon.org

@PeterLG @4raylee @Syulang @johncarlosbaez

30 TIB is still a lot. Because data that isn't backed up doesn't really exist. So you'd also need 2 copies which boosts it to 60 TiB. Which is a lot. Also I've never trusted this generation of ultra high capacity spinning disks in terms of durability. Too fragile.

Another saying is RAID isn't backup. Though a RAID mirror is to some degree. But easier to back up static blocks of data. Then 2 copies is fine. Per site backing it up. Backup is hard.

**Ray Lee** @4raylee · Nov 27, 2024 *

Nov 27, 2024 *

Ray Lee @4raylee

@smxi You're right that RAID isn't backup, but the good news is that this is a solved problem. The idea would be to have many duplicate fragments covering the dataset, each with its own cryptographic signature to verify integrity (cf Merkle trees). Then the datasets can be distributed on a chunk by chunk basis while still verifying the integrity of each chunk. Like bittorrent, but for data sets.

@PeterLG @Syulang @johncarlosbaez

**Peter** @PeterLG@theblower.au · Nov 28, 2024

Nov 28, 2024

Peter @PeterLG@theblower.au

@smxi

Backup was my biggest headache, after security, back in the day. Working with multiple in-hospital health systems, which are by their nature dynamic (up-to-the-minute dynamic), made kepping data safe in case of hiccough a constant source of ulcers.

Dog! I'm glad I don't do that anymore.

@4raylee @Syulang @johncarlosbaez

**smxi** @smxi@fosstodon.org · Nov 29, 2024

Nov 29, 2024

smxi @smxi@fosstodon.org

@PeterLG @4raylee @Syulang @johncarlosbaez I did backup for years for a client who believed their data was as important as the data you backed up actually was. Main mirror. Hourly syncs of main mirror in case mirror failed. 2 external off site stored backups lol. And no their data wasn't that important lol.

**John Carlos Baez** @johncarlosbaez · Nov 29, 2024 *

Nov 29, 2024 *

John Carlos Baez @johncarlosbaez

@smxi - when we paid for commercial servers to back up the US climate data, we were paying for the type of services you delivered. Now we've transferred that data to a location where they'll do that for free, because they are already doing it for lots of other data. We didn't think it was sufficient to buy a 36-terabyte hard drive and stick the data in there, but a lot of people kept suggesting we could just do that.

@PeterLG @4raylee @Syulang

**Syulang** @Syulang@aus.social · Nov 29, 2024

Nov 29, 2024

Syulang @Syulang@aus.social

@smxi @PeterLG @4raylee @johncarlosbaez I worked for a private company once (briefly) in managed services and we had a client with mind-numbingly trivial data, while a university I worked at around the same time had literally no backup for students work made using the only Mac computer suite. (I mean, kinda figures, but still a bit lame)

**MossyRua** @Mossyrua@c.im · Nov 27, 2024

Nov 27, 2024

MossyRua @Mossyrua@c.im

@johncarlosbaez You could share it with the European Union’s climate community or any non-US university?

**John Carlos Baez** @johncarlosbaez · Nov 27, 2024

Nov 27, 2024

John Carlos Baez @johncarlosbaez

@Mossyrua - we did all the sharing we needed to back in 2018. What the world could use now is someone making new backups. But I'm hoping the people at NOAA, NASA, etc. do this on a regular basis.

**ma𝕏pool** @maxpool · Nov 27, 2024 *

Nov 27, 2024 *

ma𝕏pool @maxpool

@johncarlosbaez
Scientists Scramble to Save Climate Data from Trump—Again
https://www.scientificamerican.com/article/scientists-scramble-to-save-climate-data-from-trump-again/

Donald Trump silhouette with digital background.

Scientific American · Nov 22, 2024Scientists Scramble to Save Climate Data from Trump—AgainBy Chelsea Harvey

#climate #climateScience #USpol

**John Carlos Baez** @johncarlosbaez · Nov 28, 2024

Nov 28, 2024

John Carlos Baez @johncarlosbaez

@maxpool - thanks!!!

**65dBnoise** @65dBnoise@mastodon.social · Nov 27, 2024

Nov 27, 2024

65dBnoise @65dBnoise@mastodon.social

@johncarlosbaez
Reminds me of museums in Europe, ca. 1940, burying archaeological treasures to save them from the invading Nazis.

What a shame for the US!

**NickServ** @nickserv@mastodon.social · Nov 28, 2024

Nov 28, 2024

NickServ @nickserv@mastodon.social

@65dBnoise @johncarlosbaez How is it a shame for the US? Your assumed loss equals your assumed derision. Did you already lose access to the data or just decide to bash the US?

**65dBnoise** @65dBnoise@mastodon.social · Nov 28, 2024

Nov 28, 2024

65dBnoise @65dBnoise@mastodon.social

@nickserv @johncarlosbaez
It's not a zero memory system.Trump has given plenty a reason (chloroquine injections, sharpied hurricane paths, roll back of EPA rules etc.) for scientists to worry that the worst may happen.

It's a shame for the US, which pioneered open public scientific data and has a vibrant scientific community, that American scientists in American universities fear loss or manipulation of such data because of manifested climate denialism and bigotry, like in the dark ages.

**3Jane Tessier Ashpool** @3janeTA@beige.party · Nov 27, 2024

Nov 27, 2024

3Jane Tessier Ashpool @3janeTA@beige.party

@johncarlosbaez do you have another funding effort I can contribute to and boost?

**John Carlos Baez** @johncarlosbaez · Nov 27, 2024 *

Nov 27, 2024 *

John Carlos Baez @johncarlosbaez

@3janeTA - thanks so much for your offer, but I'm too burnt out to do that project again. It took quite a bit of work for about a year. I'm sorry! If I discover such a project going on, or some similar useful project, I'll post about it.

**3Jane Tessier Ashpool** @3janeTA@beige.party · Nov 27, 2024

Nov 27, 2024

3Jane Tessier Ashpool @3janeTA@beige.party

@johncarlosbaez no problem! Seemed worthwhile to help if I could. Important work I think

**Blake C. Stacey** @bstacey@icosahedron.website · Nov 28, 2024 *

Nov 28, 2024 *

Blake C. Stacey @bstacey@icosahedron.website

@johncarlosbaez I'm on our home connection that has been janky these past few days, and I have no special archival tools, but I flipped through the list and downloaded the Transportation Energy Data Book.

**John Carlos Baez** @johncarlosbaez · Nov 28, 2024

Nov 28, 2024

John Carlos Baez @johncarlosbaez

@bstacey - good.

Downloading really large datasets turned out to be quite slow, even for people who supposedly had very good internet connections.

Drag & drop to upload

Recent searches

Search options

Administered by:

Server stats: