Is the arXiv safe from the current US government attacks to education, research, universities and science?
Perhaps they haven't yet realized that it exists.
1/
Also, could the US government, via Musk, order my github account to be deleted because we have a statement of inclusion in TypeTopology? Or just Microsoft decide to do this to shield itself from retaliation from government?
2/
Also, I foresee that UK and European universities that have relatively recently replaced their own IT systems by Microsoft may soon deeply regret this decision.
3/
And now it is time for you to think twice if you are investing your social energy on Bluesky.
4/
@MartinEscardo Shhhh...
@MartinEscardo I read somewhere that while a few years ago, they had mirror servers elsewhere, now everything is in the cloud. (Amazon?)
So it seems that there is some danger for arXiv.
@mrdk @MartinEscardo It looks like arXiv permits bulk access/downloading from their S3 buckets (https://info.arxiv.org/help/bulk_data_s3.html).
It looks like this would cost ~$140 USD to download the data (assuming ~6 TB). Seems very doable especially for departments? Though this would be a one-time snapshot that would have to be updated regularly
@ptcrews @mrdk @MartinEscardo My wife just did this over the last couple weeks. You can save money by first downloading what is already distributed for free on, say, Internet archive. They have a copy of the bulk download, but it is only updated infrequently (I think it's currently a year or so out of date?). Then use the Amazon S3 buckets for the most recent stuff.
I believe our cost was less than $30, and (stripping out figures and pdfs), the results fit on a pair of archival dvds. Took several nights though.
@ptcrews @mrdk @MartinEscardo She has shared her code for pulling the arxiv dataset, with some sparse documentation. It uses a whitelist of file extensions to keep (deleting the rest automatically), you can tweak that file to get more or less complete downloads.
However, she did warn me the cost was higher than expected; internet Archive is more behind than I thought. Closer to the $140 price tag.
For those interested, here is the repo: https://github.com/Phylliida/ExtractArxivText
@MartinEscardo They have the mirrors all set up, but they turned them off when they started using "the cloud". Perhaps it's time to reactivate them. https://info.arxiv.org/help/mirrors.html
@andrejbauer @MartinEscardo well, do we know anyone who ran the EU mirror?
They should be pinged...
@MartinEscardo I'm thinking of moving my stuff github stuff to codeberg for exactly this reason. Now arXiv, I had not thought about that until now.....
@julesh @MartinEscardo
you might gonna regret going full-Apple, too
@MartinEscardo @julesh the recent appearance of Apple Intelligence and Microsoft Copilot - they can be used as keyloggers/data stealers etc, rises a suspicion that it's how they are getting prepared for eventual US govt orders...
@dimpase @MartinEscardo In practice I'm currently more worried right now about running Android on my phone than OSX on my laptop, but yeah. Gonna need to replace both of them at some point
@julesh @dimpase @MartinEscardo I have definitely been paying more attention to @murena , @fdroidorg and other open source phone projects.
@soaproot @julesh @dimpase @MartinEscardo @murena @fdroidorg If you happen to have a recentish Pixel, I've had good results with @GrapheneOS. It is de-googled by default but will also allow you to run Google apps with additional sandboxing should you need to.
@MartinEscardo Less BlueSky, more MayFly (the short-lived insect, I mean)?
It's going to be a chaos we've never seen before. The US crap is built into pretty much any service or product, so the crowdstrike disaster gives you good data to extrapolate from if they shut things down (even for short times).
Good luck to all of us.
@MartinEscardo no, they are idiots.
They will not learn. Many forgot to fear depending on others.
@MartinEscardo I wpuld hope they have all contracted with Microsoft UK, which is a seperate company. When the US came after MS uk for a server (then located in the EU) they put up a very robust response indeed. So hopefully its not an immediate crisis. We do need to keep a very close eye out though. EU adequacy and other related matters are clearly at risk here.
@MartinEscardo
While I don't think privately owned companies fall directly under DOGE/Musk, you never know.
Please make sure to have good copies of what you have on GitHub, preferably on a EU resource, such as https://sr.ht/ or
@dimpase @MartinEscardo - DOGE was created at the whim of Musk and Trump, so its powers are still being invented, but so far it is trying to exert power over the executive branch of the US federal government, and thus *indirectly* over anyone who gets money from the government or is subject to federal regulations (which are different from laws). Also Trump directly influences funding and regulations. This covers a lot of ground.
Regarding the arXiv, we should look into who funds them. They write
"Together, Cornell University, the Simons Foundation, members, affiliates, sponsors, foundations, and individual donors contribute to arXiv's operating budget."
https://info.arxiv.org/about/funding.html
The federal government could say it will withhold funding to Cornell University unless it drops all connection to entities that support DEI in any way. In practice such an action would be aimed at DEI programs within universities, and I can easily see Trump trying to do this to the University of California. Old arXiv papers would be a fairly low priority, but might be a nearly accidental side-causality (as is Emily Riehl's grant for working on infinity-categories).
@johncarlosbaez @MartinEscardo I lived through one coup playing out on the streets near my apartment block in Moscow, and I am worried about inability to enforce injunctions against DOGE which are already in place. Will states start to refuse paying federal taxes? This seems to be the most serious means they have, short of sending National Guard to DC...
@dimpase @MartinEscardo - It's not states that pay federal taxes, it's individuals in those states (like you and me). The states don't have much to do with the collection of federal taxes.
@johncarlosbaez @MartinEscardo corporations too?
Can one delay/refuse paying federal taxes?
And how is it enforced? Can one ask the employer to not deduct taxes automatically?
I don't see how a party which refuses to obey courts' orders can enforce much without direct violence.
@dimpase @johncarlosbaez @MartinEscardo
You can delay paying federal taxes, but eventually the IRS will start taking it (principle *and* interest) out of paychecks and bank accounts. If it's egregiously willful (consult a lawyer for the legal terminology and conditions) then it can also mean jail time -- this happens all the time.
> I don't see how a party which refuses to obey courts' orders can enforce much without direct violence.
The modern question of what a court can do to a president is different than the question of what a court or the IRS can do to a random individual.
@dimpase wrote:
"corporations too?"
Yes, though the corporate tax rate can be quite low, e.g.:
"AT&T reported that it will pay no federal income taxes in 2021, despite $29.6 billion in earnings. The company reported a tax refund—or an income tax benefit—of $1.2 billion.
Charter Communications, with brands like Spectrum, reported that it will pay no federal income taxes in 2021, despite $6 billion in earnings. The company reported a tax refund of $12 million.
AIG reported that it will pay no federal income tax for 2021, despite $9.8 billion in earnings. The company reported a tax refund of $216 million."
"Can one delay/refuse paying federal taxes?"
For many of us like most academics, our employers automatically deduct taxes from our salary. But if you are self-employed, you can not pay taxes, and then the IRS will either catch you... or they won't. If they catch you, you can go to jail. But the Republicans have been working for years to make it easier for (rich) people to avoid getting caught, by cutting funds to the IRS.
From 2023:
"Wealthy tax cheats owe $65.7 billion
IRS data from 2017 to 2020 reveal that 1.4 million wealthy individuals failed to file federal tax returns. These unfiled returns are tied to an estimated $65.7 billion in unpaid taxes. In his letter, Wyden reminds the IRS that willful non-filing is a federal crime, adding that millionaires “know better,” given the tax professionals and advisors the wealthy have access to."
https://www.kiplinger.com/taxes/lawmakers-want-irs-to-crack-down-on-wealthy-tax-cheats
@johncarlosbaez @MartinEscardo are US foundations subject to federal regulations? Then the central government can start selectively revoking their tax privileges/charity status.
@dimpase @MartinEscardo - more or less all US entities are subject to federal regulations. And yes, the executive branch could suddenly claim to revoke the tax privileges of nonprofit organizations. This might conflict with laws passed by the legislative branch, and people affect might sue, throwing the matter to the judicial branch.
Here is how the Trump administration is already causing trouble for nonprofit organizations:
@MartinEscardo I mean, they can't, but it doesn't mean that they won't, or that GitHub won't preemptively do that to appease the bullies. I'm migrating everything to Codeberg this week to try to avoid such an issue.
@joey @MartinEscardo Migration over HTTPS at least seems to work straightforwardly. (Enjoying some brie as I do this BTW
)
https://docs.codeberg.org/advanced/migrating-repos/
@MartinEscardo This is apparently a bit more long-term than I thought at first, but I hope we are moving towards a @forgefed future (including account/project migration from server to server at least as good as Mastodon's and ideally better).
@MartinEscardo in France everyone has to upload their research on https://hal.science. I wasn't so sure in the beginning but I can definitely see the point now.
@Scriddie Can people from. the UK upload things to hal.science, after the contentious Brexit?
@MartinEscardo haha absolutely. It's open to everyone subject to review to make sure it's scientific work.
@Scriddie Good to know!
@MartinEscardo @Scriddie don't you have to submit the version of record to your library? I always had to do it.
I suppose your concern is that in the UK these infrastructure is private (the one provided by Elsevier is really common I think) and you don't know what will happen in the future?
@MartinEscardo arXiv has a five million dollar grant from NSF. https://www.nsf.gov/awardsearch/showAward?AWD_ID=2311521&HistoricalAwards=false and I assume that it has the usual NSF-demanded DEI section in it.
@MartinEscardo it appears there are multiple options to download the full database : https://info.arxiv.org/help/bulk_data/index.html
The data won't be lost at least