mathstodon.xyz is one of the many independent Mastodon servers you can use to participate in the fediverse.
A Mastodon instance for maths people. We have LaTeX rendering in the web interface!

Server stats:

3K
active users

John Carlos Baez

𝗤𝗨𝗜𝗖𝗞! 𝗙𝗘𝗘𝗗 𝗧𝗛𝗘 𝗕𝗘𝗔𝗦𝗧!

Academic publisher Taylor & Francis recently sold many of its authors’ works to Microsoft for $10 million, without asking or paying the authors — to train Microsoft’s large language models!

Taylor & Francis asked their journal "Learning, Media and Technology" to cut peer review time to 15 days — absurdly little time — to crank out more content.

And Taylor & Francis's subsidiary Routledge told staff that it was “extra important” to meet publishing targets for 2024. It moved some book deadlines from 2025 to 2024. Why? To meet its deadline with Microsoft.

Another academic publisher, Wiley, made a $44 million deal to feed academic books to LLMs — with no way for authors to opt out. They say “it is in the public interest for these emerging technologies to be trained on high-quality, reliable information.”

When you publish with one of the big academic publishers, they try to make you sign a contract saying they can do whatever they want with your work. That means anything.

Hat-tip to @bstacey for pointing this out.

These articles have links to original sources:

pivot-to-ai.com/2024/08/04/mor

pivot-to-ai.com/2024/09/28/rou

@johncarlosbaez @bstacey

In a different universe, academics around the world had enough brain power to invent and alternative system to the broken parasitic one we have now.

In a different universe, academics could get out of their moderately comfy office armchairs and organise collectively.

The academic publishers would crumble. Without content they are nothing.

@rzeta0 @johncarlosbaez @bstacey I feel ya. I've become incredibly cynical about academia in my seven years of higher ed - even outside of publishing, the way unis are basically hedge funds and landlords that sometimes give degrees, the sheer sinisterness of the grant system and research funding, the funneling of students through whatever terrible circumstances just to keep graduation rates up, the postdoc system generally, the bureaucratic bloat enriching administrators at the expense of professors and students, the insular departmental cultures and elitist attitude against "defecting" from the grind as though we're somehow above the fray of human affairs rather than mired in them... There *has* to be a better way, but frankly I'm tired and have gone through enough and would rather find ways to stay mathematically engaged outside of the ivory towers.

@djspacewhale - The way I see it, the key problem is that academics have preferred to focus on their research and let administrators and publishers do the boring work. It's not a lack of brain power, it's laziness. But this means that gradually academics have become trapped in a system that doesn't work for them.

I see this clearly in the University of California system. This has a lot of self-governance baked into its rules - that is, academics are supposed to be making decisions about how to run things. But we've let administrators take over, not to mention publishers.

My personal solution was to retire early. This is not a solution to the overall problem!

My grad student Brendan Fong has set up his own institute, the Topos Institute. That's a better solution. Take charge!

@rzeta0 @bstacey

topos.institute

Topos InstituteTopos Institute
I do not trust global databases. Unless the tabulated data is moved to immutable storage (Interplanetary File System) the issue with credibility only going to increase. Going forward, immutable storage for tabulated data shall be the Gold Standard. This is not only "academia", this is a Civilization issue where Academia has the best chance and best interest to champion it.

@johncarlosbaez - I think saying it is laziness is a little harsh. Division of work with appropriately skilled people is not being lazy. The system has become parasitic to the point that it is killing its host; and if removing parasites were just a matter of not being lazy, there’d be very few parasites anywhere.

I think our real failing (as a society) was allowing these companies to financially profit to the point where they could impose and protect their parasitism on us (those that need to publish) with impunity, rather than directly taking over.

I continue to hope that modern tech can help by reducing the cost and inconvenience of the "boring work," as you call it. Only a few years ago, copyediting for anything I wanted to publish was a costly and time-consuming business as I’m profoundly dyslexic. Now I use an LLM to reduce this cost to effectively zero, allowing me to start self-publishing articles. These companies know they’re doomed in their current form, and this seems like a last misguided grasp to gouge more money from the system before they evaporate.

I (and some others) also started our own institute, though alas, it did not end too well for us!

@rzeta0 @johncarlosbaez @bstacey

As much as I share your hope, In this universe, you get sidelined on a *diversity* panel (sic!) meeting for suggesting a motion for the institution to leave the shitty birdplace and to encourage alternatives for professional communictions of faculty and staff.
Harshest reactions come from younger students.
I don't see the light at the end of this tunnel yet.

Socialism does not work for EVERYONE, there are no property rights in Socialism. In Socialism the more productive you are, the more you are FORCED to contribute.

Sorry, I do not have opinions about Socialism, I have Life Experiences in Socialism.
Go live in Socialism and find out. Again, life experience beats academia every time, at every turn.

@FourOh-LLC @djspacewhale @johncarlosbaez
Sure, but nobody is going to go live in Socialism (nor should they have to just to understand what you're hinting at), so either share your experiences, or your words are completely pointless.

John Carlos Baez left the USA, and moved to the UK because he will not tolerate another possible Trump administration.

This is the proper response to all politically motivated arguments.

I was born in Soviet-occupied Hungary in 1968, and by 1984 I had a promising future in the Hungarian Communist party. I believed in Communism.

Tis, despite the facts that are now on display in https://en.wikipedia.org/wiki/House_of_Terror

All the grand parents of my peers survived WW-II and the 1956 revolution against Communism. In 1996 after graduating as a railway engine repair tech I worked with people who were interrigated.

You cannot propose me a version of Socialism where people are not tortured and murdered by the thousands, by the millions for "wrong-think".

I do not have opinions, I have life experiences and the facts.
en.wikipedia.orgHouse of Terror - Wikipedia
Sorry, I got pissed and confused my timeline.

I left Hungary in 1987 to Austria, and I waited for my chance to enter the USA LEGALLY until 1989 August.

I am now a naturalized US citizen since 2017, living on a Green Card for 30 years thinking I escaped Socialism.

Communism is the inevitable end to the Human Experience. Facts do not need to be "just".

@FourOh-LLC @djspacewhale @johncarlosbaez
Thanks for sharing that.

Anger at Socialism and Socialists is understandable, but there's no need to be angry at anyone on Mastodon, at least in this thread; no one here is proposing Socialism, although that's what publishers and others are doing.

I'm aware that the USSR sucked; aside from news, I've met and have been friends with many Russian emigrants from the ex-USSR and one in particular from Hungary (which of course continues to have problems, alas).

Congratulations on escaping.

That is my point. Anyone can have their own opinion, their own preferred lifestyle and most importantly - have it without forcing others to adopt it. The US is a Christianity-based Western Civilization. Come where when you are compatible, and stay where you are when you are not. You cannot simply come to Chicago, and claim that adopting your "culture" is in my own best interest.

Using arguments about what happened decades, centuries, tens of thousands of years before belongs to Academia. Civic institutions are not equipped to deal with that.

The best Socialists in the US can do is move themselves to a receptive place, and John showed the way. There are tens of millions of us who will resist Socialism, Communism, Anarchy, Islam, Scientology, the Sun God, etc. Move yourself to a compatible place.

That's all.

Regarding Academia, the Sciences and Engineering - I really love to read you folks. Its interesting to me, where I work on EU and US environmental databases with billions of substances not definitely entering regulatory status. What I wrote about the differences between academia and engineering is also abstract, so I was dishonest to some degree.
The current fallout with Academia is the fact that they are unable to step out of the abstract. In Academia doing "research", getting to the wrong conclusion 99.999% of the time then finding a solution is 100% success rate. After all, mission accomplished and credit is deserved.

For engineers, on the other hand, 99.999 success rate and one failure is a FAIL.

It is evident that Academia have difficulties in this area.
Then take that success (generated by 99.999% failure rate) and it turns out it has zero utility. Its still a scientific achievement, of course, but currently is has no utility.

See what happens to an engineering company pursuing projects with no utility.

China's Evergrande Group illustrate this in all the Glory of Central Planning, Socialism, and the disconnect between the abstract and utility.

@johncarlosbaez @bstacey hey @jonny ! I was wondering what your thoughts are on this. Are we doomed to all published work being sent into the machine?

@science_is_hard @johncarlosbaez @bstacey @jonny as of today, yes. We have to push change ourselves if we want to see a different outcome.

@johncarlosbaez @bstacey on the other side, „researchers“ are cranking out content with the aid of LLMs. How long until all of this falls apart?

@flq @bstacey - the business of scholarship may become further enshittified, while people who actually care about ideas and writing will have to work around the damage.

Its been falling apart for a while. ChatGPT refuses to accept the fact that "battery technology" in EVs are plateauing. I invite you to compose and share a question where ChatGPT admits that Ohm's Law going to limit the "evolution" of the cell. In railway engines 1" diameter copper conductors are acceptable, where the generator is on the same frame and electricity is consumed as soon as it is generated (ideal). EVs "storing" electricity going to have a problem with copper conductors 1" in diameter.

ChatGTP is learning from "scientists" apparently unaware of the fact that electricity has zero shelf-life.

@johncarlosbaez @bstacey Crap. I have stuff I've authored through Routledge. How am I just now hearing about this?

@johncarlosbaez @bstacey

I don't see the problem on the source side, since it's just a more automated version of what we already had. #Science is not a closed endeavor.

Now, since they are selling this for commercial use, we have a problem of another kind. How will this info be used? Who will have (free) access to it?

The simple fact of the matter is, that we need something like this, and it needs to be done collectively such as we see in the Olympics, Space stations, or Global Monetary infrastructure. Or even-- wait for it.. the Interfriggenet.

The costs & benefits need to be adequately distributed to all of humanity, through a downward cascading set of densely connected nodes. (Review, critique, and citation by reputable sources.)

We do not need, planet survival notwithstanding, to have a circle-jerk of multinational corporate pirates competing to serve up supercomputer power to every conceivable trivial search made by the masses , made possible by the textbook 'freemium' scam that treats energy as freemium too.

@MalthusJohn wrote: "I don't see the problem on the source side, since it's just a more automated version of what we already had. is not a closed endeavor."

By the way, I'm talking about scholarship, which includes many subjects besides just "science". But that's not the main point.

In scholarship researchers put work into figuring things out. They write papers and books. When people use the results in those papers and books, they cite that work so we can follow the citation, go back, and find out what's the evidence for those claims. And when authors are cited, they get rewarded in various ways.

When you take people's papers, put them into a blender, and produce a LLM that spits out claims based on an undifferentiated mix of stuff, it breaks the system in at least two ways. Users of the LLM can't easily find where these claims are coming from. And the scholars who did the research aren't getting rewarded by citations.

To fix this, we need LLMs that cite what they're basing their claims on.

And we need to fix the publishing model so that universities, whose scholars do the research, don't have to pay monopoly prices to access the results. It would be even better if ordinary citizens could freely access the research their taxes are paying for!

The main problem is that the publishers' goals are not aligned with those of the people who read and write the publications. This leads to outrages like Routledge telling people to hurry up and finish writing a book so they can feed it to a large language model.
@bstacey

> It would be even better if ordinary citizens could freely access the research

I am an ordinary citizen, and I would love to hear what else you have to say along those lines.

@johncarlosbaez @MalthusJohn
@bstacey
I have good hopes that so-called "diamond open access" journal will take over in less than a deacade or so. See already initiatives like tektonika.online and seismica.library.mcgill.ca but there are many others in many disciplines. By the researchers for everyone.
This of course does not prevent LLM to digest these papers. But at least no "editor" makes *more* money for this in our back...

tektonika.online τeκτoniκa

@johncarlosbaez

We agree on the need for a better system, where goals, risks, and benefits are aligned.

It needs to produce a couple orders of magnitude more quantity as well.

I've also said in other chats that to use LLM well, you need to know the citations where the info comes from. There are no original works possible without human input.

The reason there's no problem on the publishing side is that whatever is being 'scraped' is already copyrighted, peer reviewed, etc. Using these aids does not eliminate the scientific methodology, norms and such. When anyone, using AI or not, submits a paper, it has to pass the same standards.

If we leave bad actors aside, if an idea has precedence elsewhere in the literature, the author's ignorance is no excuse. Most of the trash output of AI should be caught in early filters, among the tests for that must include having citations of previous legit work. A paper that claims groundbreaking new work and has no citations is not going to get traction, for example. Certainly not without the accompanying logic, the necessary input of a human who lays out the trail for others to follow, and confirm.

@MalthusJohn @johncarlosbaez ai is unnecessary. It's the tool of the greedy. Not the tool of knowledge.

@f4grx @johncarlosbaez

Yes, I would much rather see an army of people employed to work on updating the body of science literature than these LLM centers.

We were talking about something that already has happened & will continue to do so for the usual financier cult driven reasons.

My point was simply that the AI output is never going to be published as it is produced because it has either already been published (an exact copy) or because it's gibberish (a statistical copy). They are pretty much programmed to avoid the former for copyright reasons, leaving only the crap output to worry about filtering out.

@MalthusJohn

An even simpler example, I once wrote a bit of open-source code to compute a 'modular inverse', with comments referencing a paper and a website to read up on limitations of my implementation and faster options.

Out of curiosity, I had GitHub's Co-Pilot write the same function for me. It wrote the exact same code, minus all the comments to help users go learn more.

Citations matter. But that's sometimes counter to the financial incentives of the people who rent out time on LLMs, as they may have to flow some of that money (and attention) back to the people who created the training data.

@johncarlosbaez @bstacey

@johncarlosbaez @MalthusJohn @bstacey there is no need to fix anything. The system worked well without AI crap. To fix this we have to remove LLMs, which are always unnecessary, and always motivated by financial reasons to milk out all authors and artists.

@johncarlosbaez This kind of shit doesn't just enrage me, it also makes me deeply sad. I want to publish textbooks one day, and I want them to be actual Books rather than just lectures notes available free on my website, but who will I be able to entrust my work to?

@johncarlosbaez @bstacey This story enrages me. But the fact that the artwork that accompanies it was made from also pilfering artists' work to feed an AI is beyond ironic.

@LingLass - that was intentional; read the image information.

@johncarlosbaez truths for the lie machine! money for the money throne!

@johncarlosbaez That's just modern development. After couple of years Microsoft don't need human being customers at all.

@Santtu_61 @johncarlosbaez That would appear to be an AI illustration accompanying this post.

If I’m wrong please credit the artist. If I’m right, please consider that irony may be dead…

@megmuttonhead @Santtu_61 @johncarlosbaez Definitely hit me too. Let's complain about stealing from authors while stealing from artists.

@ocdtrekkie @Santtu_61 @johncarlosbaez apparently, I was mistaken. (My allergy to AI images is possibly being generalized to all images… Gotta watch out for that.)

@megmuttonhead @Santtu_61 @johncarlosbaez What Cat C-B said ☝️: I’m getting mixed messages from the AI garbage pic.

@Wlm @megmuttonhead @Santtu_61 - irony is not dead, you just didn't read the image description.

@johncarlosbaez @Wlm @Santtu_61 Thank you for the correction! I appreciate it.

@johncarlosbaez using an AI-generated image on a post critical of LLMs is definitely A Look.

@johncarlosbaez @bstacey

a) you did sign away your copyright

b) is it not better to train LLM on proper academic papers rather than random rants on Twitter and Reddit?

@richardtol wrote:

"you did sign away your copyright"

No, I didn't - I avoid publishing with crap publishers who force bad contracts on their authors. But a bunch of suckers did. So I'm warning others.

"is it not better to train LLM on proper academic papers rather than random rants on Twitter and Reddit?"

Better for whom? You're not addressing the key problem here, which is that crap publishers are now trying to make people finish their books fast, and skip proper refereeing. That's not better for anyone except the publishers.

In scholarship researchers put work into figuring things out. They write papers and books. When people use the results in those papers and books, they cite that work so we can follow the citation, go back, and find out what's the evidence for those claims. And when authors are cited, they get rewarded in various ways.

When you take people's papers, put them into a blender, and produce a LLM that spits out claims based on an undifferentiated mix of stuff, it breaks the system in at least two ways. Users of the LLM can't easily find where these claims are coming from. And the scholars who did the research aren't getting rewarded by citations.

To fix this, we need LLMs that cite what they're basing their claims on.

And we need to fix the publishing model so that universities, whose scholars do the research, don't have to pay monopoly prices to access the results. It would be even better if ordinary citizens could freely access the research their taxes are paying for!

The main problem is that the publishers' goals are not aligned with those of the people who read and write the publications. .

@bstacey

@johncarlosbaez @bstacey it's a bit rich to post this with an "ai"-generated image attached

@johncarlosbaez @bstacey Seems like a mixed message to illlustrate a post deploring the harvesting of people’s original work without their consent for AI with an image created by the exact same thing.

@bstacey @johncarlosbaez @Adzebill Thst thought crossed my mind too. One of the problems with this wild frontier of AI products is having no way of knowing which were built on user-consented content and which were not.

@johncarlosbaez @bstacey I’m not a fan of AI but there’s some dissonance going on with the articles then using an AI image to advertise them

@PictoPirate - good, you noticed. Did you read the image description?

@johncarlosbaez @bstacey As a scientist that aims to do research for the betterment of society, I do not have a problem with any of my works being used to create AI databases. I guess what does rub me the wrong way a little is that publishers are once again making money off of the free labor of scientists. Will they reduce publication costs because of this windfall? Doubtful, they will probably get a great bonus this year.