#LLMs Pass the #TuringTest: Interrogators mistook GPT-4.5 for a human 73% of the time—far more than they did the actual human participant https://arxiv.org/abs/2503.23674

#LLMs Pass the #TuringTest: Interrogators mistook GPT-4.5 for a human 73% of the time—far more than they did the actual human participant https://arxiv.org/abs/2503.23674
Large-Scale Text Analysis & Cultural Change
In their talk at the workshop “Large Language Models for the HPSS” @tuberlin Pierluigi Cassotti and Nina Tahmasebi presented a multi-method approach to studying cultural and societal change through large-scale text analysis.
By combining close reading with computational techniques, including but not limited to #LLMs , they demonstrate how diverse tools can be integrated to uncover shifts in language. #DigitalHumanities
The insatiable hunger to feed #LLMs and #AI is parasitically draining the commons and public internet. Bandwidth costs are spiking as crawlers take data for training and information. For Wikipedia, the lack of attribution means no visitors, no donors, just cost. The #ethics of AI are failing here.
I saw Tim Karr on bluesky suggest that AIs should pay fees or a tax (should that be tariffs?) into a fund that supports public content. Services like Cloudflare and Fastly that defend against bots are evolving for crawlers. In #identity, the implications for #AgenticAI, #AI, and #NHI are vast.
https://diff.wikimedia.org/2025/04/01/how-crawlers-impact-the-operations-of-the-wikimedia-projects/
Proof or bluff? Evaluating LLMs on 2025 USA math olympiad. ~ Ivo Petrov et als. https://arxiv.org/abs/2503.21934 #LLMs #Math
Readings shared April 1, 2025. https://jaalonso.github.io/vestigium/posts/2025/04/01-readings_shared_04-01-25 #AI #Haskell #ITP #IsabelleHOL #LLMs #LeanProver #Logic #LogicProgramming #Math #Prolog #SMT #Z3
"Prompt Engineering" for AI is this today's version of "Don't hold it that way" for the iPhone 4.
Users are misassigned blame for fundamental flaws in the technology, and are instructed to adopt behavioural workarounds. These improvised habits lack the causal power to fix underlying problems in the tech, but they serve to reinforce the notion that this new tech is superior to the tech it's trying to replace or "disrupt". Furthermore, users are taught, "Just keep trying and you'll get it right," without questioning whether the new tech is the problem, or to ask if the new tech has the potential to ever deliver on its promises.
A crucial difference between early smartphones and wishing that LLMs are a route to "Thinking Machines" is: later models of phones successfully matured the engineering of antennas and improved mobile reception, but LLMs are a dead-end that can never lead to real Artificial Intelligence.
This can be summarised by the AM/FM Principal: Actual Machines in contrast to Fucking Magic.
#Prompt -> #Skript -> Zuletzt verwendet
Ein Erfahrungsbericht über #Vibe-Coding, Vertrauen in #LLMs und wie man mit einem #Prompt echten Mehrwert schafft.
Unter Windows kann man im Kontextmenü der Taskleiste schnell auf zuletzt verwendete Dateien zugreifen – eine Funktion, die ich oft nutze. Unter GNOME gibt es so etwas leider nicht. Also habe ich mir dieses Feature einfach selbst gebaut – mit #Bash und etwas Unterstützung von #ChatGPT.
You may have heard of #APL, an powerful but very dense array #programing #language. Current #LLMs have trouble with it.
After many months of training, announcing my #AI powered auto-coder, generating solutions indistinguishable from ones written by a human: https://mousetail.github.io/golf-ai/. Currently, it supports APL as well as two other popular array languages: Vyxal, and Uiua. More will follow.
Bill Gates: "A.I. chatbots will teach kids to read within 18 months." (April 2023)
HEADLINE: 'No Longer Think You Should Learn To Code,' Says CEO of AI Coding Startup
CORRECTED: CEO Of Company Whose Flimsy Business Model Is Only Hypothetically Viable If You Cant Code For Yourself; Says You Should Stop Learning To Code.
"In a new joint study, researchers with OpenAI and the MIT Media Lab found that this small subset of ChatGPT users engaged in more "problematic use," defined in the paper as "indicators of addiction... including preoccupation, withdrawal symptoms, loss of control, and mood modification."
To get there, the MIT and OpenAI team surveyed thousands of ChatGPT users to glean not only how they felt about the chatbot, but also to study what kinds of "affective cues," which was defined in a joint summary of the research as "aspects of interactions that indicate empathy, affection, or support," they used when chatting with it.
Though the vast majority of people surveyed didn't engage emotionally with ChatGPT, those who used the chatbot for longer periods of time seemed to start considering it to be a "friend." The survey participants who chatted with ChatGPT the longest tended to be lonelier and get more stressed out over subtle changes in the model's behavior, too."
**Are chatbots reliable text annotators? Sometimes**
“_Given the unreliable performance of ChatGPT and the significant challenges it poses to Open Science, we advise caution when using ChatGPT for substantive text annotation tasks._”
Ross Deans Kristensen-McLachlan, Miceal Canavan, Marton Kárdos, Mia Jacobsen, Lene Aarøe, Are chatbots reliable text annotators? Sometimes, PNAS Nexus, Volume 4, Issue 4, April 2025, pgaf069, https://doi.org/10.1093/pnasnexus/pgaf069.
#OpenAccess #OA #Article #AI #ArtificialIntelligence #LargeLanguageModels #LLMS #Chatbots #Technology #Tech #Data #Annotation #Academia #Academics @ai
BREAKING:
@TheAcornAI
just dropped the first
test-time learning pretrained model!
It learns on the fly, interacts
adapts to you, and outsmarts anything before it.
Oh, and it's OPEN.
The future just got smarter.
#AI #MachineLearning #LLMs
Happy birthday to Cognitive Design for Artificial Minds (https://lnkd.in/gZtzwDn3) that was released 4 years ago!
Since then its ideas have been presented and discussed widely in the research fields of AI/Cognitive Science/Robotics and - nowadays - both the possibilities and the limitations of: #LLMs, #GenerativeAI and #ReinforcementLearning (already envisioned and discussed in the book) have become a common topic of research interests in the AI community and beyond.
Similarly also the topic concerning the evaluation - in human-like and human-level terms - of the current AI systems has become a critical theme related to the problem Anthropomorphic interpretation of AI output (see e.g. https://lnkd.in/dVi9Qf_k ).
Book reviews have been published on ACM Computing Reviews (2021) https://lnkd.in/dWQpJdkV and on Argumenta (2023): https://lnkd.in/derH3VKN
I have been invited to present the content of the book in over 20 official scientific events in international conferences, Ph.D Schools in US, China, Japan, Finland, Germany, Sweden, France, Brazil, Poland, Austria and, of course, Italy.
A news I am happy to share is that Routledge/Taylor & Francis contacted me few weeks ago for a second edition! Stay tuned!
The #book is available in many webstores:
- Routledge: https://lnkd.in/dPrC26p
- Taylor & Francis: https://lnkd.in/dprVF2w
- Amazon: https://lnkd.in/dC8rEzPi
@academicchatter @cognition
#AI #minimalcognitivegrid #CognitiveAI #cognitivescience #cognitivesystems
#TxGemma is an open collection of #AI models built to revolutionize drug discovery and clinical trial predictions.
By streamlining drug development, TxGemma aims to accelerate the discovery of life-saving treatments.
Learn more: https://bit.ly/3RyygTa
STP: Self-play LLM theorem provers with iterative conjecturing and proving. ~ Kefan Dong, Tengyu Ma. https://arxiv.org/abs/2502.00212 #AI #LLMs #ITP #LeanProver
The cultural divide between mathematics and AI (A reflection on cultural differences observed at the 2025 Joint Mathematics Meeting). ~ Ralph Furman. https://sugaku.net/content/understanding-the-cultural-divide-between-mathematics-and-ai/ #AI #LLMs #Math
The disconnect between AI benchmarks and math research (Evaluating AI systems on their ability to be a mathematical copilot). ~ Ralph Furman. https://sugaku.net/content/ai-benchmarks-vs-real-math-research/ #AI #LLMs #Math