Christian Lawson-Perfect @christianp

Recent searches

Search options

Only available when logged in.

2 posts2 participants0 posts today

**Winbuzzer** @winbuzzer@mastodon.social · 4d

Winbuzzer @winbuzzer@mastodon.social

OpenAI New o3/o4-mini Models Hallucinate More Than Previous Models

#AI #OpenAI #GenAI #LLM #AISafety #AIEthics #Hallucinations #AIModels #o3 #o4mini #ChatGPT #ReasoningModels

https://winbuzzer.com/2025/04/19/openai-new-o3-o4-mini-models-hallucinate-more-than-previous-models-xcxwbn/

**LavX News** @lavxnews@mastodon.cloud · 5d

LavX News @lavxnews@mastodon.cloud

OpenAI's O3 and O4-Mini Models: A Leap Forward or a Hallucination Nightmare?

OpenAI's latest AI models, O3 and O4-Mini, promise advancements in reasoning but come with a troubling increase in hallucinations. As these models generate more inaccuracies, the implications for soft...

https://news.lavx.hu/article/openai-s-o3-and-o4-mini-models-a-leap-forward-or-a-hallucination-nightmare

#news #tech #AIHallucinations

**Global Threads** @globalthreads@mastodon.social · 6d

Global Threads @globalthreads@mastodon.social

AI
OpenAI Unveils o3 & o4-mini Reasoning Models

o3 outperforms all models in math, coding & visual tasks; o4-mini balances price & power.
First OpenAI models to "think with images" — can analyze blurry PDFs or sketches.
Both run Python, browse the web, and will be accessible via APIs & ChatGPT.

#OpenAI #AI #o3

**Winbuzzer** @winbuzzer@mastodon.social · 6d

Winbuzzer @winbuzzer@mastodon.social

Microsoft Adds OpenAI o3, o4-mini to Azure & GitHub

#AI #OpenAI #Microsoft #Azure #GitHub #o3 #o4mini #LLMa #ReasoningModels #CloudComputing

https://winbuzzer.com/2025/04/17/microsoft-adds-openai-o3-o4-mini-to-azure-github-xcxwbn/

**Alex Jimenez** @AlexJimenez@mas.to · Apr 6

Apr 6

Alex Jimenez @AlexJimenez@mas.to

Anthropic’s Evaluation of Chain-of-Thought Faithfulness: Investigating Hidden Reasoning, Reward Hacks, and the Limitations of Verbal #AI Transparency in #ReasoningModels

https://www.marktechpost.com/2025/04/05/anthropics-evaluation-of-chain-of-thought-faithfulness-investigating-hidden-reasoning-reward-hacks-and-the-limitations-of-verbal-ai-transparency-in-reasoning-models/

MarkTechPost · Apr 6Anthropic’s Evaluation of Chain-of-Thought Faithfulness: Investigating Hidden Reasoning, Reward Hacks, and the Limitations of Verbal AI Transparency in Reasoning ModelsA key advancement in AI capabilities is the development and use of chain-of-thought (CoT) reasoning, where models explain their steps before reaching an answer. This structured intermediate reasoning is not just a performance tool; it’s also expected to enhance interpretability. If models explain their reasoning in natural language, developers can trace the logic and detect faulty assumptions or unintended behaviors. While the transparency potential of CoT reasoning has been well-recognized, the actual faithfulness of these explanations to the model’s internal logic remains underexplored. As reasoning models become more influential in decision-making processes, it becomes critical to ensure the coherence between

#CoTs #LLMs

**Hacker News** @h4ckernews@mastodon.social · Apr 3

Apr 3

Hacker News @h4ckernews@mastodon.social

Reasoning models don't always say what they think

https://www.anthropic.com/research/reasoning-models-dont-say-think

#HackerNews #ReasoningModels #AIResearch

**LavX News** @lavxnews@mastodon.cloud · Feb 14

Feb 14

LavX News @lavxnews@mastodon.cloud

ChatGPT's Energy Consumption: A Closer Look at AI Efficiency

A recent analysis challenges the conventional wisdom surrounding ChatGPT's energy consumption, revealing that its power usage may be significantly lower than previously estimated. As AI models evolve,...

https://news.lavx.hu/article/chatgpt-s-energy-consumption-a-closer-look-at-ai-efficiency

#news #tech #ReasoningModels

**WetHat** @WetHat@fosstodon.org · Feb 11

Feb 11

WetHat @WetHat@fosstodon.org

Apparently AI reasoning models like Deepseek-R1 and OpenAI o1 suffer from "underthinking", where they abandon promising solutions too quickly, leading to inefficient resource use. To address this, a "thought switching penalty" (TIP) was developed, which improved accuracy across math and science problems.

https://the-decoder.com/reasoning-models-like-deepseek-r1-and-openai-o1-suffer-from-underthinking-study-finds/

THE DECODER · Feb 2Reasoning models like Deepseek-R1 and OpenAI o1 suffer from 'underthinking', study findsChinese researchers have discovered why AI models often struggle with complex reasoning tasks: They tend to drop promising solutions too quickly, leading to wasted computing power and lower accuracy.

#AI #ReasoningModels #DeepSeekR1

**Matthew Turland** @elazar@phpc.social · Feb 11

Feb 11

Matthew Turland @elazar@phpc.social

#ReasoningModels are just #LLMs - <antirez>
https://antirez.com/news/146?utm_source=tldrnewsletter

antirez.comReasoning models are just LLMs - <antirez>

#AI #ReasoningAI #ReasoningLLM

**PUPUWEB Blog** @pupuweb@mastodon.social · Feb 1

Feb 1

PUPUWEB Blog @pupuweb@mastodon.social

O3-mini is now available to all ChatGPT users, giving free users their first chance to try OpenAI's reasoning models! #ChatGPT #OpenAI #AI #ReasoningModels #TechNews #ArtificialIntelligence #MachineLearning #AICommunity #FreeAccess

**Johannes Kuhn (kopfzeiler)** @johakuhn@mastodon.social · Jan 30

Jan 30

Johannes Kuhn (kopfzeiler) @johakuhn@mastodon.social

Im #Newsletter habe ich ein paar Gedanken und... Thesen? Beobachtungen? zu #DeepSeek aufgeschrieben. https://internetobservatorium.substack.com/p/aus-dem-internet-observatorium-123 #AI #KI #KünstlicheIntelligenz #ReasoningModels #ChinaTech

Aus dem Internet-Observatorium · Jan 29Aus dem Internet-Observatorium #123By Johannes Kuhn

**Lorenzo** @enzoesco@poliverso.org · Jan 29

Jan 29

Lorenzo @enzoesco@poliverso.org

The Chinese firm said training the model cost just $5.6 million. Alibaba Cloud followed with a new generative AI model, while Microsoft alleges DeepSeek ‘distilled’ OpenAI’s work.#artificialintelligence #chatgpt #deepseek #deepseekr1 #deepseek-v3 #generativeai #Microsoft #nvidia #openai #reasoningmodels
DeepSeek Chatbot Beats OpenAI on App Store Leaderboard

TechRepublic · Jan 29DeepSeek Chatbot Beats OpenAI on App Store LeaderboardThe Chinese firm said training the model cost just $5.6 million. Microsoft alleges DeepSeek ‘distilled’ OpenAI’s work.

#deepseekv3

**LavX News** @lavxnews@mastodon.cloud · Jan 13

Jan 13

LavX News @lavxnews@mastodon.cloud

Revolutionizing AI Reasoning: Sky-T1-32B-Preview Model Unveiled for Under $450

In a groundbreaking move, the NovaSky team at UC Berkeley has unveiled the Sky-T1-32B-Preview model, achieving top-tier reasoning capabilities at an astonishingly low cost. This fully open-source mode...

https://news.lavx.hu/article/revolutionizing-ai-reasoning-sky-t1-32b-preview-model-unveiled-for-under-450

#news #tech #OpenSourceAI

**eicker.news ᳇ tech news** @technews@eicker.news · Dec 24, 2024

Dec 24, 2024

eicker.news ᳇ tech news @technews@eicker.news

»#OpenAI trained #o1 and #o3 to 'think' about its #safetypolicy: outlining the company’s latest way to ensure #AI #reasoningmodels stay aligned with the #values of their #humandevelopers.« https://techcrunch.com/2024/12/22/openai-trained-o1-and-o3-to-think-about-its-safety-policy/?eicker.news #tech #media

TechCrunch · Dec 22, 2024OpenAI trained o1 and o3 to 'think' about its safety policy | TechCrunchOpenAI announced a new family of AI reasoning models on Friday, o3, which the startup claims to be more advanced than o1 or anything else it has released.

**PUPUWEB Blog** @pupuweb@mastodon.social · Dec 8, 2024

Dec 8, 2024

PUPUWEB Blog @pupuweb@mastodon.social

OpenAI's o1 marks a major shift in the AI industry, moving away from prediction-based LLMs to reasoning models that aim to overcome their limitations. #OpenAI #AI #MachineLearning #ReasoningModels #ArtificialIntelligence #TechInnovation #AIShift

Drag & drop to upload

Recent searches

Search options

Administered by:

Server stats:

#reasoningmodels