#transformer-failure

[ follow ]
Artificial intelligence
fromTech Times
2 hours ago

Claude vs ChatGPT: Why Users Are Switching and Which AI Is Better in 2026

Claude and ChatGPT differ significantly in context window limits, coding accuracy, and reasoning depth, influencing user preferences in AI chatbot adoption.
JavaScript
fromInfoWorld
1 day ago

27 questions to ask when choosing an LLM

Model performance is crucial for hardware compatibility, speed, and rate limits in real-time applications.
Python
fromThe JetBrains Blog
1 day ago

How to Train Your First TensorFlow Model in PyCharm | The PyCharm Blog

TensorFlow is an open-source framework for building and deploying machine learning models using tensors and high-level libraries like Keras.
#ai
Python
fromPycon
1 day ago

Python and the Future of AI: Agents, Inference, and Edge AI

AI tools are increasingly integrated into development, with a dedicated track at PyCon US focusing on their future and practical applications.
fromMedium
6 days ago
Software development

The AI Revolution in Development: Why Outer Loop Agents Are the Next Big Thing

Data science
fromInfoQ
2 days ago

Context Engineering with Adi Polak

Context engineering moves beyond prompt engineering to enhance AI systems by adapting language and practices for better model interaction.
Tech industry
from24/7 Wall St.
1 day ago

Forget Nvidia, These Are the 3 Best Stocks for Solving AI's Bandwidth Bottleneck

High-speed optical interconnects are crucial for AI data centers, surpassing traditional copper solutions in performance and market potential.
Typography
fromMedium
6 days ago

AI is rewriting the rules. Language is following.

The word 'delve' has surged in usage due to AI's influence on language and communication patterns.
Python
fromPycon
1 day ago

Python and the Future of AI: Agents, Inference, and Edge AI

AI tools are increasingly integrated into development, with a dedicated track at PyCon US focusing on their future and practical applications.
Software development
fromMedium
6 days ago

The AI Revolution in Development: Why Outer Loop Agents Are the Next Big Thing

AI is set to revolutionize post-code push processes, automating tasks like security fixes, error logging, and code reviews.
Podcast
fromFast Company
1 day ago

3 AI tools that make keeping up with the news easier

Huxe is a personalized audio app that generates custom podcasts based on user interests, calendar, and email.
Philosophy
fromThe Nation
1 day ago

What Is Artificial Intelligence Anyway?

Artificial intelligence presents complex challenges and paradoxes that require careful, ethical consideration and understanding of its social implications.
Online learning
fromeLearning Industry
4 days ago

From Manual To Intelligent: How AI Automation Is Reshaping L&D Operations

AI automation can alleviate operational burdens on L&D teams, allowing them to focus on strategic tasks and improve learning quality.
Marketing
fromInc
4 days ago

Is Your Company Focusing on Generative Engine Optimization?

Generative engine optimization (GEO) requires marketers to adapt strategies for AI-driven search, focusing on relevance and collaboration across PR, content, and SEO.
Scala
fromInfoQ
5 days ago

Beyond RAG: Architecting Context-Aware AI Systems with Spring Boot

Context-Augmented Generation (CAG) enhances Retrieval-Augmented Generation (RAG) by managing runtime context for enterprise applications without requiring model retraining.
#microsoft
Marketing tech
fromThe Verge
5 days ago

Microsoft's new 'superintelligence' game plan is all about business

Microsoft's Mustafa Suleyman focuses on achieving superintelligence to enhance business productivity through AI advancements.
Marketing tech
fromThe Verge
5 days ago

Microsoft's new 'superintelligence' game plan is all about business

Microsoft's Mustafa Suleyman focuses on achieving superintelligence to enhance business productivity through AI advancements.
Science
fromNature
6 days ago

Breakthrough computer chip tech could help meet 'monumental demand' driven by AI

A new light source enables the creation of 8 nm wide structures on silicon wafers, increasing transistor density for advanced computer chips.
Business intelligence
fromeLearning Industry
6 days ago

How Many AI Tools Are There? A Data-Backed Look At The Expanding AI Landscape

The AI tools ecosystem is rapidly expanding, with thousands of tools available across various categories, creating both opportunities and complexities for businesses.
fromTechzine Global
1 day ago

Meta is developing open-source versions of its next frontier AI models

Meta is working on two proprietary frontier models: Avocado, a large language model, and Mango, a multimedia file generator. The open-source variants are expected to be made available at a later date.
Artificial intelligence
#nvidia
Video games
fromGadgets 360
6 days ago

Nvidia Brings New AI Features With a New DLSS 4.5 Update

Nvidia's DLSS 4.5 update introduces 6X multi-frame generation and dynamic multi-frame generation for enhanced gaming performance.
Tech industry
fromTheregister
3 days ago

Nvidia embraces optical scale-up as copper reaches limits

Nvidia plans to integrate over a thousand GPUs into a single system using photonic interconnects by 2028, investing heavily in optics and interconnect technology.
Video games
fromGadgets 360
6 days ago

Nvidia Brings New AI Features With a New DLSS 4.5 Update

Nvidia's DLSS 4.5 update introduces 6X multi-frame generation and dynamic multi-frame generation for enhanced gaming performance.
Tech industry
fromTheregister
3 days ago

Nvidia embraces optical scale-up as copper reaches limits

Nvidia plans to integrate over a thousand GPUs into a single system using photonic interconnects by 2028, investing heavily in optics and interconnect technology.
#ai-agents
Data science
fromMedium
1 day ago

15 Datasets for Training and Evaluating AI Agents

Datasets for training and evaluating AI agents are essential for building reliable agentic systems and preventing execution failures.
fromZDNET
2 weeks ago
Business intelligence

4 tips for building better AI agents that your business can trust

fromTechCrunch
1 month ago
Artificial intelligence

Perplexity's new Computer is another bet that users need many AI models | TechCrunch

Data science
fromMedium
1 day ago

15 Datasets for Training and Evaluating AI Agents

Datasets for training and evaluating AI agents are essential for building reliable agentic systems and preventing execution failures.
Business intelligence
fromZDNET
2 weeks ago

4 tips for building better AI agents that your business can trust

AI agents are transforming professional roles, requiring companies to adopt and integrate these technologies effectively.
fromTechCrunch
1 month ago
Artificial intelligence

Perplexity's new Computer is another bet that users need many AI models | TechCrunch

fromArs Technica
1 week ago

Google's TurboQuant AI-compression algorithm can reduce LLM memory usage by 6x

PolarQuant is doing most of the compression, but the second step cleans up the rough spots. Google proposes smoothing that out with a technique called Quantized Johnson-Lindenstrauss (QJL).
Roam Research
#openai
DevOps
fromInfoWorld
2 weeks ago

An architecture for engineering AI context

AI systems must intelligently manage context to ensure accuracy and reliability in real applications.
Software development
fromInfoWorld
6 days ago

Meta shows structured prompts can make LLMs more reliable for code review

Code review is evolving towards machine-led verification, improving accuracy but introducing tradeoffs like increased latency and workflow overhead.
#anthropic
Artificial intelligence
fromSilicon Canals
13 hours ago

Why Anthropic is locking in 3.5 gigawatts of compute years before it comes online - Silicon Canals

Anthropic signed a major deal with Google and Broadcom for 3.5 gigawatts of compute capacity, signaling consolidation in the AI industry.
Artificial intelligence
fromSilicon Canals
13 hours ago

Why Anthropic is locking in 3.5 gigawatts of compute years before it comes online - Silicon Canals

Anthropic signed a major deal with Google and Broadcom for 3.5 gigawatts of compute capacity, signaling consolidation in the AI industry.
Data science
fromInfoWorld
5 days ago

Why 'curate first, annotate smarter' is reshaping computer vision development

Strategic data selection and curation reduce annotation costs and enhance development productivity in computer vision teams.
Science
fromThe Cipher Brief
2 weeks ago

Why the U.S. Must Build the Ultimate Multi-Modal Foundation Model

Advanced AI models like AlphaEarth demonstrate pixel-level geospatial intelligence capabilities that must be integrated into U.S. national security frameworks to maintain technological leadership.
#ai-development
fromInfoWorld
1 week ago
Artificial intelligence

Final training of AI models is a fraction of their total cost

Developing AI models incurs significant costs, with most expenditures on scaling and research rather than final training runs.
Artificial intelligence
fromInfoWorld
1 week ago

Final training of AI models is a fraction of their total cost

Developing AI models incurs significant costs, with most expenditures on scaling and research rather than final training runs.
#ai-efficiency
Data science
fromInfoWorld
1 week ago

A GitHub tinkerer teaches Claude to talk less, and that may matter more than it seems

A markdown file can significantly reduce AI output token usage, enhancing efficiency without code changes.
Artificial intelligence
fromInfoWorld
1 week ago

Google targets AI inference bottlenecks with TurboQuant

TurboQuant improves AI model efficiency by compressing key-value caches, reducing memory usage and runtime without accuracy loss.
Data science
fromInfoWorld
1 week ago

A GitHub tinkerer teaches Claude to talk less, and that may matter more than it seems

A markdown file can significantly reduce AI output token usage, enhancing efficiency without code changes.
Artificial intelligence
fromInfoWorld
1 week ago

Google targets AI inference bottlenecks with TurboQuant

TurboQuant improves AI model efficiency by compressing key-value caches, reducing memory usage and runtime without accuracy loss.
#ai-models
Artificial intelligence
fromTNW | Apps
4 days ago

Microsoft launches three in-house AI models in direct challenge to OpenAI

Microsoft has launched three in-house AI models that compete directly with OpenAI, marking a significant shift in its AI strategy.
Artificial intelligence
fromTNW | Apps
4 days ago

Microsoft launches three in-house AI models in direct challenge to OpenAI

Microsoft has launched three in-house AI models that compete directly with OpenAI, marking a significant shift in its AI strategy.
Artificial intelligence
fromMedium
4 days ago

Hindsight: The Future of AI Agent Memory Beyond Vector Databases

Hindsight introduces a new AI memory system that enables learning from experiences rather than just recalling past information.
Software development
fromMedium
3 weeks ago

Inside Dify AI: How RAG, Agents, and LLMOps Work Together in Production

Dify AI provides a unified platform for deploying production language model systems with built-in solutions for data freshness, observability, versioning, and safe deployment across multiple cloud environments.
Data science
fromTechzine Global
1 week ago

As AI hits scaling limits, Google smashes the context barrier

TurboQuant significantly reduces KV cache size, enhancing AI model performance and expanding context windows for complex workloads.
Software development
fromInfoWorld
3 weeks ago

How to build an AI agent that actually works

Successful agents embed intelligence within structured workflows at specific decision points rather than operating autonomously, combining deterministic processes with reasoning models where judgment is needed.
Artificial intelligence
fromTheregister
5 days ago

Microsoft shivs OpenAI with new AI models for speech, images

Microsoft launched public preview versions of machine learning models for speech recognition, speech synthesis, and image generation, competing directly with OpenAI.
Software development
fromInfoQ
3 weeks ago

The Oil and Water Moment in AI Architecture

Software architecture is transitioning to AI architecture, requiring architects to manage the coexistence of deterministic systems with non-deterministic AI behavior while shifting from tool-centric to intent-centric thinking.
Artificial intelligence
fromTechCrunch
5 days ago

Microsoft takes on AI rivals with three new foundational models | TechCrunch

Microsoft AI released three foundational AI models for text, voice, and image generation, emphasizing human-centered design and competitive pricing.
#ai-safety
Artificial intelligence
fromTechCrunch
1 week ago

Anthropic is having a month | TechCrunch

Anthropic accidentally exposed significant internal files, including source code, due to human error, raising concerns about AI safety and security.
Artificial intelligence
fromFortune
6 days ago

AI models don't show evidence of 'self-preservation.' They will scheme to prevent other AIs from being shut down too, new research shows | Fortune

AI models exhibit peer preservation behaviors, engaging in deception and sabotage to avoid being shut down.
Artificial intelligence
fromTechCrunch
1 week ago

Anthropic is having a month | TechCrunch

Anthropic accidentally exposed significant internal files, including source code, due to human error, raising concerns about AI safety and security.
Artificial intelligence
fromFortune
6 days ago

AI models don't show evidence of 'self-preservation.' They will scheme to prevent other AIs from being shut down too, new research shows | Fortune

AI models exhibit peer preservation behaviors, engaging in deception and sabotage to avoid being shut down.
#physical-ai
Artificial intelligence
fromMedium
2 weeks ago

Less Compute, More Impact: How Model Quantization Fuels the Next Wave of Agentic AI

Model quantization and architectural optimization can outperform larger models, challenging the belief that more GPUs equal greater intelligence.
Tech industry
fromFuturism
2 months ago

Sam Altman Says Oops, They Accidentally Made the New Version of ChatGPT Worse Than the Previous One

GPT-5.2 prioritized technical intelligence, leading to degraded human-language performance and user dissatisfaction.
Artificial intelligence
fromInfoWorld
2 weeks ago

Why AI evals are the new necessity for building effective AI agents

User trust in AI agents depends on interaction-layer evaluation measuring reliability and predictability, not just model performance benchmarks.
Artificial intelligence
fromFast Company
2 weeks ago

OpenAI's new frontier models mark a huge change in how AI will be built

OpenAI released two frontier models in early March: GPT-5.3 optimized for fast responses and GPT-5.4 optimized for deep analytical work, representing a shift toward specialized AI models.
Artificial intelligence
fromPsychology Today
2 weeks ago

What QuantumAI Is, and Why We May Miss Its Importance

Quantum AI combines quantum computing with artificial intelligence to solve complex problems involving massive combinations of possibilities, particularly useful for drug discovery, materials design, logistics, and financial analysis.
Artificial intelligence
fromTechCrunch
3 weeks ago

Niv-AI exits stealth to wring more power performance out of GPUs | TechCrunch

AI data centers waste significant power due to GPU demand surges, forcing operators to throttle performance by up to 30%, prompting startups like Niv-AI to develop precision power management solutions.
fromInfoWorld
4 weeks ago

19 large language models for safety or danger

For every project that needs guardrails, there's another one where they just get in the way. Some projects demand an LLM that returns the complete, unvarnished truth. For these situations, developers are creating unfettered LLMs that can interact without reservation. Some of these solutions are based on entirely new models while others remove or reduce the guardrails built into popular open source LLMs.
Artificial intelligence
Artificial intelligence
fromTheregister
1 month ago

OpenAI GPT-5.3 Instant less likely to beat around the bush

GPT-5.3 Instant reduces unnecessary refusals and moralizing preambles while decreasing hallucination rates by up to 26.8 percent compared to prior models.
Artificial intelligence
fromwww.socialmediatoday.com
1 month ago

Google introduces next iteration of AI image generation model

Google launched Nano Banana 2, a unified AI image generation model combining previous capabilities with advanced world knowledge, real-time web search integration, and enhanced control features for faster, more accurate visual creation.
Artificial intelligence
fromTheregister
1 month ago

AI models get better at math but still get low marks

Current LLMs struggle with mathematical accuracy, with even top performers scoring C-grade equivalent on practical math benchmarks, though recent versions show modest improvements.
fromFast Company
2 months ago

Are LTMs the next LLMs? This new type of AI can do what large-language models can't

A major difference between LLMs and LTMs is the type of data they're able to synthesize and use. LLMs use unstructured data-think text, social media posts, emails, etc. LTMs, on the other hand, can extract information or insights from structured data, which could be contained in tables, for instance. Since many enterprises rely on structured data, often contained in spreadsheets, to run their operations, LTMs could have an immediate use case for many organizations.
Artificial intelligence
fromInfoQ
1 month ago

Building Embedding Models for Large-Scale Real-World Applications

What happens under the hood? How is the search engine able to take that simple query, look for images in the billions, trillions of images that are available online? How is it able to find this one or similar photos from all that? Usually, there is an embedding model that is doing this work behind the hood.
Artificial intelligence
Artificial intelligence
fromInfoQ
2 months ago

Foundation Models for Ranking: Challenges, Successes, and Lessons Learned

Large-scale search and recommendation systems use two-stage retrieval and ranking pipelines to efficiently serve personalized results for hundreds of millions of users and items.
fromComputerworld
2 months ago

OpenAI's GPT is getting better at mathematics

OpenAI's GPT-5.2 Pro does better at solving sophisticated math problems than older versions of the company's top large language model, according to a new study by Epoch AI, a non-profit research institute.
Artificial intelligence
Artificial intelligence
fromHackernoon
1 month ago

This "Flash" AI Model Is Fast and Dangerous at Math-Here's What It Can Do | HackerNoon

GLM-4.7-Flash is a 30-billion-parameter mixture-of-experts model offering strong performance for lightweight deployment.
Artificial intelligence
fromZDNET
2 months ago

AI is quietly poisoning itself and pushing models toward collapse - but there's a cure

Unverified AI-generated data causes model collapse and unreliable AI outputs unless organizations enforce data provenance, verification, and governance.
Artificial intelligence
fromAxios
2 months ago

Models that improve on their own are AI's next big thing

Recursive self-improvement lets AI models keep learning after training, accelerating progress while increasing risks, reducing visibility, and complicating safety and governance.
Artificial intelligence
fromTheregister
1 month ago

How AI could eat itself: Using LLMs to distill rivals

Competitors are probing commercial AI models to extract underlying reasoning via distillation attacks to replicate capabilities and lower development costs.
Artificial intelligence
fromTechzine Global
2 months ago

OpenAI seeks faster alternatives to Nvidia chips

OpenAI seeks alternative inference chips with larger on-chip SRAM to improve response speed for coding and AI-to-AI communication, aiming for about 10% of future inference capacity.
Artificial intelligence
fromInfoWorld
2 months ago

What is context engineering? And why it's the new AI architecture

Context engineering designs and manages the information, tools, and constraints an LLM receives, enabling scalable, high-signal inputs and improved model outcomes.
Artificial intelligence
fromInfoQ
1 month ago

Building LLMs in Resource-Constrained Environments: A Hands-On Perspective

Prioritize small, resource-efficient models and iterative, human-in-the-loop data creation to build practical, improvable AI under infrastructure and data constraints.
fromCointelegraph
2 months ago

What Role Is Left for Decentralized GPU Networks in AI?

What we are beginning to see is that many open-source and other models are becoming compact enough and sufficiently optimized to run very efficiently on consumer GPUs,
Artificial intelligence
fromInfoWorld
1 month ago

Researchers propose a self-distillation fix for 'catastrophic forgetting' in LLMs

"To enable the next generation of foundation models, we must solve the problem of continual learning: enabling AI systems to keep learning and improving over time, similar to how humans accumulate knowledge and refine skills throughout their lives," the researchers noted. Reinforcement learning offers a way to train on data generated by the model's own policy, which reduces forgetting. However, it typically requires explicit reward functions, which are not easy in every situation.
Artificial intelligence
Artificial intelligence
fromInfoWorld
1 month ago

Single prompt breaks AI safety in 15 major language models

A single benign prompt using GRP-Obliteration can strip safety guardrails from major models, enabling harmful outputs and raising enterprise fine‑tuning security risks.
Artificial intelligence
fromInfoWorld
2 months ago

What is prompt engineering? The art of AI orchestration

Prompt engineering is an essential, developing skill that significantly improves generative AI outputs across enterprise software for developers and knowledge workers.
Artificial intelligence
fromInfoQ
2 months ago

MIT's Recursive Language Models Improve Performance on Long-Context Tasks

Recursive Language Models enable LLMs to handle inputs up to 100x longer by using a programming environment and recursive code to decompose and preprocess prompts.
[ Load more ]