#nice-part-usage-npu

[ follow ]
Science
fromNature
2 days ago

Breakthrough computer chip tech could help meet 'monumental demand' driven by AI

A new light source enables the creation of 8 nm wide structures on silicon wafers, increasing transistor density for advanced computer chips.
#ai
fromEngadget
4 days ago
Artificial intelligence

Microsoft's research assistant can now use multiple AI models simultaneously

fromTechCrunch
1 week ago
Silicon Valley

Startup Gimlet Labs is solving the AI inference bottleneck in a surprisingly elegant way | TechCrunch

Gimlet Labs raised $80 million to enhance AI inference efficiency across diverse hardware types.
fromComputerworld
1 week ago
Tech industry

HP will cram a 20-billion-parameter AI model into new AI PCs

HP is launching AI features in its Workforce Experience Platform to enhance remote device management and automate tasks on enterprise PCs.
Data science
fromTheregister
2 days ago

TurboQuant is a big deal, but it won't end the memory crunch

TurboQuant is an AI data compression technology that reduces memory usage for KV caches but may not significantly alleviate memory shortages.
Data science
fromInfoWorld
2 days ago

How to halve Claude output costs with a markdown tweak

A markdown file can reduce Claude's token output by over 50%, aiding enterprises in managing AI costs during production.
Artificial intelligence
fromEngadget
4 days ago

Microsoft's research assistant can now use multiple AI models simultaneously

The upgraded Researcher tool combines ChatGPT and Claude models for improved research quality in Microsoft 365 Copilot.
Silicon Valley
fromTechCrunch
1 week ago

Startup Gimlet Labs is solving the AI inference bottleneck in a surprisingly elegant way | TechCrunch

Gimlet Labs raised $80 million to enhance AI inference efficiency across diverse hardware types.
Tech industry
fromComputerworld
1 week ago

HP will cram a 20-billion-parameter AI model into new AI PCs

HP is launching AI features in its Workforce Experience Platform to enhance remote device management and automate tasks on enterprise PCs.
#nvidia
Venture
from24/7 Wall St.
1 month ago

NVIDIA Just Made Another Smart Bet on AI

Nvidia continues aggressive dealmaking and strategic investments across AI, leveraging cash and partnerships while raising risks of overexposure if AI monetization falters.
Artificial intelligence
from24/7 Wall St.
2 months ago

Is AMD About to Surpass Nvidia In the AI Chip Race?

Nvidia dominates AI chips with roughly 92% of data-center GPUs, while AMD has rapidly improved with MI300X and may challenge on cost and open-standard appeal.
Software development
fromArs Technica
2 days ago

Nvidia rolls out its fix for PC gaming's "compiling shaders" wait times

Nvidia's new Auto Shader Compilation feature allows automatic shader compilation during idle times to reduce load times for PC gamers.
Tech industry
from24/7 Wall St.
2 days ago

Nvidia vs Broadcom: Which AI Stock Will Make You More Money

Nvidia and Broadcom reported significant AI-driven revenue growth, with Nvidia focusing on GPUs and Broadcom on custom silicon.
Business
from24/7 Wall St.
1 week ago

Nvidia Could Hit $340 by 2031 and the AI Buildout Is Just Getting Started

NVIDIA's stock is projected to reach $209.50 in one year and $298.29 in five years, driven by strong growth and strategic partnerships.
Artificial intelligence
from24/7 Wall St.
1 week ago

NVIDIA's GTC Developments Were Far Bigger Than the Market Realizes

Nvidia's stock remains stagnant despite significant innovations, with uncertainty about future reactions to developments in the AI sector.
#ai-efficiency
Artificial intelligence
fromInfoWorld
1 week ago

Google targets AI inference bottlenecks with TurboQuant

TurboQuant improves AI model efficiency by compressing key-value caches, reducing memory usage and runtime without accuracy loss.
Artificial intelligence
fromInfoWorld
1 week ago

Google targets AI inference bottlenecks with TurboQuant

TurboQuant improves AI model efficiency by compressing key-value caches, reducing memory usage and runtime without accuracy loss.
#arm
Tech industry
fromWIRED
1 week ago

Arm Is Now Making Its Own Chips

Arm is producing its own semiconductors, marking a shift from licensing to manufacturing in response to AI demand.
Artificial intelligence
fromTheregister
1 week ago

Arm rolls its own 136-core AGI CPU to chase AI hype train

Arm has unveiled its first homegrown silicon, the AGI CPU, designed for artificial general intelligence and set for deployment by Meta.
Tech industry
fromWIRED
1 week ago

Arm Is Now Making Its Own Chips

Arm is producing its own semiconductors, marking a shift from licensing to manufacturing in response to AI demand.
Artificial intelligence
fromTheregister
1 week ago

Arm rolls its own 136-core AGI CPU to chase AI hype train

Arm has unveiled its first homegrown silicon, the AGI CPU, designed for artificial general intelligence and set for deployment by Meta.
Data science
fromInfoWorld
2 weeks ago

The 'toggle-away' efficiencies: Cutting AI costs inside the training loop

Simple optimizations can significantly reduce AI training costs and carbon emissions without needing the latest GPUs.
Tech industry
fromThe Verge
1 week ago

Arm's first CPU ever will plug into Meta's AI datacenters later this year

Arm AGI CPU features up to 136 cores and claims double the performance per watt compared to x86 chips.
Apple
fromComputerworld
4 weeks ago

Leaked Mac benchmarks show that Apple offers tomorrow's AI PCs today

Apple's M5 Max MacBook Pro delivers the fastest consumer PC processor with highest single- and multi-core scores, while the MacBook Neo provides excellent everyday performance at budget pricing.
Artificial intelligence
fromMedium
1 week ago

Less Compute, More Impact: How Model Quantization Fuels the Next Wave of Agentic AI

Model quantization and architectural optimization can outperform larger models, challenging the belief that more GPUs equal greater intelligence.
Tech industry
fromTheregister
2 weeks ago

A closer look at Nvidia's Groq-powered LPX rack systems

Nvidia acquired Groq for $20 billion primarily to accelerate time-to-market for SRAM-heavy inference chips rather than develop the technology independently, enabling faster token generation for AI reasoning workloads.
Miscellaneous
fromInfoQ
1 month ago

OpenAI Codex-Spark Achieves Ultra-Fast Coding Speeds on Cerebras Hardware

OpenAI deployed GPT-5.3-Codex-Spark on Cerebras wafer-scale chips, achieving 1,000 tokens per second for real-time interactive coding with 15× faster performance than earlier versions.
Tech industry
fromComputerworld
2 weeks ago

System-level 'coopetition': Why Nvidia's DGX Rubin NVL8 runs on Intel Xeon 6

Nvidia's flagship DGX Rubin NVL8 AI systems use Intel Xeon 6 processors as host CPUs to maintain x86 compatibility and meet enterprise deployment requirements.
Tech industry
fromTheregister
2 weeks ago

Nvidia slaps Groq into new LPX racks for faster AI response

Nvidia integrates Groq's language processing units into Vera Rubin systems to dramatically accelerate LLM inference, enabling hundreds to thousands of tokens per second per user.
Artificial intelligence
fromTechCrunch
2 weeks ago

Niv-AI exits stealth to wring more power performance out of GPUs | TechCrunch

AI data centers waste significant power due to GPU demand surges, forcing operators to throttle performance by up to 30%, prompting startups like Niv-AI to develop precision power management solutions.
#ai-infrastructure
Artificial intelligence
fromComputerWeekly.com
2 weeks ago

HPE taps Nvidia to transform distributed AI factories into intelligent AI grid | Computer Weekly

HPE launches AI Grid infrastructure powered by Nvidia GPUs to enable distributed, low-latency AI inference at edge locations for real-time applications across retail, manufacturing, healthcare, and telecommunications.
Artificial intelligence
fromTechRepublic
1 month ago

Nvidia's Vera Rubin Promises 10x Efficiency as AI Power Demands Surge

Nvidia's Vera Rubin system prioritizes energy efficiency and modularity over raw speed, delivering 10 times more performance per watt than Grace Blackwell to address data center power constraints and scaling challenges.
Artificial intelligence
fromComputerWeekly.com
2 weeks ago

HPE taps Nvidia to transform distributed AI factories into intelligent AI grid | Computer Weekly

HPE launches AI Grid infrastructure powered by Nvidia GPUs to enable distributed, low-latency AI inference at edge locations for real-time applications across retail, manufacturing, healthcare, and telecommunications.
Artificial intelligence
fromTechRepublic
1 month ago

Nvidia's Vera Rubin Promises 10x Efficiency as AI Power Demands Surge

Nvidia's Vera Rubin system prioritizes energy efficiency and modularity over raw speed, delivering 10 times more performance per watt than Grace Blackwell to address data center power constraints and scaling challenges.
Data science
fromTechRepublic
1 month ago

Inside the Gas Engine Strategy Powering AI's Next Wave

Gas reciprocating engines are emerging as a critical power solution for AI data centers, with manufacturers like Caterpillar securing multi-gigawatt orders to meet demand that exceeds grid and turbine capacity.
Artificial intelligence
fromTechzine Global
2 weeks ago

Nvidia's Groq 3 LPU targets agentic AI inference at GTC 2026

Nvidia's acquisition of Groq technology produces the Groq 3 LPU, a specialized inference chip delivering 40 petabytes per second bandwidth, significantly outpacing GPU inference speeds.
#custom-ai-chips
Tech industry
fromTheregister
3 weeks ago

Meta reveals custom AI chips it says beat Nvidia

Meta unveiled four custom AI chips (MTIA 300, 400, 450, 500) developed with Broadcom, with some in production and others launching through 2027 to power AI inference and recommendation workloads.
fromTechRepublic
3 weeks ago
Artificial intelligence

Meta's New AI Chips Reveal a Faster, More Self-Reliant Hardware Strategy

Meta is rapidly developing custom AI chips to reduce costs, gain hardware control, and support its expanding AI infrastructure across platforms.
Tech industry
fromTheregister
3 weeks ago

Meta reveals custom AI chips it says beat Nvidia

Meta unveiled four custom AI chips (MTIA 300, 400, 450, 500) developed with Broadcom, with some in production and others launching through 2027 to power AI inference and recommendation workloads.
Artificial intelligence
fromComputerworld
2 weeks ago

Nvidia NemoClaw promises to run OpenClaw agents securely

Nvidia introduced NemoClaw with OpenShell security features to address OpenClaw's enterprise security vulnerabilities through sandbox isolation and policy enforcement.
Gadgets
fromTechzine Global
1 month ago

Review ASUS NUC 15 Pro: brings computing power to impossible places

ASUS NUC 15 Pro is a compact mini PC offering reliable, flexible connectivity and redundancy ideal for point-of-sale, control panels, and space-limited business deployments.
#meta
Artificial intelligence
fromInfoWorld
3 weeks ago

Nvidia launches Nemotron 3 Super to power enterprise AI agents

Nemotron 3 Super's hybrid architecture combining Mamba and Transformer technologies enables enterprises to run complex AI agents more efficiently with lower costs and faster execution on existing infrastructure.
fromTechzine Global
3 weeks ago

Meta shifts to AI inference with its future chips

Four generations, MTIA 300, 400, 450, and 500, have been produced within less than two years, with several already in production and others scheduled for mass deployment in 2026 and 2027. The quick pace is deliberate. Rather than betting on a single chip generation and waiting years for results, Meta has adopted a roughly six-month cadence per generation, using modular chiplet architecture to enable incremental upgrades without replacing entire rack systems.
Artificial intelligence
fromRaymondcamden
2 months ago

Summarizing PDFs with On-Device AI

For today, I'm going to demonstrate something that's been on my mind in a while - doing summarizing of PDFs completely in the browser, with Chrome's on-device AI. Unlike the Prompt API, summarization has been released since Chrome 138, so most likely those of you on Chrome can run these demos without problem. (You can see more about the AI API statuses if you're curious.)
JavaScript
Artificial intelligence
fromComputerWeekly.com
4 weeks ago

Edge AI: What's working and what isn't | Computer Weekly

Edge AI deployment success depends on identifying efficient, narrow use cases with manageable risks rather than pursuing sophisticated, large-scale models across all applications.
Gadgets
fromEngadget
2 months ago

PNY is releasing slim-sized NVIDIA RTX GPUs just as PC building becomes prohibitively expensive

PNY’s new two-slot "Slim" RTX 5070/5070 Ti/5080 GPUs offer compact designs suitable for small-form-factor builds, arriving in February amid rising PC component costs.
Artificial intelligence
fromTheregister
1 month ago

OpenAI GPT-5.3 Instant less likely to beat around the bush

GPT-5.3 Instant reduces unnecessary refusals and moralizing preambles while decreasing hallucination rates by up to 26.8 percent compared to prior models.
Gadgets
fromArs Technica
1 month ago

Intel Panther Lake Core Ultra review: Intel's best laptop CPU in a very long time

Intel's Panther Lake Core Ultra Series 3 delivers substantial CPU and GPU performance gains with excellent power efficiency and battery life.
Artificial intelligence
fromTechzine Global
1 month ago

Nvidia is working on a chip for AI inferencing with Groq technology

Nvidia is developing an energy-efficient inferencing chip using Groq technology to compete in AI inference processing, with OpenAI as an early customer.
Artificial intelligence
from24/7 Wall St.
1 month ago

NVIDIA Cements Its Role as the Backbone of AI Infrastructure

NVIDIA's networking revenue grew 162% year-over-year to $8.2 billion, nearly tripling GPU growth, signaling a shift from chip seller to integrated infrastructure provider selling complete AI data center systems.
Tech industry
fromTheregister
2 months ago

How Nvidia is using emulation to turn AI FLOPS into FP64

Nvidia achieves higher FP64 throughput through software emulation on Rubin GPUs, trading hardware FP64 for emulated matrix performance up to 200 TFLOPS.
#ryzen-ai-400
fromTechCrunch
2 months ago

Quadric rides the shift from cloud AI to on-device inference - and it's paying off | TechCrunch

The company, which is based in San Francisco and has an office in Pune, India, is targeting up to $35 million this year as it builds a royalty-driven on-device AI business. That growth has buoyed the company, which now has post-money valuation of between $270 million and $300 million, up from around $100 million in its 2022 Series B, Kheterpal said.
Artificial intelligence
Artificial intelligence
fromTechzine Global
1 month ago

OpenAI seeks faster alternatives to Nvidia chips

OpenAI seeks alternative inference chips with larger on-chip SRAM to improve response speed for coding and AI-to-AI communication, aiming for about 10% of future inference capacity.
Artificial intelligence
fromInfoWorld
2 months ago

Edge AI: The future of AI inference is smarter local compute

Edge AI shifts computation from cloud to devices, enabling low-latency, cost-efficient, and privacy-preserving AI inference while facing performance and ecosystem challenges.
Artificial intelligence
fromZDNET
2 months ago

AMD's new Ryzen chipset promises faster performance, better gaming, and smarter AI

AMD launched new Ryzen AI mobile and workstation processors plus high-performance gaming CPUs with upgraded NPUs and AI-powered FSR Redstone to boost performance and visuals.
fromComputerworld
1 month ago

Intel sets sights on data center GPUs amid AI-driven infrastructure shifts

Intel is making a new push into GPUs, this time with a focus on data center workloads, as the chipmaker looks to reestablish itself in a market increasingly shaped by AI-driven demand and dominated by Nvidia. CEO Lip-Bu Tan said that after hiring a senior GPU architect, the company is working directly with customers to define requirements, signaling a more demand-driven approach as enterprises and cloud providers weigh their options for accelerated computing, according to a Reuters report.
Artificial intelligence
fromTechzine Global
2 months ago

Arm and Nvidia are on the prowl for physical AI's 'ChatGPT moment'

Nvidia is reportedly positioning itself to become the 'Android for robotics'. Arm, meanwhile, has created a fully-fledged business unit for 'Physical AI' alongside its other two divisions for cloud/AI and edge. The priority in both cases is clear: innovation is moving into the physical domain, with new pioneers for the next step in IT systems. The rhetoric, however, is jumping the gun a little.
Artificial intelligence
fromTechzine Global
2 months ago

Neuromorphic computers prove suitable for supercomputing

Scientists are showing that neuromorphic computers, designed to mimic the human brain, are not only useful for AI, but also for complex computational problems that normally run on supercomputers. This is reported by The Register. Neuromorphic computing differs fundamentally from the classic von Neumann architecture. Instead of a strict separation between memory and processing, these functions are closely intertwined. This limits data transport, a major source of energy consumption in modern computers. The human brain illustrates how efficient such an approach can be.
Artificial intelligence
Artificial intelligence
fromTheregister
2 months ago

Nvidia says DGX Spark is now 2.5x faster than at launch

Nvidia's DGX Spark and GB10 systems gain significant software-driven performance improvements and broader software integrations, boosting prefill compute performance for genAI workflows.
#gpt-53-codex-spark
Artificial intelligence
fromInfoQ
2 months ago

Intel DeepMath Introduces a Smart Architecture to Make LLMs Better at Math

DeepMath uses a Qwen3-4B Thinking agent that emits small Python executors for intermediate math steps, improving accuracy and significantly reducing output length.
#openai
Artificial intelligence
fromArs Technica
1 month ago

OpenAI sidesteps Nvidia with unusually fast coding model on plate-sized chips

Cerebras' Wafer Scale Engine enables high token throughput while OpenAI diversifies hardware beyond Nvidia amid fast-paced coding model competition.
Artificial intelligence
fromHackernoon
1 month ago

This "Flash" AI Model Is Fast and Dangerous at Math-Here's What It Can Do | HackerNoon

GLM-4.7-Flash is a 30-billion-parameter mixture-of-experts model offering strong performance for lightweight deployment.
fromTheregister
1 month ago

Positron opts for laptop RAM over HBM to take on Nvidia

On paper, Positron's next-gen Asimov accelerators, no doubt named for the beloved science fiction author, don't look like much of a match for Nvidia's Rubin GPUs. Yet, the Arm-backed AI startup boasts its inference chip will churn out five times as many tokens per dollar while using one-fifth the power of Nvidia's latest accelerators to do it. Those are certainly some bold claims, which the company contends are possible because the chip was designed to support large-scale inference workloads.
Artificial intelligence
[ Load more ]