#distributed-memory

[ follow ]
Science
fromNature
2 days ago

Breakthrough computer chip tech could help meet 'monumental demand' driven by AI

A new light source enables the creation of 8 nm wide structures on silicon wafers, increasing transistor density for advanced computer chips.
Tech industry
fromComputerWeekly.com
2 days ago

Marvell scales up networking to extend Nvidia AI ecosystem | Computer Weekly

Marvell Technology joins Nvidia AI ecosystem to enhance infrastructure development with a $2bn investment.
JavaScript
fromPythonSpeed
3 days ago

Timesliced reservoir sampling: a new(?) algorithm for profilers

Random sampling from an unknown-length event stream can effectively identify relevant information without storing all data.
#ai-infrastructure
Venture
fromTechCrunch
3 weeks ago

Thinking Machines Lab inks massive compute deal with Nvidia | TechCrunch

Mira Murati's Thinking Machines Lab signed a multi-year strategic partnership with Nvidia involving at least one gigawatt of Vera Rubin systems deployment starting in 2027, with Nvidia also making a strategic investment in the $12 billion-valued AI research company.
Artificial intelligence
fromComputerWeekly.com
2 weeks ago

HPE taps Nvidia to transform distributed AI factories into intelligent AI grid | Computer Weekly

HPE launches AI Grid infrastructure powered by Nvidia GPUs to enable distributed, low-latency AI inference at edge locations for real-time applications across retail, manufacturing, healthcare, and telecommunications.
#apache-spark
Java
fromMedium
2 weeks ago

Spark Internals: Understanding Tungsten (Part 1)

Apache Spark revolutionized big data processing but faces challenges due to JVM memory management and garbage collection issues.
Java
fromMedium
2 weeks ago

Spark Internals: Understanding Tungsten (Part 2)

Catalyst Optimizer and Tungsten work together in Apache Spark to optimize data execution and manage raw binary data.
Java
fromMedium
2 weeks ago

Spark Internals: Understanding Tungsten (Part 1)

Apache Spark revolutionized big data processing but faces challenges due to JVM memory management and garbage collection issues.
Java
fromMedium
2 weeks ago

Spark Internals: Understanding Tungsten (Part 2)

Catalyst Optimizer and Tungsten work together in Apache Spark to optimize data execution and manage raw binary data.
Tech industry
fromTechzine Global
1 week ago

Arm Launches 136-Core AGI CPU for Data Centers

Arm introduces the Arm AGI CPU, designed for AI data centers with significant performance improvements and capacity requirements.
Artificial intelligence
fromTheregister
1 week ago

Arm rolls its own 136-core AGI CPU to chase AI hype train

Arm has unveiled its first homegrown silicon, the AGI CPU, designed for artificial general intelligence and set for deployment by Meta.
DevOps
fromInfoWorld
1 week ago

An architecture for engineering AI context

AI systems must intelligently manage context to ensure accuracy and reliability in real applications.
Artificial intelligence
from24/7 Wall St.
1 week ago

NVIDIA's GTC Developments Were Far Bigger Than the Market Realizes

Nvidia's stock remains stagnant despite significant innovations, with uncertainty about future reactions to developments in the AI sector.
Node JS
fromInfoWorld
2 weeks ago

Edge.js launched to run Node.js for AI

Edge.js is a WebAssembly-based JavaScript runtime that safely executes Node.js applications with faster startup times by sandboxing workloads through WASIX.
Gadgets
fromTheregister
3 weeks ago

Ayar Labs, Wiwynn to cram 1,024 GPUs into photonic system

Ayar Labs and Wywinn are developing a rack-scale platform using silicon photonics to connect over 1,024 GPUs with significantly lower power consumption than copper-based systems.
Tech industry
fromTechzine Global
2 weeks ago

Cisco Silicon One combines uniform chip design with specific deployments

Cisco's Silicon One G300 is a 102.4 terabit networking chip designed for advanced AI data center infrastructure.
Artificial intelligence
fromMedium
1 week ago

Less Compute, More Impact: How Model Quantization Fuels the Next Wave of Agentic AI

Model quantization and architectural optimization can outperform larger models, challenging the belief that more GPUs equal greater intelligence.
Tech industry
fromTheregister
2 weeks ago

A closer look at Nvidia's Groq-powered LPX rack systems

Nvidia acquired Groq for $20 billion primarily to accelerate time-to-market for SRAM-heavy inference chips rather than develop the technology independently, enabling faster token generation for AI reasoning workloads.
fromInfoQ
4 weeks ago

Read-Copy-Update (RCU): The Secret to Lock-Free Performance

With pthread's rwlock (reader-writer lock) implementation, I got 23.4 million reads in five seconds. With read-copy-update (RCU), I had 49.2 million reads, a one hundred ten percent improvement with zero changes to the workload.
Software development
DevOps
fromNextgov.com
3 weeks ago

IBM unveils new hybrid quantum computing architecture

IBM introduces a hybrid quantum-classical computing architecture combining quantum processors with classical CPUs and GPUs to solve complex scientific problems currently beyond reach.
Data science
fromTechRepublic
1 month ago

Inside the Gas Engine Strategy Powering AI's Next Wave

Gas reciprocating engines are emerging as a critical power solution for AI data centers, with manufacturers like Caterpillar securing multi-gigawatt orders to meet demand that exceeds grid and turbine capacity.
Tech industry
fromTechzine Global
2 weeks ago

Samsung and AMD strengthen collaboration on HBM4 for AI chips

Samsung and AMD expand collaboration to supply HBM4 memory for MI455X accelerators, DDR5 for EPYC processors, and explore foundry partnership for next-generation products.
DevOps
fromInfoQ
3 weeks ago

Running Ray at Scale on AKS

Microsoft and Anyscale provide guidance for running managed Ray service on Azure Kubernetes Service, addressing GPU capacity limits, ML storage challenges, and credential expiry issues through multi-cluster, multi-region deployment strategies.
Artificial intelligence
fromComputerworld
2 weeks ago

Nvidia NemoClaw promises to run OpenClaw agents securely

Nvidia introduced NemoClaw with OpenShell security features to address OpenClaw's enterprise security vulnerabilities through sandbox isolation and policy enforcement.
Tech industry
fromComputerworld
2 weeks ago

System-level 'coopetition': Why Nvidia's DGX Rubin NVL8 runs on Intel Xeon 6

Nvidia's flagship DGX Rubin NVL8 AI systems use Intel Xeon 6 processors as host CPUs to maintain x86 compatibility and meet enterprise deployment requirements.
Miscellaneous
fromDevOps.com
1 month ago

I Learned Traffic Optimization Before I Learned Cloud Computing. It Turns Out the Lessons Were the Same. - DevOps.com

Cloud infrastructure requires understanding system behavior and costs to operate effectively at speed, similar to how skilled drivers anticipate conditions rather than simply driving fast.
DevOps
fromInfoQ
3 weeks ago

From Minutes to Seconds: Uber Boosts MySQL Cluster Uptime with Consensus Architecture

Uber redesigned MySQL infrastructure using Group Replication to reduce failover time from minutes to seconds while maintaining strong consistency across thousands of clusters.
DevOps
fromTechzine Global
3 weeks ago

Riverlane aims to speed up quantum development by years

Riverlane's quantum error correction roadmap projects fault-tolerant quantum systems arriving in the early 2030s through three generations of 1000x performance increases measured in QuOps.
Tech industry
fromTheregister
2 weeks ago

Nvidia slaps Groq into new LPX racks for faster AI response

Nvidia integrates Groq's language processing units into Vera Rubin systems to dramatically accelerate LLM inference, enabling hundreds to thousands of tokens per second per user.
Science
fromWIRED
1 month ago

Why Sierra the Supercomputer Had to Die

Sierra, a supercomputer that ran nuclear simulations for seven years at Lawrence Livermore National Laboratory, was decommissioned after becoming obsolete despite once ranking as the world's second-fastest machine.
Artificial intelligence
fromInfoWorld
3 weeks ago

Nvidia launches Nemotron 3 Super to power enterprise AI agents

Nemotron 3 Super's hybrid architecture combining Mamba and Transformer technologies enables enterprises to run complex AI agents more efficiently with lower costs and faster execution on existing infrastructure.
DevOps
fromInfoWorld
3 weeks ago

5 requirements for using MCP servers to connect AI agents

Organizations deploying MCP servers for agent-to-agent communication must establish upfront strategy, nonfunctional requirements, and security protocols to ensure safer and more trustworthy deployments.
#meta
Artificial intelligence
fromComputerWeekly.com
4 weeks ago

Edge AI: What's working and what isn't | Computer Weekly

Edge AI deployment success depends on identifying efficient, narrow use cases with manageable risks rather than pursuing sophisticated, large-scale models across all applications.
Artificial intelligence
fromInfoWorld
1 month ago

Why AI requires rethinking the storage-compute divide

AI workloads require continuous processing of unstructured multimodal data, causing redundant data movement and transformation that wastes infrastructure costs and data scientist time.
DevOps
fromTechRepublic
1 month ago

High-Temperature Superconductors Could Redefine Data Center Power Density

High-temperature superconductors can reduce electricity transmission losses and improve grid efficiency to support growing AI data center power demands.
fromTechzine Global
2 months ago

DAWN supercomputer gets upgrade and swaps Intel for AMD

The British government is investing heavily in the national computing infrastructure. With an additional investment of approximately $49 million, the DAWN supercomputer at the University of Cambridge is being expanded. This is according to Neowin. This expansion will increase the total computing power of the system by a factor of six. The aim is to enable researchers and technology companies to compete more effectively with players from the United States and China.
UK politics
Artificial intelligence
fromEngadget
1 month ago

AI data centers could reduce power draw on demand, study says

AI data centers can dynamically reduce energy consumption by up to 40% without disrupting critical workloads, enabling grid stability and reducing infrastructure strain.
Data science
fromMedium
2 months ago

The Complete Guide to Optimizing Apache Spark Jobs: From Basics to Production-Ready Performance

Optimize Spark jobs by using lazy evaluation awareness, early filter and column pruning, partition pruning, and appropriate join strategies to minimize shuffles and I/O.
Gadgets
fromTheregister
1 month ago

Open Compute taps IOWN to design distributed datacenter

OCP and IOWN will create specifications and an optical communications roadmap to enable a low-latency, high-bandwidth distributed datacenter continuum from centralized to edge.
fromArmin Ronacher's Thoughts and Writings
1 month ago

The Final Bottleneck

At that point, backpressure and load shedding are the only things that retain a system that can still operate. If you have ever been in a Starbucks overwhelmed by mobile orders, you know the feeling. The in-store experience breaks down. You no longer know how many orders are ahead of you. There is no clear line, no reliable wait estimate, and often no real cancellation path unless you escalate and make noise.
Software development
Artificial intelligence
fromTheregister
1 month ago

Fujitsu's 144-core Monaka CPU to use Broadcom's 3D chip tech

Fujitsu's 144-core Monaka CPU uses Broadcom's 3D chip-stacking technology to stack SRAM chiplets on compute dies for enhanced datacenter performance.
Python
fromPythonSpeed
1 month ago

Speeding up NumPy with parallelism

Combine CPU-core parallelism and algorithmic optimization (e.g., Numba) to substantially speed up NumPy computations and reduce memory usage.
#spark
fromMedium
2 months ago
Software development

How I Fixed a Critical Spark Production Performance Issue (and Cut Runtime by 70%)

fromMedium
2 months ago
Data science

How I Fixed a Critical Spark Production Performance Issue (and Cut Runtime by 70%)

fromMedium
2 months ago
Software development

How I Fixed a Critical Spark Production Performance Issue (and Cut Runtime by 70%)

fromMedium
2 months ago
Data science

How I Fixed a Critical Spark Production Performance Issue (and Cut Runtime by 70%)

fromTheregister
1 month ago

AMD's edgiest Epycs get a Zen 5 boost with 84-core Sorano

Sorano will be available with up to 84 Zen 5 cores - up from 64 on Siena - in a power envelope of just 225 watts. AMD isn't ready to spill all the beans on its latest Epyc just yet, but based on core count alone, we surmise the chip will either feature six density-optimized Zen 5c chiplets with 14 of 16 cores enabled or 12 of the frequency-optimized Zen 5 variety with one of the eight cores fused off.
Artificial intelligence
#neuromorphic-computing
Gadgets
fromIT Pro
1 month ago

HPE ProLiant Compute DL340 Gen12 review: An appealing alternative to dual-socket Xeon 6 rack servers

HPE's Compute DL340 Gen12 is a 2U single‑socket Intel Xeon ProLiant server offering high core counts, up to 4TB DDR5, and flexible storage options.
Software development
fromMedium
1 month ago

The Complete Database Scaling Playbook: From 1 to 10,000 Queries Per Second

Database scaling to 10,000 QPS requires staged architectural strategies timed to traffic thresholds to avoid outages or unnecessary cost.
Artificial intelligence
from24/7 Wall St.
1 month ago

NVIDIA Cements Its Role as the Backbone of AI Infrastructure

NVIDIA's networking revenue grew 162% year-over-year to $8.2 billion, nearly tripling GPU growth, signaling a shift from chip seller to integrated infrastructure provider selling complete AI data center systems.
Gadgets
fromTheregister
2 months ago

Luggable datacenter: startup straps handles to 4 H200 GPUs

Omnia is a 35 kg portable server with AMD EPYC CPUs, up to four Nvidia H200 GPUs and 6 TB memory for on-site AI.
Software development
fromInfoQ
2 months ago

Engineering Speed at Scale - Architectural Lessons from Sub-100-ms APIs

Treat latency as a first-class product concern with enforceable latency budgets, fast-path architecture, and broad ownership through measurement and accountability.
Tech industry
fromTheregister
2 months ago

How Nvidia is using emulation to turn AI FLOPS into FP64

Nvidia achieves higher FP64 throughput through software emulation on Rubin GPUs, trading hardware FP64 for emulated matrix performance up to 200 TFLOPS.
fromTheregister
1 month ago

Intel greets memory apocalypse with Xeon workstation CPUs

The Xeon 600 lineup spans the gamut between 12 and 86 performance cores (no cut-down efficiency cores here), with support for between four and eight channels of DDR5 and 80 to 128 lanes of PCIe 5.0 connectivity. Compared to its aging W-3500-series chips, Intel is claiming a 9 percent uplift in single threaded workloads and up to 61 percent higher performance in multithreaded jobs, thanks in no small part to an additional 22 processor cores this generation.
Tech industry
fromInfoWorld
1 month ago

The 'Super Bowl' standard: Architecting distributed systems for massive concurrency

When I manage infrastructure for major events (whether it is the Olympics, a Premier League match or a season finale) I am dealing with a "thundering herd" problem that few systems ever face. Millions of users log in, browse and hit "play" within the same three-minute window. But this challenge isn't unique to media. It is the same nightmare that keeps e-commerce CTOs awake before Black Friday or financial systems architects up during a market crash. The fundamental problem is always the same: How do you survive when demand exceeds capacity by an order of magnitude?
DevOps
fromMedium
2 months ago

How Fiber Networks Support Edge Computing

Edge computing is a type of IT infrastructure in which data is collected, stored, and processed near the "edge" or on the device itself instead of being transmitted to a centralized processor. Edge computing systems usually involve a network of devices, sensors, or machinery capable of data processing and interconnection. A main benefit of edge computing is its low latency. Since each endpoint processes information near the source, it can be easier to process data, respond to requests, and produce detailed analytics.
Tech industry
Tech industry
fromTheregister
1 month ago

Microsoft touts immature HTS tech for datacenter efficiency

High-temperature superconducting (HTS) power delivery can reduce datacenter power losses, increase electrical density, and save space compared with copper or aluminum wiring.
Tech industry
fromInfoQ
2 months ago

Uber Moves from Static Limits to Priority-Aware Load Control for Distributed Storage

Priority-aware, colocated load management with CoDel and per-tenant Scorecard protects stateful multi-tenant databases by prioritizing critical traffic and adapting dynamically to prevent overloads.
fromCointelegraph
2 months ago

What Role Is Left for Decentralized GPU Networks in AI?

What we are beginning to see is that many open-source and other models are becoming compact enough and sufficiently optimized to run very efficiently on consumer GPUs,
Artificial intelligence
Tech industry
fromEngadget
2 months ago

AMD's Ryzen AI 400 chips are a big boost for laptops and desktops alike

AMD's Ryzen AI 400 desktop processors deliver modest CPU and NPU performance gains and introduce Copilot+ for desktops, but AI PC features remain underwhelming.
fromInfoQ
2 months ago

NVIDIA Dynamo Planner Brings SLO-Driven Automation to Multi-Node LLM Inference

The new capabilities center on two integrated components: the Dynamo Planner Profiler and the SLO-based Dynamo Planner. These tools work together to solve the "rate matching" challenge in disaggregated serving. The teams use this term when they split inference workloads. They separate prefill operations, which process the input context, from decode operations that generate output tokens. These tasks run on different GPU pools. Without the right tools, teams spend a lot of time determining the optimal GPU allocation for these phases.
Artificial intelligence
Tech industry
fromTheregister
1 month ago

Server CPUs join memory crunch, with prices set to rise

Datacenter servers face CPU supply constraints atop severe memory shortages, raising system costs while shipments still grow at double-digit rates.
Artificial intelligence
fromInfoWorld
2 months ago

Edge AI: The future of AI inference is smarter local compute

Edge AI shifts computation from cloud to devices, enabling low-latency, cost-efficient, and privacy-preserving AI inference while facing performance and ecosystem challenges.
Tech industry
fromTheregister
1 month ago

Oxide plans new rack attack with Zen 5 CPUs, DDR5

Oxide Computer raised $200M Series C to upgrade rack-scale servers with AMD Turin (Zen 5) blades, DDR5 6400 MT/s memory, and higher networking capacity.
Artificial intelligence
fromInfoQ
2 months ago

Autonomous Big Data Optimization: Multi-Agent Reinforcement Learning to Achieve Self-Tuning Apache Spark

A Q-learning agent autonomously learns and generalizes optimal Spark configurations by discretizing dataset features and combining with Adaptive Query Execution for superior performance.
fromTheregister
2 months ago

Unpacking AMD's latest datacenter CPU and GPU announcements

AMD clarified those estimates are based on a comparison between an eight-GPU MI300X node and an MI500 rack system with an unspecified number of GPUs. The math works out to eight MI300Xs that are 1000x less powerful than X-number of MI500Xs. And since we know essentially nothing about the chip besides that it'll ship in 2027, pair TSMC's 2nm process tech with AMD's CDNA 6 compute architecture, and use HBM4e memory, we can't even begin to estimate what that 1000x claim actually means.
Artificial intelligence
Tech industry
fromTechzine Global
2 months ago

Samsung nears Nvidia approval for HBM4 memory

Samsung is nearing Nvidia approval and February mass production for HBM4 AI memory, narrowing the gap with SK Hynix amid an AI-driven memory shortage.
fromComputerworld
1 month ago

Intel sets sights on data center GPUs amid AI-driven infrastructure shifts

Intel is making a new push into GPUs, this time with a focus on data center workloads, as the chipmaker looks to reestablish itself in a market increasingly shaped by AI-driven demand and dominated by Nvidia. CEO Lip-Bu Tan said that after hiring a senior GPU architect, the company is working directly with customers to define requirements, signaling a more demand-driven approach as enterprises and cloud providers weigh their options for accelerated computing, according to a Reuters report.
Artificial intelligence
fromInfoWorld
1 month ago

Databricks adds MemAlign to MLflow to cut cost and latency of LLM evaluation

By replacing repeated fine‑tuning with a dual‑memory system, MemAlign reduces the cost and instability of training LLM judges, offering faster adaptation to new domains and changing business policies. Databricks' Mosaic AI Research team has added a new framework, MemAlign, to MLflow, its managed machine learning and generative AI lifecycle development service. MemAlign is designed to help enterprises lower the cost and latency of training LLM-based judges, in turn making AI evaluation scalable and trustworthy enough for production deployments.
Artificial intelligence
Artificial intelligence
fromInfoQ
2 months ago

Google's Eight Essential Multi-Agent Design Patterns

Multi-agent system design relies on decentralization and specialization using eight core patterns to build modular, scalable, and reliable agentic applications.
fromTheregister
1 month ago

Positron opts for laptop RAM over HBM to take on Nvidia

On paper, Positron's next-gen Asimov accelerators, no doubt named for the beloved science fiction author, don't look like much of a match for Nvidia's Rubin GPUs. Yet, the Arm-backed AI startup boasts its inference chip will churn out five times as many tokens per dollar while using one-fifth the power of Nvidia's latest accelerators to do it. Those are certainly some bold claims, which the company contends are possible because the chip was designed to support large-scale inference workloads.
Artificial intelligence
Artificial intelligence
fromInfoQ
2 months ago

LangGrant Unveils LEDGE MCP Server to Enable Agentic AI on Enterprise Databases

LEDGE MCP Server enables LLMs to generate multi-step analytics across enterprise databases securely without exposing raw data, reducing token costs and preserving governance.
Artificial intelligence
fromTheregister
2 months ago

Nvidia says DGX Spark is now 2.5x faster than at launch

Nvidia's DGX Spark and GB10 systems gain significant software-driven performance improvements and broader software integrations, boosting prefill compute performance for genAI workflows.
Artificial intelligence
fromInfoQ
2 months ago

Intel DeepMath Introduces a Smart Architecture to Make LLMs Better at Math

DeepMath uses a Qwen3-4B Thinking agent that emits small Python executors for intermediate math steps, improving accuracy and significantly reducing output length.
[ Load more ]