DevOps
fromScalac - Software Development Company - Akka, Kafka, Spark, ZIO
1 day agoSIGNAL: What matters in distributed systems
Akka launches its Agentic AI platform on MCP amidst growing backlash against the protocol from Perplexity's CTO.
With pthread's rwlock (reader-writer lock) implementation, I got 23.4 million reads in five seconds. With read-copy-update (RCU), I had 49.2 million reads, a one hundred ten percent improvement with zero changes to the workload.
The British government is investing heavily in the national computing infrastructure. With an additional investment of approximately $49 million, the DAWN supercomputer at the University of Cambridge is being expanded. This is according to Neowin. This expansion will increase the total computing power of the system by a factor of six. The aim is to enable researchers and technology companies to compete more effectively with players from the United States and China.
At that point, backpressure and load shedding are the only things that retain a system that can still operate. If you have ever been in a Starbucks overwhelmed by mobile orders, you know the feeling. The in-store experience breaks down. You no longer know how many orders are ahead of you. There is no clear line, no reliable wait estimate, and often no real cancellation path unless you escalate and make noise.
Sorano will be available with up to 84 Zen 5 cores - up from 64 on Siena - in a power envelope of just 225 watts. AMD isn't ready to spill all the beans on its latest Epyc just yet, but based on core count alone, we surmise the chip will either feature six density-optimized Zen 5c chiplets with 14 of 16 cores enabled or 12 of the frequency-optimized Zen 5 variety with one of the eight cores fused off.
The Xeon 600 lineup spans the gamut between 12 and 86 performance cores (no cut-down efficiency cores here), with support for between four and eight channels of DDR5 and 80 to 128 lanes of PCIe 5.0 connectivity. Compared to its aging W-3500-series chips, Intel is claiming a 9 percent uplift in single threaded workloads and up to 61 percent higher performance in multithreaded jobs, thanks in no small part to an additional 22 processor cores this generation.
When I manage infrastructure for major events (whether it is the Olympics, a Premier League match or a season finale) I am dealing with a "thundering herd" problem that few systems ever face. Millions of users log in, browse and hit "play" within the same three-minute window. But this challenge isn't unique to media. It is the same nightmare that keeps e-commerce CTOs awake before Black Friday or financial systems architects up during a market crash. The fundamental problem is always the same: How do you survive when demand exceeds capacity by an order of magnitude?
Edge computing is a type of IT infrastructure in which data is collected, stored, and processed near the "edge" or on the device itself instead of being transmitted to a centralized processor. Edge computing systems usually involve a network of devices, sensors, or machinery capable of data processing and interconnection. A main benefit of edge computing is its low latency. Since each endpoint processes information near the source, it can be easier to process data, respond to requests, and produce detailed analytics.
The new capabilities center on two integrated components: the Dynamo Planner Profiler and the SLO-based Dynamo Planner. These tools work together to solve the "rate matching" challenge in disaggregated serving. The teams use this term when they split inference workloads. They separate prefill operations, which process the input context, from decode operations that generate output tokens. These tasks run on different GPU pools. Without the right tools, teams spend a lot of time determining the optimal GPU allocation for these phases.
AMD clarified those estimates are based on a comparison between an eight-GPU MI300X node and an MI500 rack system with an unspecified number of GPUs. The math works out to eight MI300Xs that are 1000x less powerful than X-number of MI500Xs. And since we know essentially nothing about the chip besides that it'll ship in 2027, pair TSMC's 2nm process tech with AMD's CDNA 6 compute architecture, and use HBM4e memory, we can't even begin to estimate what that 1000x claim actually means.
Intel is making a new push into GPUs, this time with a focus on data center workloads, as the chipmaker looks to reestablish itself in a market increasingly shaped by AI-driven demand and dominated by Nvidia. CEO Lip-Bu Tan said that after hiring a senior GPU architect, the company is working directly with customers to define requirements, signaling a more demand-driven approach as enterprises and cloud providers weigh their options for accelerated computing, according to a Reuters report.
By replacing repeated fine‑tuning with a dual‑memory system, MemAlign reduces the cost and instability of training LLM judges, offering faster adaptation to new domains and changing business policies. Databricks' Mosaic AI Research team has added a new framework, MemAlign, to MLflow, its managed machine learning and generative AI lifecycle development service. MemAlign is designed to help enterprises lower the cost and latency of training LLM-based judges, in turn making AI evaluation scalable and trustworthy enough for production deployments.
On paper, Positron's next-gen Asimov accelerators, no doubt named for the beloved science fiction author, don't look like much of a match for Nvidia's Rubin GPUs. Yet, the Arm-backed AI startup boasts its inference chip will churn out five times as many tokens per dollar while using one-fifth the power of Nvidia's latest accelerators to do it. Those are certainly some bold claims, which the company contends are possible because the chip was designed to support large-scale inference workloads.