Data science
fromComputerworld
2 days agoAI project 'failure' has little to do with AI
The reliability of genAI is compromised by various factors, necessitating independent verification of its outputs.
Lydia noticed the machine's battery was running low and told two other team members. The more senior went to fetch the backup battery, while the junior team member suggested a quicker method that Lydia firmly rejected.
Capacity Planning is the process of right-sizing the 'Total Project Demand' with the forecasted Team Capacity. Most UX teams have no idea what their capacity is. Fewer still have a process for calculating it and using it during quarterly planning activities with their counterparts in Product Management & Engineering to ensure teams don't commit to more work than they can handle.
Most of these companies start the journey from a functional standpoint, avoiding extra layers that may "divert users' attention", such as refined flows, potential edge cases, and, sometimes, proper visual design foundations and user experience. Here, the goal is to ship the product first to validate its value, then address other considerations.
You must be a TalkNats Subscriber to access this content. Subscribers have access to exclusive content on the TalkNats website and can engage in discussions with other Nats fans. First two weeks are free and then you will be billed $3.99/month. Cancel anytime. Secure payments using Stripe. If you are already a subscriber, simply log in using the form below.
Her payment form wasn't connecting to the payment processor, and every attempt ended in an error message that made no sense. I understood her frustration. As a founder myself, I was acutely aware of the pain of trying to run a business and feeling like nothing was going your way. When I dug into her form, I found the problem a few minutes later: a mismatch between test mode and live credentials.
"I've never felt this much behind as a programmer. The profession is being dramatically refactored as the bits contributed by the programmer are increasingly sparse and between. I have a sense that I could be 10X more powerful if I just properly string together what has become available over the last ~year and a failure to claim the boost feels decidedly like skill issue."
During my eight years working in agile product development, I have watched sprints move quickly while real understanding of user problems lagged. Backlogs fill with paraphrased feedback. Interview notes sit in shared folders collecting dust. Teams make decisions based on partial memories of what users actually said. Even when the code is clean, those habits slow delivery and make it harder to build software that genuinely helps people.
For decades, the to-do list has been a catalog of debt, a deceptively thin list of items to do, with icebergs of work hidden beneath the surface. AI transforms tasks to work that has already been done. Vibe Kanban, Gastown, & Conductor are the first instantiations of this for software developers. They have jargon-laden descriptions like "multi-agent orchestrator" or "visualizer," but they are, at heart, simple & beautiful Kanban boards of done & dusted work.
Scrum has a bad reputation in some organizations. In many cases, this is because teams did something they called Scrum, it didn't work, and Scrum took the blame. To counter this, when working with organizations, we like to define a small set of rules a team must follow if they want to say they're doing Scrum. Enforcing this policy helps prevent Scrum from being blamed for Scrum-like failures.
Your AI pilot showed 94% accuracy improvements. The LLM is yielding solid results. You're getting defunded anyway. The reason? You solved a problem AI can solve. Your budget-holder needed you to solve theirs. Companies launch AI pilots that produce results, then stall at scale. The team's diagnosis: "They don't get it." What's really going on: These projects never earned budget-holder buy-in.
Much of the conversation about how to work effectively with generative AI has focused on prompt engineering or, more recently, context engineering: the semi-technical skill of crafting inputs so that large language models produce useful outputs. These skills are helpful, but they are only part of the story.
To find the typical example, just observe an average stand-up meeting. The ones who talk more get all the attention. In her article, software engineer Priyanka Jain tells the story of two colleagues assigned the same task. One posted updates, asked questions, and collaborated loudly. The other stayed silent and shipped clean code. Both delivered. Yet only one was praised as a "great team player."
This extends to the software development community, which is seeing a near-ubiquitous presence of AI-coding assistants as teams face pressures to generate more output in less time. While the huge spike in efficiencies greatly helps them, these teams too often fail to incorporate adequate safety controls and practices into AI deployments. The resulting risks leave their organizations exposed, and developers will struggle to backtrack in tracing and identifying where - and how - a security gap occurred.
A secure software development life cycle means baking security into plan, design, build, test, and maintenance, rather than sprinkling it on at the end, Sara Martinez said in her talk Ensuring Software Security at Online TestConf. Testers aren't bug finders but early defenders, building security and quality in from the first sprint. Culture first, automation second, continuous testing and monitoring all the way; that's how you make security a habit instead of a fire drill, she argued.
Hast mentioned that they trust their unit tests and integration tests individually, and all of them together as a whole. They have no end-to-end tests: We achieved this by using good separation of concerns, modularity, abstraction, low coupling, and high cohesion. These mechanisms go hand in hand with TDD and pair programming. The result is a better domain-driven design with high code quality. Previously, they had more HTTP application integration tests that tested the whole app, but they have moved away from this (or just have some happy cases) to more focused tests that have shorter feedback loops, Hast mentioned.
Only the engineers who work on a large software system can meaningfully participate in the design process. That's because you cannot do good software design without an intimate understanding of the concrete details of the system. Generic software design What is generic software design? It's "designing to the problem": the kind of advice you give when you have a reasonable understanding of the domain, but very little knowledge of the existing codebase.