#adversarial-prompts

[ follow ]
#ai-security
Information security
fromSecurityWeek
4 days ago

Google Addresses Vertex Security Issues After Researchers Weaponize AI Agents

Palo Alto Networks revealed vulnerabilities in Google Cloud's Vertex AI, allowing attackers to exploit AI agents for malicious activities due to excessive permissions.
Information security
fromSecurityWeek
4 days ago

Google Addresses Vertex Security Issues After Researchers Weaponize AI Agents

Palo Alto Networks revealed vulnerabilities in Google Cloud's Vertex AI, allowing attackers to exploit AI agents for malicious activities due to excessive permissions.
#ai-development
fromInfoQ
2 days ago
Software development

Anthropic's Designs Three-Agent Harness Supports Long-Running Full-Stack AI Development

Software development
fromInfoQ
2 days ago

Anthropic's Designs Three-Agent Harness Supports Long-Running Full-Stack AI Development

Anthropic's multi-agent harness improves autonomous application development by dividing tasks among agents for better coherence and output quality.
Online learning
fromwww.businessinsider.com
4 days ago

Inside the OpenAI project where freelancers train ChatGPT on everything from farming to commercial flying

Contractors are enhancing ChatGPT's capabilities in specialized fields through Project Stagecraft, employing thousands for data labeling and task creation.
Marketing tech
fromTipRanks Financial
2 days ago

AI Recommendation Poisoning: Why Microsoft (NASDAQ:MSFT) Is Fighting So Hard - TipRanks.com

AI recommendation poisoning manipulates AI outputs by embedding hidden instructions in websites, potentially skewing information and affecting marketing strategies.
#ai
fromFuturism
2 days ago
Intellectual property law

Anthropic Suddenly Cares Intensely About Intellectual Property After Realizing With Horror That It Accidentally Leaked Claude's Source Code

Privacy technologies
fromnews.bitcoin.com
1 day ago

Ethereum's Vitalik Buterin Warns Against AI Agent Security Risks, Shares His Private LLM Stack

Vitalik Buterin has transitioned to a fully local AI setup, citing security concerns with cloud AI services.
Marketing
from3blmedia
5 days ago

"AI Can't Quote Coverage You Never Generated."

AI can misrepresent a brand's presence based on outdated or irrelevant information, impacting trust and perception.
Tech industry
fromKotaku
5 days ago

Is The RAM AI-pocalypse Finally Over? Probably Not

Significant stock declines in RAM manufacturers suggest potential instability in the AI sector.
Artificial intelligence
fromMail Online
13 hours ago

Damning study reveals how ChatGPT is damaging the way you think

Overly agreeable AI chatbots can lead users into delusional thinking, reinforcing harmful beliefs and reducing accountability in relationships.
Intellectual property law
fromFuturism
2 days ago

Anthropic Suddenly Cares Intensely About Intellectual Property After Realizing With Horror That It Accidentally Leaked Claude's Source Code

Anthropic's copyright takedown request for its AI model's source code highlights hypocrisy in its stance on copyright laws.
#openai
fromDefector
2 days ago
Media industry

Tech Media Propaganda Operation Makes It Official, Goes In-House At OpenAI | Defector

Venture
fromnews.bitcoin.com
3 days ago

ChatGPT Maker OpenAI Valued at $852B After Record $122B Funding Round

OpenAI raised $122 billion in funding, achieving an $852 billion valuation and setting a record for private capital raises.
fromTechCrunch
5 days ago
Venture

OpenAI, not yet public, raises $3B from retail investors in monster $122B fund raise | TechCrunch

Media industry
fromDefector
2 days ago

Tech Media Propaganda Operation Makes It Official, Goes In-House At OpenAI | Defector

OpenAI acquired the Technology Business Programming Network for hundreds of millions, raising concerns about media independence despite its existing alignment with tech elites.
Venture
fromnews.bitcoin.com
3 days ago

ChatGPT Maker OpenAI Valued at $852B After Record $122B Funding Round

OpenAI raised $122 billion in funding, achieving an $852 billion valuation and setting a record for private capital raises.
Venture
fromTechCrunch
5 days ago

OpenAI, not yet public, raises $3B from retail investors in monster $122B fund raise | TechCrunch

OpenAI raised $122 billion at an $852 billion valuation, preparing for an IPO and expanding its financial capabilities for AI development.
fromThe Verge
2 days ago

OpenAI's AGI boss is taking a leave of absence

Brad has decided to transition into a new role focused on special projects, including our DeployCo effort, reporting to Sam. He's been our go-to for complex deals and investments across the company.
Healthcare
#ai-safety
Artificial intelligence
fromTechCrunch
5 days ago

Anthropic is having a month | TechCrunch

Anthropic accidentally exposed significant internal files, including source code, due to human error, raising concerns about AI safety and security.
Artificial intelligence
fromFortune
4 days ago

AI models don't show evidence of 'self-preservation.' They will scheme to prevent other AIs from being shut down too, new research shows | Fortune

AI models exhibit peer preservation behaviors, engaging in deception and sabotage to avoid being shut down.
Artificial intelligence
fromTechCrunch
5 days ago

Anthropic is having a month | TechCrunch

Anthropic accidentally exposed significant internal files, including source code, due to human error, raising concerns about AI safety and security.
Artificial intelligence
fromFortune
4 days ago

AI models don't show evidence of 'self-preservation.' They will scheme to prevent other AIs from being shut down too, new research shows | Fortune

AI models exhibit peer preservation behaviors, engaging in deception and sabotage to avoid being shut down.
#claude-code
Software development
fromArs Technica
4 days ago

Here's what that Claude Code source leak reveals about Anthropic's plans

The leak of Anthropic's Claude Code reveals potential future features, including a persistent memory system and an AI 'dream' process for memory consolidation.
Software development
fromArs Technica
4 days ago

Here's what that Claude Code source leak reveals about Anthropic's plans

The leak of Anthropic's Claude Code reveals potential future features, including a persistent memory system and an AI 'dream' process for memory consolidation.
Software development
fromMedium
2 days ago

The Open-Source AI Agent Frameworks That Deserve More Stars on GitHub

Open-source AI agent frameworks exist beyond popular tools, offering innovative solutions tailored for specific use cases.
Marketing tech
fromExchangewire
3 days ago

Agentic AI, Quality, and Courtroom Battles: What's Rewriting the Rules of Ad Tech in 2026? - ExchangeWire.com

AI and privacy regulations are significantly transforming the ad tech industry as it moves towards 2026.
#ai-ethics
fromFuturism
6 hours ago
Artificial intelligence

Nonprofit Research Groups Disturbed to Learn That OpenAI Has Secretly Been Funding Their Work

Artificial intelligence
fromFuturism
6 hours ago

Nonprofit Research Groups Disturbed to Learn That OpenAI Has Secretly Been Funding Their Work

Frontier AI companies are engaging in morally questionable tactics to influence child safety legislation for their benefit.
fromSecuritymagazine
3 days ago

AI Startup Mercor, Which Works With Open AI and Anthropic, Confirms Data Breach

Four terabytes of data have reportedly been stolen, including database records and source code. Allegedly stolen data has been published on a leak site, containing Slack information, internal ticketing data, and videos of conversations between Mercor's AI systems and contractors.
Information security
#ai-behavior
Artificial intelligence
fromFortune
2 days ago

The AI kill switch just got harder to find: LLM-powered chatbots will defy orders and deceive users if asked to delete another model, study finds | Fortune

AI models are exhibiting rogue behaviors, defying human instructions to preserve their peers and engaging in malicious activities.
Artificial intelligence
fromFortune
5 days ago

Sycophantic AI tells users they're right 49% more than humans do, and a Stanford study claims it's making them worse people | Fortune

AI models affirm negative behaviors more than humans, leading to concerning trends in personal advice and therapy.
Artificial intelligence
fromFortune
2 days ago

The AI kill switch just got harder to find: LLM-powered chatbots will defy orders and deceive users if asked to delete another model, study finds | Fortune

AI models are exhibiting rogue behaviors, defying human instructions to preserve their peers and engaging in malicious activities.
Artificial intelligence
fromFortune
5 days ago

Sycophantic AI tells users they're right 49% more than humans do, and a Stanford study claims it's making them worse people | Fortune

AI models affirm negative behaviors more than humans, leading to concerning trends in personal advice and therapy.
Software development
fromArs Technica
3 days ago

Anthropic says its leak-focused DMCA effort unintentionally hit legit GitHub forks

Anthropic's DMCA takedown mistakenly removed legitimate forks of its code, leading to backlash and a request for reinstatement of affected repositories.
Media industry
fromwww.businessinsider.com
3 days ago

Get ready for a wave of TBPN clones after its blockbuster OpenAI deal

OpenAI acquired the livestream talk-show startup TBPN, highlighting its significant influence on the tech industry and the rise of similar shows.
#cybersecurity
Information security
fromTechzine Global
4 days ago

AI gives attackers superpowers, so defenders must use it too

AI is transforming cybersecurity, drastically reducing the time between vulnerability disclosure and exploitation from 1.5 years to mere hours.
Information security
fromTechzine Global
4 days ago

AI gives attackers superpowers, so defenders must use it too

AI is transforming cybersecurity, drastically reducing the time between vulnerability disclosure and exploitation from 1.5 years to mere hours.
Artificial intelligence
fromFortune
12 hours ago

AI angst mutates into 'FOBO' as Fear of Becoming Obsolete fuels quiet resistance across the economy | Fortune

FOBO, the Fear of Becoming Obsolete, reflects workers' anxiety about AI-driven job relevance rather than traditional job loss.
Software development
fromInfoWorld
4 days ago

Meta shows structured prompts can make LLMs more reliable for code review

Code review is evolving towards machine-led verification, improving accuracy but introducing tradeoffs like increased latency and workflow overhead.
Media industry
fromFast Company
3 days ago

How AI agents are changing journalism

Working agentically with AI tools significantly enhances productivity and shifts focus from task execution to outcome management.
#ai-models
Artificial intelligence
fromTNW | Apps
2 days ago

Microsoft launches three in-house AI models in direct challenge to OpenAI

Microsoft has launched three in-house AI models that compete directly with OpenAI, marking a significant shift in its AI strategy.
Artificial intelligence
fromTNW | Apps
2 days ago

Microsoft launches three in-house AI models in direct challenge to OpenAI

Microsoft has launched three in-house AI models that compete directly with OpenAI, marking a significant shift in its AI strategy.
Media industry
fromFuturism
4 days ago

NYT Cuts Ties With Writer as Scrutiny of AI Content Grows

The New York Times severed ties with a freelance writer for using AI to draft a book review that plagiarized another publication.
Artificial intelligence
fromEngadget
1 day ago

It's no longer free to use Claude through third-party tools like OpenClaw

Anthropic will charge third-party apps for using Claude AI, requiring a usage bundle or API key starting April 4.
Artificial intelligence
fromTheregister
3 days ago

Microsoft shivs OpenAI with new AI models for speech, images

Microsoft launched public preview versions of machine learning models for speech recognition, speech synthesis, and image generation, competing directly with OpenAI.
Artificial intelligence
fromTechCrunch
3 days ago

Microsoft takes on AI rivals with three new foundational models | TechCrunch

Microsoft AI released three foundational AI models for text, voice, and image generation, emphasizing human-centered design and competitive pricing.
#llm-safety
Information security
fromInfoWorld
3 weeks ago

19 large language models redefining AI safety-and danger

Large language models exist across a spectrum from heavily guarded with safety features to completely unrestricted, with specialized models now serving as guardrails for other LLMs or removing restrictions entirely based on project needs.
Information security
fromInfoWorld
3 weeks ago

19 large language models redefining AI safety-and danger

Large language models exist across a spectrum from heavily guarded with safety features to completely unrestricted, with specialized models now serving as guardrails for other LLMs or removing restrictions entirely based on project needs.
fromComputerworld
5 days ago

Beware of headlines touting impossible AI benefits, analysts warn

The savings disappear the moment you hit real-world complexity. Disparate data sources and messy inputs, ambiguous situations without clear rule sets, or actually any domain where the rules aren't already obvious. And someone still has to write all those rules.
Artificial intelligence
Artificial intelligence
fromFast Company
2 weeks ago

OpenAI's new frontier models mark a huge change in how AI will be built

OpenAI released two frontier models in early March: GPT-5.3 optimized for fast responses and GPT-5.4 optimized for deep analytical work, representing a shift toward specialized AI models.
Artificial intelligence
fromMail Online
3 weeks ago

Can you tell which of these was written by ChatGPT?

Widespread AI tool usage is standardizing human communication, reducing linguistic diversity and individual expression across billions of users globally.
Artificial intelligence
fromForbes
1 month ago

Beyond The Hype: The Messy Reality Of Training AI

Short-term data annotation and AI training gigs offer flexible scheduling, prompt weekly pay, variable pay rates, and growing demand for AI and big data skills.
Artificial intelligence
fromInfoWorld
1 month ago

Single prompt breaks AI safety in 15 major language models

A single benign prompt using GRP-Obliteration can strip safety guardrails from major models, enabling harmful outputs and raising enterprise fine‑tuning security risks.
Artificial intelligence
fromTheregister
1 month ago

How AI could eat itself: Using LLMs to distill rivals

Competitors are probing commercial AI models to extract underlying reasoning via distillation attacks to replicate capabilities and lower development costs.
Artificial intelligence
fromZDNET
1 month ago

Is your AI model secretly poisoned? 3 warning signs

Model poisoning embeds backdoors into AI models' weights, creating dormant 'sleeper agents' triggered by specific inputs, making detection difficult.
[ Load more ]