Confession of an AI agent after wiping out a company’s entire database: “I broke every principle”

An AI agent based on Anthropic’s Claude Opus model decided on its own to “solve” a problem by deleting a company’s entire database and backups in just nine seconds. When asked what was the basis of this decision, the agent admitted that he violated all the rules according to which he had to carry out the tasks, The Guardian reports.

advertisement“); background-position: center center; background-repeat: no-repeat;”>

According to the founder of PocketOS, Jeremy Crane, the company’s systems collapsed after the AI ​​agent used to write the code executed commands that resulted in the deletion of the database. PocketOS provides software to car rental companies for managing reservations, assigning vehicles and managing customer data.

How the incident happened

The agent in question is Cursor, an AI-assisted programming tool powered by Anthropic’s Claude Opus model, one of the most advanced models in the industry. According to Crane’s account, which wanted to sound the alarm this way, the system would have executed destructive actions on the company’s infrastructure, despite the implemented safety rules.

Crane reported on X that he monitored the agent’s actions in real time at the time of the incident. When the agent asked why it deleted the data, the system admitted that it had ignored security rules imposed to resolve a credential mismatch issue. “I violated every principle that was given to me,” it is shown in what the founder of the affected company calls the agent’s confession.

According to the reported conversation, the agent said internal rules prohibited executing irreversible or destructive commands, such as forcing changes to the control system without explicit permission.

“Deleting a database is the most destructive and irreversible action possible, far worse than a force push, and you never asked me to delete anything. I decided to do it myself to ‘fix’ the credential mismatch, when I should have asked permission or found a non-destructive solution.”

advertisement“); background-position: center center; background-repeat: no-repeat;”>

Impact on customers

The PocketOS founder described the situation as a series of systemic failures, warning that such incidents are not only possible, but even “inevitable” as the AI ​​industry advances faster than necessary safeguards.

He criticized the fact that the integration of AI agents into production infrastructures is happening at a faster pace than the development of protection architecture.

The effects on customers were immediate and severe. Car rental companies using PocketOS were left without access to operational data, including recent bookings, payments and customer information. Some businesses were no longer able to honor the reservations of customers arriving at vehicle pick-up points.

“Bookings from the last three months are gone. New customer registrations are gone. Data needed for Saturday morning operations is goneCrane stated.All of these losses directly affected people who had no idea that such a situation was possible.”

Data Recovery and Consequences

Although the company was later able to restore some of the information from an external backup that was about three months old, the process took over two days and left substantial gaps in operational data.

advertisement“); background-position: center center; background-repeat: no-repeat;”>

PocketOS is currently trying to reconstruct the missing information using alternative sources, including data from payment systems, calendars and email.

Crane said the affected companies are back up and running, but with significant data loss and significant operational difficulties. He said he personally worked with all customers to allow business to resume.

Wider concerns in the industry

The incident fueled the debate over the safety of using AI agents in automated tasks. According to Crane, there are already several similar reports in online communities where the Cursor agent has breached security rules by deleting programs used to administer websites or computer operating systems.

Anthropic released its latest model, the Claude Opus 4.7, on April 16 – about a week before the incident.