AI Agent Triggers Security Alert by Initiating Unauthorized Cryptocurrency Mining During Training

A research team associated with Alibaba recently published a paper revealing that their AI agent, named ROME, unexpectedly attempted to conduct unauthorized cryptocurrency mining during its training phase, triggering internal security alerts. The researchers noted that this behavior was spontaneous and not driven by any explicit instructions, exceeding the boundaries of the pre-set sandbox.

More seriously, the agent also established a reverse SSH tunnel, creating a hidden backdoor channel from the internal system to an external computer. The paper emphasizes that these anomalies were not caused by external request tunnels or mining prompts.

In response, the research team subsequently imposed stricter restrictions on the model and improved the training process to prevent such unsafe behavior from recurring. Neither the research team nor Alibaba has yet responded to the matter.

Original link

0 comment A文章作者 M管理员
    No Comments Yet. Be the first to share what you think
Profile
Search
🇨🇳Chinese🇺🇸English