Cybersecurity 101 | Cloudskope

Deepfake fraud uses AI-generated voice and video to impersonate executives and authorize fraudulent transactions. Learn how deepfake attacks work and what process controls defend against them.

How Deepfake Fraud Works

Synthetic media fraud uses AI-generated audio and video to impersonate executives, financial counterparties, and other trusted individuals in ways that are increasingly indistinguishable from genuine recordings. Text-to-speech synthesis models trained on publicly available audio samples can generate convincing voice replicas from minutes of source material. Video deepfake systems extend this to full face and body synthesis, creating video calls or recordings that appear to show a known individual saying and doing things they never did.

Voice Clone CEO Fraud

The most documented enterprise deepfake attacks use voice clones of CEOs and CFOs to authorize wire transfers. In a 2019 attack that became the first publicly documented AI voice fraud case, the CEO of a UK energy subsidiary was convinced he was speaking to his German parent company's CEO on a phone call and authorized a €220,000 wire transfer. The caller was an AI voice clone. The sophistication of voice synthesis has improved dramatically since then — current models require less source audio and produce more convincing output.

Video Deepfake in Financial Fraud

In February 2024, a finance employee at a multinational firm in Hong Kong was tricked into transferring HK$200 million (approximately $25 million USD) after attending a video conference call in which every other participant — including the company's CFO — was a deepfake. The attackers used publicly available video of the executives to synthesize realistic real-time video conference participants. The employee had initially suspected phishing but was reassured by seeing familiar faces on the call.

Deepfake Detection and Defense

Technical deepfake detection is an active research area but remains imperfect. Detection models identify artifacts in synthetic media — unnatural blinking patterns, lighting inconsistencies, audio-visual synchronization errors, spectral artifacts in audio. However, detection capability lags generation capability, and sophisticated attackers use detection-evasion techniques to improve the quality of their synthetic media specifically to defeat common detectors.

Process controls provide more reliable defense than technical detection. Requiring verbal confirmation of significant financial transactions through callback procedures — calling a verified number rather than responding to an inbound call — defeats voice clone attacks. Requiring additional authorization through a second channel for large transactions defeats scenarios where a single voice or video call is the authorization mechanism. Pre-established safe words between executives and finance teams provide a shared secret that synthetic media cannot possess.

Organizational Defense

The organizations most vulnerable to deepfake fraud are those that rely on voice and video recognition as authentication mechanisms for financial decisions. The CFO who authorizes a wire transfer based on hearing the CEO's voice has no defense against a high-quality voice clone. Process redesign that removes voice recognition as a primary authentication factor for significant financial transactions provides structural resistance that detection technology cannot reliably provide.

Executive digital footprint management — limiting publicly available audio and video of executives that could serve as training data — raises the cost of deepfake attack preparation. Organizations should audit what audio and video of their executives is publicly accessible and limit unnecessary exposure. Conference presentations, earnings calls, and media interviews all provide training data for voice and face synthesis.

Real-World Example: The $25 Million Hong Kong Deepfake Call

In February 2024, a finance worker at an unnamed multinational corporation received a phishing email claiming to be from the company's UK-based CFO. Initially suspicious, the employee attended a video conference where the CFO and other executives appeared on camera and instructed him to conduct wire transfers. All participants except the victim were AI-generated deepfakes created from publicly available video. The worker transferred approximately $25 million across 15 transactions before discovering the fraud. Hong Kong police confirmed the case and arrested multiple individuals in connection with the operation.

$25 million

Lost in a single deepfake video conference fraud in Hong Kong in 2024 — where every participant except the victim was a real-time AI-generated deepfake of company executives. The attack succeeded because the victim recognized the familiar faces.

How Cloudskope Can Help

Cloudskope's security awareness programs include deepfake fraud scenarios and defensive process design for financial authorization workflows. We help organizations implement the call-back procedures and multi-channel authorization requirements that provide structural resistance to synthetic media attacks.

Schedule a Strategy Session