Deepfake, Face Swap, AI-Generated Person: What the Difference Actually Means

Quick answer: Deepfake, face swap, and AI-generated person describe three different things. A deepfake manipulates a real person's words or actions on real footage. A face swap replaces one person's face with another on real video. A fully AI-generated person is entirely synthetic with no source footage. Each fails in different places, so look in different places.

Someone sends you a video. The person in it looks real. The setting looks real. Something feels off, but you cannot name it.

Before you can figure out what you are looking at, you need to know which of three completely different things produced it. A deepfake, a face swap, and a fully AI-generated person are not the same thing. They are made differently, they fail differently, and they are used for different purposes.

Getting the label right is not pedantry.

The visual tells are different. The legal exposure is different. The detection approach is different. Using the wrong label means looking in the wrong place.

Face Swaps Leave Seams, and the Seams Are Where You Look

A face swap takes a real video of a real person and replaces their face with someone else's. The body, the voice, the background, and the movement are all from the original footage. Only the face has been substituted.

The source material is genuine video. A face-swap model is trained on images of the target face, the person whose face will appear in the output, and learns to map that face's geometry onto the source footage frame by frame. Quality depends on how many training images were available and how closely the lighting and angle match between the source and the target.

What it is used for: Entertainment parody, satire, non-consensual intimate imagery, political manipulation where the body language and setting of a real video are preserved but the face is substituted.

The specific tells:

The seam where the replacement face meets the original neck, ears, and hairline is where face-swap artifacts concentrate. Look at the boundary between the face and the collar. On a close face-swap, there is often a slight halo, a color mismatch, or a texture difference at the edge. The ears are particularly reliable: a face-swap model trained on frontal images often struggles with the ear-to-jaw junction when the head turns.

Earrings are a useful test. In a real video, an earring moves with the earlobe. In a face-swap, the earring sometimes lags slightly behind head movement or floats at a slightly wrong position relative to the ear.

What makes it hard to detect: A well-executed face swap using footage with similar lighting, angle, and skin tone can be extremely convincing. The giveaway is almost always at the boundary and in motion, not in a static frame.

The implication for you: pausing on a static frame is the wrong move. Watch the edges. Watch the ears. Watch anything the face-swap model had to approximate rather than copy.

Deepfakes Manipulate What a Real Person Says, Not Who They Are

A deepfake, in the precise technical sense, starts with real footage of a real person and manipulates it to show them saying or doing something they did not say or do. The most common application is lip-sync manipulation: a new audio track is generated, often by cloning the subject's voice, and the mouth movements in the original footage are regenerated to match the new audio.

The face is real. The body is real. The words are not.

What it is used for: Political manipulation, making a real politician appear to say something they did not say, non-consensual intimate audio, revenge content against private individuals, and increasingly in fraud where a known figure's credibility is needed but their actual appearance is available from existing footage.

The specific tells:

Lip sync is the primary failure point. Watch the transition between words, particularly on phonemes that require the lips to fully close: B, P, and M sounds. In a deepfake, the mouth movements are generated to approximate the new audio, but the approximation often lags at sentence boundaries and produces unnatural mouth shapes on consonant clusters.

The skin immediately around the mouth sometimes has a slightly different texture or motion than the rest of the face, because only the mouth region was regenerated. Watch the philtrum, the vertical groove between the nose and upper lip, during fast speech. On a real face it deforms naturally. On a deepfake it often stays rigid while the mouth moves around it.

What makes it hard to detect: Because the source material is real footage of a real person, the overall realism is very high. The manipulation is localized, which means a quick watch at normal speed misses it. Frame-by-frame review at the mouth during speech is the most reliable approach.

If you are trying to verify whether a public figure actually said something, this is the type you are most likely dealing with. The original footage exists. The question is whether the words were added afterward.

Fully Synthetic People Have No Source Material, Which Makes Them Harder to Disprove

A fully AI-generated person has no original footage. No real individual was involved. The face, voice, body, background, and script were all generated from scratch by a generative AI model. The person does not exist.

This is what the fake TikTok doctor recommending supplements was. This is what most AI influencer fraud accounts use. There is no original footage to compare against, no real person whose face can be traced back to a source, and no seam between a generated face and a real body because the entire frame was generated.

Models like Sora, Kling, Runway, and Seedance produce this type of content. The output quality has improved dramatically since 2023, but the generation process still leaves consistent artifacts.

What it is used for: Fake influencer accounts, fake doctor and health authority content, investment and crypto scam endorsements, AI companion applications, disinformation at scale where no real footage of a real person is needed.

The specific tells:

Because every element was generated simultaneously, there is no seam between a face and a body. The tells are distributed across the whole frame rather than concentrated at a boundary. None of these signals alone is definitive. Look for a pattern across several.

Hands: AI generators still produce hands with inconsistent finger counts, merged fingers, and knuckle geometry that falls apart in motion. If a synthetic person uses their hands, watch closely.
Background coherence: The background of a synthetic video sometimes warps subtly near the edges of the frame when the subject moves. Straight lines in the background, door frames, shelves, tiles, may bend or shift.
Skin texture in motion: Static frames from synthetic video can look convincing. In motion, the skin texture either smears during head movement or flickers slightly between frames.
Eye moisture: Real eyes have catchlight reflections from the light source. Generated eyes often have catchlights that do not correspond to any light source visible in the background, or catchlights that stay fixed as the head moves rather than shifting with the environment.

What makes it hard to detect: The absence of any original footage means there is no source material to compare against. Platform metadata labeling is the only automatic signal, and as covered in how platforms label AI video, that metadata is trivially stripped before upload.

The Three Types Fail in Different Places, So Look in Different Places

	Face swap	Deepfake	Synthetic person
Source material	Real video + real face	Real video of real person	None
Primary manipulation	Face boundary	Mouth/lip sync	Everything
Where to look first	Ear-jaw boundary, hairline	Lip sync, philtrum	Hands, background, eyes
Platform label reliability	Low (metadata stripped easily)	Low	Low
Traceability	High (original footage exists)	High (original footage exists)	None

Community reporting covers all three types, but the confidence level differs. A face-swapped video of a real person can be verified against the original footage if someone finds it. A deepfake can be traced to the source. A fully synthetic person has no ground truth to compare against, which is why account-level patterns matter more than individual video analysis. An account that posts only synthetic content will show a pattern of consistent artifacts across multiple videos even if any single video passes a quick check.

Everyone Calls Everything a Deepfake, and That Causes Real Problems

All three types are called "deepfake" in casual usage, in most media coverage, and in most platform policy documents. TikTok's community guidelines use "deepfake" to cover all three. The EU AI Act uses "deep fake" to mean synthetic or manipulated media generally.

This matters for two reasons.

First, when you report content to a platform using the "deepfake" report category, you may be reporting a face swap or a synthetic person. The platform's enforcement team needs to know which type to investigate correctly. Calling everything a deepfake is like calling everything a car accident. The label tells you something happened, but not what to do next.

Second, when a news article says "a deepfake of [person]," it is worth asking whether the original person was involved at all or whether the article is describing a synthetic person who was designed to look like them. Those are very different claims with very different legal and ethical implications.

Understanding what a deepfake actually is, and which of the three categories it falls into, is the first step in knowing where to look, what to report, and how to explain what you found to someone else.

What Is a Deepfake? A Plain-English Guide for Social Media Users: the consumer-level overview of how each type is generated and why the distinction matters
The 6 Visual Tells That Instantly Give Away an AI Face on Video: the face-specific signals that apply across all three types, with specific differences noted
How TikTok, Instagram, and Facebook Label AI Videos: why platform labels fail for all three types and what fills the gap
Humans Are Still Better Than AI at Spotting Deepfake Videos: why detection performance differs across the three categories and where human observers still win

Deepfake, Face Swap, AI-Generated Person: What the Difference Actually Means

Face Swaps Leave Seams, and the Seams Are Where You Look

Deepfakes Manipulate What a Real Person Says, Not Who They Are

Fully Synthetic People Have No Source Material, Which Makes Them Harder to Disprove

The Three Types Fail in Different Places, So Look in Different Places

Everyone Calls Everything a Deepfake, and That Causes Real Problems

Related Posts

Train your eye. Verify what you find.