The Method

The AI does not decide
the quality of your
digital twin.
Your footage does.

Most people blame the platform when their avatar looks robotic. The real problem was decided before the AI ever saw the video. This page explains what actually determines digital twin quality — and why most filming approaches get it wrong.

The Misconception

The platform is not the problem

Digital twin platforms are remarkable tools. They can produce realistic AI video of a person from a few minutes of source footage. But they process input and produce output. The quality of the output is almost entirely a function of the quality of the input.

Feed the system footage shot in harsh lighting, with shifting white balance, in the wrong codec, with a subject who was not directed to maintain eye contact — and the avatar will look harsh, inconsistent, and lifeless.

The platforms publish documentation. It is generally incomplete. What it does not cover is responsible for most of the failed uploads, poor avatar quality, and client disappointment in this space.

What Actually Determines Quality

Six variables. Each one decided before you press record.

Lighting

Even, soft light on the subject. Underexposed background. No bounce from studio walls.

Digital twin systems analyse tiny differences between frames — around the mouth, eyes, and skin. Inconsistent or harsh lighting makes this harder. Bounce light from white studio walls can cause background removal to blur edges, particularly around male subjects' ears, where skin tone and background can become similar enough to confuse the AI.

The right approach: soft, even light on the subject, with the background deliberately underexposed. This gives the AI clean facial data and produces background removal without halos or edge artifacts.

Codec and File Format

Wrong codec = silent upload failure. No error message. The upload just does not work.

Most platforms only accept 8-bit 4:2:0 chroma subsampling. If the footage was recorded in a different format — even if it looks identical on screen — the upload fails silently. This is one of the most common and most preventable causes of failure in the entire workflow.

H.264 at 8-bit with 4:2:0 chroma subsampling before uploading eliminates this category of problem entirely.

White Balance

Automatic white balance shifts skin tones between frames. Set it manually once and leave it.

If the camera adjusts colour temperature during filming, skin tones change between frames. The AI tries to identify consistent facial data across the recording. Shifting skin tones produce inconsistency in the avatar. Manual white balance set at the start of every session resolves this completely.

Body Language and Performance

The AI learns what it sees. Non-generic movements appear randomly in generated videos.

If the subject's eyes drift sideways, the avatar sometimes drifts. If the subject performs a pointed-finger gesture during training footage, that gesture may appear randomly in generated content — because the system does not understand context, only patterns.

Natural, small, generic movements. Eye contact maintained. No wide hand gestures outside the body area, which blur during background removal. These directing decisions have to be made before recording starts.

Frame Rate and Shutter Speed

25 frames per second at 1/100 shutter. Sharper facial detail. Better lip-sync precision.

25 frames per second keeps motion natural and gives the AI stable timing to analyse lip movement and facial expressions. A 1/100 shutter speed — slightly faster than the conventional video rule — reduces motion blur on small facial movements, capturing the precision around the mouth that lip-sync accuracy depends on.

Background

Plain. High contrast against the subject. Ten feet from the backdrop. No patterns.

Plain background with high contrast against the subject's hair and clothing. Approximately ten feet between subject and backdrop so light falls off and the platform has clear subject-to-background separation. No patterns, no bright whites, no equipment visible in frame. The background is a data source for the AI. Clean separation produces clean results.

The Stakes

Three failed attempts. Account-wide ban.

Three failed avatar creation attempts in HeyGen triggers an account-wide ban. Only support can lift it, and support response times are slow. Most people find this out after it has happened to their client.

The cause of most failed creations is footage that does not meet platform requirements. Wrong chroma subsampling. Wrong codec. Footage with cuts the platform cannot process. A consent video filmed differently from the training footage.

Every one of these is preventable.

Want to avoid them? The Masterclass covers every variable — including how to test before you upload.

Explore the Masterclass

The Standard

The gap is before you press record

The difference between a digital twin a CEO would show to their board and one they would not is not primarily the AI. It is whether the person filming it understood these variables before they pressed record.

That is what a professional photography background makes possible. Understanding light — not just the technical settings, but what it does to how a subject reads on camera. Understanding direction — the difference between a subject who looks stiff and one who looks natural.

This is the methodology behind the Masterclass, the consulting days, and the digital twins produced by Digital Twin Imaging.

Start with the right foundation

The AI is capable of producing something extraordinary. But only if the footage gives it something to work with.

Photographers & videographers

Learn the complete system end to end.

Masterclass — from $997

Studios & agencies

Get your specific setup calibrated to this standard.

Consulting Day

Executives & founders

Have a professional digital twin produced at this level.

Enquire About Your Twin

The AI does not decide the quality of your digital twin. Your footage does.