AI Product Photography Without a Studio

A commercial shoot runs $500 to $2,000 a day, plus a stylist, a retoucher, and a booking lead time in weeks. For a small brand, that math stops most shoots before they start.

The short version: shoot the product once, drop it into an influencer ad without it changing, scale it to every social aspect ratio, and finish with a designed ad complete with bilingual copy, all from your agent. Here is a full campaign for one product.

The product shot

Start with a clean hero. The realistic preset is tuned for photographic light and material, so describe the product, the light, and what is not in frame.

A premium creme brulee bubble tea in a glass that is wide and rounded at the bottom and
tapers to the top, a straw inside: dark tapioca pearls at the base, iced milk tea, a pale
custard layer and a torched caramel crust on top, the layers swirling together.
Studio product photo, cool greyish tones, soft neutral daylight, smooth grey concrete
backdrop, shallow depth of field. No text, no labels.
preset: realistic · quality: medium · 2:3 portrait

Put it in an ad, with the product locked

A hero shot sells the drink. An influencer holding it sells the brand. The trick is keeping the product identical when it moves into a new scene, and the tool for that is the reference image (reference_image_paths): pass the hero shot back in, and the model anchors to it instead of inventing a new drink.

A premium social ad in cool greyish tones: a stylish young woman in a beige trench coat
holding the exact drink from the reference image, soft smile, smooth grey concrete
minimalist setting, soft daylight. Keep the drink identical to the reference.
preset: realistic · quality: medium · 2:3 portrait · reference: the product shot

The pearls, the caramelized top, the glass: all carried over, because the model is holding the actual product. Save this ad as a reference too, and the next step holds the model steady as well.

One shoot, every aspect ratio

Social wants the same creative in different shapes: a vertical story, a wide banner for X and LinkedIn. Re-run the ad at each size, passing the first ad as the reference so the woman and the drink do not drift.

Re-run the ad at each size, passing the first ad as the reference so the woman and drink
stay identical. 9:16 for stories, 16:9 for a wide banner with open space for a headline.
preset: realistic · quality: medium · reference: the ad above

Add the copy: a finished ad with text

To ship it as an actual ad, add the headline. Two things matter here. First, switch to the custom preset: the photographic presets render a clean photo and ignore overlaid typography, while custom will lay in text. Second, and this is the one that trips people up: a reference image is a very strong influence, so keep the prompt short. Name the reference, state the pose and the text you want, and stop. A long, detailed prompt fights the reference and starts dropping elements (the pose, half the text). Less prompt, more control.

A premium bubble-tea ad poster, cool greyish tones, with clean text. The woman from the
reference image holds up and shows the drink, smiling. Headline: Crème Brûlée Bubble Tea,
with 焦糖珍珠奶茶 below it.
preset: custom · quality: high · reference: the approved ad

That short prompt held the model, the drink, and the cool palette from the reference, posed her holding the drink, and rendered both the English and the Chinese headline cleanly. gpt-image-2 is reliable on short display text like this; it is dense paragraphs where text rendering falls apart.

One hero, one model, four ready-to-post assets. The photo formats at medium are about 15 tokens; the text poster is best at high (20), which Pro and Power unlock. Call it 35 tokens for the campaign, under a dollar on Pro, versus an $800-plus studio day.

Where this breaks down

You will hit these, so plan for them.

Glass and reflective finishes. Glass, chrome, and polished steel are the hard case. The drink above is glass, which is exactly why the reference image matters: it holds the shape and the layers steady across every scene. Matte, ceramic, fabric, and wood read more reliably.

Fine label text. The model invents convincing label copy that will not match your real packaging. Fine for hero and lifestyle shots; for compliance or detail shots, shoot a camera.

Exact product fidelity. For warranty, documentation, or anything that must depict the physical item precisely, use a photograph. This is for marketing, scene variety, and fast iteration, where it removes a real bottleneck.

For transparent cut-outs (a product on a clean alpha background for marketplace listings), generate on white, then remove the background locally at zero tokens. Full method in the transparent background PNG guide.

FAQ

Can an AI product photo generator replace professional photography? For marketing, social, and marketplace testing, it covers most of a studio session. For legal, compliance, or precision documentation, use a camera. The two are complementary.

How do I generate product photos with AI and keep the product consistent? Pass your hero shot as a reference image (reference_image_paths) in every later generation. The model anchors to it and holds color, shape, and finish across scenes, far more reliably than re-describing the product each time. Keep the prompt short when you do, since the reference carries most of the weight. More in generate images from reference images.

How do I get readable text on the ad? Use the custom preset (the photo presets skip overlaid text), keep the copy short (a headline, a line of subtext), and put the exact words in the prompt. Short display text, including non-Latin scripts, renders well; dense paragraphs do not.

How much does AI product photography cost per image? A high-quality 1024x1024 image is 20 tokens; medium is 5; low is 1. Background removal is 0. The Pro plan ($14.99/month) includes 600 tokens, plus unlimited background removals. For pay-as-you-go beyond a plan, Power adds $0.04/token overage.


Ready to shoot your product? Connect AgentBrush to your agent, generate the hero, then reference it into a full campaign. AgentBrush is an MCP server, so Claude Code, Cursor, Codex CLI, or any MCP client runs every step in one conversation.