YouTube Thumbnail Generation with AI JSON Prompts
If you make videos, you know this truth: Creating content is fun, but making thumbnails is not.
You've got better things to do than spend an hour trying to:
- Pose 5 different ways with the "right" face
- Crop yourself out of a noisy background
- Pick the "perfect" font combo that doesn't scream 2011
- Add shadows, emojis, and icons - then adjust everything 12 times
- Wonder why it still looks off
So I built a tool that handles all of it for you - using a single JSON prompt and AI thumbnail generation.
How it Works
You drop in:
- A title (Main text)
- A subhead (Optional)
- A photo
And that's it. The JSON prompt handles:
✅ Pose and expression (matched to your title's tone)
✅ Visual theme (based on your keywords: money, warning, drama, etc.)
✅ Auto background, lighting, icons, and accents
✅ Balanced layout for YouTube-sized real estate
✅ Bold text, max contrast, clean framing
No Photoshop, digging through pre-made templates or wasting 45 minutes lining up your eyes with the "rule of thirds."
Real Example
- Main Text:
EXTREME LEADERSHIP
- Sub Text:
POWER DOESN'T ASK PERMISSION
Here's what happens behind the scenes:
{
"profile_name": "smart_youtube_thumbnail_autogen",
"thumbnail_goal": "Auto-generate a YouTube thumbnail that adapts pose, styling, icons, and colors based on the sentiment and keywords of the input text.",
"autodetect": {
"enabled": true,
"theme_engine": {
"mode": "analyze_text",
"fallback_theme": "default",
"themes_supported": [
"money",
"crypto",
"freelance",
"weddings",
"tutorial",
"warning",
"success",
"drama",
"tech",
"review",
"funny",
"motivational",
"fitness",
"parenting",
"education",
"gaming",
"fashion",
"food",
"travel",
"celebrity",
"default"
],
"match_logic": {
"money": [
"$10K",
"profit",
"income",
"sales",
"revenue",
"made money",
"passive income"
],
"crypto": [
"bitcoin",
"ethereum",
"NFT",
"crypto",
"blockchain",
"token"
],
"freelance": [
"freelance",
"client",
"upwork",
"fiverr",
"side hustle",
"contract"
],
"weddings": [
"wedding",
"married",
"proposal",
"bride",
"groom",
"honeymoon",
"ring"
],
"tutorial": [
"how to",
"step-by-step",
"guide",
"walkthrough",
"tutorial",
"learn to"
],
"warning": [
"warning",
"urgent",
"don’t do",
"scam",
"mistake",
"exposed"
],
"success": [
"milestone",
"celebrate",
"success story",
"win",
"hit goal",
"achievement",
"growth"
],
"drama": [
"shocking",
"cancelled",
"drama",
"beef",
"exposed",
"fight",
"clapback"
],
"tech": [
"AI",
"GPT",
"app",
"dev",
"startup",
"code",
"API",
"programming"
],
"review": [
"review",
"vs",
"comparison",
"rating",
"unboxing",
"first look"
],
"funny": [
"funny",
"lol",
"cringe",
"fail",
"wtf",
"🤣",
"😂"
],
"motivational": [
"mindset",
"discipline",
"level up",
"grind",
"life advice",
"success mindset"
],
"fitness": [
"gym",
"workout",
"before and after",
"transformation",
"6 pack",
"fat loss",
"muscle"
],
"parenting": [
"baby",
"toddler",
"mom life",
"dad life",
"parenting",
"child",
"newborn"
],
"education": [
"study tips",
"exam",
"SAT",
"math",
"science",
"reading",
"learn faster",
"homework"
],
"gaming": [
"gaming",
"stream",
"twitch",
"fortnite",
"warzone",
"gamer rage",
"controller"
],
"fashion": [
"style",
"outfit",
"hairstyle",
"closet",
"lookbook",
"what I wore",
"makeup"
],
"food": [
"recipe",
"cooking",
"meal",
"kitchen",
"taste test",
"mukbang",
"snack",
"what I eat"
],
"travel": [
"travel",
"adventure",
"vacation",
"trip",
"destination",
"exploring"
],
"celebrity": [
"celebrity",
"kardashian",
"famous",
"influencer",
"actor",
"viral"
]
},
"use_sentiment_for_pose": true,
"use_keywords_for_visuals": true,
"allow_override": true
}
},
"person": {
"image_input": "[UPLOAD_IMAGE]",
"pose": "auto from detected_theme",
"expression": "auto from sentiment",
"position": "left",
"outline": {
"enabled": true,
"color": "#FFFFFF",
"glow": true,
"glow_intensity": "moderate"
},
"shadow": {
"enabled": true,
"opacity": 0.4,
"blur_radius": 20
}
},
"text": {
"main_text": "[YOUR MAIN TEXT]",
"sub_text": "[YOUR SUB TEXT]",
"font": {
"style": "bold uppercase",
"family": "Bebas Neue or similar",
"weight": "900"
},
"color_mode": "auto",
"outline": true,
"drop_shadow": true,
"verbatim": true
},
"background": {
"preset": "auto from theme",
"options_by_theme": {
"money": [
"cash rain",
"green starburst gradient",
"gold explosion with spark trails",
"neon green stacks over black fade",
"green lightning over dark grid"
],
"warning": [
"black-red radial burst",
"flashing light grid",
"fire hazard backdrop",
"red lightning bolts over dark sky"
],
"drama": [
"fire blast",
"orange smoke with sparks",
"exploding emoji shockwave",
"purple lightning with shattered glass effect"
],
"tech": [
"deep blue radial grid",
"neon circuitry lines with pulsing energy",
"electric arc streaks across dark blue"
],
"motivational": [
"gold beam burst",
"sunrise glow with energy rays",
"orange fire swirl",
"red and gold electric charge background"
],
"default": [
"bold lightning bolts on gradient",
"neon radial burst",
"comic-style explosion background",
"deep blue starburst"
]
},
"depth_layers": 4,
"glow_effects": true,
"randomize_variants": true
},
"icons_and_effects": {
"enabled": true,
"auto_icons": true,
"icons_by_theme": {
"money": [
"cash stack",
"green lightning bolt",
"dollar rain",
"green fireball",
"up arrow with glow"
],
"warning": [
"red siren light",
"exploding triangle",
"burning exclamation mark"
],
"drama": [
"🔥",
"⚡",
"😱",
"shatter crack",
"fire swirl"
],
"motivational": [
"gold medal glow",
"rising sun burst",
"⚡ energy zap",
"🔥 flame bar"
],
"tech": [
"neon chip icon",
"pulse grid",
"AI spark flare"
]
},
"placement": "around person and text, balanced",
"dynamic_scaling": true
},
"style": {
"theme_vibe": "auto",
"balance": "50/50 person-text",
"contrast": "max",
"readability_at_small_scale": true
},
"render": {
"format": "JPG",
"resolution": "1280x720",
"dpi": 72,
"optimized_for_youtube": true,
"max_file_size": "2MB"
},
"mood": "Auto-driven by text tone"
}
This setup pulls everything from your text and image. It understands context, not just aesthetics.
The AI reads your intent and builds a thumbnail around it - not some random template. Actual intelligence, not just "design automation."
Why This Works
YouTube thumbnails are half science, half vibe. This prompt nails both:
- Emotion-based visuals that feel natural, not forced
- Consistent layout built for clickability
- Theme-driven graphics that match what viewers expect (and respond to)
- Small-screen readability that doesn't fall apart in the feed
It doesn't just look good, it performs.
Who This Is For
- Creators who don't want to spend another second in Canva
- Course sellers, marketers, or indie hackers trying to scale
- Anyone building content at speed and sick of fiddling with thumbnails
- People who want AI-generated YouTube thumbnails that don't look AI-generated
Want to Try It?
You don't need to learn anything.
Just upload a photo and paste the JSON.