Day 8 of 28 Β· ChatGPT Course
Image Generation & Analysis
β± 7 min
π Beginner
ChatGPT is not just a text tool. It can create images from scratch, analyze images you upload, and edit existing images using natural language instructions. This is powered by OpenAI's DALL-E integration, built directly into the ChatGPT interface.
Today you'll learn how to generate stunning images, get AI-powered analysis of any visual content, and edit images without touching Photoshop. By the end of this lesson, you'll have the skills to create and manipulate visuals for presentations, social media, marketing materials, and more.
Generating images from text descriptions
ChatGPT uses DALL-E to turn your text prompts into images. You simply describe what you want, and the AI generates it. No design skills needed.
The key to great image generation is specificity. Compare these two prompts:
Vague: "A dog in a park"
Specific: "A golden retriever sitting in a sunlit autumn park, fallen orange and red leaves scattered on the ground, soft bokeh background, warm afternoon light, photorealistic style"
The more detail you provide about subject, setting, lighting, style, and composition, the better your results will be. Think of yourself as an art director giving instructions to a designer.
Here are the style categories you can use:
Photorealistic β looks like a photograph, great for product mockups and stock imagery
Illustration β hand-drawn, digital art, or vector styles for blog posts and social media
3D render β clean, modern 3D objects for tech presentations and app visuals
Watercolor / Oil painting β artistic styles for creative projects and wall art
Flat design / Minimalist β clean icons and simple visuals for UI and branding
Create an image of a modern home office with a standing desk, dual monitors, a potted monstera plant, and warm natural light coming through large windows. Minimalist Scandinavian style.
Here's your image. I've created a minimalist Scandinavian home office featuring:
- A light oak standing desk with clean lines
- Dual monitors with thin bezels on a monitor arm
- A large monstera plant in a white ceramic pot
- Floor-to-ceiling windows with warm golden-hour light
- Neutral color palette with white walls and light wood tones
Would you like me to adjust anything β the color scheme, add more furniture, or change the perspective?
Make it a nighttime scene with the monitors glowing and add some ambient LED strip lighting under the desk.
Done. I've updated the scene to nighttime:
- The windows now show a city skyline at night
- Both monitors are glowing with a soft blue-white light
- Purple-blue LED strip lighting runs under the desk
- The monstera casts subtle shadows from the ambient light
The mood is now cozy and focused. Want any other changes?
β» Replay conversation
Knowledge Check
What is the most important factor for getting high-quality AI-generated images?
A
Using the most expensive ChatGPT plan
B
Keeping prompts as short as possible
C
Generating the image multiple times until you get lucky
D
Providing specific, detailed descriptions including subject, style, lighting, and composition
Specificity is everything in image generation. The more detail you provide about what you want β subject, setting, style, lighting, composition, and mood β the closer the output will match your vision. Think like an art director, not a search engine user.
Uploading and analyzing images
ChatGPT's vision capabilities let you upload any image and have a conversation about it. This is incredibly powerful for real-world tasks.
What you can upload and analyze:
Screenshots β paste a screenshot of an error message, a website design, or a dashboard, and ChatGPT will explain what it sees and offer suggestions.
Diagrams and charts β upload a flowchart, org chart, or data visualization. ChatGPT can read it, explain it, and even recreate it in a different format.
Photos of documents β snap a picture of a whiteboard, handwritten notes, or a printed document. ChatGPT will transcribe and organize the content.
Product photos β upload images for analysis, comparison, or to generate marketing copy based on what it sees.
Design mockups β share wireframes or UI designs and get feedback on layout, usability, and accessibility.
To upload an image, click the attachment icon (paperclip) in the ChatGPT message bar, or simply paste an image from your clipboard.
[Uploads a screenshot of a complex Excel spreadsheet with sales data] Can you explain what this spreadsheet is tracking and suggest improvements to the layout?
I can see a sales tracking spreadsheet with the following structure:
**Columns identified:**
- Column A: Sales rep names
- Columns B-M: Monthly revenue figures (Jan-Dec)
- Column N: Annual totals
- Column O: Commission percentage
**Issues I noticed:**
1. No header row formatting β makes it hard to scan
2. Mixed currency formats ($1,000 vs 1000.00)
3. The commission column uses hardcoded values instead of formulas
4. No conditional formatting to highlight top/bottom performers
**Suggested improvements:**
- Add a frozen header row with bold formatting
- Standardize currency format across all cells
- Add conditional formatting (green for above target, red for below)
- Create a summary dashboard row at the top
- Add sparkline charts for trend visualization
Want me to write the formulas or create an improved version?
β» Replay conversation
Knowledge Check
Which of the following can you NOT do with ChatGPT's image analysis?
A
Analyze data from a screenshot of a chart
B
Get feedback on a UI design mockup
C
Directly edit the pixels of an uploaded photograph in real-time
D
Transcribe handwritten notes from a photo
ChatGPT can analyze, describe, and discuss uploaded images in detail. It can also generate new images based on your instructions. However, it doesn't work as a real-time pixel editor like Photoshop β it generates new images rather than directly manipulating uploaded ones at the pixel level.
Editing images with natural language
One of the most powerful features is image editing through conversation. You can ask ChatGPT to modify generated images or even work with images you upload.
Here are practical editing workflows:
Iterative refinement β Generate an image, then ask for specific changes. "Move the text to the left." "Make the background darker." "Add a subtle drop shadow." Each instruction builds on the previous result.
Style transfer β Take a concept and change its visual style. "Now make this same scene as a pixel art illustration." "Convert this to a vintage poster style."
Element addition/removal β "Remove the person in the background." "Add a coffee cup on the desk." "Replace the sky with a sunset."
Text and branding β "Add the text 'Launch Day' in a bold sans-serif font at the top." "Place our logo in the bottom-right corner."
The editing workflow is conversational. You don't need to start over β just keep refining until you get exactly what you want.
Knowledge Check
What is the recommended workflow for getting the perfect AI-generated image?
A
Use a separate image editing tool after generation
B
Generate an initial image, then iteratively refine it with specific editing instructions
C
Write one perfect prompt and accept whatever it generates
D
Generate dozens of images and pick the best one
The iterative approach works best. Start with a solid initial prompt, then refine through conversation β adjusting colors, moving elements, changing styles. Each edit builds on the previous result, so you converge on exactly what you want.
Best practices for image prompts
Here's a formula for writing effective image generation prompts. Include as many of these elements as relevant:
1. Subject β What is the main focus? ("A woman working on a laptop", "A futuristic city skyline")
2. Setting/Environment β Where is it? ("in a cozy cafe", "on a rooftop at sunset")
3. Style β What visual style? ("photorealistic", "watercolor illustration", "flat vector design", "cinematic")
4. Lighting β What's the mood? ("soft natural light", "dramatic side lighting", "neon glow")
5. Composition β How is it framed? ("close-up portrait", "wide-angle aerial view", "centered symmetrical")
6. Color palette β Any color preferences? ("warm earth tones", "monochrome blue", "vibrant and saturated")
7. Mood/Atmosphere β What feeling? ("peaceful and calm", "energetic and dynamic", "mysterious and moody")
Pro tip: You can also use reference styles like "in the style of a New Yorker magazine illustration" or "like a Studio Ghibli background painting" to guide the aesthetic.
Final Check
You need a header image for a blog post about sustainable energy. Which prompt would produce the best result?
A
"Sustainable energy image"
B
"A green energy picture for my blog"
D
"A wide-angle photograph of a modern solar farm at golden hour, rows of sleek panels reflecting warm sunset light, rolling green hills in the background, clear sky with soft clouds, clean and optimistic mood, editorial photography style"
The detailed prompt specifies subject (solar farm), composition (wide-angle), lighting (golden hour/sunset), setting (green hills), mood (clean, optimistic), and style (editorial photography). This gives the AI everything it needs to generate a professional, on-brand image.
π¨
Day 8 Complete
"ChatGPT turns you into a visual creator. Describe it, generate it, refine it β no design degree required."
Tomorrow β Day 9
Canvas
Discover Canvas, ChatGPT's interactive workspace for writing and coding side by side with AI.