🦞 4minAI.com
1 / 11
Day 10 of 28 Β· ChatGPT Course

Voice Mode

Typing is great, but sometimes you just want to talk. ChatGPT's Voice Mode lets you have natural, real-time conversations with AI using your voice β€” on your phone, desktop, or in the browser.

It's not just speech-to-text. Voice Mode is a fully conversational experience where ChatGPT listens, understands context, responds with natural-sounding speech, and can even detect emotion and adjust its tone. Today you'll learn how to activate it, pick your preferred voice, and use it in everyday situations where your hands are busy.

How to use voice conversations

Voice Mode is available across all ChatGPT platforms. Here's how to get started on each:

Mobile (iOS and Android) β€” Tap the headphone icon in the bottom-right corner of the ChatGPT app. The screen shifts to a visual waveform display, and you can start talking immediately. Tap the waveform to pause, or tap the X to end the conversation.

Desktop app β€” Click the headphone icon next to the message input. A voice interface overlay appears. You can talk naturally while still seeing your previous chat history.

Web browser β€” The voice icon appears in the message bar at chatgpt.com. Click to activate, and allow microphone access when prompted.

Key things to know:

- You can interrupt ChatGPT mid-response β€” just start talking and it will stop and listen

- Voice Mode maintains full conversation context, so you can reference things said earlier

- You can switch between voice and text seamlessly in the same conversation

- Voice Mode works with your custom instructions and memory, so ChatGPT knows your preferences

πŸ€–
ChatGPT
online
Hey ChatGPT, I'm driving to a client meeting right now. Can you give me a quick refresher on the three main points from the proposal we worked on yesterday?
Of course! Here are the three key points from your proposal: First, the cost reduction β€” you're projecting a 30% decrease in operational costs by automating the invoice processing workflow. Second, the implementation timeline β€” you proposed a 90-day rollout in three phases: setup, pilot testing, and full deployment. Third, the ROI β€” based on current processing volumes, the system should pay for itself within six months. Want me to help you rehearse any talking points before you arrive?
Yeah, help me practice how to respond if they push back on the 90-day timeline.
Good thinking. Here's a strong response: "I understand the timeline feels ambitious. The reason we structured it in three phases is to reduce risk. Phase one is just setup β€” no disruption to your current workflow. Phase two is a controlled pilot with one team. Only in phase three do we go company-wide, and only after the pilot proves the results. So while it's 90 days total, the real commitment doesn't start until day 30." That positions the timeline as cautious rather than aggressive. Want to run through any other objections?
↻ Replay conversation
Knowledge Check
What makes Voice Mode different from a simple speech-to-text feature?
A
It's a fully conversational experience that maintains context, detects emotion, and responds with natural speech
B
It requires a separate subscription
C
It only works on mobile devices
D
It converts your speech to text and sends it as a regular message
Voice Mode is not just dictation. It's a real-time conversational AI experience. ChatGPT listens, understands the context of the full conversation, responds naturally, and can even pick up on emotional cues to adjust its tone. You can interrupt, ask follow-ups, and switch topics naturally.

Choosing your preferred voice

ChatGPT offers several distinct voices, each with its own personality and tone. You can choose the one that feels most natural and comfortable for your conversations.

Available voices include options that range from warm and friendly to clear and professional. You can preview each voice in Settings and pick the one you prefer. Your choice persists across all conversations until you change it.

How to change your voice:

1. Go to Settings in ChatGPT

2. Select Voice

3. Preview the available voices by tapping the play button next to each

4. Select your preferred voice

Tips for choosing:

- If you use Voice Mode for professional tasks (meeting prep, presentations), pick a voice with a clear, steady cadence

- If you use it for casual conversations (brainstorming, learning), pick a voice with a warm, conversational feel

- Try a few different voices for a week before committing β€” what sounds good in a preview might feel different in a 10-minute conversation

Knowledge Check
Where do you change ChatGPT's voice?
A
You cannot change the voice
B
In ChatGPT's Settings under the Voice section
C
By asking ChatGPT to change its voice during a conversation
D
In your device's accessibility settings
Navigate to Settings in ChatGPT and select the Voice section. There you can preview all available voices and select the one you prefer. Your selection applies to all future voice conversations.

Hands-free use cases

Voice Mode unlocks ChatGPT for situations where typing is impossible or inconvenient. Here are the most popular hands-free scenarios:

Commuting β€” Prepare for your day while driving or riding transit. Get briefed on your calendar, rehearse a presentation, brainstorm ideas, or catch up on industry news. Voice Mode turns dead commute time into productive time.

Cooking β€” Ask for recipes step by step, get ingredient substitutions on the fly, set verbal reminders for timers, or have ChatGPT walk you through a new technique. No need to touch your phone with messy hands.

Exercising β€” Get workout suggestions, have ChatGPT count reps or guide you through a stretching routine, or listen to a summary of an article while on the treadmill. Voice Mode keeps your hands free and your mind engaged.

Walking and brainstorming β€” Some of the best ideas come when you're moving. Use Voice Mode to capture ideas, develop them in real time, and have ChatGPT organize your stream of consciousness into structured notes.

Multitasking at your desk β€” Even when you're at a computer, sometimes it's faster to talk than to type β€” especially for complex, nuanced requests that are easier to explain verbally.

Record mode and meeting summaries

One of the most powerful Voice Mode features is Record mode. This lets you record real-world audio β€” like a meeting, lecture, or phone call β€” and have ChatGPT process it into useful output.

How Record mode works:

1. Open ChatGPT and tap the Record button (microphone icon with a circle)

2. Place your phone where it can capture the audio (on the conference table, next to your laptop speaker, etc.)

3. ChatGPT listens and transcribes in real time

4. When the recording is done, tap Stop

5. Ask ChatGPT to process the recording β€” summarize it, extract action items, or create a document in Canvas

What you can do with recordings:

- Meeting summaries β€” Record a team meeting and get a structured summary with decisions made, action items assigned, and next steps

- Lecture notes β€” Record a class or webinar and get organized notes with key concepts highlighted

- Interview transcripts β€” Record an interview and get a clean transcript with speaker labels

- Voice memos to Canvas β€” Record your thoughts on a topic, then have ChatGPT turn your rambling voice memo into a polished document in Canvas

This workflow is especially powerful when combined with Canvas. You can go from a raw meeting recording to a formatted summary document with action items in under a minute.

Knowledge Check
What can you do with a meeting recording in ChatGPT's Record mode?
A
Only get a raw transcription of what was said
B
Only share the audio file with others
C
Record mode is only available on desktop
D
Get structured summaries, extract action items, identify decisions, and create formatted documents in Canvas
Record mode goes far beyond simple transcription. ChatGPT can process the recording to create structured meeting summaries, extract action items with assigned owners, identify key decisions, and even draft follow-up emails or create polished documents in Canvas β€” all from a single recording.
Final Check
Which scenario is the BEST use of ChatGPT Voice Mode?
A
Preparing for a client meeting while driving to the office
B
Formatting a spreadsheet with precise column widths
C
Writing a 2,000-word blog post from scratch
D
Debugging a complex piece of code with multiple files
Voice Mode excels in hands-free, conversational scenarios. Preparing for a meeting while driving is perfect β€” it's conversational, doesn't require visual output, and turns unproductive time into preparation time. Tasks requiring visual precision (code debugging, spreadsheet formatting) are better suited to text mode.
πŸŽ™
Day 10 Complete
"Voice Mode frees ChatGPT from the keyboard. Talk, listen, and get things done β€” wherever you are."
Tomorrow β€” Day 11
Video Generation with Sora
Learn how to generate HD video clips with Sora, OpenAI's video generation tool built into ChatGPT.
πŸ”₯1
1 day streak!