Audio and Video
Embedding audio and video with accessibility controls, captions, and fallback content.
The <video> and <audio> elements embed media directly in a page without requiring a plugin. Both elements look simple at first, but each has a set of attributes that control playback behavior, accessibility, and browser compatibility.
The <video> element
The most common mistake is placing a <video> element on a page and giving it no way to play.
What breaks without controls
<!-- The video loads but the user cannot play it. No play button. No timeline. Nothing. -->
<video src="intro.mp4"></video>The video exists in the page. The browser may even preload it. But there is no interface. A user without JavaScript and without a controls attribute has no way to start, pause, mute, or seek through the video.
Add the controls attribute to render the browser's built-in playback interface:
<video src="intro.mp4" controls width="800" height="450">
<p>Your browser does not support video. <a href="intro.mp4">Download the video</a>.</p>
</video>The content inside <video> is fallback content. It displays only if the browser cannot play the video at all. Most modern browsers support video natively, so this fallback is primarily for very old browsers. A download link is the most practical fallback.
Key attributes
controls renders the built-in playback UI. It is not on by default. Without it, keyboard users cannot interact with the video at all. Never omit it unless you are building a fully custom player in JavaScript with complete keyboard support.
width and height reserve space before the video loads, preventing layout shift. Use the intrinsic dimensions of the video file.
poster sets an image to display before the video plays. Without a poster, the browser shows either a black rectangle or the first frame of the video. A black rectangle before the video loads creates a poor first impression and can look like a broken element.
<video controls width="1280" height="720" poster="https://placehold.co/1280x720">
<source src="intro.mp4" type="video/mp4">
</video>preload hints to the browser how much to load before the user presses play. Options:
none: do not load anything until the user starts playback. Best for bandwidth-sensitive pages where users may not watch the video.metadata: load only the duration, dimensions, and first frame. The default in most browsers.auto: the browser decides how much to buffer. Appropriate for a page where the video is the primary content and likely to be watched.
muted starts the video with audio muted.
autoplay starts playing immediately without user interaction. Most browsers block autoplay with sound. Use autoplay only combined with muted. Even then, be careful: videos that start playing automatically without sound can still be disruptive.
<!-- Blocked by most browsers: autoplay with sound -->
<video src="background.mp4" autoplay controls></video>
<!-- Allowed: muted autoplay, commonly used for background loops -->
<video src="background.mp4" autoplay muted loop playsinline controls></video>playsinline prevents iOS Safari from forcing video into full-screen mode. Include it whenever you use autoplay muted loop for background videos.
loop replays the video from the beginning when it ends.
Multiple formats with <source>
Not every browser supports every video format. WebM (with VP9 or AV1 codec) offers better compression but has incomplete support in older browsers. MP4 (with H.264 codec) is universally supported.
Use multiple <source> elements so the browser picks the best format it can play:
<video controls width="1280" height="720" poster="https://placehold.co/1280x720">
<source src="lesson.webm" type="video/webm">
<source src="lesson.mp4" type="video/mp4">
<p>Your browser does not support video. <a href="lesson.mp4">Download the lesson</a>.</p>
</video>The browser reads the <source> elements in order and uses the first one it can play. The type attribute tells the browser the MIME type so it can skip a source without attempting to download it. Without type, the browser has to start downloading the file to figure out the format.
The src attribute on the <video> element is replaced by <source> children when using multiple formats. Do not use both at the same time.
Captions and subtitles with <track>
A video without captions is inaccessible to deaf and hard-of-hearing users. It is also inaccessible to anyone watching without headphones, in a noisy environment, or learning in a second language.
The <track> element links a WebVTT file to the video. WebVTT (Web Video Text Tracks) is a plain text format that pairs timestamps with lines of text.
<video controls width="1280" height="720">
<source src="lesson.webm" type="video/webm">
<source src="lesson.mp4" type="video/mp4">
<track
kind="captions"
src="lesson-captions-en.vtt"
srclang="en"
label="English captions"
default
>
<track
kind="subtitles"
src="lesson-subtitles-fr.vtt"
srclang="fr"
label="French"
>
</video><track> attributes
kind sets the type of text track:
captions: dialogue and non-speech sounds ("door slams", "[upbeat music]"). Designed for deaf and hard-of-hearing users. This is what most informational video needs.subtitles: dialogue only, intended as a translation. Designed for viewers who can hear but do not speak the language.descriptions: audio descriptions of on-screen visual content, for blind users. Read by screen readers.chapters: navigation points for jumping to sections of a longer video.metadata: machine-readable cues, not displayed to users.
src is the path to the .vtt file.
srclang is the language code for the track (BCP 47 format: en, fr, de, pt-BR).
label is the human-readable name shown in the browser's caption menu.
default activates this track automatically. Only one track per video should have default.
The WebVTT format
A WebVTT file is a plain text file. It starts with the header WEBVTT, then lists cues in timestamp order. Each cue has a start time, an end time, and the caption text.
WEBVTT
00:00:01.000 --> 00:00:04.500
Welcome to this lesson on HTML media elements.
00:00:05.000 --> 00:00:09.000
Today you will learn how to embed video, audio,
and text tracks into a web page.
00:00:09.500 --> 00:00:13.000
[Keyboard typing sounds]
Let's look at the video element first.The time format is hours:minutes:seconds.milliseconds. The text on the line below each timestamp pair is the caption text. Square brackets mark non-speech sounds in captions.
The <audio> element
<audio> works identically to <video> in principle. It embeds audio with native playback controls.
<audio controls>
<source src="episode-12.ogg" type="audio/ogg">
<source src="episode-12.mp3" type="audio/mpeg">
<p>Your browser does not support audio. <a href="episode-12.mp3">Download the episode</a>.</p>
</audio>The same attributes apply: controls, autoplay, muted, loop, preload. The controls attribute is just as important here as it is on <video>. An <audio> element without controls is invisible and non-interactive.
You do not set width and height on <audio>. The browser's audio player is a narrow inline control bar.
Common audio formats and their MIME types:
- MP3:
type="audio/mpeg" - OGG Vorbis:
type="audio/ogg" - WAV:
type="audio/wav" - AAC:
type="audio/aac"
Audio accessibility: transcripts
Captions work for video because text can overlay the frames. For audio-only content, there is no visual track to overlay. The accessibility requirement for audio is a text transcript.
A transcript is a text document containing everything spoken in the audio. For a podcast or lecture, link to the transcript near the player:
<figure>
<audio controls>
<source src="episode-12.mp3" type="audio/mpeg">
<p>Your browser does not support audio. <a href="episode-12.mp3">Download the episode</a>.</p>
</audio>
<figcaption>
Episode 12: Building accessible forms.
<a href="episode-12-transcript.html">Read the transcript</a>.
</figcaption>
</figure>If the audio contains important information -- a lecture, an interview, a course recording -- the transcript is not optional. Without it, the content is unavailable to deaf users and to anyone who prefers reading over listening.
Common mistakes
No controls attribute. The media loads but cannot be interacted with. Users have no play button, no volume control, no way to seek. This also breaks keyboard accessibility entirely.
autoplay without muted. Browsers block unmuted autoplay to prevent pages from blasting audio at users without consent. If you need autoplay (background loop, demo video), combine it with muted.
No captions on informational video. If the video teaches something or contains spoken content users need, it requires captions. A <track kind="captions"> is not optional for accessibility-compliant video.
Wrong MIME type on <source>. If type is incorrect, the browser downloads the file to check the format instead of skipping to the next source. If type is missing entirely, same problem. Always specify the correct MIME type.
No fallback content. Content inside <video> and <audio> is shown when the browser cannot play the media. A download link is the minimum. Without it, users on unsupported browsers see a blank space with no explanation.
No poster on video. Before the user presses play, a black box or a random first frame creates a poor first impression. Always set a poster image.
Putting <track> outside <video>. A <track> element is only valid as a direct child of <video> or <audio>. Outside those elements, it does nothing.
Exercise
Build the HTML structure for a video player and an audio player. The focus is the markup, not the media files. The video will show the poster image since no real video file exists, which correctly demonstrates what poster is for.
-
Build a
<video>element with:- Two
<source>children (WebM first, MP4 second) with correcttypeattributes - A
posterattribute usinghttps://placehold.co/800x450as the poster image controls,width="800", andheight="450"- A
<track>for English captions withkind,src,srclang,label, anddefault - A fallback paragraph with a download link
- Two
-
Build an
<audio>element with:- Two
<source>children (OGG first, MP3 second) with correcttypeattributes controls- A fallback paragraph with a download link
- Two