[REVIEW: Verify against live product before announcing.]
Video accessibility is the practice of designing and producing video content so that people with disabilities can perceive, operate, and understand it. The legal frameworks (WCAG, Section 508, ADA, EN 301 549, AODA) define minimum requirements; thoughtful production goes further.
What is video accessibility?
A video is accessible when:
- A deaf or hard-of-hearing viewer can get the same content as a hearing viewer (captions or transcript).
- A blind or low-vision viewer can get the same content as a sighted viewer (audio descriptions for visual-only information; well-labeled controls for the player).
- A motor-impaired viewer can operate the player without a mouse (keyboard navigation; large enough hit targets).
- A photosensitive viewer is not put at risk (no flashing content above the seizure threshold).
- A viewer with cognitive differences can follow along (clear pacing, no autoplay without user choice, pause and stop controls).
Video accessibility matters legally (lawsuits and procurement requirements), ethically (you do not need a reason to include people), and practically (captions help everyone, not just the deaf and hard-of-hearing).
Core requirements
| Requirement | Who benefits | What it looks like |
|---|---|---|
| Captions | Deaf, hard-of-hearing, sound-off viewers, non-native speakers | Text track synced with dialogue, including non-speech audio cues |
| Audio descriptions | Blind, low-vision viewers | Spoken narration of important visual elements |
| Keyboard navigation | Motor-impaired, screen reader users | Every player control reachable and operable from the keyboard |
| Screen reader support | Blind, low-vision viewers | Player labeled as a media element; state announced; controls labeled |
| Sufficient color contrast | Low-vision viewers | Text and controls meet WCAG AA contrast ratios |
| No flashing content | Photosensitive viewers | No flashes above three per second |
| Pause/stop controls | Cognitive, vestibular, photosensitive viewers | Auto-playing video must be pausable or stoppable |
Standards and laws
| Standard | Where it applies | Common level |
|---|---|---|
| WCAG 2.1 / 2.2 | International web standard | AA most common, AAA for some contexts |
| Section 508 | US federal procurement | Aligned with WCAG 2.0 AA, often updated |
| ADA Title III | US private sector public accommodations | Often interpreted via WCAG 2.1 AA |
| EN 301 549 | European public sector procurement | Aligned with WCAG 2.1 AA |
| AODA | Ontario, Canada | WCAG 2.0 AA |
Each framework defines what conformant means, who must conform, and what proof of conformance looks like. Most organizations target WCAG 2.1 or 2.2 at Level AA as the baseline because that single target also satisfies most of the others.
Captions vs subtitles vs transcripts
These three are often confused. They are not interchangeable.
- Captions
- Text version of dialogue and significant non-speech audio (laughter, music, door slams), timed to the video. Designed for viewers who cannot hear the audio. Required for WCAG 1.2.2.
- Subtitles
- Text translation of dialogue into another language. Assume the viewer can hear; do not include non-speech audio cues. Not a replacement for captions.
- Transcripts
- Full text equivalent of the video, separate from the timed track. Useful for search, reading, and as an alternative for users who cannot or prefer not to play the video. Required for WCAG 1.2.1 when the video has no audio.
Most accessible video pages provide both captions and a transcript. They serve different needs.
Audio descriptions
Audio descriptions are a spoken narration of visual information that the existing audio does not cover. Required by WCAG 1.2.5 at Level AA when the video has visual content the dialogue does not describe.
There are three production approaches:
- Integrated narration: write the script so the narrator describes visuals naturally. Cheapest; works for tutorials and explainers.
- Recorded second track: an editor records description audio and slots it into the gaps between dialogue. Standard approach for produced video.
- Extended audio descriptions: pause the video to allow longer descriptions where dialogue leaves no room. WCAG AAA; supported by few players.
Audio descriptions are not auto-generated by any current tool with sufficient quality. Plan for the production cost up front.
Keyboard navigation
A keyboard user expects:
- Tab to move focus through player controls in a sensible order.
- Space or Enter on the focused control to activate it (play/pause is the most common).
- Left and right arrows to seek backward and forward.
- Up and down arrows to adjust volume.
- F for fullscreen, C for captions, M for mute (common but not required).
- Visible focus indicators on every focusable element.
If any control cannot be reached via Tab, or if focused controls have no visible indicator, the player fails WCAG 2.1.1 (Keyboard) and 2.4.7 (Focus Visible).
Screen reader expectations
A screen reader user expects:
- The player labeled as a media element with the video title.
- Play state announced when it changes ("Playing", "Paused").
- Time announced when the user seeks.
- The captions toggle announced as a button with its current state.
- The volume control announced as a slider with its current value.
Test with a real screen reader (VoiceOver on macOS, NVDA on Windows). Automated tools catch some issues; manual testing catches the rest.
Compliance and procurement
Buyers in regulated industries (US federal, US healthcare, US higher education, EU public sector) often ask for two documents before they buy:
- VPAT (Voluntary Product Accessibility Template): a structured questionnaire describing how the product addresses each WCAG success criterion.
- ACR (Accessibility Conformance Report): the completed VPAT with evidence and known gaps.
Producing a VPAT honestly takes 4-8 weeks of work, including manual testing with assistive technology. Specialized firms such as accessible.org prepare VPATs and ACRs for vendors who want a credible third-party document.
Testing your video player
A short self-test before publishing any video page:
- Tab through every control. Can you reach all of them? Does each have a visible focus ring?
- Use a screen reader. Open the page with VoiceOver or NVDA. Is the player announced? Are the controls labeled?
- Toggle captions. Do they appear correctly? Are non-speech cues included?
- Watch with audio off. Can you understand the video with captions alone? Is anything visual that needs description?
- Watch with the screen turned away. Can you understand the video with audio alone? Anything that needs audio description?
- Check color contrast. Run an automated tool or use the WCAG ratios manually. Body text should meet 4.5:1; large text should meet 3:1.
VideoPlayer.ai accessibility
VideoPlayer.ai's player supports keyboard navigation, screen reader announcements, captions, and visible focus indicators on the public surfaces and the player. Audio descriptions are not auto-generated; producers supply them as a separate track or use integrated narration. WCAG conformance details, with current status and known gaps, live at /docs/accessibility.
Related
- VideoPlayer.ai accessibility documentation
- accessible.org for VPAT and accessibility consulting