ai-sdk-elements/Components

Voice Components

Voice and audio components including AudioPlayer, SpeechInput, Transcription

voiceaudiospeechtranscription

Voice Components

AudioPlayer Component

A composable audio player component built on media-chrome with shadcn styling.

Features

  • Built on media-chrome for reliable audio playback
  • Fully composable architecture with granular control components
  • ButtonGroup integration for cohesive control layout
  • Individual control components (play, seek, volume)
  • CSS custom properties for deep theming
  • Shadcn/ui Button component styling
  • Responsive design

<AudioPlayer />

Root MediaController component. Accepts all MediaController props except audio (set to true by default).

| Prop | Type | Description | |------|------|-------------| | style | CSSProperties | Custom CSS properties for media-chrome theming | | ...props | Omit<React.ComponentProps<typeof MediaController>, "audio"> | Spread to MediaController |

<AudioPlayerElement />

The audio element that contains the media source.

| Prop | Type | Description | |------|------|-------------| | src | string | URL of the audio file (for remote audio) | | data | SpeechResult["audio"] | AI SDK Speech Result audio data | | ...props | Omit<React.ComponentProps<"audio">, "src"> | Spread to audio element |

<AudioPlayerControlBar />

Container for control buttons.

| Prop | Type | Description | |------|------|-------------| | ...props | React.ComponentProps<typeof MediaControlBar> | Spread to MediaControlBar |

<AudioPlayerPlayButton />

Play/pause button.

| Prop | Type | Description | |------|------|-------------| | ...props | React.ComponentProps<typeof MediaPlayButton> | Spread to MediaPlayButton |

<AudioPlayerSeekBackwardButton />

Seek backward button.

| Prop | Type | Default | |------|------|---------| | seekOffset | number | 10 | | ...props | React.ComponentProps<typeof MediaSeekBackwardButton> | Spread to MediaSeekBackwardButton |

<AudioPlayerSeekForwardButton />

Seek forward button.

| Prop | Type | Default | |------|------|---------| | seekOffset | number | 10 | | ...props | React.ComponentProps<typeof MediaSeekForwardButton> | Spread to MediaSeekForwardButton |

<AudioPlayerTimeDisplay />

Displays current playback time.

| Prop | Type | Description | |------|------|-------------| | ...props | React.ComponentProps<typeof MediaTimeDisplay> | Spread to MediaTimeDisplay |

<AudioPlayerTimeRange />

Seek slider for controlling playback position.

| Prop | Type | Description | |------|------|-------------| | ...props | React.ComponentProps<typeof MediaTimeRange> | Spread to MediaTimeRange |

<AudioPlayerDurationDisplay />

Displays total duration.

| Prop | Type | Description | |------|------|-------------| | ...props | React.ComponentProps<typeof MediaDurationDisplay> | Spread to MediaDurationDisplay |

<AudioPlayerMuteButton />

Mute/unmute button.

| Prop | Type | Description | |------|------|-------------| | ...props | React.ComponentProps<typeof MediaMuteButton> | Spread to MediaMuteButton |

<AudioPlayerVolumeRange />

Volume slider control.

| Prop | Type | Description | |------|------|-------------| | ...props | React.ComponentProps<typeof MediaVolumeRange> | Spread to MediaVolumeRange |

SpeechInput Component

A button component that captures voice input and converts it to text.

Features

  • Built on Web Speech API with MediaRecorder fallback
  • Cross-browser support (Chrome, Edge, Firefox, Safari)
  • Continuous speech recognition with interim results
  • Visual feedback with pulse animation when listening
  • Loading state during transcription processing

Browser Support

| Browser | API Used | Requirements | |---------|----------|--------------| | Chrome | Web Speech API | None | | Edge | Web Speech API | None | | Firefox | MediaRecorder | onAudioRecorded prop | | Safari | MediaRecorder | onAudioRecorded prop |

<SpeechInput />

| Prop | Type | Default | |------|------|---------| | onTranscriptionChange | (text: string) => void | Final transcription callback | | onAudioRecorded | (audioBlob: Blob) => Promise<string> | MediaRecorder fallback | | lang | string | "en-US" | | ...props | React.ComponentProps<typeof Button> | Spread to Button |

Visual States

  • Default: Standard button with microphone icon
  • Listening: Pulsing animation with accent colors
  • Processing: Loading spinner while waiting for transcription
  • Disabled: Button disabled when no API available

Usage with MediaRecorder Fallback

const handleAudioRecorded = async (audioBlob: Blob): Promise<string> => {
  const formData = new FormData();
  formData.append("file", audioBlob, "audio.webm");
  formData.append("model", "whisper-1");

  const response = await fetch(
    "https://api.openai.com/v1/audio/transcriptions",
    {
      method: "POST",
      headers: {
        Authorization: `Bearer ${process.env.OPENAI_API_KEY}`,
      },
      body: formData,
    }
  );

  const data = await response.json();
  return data.text;
};

<SpeechInput
  onTranscriptionChange={(text) => console.log(text)}
  onAudioRecorded={handleAudioRecorded}
/>

Requirements

  • Requires secure context (HTTPS or localhost)
  • Browser may prompt for microphone permission
  • Only final transcripts trigger onTranscriptionChange
  • onAudioRecorded required for Firefox/Safari support

Transcription Component

Displays transcription results with speaker labels.

Features

  • Speaker label display
  • Timestamp support
  • MessagePart component for displaying transcription segments

Persona Component

Voice persona selection component.

VoiceSelector Component

Selector for choosing voice options.

MicSelector Component

Microphone selection component.