Skip to main content

Voice Mode

Voice mode lets you speak tasks instead of typing them.

Prerequisites

Install the voice extras:

pip install nimbus-ai[voice]

This installs:

  • sounddevice — microphone capture
  • numpy — audio processing
  • scipy — signal utilities
  • openai-whisper — local speech transcription (base model, ~140MB, downloaded on first use)

No audio is sent to external servers. Whisper runs entirely locally.

Launch

nimbus --voice

Or enable voice mode inside the REPL with the --voice flag on startup.

Recording

When you press Enter at the prompt, Nimbus starts recording:

  ●  recording... speak your task (press Enter to stop)

Recording stops when you press Enter again, or after 15 seconds of silence.

Transcription

Nimbus transcribes your audio using the Whisper base model and displays the result:

  heard: add input validation to the user registration endpoint

If the transcription looks correct, Nimbus proceeds with the task. You can edit the transcribed text before approving.

Keyword detection

During task execution, Nimbus listens for spoken approval keywords:

Spoken wordAction
"yes" / "approve"Approve the plan or diff
"reject" / "cancel"Reject

Limitations

  • Requires microphone access. On macOS, grant permission when prompted by the system.
  • The base model is optimized for English. Other languages have reduced accuracy.
  • Voice mode does not work in SSH sessions without audio forwarding.