$5 early access price

The uncompromising transcription tool for low-spec Apple Silicon.

Airakeet wraps NVIDIA's Parakeet ASR into a silky macOS menubar experience that unloads when idle, respects 8GB RAM ceilings, and keeps every word on-device.

0

MB model footprint unloaded when idle

0

× real-time factor on M2 MacBook Air

0

minute auto-off to return all RAM

Why it matters

Designed around the constraints of an 8GB M1 MacBook Air.

Most dictation apps idle at 2–3GB. Airakeet uses aggressive unloads, streaming buffers, and CoreML tuning to stay invisible until you need it.

Zero-overhead idle

The 800MB Parakeet model is evicted after five minutes of inactivity, returning memory to the OS automatically.

ANE-first execution

Inference runs exclusively on the Apple Neural Engine, keeping the CPU free and the laptop fanless during long sessions.

Waveform overlay

A translucent HUD with a live waveform of your mic input. Plus, it's fully color-customizable

Clipboard injection

Dictation drops straight into the active text field using Clipboard + CMD+V—no extra permissions, no network.

Configurable hotkeys

Supports standard shortcuts, Fn combos, and a dedicated Shift+Fn gesture for quick starts even on compact keyboards.

Audio cache & debug

Replay exactly what the engine heard via a safety cache so you can validate inputs without uploading sensitive data.

Engineering story

Systems thinking for a pure native build.

From CoreML conversion to macOS UX polish, Airakeet is a full-stack native build that spotlights low-level craftsmanship.

  • 🧠

    Model distillation

    Converted NVIDIA Parakeet TDT 0.6B to CoreML with quantization and ANE-friendly ops.

  • 🛡

    Security-first footprint

    Menubar-only surface, no analytics, and scoped macOS permissions keep the threat model tiny.

  • 🎛

    Hotkey architecture

    Custom event tap avoids global listeners to reduce CPU wakeups while preserving instant response.

  • 🧊

    Memory choreography

    Extract-and-clear buffers plus timed auto-unload keep RAM usage flat during long recordings.

Results

Feels like hardware, not software.

Transcribes five seconds of speech in 0.11s, injects text instantly, and stays invisible until summoned. Perfect companion for essays, code reviews, or meeting notes.

Swift + AppKit CoreML Metal Performance Shaders AudioKit

Under the hood

Built on NVIDIA Parakeet, tuned for everyday workflows.

Parakeet is NVIDIA’s speech model family converted to CoreML. Think of it as a musician trained on billions of sentences who performs directly on your Mac instead of on a cloud stage.

What Parakeet brings

  • Accent friendly. Trained on noisy, multi-accent corpora so code-switching and filler words aren’t dropped.
  • ANE-ready tensors. Converted into CoreML operators that map cleanly to the Apple Neural Engine.
  • Streaming aware. Supports chunk-by-chunk inference without waiting for full clips.

Future engines

The 1.1B Parakeet-EOU build will unlock live dictation with punctuation, multilingual translation, and smarter “keep listening” behavior without adding cloud latency.

Parakeet TDT 0.6B Parakeet EOU 1.1B NVIDIA Canary

Parakeet vs Local Whisper (both running on-device)

Attribute Parakeet Local Whisper
Latency target Optimized for low-latency partials so you see text mid-sentence. Batch-first decoding introduces a pause before the first characters appear.
Hardware sweet spot Runs comfortably on the Apple Neural Engine with 8GB RAM. Prefers discrete GPU or 16GB+ unified memory to stay smooth.
Streaming feel Designed for incremental injection with ANE offload. Often buffers a full sentence before emitting, so text arrives in bursts.

Forward-looking

Dual-engine roadmap for future Apple Silicon.

High-tier work resumes when I upgrade to a 32GB MacBook Air so I can validate the 1.1B model end-to-end.

Phase 1 · Engine abstraction

Refactor `ASREngine` to load either 0.6B or 1.1B models on demand and prevent RAM collisions.

Phase 2 · Live injection

Streaming text, word-by-word insertion, and silence detection using Parakeet-EOU for instant feedback.

Phase 3 · Hardware validation

Benchmark M5 hardware, monitor thermals, and explore NVIDIA Canary for multilingual translation.

Future: Streaming engine

Queued for my next hardware upgrade.

Rolls out once I’m on a new 32GB MacBook Air so the 1.1B model fits comfortably—nothing required on your end.

With this update you’ll be able to choose between today’s ultra-efficient 0.6B engine and a higher-capacity Parakeet EOU 1.1B build. That bigger model will enable true word-by-word streaming and EOU (End of Utterance) timing so Airakeet feels like it’s reading your mind.

  • Words appear as you speak: Streaming injection means paragraphs grow in the active text field without waiting for the clip to finish.
  • EOU = natural pauses: The engine listens for the tiny silence after each thought, then auto-stops recording with punctuation so you never overshoot a sentence.
  • Instant re-entry: If you keep talking, the future engine jumps back into capture without reloading gigabytes of weights.

Ready when you are

Keep your voice on your device.

Airakeet keeps every syllable on your hardware and is available through a private early-access program. Secure the introductory $5 access (regular $10) while seats are open.