stt

2025-07-22 22:02:48 +08:00
parent 0726aa60ed
commit dcb3f9d368
5 changed files with 775 additions and 9 deletions
@@ -170,6 +170,221 @@ ocr-screenshot --screenshot-method interactive --save-image
 ocr-screenshot --screenshot-method mss --monitor-number 2
 ```
 ## Speech-to-Text (STT) Tool
 A real-time speech-to-text tool using RealtimeSTT with wake word activation. Features the "jarvis" wake word by default and supports live transcription with various output options.
 ### Features
 - 🎙️ **Real-time transcription** - Live speech-to-text conversion
 - 🎯 **Wake word activation** - Multiple wake words including "jarvis"
 - ⚡ **GPU acceleration** - CUDA support for faster processing
 - 🔄 **Live display** - Real-time transcription preview
 - 💾 **File output** - Save transcriptions to text files
 - 🎛️ **Multiple models** - Choose from tiny to large Whisper models
 - 🌍 **Multi-language** - Support for multiple languages
 - 🧪 **Test mode** - Test functionality without wake words
 ### Installation
 The STT dependencies are included in the base installation:
 ```bash
 pip install .
 ```
 For optimal performance with GPU acceleration:
 ```bash
 # For CUDA 11.8
 pip install torch==2.5.1+cu118 torchaudio==2.5.1 --index-url https://download.pytorch.org/whl/cu118
 # For CUDA 12.X
 pip install torch==2.5.1+cu121 torchaudio==2.5.1 --index-url https://download.pytorch.org/whl/cu121
 ```
 ### Usage
 #### Basic Commands
 Start STT with jarvis wake word:
 ```bash
 tooling stt listen
 ```
 Test STT without wake words:
 ```bash
 tooling stt test
 ```
 Show system information:
 ```bash
 tooling stt info
 ```
 #### Wake Word Options
 Use different wake words:
 ```bash
 # Use alexa wake word
 tooling stt listen --wake-word alexa
 # Use hey google wake word  
 tooling stt listen --wake-word "hey google"
 # Use computer wake word
 tooling stt listen --wake-word computer
 ```
 #### Model Selection
 Choose different Whisper models for speed vs accuracy:
 ```bash
 # Fastest (tiny model)
 tooling stt listen --model tiny
 # Balanced (base model, default)
 tooling stt listen --model base
 # Best accuracy (large model)
 tooling stt listen --model large-v2
 ```
 #### Advanced Features
 Save transcriptions to file:
 ```bash
 tooling stt listen --save-to-file transcripts.txt
 ```
 Disable real-time display for better performance:
 ```bash
 tooling stt listen --no-realtime
 ```
 Set custom sensitivity and language:
 ```bash
 tooling stt listen --sensitivity 0.8 --language en --verbose
 ```
 Force CPU usage:
 ```bash
 tooling stt listen --device cpu
 ```
 ### Available Wake Words
 The following wake words are supported:
 - **jarvis** (default)
 - alexa
 - americano  
 - blueberry
 - bumblebee
 - computer
 - grapefruits
 - grasshopper
 - hey google
 - hey siri
 - ok google
 - picovoice
 - porcupine
 - terminator
 ### Available Models
 | Model | Speed | Accuracy | Memory | Use Case |
 |-------|-------|----------|--------|----------|
 | **tiny** | ⚡⚡⚡ | ⭐⭐ | 39MB | Testing, low-power devices |
 | **base** | ⚡⚡ | ⭐⭐⭐ | 74MB | Balanced (default) |
 | **small** | ⚡ | ⭐⭐⭐⭐ | 244MB | Better accuracy |
 | **medium** | 🐌 | ⭐⭐⭐⭐⭐ | 769MB | High accuracy |
 | **large-v2** | 🐌🐌 | ⭐⭐⭐⭐⭐ | 1550MB | Best accuracy |
 ### Command Line Options
 ```bash
 tooling stt listen [OPTIONS]
 Options:
  --wake-word TEXT        Wake word to activate recording [default: jarvis]
  --model TEXT           Whisper model (tiny, base, small, medium, large-v2) [default: base]
  --language TEXT        Language code for transcription (empty for auto-detection)
  --realtime/--no-realtime    Enable real-time transcription display [default: realtime]
  --save-to-file PATH    Save transcriptions to a file
  --sensitivity FLOAT    Wake word sensitivity (0.0 to 1.0) [default: 0.6]
  --device TEXT          Device to use (auto, cuda, cpu) [default: auto]
  --verbose              Show verbose output and configuration
  --help                 Show this message and exit
 ```
 ### Examples
 **Basic usage with jarvis:**
 ```bash
 tooling stt listen
 ```
 **Fast transcription with tiny model:**
 ```bash
 tooling stt listen --model tiny --wake-word computer
 ```
 **High accuracy with file output:**
 ```bash
 tooling stt listen --model large-v2 --save-to-file meeting_notes.txt --verbose
 ```
 **Quick test without wake words:**
 ```bash
 tooling stt test --duration 5 --model tiny
 ```
 **Custom language and sensitivity:**
 ```bash
 tooling stt listen --language es --sensitivity 0.8 --wake-word "hey google"
 ```
 ### How it Works
 1. **Initialization**: Loads the selected Whisper model and sets up audio processing
 2. **Wake Word Detection**: Listens for the specified wake word using Porcupine or OpenWakeWord
 3. **Voice Activity Detection**: Uses WebRTC VAD and Silero VAD for accurate speech detection  
 4. **Real-time Transcription**: Processes audio chunks in real-time (optional)
 5. **Final Transcription**: Generates high-quality final transcription when speech ends
 6. **Output**: Displays results and optionally saves to file
 ### Performance Tips
 - **GPU**: Use CUDA for 3-5x faster transcription
 - **Model**: Use `tiny` or `base` for real-time applications
 - **Sensitivity**: Adjust wake word sensitivity based on environment noise
 - **Device**: Set `--device cpu` if experiencing GPU memory issues
 - **Real-time**: Disable `--no-realtime` for better final transcription performance
 ### Troubleshooting
 **No microphone detected:**
 ```bash
 # Check audio devices
 tooling stt info
 ```
 **CUDA not available:**
 ```bash
 # Install CUDA-enabled PyTorch
 pip install torch==2.5.1+cu121 torchaudio==2.5.1 --index-url https://download.pytorch.org/whl/cu121
 ```
 **Wake word not detected:**
 ```bash
 # Increase sensitivity
 tooling stt listen --sensitivity 0.8 --verbose
 ```
 **Poor transcription quality:**
 ```bash
 # Use larger model
 tooling stt listen --model large-v2
 ```
 ## Development Guide
 ### How to Add New Packages
@@ -30,6 +30,7 @@ screenshot-all = [
 [project.scripts]
 ocr-screenshot = "tooling.cli:cli_main"
 tooling = "tooling.cli:cli_main"
 [build-system]
 requires = ["hatchling"]
@@ -2,34 +2,63 @@
 #    uv pip compile pyproject.toml -o requirements.txt
 anyascii==0.3.3
    # via python-doctr
 av==15.0.0
    # via faster-whisper
 certifi==2025.7.14
    # via requests
 cffi==1.17.1
    # via soundfile
 charset-normalizer==3.4.2
    # via requests
 click==8.2.1
    # via typer
 colorama==0.4.6
    # via
    #   halo
    #   log-symbols
 coloredlogs==15.0.1
    # via onnxruntime
 ctranslate2==4.6.0
    # via faster-whisper
 defusedxml==0.7.1
    # via python-doctr
 enum34==1.1.10
    # via pvporcupine
 faster-whisper==1.1.1
    # via realtimestt
 filelock==3.18.0
    # via
    #   huggingface-hub
    #   torch
 flatbuffers==25.2.10
    # via onnxruntime
 fsspec==2025.7.0
    # via
    #   huggingface-hub
    #   torch
 h5py==3.14.0
    # via python-doctr
 halo==0.0.31
    # via realtimestt
 hf-xet==1.1.5
    # via huggingface-hub
 huggingface-hub==0.33.4
-    # via python-doctr
+    # via
    #   faster-whisper
    #   python-doctr
    #   tokenizers
 humanfriendly==10.0
    # via coloredlogs
 idna==3.10
    # via requests
 jinja2==3.1.6
    # via torch
 joblib==1.5.1
    # via scikit-learn
 langdetect==1.0.9
    # via python-doctr
 log-symbols==0.0.14
    # via halo
 markdown-it-py==3.0.0
    # via rich
 markupsafe==3.0.2
@@ -42,30 +71,55 @@ networkx==3.5
    # via torch
 numpy==2.3.1
    # via
    #   ctranslate2
    #   h5py
    #   onnx
    #   onnxruntime
    #   opencv-python
    #   pvporcupine
    #   python-doctr
    #   scikit-learn
    #   scipy
    #   shapely
    #   soundfile
    #   torchvision
 onnx==1.18.0
    # via python-doctr
 onnxruntime==1.22.1
    # via
    #   faster-whisper
    #   openwakeword
 opencv-python==4.11.0.86
    # via python-doctr
 openwakeword==0.6.0
    # via realtimestt
 packaging==25.0
-    # via huggingface-hub
+    # via
    #   huggingface-hub
    #   onnxruntime
 pillow==11.3.0
    # via
    #   tooling (pyproject.toml)
    #   python-doctr
    #   torchvision
 protobuf==6.31.1
-    # via onnx
+    # via
    #   onnx
    #   onnxruntime
 pvporcupine==1.9.5
    # via realtimestt
 pyaudio==0.2.14
    # via realtimestt
 pyclipper==1.3.0.post6
    # via python-doctr
 pycparser==2.22
    # via cffi
 pygments==2.19.2
    # via rich
 pyobjc-core==11.1
    # via pyobjc-framework-cocoa
 pyobjc-framework-cocoa==11.1
    # via rumps
 pypdfium2==4.30.0
    # via python-doctr
 pyperclip==1.9.0
@@ -73,34 +127,70 @@ pyperclip==1.9.0
 python-doctr==1.0.0
    # via tooling (pyproject.toml)
 pyyaml==6.0.2
-    # via huggingface-hub
+    # via
    #   ctranslate2
    #   huggingface-hub
 rapidfuzz==3.13.0
    # via python-doctr
 realtimestt==0.3.104
    # via tooling (pyproject.toml)
 requests==2.32.4
-    # via huggingface-hub
+    # via
    #   huggingface-hub
    #   openwakeword
 rich==14.0.0
    # via
    #   tooling (pyproject.toml)
    #   typer
-scipy==1.16.0
+rumps==0.4.0
-    # via python-doctr
+    # via tooling (pyproject.toml)
 scikit-learn==1.7.1
    # via openwakeword
 scipy==1.15.2
    # via
    #   openwakeword
    #   python-doctr
    #   realtimestt
    #   scikit-learn
 setuptools==80.9.0
    # via ctranslate2
 shapely==2.1.1
    # via python-doctr
 shellingham==1.5.4
    # via typer
 six==1.17.0
-    # via langdetect
+    # via
    #   halo
    #   langdetect
 soundfile==0.13.1
    # via realtimestt
 spinners==0.0.24
    # via halo
 sympy==1.14.0
-    # via torch
+    # via
    #   onnxruntime
    #   torch
 termcolor==3.1.0
    # via halo
 threadpoolctl==3.6.0
    # via scikit-learn
 tokenizers==0.21.2
    # via faster-whisper
 torch==2.7.1
    # via
    #   python-doctr
    #   realtimestt
    #   torchaudio
    #   torchvision
 torchaudio==2.7.1
    # via realtimestt
 torchvision==0.22.1
    # via python-doctr
 tqdm==4.67.1
    # via
    #   faster-whisper
    #   huggingface-hub
    #   openwakeword
    #   python-doctr
 typer==0.16.0
    # via tooling (pyproject.toml)
@@ -114,3 +204,9 @@ urllib3==2.5.0
    # via requests
 validators==0.35.0
    # via python-doctr
 webrtcvad-wheels==2.0.14
    # via realtimestt
 websocket-client==1.8.0
    # via realtimestt
 websockets==15.0.1
    # via realtimestt
@@ -9,6 +9,7 @@ import typer
 from rich.console import Console
 from .ocr_cli import ocr_app
 from .stt_cli import stt_app
 # Create main app
 app = typer.Typer(
@@ -22,6 +23,9 @@ console = Console()
 # Add OCR subcommand
 app.add_typer(ocr_app, name="ocr", help="OCR screenshot tools")
 # Add STT subcommand
 app.add_typer(stt_app, name="stt", help="Speech-to-text tools with wake word activation")
@app.command()
 def version():
    """Show version information."""
@@ -0,0 +1,450 @@
 #!/usr/bin/env python3
 """
 Speech-to-Text CLI Tool
 A command-line tool that provides real-time speech-to-text transcription
 using RealtimeSTT with wake word activation and various output options.
 """
 import datetime
 import os
 import tempfile
 from pathlib import Path
 from typing import Optional, Callable
 import threading
 import time
 import typer
 from rich.console import Console
 from rich.panel import Panel
 from rich.progress import Progress, SpinnerColumn, TextColumn
 from rich.live import Live
 from rich.text import Text
 from rich.table import Table
 # Create STT app that can be imported as a subcommand
 stt_app = typer.Typer(
    name="stt",
    help="Real-time speech-to-text with wake word activation",
    rich_markup_mode="rich"
 )
 console = Console()
 # Global variables for managing the recorder
 _recorder = None
 _recording_active = False
 _transcription_buffer = []
 class TranscriptionDisplay:
    """Handle live display of transcriptions."""
    def __init__(self, show_realtime: bool = True):
        self.show_realtime = show_realtime
        self.realtime_text = ""
        self.final_text = ""
        self.status = "Initializing..."
    def create_display(self) -> Table:
        """Create the display table."""
        table = Table.grid(padding=1)
        table.add_column(style="cyan", no_wrap=False)
        # Status
        table.add_row(f"[bold blue]Status:[/bold blue] {self.status}")
        table.add_row("")
        # Realtime transcription
        if self.show_realtime and self.realtime_text:
            table.add_row("[bold yellow]🎙️  Live transcription:[/bold yellow]")
            table.add_row(f"[dim]{self.realtime_text}[/dim]")
            table.add_row("")
        # Final transcription
        if self.final_text:
            table.add_row("[bold green]✅ Final transcription:[/bold green]")
            table.add_row(self.final_text)
            table.add_row("")
        return table
    def update_status(self, status: str):
        """Update the status."""
        self.status = status
    def update_realtime(self, text: str):
        """Update realtime transcription."""
        self.realtime_text = text
    def add_final(self, text: str):
        """Add final transcription."""
        if text.strip():
            timestamp = datetime.datetime.now().strftime("%H:%M:%S")
            self.final_text += f"[{timestamp}] {text}\n"
@stt_app.command("listen")
 def listen_cmd(
    wake_word: str = typer.Option(
        default="jarvis",
        help="Wake word to activate recording (jarvis, alexa, hey google, etc.)"
    ),
    model: str = typer.Option(
        default="base",
        help="Whisper model to use (tiny, base, small, medium, large-v2)"
    ),
    language: str = typer.Option(
        default="",
        help="Language code for transcription (empty for auto-detection)"
    ),
    realtime: bool = typer.Option(
        default=True,
        help="Enable real-time transcription display"
    ),
    save_to_file: Optional[Path] = typer.Option(
        default=None,
        help="Save transcriptions to a file"
    ),
    sensitivity: float = typer.Option(
        default=0.6,
        help="Wake word sensitivity (0.0 to 1.0)"
    ),
    device: str = typer.Option(
        default="auto",
        help="Device to use (auto, cuda, cpu)"
    ),
    verbose: bool = typer.Option(
        default=False,
        help="Show verbose output and configuration"
    )
 ):
    """Start real-time speech-to-text with wake word activation."""
    try:
        from RealtimeSTT import AudioToTextRecorder
    except ImportError:
        console.print("[bold red]❌ RealtimeSTT not installed.[/bold red]")
        console.print("Install with: [bold]pip install RealtimeSTT[/bold]")
        raise typer.Exit(1)
    # Validate wake word
    valid_wake_words = [
        "alexa", "americano", "blueberry", "bumblebee", "computer", 
        "grapefruits", "grasshopper", "hey google", "hey siri", "jarvis", 
        "ok google", "picovoice", "porcupine", "terminator"
    ]
    if wake_word.lower() not in valid_wake_words:
        console.print(f"[bold red]❌ Invalid wake word: {wake_word}[/bold red]")
        console.print(f"Valid options: {', '.join(valid_wake_words)}")
        raise typer.Exit(1)
    # Determine device
    if device == "auto":
        try:
            import torch
            device = "cuda" if torch.cuda.is_available() else "cpu"
        except ImportError:
            device = "cpu"
    # Create transcription display
    display = TranscriptionDisplay(show_realtime=realtime)
    # File output setup
    output_file = None
    if save_to_file:
        save_to_file.parent.mkdir(parents=True, exist_ok=True)
        output_file = open(save_to_file, 'a', encoding='utf-8')
        timestamp = datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S")
        output_file.write(f"\n=== STT Session Started: {timestamp} ===\n")
        output_file.flush()
    # Show configuration if verbose
    if verbose:
        config_table = Table(title="STT Configuration")
        config_table.add_column("Setting", style="cyan")
        config_table.add_column("Value", style="green")
        config_table.add_row("Wake Word", wake_word)
        config_table.add_row("Model", model)
        config_table.add_row("Language", language if language else "Auto-detect")
        config_table.add_row("Device", device)
        config_table.add_row("Realtime Display", str(realtime))
        config_table.add_row("Sensitivity", str(sensitivity))
        if save_to_file:
            config_table.add_row("Output File", str(save_to_file))
        console.print(config_table)
        console.print()
    # Callback functions
    def on_realtime_transcription(text: str):
        """Handle real-time transcription updates."""
        if realtime:
            display.update_realtime(text)
    def on_transcription_complete(text: str):
        """Handle completed transcriptions."""
        if text.strip():
            display.add_final(text)
            # Save to file if specified
            if output_file:
                timestamp = datetime.datetime.now().strftime("%H:%M:%S")
                output_file.write(f"[{timestamp}] {text}\n")
                output_file.flush()
    def on_recording_start():
        """Called when recording starts."""
        display.update_status("🎙️ Recording... (speak now)")
    def on_recording_stop():
        """Called when recording stops."""
        display.update_status("⏸️ Processing transcription...")
    def on_wakeword_detected():
        """Called when wake word is detected."""
        display.update_status(f"🎯 Wake word '{wake_word}' detected!")
    def on_wakeword_timeout():
        """Called when wake word times out."""
        display.update_status(f"⏰ Waiting for wake word '{wake_word}'...")
    def on_wakeword_detection_start():
        """Called when starting to listen for wake words."""
        display.update_status(f"👂 Listening for wake word '{wake_word}'...")
    try:
        display.update_status("🔧 Initializing STT engine...")
        # Configure recorder parameters
        recorder_config = {
            "model": model,
            "wake_words": wake_word,
            "wake_words_sensitivity": sensitivity,
            "device": device,
            "on_recording_start": on_recording_start,
            "on_recording_stop": on_recording_stop,
            "on_wakeword_detected": on_wakeword_detected,
            "on_wakeword_timeout": on_wakeword_timeout,
            "on_wakeword_detection_start": on_wakeword_detection_start,
        }
        if language:
            recorder_config["language"] = language
        if realtime:
            recorder_config.update({
                "enable_realtime_transcription": True,
                "on_realtime_transcription_update": on_realtime_transcription,
            })
        # Initialize recorder
        recorder = AudioToTextRecorder(**recorder_config)
        # Show initial instructions
        console.print(Panel(
            f"[bold]Speech-to-Text Ready![/bold]\n\n"
            f"• Say '[bold cyan]{wake_word}[/bold cyan]' to activate recording\n"
            f"• Speak clearly after activation\n"
            f"• Press [bold red]Ctrl+C[/bold red] to stop\n"
            f"• Model: [bold]{model}[/bold] | Device: [bold]{device}[/bold]",
            title="🎤 STT Instructions",
            border_style="green"
        ))
        # Start live display
        with Live(display.create_display(), refresh_per_second=10, console=console) as live:
            try:
                while True:
                    # Get transcription (this will wait for wake word and then record)
                    text = recorder.text()
                    if text:
                        on_transcription_complete(text)
                        live.update(display.create_display())
                    # Small delay to prevent high CPU usage
                    time.sleep(0.1)
            except KeyboardInterrupt:
                display.update_status("🛑 Stopping STT...")
                live.update(display.create_display())
                raise
    except KeyboardInterrupt:
        console.print("\n[bold yellow]⚠️  STT stopped by user.[/bold yellow]")
    except Exception as e:
        console.print(f"\n[bold red]❌ STT error: {e}[/bold red]")
        if verbose:
            import traceback
            console.print(f"[dim]{traceback.format_exc()}[/dim]")
        raise typer.Exit(1)
    finally:
        # Cleanup
        if 'recorder' in locals():
            try:
                recorder.shutdown()
            except:
                pass
        if output_file:
            timestamp = datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S")
            output_file.write(f"=== STT Session Ended: {timestamp} ===\n\n")
            output_file.close()
            console.print(f"\n[green]💾 Transcriptions saved to: {save_to_file}[/green]")
@stt_app.command("test")
 def test_cmd(
    duration: int = typer.Option(
        default=10,
        help="Test duration in seconds"
    ),
    model: str = typer.Option(
        default="tiny",
        help="Whisper model to use for testing"
    )
 ):
    """Test STT functionality without wake words."""
    try:
        from RealtimeSTT import AudioToTextRecorder
    except ImportError:
        console.print("[bold red]❌ RealtimeSTT not installed.[/bold red]")
        console.print("Install with: [bold]pip install RealtimeSTT[/bold]")
        raise typer.Exit(1)
    console.print(Panel(
        f"[bold]STT Test Mode[/bold]\n\n"
        f"• Duration: [bold]{duration}[/bold] seconds\n"
        f"• Model: [bold]{model}[/bold]\n"
        f"• No wake word required\n"
        f"• Start speaking when you see 'Recording...'",
        title="🧪 Test Configuration",
        border_style="blue"
    ))
    try:
        with Progress(
            SpinnerColumn(),
            TextColumn("[progress.description]{task.description}"),
            console=console,
        ) as progress:
            init_task = progress.add_task("[cyan]Initializing STT engine...", total=None)
            recorder = AudioToTextRecorder(
                model=model,
                wake_words="",  # No wake words for test
            )
            progress.update(init_task, description="[green]✓ STT engine ready")
            progress.stop()
        console.print(f"\n[bold green]🎙️  Recording for {duration} seconds...[/bold green]")
        console.print("[yellow]Start speaking now![/yellow]")
        # Manual recording for test
        recorder.start()
        # Show countdown
        for remaining in range(duration, 0, -1):
            console.print(f"\r[bold cyan]⏰ {remaining} seconds remaining...[/bold cyan]", end="")
            time.sleep(1)
        console.print(f"\r[bold blue]⏸️  Processing transcription...[/bold blue]")
        recorder.stop()
        text = recorder.text()
        if text:
            console.print("\n[bold green]✅ Test completed successfully![/bold green]")
            console.print(Panel(
                text,
                title="📝 Transcribed Text",
                border_style="green"
            ))
        else:
            console.print("\n[bold yellow]⚠️  No speech detected during test.[/bold yellow]")
            console.print("[dim]Try speaking louder or check your microphone.[/dim]")
    except KeyboardInterrupt:
        console.print("\n[bold yellow]⚠️  Test cancelled by user.[/bold yellow]")
    except Exception as e:
        console.print(f"\n[bold red]❌ Test failed: {e}[/bold red]")
        raise typer.Exit(1)
    finally:
        if 'recorder' in locals():
            try:
                recorder.shutdown()
            except:
                pass
@stt_app.command("info")
 def info_cmd():
    """Show STT system information and available options."""
    console.print(Panel(
        "[bold blue]STT System Information[/bold blue]",
        border_style="blue"
    ))
    # Check RealtimeSTT installation
    try:
        from RealtimeSTT import AudioToTextRecorder
        console.print("[green]✅ RealtimeSTT installed[/green]")
        # Check CUDA availability
        try:
            import torch
            cuda_available = torch.cuda.is_available()
            if cuda_available:
                console.print(f"[green]✅ CUDA available (GPU: {torch.cuda.get_device_name()})[/green]")
            else:
                console.print("[yellow]⚠️  CUDA not available (CPU only)[/yellow]")
        except ImportError:
            console.print("[yellow]⚠️  PyTorch not available[/yellow]")
    except ImportError:
        console.print("[red]❌ RealtimeSTT not installed[/red]")
        console.print("Install with: [bold]pip install RealtimeSTT[/bold]")
    # Available wake words
    wake_words = [
        "alexa", "americano", "blueberry", "bumblebee", "computer", 
        "grapefruits", "grasshopper", "hey google", "hey siri", "jarvis", 
        "ok google", "picovoice", "porcupine", "terminator"
    ]
    console.print(f"\n[bold cyan]Available Wake Words:[/bold cyan]")
    console.print(", ".join(wake_words))
    # Available models
    models = ["tiny", "tiny.en", "base", "base.en", "small", "small.en", "medium", "medium.en", "large-v1", "large-v2"]
    console.print(f"\n[bold cyan]Available Models:[/bold cyan]")
    console.print(", ".join(models))
    # Usage examples
    console.print(f"\n[bold cyan]Usage Examples:[/bold cyan]")
    examples = [
        "tooling stt listen                    # Use jarvis wake word with base model",
        "tooling stt listen --wake-word alexa  # Use alexa wake word",
        "tooling stt listen --model tiny       # Use faster tiny model",
        "tooling stt test --duration 5         # Test for 5 seconds",
        "tooling stt listen --save-to-file transcripts.txt  # Save to file"
    ]
    for example in examples:
        console.print(f"  [dim]${example}[/dim]")
 # For backward compatibility when run directly
 def cli_main():
    """Entry point for the STT CLI script when run directly."""
    stt_app()
 if __name__ == "__main__":
    stt_app()