Chirp
Real-Time Audio Monitoring, Visualization & Triggered Recording
v2.2.1Overview
Chirp is a desktop application built with Python, PyQt5, and matplotlib for real-time audio monitoring, spectral visualization, and threshold-triggered recording. It was designed for bioacoustics research but is suitable for any scenario requiring continuous audio surveillance with automatic capture of events of interest.
Key Capabilities
- Live spectrogram, waveform, and amplitude visualization
- Threshold-triggered recording with configurable pre- and post-trigger buffers
- Spectral entropy trigger for tonal sound detection
- Multi-stream support — monitor several audio devices simultaneously
- Stereo channel analysis with per-channel trigger logic
- Optional bandpass filtering for frequency-specific triggering
- Configurable display: FFT size, window function, frequency scale, view modes
- Dark-themed UI using the Catppuccin Mocha color palette
- Save and load full session configurations as JSON
The entire application is contained in a single Python file (chirp.py), making
deployment and modification straightforward.
Installation
Requirements
- Python 3.11 or later
sounddevice— audio I/O via PortAudionumpy— numerical computationscipy— signal processing (bandpass filter)matplotlib— plotting enginePyQt5— GUI framework
Quick Start
# Create and activate a conda environment (recommended)
conda create -n chirp python=3.11 -y
conda activate chirp
# Install dependencies
pip install sounddevice numpy scipy matplotlib PyQt5
# Launch
python chirp.py
Interface Layout
The main window is divided into five functional areas: the sidebar, the central canvas, the transport bar, and two configuration panels at the bottom (trigger and display), plus a settings panel.
Sidebar (Left)
The sidebar lists all configured recording streams. Each entry displays:
- Status indicators — colored dots showing stream state: ACQ (acquiring, green), REC (recording, green), TRIG (triggered, red)
- Sticky health badges (v2.1) — S latches when this stream has saturated (clipped) at any point during the session; D latches when the audio capture has dropped any callback chunks (with the total count in its tooltip). Both stay lit until you click the badge to clear, so a brief clip or a single dropped callback can't slip by unnoticed.
- Transient drop flash — a short per-tick indicator that blinks when a drop occurs in the current UI tick, alongside the sticky D badge.
- Mini amplitude preview — a small real-time level meter
- Reorder buttons — move streams up or down in the list
- Delete button — remove a stream
Click Add Recording at the bottom of the sidebar to create a new stream. Right-click the sidebar background for Start All and Stop All context actions.
Canvas (Center)
The central area hosts the matplotlib figure showing the spectrogram, waveform, and amplitude plots for the currently selected stream. A vertical green cursor line indicates the current write position in the circular buffer.
A thin detect / record events strip (v2.1) sits directly beneath the amplitude plot with two rows: a yellow bar lights up for every sample the trigger condition is met (detect), and a green bar lights up for every sample the writer is actually capturing (record). The detect row is driven by the same per-sample trigger mask the recording state machine consumes — there is no second computation, so what you see is exactly what the trigger did, including any bandpass filtering and spectral-entropy gating.
Transport Bar
| Button | Action |
|---|---|
| Start Acq | Begin audio acquisition (fills buffer, enables display) |
| Stop Acq | Stop acquisition and clear the buffer |
| Start Rec | Arm threshold-triggered recording |
| Stop Rec | Disarm recording (finishes any in-progress file) |
| Reset Params | Restore all parameters to defaults |
| Save / Save As | Save current configuration to JSON |
| Load | Load a previously saved configuration |
| View Mode | Switch to multi-stream grid view |
Status indicators in the transport bar show the global state: ACQ, REC, and TRIG.
Trigger Panel (Bottom Left)
Controls that govern when and how recordings are triggered:
| Control | Description |
|---|---|
| Threshold | Amplitude level that triggers recording. Also draggable on the plot. |
| Min Cross | Minimum duration the signal must stay above threshold to confirm a trigger. |
| Hold | Duration to keep recording after the signal drops below threshold. |
| Post-Trigger | Additional time appended after the last threshold crossing. |
| Max Rec | Maximum duration of a single recording segment (auto-splits if exceeded). |
| Pre-Trigger | Duration of audio captured before the threshold crossing via ring buffer. |
| Band Lo / Hi Hz | Bandpass filter frequency range for trigger analysis. |
| Detect Mode | Trigger detection strategy (see Spectral Entropy). |
| Entropy Threshold | Spectral entropy level below which a tonal trigger fires. |
| Auto Calibrate | Automatically set threshold from ambient noise measurement. |
Display Panel (Bottom Right)
| Control | Description |
|---|---|
| Gain | Amplification applied to the input signal for display. |
| dB Floor / Ceil | Dynamic range limits for the spectrogram colormap. |
| FFT Size | Number of FFT points (256, 512, 1024, 2048, or 4096). |
| Window | FFT window function: Hann, Hamming, Blackman, Bartlett, or Flattop. |
| Freq Scale | Frequency axis mapping: Linear, Logarithmic, or Mel. |
| Display Lo / Hi | Visible frequency range on the spectrogram. |
| Buffer Duration | Length of the circular display buffer (5 to 60 seconds). |
| View | Visualization mode: Spectrogram, Waveform, or Both. |
| Sync | Synchronize display parameters across all streams. |
Settings Panel (Bottom)
| Control | Description |
|---|---|
| Output Folder | Directory where WAV recordings are saved. |
| Prefix / Suffix | Custom text prepended/appended to recording filenames. |
| Input Device | Audio input device selector with refresh button. |
| Channel Mode | Mono, Left, Right, or Stereo. |
| Trigger Mode | Channel logic for stereo triggering (Average, Any, Both, Left, Right). |
| Sample Rate | Audio sample rate from 8,000 Hz to 96,000 Hz. |
| Reference Date | Base date for day-count subfolder naming (e.g., days post hatch). |
Visualization Modes
The View combo box in the Display panel selects which plots are shown on the canvas. All modes include an amplitude envelope subplot at the bottom.
Spectrogram
The default mode. The upper subplot shows a scrolling spectrogram rendered with matplotlib's
imshow using the inferno colormap. The lower subplot shows the amplitude
envelope (absolute sample values) as a blue line. A yellow dashed threshold line is drawn when
recording is armed.
Waveform
The upper subplot displays the raw signed waveform as a teal line. The lower subplot shows the amplitude envelope, identical to spectrogram mode. This view is useful for inspecting transient shape and polarity.
Both
Three subplots stacked vertically: spectrogram on top, waveform in the middle, and amplitude envelope at the bottom. This provides the most complete picture at the cost of vertical space.
Stereo Display
When the channel mode is set to Stereo, an additional subplot is added for the right channel spectrogram or waveform. The amplitude plot combines both channels: left in blue, right in pink.
Visual Indicators
- Cursor line (green) — marks the current write position in the circular buffer.
- Saturation warning — the waveform and amplitude lines turn red when the signal reaches 99% of full scale or higher. The transient red tint clears as soon as a clean chunk arrives; the sticky S badge (v2.1) on the sidebar entry latches until you click it, so a brief clip is never missed.
- Drop warning (v2.1) — the sticky D badge on the sidebar entry latches as soon as the audio capture callback drops any chunk (queue full, slow consumer, etc.), with a running session total in its tooltip. Click to clear.
Threshold-Triggered Recording
Chirp continuously monitors the amplitude envelope and automatically records audio segments when the signal exceeds a configurable threshold. This enables unattended data collection over extended periods.
State Machine
Recording progresses through three states:
- IDLE — Acquisition is running but recording is not armed, or the signal is below threshold.
- PENDING — The signal has crossed the threshold but the Min Cross duration has not yet elapsed. If the signal drops back below threshold before this duration, the trigger is cancelled (false-trigger rejection).
- RECORDING — A confirmed trigger. Audio is being written to a WAV file. Recording continues until the signal has been below threshold for the Hold duration, plus any Post-Trigger extension, or until Max Rec is reached.
Parameter Reference
Threshold
The amplitude level that initiates the trigger sequence. This can be set numerically or by dragging the yellow dashed line directly on the amplitude plot. Values are in linear amplitude (0.0 to 1.0 of full scale).
Min Cross
The minimum continuous duration (in seconds) the signal must remain above the threshold before a trigger is confirmed. This prevents brief transient spikes (e.g., a click or bump) from creating recordings. Typical values range from 0.01 s to 0.5 s.
Pre-Trigger Buffer
Duration of audio (in seconds) captured before the threshold crossing. A ring buffer continuously stores recent audio so that the onset of a vocalization is not lost. The pre-trigger audio is prepended to each recording file.
Hold
How long (in seconds) recording continues after the signal drops below the threshold. This bridges short pauses within a single vocalization to avoid splitting it into many small files.
Post-Trigger
An additional duration (in seconds) appended after the last threshold crossing. Unlike Hold, Post-Trigger always adds this full duration once the hold period has expired, capturing any trailing reverberation or echo.
Max Rec
The maximum allowed duration (in seconds) of a single recording segment. If a continuous sound exceeds this limit, the current file is closed and a new one is started immediately. This prevents runaway recordings from filling disk space.
Output Files
Recordings are saved as uncompressed WAV files (PCM 16-bit) with the following naming pattern:
<prefix>_<epochMs>_YYYYMMDD_HHMMSS_mmm_<suffix>.wav
Where epochMs is the Unix epoch in milliseconds, and the timestamp reflects local
time down to the millisecond. Files are written in a background thread to avoid blocking the
audio processing pipeline.
Spectral Entropy Trigger
Introduced in v2.0, the spectral entropy trigger provides a frequency-domain criterion for identifying sounds of interest. It computes the normalized Shannon entropy of the FFT magnitude spectrum, producing a value between 0 and 1:
- 0 — pure tone (all energy at a single frequency)
- 1 — white noise (energy uniformly spread across all frequencies)
Tonal vocalizations (birdsong, bat calls, whale sounds) produce low entropy values. The trigger fires when entropy falls below the configured threshold, meaning a structured tonal signal has been detected.
Detect Modes
The Detect Mode dropdown in the Trigger panel selects how amplitude and spectral entropy criteria are combined:
| Mode | Trigger Condition |
|---|---|
| Amplitude Only | Legacy behavior. Only amplitude threshold is checked. |
| Spectral Only | Only entropy is checked. Amplitude threshold is ignored. |
| Amp AND Spectral | Both conditions must be met simultaneously. |
| Amp OR Spectral | Either condition alone is sufficient to trigger. |
Entropy Trace Plot
When any spectral mode is active, an additional subplot appears below the amplitude plot showing the real-time entropy trace:
- Mauve line — current entropy value over time
- Peach dashed line — entropy threshold (draggable, like the amplitude threshold)
- Scroll-wheel zoom — adjusts Y-axis range, centered on the mouse cursor position
A numeric entropy display label in the status area shows the current instantaneous value. For stereo streams, per-channel entropy values are combined using the same logic as the channel trigger mode setting (Average, Any, Both, Left, or Right).
Auto-Calibrate
The Auto Calibrate button in the Trigger panel measures the ambient noise floor and automatically sets the amplitude threshold above it.
How It Works
- Acquisition must already be running.
- Click Auto Calibrate. The system records ambient levels for a configurable duration (1 to 10 seconds).
- The threshold is set to the 95th percentile of measured amplitude, multiplied by a safety margin.
Bandpass Filter
Each stream can optionally apply a 4th-order Butterworth bandpass filter. When enabled, the amplitude measurement used for threshold triggering is computed from the filtered signal, focusing detection on a specific frequency band.
- Set the Lo Hz and Hi Hz fields in the Trigger panel to define the passband.
- The spectrogram always displays the unfiltered signal so you retain full spectral context.
- The filter is particularly useful when the target vocalization occupies a narrow frequency band but the environment contains broadband noise.
Channel Modes
Input Channel Selection
| Mode | Behavior |
|---|---|
| Mono | Uses a single-channel input. |
| Left | Extracts the left channel from a stereo device. |
| Right | Extracts the right channel from a stereo device. |
| Stereo | Uses both channels. Dual spectrograms are displayed (left in blue, right in pink). |
Stereo Trigger Modes
When operating in stereo, the Trigger Mode control determines how per-channel amplitude values are combined for threshold comparison:
| Trigger Mode | Logic |
|---|---|
| Average | Mean of left and right amplitudes must exceed threshold. |
| Any Channel | Either channel exceeding threshold is sufficient. |
| Both Channels | Both channels must independently exceed threshold. |
| Left Channel | Only the left channel is evaluated. |
| Right Channel | Only the right channel is evaluated. |
View Mode
Click View Mode in the transport bar to switch from the single-stream configuration view to a multi-stream monitoring grid.
- All active streams are displayed simultaneously in a configurable grid (1 to 6 columns).
- Each cell shows a compact spectrogram or waveform, amplitude plot, threshold line, detect / record events strip (v2.1), and status text.
- The sidebar and configuration panels are hidden to maximize display area.
- Sticky health badges (v2.1) are overlaid on each tile: SAT lights up if the stream has ever clipped, and DROP×N lights up if the capture has ever dropped chunks (with the session-total count). Since view mode hides the sidebar, these overlays are how you spot health problems while monitoring; clearing them is done from config mode by clicking the sidebar S / D badges.
- Panel height is adjustable for each stream.
- Click Config Mode to return to the single-stream editing view.
Settings Management
All configuration parameters can be saved to and loaded from JSON files using the Save, Save As, and Load buttons in the transport bar.
What Is Saved
- Audio device names and sample rates
- All trigger parameters (threshold, timings, band filter, detect mode, entropy settings)
- Display options (FFT size, window, frequency scale, view mode, gain, dB range)
- Channel and trigger mode selections
- Output folder, prefix, suffix, and reference date
Device Resolution
When loading a configuration, Chirp resolves audio devices by name. If an exact match is not found, it attempts a partial string match. This allows configurations to be portable across systems where devices may have slightly different names.
Reference Date & Subfolder Naming
The Reference Date field enables automatic subfolder naming based on the number
of days elapsed since a reference event. For example, in bird vocalization studies, you can set
the hatch date. Recordings are then saved into subfolders named by the day count (e.g.,
day_042/), simplifying longitudinal data organization.
Mouse & Keyboard Interactions
| Action | Target | Effect |
|---|---|---|
| Click & drag | Amplitude threshold line | Adjust amplitude trigger threshold |
| Click & drag | Entropy threshold line | Adjust spectral entropy threshold |
| Scroll wheel | Amplitude plot | Zoom Y-axis (anchored at zero) |
| Scroll wheel | Waveform plot | Zoom Y-axis (symmetric around zero) |
| Scroll wheel | Entropy plot | Zoom Y-axis (centered on mouse cursor) |
| Scroll wheel | Spectrogram | No effect (use Display panel frequency controls) |
| Right-click | Sidebar background | Context menu: Start All / Stop All |
Tips & Best Practices
Setting the Threshold
Use Auto-Calibrate to quickly set a threshold above the ambient noise floor. Fine-tune by dragging the yellow threshold line on the amplitude plot while monitoring a live signal.
Capturing Complete Vocalizations
Set a generous Pre-Trigger buffer (0.5–2 s) to ensure you capture the onset of each vocalization. Use Hold to bridge short pauses within a phrase, and Post-Trigger to capture trailing echoes or reverberation.
Reducing False Triggers
Increase Min Cross to reject brief transients. Enable the bandpass filter to ignore out-of-band noise. In environments with broadband noise, switch to Amp AND Spectral detect mode so that only tonal sounds trigger recording.
Frequency Scale Selection
Use Mel frequency scale for bioacoustics work. Mel spacing approximates perceptual frequency resolution and provides better visual separation of harmonics in birdsong and other animal vocalizations. Use Log for general-purpose analysis and Linear when precise frequency measurements are needed.
Multi-Stream Monitoring
Configure all your streams in Config Mode, arm recording on each, then switch to View Mode for a clean monitoring display. Use the Sync checkbox to keep display parameters consistent across streams.
Long Recording Sessions
Always save your configuration before starting a long session. Set Max Rec to a reasonable limit (e.g., 300 s) to prevent individual files from growing too large. Monitor available disk space—uncompressed WAV files consume approximately 176 KB/s per channel at 44,100 Hz.
Reference Date for Experiments
If you are tracking development over time (e.g., days post hatch in bird research), set the Reference Date to the start event. Recordings will be automatically organized into day-count subfolders, making longitudinal analysis straightforward.
FFT Settings
Larger FFT sizes (2048, 4096) provide finer frequency resolution but coarser time resolution. Smaller sizes (256, 512) give better time resolution at the expense of frequency detail. For birdsong, 1024 is usually a good compromise at 44,100 Hz sample rate. The Flattop window is best for accurate amplitude measurement of pure tones; Hann is the best general-purpose choice.