Chirp

Real-Time Audio Monitoring, Visualization & Triggered Recording

v2.2.1

Overview

Chirp is a desktop application built with Python, PyQt5, and matplotlib for real-time audio monitoring, spectral visualization, and threshold-triggered recording. It was designed for bioacoustics research but is suitable for any scenario requiring continuous audio surveillance with automatic capture of events of interest.

Key Capabilities

Live spectrogram, waveform, and amplitude visualization
Threshold-triggered recording with configurable pre- and post-trigger buffers
Spectral entropy trigger for tonal sound detection
Multi-stream support — monitor several audio devices simultaneously
Stereo channel analysis with per-channel trigger logic
Optional bandpass filtering for frequency-specific triggering
Configurable display: FFT size, window function, frequency scale, view modes
Dark-themed UI using the Catppuccin Mocha color palette
Save and load full session configurations as JSON

The entire application is contained in a single Python file (chirp.py), making deployment and modification straightforward.

Installation

Requirements

Python 3.11 or later
sounddevice — audio I/O via PortAudio
numpy — numerical computation
scipy — signal processing (bandpass filter)
matplotlib — plotting engine
PyQt5 — GUI framework

Quick Start

# Create and activate a conda environment (recommended)
conda create -n chirp python=3.11 -y
conda activate chirp

# Install dependencies
pip install sounddevice numpy scipy matplotlib PyQt5

# Launch
python chirp.py

Tip: A conda or venv environment is recommended to avoid dependency conflicts with other Python projects on your system.

Interface Layout

The main window is divided into five functional areas: the sidebar, the central canvas, the transport bar, and two configuration panels at the bottom (trigger and display), plus a settings panel.

The sidebar lists all configured recording streams. Each entry displays:

Status indicators — colored dots showing stream state: ACQ (acquiring, green), REC (recording, green), TRIG (triggered, red)
Sticky health badges (v2.1) — S latches when this stream has saturated (clipped) at any point during the session; D latches when the audio capture has dropped any callback chunks (with the total count in its tooltip). Both stay lit until you click the badge to clear, so a brief clip or a single dropped callback can't slip by unnoticed.
Transient drop flash — a short per-tick indicator that blinks when a drop occurs in the current UI tick, alongside the sticky D badge.
Mini amplitude preview — a small real-time level meter
Reorder buttons — move streams up or down in the list
Delete button — remove a stream

Click Add Recording at the bottom of the sidebar to create a new stream. Right-click the sidebar background for Start All and Stop All context actions.

Canvas (Center)

The central area hosts the matplotlib figure showing the spectrogram, waveform, and amplitude plots for the currently selected stream. A vertical green cursor line indicates the current write position in the circular buffer.

A thin detect / record events strip (v2.1) sits directly beneath the amplitude plot with two rows: a yellow bar lights up for every sample the trigger condition is met (detect), and a green bar lights up for every sample the writer is actually capturing (record). The detect row is driven by the same per-sample trigger mask the recording state machine consumes — there is no second computation, so what you see is exactly what the trigger did, including any bandpass filtering and spectral-entropy gating.

Transport Bar

Button	Action
Start Acq	Begin audio acquisition (fills buffer, enables display)
Stop Acq	Stop acquisition and clear the buffer
Start Rec	Arm threshold-triggered recording
Stop Rec	Disarm recording (finishes any in-progress file)
Reset Params	Restore all parameters to defaults
Save / Save As	Save current configuration to JSON
Load	Load a previously saved configuration
View Mode	Switch to multi-stream grid view

Status indicators in the transport bar show the global state: ACQ, REC, and TRIG.

Trigger Panel (Bottom Left)

Controls that govern when and how recordings are triggered:

Control	Description
Threshold	Amplitude level that triggers recording. Also draggable on the plot.
Min Cross	Minimum duration the signal must stay above threshold to confirm a trigger.
Hold	Duration to keep recording after the signal drops below threshold.
Post-Trigger	Additional time appended after the last threshold crossing.
Max Rec	Maximum duration of a single recording segment (auto-splits if exceeded).
Pre-Trigger	Duration of audio captured before the threshold crossing via ring buffer.
Band Lo / Hi Hz	Bandpass filter frequency range for trigger analysis.
Detect Mode	Trigger detection strategy (see Spectral Entropy).
Entropy Threshold	Spectral entropy level below which a tonal trigger fires.
Auto Calibrate	Automatically set threshold from ambient noise measurement.

Display Panel (Bottom Right)

Control	Description
Gain	Amplification applied to the input signal for display.
dB Floor / Ceil	Dynamic range limits for the spectrogram colormap.
FFT Size	Number of FFT points (256, 512, 1024, 2048, or 4096).
Window	FFT window function: Hann, Hamming, Blackman, Bartlett, or Flattop.
Freq Scale	Frequency axis mapping: Linear, Logarithmic, or Mel.
Display Lo / Hi	Visible frequency range on the spectrogram.
Buffer Duration	Length of the circular display buffer (5 to 60 seconds).
View	Visualization mode: Spectrogram, Waveform, or Both.
Sync	Synchronize display parameters across all streams.

Settings Panel (Bottom)

Control	Description
Output Folder	Directory where WAV recordings are saved.
Prefix / Suffix	Custom text prepended/appended to recording filenames.
Input Device	Audio input device selector with refresh button.
Channel Mode	Mono, Left, Right, or Stereo.
Trigger Mode	Channel logic for stereo triggering (Average, Any, Both, Left, Right).
Sample Rate	Audio sample rate from 8,000 Hz to 96,000 Hz.
Reference Date	Base date for day-count subfolder naming (e.g., days post hatch).

Visualization Modes

The View combo box in the Display panel selects which plots are shown on the canvas. All modes include an amplitude envelope subplot at the bottom.

Spectrogram

The default mode. The upper subplot shows a scrolling spectrogram rendered with matplotlib's imshow using the inferno colormap. The lower subplot shows the amplitude envelope (absolute sample values) as a blue line. A yellow dashed threshold line is drawn when recording is armed.

Waveform

The upper subplot displays the raw signed waveform as a teal line. The lower subplot shows the amplitude envelope, identical to spectrogram mode. This view is useful for inspecting transient shape and polarity.

Both

Three subplots stacked vertically: spectrogram on top, waveform in the middle, and amplitude envelope at the bottom. This provides the most complete picture at the cost of vertical space.

Stereo Display

When the channel mode is set to Stereo, an additional subplot is added for the right channel spectrogram or waveform. The amplitude plot combines both channels: left in blue, right in pink.

Visual Indicators

Cursor line (green) — marks the current write position in the circular buffer.
Saturation warning — the waveform and amplitude lines turn red when the signal reaches 99% of full scale or higher. The transient red tint clears as soon as a clean chunk arrives; the sticky S badge (v2.1) on the sidebar entry latches until you click it, so a brief clip is never missed.
Drop warning (v2.1) — the sticky D badge on the sidebar entry latches as soon as the audio capture callback drops any chunk (queue full, slow consumer, etc.), with a running session total in its tooltip. Click to clear.

Saturation: Persistent red lines indicate clipping. Reduce the input gain on your audio interface or move the microphone further from the source.

Amplitude Y-Axis Scale

The amplitude envelope plot has two Y-axis scales:

Log (dB) — the default. Y axis runs -80 dB to 0 dB full scale. Best for surfacing quiet signals that compress against zero on the linear scale (faint vocalisations, distant calls).
Linear — raw envelope, scrollable Y zoom. Best when you care about the absolute peak level and want to see the shape of high-amplitude transients.

Right-click on the amplitude plot (in either Config or View mode) to pick Linear or Log (dB) for that stream. The choice is per-stream and is saved to the JSON settings file along with the rest of the configuration. The threshold line and its label track the chosen scale automatically — in dB mode the label reads e.g. thr = -26.0 dB; in linear mode it reads thr = 0.050.

Threshold-Triggered Recording

Chirp continuously monitors the amplitude envelope and automatically records audio segments when the signal exceeds a configurable threshold. This enables unattended data collection over extended periods.

State Machine

Recording progresses through three states:

IDLE — Acquisition is running but recording is not armed, or the signal is below threshold.
PENDING — The signal has crossed the threshold but the Min Cross duration has not yet elapsed. If the signal drops back below threshold before this duration, the trigger is cancelled (false-trigger rejection).
RECORDING — A confirmed trigger. Audio is being written to a WAV file. Recording continues until the signal has been below threshold for the Hold duration, plus any Post-Trigger extension, or until Max Rec is reached.

Parameter Reference

Threshold

The amplitude level that initiates the trigger sequence. This can be set numerically or by dragging the yellow dashed line directly on the amplitude plot. Values are in linear amplitude (0.0 to 1.0 of full scale).

Min Cross

The minimum continuous duration (in seconds) the signal must remain above the threshold before a trigger is confirmed. This prevents brief transient spikes (e.g., a click or bump) from creating recordings. Typical values range from 0.01 s to 0.5 s.

Pre-Trigger Buffer

Duration of audio (in seconds) captured before the threshold crossing. A ring buffer continuously stores recent audio so that the onset of a vocalization is not lost. The pre-trigger audio is prepended to each recording file.

Hold

How long (in seconds) recording continues after the signal drops below the threshold. This bridges short pauses within a single vocalization to avoid splitting it into many small files.

Post-Trigger

An additional duration (in seconds) appended after the last threshold crossing. Unlike Hold, Post-Trigger always adds this full duration once the hold period has expired, capturing any trailing reverberation or echo.

Max Rec

The maximum allowed duration (in seconds) of a single recording segment. If a continuous sound exceeds this limit, the current file is closed and a new one is started immediately. This prevents runaway recordings from filling disk space.

Output Files

Recordings are saved as uncompressed WAV files (PCM 16-bit) with the following naming pattern:

<prefix>_<epochMs>_YYYYMMDD_HHMMSS_mmm_<suffix>.wav

Where epochMs is the Unix epoch in milliseconds, and the timestamp reflects local time down to the millisecond. Files are written in a background thread to avoid blocking the audio processing pipeline.

Tip: The epoch timestamp ensures filenames are unique even if the system clock is adjusted during a recording session.

Spectral Entropy Trigger

Introduced in v2.0, the spectral entropy trigger provides a frequency-domain criterion for identifying sounds of interest. It computes the normalized Shannon entropy of the FFT magnitude spectrum, producing a value between 0 and 1:

0 — pure tone (all energy at a single frequency)
1 — white noise (energy uniformly spread across all frequencies)

Tonal vocalizations (birdsong, bat calls, whale sounds) produce low entropy values. The trigger fires when entropy falls below the configured threshold, meaning a structured tonal signal has been detected.

Detect Modes

The Detect Mode dropdown in the Trigger panel selects how amplitude and spectral entropy criteria are combined:

Mode	Trigger Condition
Amplitude Only	Legacy behavior. Only amplitude threshold is checked.
Spectral Only	Only entropy is checked. Amplitude threshold is ignored.
Amp AND Spectral	Both conditions must be met simultaneously.
Amp OR Spectral	Either condition alone is sufficient to trigger.

Tip: In noisy environments, use Amp AND Spectral mode to dramatically reduce false triggers. Broadband noise has high entropy and will not satisfy the spectral condition.

Entropy Trace Plot

When any spectral mode is active, an additional subplot appears below the amplitude plot showing the real-time entropy trace:

Mauve line — current entropy value over time
Peach dashed line — entropy threshold (draggable, like the amplitude threshold)
Scroll-wheel zoom — adjusts Y-axis range, centered on the mouse cursor position

A numeric entropy display label in the status area shows the current instantaneous value. For stereo streams, per-channel entropy values are combined using the same logic as the channel trigger mode setting (Average, Any, Both, Left, or Right).

Auto-Calibrate

The Auto Calibrate button in the Trigger panel measures the ambient noise floor and automatically sets the amplitude threshold above it.

How It Works

Acquisition must already be running.
Click Auto Calibrate. The system records ambient levels for a configurable duration (1 to 10 seconds).
The threshold is set to the 95th percentile of measured amplitude, multiplied by a safety margin.

Note: Ensure the target sound source is silent during calibration. Any vocalizations during the measurement period will raise the threshold unnecessarily.

Bandpass Filter

Each stream can optionally apply a 4th-order Butterworth bandpass filter. When enabled, the amplitude measurement used for threshold triggering is computed from the filtered signal, focusing detection on a specific frequency band.

Set the Lo Hz and Hi Hz fields in the Trigger panel to define the passband.
The spectrogram always displays the unfiltered signal so you retain full spectral context.
The filter is particularly useful when the target vocalization occupies a narrow frequency band but the environment contains broadband noise.

Channel Modes

Input Channel Selection

Mode	Behavior
Mono	Uses a single-channel input.
Left	Extracts the left channel from a stereo device.
Right	Extracts the right channel from a stereo device.
Stereo	Uses both channels. Dual spectrograms are displayed (left in blue, right in pink).

Stereo Trigger Modes

When operating in stereo, the Trigger Mode control determines how per-channel amplitude values are combined for threshold comparison:

Trigger Mode	Logic
Average	Mean of left and right amplitudes must exceed threshold.
Any Channel	Either channel exceeding threshold is sufficient.
Both Channels	Both channels must independently exceed threshold.
Left Channel	Only the left channel is evaluated.
Right Channel	Only the right channel is evaluated.

View Mode

Click View Mode in the transport bar to switch from the single-stream configuration view to a multi-stream monitoring grid.

All active streams are displayed simultaneously in a configurable grid (1 to 6 columns).
Each cell shows a compact spectrogram or waveform, amplitude plot, threshold line, detect / record events strip (v2.1), and status text.
The sidebar and configuration panels are hidden to maximize display area.
Sticky health badges (v2.1) are overlaid on each tile: SAT lights up if the stream has ever clipped, and DROP×N lights up if the capture has ever dropped chunks (with the session-total count). Since view mode hides the sidebar, these overlays are how you spot health problems while monitoring; clearing them is done from config mode by clicking the sidebar S / D badges.
Panel height is adjustable for each stream.
Click Config Mode to return to the single-stream editing view.

Tip: View Mode is ideal for unattended multi-device monitoring sessions. Set up your streams, arm recording, switch to View Mode, and leave the system running.

Settings Management

All configuration parameters can be saved to and loaded from JSON files using the Save, Save As, and Load buttons in the transport bar.

What Is Saved

Audio device names and sample rates
All trigger parameters (threshold, timings, band filter, detect mode, entropy settings)
Display options (FFT size, window, frequency scale, view mode, gain, dB range)
Channel and trigger mode selections
Output folder, prefix, suffix, and reference date

Device Resolution

When loading a configuration, Chirp resolves audio devices by name. If an exact match is not found, it attempts a partial string match. This allows configurations to be portable across systems where devices may have slightly different names.

Reference Date & Subfolder Naming

The Reference Date field enables automatic subfolder naming based on the number of days elapsed since a reference event. For example, in bird vocalization studies, you can set the hatch date. Recordings are then saved into subfolders named by the day count (e.g., day_042/), simplifying longitudinal data organization.

Error Logging

Every pipeline failure surfaced by the sidebar S / D / ! indicators is also appended to a plain-text log file named chirp_errors.log, written to the folder Chirp is launched from. The file is created on first event, opened in append mode (so it survives across runs), and never deleted automatically — manage rotation manually if it grows.

Line Format

Each event is a single tab-separated line:

<ISO timestamp>   <category>   stream=<name>   [file=<path>]   <message>

Example:

2026-04-27T14:32:18.412   saturation   stream=Mic A   file=D:\recordings\day_042\Mic_A_..._.wav   recording contains clipped samples (peak=1.0000)

Category	What it means	File path
`queue_full`	Python audio queue overflowed — the ingestion thread fell behind and a chunk was dropped (the sidebar D badge). Throttled to one line per stream per second; the cumulative count is stamped on each line so you can see how many drops were suppressed between two log entries.	—
`os_drop`	PortAudio reported `input_overflow` — the driver / OS lost samples upstream of our queue (the `OS` tag on the ! badge). Same throttling as `queue_full`.	—
`ingest`	An exception was raised inside the per-entity ingest loop (DSP / FFT / trigger). The thread is preserved; a 3-frame traceback is included in the message. Every event is logged.	—
`open`	The audio capture (`AudioCapture`) or WAV-replay capture (`WavFileCapture`) failed to open. Every event is logged.	WAV-replay path on a WAV-open failure.
`wav_writer`	The writer pool failed to write a triggered WAV (disk full, bad path, permission, scipy crash). Every event is logged. A worker that dies entirely is also logged with a `worker died: ...` message before the supervisor respawns it.	Output folder of the failed write.
`saturation`	A successfully-written WAV contained at least one sample at `\|x\| ≥ 0.99` of full scale. Logged once per file (not per sample) so you can locate and review every recording that clipped without flooding the log.	Full path of the WAV that clipped.

Throttling

queue_full and os_drop can fire on every audio chunk (50+/second at the default sample rate / chunk size). To keep the log bounded and useful, those two categories are limited to one entry per (stream, category) per second. The first event in any burst always logs immediately; subsequent events within the window are suppressed. Because each line carries a cumulative count, the difference between two adjacent lines tells you how many events were suppressed in between.

All other categories — ingest, open, wav_writer, saturation — are not throttled; every event produces a line.

Reliability

The logger is wrapped in a try/except — any I/O failure (path locked, disk full, permission error) is swallowed silently. Losing log lines is strictly preferable to crashing the audio pipeline.

Tip: When you spot a sticky badge mid-session, open chirp_errors.log and search for the stream name. The timestamp on the first matching line is when the failure actually occurred — the badge only tells you it happened, the log tells you when, where, and (for writer / saturation events) which file.

Mouse & Keyboard Interactions

Action	Target	Effect
Click & drag	Amplitude threshold line	Adjust amplitude trigger threshold
Click & drag	Entropy threshold line	Adjust spectral entropy threshold
Scroll wheel	Amplitude plot	Zoom Y-axis (anchored at zero)
Scroll wheel	Waveform plot	Zoom Y-axis (symmetric around zero)
Scroll wheel	Entropy plot	Zoom Y-axis (centered on mouse cursor)
Scroll wheel	Spectrogram	No effect (use Display panel frequency controls)
Right-click	Amplitude plot	Context menu: switch Y axis between Linear and Log (dB)
Right-click	Sidebar background	Context menu: Start All / Stop All

Tips & Best Practices

Setting the Threshold

Use Auto-Calibrate to quickly set a threshold above the ambient noise floor. Fine-tune by dragging the yellow threshold line on the amplitude plot while monitoring a live signal.

Capturing Complete Vocalizations

Set a generous Pre-Trigger buffer (0.5–2 s) to ensure you capture the onset of each vocalization. Use Hold to bridge short pauses within a phrase, and Post-Trigger to capture trailing echoes or reverberation.

Reducing False Triggers

Increase Min Cross to reject brief transients. Enable the bandpass filter to ignore out-of-band noise. In environments with broadband noise, switch to Amp AND Spectral detect mode so that only tonal sounds trigger recording.

Frequency Scale Selection

Use Mel frequency scale for bioacoustics work. Mel spacing approximates perceptual frequency resolution and provides better visual separation of harmonics in birdsong and other animal vocalizations. Use Log for general-purpose analysis and Linear when precise frequency measurements are needed.

Multi-Stream Monitoring

Configure all your streams in Config Mode, arm recording on each, then switch to View Mode for a clean monitoring display. Use the Sync checkbox to keep display parameters consistent across streams.

Long Recording Sessions

Always save your configuration before starting a long session. Set Max Rec to a reasonable limit (e.g., 300 s) to prevent individual files from growing too large. Monitor available disk space—uncompressed WAV files consume approximately 176 KB/s per channel at 44,100 Hz.

Reference Date for Experiments

If you are tracking development over time (e.g., days post hatch in bird research), set the Reference Date to the start event. Recordings will be automatically organized into day-count subfolders, making longitudinal analysis straightforward.

FFT Settings

Larger FFT sizes (2048, 4096) provide finer frequency resolution but coarser time resolution. Smaller sizes (256, 512) give better time resolution at the expense of frequency detail. For birdsong, 1024 is usually a good compromise at 44,100 Hz sample rate. The Flattop window is best for accurate amplitude measurement of pure tones; Hann is the best general-purpose choice.

Overview

Key Capabilities

Installation

Requirements

Quick Start

Interface Layout

Sidebar (Left)

Canvas (Center)

Transport Bar

Trigger Panel (Bottom Left)

Display Panel (Bottom Right)

Settings Panel (Bottom)

Visualization Modes

Spectrogram

Waveform

Both

Stereo Display

Visual Indicators

Amplitude Y-Axis Scale

Threshold-Triggered Recording

State Machine

Parameter Reference

Output Files

Spectral Entropy Trigger

Detect Modes

Entropy Trace Plot

Auto-Calibrate

How It Works

Bandpass Filter

Channel Modes

Input Channel Selection

Stereo Trigger Modes

View Mode

Settings Management

What Is Saved

Device Resolution

Reference Date & Subfolder Naming

Error Logging

Line Format

Categories

Throttling

Reliability

Mouse & Keyboard Interactions

Tips & Best Practices