Chirp

Real-Time Audio Monitoring, Visualization & Triggered Recording

v2.2.1

Overview

Chirp is a desktop application built with Python, PyQt5, and matplotlib for real-time audio monitoring, spectral visualization, and threshold-triggered recording. It was designed for bioacoustics research but is suitable for any scenario requiring continuous audio surveillance with automatic capture of events of interest.

Key Capabilities

The entire application is contained in a single Python file (chirp.py), making deployment and modification straightforward.

Installation

Requirements

Quick Start

# Create and activate a conda environment (recommended)
conda create -n chirp python=3.11 -y
conda activate chirp

# Install dependencies
pip install sounddevice numpy scipy matplotlib PyQt5

# Launch
python chirp.py
Tip: A conda or venv environment is recommended to avoid dependency conflicts with other Python projects on your system.

Interface Layout

The main window is divided into five functional areas: the sidebar, the central canvas, the transport bar, and two configuration panels at the bottom (trigger and display), plus a settings panel.

The sidebar lists all configured recording streams. Each entry displays:

Click Add Recording at the bottom of the sidebar to create a new stream. Right-click the sidebar background for Start All and Stop All context actions.

Canvas (Center)

The central area hosts the matplotlib figure showing the spectrogram, waveform, and amplitude plots for the currently selected stream. A vertical green cursor line indicates the current write position in the circular buffer.

A thin detect / record events strip (v2.1) sits directly beneath the amplitude plot with two rows: a yellow bar lights up for every sample the trigger condition is met (detect), and a green bar lights up for every sample the writer is actually capturing (record). The detect row is driven by the same per-sample trigger mask the recording state machine consumes — there is no second computation, so what you see is exactly what the trigger did, including any bandpass filtering and spectral-entropy gating.

Transport Bar

ButtonAction
Start AcqBegin audio acquisition (fills buffer, enables display)
Stop AcqStop acquisition and clear the buffer
Start RecArm threshold-triggered recording
Stop RecDisarm recording (finishes any in-progress file)
Reset ParamsRestore all parameters to defaults
Save / Save AsSave current configuration to JSON
LoadLoad a previously saved configuration
View ModeSwitch to multi-stream grid view

Status indicators in the transport bar show the global state: ACQ, REC, and TRIG.

Trigger Panel (Bottom Left)

Controls that govern when and how recordings are triggered:

ControlDescription
ThresholdAmplitude level that triggers recording. Also draggable on the plot.
Min CrossMinimum duration the signal must stay above threshold to confirm a trigger.
HoldDuration to keep recording after the signal drops below threshold.
Post-TriggerAdditional time appended after the last threshold crossing.
Max RecMaximum duration of a single recording segment (auto-splits if exceeded).
Pre-TriggerDuration of audio captured before the threshold crossing via ring buffer.
Band Lo / Hi HzBandpass filter frequency range for trigger analysis.
Detect ModeTrigger detection strategy (see Spectral Entropy).
Entropy ThresholdSpectral entropy level below which a tonal trigger fires.
Auto CalibrateAutomatically set threshold from ambient noise measurement.

Display Panel (Bottom Right)

ControlDescription
GainAmplification applied to the input signal for display.
dB Floor / CeilDynamic range limits for the spectrogram colormap.
FFT SizeNumber of FFT points (256, 512, 1024, 2048, or 4096).
WindowFFT window function: Hann, Hamming, Blackman, Bartlett, or Flattop.
Freq ScaleFrequency axis mapping: Linear, Logarithmic, or Mel.
Display Lo / HiVisible frequency range on the spectrogram.
Buffer DurationLength of the circular display buffer (5 to 60 seconds).
ViewVisualization mode: Spectrogram, Waveform, or Both.
SyncSynchronize display parameters across all streams.

Settings Panel (Bottom)

ControlDescription
Output FolderDirectory where WAV recordings are saved.
Prefix / SuffixCustom text prepended/appended to recording filenames.
Input DeviceAudio input device selector with refresh button.
Channel ModeMono, Left, Right, or Stereo.
Trigger ModeChannel logic for stereo triggering (Average, Any, Both, Left, Right).
Sample RateAudio sample rate from 8,000 Hz to 96,000 Hz.
Reference DateBase date for day-count subfolder naming (e.g., days post hatch).

Visualization Modes

The View combo box in the Display panel selects which plots are shown on the canvas. All modes include an amplitude envelope subplot at the bottom.

Spectrogram

The default mode. The upper subplot shows a scrolling spectrogram rendered with matplotlib's imshow using the inferno colormap. The lower subplot shows the amplitude envelope (absolute sample values) as a blue line. A yellow dashed threshold line is drawn when recording is armed.

Waveform

The upper subplot displays the raw signed waveform as a teal line. The lower subplot shows the amplitude envelope, identical to spectrogram mode. This view is useful for inspecting transient shape and polarity.

Both

Three subplots stacked vertically: spectrogram on top, waveform in the middle, and amplitude envelope at the bottom. This provides the most complete picture at the cost of vertical space.

Stereo Display

When the channel mode is set to Stereo, an additional subplot is added for the right channel spectrogram or waveform. The amplitude plot combines both channels: left in blue, right in pink.

Visual Indicators

Saturation: Persistent red lines indicate clipping. Reduce the input gain on your audio interface or move the microphone further from the source.

Threshold-Triggered Recording

Chirp continuously monitors the amplitude envelope and automatically records audio segments when the signal exceeds a configurable threshold. This enables unattended data collection over extended periods.

State Machine

Recording progresses through three states:

  1. IDLE — Acquisition is running but recording is not armed, or the signal is below threshold.
  2. PENDING — The signal has crossed the threshold but the Min Cross duration has not yet elapsed. If the signal drops back below threshold before this duration, the trigger is cancelled (false-trigger rejection).
  3. RECORDING — A confirmed trigger. Audio is being written to a WAV file. Recording continues until the signal has been below threshold for the Hold duration, plus any Post-Trigger extension, or until Max Rec is reached.

Parameter Reference

Threshold

The amplitude level that initiates the trigger sequence. This can be set numerically or by dragging the yellow dashed line directly on the amplitude plot. Values are in linear amplitude (0.0 to 1.0 of full scale).

Min Cross

The minimum continuous duration (in seconds) the signal must remain above the threshold before a trigger is confirmed. This prevents brief transient spikes (e.g., a click or bump) from creating recordings. Typical values range from 0.01 s to 0.5 s.

Pre-Trigger Buffer

Duration of audio (in seconds) captured before the threshold crossing. A ring buffer continuously stores recent audio so that the onset of a vocalization is not lost. The pre-trigger audio is prepended to each recording file.

Hold

How long (in seconds) recording continues after the signal drops below the threshold. This bridges short pauses within a single vocalization to avoid splitting it into many small files.

Post-Trigger

An additional duration (in seconds) appended after the last threshold crossing. Unlike Hold, Post-Trigger always adds this full duration once the hold period has expired, capturing any trailing reverberation or echo.

Max Rec

The maximum allowed duration (in seconds) of a single recording segment. If a continuous sound exceeds this limit, the current file is closed and a new one is started immediately. This prevents runaway recordings from filling disk space.

Output Files

Recordings are saved as uncompressed WAV files (PCM 16-bit) with the following naming pattern:

<prefix>_<epochMs>_YYYYMMDD_HHMMSS_mmm_<suffix>.wav

Where epochMs is the Unix epoch in milliseconds, and the timestamp reflects local time down to the millisecond. Files are written in a background thread to avoid blocking the audio processing pipeline.

Tip: The epoch timestamp ensures filenames are unique even if the system clock is adjusted during a recording session.

Spectral Entropy Trigger

Introduced in v2.0, the spectral entropy trigger provides a frequency-domain criterion for identifying sounds of interest. It computes the normalized Shannon entropy of the FFT magnitude spectrum, producing a value between 0 and 1:

Tonal vocalizations (birdsong, bat calls, whale sounds) produce low entropy values. The trigger fires when entropy falls below the configured threshold, meaning a structured tonal signal has been detected.

Detect Modes

The Detect Mode dropdown in the Trigger panel selects how amplitude and spectral entropy criteria are combined:

ModeTrigger Condition
Amplitude OnlyLegacy behavior. Only amplitude threshold is checked.
Spectral OnlyOnly entropy is checked. Amplitude threshold is ignored.
Amp AND SpectralBoth conditions must be met simultaneously.
Amp OR SpectralEither condition alone is sufficient to trigger.
Tip: In noisy environments, use Amp AND Spectral mode to dramatically reduce false triggers. Broadband noise has high entropy and will not satisfy the spectral condition.

Entropy Trace Plot

When any spectral mode is active, an additional subplot appears below the amplitude plot showing the real-time entropy trace:

A numeric entropy display label in the status area shows the current instantaneous value. For stereo streams, per-channel entropy values are combined using the same logic as the channel trigger mode setting (Average, Any, Both, Left, or Right).

Auto-Calibrate

The Auto Calibrate button in the Trigger panel measures the ambient noise floor and automatically sets the amplitude threshold above it.

How It Works

  1. Acquisition must already be running.
  2. Click Auto Calibrate. The system records ambient levels for a configurable duration (1 to 10 seconds).
  3. The threshold is set to the 95th percentile of measured amplitude, multiplied by a safety margin.
Note: Ensure the target sound source is silent during calibration. Any vocalizations during the measurement period will raise the threshold unnecessarily.

Bandpass Filter

Each stream can optionally apply a 4th-order Butterworth bandpass filter. When enabled, the amplitude measurement used for threshold triggering is computed from the filtered signal, focusing detection on a specific frequency band.

Channel Modes

Input Channel Selection

ModeBehavior
MonoUses a single-channel input.
LeftExtracts the left channel from a stereo device.
RightExtracts the right channel from a stereo device.
StereoUses both channels. Dual spectrograms are displayed (left in blue, right in pink).

Stereo Trigger Modes

When operating in stereo, the Trigger Mode control determines how per-channel amplitude values are combined for threshold comparison:

Trigger ModeLogic
AverageMean of left and right amplitudes must exceed threshold.
Any ChannelEither channel exceeding threshold is sufficient.
Both ChannelsBoth channels must independently exceed threshold.
Left ChannelOnly the left channel is evaluated.
Right ChannelOnly the right channel is evaluated.

View Mode

Click View Mode in the transport bar to switch from the single-stream configuration view to a multi-stream monitoring grid.

Tip: View Mode is ideal for unattended multi-device monitoring sessions. Set up your streams, arm recording, switch to View Mode, and leave the system running.

Settings Management

All configuration parameters can be saved to and loaded from JSON files using the Save, Save As, and Load buttons in the transport bar.

What Is Saved

Device Resolution

When loading a configuration, Chirp resolves audio devices by name. If an exact match is not found, it attempts a partial string match. This allows configurations to be portable across systems where devices may have slightly different names.

Reference Date & Subfolder Naming

The Reference Date field enables automatic subfolder naming based on the number of days elapsed since a reference event. For example, in bird vocalization studies, you can set the hatch date. Recordings are then saved into subfolders named by the day count (e.g., day_042/), simplifying longitudinal data organization.

Mouse & Keyboard Interactions

ActionTargetEffect
Click & dragAmplitude threshold lineAdjust amplitude trigger threshold
Click & dragEntropy threshold lineAdjust spectral entropy threshold
Scroll wheelAmplitude plotZoom Y-axis (anchored at zero)
Scroll wheelWaveform plotZoom Y-axis (symmetric around zero)
Scroll wheelEntropy plotZoom Y-axis (centered on mouse cursor)
Scroll wheelSpectrogramNo effect (use Display panel frequency controls)
Right-clickSidebar backgroundContext menu: Start All / Stop All

Tips & Best Practices

Setting the Threshold

Use Auto-Calibrate to quickly set a threshold above the ambient noise floor. Fine-tune by dragging the yellow threshold line on the amplitude plot while monitoring a live signal.

Capturing Complete Vocalizations

Set a generous Pre-Trigger buffer (0.5–2 s) to ensure you capture the onset of each vocalization. Use Hold to bridge short pauses within a phrase, and Post-Trigger to capture trailing echoes or reverberation.

Reducing False Triggers

Increase Min Cross to reject brief transients. Enable the bandpass filter to ignore out-of-band noise. In environments with broadband noise, switch to Amp AND Spectral detect mode so that only tonal sounds trigger recording.

Frequency Scale Selection

Use Mel frequency scale for bioacoustics work. Mel spacing approximates perceptual frequency resolution and provides better visual separation of harmonics in birdsong and other animal vocalizations. Use Log for general-purpose analysis and Linear when precise frequency measurements are needed.

Multi-Stream Monitoring

Configure all your streams in Config Mode, arm recording on each, then switch to View Mode for a clean monitoring display. Use the Sync checkbox to keep display parameters consistent across streams.

Long Recording Sessions

Always save your configuration before starting a long session. Set Max Rec to a reasonable limit (e.g., 300 s) to prevent individual files from growing too large. Monitor available disk space—uncompressed WAV files consume approximately 176 KB/s per channel at 44,100 Hz.

Reference Date for Experiments

If you are tracking development over time (e.g., days post hatch in bird research), set the Reference Date to the start event. Recordings will be automatically organized into day-count subfolders, making longitudinal analysis straightforward.

FFT Settings

Larger FFT sizes (2048, 4096) provide finer frequency resolution but coarser time resolution. Smaller sizes (256, 512) give better time resolution at the expense of frequency detail. For birdsong, 1024 is usually a good compromise at 44,100 Hz sample rate. The Flattop window is best for accurate amplitude measurement of pure tones; Hann is the best general-purpose choice.