Recording audio in C across multiple platforms (Linux, macOS, Windows, and Raspberry Pi) requires careful handling of low-level buffers, sample formats, and platform-specific audio APIs. Unlike higher-level languages, C lacks a built-in microphone interface across operating systems.
Each platform uses different audio subsystems with distinct APIs:
- macOS: Core Audio framework
- Windows: WASAPI (Windows Audio Session API) or DirectSound
- Linux & Raspberry Pi: ALSA (Advanced Linux Sound Architecture) or PulseAudio
Without a unified abstraction layer, developers must maintain separate initialization sequences, buffer management strategies, and error handling patterns for each operating system—significantly increasing complexity and maintenance burden.
This tutorial uses PvRecorder: a lightweight, cross-platform C library that provides a unified audio capture interface for real-time audio streaming. With PvRecorder, you can capture high-quality microphone input consistently across all major platforms, making it ideal for cross-platform voice-controlled applications, streaming speech-to-text, wake-word detection, and other real-time audio processing tasks.
By the end of this tutorial, you will be able to:
- build the example from the command line
- dynamically load the
PvRecordershared library at runtime - open the microphone and stream audio frames in real time
- stop and clean up safely
This is a practical foundation for:
- wake word detection
- voice commands
- voice activity detection
- streaming speech-to-text
- speaker recognition
- speaker diarization
- noise suppression
Prerequisites
- C99-compatible compiler
- Windows: MinGW
Supported Platforms
- Linux (x86_64)
- macOS (x86_64, arm64)
- Windows (x86_64, arm64)
- Raspberry Pi (Zero, 3, 4, 5)
Project Setup
This is the folder structure used in this tutorial. You can organize your files differently if you like, but make sure to update the paths in the examples accordingly:
Step 1. Add PvRecorder library files
- Create a folder named
pvrecorder/. - Download the pvrecorder header files from GitHub and place them in:
- Download the correct library file for your platform and place it in:
Implement Dynamic Loading
PvRecorder distributes pre-built platform libraries, meaning:
- the shared library (
.so,.dylib,.dll) is not linked at compile time - the program loads it at runtime
- functions must be retrieved by name
So, we need to write small helper functions to:
- open the shared library
- look up function pointers
- close the library
Step 2. Include platform-specific headers
Why these matter
- On Windows systems,
windows.hprovides theLoadLibraryfunction to load a shared library andGetProcAddressto retrieve individual function pointers. - On Unix-based systems,
dlopenanddlsymfrom thedlfcn.hheader provide the same functionality. - Lastly,
signal.hallows us to handleCtrl-Clater in this example.
Step 3. Define dynamic loading helper functions
3a. Open the shared library
3b. Load function symbols
3c. Close the library
3d. Print platform-correct errors
Capturing Microphone Audio
Now that we've set up dynamic loading, we can actually use the PvRecorder API.
Step 4. Load the library file
Point library_path to the library file you previously downloaded, and dynamically load the library:
Step 5. Initialize the recorder
Dynamically load and call pv_recorder_init to initialize the recorder:
What these values mean
frame_length: Number of audio samples captured per read operationdevice_index: Selected microphone (-1= default device)buffered_frame_count: Number of audio frames buffered internally
You can choose any available microphone instead of using the default device—see selecting an available device for details.
Most speech recognition engines expect:
- 16-bit samples (
int16_t) - 16 kHz audio
- frames of 512–1024 samples
Step 6. Capture audio
Start the recorder and continuously read audio frames in real time:
If you're building a speech recognition pipeline, pass frame to your speech recognition engine for processing.
Step 7. Stop and clean up
When done, stop and delete the recorder to free acquired memory:
Complete Example: Recording Audio in C
Here is the complete pvrecorder_tutorial.c you can copy, build, and run (update library_path to point to the correct library for your platform):
This is a simplified example but includes all the necessary components to get started. Check out the PvRecorder C demo on GitHub for a complete demo application.
Build & Run
Build and run the application:
Linux (gcc) and Raspberry Pi (gcc)
macOS (clang)
Windows (MinGW)
Troubleshooting
Even with correct setup, microphone recording in C can be affected by device configuration, buffering behavior, or platform audio drivers. The following checks help diagnose the most common issues when working with real-time audio capture.
1. Enable Debug Logging
Debug logging provides diagnostic messages that can reveal buffer overflows or silent frames. This is often the fastest way to identify timing or hardware issues during development.
2. Verify the Selected Input Device
Confirming the active microphone is useful if:
- your system has multiple input devices
- virtual audio devices are installed
- audio appears silent or distorted
To double-check what's being captured, record to a WAV file and listen back. The official PvRecorder C demo on GitHub includes a reference implementation.
You can also verify that recording is active before reading frames:
3. Fix Skipping, Stuttering, or Choppy Audio
Audio that sounds uneven or intermittently silent typically indicates the application is not reading frames quickly enough. This results in internal buffer overflow.
To resolve:
- enable debug logging (see above)
- check for "overflow" messages
- increase the buffered_frames_count used during initialization
A higher buffer count increases memory usage but allows more tolerance.
4. Investigate Low-Level Audio Backend Issues
PvRecorder uses miniaudio internally. If you suspect driver, hardware, or OS-level issues, try miniaudio's standalone capture example.
If miniaudio exhibits the same symptoms, the root cause is likely system-level rather than application code.
Next Steps: Build a Speech Recognition Pipeline
With reliable microphone input streaming into your C application, you can extend the project into voice processing and transcription. A typical next step is feeding captured PCM frames into a real-time speech-to-text engine.
Real-Time Streaming Speech Recognition in C
Cheetah Streaming Speech-to-Text: convert speech audio into text continuously with low latency, suitable for voice assistants and transcription tools.
Start Building






