🏢 Enterprise AI Consulting
Get dedicated help specific to your use case and for your hardware and software choices.
Consult an AI Expert

Recording audio in C across multiple platforms (Linux, macOS, Windows, and Raspberry Pi) requires careful handling of low-level buffers, sample formats, and platform-specific audio APIs. Unlike higher-level languages, C lacks a built-in microphone interface across operating systems.

Each platform uses different audio subsystems with distinct APIs:

Without a unified abstraction layer, developers must maintain separate initialization sequences, buffer management strategies, and error handling patterns for each operating system—significantly increasing complexity and maintenance burden.

This tutorial uses PvRecorder: a lightweight, cross-platform C library that provides a unified audio capture interface for real-time audio streaming. With PvRecorder, you can capture high-quality microphone input consistently across all major platforms, making it ideal for cross-platform voice-controlled applications, streaming speech-to-text, wake-word detection, and other real-time audio processing tasks.

By the end of this tutorial, you will be able to:

  • build the example from the command line
  • dynamically load the PvRecorder shared library at runtime
  • open the microphone and stream audio frames in real time
  • stop and clean up safely

This is a practical foundation for:

Prerequisites

  • C99-compatible compiler
  • Windows: MinGW

Supported Platforms

  • Linux (x86_64)
  • macOS (x86_64, arm64)
  • Windows (x86_64, arm64)
  • Raspberry Pi (Zero, 3, 4, 5)

Project Setup

This is the folder structure used in this tutorial. You can organize your files differently if you like, but make sure to update the paths in the examples accordingly:

Step 1. Add PvRecorder library files

  1. Create a folder named pvrecorder/.
  2. Download the pvrecorder header files from GitHub and place them in:
  1. Download the correct library file for your platform and place it in:

Implement Dynamic Loading

PvRecorder distributes pre-built platform libraries, meaning:

  • the shared library (.so, .dylib, .dll) is not linked at compile time
  • the program loads it at runtime
  • functions must be retrieved by name

So, we need to write small helper functions to:

  1. open the shared library
  2. look up function pointers
  3. close the library

Step 2. Include platform-specific headers

Why these matter

  • On Windows systems, windows.h provides the LoadLibrary function to load a shared library and GetProcAddress to retrieve individual function pointers.
  • On Unix-based systems, dlopen and dlsym from the dlfcn.h header provide the same functionality.
  • Lastly, signal.h allows us to handle Ctrl-C later in this example.

Step 3. Define dynamic loading helper functions

3a. Open the shared library

3b. Load function symbols

3c. Close the library

3d. Print platform-correct errors

Capturing Microphone Audio

Now that we've set up dynamic loading, we can actually use the PvRecorder API.

Step 4. Load the library file

Point library_path to the library file you previously downloaded, and dynamically load the library:

Step 5. Initialize the recorder

Dynamically load and call pv_recorder_init to initialize the recorder:

What these values mean

  • frame_length: Number of audio samples captured per read operation
  • device_index: Selected microphone (-1 = default device)
  • buffered_frame_count: Number of audio frames buffered internally

You can choose any available microphone instead of using the default device—see selecting an available device for details.

Most speech recognition engines expect:

  • 16-bit samples (int16_t)
  • 16 kHz audio
  • frames of 512–1024 samples

Step 6. Capture audio

Start the recorder and continuously read audio frames in real time:

If you're building a speech recognition pipeline, pass frame to your speech recognition engine for processing.

Step 7. Stop and clean up

When done, stop and delete the recorder to free acquired memory:

Complete Example: Recording Audio in C

Here is the complete pvrecorder_tutorial.c you can copy, build, and run (update library_path to point to the correct library for your platform):

This is a simplified example but includes all the necessary components to get started. Check out the PvRecorder C demo on GitHub for a complete demo application.

Build & Run

Build and run the application:

Linux (gcc) and Raspberry Pi (gcc)

macOS (clang)

Windows (MinGW)

Troubleshooting

Even with correct setup, microphone recording in C can be affected by device configuration, buffering behavior, or platform audio drivers. The following checks help diagnose the most common issues when working with real-time audio capture.

1. Enable Debug Logging

Debug logging provides diagnostic messages that can reveal buffer overflows or silent frames. This is often the fastest way to identify timing or hardware issues during development.

2. Verify the Selected Input Device

Confirming the active microphone is useful if:

  • your system has multiple input devices
  • virtual audio devices are installed
  • audio appears silent or distorted

To double-check what's being captured, record to a WAV file and listen back. The official PvRecorder C demo on GitHub includes a reference implementation.

You can also verify that recording is active before reading frames:

3. Fix Skipping, Stuttering, or Choppy Audio

Audio that sounds uneven or intermittently silent typically indicates the application is not reading frames quickly enough. This results in internal buffer overflow.

To resolve:

  1. enable debug logging (see above)
  2. check for "overflow" messages
  3. increase the buffered_frames_count used during initialization

A higher buffer count increases memory usage but allows more tolerance.

4. Investigate Low-Level Audio Backend Issues

PvRecorder uses miniaudio internally. If you suspect driver, hardware, or OS-level issues, try miniaudio's standalone capture example.

If miniaudio exhibits the same symptoms, the root cause is likely system-level rather than application code.

Next Steps: Build a Speech Recognition Pipeline

With reliable microphone input streaming into your C application, you can extend the project into voice processing and transcription. A typical next step is feeding captured PCM frames into a real-time speech-to-text engine.

Real-Time Streaming Speech Recognition in C

Cheetah Streaming Speech-to-Text: convert speech audio into text continuously with low latency, suitable for voice assistants and transcription tools.

Start Building

Frequently Asked Questions

What is the best audio format for speech recognition in C?
Most speech recognition engines expect single channel, 16-bit PCM audio sampled at 16 kHz. Using this format ensures low latency and consistent audio quality.
How do I choose the correct microphone device in C?
Use the device enumeration API (e.g., pv_recorder_get_available_devices) to list all connected audio devices. Each device has an index; pass the index to the recorder initialization function. If you want the default device, you can use -1.
Why is my audio choppy or skipping?
Choppy audio usually indicates buffer overflows. You can resolve this by increasing the internal buffer count (buffered_frames_count) when initializing the recorder. Also, ensure your frame length is appropriate for real-time processing (commonly 512–1024 samples).
Can I record audio on multiple platforms with the same C code?
Yes. By using a cross-platform audio library like PvRecorder, you can write a single C codebase that captures audio on Linux, Windows, macOS, and Raspberry Pi.
How do I integrate captured audio with a speech recognition engine?
Once you have raw PCM frames from your microphone, pass them directly into the speech recognition engine's streaming API or buffer. Ensure the audio format (sample rate, bit depth, and channel count) matches the engine's requirements.