ALSA / PulseAudio Audio Processing | Shortwave Radio | RTL-SDR | KiwiSDR

WebSDR Handbook
How to Tune Global Broadcasts and Communications for Free With Software Defined Radio

We earn a commission if you make a purchase, at no additional cost to you.

Transmitting an effective voice signal, especially in the HF spectrum, requires more than using a good antenna, lots of power, and a stable / clean sounding transmitter. Attention must be devoted to the audio signal chain, and some manipulation of the amplitude dynamics is necessary. Specifically, the human voice normally has a peak-to-average ratio that is good for face to face conversation, but not good for noisy radio conditions. One can be heard much better if spectral components in the voice are enhanced and the peak-to-average ratio reduced. It is also essential to incorporate hard limiting to prevent overdriving downstream components in the transmit signal path. Traditionally, any well equipped radio studio or ham shack included an audio processor rack to do these tasks. More spartan stations relied on internal transmitter audio processing, which works but with limits in flexibility.

Ten-Tec Speech Procesor
Ten-Tec Speech Processor

Behringer Speech Processor

Software Speech Processor

Send that expensive outboard audio processing equipment to the local radio museum. Audio processing for broadcasting, podcasting, or amateur radio can be done very nicely on a personal computer using ALSA with LADSPA plugins. Recently, two amateur radio operators and bloggers, VK2MEV and VK5FSCK have made substantial progress implementing real time software speech processing. It is not at all difficult to do on modest computers! Most Linux distributions come with the basic audio architecture already installed; getting the plugins requires a short download. Configuring the system for very nice equalization, compression, and limiting takes a few more minutes, using data given here.

LADSPA Plugins

These software modules have been very popular in audio editing. Any user of a digital audio workstation has seen and used the LADSPA plugins for cleaning up, adding special effects, and other audio processing tasks. "Tom's Audio Processing" and "Steve Harris" processing modules, designed to alter the amplitude dynamics of audio, will be used here. Yes, it is also possible to add reverb, phasing, pitch change, and other effets in real time.

Install the ladspa nd lv2 plugins using typical Linux package managers. For example, use Apt to get it on most Debian or Ubuntu systems:

sudo apt update; sudo apt install tap-plugins swh-plugins lsp-plugins

Note (January 26, 2023): The information on this page shows you how to do realtime audio processing using audio plugins and config files. It is far easier to simply manage the plugins with PulseEffects; that is how I set it up in Skywave Linux and other systems. Do this to install it:

sudo apt update; sudo apt install pulseeffects

To use PulseEffects, click the launcher then set up your audio chain as you wish!

It is also possible, if you are an advanced Linux user, to enjoy higher performance if you download and compile the plugins from source code. In particular, the tap_dynamics plugin can be modified very nicely for combined compression and gating by editing the source code. Tap_dynamics_presets.h is the file containing the presets. Number 7 (actually number eight in the file when starting at zero) can be rewritten as shown:


	{ /* Compressor/Gate, threshold at -20 dB */
		5,
		{
			{-80.0f, -110.0f},
			{-62.0f, -90.0f},
			{-20.0f, -20.0f},
			{0.0f, -10.0f},
			{20.0f, -8.0f},
		},
	},

Code for other plugins can also be tweaked to suit operator preferences when the need arises. With original code, these plugins mees most operators' needs, but any may be adjusted if a truly unique and bleeding edge sound system is desired.

Configuring Pulseaudio for System-Wide Audio EQ / Compression / Limiting

Computers running Ubuntu, Linux Mint, Arch, or other operating systems with PulseAudio must configure PulseAudio to load the LADSPA plugins, route audio data through them, and provide processed audio as the default sink. This is done with two files: /etc/pulse/default.pa and /etc/asound.conf

First, paste the following code into /etc/asound.conf:

pcm.pulse {
    type pulse
}
ctl.pulse {
    type pulse
}
pcm.!default {
    type pulse
}
ctl.!default {
    type pulse
}

Next, paste the following code to the top of /etc/pulse/default.pa to enable the alsa-sink module:

load-module module-alsa-sink device=hw:0,0 sink_name=processed_output

Finally, paste the following code to the bottom of /etc/pulse/default.pa. The system will send all sounds to the default sink, ladspa_output.tap_eq. Note that the limiter sends its output to the ALSA sink we called processed_output above:

#EQ and Dynamics
load-module module-ladspa-sink sink_name=ladspa_output.fastLookaheadLimiter label=fastLookaheadLimiter plugin=fast_lookahead_limiter_1913 master=processed_output control=6,0,0.3

load-module module-ladspa-sink sink_name=ladspa_output.tap_dynamics_m label=tap_dynamics_m plugin=tap_dynamics_m master=ladspa_output.fastLookaheadLimiter control=4,700,15,15,13

load-module module-ladspa-sink sink_name=ladspa_output.tap_eq label=tap_equalizer plugin=tap_eq master=ladspa_output.tap_dynamics_m control=-6,-6,-3,0,0,0,0,0,100,200,400,1000,3000,6000,12000,15000

set-default-sink ladspa_output.tap_eq

The ideal settings for each stage of processing depend on the nature of the audio content, and should be determined through testing. See the examples given below, for ALSA, that are suitable for normal voice, broadcast, and hardcore dx usage. Note that PulseAudio takes the control parameters as a string of comma separated numbers.

Dynamics Processing Without ALSA-Sink

In some situations, there may be an advantage to not using the alsa-sink module. In that case, do not put any code into /etc/asound.conf. The limiter should send its output specifically to the sound card. Get the sound card identifier using the following command:

pacmd list-sinks

Then designate the sound card as the "master" sink in the limiter configuration line. Here is the code in its entirety:

#EQ and Dynamics
load-module module-ladspa-sink sink_name=ladspa_output.fastLookaheadLimiter label=fastLookaheadLimiter plugin=fast_lookahead_limiter_1913 master=alsa_output.pci-0000_05_01.0.analog-surround-71 control=6,0,0.3

load-module module-ladspa-sink sink_name=ladspa_output.tap_dynamics_m label=tap_dynamics_m plugin=tap_dynamics_m master=ladspa_output.fastLookaheadLimiter control=4,700,15,15,13

load-module module-ladspa-sink sink_name=ladspa_output.tap_eq label=tap_equalizer plugin=tap_eq master=ladspa_output.tap_dynamics_m control=-6,-6,-3,0,0,0,0,0,100,200,400,1000,3000,6000,12000,15000

set-default-sink ladspa_output.tap_eq

One last note about Pulseaudio and ALSA. It is wise to start the alsamixer application and adjust the sound levels, as the defaults may be too low or too high. Save them with the sudo alsactl store command.

Configuring ALSA (without Pulseaudio, no dmix)

System-wide audio processing without PulseAudio requires configuring ALSA. ALSA will take raw audio, pass the data through a series of LADSPA plugins in real time, and export processed audio as the default sound. All of this is set up, system wide, in the file /etc/asound.conf.

First, it is necessary for the audio data to be in float format. That is accomplished in the first section of acound.conf, for the virtual device "ladcomp." Next, the "tap_equalizer" plugin is set to recieve audio data from the first audio interface, "plughw:0,0" and process it according to the controls. There is flexibility here - the gain levels and bands can be changed to suit the EQ task at hand.

Next, the programmable "tap_dynamics_m" plugin is used for audio amplitude compression. There are several different compression, gating, and expansion presets for this plugin; here number 13 was chosen since it combines compression and mild gating to reduce background noise. Some makeup gain is used after the audio is compressed, to bring it back to a nominal level. Preset 7 also works well.

Finally, the "Fast Lookahead Limiter" is applied to hard limit the audio for additional lowering of the peak-to-average ratio and prevent voice peaks from going above a set limit. This is important in many transmitters due to negative effects of triggering internal protective circuits. Few things sound worse than a sideband transmitter driven so hard that the ALC kicks in, or an FM transmitter modulated strongly enough to reach the deviation limiter. This "Fast Lookahead Limiter" does its job with finesse.

Here is the actual speech processor data. Paste it into the file /etc/asound.conf. If there presently is no /etc/asound.conf, then create it. For individual users on a system who want a customized audio configuration, this data may be written to the file /.asoundrc (a hidden file in the user's home directory).

# ALSA / LADSPA Speech Processor
#Place this in /etc/asound.conf or /.asoundrc

#Convert audio to float data for the equalizer plugin.
pcm.ladcomp {
    type plug
    slave.pcm "plughw:0,0";
}

#LADSPA Equalizer plugin.
#Set levels and bands as desired.
#Input is float data from microphone
pcm.ladcomp_eq {
      type ladspa
      slave.pcm "plughw:0,0";
      path "/usr/lib/ladspa";
      plugins [
          {
              label tap_equalizer
              input {
                        #the first 8 numbers set levels (dB),  2nd 8 numbers set the center frequencies (hz).
                        controls [-20 0 -6 0 0 0 -10 -30 100 300 500 1000 3000 6000 12000 15000]
              }
          }
      ]
  }

#LADSPA compressor plugin.
#Set parameters as desired.
#Input is from equalizer plugin.
pcm.ladcomp_compressor {
      type ladspa
      slave.pcm "ladcomp_eq";
      path "/usr/lib/ladspa";
      plugins [
          {
              label tap_dynamics_m
              input {
                        #attack time (ms), release time (ms), offset gain (dB), makeup gain (dB), function
                        controls [3 300 15 15 13]
              }
          }
      ]
  }

#LADSPA look-ahead limiter plugin.
#Set parameters as desired.
#Input is compressor plugin.
pcm.ladcomp_limiter {
      type ladspa
      slave.pcm "ladcomp_compressor";
      path "/usr/lib/ladspa";
      plugins [
          {
              label fastLookaheadLimiter
              input {
                        #InputGain,Limit,Releasetime
                        controls [10 0 0.3]
              }
          }
      ]
  }

#Whatever is fed to the audio
#processor is available here.
#EQ, Compression, and Limiting
pcm.!combo {
type plug;
slave.pcm ladcomp_limiter;
}

#default device
pcm.!default {
type plug;
slave.pcm "hw:0,0";
}

ctl.!default {
type hw
card 0
}

The actual parameters used in the above asound.conf will vary depending on the intended usage. For DX or contest operation it is desirable to use a narrow bandpass, heavy compression, and moderate limiting. Normal voice communications don't need such extreme processing, and a wider bandpass sounds quite decent with moderate compression and light limiting. Similarly, broadcast quality audio requires a wide bandpass, light compression with a longer release time, and only enough gain to cause limiting of occaisional voice peaks. Be aware that control parameters for ALSA should be separated by spaces, not commas as is the case for PulseAudio.

Parameters for different uses:

Broadcast Quality
EQ	 controls	[3 3 0 -8 0 0 0 0 100 300 500 1000 3000 6000 12000 15000]
Compressor		[4 700 8 20 13]
Limiter			[6 0 1.0]

Normal Voice Quality
EQ controls 		[-20 0 -6 0 0 0 -10 -30 100 300 500 1000 3000 6000 12000 15000]
Compressor		[4 300 15 15 13]
Limiter			[10 0 0.3]

Hardcore DX / Contesting
EQ controls 		[-50 -50-50 -5 -5 -50 -50 -50 100 300 500 1000 3000 6000 12000 15000]
Compressor		[4 300 20 20 13]
Limiter			[15 0 0.3]

After making any changes to /etc/asound.conf, restart ALSA with the following command:

sudo alsa reload

Monitor the audio or feed it to your transmitter using this command as a normal user:

arecord -B 6000 -D combo -r 48000 | aplay -B 6000

System-Wide Audio Processing With ALSA and LADSPA Plugins (on dmix)

Audio can be processed system-wide without Pulseaudio, with a few changes to the default ALSA configuration. Here we configure the compressor to take data from the "dmix" source, which carries any audio running on the system. Also, the default plugin is configured to take audio from the last processor stage (ladcomp_limiter).

# ALSA / LADSPA Speech Processor
#Place this in /etc/asound.conf or /.asoundrc

#Convert audio to float data for the equalizer plugin.
# Speech Processor
pcm.ladcomp {
    type plug
# Use this for single channel processing
#    slave.pcm "plughw:0,0";
# Use this for sysytemwide processing
    slave.pcm "plug:dmix";
}

#LADSPA Equalizer plugin.
#Set levels and bands as desired.
#Input is float data from microphone
pcm.ladcomp_eq {
      type ladspa
      slave.pcm "plughw:0,0";
      path "/usr/lib/ladspa";
      plugins [
          {
              label tap_equalizer
              input {
                        #the first 8 numbers set levels (dB),  2nd 8 numbers set the center frequencies (hz).
                        controls [3 0 -6 0 0 0 0 0 100 300 500 1000 3000 6000 12000 15000]
              }
          }
      ]
  }

#LADSPA compressor plugin.
#Set parameters as desired.
#Input is from equalizer plugin.
pcm.ladcomp_compressor {
      type ladspa
      slave.pcm "ladcomp_eq";
      path "/usr/lib/ladspa";
      plugins [
          {
              label tap_dynamics_m
              input {
                        #attack time (ms), release time (ms), offset gain (dB), makeup gain (dB), function
                        controls [3 800 12 20 7]
              }
          }
      ]
  }

#LADSPA look-ahead limiter plugin.
#Set parameters as desired.
#Input is compressor plugin.
pcm.ladcomp_limiter {
      type ladspa
      slave.pcm "ladcomp_compressor";
      path "/usr/lib/ladspa";
      plugins [
          {
              label fastLookaheadLimiter
              input {
                        #InputGain,Limit,Releasetime
                        controls [10 0 0.2]
              }
          }
      ]
  }

#Whatever is fed to the audio
#processor is available here.
#EQ, Compression, and Limiting
pcm.!combo {
type plug;
slave.pcm ladcomp_limiter;
}

#default device
pcm.!default {
type plug
#Single channel processing
#slave.pcm "hw:0,0";
#Systemwide processing
slave.pcm "ladcomp_limiter";
}

ctl.!default {
type hw
card 0
}

Give the above asound.conf a try with some multimedia files or streaming audio and note the difference from an unprocessed playback. Each user's actual content and application is a bit different, so experiment with the settings and find what works best. Certainly, a system running streaming broadcast audio will need different settings from one running utility / interphone sound or content from a CD / DVD source. A dedicated amateur contesting station would ratchet up the processing to the highest degree.

For feeding the processed audio to a conventional radio transmitter, consider usung an isolation transformer connected to the modulator stage. Connecting to the microphone input is also possible, but use an attenuator to prevent overdriving subsequent audio stages. For best audio quality, feed the modulator directly. Set the incoming audio level to stay below the transmitter's clipping or ALC threshold.

Feeding audio to a software defined radio entails properly routing the processed audio to the transmitter input, and exactly how to do it varies with each kind of transmitter. Check the radio's documentation for the proper audio routing.

What would make this an even better audio setup? Multi-band compression would be a great feature, as would a direct digital output to a soundcard in the transmitter. Actually, there are plugins for the former, and most modern software defined radios can directly accept a digital audio stream to an audio interface in the transmitter. For the best bandwidth and overall performance with your transceiver, give thought to using a high definition audio interface.

Using a PC for transmitter speech procesing is an evolutionary change in radio. SDR equipment is sure to include the capabilities shown here. Classic transmitters in amateur radio, utility, and smaller broadcast stations can certainly benefit from the efficiency and flexibility of accomplishing audio processing in this manner. Will it replace the million dollar racks of specialized audio gear found in top-of-the-market radio stations? Probably not, but realtime processing on a PC is capable of things bound only by CPU power and the imaginations of programmers such as the LADSPA people.

© 2005 - 2024 AB9IL.net, All Rights Reserved.
About Philip Collier / AB9IL, Commentaries and Op-Eds, Contact, Privacy Policy and Affiliate Disclosure, XML Sitemap.
This website is reader-supported. As an Amazon affiliate, I earn from qualifying purchases.

AB9IL.net: A Realtime Software Audio Processor

LADSPA Plugins

Configuring Pulseaudio for System-Wide Audio EQ / Compression / Limiting

Dynamics Processing Without ALSA-Sink

Configuring ALSA (without Pulseaudio, no dmix)

System-Wide Audio Processing With ALSA and LADSPA Plugins (on dmix)