Example code for audio data cleaning – Exploring Audio Data
Audio data cleanup is essential to enhance the quality and accuracy of subsequent analyses or applications. It helps remove unwanted artifacts, background noise, or distortions, ensuring that the processed audio is more suitable for tasks such as speech recognition, music analysis, and other audio-based applications, ultimately improving overall performance and interpretability.
Cleaning audio data often involves techniques such as background noise removal. One popular approach is using a technique called spectral subtraction. Python provides several libraries that can be used for audio processing, and one of the commonly used ones is Librosa.
The following code utilizes the Librosa library for audio processing to demonstrate background noise removal.
Loading the audio file
The code begins by loading an audio file using Librosa. The file path is specified as audio_file_path, and the librosa.load function returns the audio signal (y) and the sampling rate (sr):
Load the audio file
audio_file_path = “../PacktPublishing/DataLabeling/ch10/cats_dogs/cat_1.wav”
Replace with the path to your audio file
y, sr = librosa.load(audio_file_path)
Displaying the original spectrogram
The original spectrogram of the audio signal is computed using the short-time Fourier transform (STFT) and displayed using librosa.display.specshow. This provides a visual representation of the audio signal in the frequency domain:
D_original = librosa.amplitude_to_db(np.abs(librosa.stft(y)), ref=np.max)
plt.figure(figsize=(12, 8))
librosa.display.specshow(D_original,sr=sr, x_axis=’time’, y_axis=’log’)
plt.colorbar(format=’%+2.0f dB’)
plt.title(‘Original Spectrogram’)
plt.show()
Applying background noise removal
Harmonic-percussive source separation (librosa.effects.hpss) is applied to decompose the audio signal into harmonic and percussive components. Background noise is then estimated by subtracting the harmonic component, resulting in y_noise_removed:
Apply background noise removal
y_harmonic, y_percussive = librosa.effects.hpss(y)
y_noise_removed = y – y_harmonic
Displaying the spectrogram after background noise removal
The cleaned audio’s spectrogram is computed and displayed, allowing a comparison with the original spectrogram. This step visualizes the impact of background noise removal on the frequency content of the audio signal:
Display the spectrogram after background noise removal
D_noise_removed = librosa.amplitude_to_db( \
np.abs(librosa.stft(y_noise_removed)), ref=np.max)
plt.figure(figsize=(12, 8))
librosa.display.specshow(D_noise_removed, sr=sr, \
x_axis=’time’, y_axis=’log’)
plt.colorbar(format=’%+2.0f dB’)
plt.title(‘Spectrogram after Background Noise Removal’)
plt.show()
Saving the cleaned audio file
The cleaned audio signal (y_noise_removed) is saved as a new WAV file specified by output_file_path using the scipy.io.wavfile.write function:
Convert the audio signal to a NumPy array
y_noise_removed_np = np.asarray(y_noise_removed)
Save the cleaned audio file
output_file_path = “../PacktPublishing/DataLabeling/ch10/cleaned_audio_file.wav”
write(output_file_path, sr, y_noise_removed_np)
We have now seen an example of how Librosa can be utilized for preprocessing and cleaning audio data, particularly for removing background noise from an audio signal.
Extracting properties from audio data
In this section, we will learn how to extract the properties from audio data. Librosa provides many tools for extracting features from audio. These features are useful for audio data classification and labeling. For example, the MFCCs feature is used to classify cough audio data and predict whether a cough indicates tuberculosis.
Leave a Reply