Pydub
Dhanushkumar R
Microsoft Learn Student Ambassador - BETA|Data Scientist-Intern @BigTapp Analytics|Ex-Intern @IIT Kharagpur| Azurex2 |Machine Learning|Deep Learning|Data Science|Gen AI|Azure AI&Data |Technical Blogger
Audio files are a widespread means of transferring information. So let’s see how to work with audio files using Python. Python provides a module called pydub to work with audio files. pydub is a Python library to work with only .wav files. By using this library we can play, split, merge, edit our .wav audio files.pydub is built on top of other audio libraries, such as FFmpeg and Audioop, making it easy to use and compatible with a wide range of audio formats.
Installation
pip install pydub
Following are some functionalities that can be performed by pydub:
Here are some of the key features and modules of pydub:
AudioSegment Class:
File Formats:
pydub supports a wide range of audio file formats, including MP3, WAV, AAC, FLAC, OGG, and more. You can specify the format when importing audio and export audio to different formats.
Audio Manipulation:
Format Conversion:
You can easily convert audio between different formats using export() with the desired format.
Playback:
While pydub is primarily used for audio processing, it also provides a basic playback feature that allows you to play audio in your Python code.
Integration with FFmpeg:
pydub relies on the FFmpeg library to support many audio formats. It can automatically convert between formats that FFmpeg supports.
Here's an overview of some key modules and classes within pydub
Here's a simple example of using pydub to load an audio file, apply a fade-in effectand export it in a different format:
AudioSegment Class:
from pydub import AudioSegment
# Create an AudioSegment instance from audio file
audio = AudioSegment.from_file("input.mp3")
# Slice audio from 10 to 20 seconds
sliced_audio = audio[10000:20000]
# Export the sliced audio to a different format
sliced_audio.export("output.wav", format="wav")
File Formats:
from pydub import AudioSegment
# Create an AudioSegment instance from an MP3 file
audio = AudioSegment.from_mp3("input.mp3")
# Export the audio to an OGG file
audio.export("output.ogg", format="ogg")
Format Conversion:
from pydub import AudioSegment
# Create an AudioSegment instance from an MP3 file
audio = AudioSegment.from_mp3("input.mp3")
# Export the audio to a WAV file
audio.export("output.wav", format="wav")
领英推荐
Playback:
from pydub import AudioSegment, playback
# Create an AudioSegment instance from an audio file
audio = AudioSegment.from_file("input.mp3")
# Play the audio
playback.play(audio)
Audio Manipulation:
from pydub import AudioSegment
# Create an AudioSegment instance from an audio file
audio = AudioSegment.from_file("input.mp3")
# Apply a fade-in effect (2 seconds)
faded_audio = audio.fade_in(2000)
# Concatenate two audio segments
concatenated_audio = audio + faded_audio
# Export the concatenated audio
concatenated_audio.export("output.wav", format="wav")
Other Functions:
pydub.utils.make_chunks():
from pydub import AudioSegment, utils
# Create an AudioSegment instance
audio = AudioSegment.from_file("input.mp3")
# Split audio into 10-second chunks
chunks = utils.make_chunks(audio, 10000)
pydub.silence.detect_silence():
from pydub import AudioSegment, silence
# Create an AudioSegment instance
audio = AudioSegment.from_file("input.mp3")
# Detect silence (silence_threshold in dBFS, min_silence_len in milliseconds)
silent_ranges = silence.detect_silence(audio, silence_threshold=-40, min_silence_len=1000)
pydub.effects.fade():
from pydub import AudioSegment, effects
# Create an AudioSegment instance
audio = AudioSegment.from_file("input.mp3")
# Apply a fade-out effect (2 seconds)
faded_audio = effects.fade(audio, duration=2000)
pydub.generators:
from pydub import AudioSegment, generators
# Generate a 3-second sine wave at 440 Hz
sine_wave = generators.Sine(440).to_audio_segment(duration=3000)
Applications:
Knowing about .wav file: for this we will use attributes of audio file object.
# import required library
from pydub import AudioSegment
# import the audio file
wav_file = AudioSegment.from_file(file="Sample.wav", format="wav")
# data type for the file
print(type(wav_file))
# OUTPUT: <class 'pydub.audio_segment.AudioSegment'>
# To find frame rate of song/file
print(wav_file.frame_rate)
# OUTPUT: 22050
# To know about channels of file
print(wav_file.channels)
# OUTPUT: 1
# Find the number of bytes per sample
print(wav_file.sample_width )
# OUTPUT : 2
# Find Maximum amplitude
print(wav_file.max)
# OUTPUT 17106
# To know length of audio file
print(len(wav_file))
# OUTPUT 60000
'''
We can change the attributes of file by
changeed_audio_segment = audio_segment.set_ATTRIBUTENAME(x)
'''
wav_file_new = wav_file.set_frame_rate(50)
print(wav_file_new.frame_rate)
Increasing/Decreasing volume of the file: By using ‘+’ and ‘-‘ operator.
# import required library
import pydub
from pydub.playback import play
wav_file = pydub.AudioSegment.from_file(file = "Sample.wav",
format = "wav")
# Increase the volume by 10 dB
new_wav_file = wav_file + 10
# Reducing volume by 5
silent_wav_file = wav_file - 5
# Playing silent file
play(silent_wav_file)
# Playing original file
play(wav_file)
# Playing louder file
play(new_wav_file)
# Feel the difference!
Merging files: This is done using ‘+’ operator.
# import required libraries
from pydub import AudioSegment
from pydub.playback import play
wav_file_1 = AudioSegment.from_file("noice.wav")
wav_file_2 = AudioSegment.from_file("Sample.wav")
# Combine the two audio files
wav_file_3 = wav_file_1 + wav_file_2
# play the sound
play(wav_file_3)
Exporting files: This is done using export() method.
# import required libraries
from pydub import AudioSegment
from pydub.playback import play
# importing audio file
a = AudioSegment.from_file("pzm12.wav")
# Split stereo to mono
b = a.split_to_mono()
print(b)
print(b[0].channels )
b[0].export(out_f="outNow.wav",format="wav")
Splitting Audio: Splitting audio using split_to_mono() method.
# import required libraries
from pydub import AudioSegment
from pydub.playback import play
# importing audio file
a = AudioSegment.from_file("pzm12.wav")
# Split stereo to mono
b = a.split_to_mono()
print(b)
print(b[0].channels )
b[0].export(out_f="outNow.wav",format="wav")
Conclusion:
These are simplified examples to demonstrate the usage of different modules and functions in pydub. You can refer to the official pydub documentation for more detailed information and additional features: https://pydub.com/
Co-Founder of Altrosyn and DIrector at CDTECH | Inventor | Manufacturer
1 年Audio processing and manipulation with Pydub is an exciting field of research and implementation. Pydub offers a range of features that makes it an ideal choice for audio processing. It provides a high-level interface for audio manipulation, with support for multiple formats and sample rates, as well as a range of effects and filters. It also supports a wide range of audio formats and allows users to easily adjust the playback speed and volume of audio files. What methods do you use to manipulate and process audio to get the desired output?