Sophia Iroegbu
Sophia Iroegbu

Sophia Iroegbu

Turn Your Pdf Into An Audiobook Using Python

#3 Python Weekly

Sophia Iroegbu
Β·Oct 2, 2021Β·

4 min read

Turn Your Pdf Into An Audiobook Using Python

Subscribe to my newsletter and never miss my upcoming articles

Play this article

Hello there! πŸ‘‹

In this article, I will teach you how to turn your PDF note into an audiobook with just a few lines of python code. This project is inspired by my younger sister, Lydia who hates reading but loves listening to music.

Table of Contents

  • Creating & Activating Virtual Environment

  • Installing and Importing Pyttsx3

  • Installing and Importing PyPDF3

  • Downloading/Saving AudioBook

  • Additional Information

Let's get started, shall we? πŸš€

Create and activate your virtual environment

#to create a virtual environment

Python -m venv name-of-virtual-environment

#to activate the virtual environment

cd audio_book\name-of-virtual-environment\Scripts\activate

audio_book is the name of my project directory

Replace name-of-virtual-environment with your desired name for your virtual environment

Once that is done. Start writing the python script πŸš€

Install Text-to-speech python package

pip install pyttsx3

Pyttsx3 is a text-to-speech conversion library in Python. It is an easy-to-use tool that converts the entered text into speech. The pyttsx3 module supports two voices first is female and the second is male which is provided by β€œsapi5” for windows

Once you have installed the package import it into your .py file

import pyttsx3

Test the speech and see if it works

speaker = pyttsx3.init()
speaker.say('Hello User')
speaker.runAndWait()

init stands for initialization.

.say() runs the output you want your script to say.

.runAndWait() makes your script stop talking after saying the code passed in the previous parameter.

Install PyPDF

For this to work effectively, you will need to add a PDF document in your project directory.

pip install PyPDF2

PyPDF is a pure-python library built as a toolkit for PDF. This package is capable of :

  • Extracting document information( such as author, book title, etc)

  • Splitting documents page by page,

  • Merging documents page by page,

  • Cropping images,

  • Merging multiples pages into a single page,

  • Decrypting and encrypting PDF files.

Import PyPDF to open PDF document

import PyPDF2

book = open('django-admin-cookbook.pdf, 'rb')
pdfRead = PyPDF2.PdfFileReader(book)
pages = pdfRead.numPages

Django-admin-cookbook.pdf is a PDF file that is in my project directory.

The open() function opens a file in text format by default. To open a file in binary format, add 'b' to the mode parameter. Hence the "rb" mode opens the file in binary format for reading.

.PdfFileReader() opens the PDF document in the book variable.

Pages: gets the number of pages the PDF has.

Read the PDF file

speaker = pyttsx3.init()
speaker.setProperty("rate", 140)

Speaker: initializes the pyttsx3 package.

setProperty: sets the rate of speech. You can change the value to your taste.

To get the full content of the PDF

for num in range(pages):
    page = pdfRead.getPage(num)
    text = page.extractText()

This code gets the full text of the PDF.

Text: extracts the text from the PDF.

To save/download the full content of the PDF, we will need to add an empty string before the speaker variable and join it with the text variable, like this:

# This opens the PDF file

book = open('django-admin-cookbook.pdf', 'rb')
pdfRead = PyPDF2.PdfFileReader(book)
pages = pdfRead.numPages

full_content = ""

# This reads the PDF file
speaker = pyttsx3.init()
speaker.setProperty("rate", 140)
for num in range(pages):
    page = pdfRead.getPage(num)
    text = page.extractText()
    full_content += text

Download/Save your audio

#This reads the PDF file

speaker = pyttsx3.init()
speaker.setProperty("rate", 140)
for num in range(pages):
    page = pdfRead.getPage(num)
    text = page.extractText()
    full_content += text

speaker.save_to_file(text, "myaudio.mp3")
speaker.runAndWait()

myaudio.mp3: is the name of your audiobook.

You can change it to whatever you want.

That's it!πŸŽ‰ You can now create your audiobooks and stop buying from amazon or kindle.

Now, you can read and sleep at the same time

giphy.gif

Additional Information

I added two python ebooks to this project github repository. Have fun and thank you for the reading till the end❀️.

Here is the github link : github.com/Sophyia7/easy-read

I appreciate you!

Keep in touch:

You can contact me via Email and Discord:

Email -

Discord - Sophyia#8929

Β 
Share this