Convert any .pdf file 📚 into an audio 🔈 book with Python
Mustafa Anas

Mustafa Anas @mustafaanas

About: Full-Stack Web Developer. main stack: MongoDB, Expressjs, Reactjs, Nodejs, and GraphQL (some React Native and Flutter too)

Location:
Toronto, ON
Joined:
Oct 22, 2018

Convert any .pdf file 📚 into an audio 🔈 book with Python

Publish Date: Jan 7 '20
925 55

(edit: I am glad you all liked this project! It got to be the top Python article of the week!)

A while ago I was messing around with google's Text to Speech python library.
This library basically reads out any piece of text and converts it to .mp3 file. Then I started thinking of making something useful out of it.

My installed, saved, and unread pdf books 😕

I like reading books. I really do. I think language and ideas sharing is fascinating. I have a directory at which I store pdf books that I plan on reading but I never do. So I thought hey, why dont I make them audio books and listen to them while I do something else 😄!

So I started planning how the script should look like.

  • Allow user to pick a .pdf file
  • Convert the file into one string
  • Output .mp3 file.

Without further needless words, lets get to it.

Allow user to pick a .pdf file

Python can read files easily. I just need to use the method open("filelocation", "rb") to open the file in reading mode. I dont want to be copying and pasting files to the directory of the code everytime I want to use the code though. So to make it easier we will use tkinter library to open up an interface that lets us choose the file.

from tkinter import Tk
from tkinter.filedialog import askopenfilename

Tk().withdraw() # we don't want a full GUI, so keep the root window from appearing
filelocation = askopenfilename() # open the dialog GUI
Enter fullscreen mode Exit fullscreen mode

Great. Now we have the file location stored in a filelocation variable.

Allow user to pick a .pdf file ✔️

Convert the file into one string

As I said before, to open a file in Python we just need to use the open() method. But we also want to convert the pdf file into regular pieces of text. So we might as well do it now.
To do that we will use a library called pdftotext.
Lets install it:

sudo pip install pdftotext
Enter fullscreen mode Exit fullscreen mode

Then:

from tkinter import Tk
from tkinter.filedialog import askopenfilename
import pdftotext

Tk().withdraw() # we don't want a full GUI, so keep the root window from appearing
filelocation = askopenfilename() # open the dialog GUI

with open(filelocation, "rb") as f:  # open the file in reading (rb) mode and call it f
    pdf = pdftotext.PDF(f)  # store a text version of the pdf file f in pdf variable
Enter fullscreen mode Exit fullscreen mode

Great. Now we have the file stored in the variable pdf.
if you print this variable, you will get an array of strings. Each string is a line in the file. to get them all into one .mp3 file, we will have to make sure they are all stored as one string. So lets loop through this array and add them all to one string.

from tkinter import Tk
from tkinter.filedialog import askopenfilename
import pdftotext

Tk().withdraw() # we don't want a full GUI, so keep the root window from appearing
filelocation = askopenfilename() # open the dialog GUI

with open(filelocation, "rb") as f:  # open the file in reading (rb) mode and call it f
    pdf = pdftotext.PDF(f)  # store a text version of the pdf file f in pdf variable

string_of_text = ''
for text in pdf:
    string_of_text += text
Enter fullscreen mode Exit fullscreen mode

Sweet 😄. Now we have it all as one piece of string.

Convert the file into one string ✔️

Output .mp3 file 🔈

Now we are ready to use the gTTS (google Text To Speech) library. all we need to do is pass the string we made, store the output in a variable, then use the save() method to output the file to the computer.
Lets install it:

sudo pip install gtts
Enter fullscreen mode Exit fullscreen mode

Then:

from tkinter import Tk
from tkinter.filedialog import askopenfilename
import pdftotext
from gtts import gTTS

Tk().withdraw() # we don't want a full GUI, so keep the root window from appearing
filelocation = askopenfilename() # open the dialog GUI

with open(filelocation, "rb") as f:  # open the file in reading (rb) mode and call it f
    pdf = pdftotext.PDF(f)  # store a text version of the pdf file f in pdf variable

string_of_text = ''
for text in pdf:
    string_of_text += text

final_file = gTTS(text=string_of_text, lang='en')  # store file in variable
final_file.save("Generated Speech.mp3")  # save file to computer
Enter fullscreen mode Exit fullscreen mode

As simple as that! we are done 🎇
(edit: I am glad you all liked this article! The intention of all my writings is to be as simple as possible so all-levels readers can understand. If you wish to know more about customizing this API, please check this page: https://gtts.readthedocs.io/en/latest/)

Buy Me A Coffee

I am on a lifetime mission to support and contribute to the general knowledge of the web community as much as possible. Some of my writings might sound too silly, or too difficult, but no knowledge is ever useless.If you like my articles, feel free to help me keep writing by getting me coffee :)

Comments 55 total

  • Kristina Gocheva
    Kristina GochevaJan 7, 2020

    My favorite part is (if I am not mistaken) that this would work for any language PDF as long as google text to speech supports the language.

  • Belkin
    BelkinJan 7, 2020

    Do you have any demo audio files? I'm really interested to hear it. :)

    • Mustafa Anas
      Mustafa AnasJan 7, 2020

      Run this code and hear the result

      from gtts import gTTS
      final_file = gTTS(text='Demo String', lang='en')  # store file in variable
      final_file.save("Generated Speech.mp3")  # save file to computer
      
  • Rishabh Aggarwal
    Rishabh AggarwalJan 7, 2020

    Hey, this is really cool.

    • Mustafa Anas
      Mustafa AnasJan 7, 2020

      hey thanks buddy!
      glad you liked it

  • Cristian Carvajal 👽
    Cristian Carvajal 👽Jan 7, 2020

    Great!!
    Does it work in any language?

  • Blake Stansell
    Blake StansellJan 7, 2020

    Awesome, awesome, awesome! I'm guessing they're ok to listen to?

  • bga
    bgaJan 7, 2020

    Really useful article.

  • SURAJ BRANWAL
    SURAJ BRANWALJan 8, 2020

    Thanks a lot for the article, I tried a lot finding such thing but now am able to read(listen) to all my untouched PDFs.

  • schwepmo
    schwepmoJan 9, 2020

    Really cool and quick project! One thing I would suggest is to use python's join() method instead of looping over the list of strings. I think that's the more "pythonic" way and should also perform a little better.

    • Mustafa Anas
      Mustafa AnasJan 9, 2020

      Thanks for the tip!
      I sure will start using that

  • Ashwanth
    AshwanthJan 10, 2020

    I am really intrigued by this article. I tried everything to install pdftotext lib on my mac but was unsuccessful. I keep getting this error --> " error: command 'gcc' failed with exit status 1"
    I installed OS dependencies , Poppler using brew but didn't work. Can you anyone help me?

    • Mustafa Anas
      Mustafa AnasJan 10, 2020

      make sure you have these two installed:
      python-dev
      libevent-dev

      • Ashwanth
        AshwanthJan 10, 2020

        Yup i installed them . NO matter what i do, i keep getting this error --> "ERROR: Command errored out with exit status 1"
        and i installed gcc too!

        • Kelvin Thompson
          Kelvin ThompsonJan 10, 2020

          I just started getting the same thing on my system (Ubuntu). After a lot of Google/StackExchange, this worked (copy from my annotations):

          For whatever reason, in order to install the following two, I had to install some stuff on my Ubuntu Mate ** system-wide ** to get rid of compile errors:

          sudo apt-get install python3-setuptools python3-dev libpython3-dev
          sudo apt-get update
          sudo apt-get install build-essential libpoppler-cpp-dev pkg-config python-dev

          I'm using PyCharmCE. After the above, I could use this in the PyCharm terminal:

          pip3 install pdftotext
          pip3 install gtts

          After I did all of that, successful! Program works like a charm (hehe).

          Cheers!

          • Mustafa Anas
            Mustafa AnasJan 11, 2020

            Thanks for sharing your solution!

            • Kelvin Thompson
              Kelvin ThompsonJan 11, 2020

              A pleasure to finally be able to give back a little!

              • Ashwanth
                AshwanthJan 11, 2020

                I have a Mac, brother. Can't use app-get. what should i do now?

                • David Souza
                  David SouzaJan 14, 2020

                  Are you using the default Python 2.7?? You may need to use Python 3.x

                  • David Souza
                    David SouzaJan 14, 2020

                    I got this working on the Mac using Python 3.7.4 using virtual env and brew. Works fine.

                    • Jogesh
                      JogeshJan 14, 2020

                      I am using docker with my Macbook without any issue. And it is a great alternative to start working on any environment, stack, etc.

      • Rohit Prasad
        Rohit PrasadJan 16, 2020

        They mention what all has to be installed for various O.S's in here pypi.org/project/pdftotext/

    • Harald Nezbeda
      Harald NezbedaJan 25, 2020

      Have you tried to install the OS dependencies as specified in the docs? github.com/jalan/pdftotext#macos

  • Narendra Kumar Vadapalli
    Narendra Kumar VadapalliJan 14, 2020

    I am on fedora and had to install the following dependencies to get this working before I could pip install pdftotext

    Sequence would be

    sudo dnf install gcc-c++ pkgconfig poppler-cpp-devel python-devel redhat-rpm-config
    pip install pdftotext gtts
    
  • Steve (Gadget) Barnes
    Steve (Gadget) BarnesJan 14, 2020

    I would suggest adding two lines to save the MP3 file to the same location and name as the PDF file.

    from os.path import splitext

    outname = splitext(filelocation)[0] + '.mp3'

    then use:

    final_file.save(outname)

    • Mustafa Anas
      Mustafa AnasJan 14, 2020

      That would be a nice add!

    • sadorect
      sadorectJan 14, 2020

      Oh, fantastic! I was looking to add this by myself but I don't know python coding. Thanks for bringing it up!

  • Usman Kamal
    Usman KamalJan 14, 2020

    Nice one Mustafa!

    I'm curious what would happen if the PDF has images or mathematical equations?

  • sadorect
    sadorectJan 14, 2020

    This is a life-saving procedure you shared. I tried it and works like charm. Thank you so very much.

    I have a question though...
    I know this is a simplistic approach to just explain the basics( and its awesome). Please, is it possible to change the reader's voice and reading speed?

    • Mustafa Anas
      Mustafa AnasJan 15, 2020

      I am glad you liked it!
      The intention of all my writings is to be as simple as possible so all-levels readers can understand.
      If you wish to know more about customizing this API, please check this page:
      gtts.readthedocs.io/en/latest/

  • sadorect
    sadorectJan 14, 2020

    An observation here ( I'm sure this has to do with the gtts engine though ):

    The reader would rather spell some words than pronounce the actual words and its a bit strange. I did a conversion where the word "first" was spelt rather than pronounced. Initially, I thought such occurs when words are not properly written and the text recognition engine is affected. "Five" was pronounced fai-vee-e,and other spellings like that.

    Overall though, it is manageable and one can make good sense out of the readings. Now I can "read" my e-books faster with this ingenious solution.

    Thanks again, @mustapha

  • Abhinav Kumar Srivastava
    Abhinav Kumar SrivastavaJan 14, 2020

    Really cool !
    However , when I tried to convert a decent sized pdf file (3.0 MB) , I got the following error :

    "gtts.tts.gTTSError: 500 (Internal Server Error) from TTS API. Probable
    cause: Uptream API error. Try again later."

    Is Gtts blocking me from using their API ? How shall I resolve this ?

  • Abhinav Kumar Srivastava
    Abhinav Kumar SrivastavaJan 14, 2020

    Suggestion : Display status of the conversion ..

  • Mustafa Anas
    Mustafa AnasJan 15, 2020

    Thank you for sharing the repo Harlin!

  • Dima Naboka
    Dima NabokaJan 15, 2020

    I have a problem running [vagrant@centos8 ~]$ sudo pip3 install pdftotext on CentoOS8:
    error: command 'gcc' failed with exit status 1
    Command "/usr/bin/python3.6 -u -c "import setuptools, tokenize;file='/tmp/pip-build-7_3v7vuh/pdftotext/setup.py';f=getattr(tokenize, 'open', open)(file);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, file, 'exec'))" install --record /tmp/pip-ac0irxfy-record/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /tmp/pip-build-7_3v7vuh/pdftotext/

    I'm running Python 3.6.8, do I have to use Python 3.8 explicitly?

  • Ankur Tiwari
    Ankur TiwariJan 18, 2020

    Cool stuff!

  • Oyeladun Rapheal Kunle
    Oyeladun Rapheal KunleApr 1, 2020

    I copy this codes and paste in python 3(Anaconda) and nothing displayed, no error no output, please why, thanks

    • Mustafa Anas
      Mustafa AnasApr 1, 2020

      I do not use Anacoda so I can't guess what the problem is.
      Just make sure you have all the needed packages installed and it should run smoothly.

  • malraharsh
    malraharshSep 10, 2020

    This idea is great. But if you just want to listen. Use Moon+ Reader App. It converts text to speech.

  • Priyanshu Kumar
    Priyanshu KumarOct 10, 2020

    will it also read page number, footer or any extra garbage text?

    • Vaibhav Kaushik
      Vaibhav KaushikNov 5, 2020

      Yes, of course as they are also a type of text.

      • Priyanshu Kumar
        Priyanshu KumarNov 5, 2020

        Using Machine learning you can avoid those things.

  • Kushagra0347
    Kushagra0347Nov 20, 2020

    This code gets stuck after I add PDF. can anyone provide any solution to this?

    from tkinter import *
    import pygame
    import PyPDF2
    from gtts import gTTS
    from tkinter import filedialog
    from os.path import splitext

    root = Tk();
    root.title('PDF Audio Player')
    root.geometry("500x300")

    Initialise Pygame Mixer

    pygame.mixer.init()

    Add PDF Function

    def addPDF():
    PDF = filedialog.askopenfilename(title="Choose a PDF", filetypes=(("PDF Files", "*.PDF"), ))
    PDF_dir = PDF

    # Strip Out the Directory Info and .pdf extension
    # So That Only the Title Shows Up
    PDF = PDF.replace('C:/Users/kusha/Downloads/', '')
    PDF = PDF.replace(".pdf", '')
    
    audioBookBox.insert(END, PDF)
    PDFtoAudio(PDF_dir)
    
    Enter fullscreen mode Exit fullscreen mode

    def PDFtoAudio(PDF_dir):
    file = open(PDF_dir, 'rb')
    reader = PyPDF2.PdfFileReader(file)
    totalPages = reader.numPages
    string = ""

    for i in range(0, totalPages):
    page = reader.getPage(i)
    text = page.extractText()
    string += text

    outName = splitext(PDF_dir)[0] + '.mp3'
    audioFile = gTTS(text=string, lang='en') # store file in variable
    audioFile.save(outName) # save file to computer

    Enter fullscreen mode Exit fullscreen mode



    Play Selected PDF Function

    def play():
    audio = audioBookBox.get(ACTIVE)
    audio = f'C:/Users/kusha/Downloads/{audio}.mp3'

    pygame.mixer.music.load(audio)
    pygame.mixer.music.play(loops=0)
    Enter fullscreen mode Exit fullscreen mode



    Create Playlist Box

    audioBookBox = Listbox(root, bg="black", fg="red", width = 70, selectbackground="gray", selectforeground="black")
    audioBookBox.pack(pady=20)

    Define Player Control Button Images

    backBtnImg = PhotoImage(file='Project Pics/back50.png')
    forwardBtnImg = PhotoImage(file='Project Pics/forward50.png')
    playBtnImg = PhotoImage(file='Project Pics/play50.png')
    pauseBtnImg = PhotoImage(file='Project Pics/pause50.png')
    stopBtnImg = PhotoImage(file='Project Pics/stop50.png')

    Create Player Control Frame

    controlsFrame = Frame(root)
    controlsFrame.pack()

    Create Player Control Buttons

    backBtn = Button(controlsFrame, image=backBtnImg, borderwidth=0)
    forwardBtn = Button(controlsFrame, image=forwardBtnImg, borderwidth=0)
    playBtn = Button(controlsFrame, image=playBtnImg, borderwidth=0, command=play)
    pauseBtn = Button(controlsFrame, image=pauseBtnImg, borderwidth=0)
    stopBtn = Button(controlsFrame, image=stopBtnImg, borderwidth=0)

    backBtn.grid(row=0, column=0, padx=10)
    forwardBtn.grid(row=0, column=1, padx=10)
    playBtn.grid(row=0, column=2, padx=10)
    pauseBtn.grid(row=0, column=3, padx=10)
    stopBtn.grid(row=0, column=4, padx=10)

    Create Menu

    myMenu = Menu(root)
    root.config(menu=myMenu)

    Add the converted audio file in the menu

    addAudioMenu = Menu(myMenu)
    myMenu.add_cascade(label="Add PDF", menu=addAudioMenu)
    addAudioMenu.add_command(label="Add One PDF", command=addPDF)

    root.mainloop()

  • abragred
    abragredJan 2, 2021

    This seems a very nice idea, might get my friend that knows how to do stuff with Python to get this done for me. I'm not at such a high level of technology use to be able to do this stuff alone, but I'd like to learn. One thing that I'm proud of myself that I can help my family with is working with PDF forms. I especially help my mom a lot cause she has a lot of forms to fill for her job, but has a big lack of technology talent. I found this site pdfliner.com/alternative/sejda_alt... that lets me edit anything I want.

  • dennisboscodemello1989
    dennisboscodemello1989Aug 13, 2021

    Is there any way to pop up an option for choosing the page from which the reading will start & option for choosing the pdf file is there, I am pasting the code

    import pyttsx3 as py
    import PyPDF2 as pd

    pdfReader = pd.PdfFileReader(open('Excel-eBook.pdf', 'rb'))

    from tkinter.filedialog import *

    speaker = py.init()

    voices = speaker.getProperty('voices')

    for voice in voices:
    speaker.setProperty('voice', voice.id)

    book = askopenfilename()
    pdfreader = pd.PdfFileReader(book)
    pages = pdfreader.numPages

    for num in range(0, pages): # O is the number from where the reading will start
    page = pdfreader.getPage(num)
    text = page.extractText()
    player = py.init()
    player.say(text)
    player.runAndWait()

Add comment