.doc to pdf using python
A simple example using comtypes, converting a single file, input and output filenames given as commandline arguments:
import sys
import os
import comtypes.client
wdFormatPDF = 17
in_file = os.path.abspath(sys.argv[1])
out_file = os.path.abspath(sys.argv[2])
word = comtypes.client.CreateObject('Word.Application')
doc = word.Documents.Open(in_file)
doc.SaveAs(out_file, FileFormat=wdFormatPDF)
doc.Close()
word.Quit()
You could also use pywin32, which would be the same except for:
import win32com.client
and then:
word = win32com.client.Dispatch('Word.Application')
Convert Microsoft Word document to PDF using Python
i solved this problem and fixed the code has following
import os
import win32com.client
import re
path = (r'D:\programing\test')
word_file_names = []
word = win32com.client.Dispatch('Word.Application')
for dirpath, dirnames, filenames in os.walk(path):
for f in filenames:
if f.lower().endswith(".docx") :
new_name = f.replace(".docx", ".pdf")
in_file =(dirpath + '/'+ f)
new_file =(dirpath + '/' + new_name)
doc = word.Documents.Open(in_file)
doc.SaveAs(new_file, FileFormat = 17)
doc.Close()
if f.lower().endswith(".doc"):
new_name = f.replace(".doc", ".pdf")
in_file =(dirpath +'/' + f)
new_file =(dirpath +'/' + new_name)
doc = word.Documents.Open(in_file)
doc.SaveAs(new_file, FileFormat = 17)
doc.Close()
word.Quit()
Error when trying to convert .doc to .pdf with Python
Example:
# Open Microsoft DOC
app = client.Dispatch("Word.Application")
# Read Doc File
doc = app.Documents.Open('C:/Users/<User>/Downloads/document.docx')
# Convert into PDF File
doc.ExportAsFixedFormat('C:/Users/<User>/Downloads/document.pdf', 17, Item=7, CreateBookmarks=0)
app.Quit()
If it still doesn't work, try deleting the cache files inside the "gen" folder in the path:
C:\Users\\AppData\Local\Programs\Python\Python39\Lib\site-packages\comtypes\gen
Try using the msoffice2pdf library using Microsoft Office or LibreOffice installed in the environment.
https://pypi.org/project/msoffice2pdf/
I need to convert .doc and .docx files to .pdf using python
I have been in the similar problem earlier,
My suggestion:
sorry there is no such direct python library to handle Microsoft office formats specially (.doc)
So try to use LibreOffice as a service in Ubuntu its "libreoffice"
if windows its "soffice.exe" use this in command line to convert the document to .PDF without opening LibreOffice
and its easy and fast too and more over gives almost perfect conversion of the file.
A sample:
For Windows:
C:\Program Files (x86)\LibreOffice 4\program\soffice.exe" --headless --convert-to pdf "input_file_path" --outdir "output_dir_path"
This will convert the input file into pdf in the given output directory without opening the LibreOffice ans just using it as a service.
To run this command from python you can use "subprocess" like libraries.
Docx to pdf using pandoc in python
The second argument to convert_file
is output format, or, in this case, the format through which pandoc generates the pdf. Pandoc doesn't know how to produce a PDF through docx, hence the error.
Use pypandoc.convert_file('thisisdoc.docx', 'latex', outputfile="thisisdoc.pdf")
or pypandoc.convert_file('thisisdoc.docx', 'pdf', outputfile="thisisdoc.pdf")
instead.
Converting .docx to .pdf in Python (File locked for editing)
It works fine when I run your code no matter the path. Maybe you need to exit ms Word before you try and run it?
Related Topics
How to Drop Rows from Pandas Data Frame That Contains a Particular String in a Particular Column
Bold Formatting in Python Console
Efficient Way of Having a Function Only Execute Once in a Loop
How to Calculate Average a Dictionary from List of Dictionary Data
Use Variable as Key Name in Python Dictionary
How to Remove/Delete a Virtualenv
Python - Automatically Adjust Width of an Excel File'S Columns
How to Install Pip for a Specific Python Version
Converting a List into Comma Separated and Add Quotes in Python
Printing the Number of Days in a Given Month and Year [Python]
Easiest Way to Ignore Blank Lines When Reading a File in Python
How to Start a Background Process in Python
How to Transfer Data from One Worksheet into Another Using Python in the Same Workbook
How to Convert Number 1 to a Boolean in Python
Python Tkinter Return Value from Function Used in Command
How to Constantly Run Python Script in the Background on Windows