Java Convert Word to PDF with UNO - Part 1
Libre Office and Open Office are free Office tools providing, among other things, a programmatic interface (called the UNO API) to load, manipulate and save documents.
The steps below create a program to load a Microsoft Word document, make changes, and save it to PDF format.
Objective
We'll use the code below to mail merge a DOC file to PDF. The process is:
- set up your environment
- initialise
- load the document
- substitute the data (mailmerge)
- save as PDF
- shutdown
This starting point will let you test all sorts of document conversions and mail merging scenarios.
Step 1 - Setup
We need to add the Libre Office (or Open Office if that's what you've installed) JARs to our class path. These JARs give us access to the Java UNO API that we'll be calling to do all sorts of magic. In your install of Libre Office, look for the following files and make sure they are in your project class path:
Install Libre Office, or Open Office Create a Java project in your faviourite editor and add these to your class path: [Libre Office Dir]/URE/java/juh.jar [Libre Office Dir]/URE/java/jurt.jar [Libre Office Dir]/URE/java/ridl.jar [Libre Office Dir]/program/classes/unoil.jar Create a new Java class. The following code snippets can be copied into the main() method to create our working program. You can Download the code and template to save time.
Step 2 - Starting the Office Process
Boot a Libre Office process that will listen to our requests.
import java.util.Date; import java.io.File; import com.sun.star.beans.PropertyValue; import com.sun.star.comp.helper.Bootstrap; import com.sun.star.frame.XComponentLoader; import com.sun.star.frame.XDesktop; import com.sun.star.frame.XStorable; import com.sun.star.lang.XComponent; import com.sun.star.lang.XMultiComponentFactory; import com.sun.star.text.XTextDocument; import com.sun.star.uno.UnoRuntime; import com.sun.star.uno.XComponentContext; import com.sun.star.util.XReplaceDescriptor; import com.sun.star.util.XReplaceable;
public class MyDocEngine {
public static void main(String[] args) throws Exception { // Initialise XComponentContext xContext = Bootstrap.bootstrap(); XMultiComponentFactory xMCF = xContext.getServiceManager(); Object oDesktop = xMCF.createInstanceWithContext( "com.sun.star.frame.Desktop", xContext); XDesktop xDesktop = (XDesktop) UnoRuntime.queryInterface( XDesktop.class, oDesktop);
Step 3 - Loading a Document
The code below loads a template into the Libre Office engine. Notice 2 things:
1. It expects to find the template as c:/projects/letterTemplate.doc (so you should change this as required).
2. The load process uses a "Hidden" flag. This can be set to false to see the process working.
// Load the Document String workingDir = "C:/projects/"; String myTemplate = "letterTemplate.doc"; if (!new File(workingDir + myTemplate).canRead()) { throw new RuntimeException("Cannot load template:" + new File(workingDir + myTemplate)); } XComponentLoader xCompLoader = (XComponentLoader) UnoRuntime .queryInterface(com.sun.star.frame.XComponentLoader.class, xDesktop); String sUrl = "file:///" + workingDir + myTemplate; PropertyValue[] propertyValues = new PropertyValue[0]; propertyValues = new PropertyValue[1]; propertyValues[0] = new PropertyValue(); propertyValues[0].Name = "Hidden"; propertyValues[0].Value = new Boolean(true); XComponent xComp = xCompLoader.loadComponentFromURL( sUrl, "_blank", 0, propertyValues);
Step 4 - Search and Replace
The search and replace looks for:
"<date>" and replaces it with the current date and time
"<addressee>" and
"<signatory>".
// Search and replace XReplaceDescriptor xReplaceDescr = null; XReplaceable xReplaceable = null; XTextDocument xTextDocument = (XTextDocument) UnoRuntime .queryInterface(XTextDocument.class, xComp); xReplaceable = (XReplaceable) UnoRuntime .queryInterface(XReplaceable.class, xTextDocument); xReplaceDescr = (XReplaceDescriptor) xReplaceable .createReplaceDescriptor(); // mail merge the date xReplaceDescr.setSearchString("<date>"); xReplaceDescr.setReplaceString(new Date().toString()); xReplaceable.replaceAll(xReplaceDescr); // mail merge the addressee xReplaceDescr.setSearchString("<addressee> "); xReplaceDescr.setReplaceString("Best Friend"); xReplaceable.replaceAll(xReplaceDescr); // mail merge the signatory xReplaceDescr.setSearchString("<signatory> "); xReplaceDescr.setReplaceString("John Steady"); xReplaceable.replaceAll(xReplaceDescr);
Step 5 - Export to PDF
The Libre Office filter name "writer_pdf_export" is used to save as a PDF document.
// save as a PDF XStorable xStorable = (XStorable) UnoRuntime .queryInterface(XStorable.class, xComp); propertyValues = new PropertyValue[2]; propertyValues[0] = new PropertyValue(); propertyValues[0].Name = "Overwrite"; propertyValues[0].Value = new Boolean(true); propertyValues[1] = new PropertyValue(); propertyValues[1].Name = "FilterName"; propertyValues[1].Value = "writer_pdf_Export"; // Appending the favoured extension to the origin document name String myResult = workingDir + "letter1.pdf"; xStorable.storeToURL("file:///" + myResult, propertyValues); System.out.println("Saved " + myResult);
Step 6 - Shutdown
This terminates the process launched in step 2 above. Instead of terminating, more load, manipulate and save processing could be done.
// shutdown xDesktop.terminate();
}
}
You can put all the above code together by copy-and-pasting, or you can download the Code and Template.
Gotchas 1 - Multithreading
It's possible, but not advisable to use this approach in a multi-threaded fashion. Experience has shown that this leads to instability and unpredictable results. Of course you could launch multiple Libre Office processes to handle many requests, each in a single threaded manner.
Gotchas 2 - Process and Crash Management
Under a realistic workload, there are documents that can crash the process. This means your real-production-version of this approach would need to expect for the occasional failure, clean up and restart the process. Ideally this would all be transparent to the calling user or program.
Likewise, you want to make sure you nicely clean up any resources to use in cases of success and cases of failure. In this case we are spawning a separate process which is definitely something you always want to clean up.
Dowloads / Resources
You can Download a Zip of this example.
There are many more examples of using the UNO API in the Libre Office SDK