07
February
2015

Java Convert Word to PDF with UNO - Part 1

Libre Office and Open Office are free Office tools providing, among other things, a programmatic interface (called the UNO API) to load, manipulate and save documents.

The steps below create a program to load a Microsoft Word document, make changes, and save it to PDF format.

Objective

We'll use the code below to mail merge a DOC file to PDF. The process is:

  1. set up your environment
  2. initialise
  3. load the document
  4. substitute the data (mailmerge)
  5. save as PDF
  6. shutdown

This starting point will let you test all sorts of document conversions and mail merging scenarios.

Step 1 - Setup

We need to add the Libre Office (or Open Office if that's what you've installed) JARs to our class path. These JARs give us access to the Java UNO API that we'll be calling to do all sorts of magic. In your install of Libre Office, look for the following files and make sure they are in your project class path:

Install Libre Office, or Open Office

Create a Java project in your faviourite editor and add these to your class path:
  [Libre Office Dir]/URE/java/juh.jar
  [Libre Office Dir]/URE/java/jurt.jar
  [Libre Office Dir]/URE/java/ridl.jar
  [Libre Office Dir]/program/classes/unoil.jar

Create a new Java class.  The following code snippets can be copied into the main() method to create our working program.
You can Download the code and template to save time.


Step 2 - Starting the Office Process

Boot a Libre Office process that will listen to our requests.

import java.util.Date;
import java.io.File;
import com.sun.star.beans.PropertyValue;
import com.sun.star.comp.helper.Bootstrap;
import com.sun.star.frame.XComponentLoader;
import com.sun.star.frame.XDesktop;
import com.sun.star.frame.XStorable;
import com.sun.star.lang.XComponent;
import com.sun.star.lang.XMultiComponentFactory;
import com.sun.star.text.XTextDocument;
import com.sun.star.uno.UnoRuntime;
import com.sun.star.uno.XComponentContext;
import com.sun.star.util.XReplaceDescriptor;
import com.sun.star.util.XReplaceable;

public class MyDocEngine {

public static void main(String[] args) throws Exception { // Initialise XComponentContext xContext = Bootstrap.bootstrap(); XMultiComponentFactory xMCF = xContext.getServiceManager(); Object oDesktop = xMCF.createInstanceWithContext( "com.sun.star.frame.Desktop", xContext); XDesktop xDesktop = (XDesktop) UnoRuntime.queryInterface( XDesktop.class, oDesktop);

Step 3 - Loading a Document

The code below loads a template into the Libre Office engine. Notice 2 things:
1. It expects to find the template as c:/projects/letterTemplate.doc (so you should change this as required).
2. The load process uses a "Hidden" flag. This can be set to false to see the process working.

 // Load the Document
 String workingDir = "C:/projects/";
 String myTemplate = "letterTemplate.doc";

 if (!new File(workingDir + myTemplate).canRead()) {
  throw new RuntimeException("Cannot load template:" + new File(workingDir + myTemplate));
 }

 XComponentLoader xCompLoader = (XComponentLoader) UnoRuntime
  .queryInterface(com.sun.star.frame.XComponentLoader.class, xDesktop);

 String sUrl = "file:///" + workingDir + myTemplate;
 
 PropertyValue[] propertyValues = new PropertyValue[0];
 
 propertyValues = new PropertyValue[1];
 propertyValues[0] = new PropertyValue();
 propertyValues[0].Name = "Hidden";
 propertyValues[0].Value = new Boolean(true);
 
 XComponent xComp = xCompLoader.loadComponentFromURL(
  sUrl, "_blank", 0, propertyValues);

Step 4 - Search and Replace

The search and replace looks for:
"<date>" and replaces it with the current date and time
"<addressee>" and
"<signatory>".

 // Search and replace
 XReplaceDescriptor xReplaceDescr = null;
 XReplaceable xReplaceable = null;

 XTextDocument xTextDocument = (XTextDocument) UnoRuntime
   .queryInterface(XTextDocument.class, xComp);

 xReplaceable = (XReplaceable) UnoRuntime
   .queryInterface(XReplaceable.class, xTextDocument);

 xReplaceDescr = (XReplaceDescriptor) xReplaceable
   .createReplaceDescriptor();

 // mail merge the date
 xReplaceDescr.setSearchString("<date>");
 xReplaceDescr.setReplaceString(new Date().toString());
 xReplaceable.replaceAll(xReplaceDescr);
 
 // mail merge the addressee
 xReplaceDescr.setSearchString("<addressee>");
 xReplaceDescr.setReplaceString("Best Friend");
 xReplaceable.replaceAll(xReplaceDescr);
 
 // mail merge the signatory
 xReplaceDescr.setSearchString("<signatory>");
 xReplaceDescr.setReplaceString("John Steady");
 xReplaceable.replaceAll(xReplaceDescr);

Step 5 - Export to PDF

The Libre Office filter name "writer_pdf_export" is used to save as a PDF document.

 // save as a PDF 
 XStorable xStorable = (XStorable) UnoRuntime
   .queryInterface(XStorable.class, xComp);

 propertyValues = new PropertyValue[2];
 propertyValues[0] = new PropertyValue();
 propertyValues[0].Name = "Overwrite";
 propertyValues[0].Value = new Boolean(true);
 propertyValues[1] = new PropertyValue();
 propertyValues[1].Name = "FilterName";
 propertyValues[1].Value = "writer_pdf_Export";

 // Appending the favoured extension to the origin document name
 String myResult = workingDir + "letter1.pdf";
 xStorable.storeToURL("file:///" + myResult, propertyValues);

 System.out.println("Saved " + myResult);

Step 6 - Shutdown

This terminates the process launched in step 2 above. Instead of terminating, more load, manipulate and save processing could be done.

 // shutdown
 xDesktop.terminate();
}
}

You can put all the above code together by copy-and-pasting, or you can download the Code and Template.

Gotchas 1 - Multithreading

It's possible, but not advisable to use this approach in a multi-threaded fashion. Experience has shown that this leads to instability and unpredictable results. Of course you could launch multiple Libre Office processes to handle many requests, each in a single threaded manner.

Gotchas 2 - Process and Crash Management

Under a realistic workload, there are documents that can crash the process. This means your real-production-version of this approach would need to expect for the occasional failure, clean up and restart the process. Ideally this would all be transparent to the calling user or program.

Likewise, you want to make sure you nicely clean up any resources to use in cases of success and cases of failure. In this case we are spawning a separate process which is definitely something you always want to clean up.

 

Dowloads / Resources

You can Download a Zip of this example.


There are many more examples of using the UNO API in the Libre Office SDK

Tags : Java, LibreOffice, Microsoft Word, OpenOffice, PDF Author : Paul Jowett