Libre Office and Open Office are free Office tools providing, among other things, a programmatic interface (called the UNO API) to load, manipulate and save documents.
The steps below create a program to load a Microsoft Word document, make changes, and save it to PDF format.
We’ll use the code below to mail merge a DOC file to PDF. The process is:
- set up your environment
- load the document
- substitute the data (mailmerge)
- save as PDF
This starting point will let you test all sorts of document conversions and mail merging scenarios.
Step 1 – Setup
We need to add the Libre Office (or Open Office if that’s what you’ve installed) JARs to our class path. These JARs give us access to the Java UNO API that we’ll be calling to do all sorts of magic. In your install of Libre Office, look for the following files and make sure they are in your project classpath:
You can Download the code and template to save time.
Step 2 – Starting the Office Process
Boot a Libre Office process that will listen to our requests.
Step 3 – Loading a Document
The code below loads a template into the Libre Office engine. Notice 2 things:
- It expects to find the template as c:/projects/letterTemplate.doc (so you should change this as required).
- The load process uses a “Hidden” flag. This can be set to false to see the process working.
Step 4 – Search and Replace
The search and replace looks for:
“<date>” and replaces it with the current date and time
Step 5 – Export to PDF
The Libre Office filter name “writer_pdf_export” is used to save as a PDF document.
Step 6 – Shutdown
This terminates the process launched in step 2 above. Instead of terminating, more load, manipulation, and save processing could be done.
You can put all the above code together by copy-and-pasting, or you can download the Code and Template.
Gotchas 1 – Multithreading
It’s possible, but not advisable to use this approach in a multi-threaded fashion. Experience has shown that this leads to instability and unpredictable results. Of course you could launch multiple Libre Office processes to handle many requests, each in a single threaded manner.
Gotchas 2 – Process and Crash Management
Under a realistic workload, there are documents that can crash the process. This means your real-production-version of this approach would need to expect for the occasional failure, clean up and restart the process. Ideally this would all be transparent to the calling user or program.
Likewise, you want to make sure you nicely clean up any resources to use in cases of success and cases of failure. In this case we are spawning a separate process which is definitely something you always want to clean up.
Downloads / Resources
You can Download a Zip of this example.
There are many more examples of using the UNO API in the Libre Office SDK