Home Caching of basic XWPFDocument templates and reusing them for document generation
Reply: 0

Caching of basic XWPFDocument templates and reusing them for document generation

user48013
1#
user48013 Published in September 19, 2018, 9:09 am

We are implementing a portal handling requests for modifying and generating Microsoft Office 2007 documents (docx).The back-end is implemented in Java using Apache POI as the API of manipulating the contents of the docx files. The back-end is accessed through RestAPI calls coming from a front-end written in JavaScript.

The back-end acts like a Document Server that handles about 15 different docx documents which act as templates and contain tokens that need to be replaced with actual values. The requests coming from the front-end are actually a token value map that the back-end needs to replace in the templates and generate a new document, for each request. The workflow is as follows:

  • receive request from front-end: token-value map
  • read template document as XWPFDocument object
  • parse and replace text in all XWPFParagraph/XWPFTable elements of the XWPFDocument
  • write the modified XWPFDocument to a different file path

I am trying to implement a caching mechanisms at the moment, it is a real performance issue going to the disk and reading the files for each request. I would need to treat each template document as a Prototype and return a clone for each request that the back-end receives, something similar to this:

XWPFDocument theDocument = documentCache.clone(documentConfiguration.getInputType());

The clone method is currently implemented as follows:

public XWPFDocument clone(DocumentDictionary.DocumentType type){

    if(PACKAGE_MAP.isEmpty())
        getPackages();

    XWPFDocument document = null;
    try {
        document = new XWPFDocument(PACKAGE_MAP.get(type));
    }catch(IOException exception){
        logger.error("Unable to clone document for input type {}", type);
    }

    return document;
}

This implementation does not yield the desired results, the first request processing works as expected, but the second request fails when writting the document with the error:

Caused by: org.xml.sax.SAXParseException: The processing instruction target matching "[xX][mM][lL]" is not allowed.

The exception above does not replicate in the case of reading the document fresh at each request. Looking at the Apache POI API, the clone() methods for XWPFDocument and ZipPackage, used in the reading/writting process are protected, so I cannot use the basic functionality offered by the programming language and the issues seems to come from the fact that the ZipPackage is shared and used in both the reading/writting of the document.

Has anyone been able to implement such a mechanism using Java and Apache POI?

share|improve this question
  • Maybe I am a fool but I have not understand what exactly you are trying to achieve. As far as I see, there is PACKAGE_MAP which is a Map<DocumentType, OPCPackage> right? And your approach is having this all in random access memory because of "performance issue going to the disk and reading the files for each request"? But what amount of RAM are you having then? In my opinion this will lead to OutOfMemoryError very fast. – Axel Richter Feb 13 at 7:39
  • But to your error. If you are creating the XWPFDocument from an OPCPackage then this OPCPackage cannot be shared between multiple XWPFDocuments the same time. This is pretty clear. – Axel Richter Feb 13 at 7:53
  • Sorry for the late answer. Basically what I am trying to achieve is to have a bunch of template word documents (docx) from which, after a process of merging with a token value map, I create a new series of documents that I write to disk. In this process I want to have the template loaded in memory and not have to go to disk everytime. I did not yet find a good method for achieving this. – Radu-Cristian Stefanescu Mar 7 at 8:19
  • Related to the OOM exception, that would not be the case in my opinion, as the only objects persisted in RAM are the template documents. For all the other documents, I want to copy the template, modify, write to disk, close the handles and clean up. – Radu-Cristian Stefanescu Mar 7 at 8:24
  • As said the same template OPCPackage cannot be shared by multiple XWPFDocuments since all the changings where made in this template OPCPackage first, before the new XWPFDocument will be written out. So I cannot think of any caching of template OPCPackages and also not of template XWPFDocuments in memory. – Axel Richter Mar 7 at 8:59

active oldest votes

Your Answer

StackExchange.ifUsing("editor", function () { StackExchange.using("externalEditor", function () { StackExchange.using("snippets", function () { StackExchange.snippets.init(); }); }); }, "code-snippets"); StackExchange.ready(function() { var channelOptions = { tags: "".split(" "), id: "1" }; initTagRenderer("".split(" "), "".split(" "), channelOptions); StackExchange.using("externalEditor", function() { // Have to fire editor after snippets, if snippets enabled if (StackExchange.settings.snippets.snippetsEnabled) { StackExchange.using("snippets", function() { createEditor(); }); } else { createEditor(); } }); function createEditor() { StackExchange.prepareEditor({ heartbeatType: 'answer', convertImagesToLinks: true, noModals: false, showLowRepImageUploadWarning: true, reputationToPostImages: 10, bindNavPrevention: true, postfix: "", onDemand: true, discardSelector: ".discard-answer" ,immediatelyShowMarkdownHelp:true }); } });
 
StackExchange.ready( function () { StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f48748811%2fcaching-of-basic-xwpfdocument-templates-and-reusing-them-for-document-generation%23new-answer', 'question_page'); } );

By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Browse other questions tagged java rest templates apache-poi xwpf or ask your own question.

StackExchange.ready(function(){$.get('/posts/48748811/ivc/c4f0');});
StackExchange.ready(function () { StackExchange.responsiveness.addSwitcher(); }) (function(i, s, o, g, r, a, m) { i['GoogleAnalyticsObject'] = r; i[r] = i[r] || function() { (i[r].q = i[r].q || []).push(arguments) }, i[r].l = 1 * new Date(); a = s.createElement(o), m = s.getElementsByTagName(o)[0]; a.async = 1; a.src = g; m.parentNode.insertBefore(a, m); })(window, document, 'script', 'https://www.google-analytics.com/analytics.js', 'ga'); StackExchange.ready(function () { StackExchange.ga.init({ sendTitles: true, tracker: window.ga, trackingCodes: [ 'UA-108242619-1' ] }); StackExchange.ga.setDimension('dimension2', '|java|rest|templates|apache-poi|xwpf|'); StackExchange.ga.setDimension('dimension3', 'Questions/Show'); StackExchange.ga.trackPageView(); }); /**/ var _qevents = _qevents || [], _comscore = _comscore || []; (function() { var ssl = 'https:' == document.location.protocol, s = document.getElementsByTagName('script')[0], qc = document.createElement('script'); qc.async = true; qc.src = (ssl ? 'https://secure' : 'http://edge') + '.quantserve.com/quant.js'; s.parentNode.insertBefore(qc, s); _qevents.push({ qacct: "p-c1rF4kxgLUzNc" }); /**/ var sc = document.createElement('script'); sc.async = true; sc.src = (ssl ? 'https://sb' : 'http://b') + '.scorecardresearch.com/beacon.js'; s.parentNode.insertBefore(sc, s); _comscore.push({ c1: "2", c2: "17440561" }); })();
You need to login account before you can post.

About| Privacy statement| Terms of Service| Advertising| Contact us| Help| Sitemap|
Processed in 0.472479 second(s) , Gzip On .

© 2016 Powered by mzan.com design MATCHINFO