Post processing AsciiDoc outputs with a #SpringBoot script

By Greg Turnquist

Greg L. Turnquist worked on the Spring team for over thirteen years and is a senior staff technical content engineer at Cockroach Labs. He was the lead for Spring Data JPA and Spring Web Services. He wrote Packt's best-selling title, Learning Spring Boot 2.0 2nd Edition, and its 3rd Edition follow-up along many others.

May 31, 2014

platform-spring-bootI’ve made great strides getting things to look nice with my AsciiDoc-to-LibreOffice outputs. But I ran into issues that I couldn’t solve with asciidoc-packt.

  • Text wrapped in **double asterisks** wouldn’t appear with the Key Word [PACKT] styling applied.
  • Bullet point lists have a special style for the last entry, probably to grant additional spaces afterwards before the Normal block begins.

As my colleagues are well aware, when I can’t beat something, I write a script! So I decided to see if I could a bit of code to post-process the FODT XML file generated by AsciiDoc.

This time, I didn’t reach for Python to solve it. Instead, I reached for Spring Boot CLI + Groovy. Something I’ve been coming to realize is that Groovy is a powerful scripting language. But not only that, it provides easy access to the Java ecosystem through @Grab annotations. Spring Boot CLI dials things up a notch by making it super easy to write a command line runner and even tack on a web page later on via Spring MVC.

So I embarked on doing some XML parsing and editing. Naturally I reached for Jsoup. Jsoup by default is made for parsing HTML, but it also has an XML parser. With it, it’s easy to search for tokens because it has a jQuery-like selector API. I was able to look for patterns of behavior, like a bullet point list, and then find the last row’s entry. Then it simply a matter of adjusting one attribute to adjust the style. Write the updated DOM document back out to disk, and POOF! I had proper formatting for all my bullet point lists.

I was able to get **key word** formatting to work by looking for strong styling in specific situations, like inside Normal, Bullet, and Tip paragraphs (but NOT Code listings).

I tried to hammer out code blocks, but was unsuccessful. Packt’s style sheet shows each line of a code listing being wrapped like a separate paragraph with Code [PACKT]. Then the last line can be styled as Code End [PACKT]. But my solution with asciidoc-packt was the wrap the whole listing as a single paragraph. It’s buried with <text:line-break /> tags. I tried splitting it up and rejoing it, but I could never replace that DOM element with a new one.

Since I needed to not spend my entire writing schedule on tools, I put that feature down and decided I will either duck the whole thing, or put in that particular styling edit manually. After all, I’m already getting major efficiency improvements by having AsciiDoc do most of the leg work for me. So it’s probably best if I not get too wrapped up working on this post processing script, and instead focus the right amount of time every night on my manuscript.

I’ve already roughed out chapter one. I am now going through each section and making sure it flows nicely. I’m moving chunks around, and ensuring everything is clearly and cleanly explained. I don’t want any reader to suffer jolts with things appearing to jump too hastily.

The whole idea isn’t to dump a bunch of code in the reader’s lap, but instead tell an enjoyable story that just happens to be about the most innovative way to build rock solid apps on the battle tested Spring + Apache Tomcat + JVM stack.


Submit a Comment

Your email address will not be published. Required fields are marked *