In a previous article, we mentioned how markdown can be used to edit document across different platforms and even how some are using it to write a book. I also touched on how teams can use GitLab to work together remotely.
Now this two services (Markdown and GitLab) can be used together in publishing a book.
For a start, GitLab enables the author and proofreader to both work in parallel. As the author(s) commit their works into the GitLab pipeline, proofreader(s) can constantly receive updates and new works from the author(s). This allows the proofreading process to begin much earlier and allow both sides to work at the same time, saving time. Besides, this method also means that authors do not need to manually compile all their works into a book before submitting for proofreading, making the process easier.
So how do we publish all the articles into a book? Every time when there is a new document content added or modification the team have to consolidate all and publish into a (work in progress) full book. This becomes a hassle to the authors. With Markdown, all the manual works can be completed just by typing a simple command.
Pandoc
To do this, we can make use of Pandoc library. Pandoc is a free document converter which is widely used in publishing workflows. It can easily convert from html to pdf, or markdown to epub, pdf, and varieties document formats as output from simple to advanced format with style defined, table of content, and reference links.
Pandoc is available for Windows, Ubuntu and all other Linux operating systems. However, the procedure to accomplish this in Windows are more complicated. In this article, lets focus on Ubuntu environment.
Pull out a terminal in your Ubuntu. Then type these commands below:
$ sudo apt upgrade
$ sudo apt update
$ sudo apt install pandoc
$ sudo apt install texlive-latex-base
$ sudo apt install te xlive-xetex
Once you get all prerequisites installed and you are ready then you can use the command below to compile all Markdown documents in a folder into a pdf book.
$ pandoc -o book.pdf *.md
CI/CD Automation
To recap, we start off by using markdowns to more easily work on the individual documents, we then uses gitlab to help sync works and progresses between different authors, with pandoc installed, we are able to combine all these documents into an end product pdf file. Now, lets talk on how to have gitlab call the pandoc command to combine all markdowns into pdf.
In order to automate the process, we can rely on Gitlab Continuous Integration/Continuous Development (CI/CD) Automation to help call the command to compile all markdowns into a pdf document every time changes are made.
There are 2 steps:
- Make sure all your markdown documents are in the root folder.
- Add .gitlab-ci.yml script.
After adding .gitlab-ci.yml script, every time any new push is done to GitLab, it will trigger a new pipeline build and compile all the markdown documents into a single file for users to download. The pdf book can then be downloaded from CI/CD > Pipeline > Download artifacts.
The .gitlab-ci.yml script will do all the jobs for you.
In GitLab CI/CD, the script is using pandoc docker to generate the pdf. This is the full content of .gitlab-ci.yml you need.
A simple script is all we required to compile all markdown documents into book.pdf.
pandoc -toc -pdf-engine=xelatex -o book.pdf *.md
Each of the parameters and explanations.
-toc
Add a table of content
-pdf-engine=xelatex
Specify the pdf interpretation engine. If your document only use
English as your language and does not have any Unicode. Then it is safe to disregard it.
You can check the complete example at my project https://gitlab.com/filpals/book.
Point of Interests
pandoc does not fully support the mathematics syntax if you plan to include it. For example, pandoc can interpret $` a²+b²=c² `$ correctly. However the below syntax fails to render correctly in pdf output.
Unfortunately it also does not support mermaid markdown which is widely used especially to render diagrams and charts in Markdown.
Folder Structure
Sometimes we will organize the document by grouping several topics into the same folder along the way we edit documents or accommodate branching in git. Thus we can utilize the pandoc default behavior and name the folder as chapter i.e. ‘Chapter 1’, ‘Chapter 2’, … and all subtopics documents to store within the chapter folder. By doing this, we can sort the book in the sequence we prefer.
AsciiDoc
For those who want to publish a book with a more advanced format, I would recommend using the approach used by O’reilly. O’reilly is using AsciiDoc as the choice of markup language. You can configure more settings like page orientation and allow custom styles and themes etc.
GitBook
If you would like to host the book online probably you can consider using GitBook to publish a full documentation website. It also allows publishing as a single book based on all the content in a website.
- https://gitlab.com/pandoc/pandoc-ci-example/
- https://pandoc.org/
- https://www.allendowney.com/blog/2018/12/27/how-to-write-a-book/
Please leave your comment below if you have any ideas or thoughts. Thanks.
Originally published at http://filpal.wordpress.com on November 28, 2021.