MBook – a proposal for a new, simple e-book format based on Markdown

mindracer1 pts0 comments

Kevin Boone: MBook<br>– a proposal for a new, simple e-book format based on Markdown

Kevin Boone

Home

Contact

About

Software

Articles

MBook<br>– a proposal for a new, simple e-book format based on Markdown

This article proposes for a new, simple format for storing e-books,<br>particularly novels. I’m calling this format “MBook” for now, because<br>its text format is a subset of Markdown. While MBook lacks (by design)<br>the sophistication of other formats, its intentional simplicity provides<br>some advantages.

Why e-book formats are<br>problematic

Do we really need a new document format?

There are many ways we can encode a book for viewing on a computing<br>device. In the simplest case we might just use plain ASCII or Unicode<br>text. The problem with this approach is that many books, even novels,<br>benefit from non-text material like cover art, and the ability to apply<br>simple emphasis, like bold and italic.

We might instead use word processor formats, like Microsoft Word,<br>RTF, or ODF. These are all capable or representing books, but they are<br>complex, and may be proprietary. It isn’t easy to write software to<br>handle such formats, so they haven’t been widely adopted for<br>e-books.

Then there are file formats specifically for e-books, such as EPUB,<br>MOBI, and AZW. These formats are complex and, again, some are<br>proprietary. All offer far more features than we require to encode most<br>novels.

A more subtle problem is that the sophistication of these formats<br>leads to their being abused. The EPUB format, for example, being based<br>on HTML, allows the author very fine control over text appearance and<br>layout. While there are circumstances in which this level of control<br>might be necessary, it’s rarely appropriate for novels. When I read an<br>EPUB book on my Kobo reader, for example, I invariably have to start<br>with a bunch of configuration changes, to make the text look like<br>I want, not the way the author wanted. It isn’t uncommon for<br>authors to insist on specific typefaces or specific text sizes, knowing<br>nothing of the reader’s preferences, or the device on which the book<br>will be read. This can be a nuisance.

If you tend to be a bit paranoid – and we all should be, the Internet<br>being what it is – you might also worry about how an e-book reader might<br>be used to compromise your privacy or security. A format as complex as<br>EPUB requires correspondingly complicated software to read, and it’s<br>plausible that a rogue EPUB might exploit defects in this software.<br>Unlikely, to be sure; but not impossible.

Frankly, with these factors in mind, plain text looks increasingly<br>like a good idea. The problem lies in adding just enough additional<br>material to plain text, without introducing any of the problems of EPUB<br>and the like.

Gempub – a step in the<br>right direction

Any format for representing e-books needs some fundamental<br>features.

There must be a way to divide a long text into chapters. There<br>should be a way to tell the reader the order in which these chapters<br>appear within the book, if the chapters are in separate files

Ideally, the format should support images: at least a cover image,<br>but perhaps illustrations within the text itself

There should be a way to store metadata, like the author,<br>publication date, and title

There should be provision for rudimentary text formatting, like<br>headings and sub-headings

Additionally, for our present purposes, the format should be easy to<br>author and to display using software.

Gempub is an e-book format that uses Gemtext as its text format.<br>Gempub comes out of the Gemini Protocol project – Gemtext is the default<br>document format with this protocol. Gemtext looks like a highly<br>simplified Markdown.

A Gempub document is a zipfile that contains Gemtext files, one for<br>each chapter, along with whatever images are required, and some<br>metadata.

It is trivially easy to hack up a viewer for Gempub books,<br>particularly if you regard the display of images as optional. In<br>practice, support for Gempub has been added to applications that already<br>support Gemtext – typically browsers for small-net protocols like<br>Gemini.

In addition, it’s easy to create Gempub books, either from scratch or<br>by converting from other formats. You don’t need any specific tooling –<br>once you understand the format, all you need is a text editor and a zip<br>utility.

A great many novels can be delivered perfectly well in Gempub format,<br>particularly the classics. Dickens, Melville, and Jane Austen need<br>nothing more. Gempub alleviates all the complexity-related problems of<br>EPUB and the like, and allows the reader complete control over<br>formatting and layout.

Unfortunately, Gemtext is just too simple for many books.<br>Not only does it not support even the most rudimentary text formatting –<br>not even bold or italic – it does not distinguish clearly between line<br>breaks and paragraph breaks. A number of conventions have grown up among<br>Gemtext users that mitigate these problems but, unfortunately, they<br>aren’t universally applied.

The ‘MBook’ format

The...

format text like book books gempub

Related Articles