Page 1 of 1

Project Lovecraft

Posted: Sat Sep 17, 2022 12:38 am
by theresnobodyhere234

I will be using this instead of my personal blog to update on the works of my LaTeX project of transcribing the Fictional works of H. P. Lovecraft onto PDF, ePub, and web (HTML/CSS and some Javascript?). I'll update this initial post and latter postings will elaborate updates and changes. For reference of the initial project posting: https://blahg.so-no.xyz/finance/2022/07 ... craft.html

LaTeX -> PDF
Status: released
draft version: draft 4 or 5 (0.4.0 and 0.5.0)
Version: 1.0.0

LaTeX -> ePub
Status: released
draft version: 1, 2
Version: 1.0.0

HTML web book
Status: Not started
draft version: 0.0.0


Additional formats to produce versions of the compilation.

Markdown
Status: not started

DVI
Status: not started

GROFF
Status: not started

XeLaTeX Source Code
Status: not started

LaTeX Source Code
Status: not started

Archived XHTML Files
Status: not started


Re: Project Lovecraft

Posted: Sun Jan 08, 2023 10:47 am
by theresnobodyhere234

An update

This post will be something that I'll go over more in depth elsewhere in either my personal blog - which I'm in the middle of repairing because the sub-domain where I host my photo gallery and store images has sort of broke due to updates that just broke a lot of images and database associations along with URLS. So I had to clean up the blog of any image links which has left me pretty annoyed. :evil:

But anyway here is where I am with the project after 5 months:

  1. I have managed to finally finish and release a finished PDF file that's roughly 7MB compared to the mammoth 600MB or so file that was initially tested out to produce via Microsoft Word. The underlying tech for producing the PDF was a combination of XeLaTeX (a fork of sorts of TeX and LaTeX that allows for using TTF and OTF fonts), Bibliotex, and some other program for outputting the PDF file. It's really quite something really when you have hundreds - if not thousands - of lines of code get compiled to output beautifully typeset text. It was a fairly challenging thing to learn because the thing about TeX and LaTeX in general is that there is a lot of information about it scattered all around the web and various places on the internet that you can't really seem to throw the proverbial can and not hit somewhere on the internet that will have a passive mention at least about the technology or support of it somehow. Where you'll find most mention of it is LaTeX markup for math equation writing and rendering for HTML output (something I've used once in a financial blog that I've started and have not updated for a while). The downside of TeX and LaTeX is that there are many problems and solutions that are inherently complex by experienced users that sometimes you can't seem to find bare basic answers to more common questions among newcomers. I plan to buy books or check out for resources that go over the actual bare basics and see if there is a textbook that I can author about getting started with particular, perhaps latter-day, practical application of LaTeX with regards to typeface font common on modern computer desktop computers and the particular templates for writing things such as common letters or high school style essay (thesis and three supporting statements).

  2. After producing the finished LaTeX PDF, it was time to go on to making the ePub. ePub is the easiest and yet the least well documented and common for using to publish electronic documentation even though the manner to make one can be done manually with some, albeit, difficulty. And that is because ePub is nothing more than archived XHTML web pages, with a specialized format for a table of contents (TOC) then compressed via ZIP which is considered now an international standard for compressed archives to then transmit over computer networks. This particular design of the very structure to the very manner of which to compress the archive to then have then be decompressed in real time and rendered via either a hardware or software EPub reader makes for EPub a very ideal manner for transmitting published text with some effort of what can be mustered via HTML typesetting, styling via cascading stylesheets (CSS), along with any audio and visual media and any necessary JavaScript necessary. It is quite a remarkable piece of technology that has yet to be used for more interactive electronic published documents. And given that it's all archived plaintext files, it makes for a smaller file on average compared to the Portable Document Format (PDF) which is a portfolio of sequence of raster images.

The difference between these two files is like night and day because the results of quality in typesetting are just that, like night and day.
PDF via LaTeX has a beautiful and satisfying result whereas EPub is a matter of compromise, expectation and effort. I say that because there is a form of self-inflicted motivation and delusion that you can get away with the results that you would get with TeX and LaTeX with HTML and CSS. And the truth is that, when you observe - if you have any experience with web development whether in an amateur or professional manner - you know very well that you can only do so much about typesetting. Consider that just 6 years ago or so that support for web fonts (SVG vector fonts, TrueType Font, and OpenType Fonts - fonts all used for desktop publishing) were just being supported finally besides the standard serif, sans-serif, and monospaced fonts typically supported like Georgia, New Times Roman, Helvetica, Impact, or any of the commonly proprietary shared fonts that are found in Microsoft Windows or Apple Mac OS. And although auto-hyphenation, justification, and margin, padding, and border sizing can be manipulated but there comes a limitation to where then it becomes where programmatic solutions are going to be involved with scripting functions in JavaScript. and when it gets to that point, you can start to see where the fundamental challenges between three different languages (two declarative and one functional) all seem absurd compared to the more elegent TeX and LaTeX by comparison.

Licensing, Credit, Bibliographic and Appendix authoring

The thing about this project is that I learned that there is a basic structure to most books in modern publishing

  1. Title Cover
  2. Title Page
  3. Publishing License and Publishing legal information (ISBN number, Library of Congress ID number, UUID, etc)
  4. Dedication Page
  5. Table of Contents page(s)
  6. Section Title Page (optional)
  7. Chapters
  8. Appendix and/or Indices
  9. Back Cover

It's more or less like this and the most important parts of the books are the Publishing Licensing and legal information pages and the appendix and indices pages and it's a bit of a science to take care of each of the sections prior to the actual content of the book. They compliment and really solidify the way how a book is published in real life. The major exception is that unlike a typical publishing project - whether at a professional printer or desktop publishing house - the costs and updating the contents for such sections are cheap and don't require the immense detail and pressure as a printing project would. But I would say that even if you have electronic advantages at your side as most modern books are drafted and edited mostly electronically today, that you should always give care as you would for a would-be physically published book because who knows? The skills for figuring out things like acquiring a UUID and ISBN number along with copyright authoring, licensing rights, publishing information, and accreditation are the things that can later make or break your career as something that could work in the publishing business world whether electronic or traditional print.

I'm not saying that I've mastered it. But let's say that I had original written work on the line, I'd definitely go the extra effort to register for a copyright and get ISBN number registration to be applied for my electronic publishing of my original work. Which brings me to licensing

Licensing is quite tricky because here's the thing with Lovecraft's writing: most of it is in the public domain but some writings are co-authored with authors and there may be claims to copyright by those authors. Which is why perhaps Arkham Archivist only made a compilation for some of the more famous stories of H. P. Lovecraft and didn't do all the works out of risk of bumping into some estate that may have partial copyrights to a story that may be also partially public domain (I'm not sure entirely how it works but I wouldn't put it past something odd like something being both public domain and yet copyright claimed somehow in some fashion in some jurisdictions and circumstances). Let's hope that it's more clean cut.

Let's just hope for the best. And so it's why I chose to, instead of plainly writing that the works itself - the compilation in PDF and EPub formats, were themselves just public domain in the American legal jurisdictional sense, but under a particular public domain license scheme of the WTFPL (Do Whatever The Fuck You Want Public License). This license is so permissive, it's practically the same as saying "Public Domain". Yes tere is the Creative Commons Public domain or "Copy-left" but I'm not really fond of the presumptions sometimes of such licensing schemes and in the case of a piece of work that is already public domain, it's redundant.

But it is something I kept in mind and in the future I will certainly will have to be mindful when I should author something original of my own writing.

What else is there to do?

The Sequoia Publishing Works "company" is really a project that may end up being something big in the future should I garner attention and interest by others around. I have have a lot to do as this is a massive website and publishing project undertaking. I am tempted to count the lines of code (including spaces) involved in between just the LaTeX and EPub versions alon which should come to reaching almost 900,000 or so. And although it was a most copy-and-paste affair, it was still a lot of work. Coding isn't just slapping things together and calling it a day, it takes some time to edit much like a natural language written statement for things such as law and mathematical writing.

So to brass tacks:

  1. I have to go over the PDF to read for pleasure and to write notes and even print physical copies for note taking on the margins for later editing if necessary for the LaTeX version of the compilation
  2. I have to go over the EPub and test it out in various EPub readers (most of it software but I'll be buying a Kobo Elipse to test out since it's a superior product to the Amazon Kindle that just came out that has taken the place of the Kindle DX - a reader I once coveted).
  3. I will have to come to split the entire single LaTeX file that is responsible for version 1.0.0 of the PDF version into smaller chunks linked to a master LaTeX file per story so as to then be able to distribute via Github or some other repository service that I can use to distribute and have people code review but I think that I'll just 7zip and have people download and check locally the code. I'll figure it out once it comes time to decide.
  4. I will have to possibly re-write the CSS rules for the EPub version with class names and referential rules that will be on the stylesheet even though the rules will at times be found in-line in <span> or <div> tags due to the nature of the way how EPub readers render some CSS stylerules from an associated stylesheet versus in-line. Case example is that CSS Grid rules are supported in-line but not when defined in a class in a stylesheet. It's the weirdest thing but I should test out something on a few readres. Considering that it was on Sigil's own Chrome-based rendering engine (Blink) that probably, per its integration to the IDE as a whole, was of an older version of the Blink engine that didn't yet fully support CSS Grid rules from a stylesheet but supported inline rules. I'll have to test more.
  5. I'll have to check out the stylesheets of other EPub publishings and projects like that of Standard Books. This way I can figure out some standard CSS stylesheet for future projects while editing the one for the compilation. There may be likely a few things I'll borrow and steal :mrgreen:

In Conclusion

There is more, much more, that I'll have to do and I'll definitely promote this project once it's good and ready to do so within the year and contact the people at hplovecraft.com and do a bit of offline promotion (I have something involving NFC tags and QR codes in mind). I'm really excited about this project because I get the feeling that for once I have something to offer, I've got a lot of things to learn and best of all I can now publish material that needs organization and distribution in a manner that people are willing to consume and easily take with them in very portable and offline ways.

You see in a world of over centralized networked walled gardens a la AOL (America Online), I think it's good to have more decentralization and methods of decentralized communication and electronic desktop publishing is one of those things for sure. It's what we hoped for in the 1990s after all.


Re: Project Lovecraft

Posted: Sat Feb 04, 2023 1:48 pm
by theresnobodyhere234

LaTeX Source Code release

After some time now I have been finally able to re-arrange the source code of the original monolithic LaTeX file into a manageable direcotry that can be edited more easily. It still requires a chunk out of TeXStudio to make the compilation happen but in the end it can do the job.

https://misc.so-no.xyz/2023-Master-TeX-Files-3.7z

This deserves a victory song: