Copying Web Content to Gemini

created May 21, 2020

My related posts:

Last week, I enabled TLS at sawv.org, and I installed a Gemini server called Molly Brown.

https://tildegit.org/solderpunk/molly-brown

Now I have a Gemini site at gemini://sawv.org.

Also last week, I installed a graphical Gemini client called Castor on my Linux laptop.

https://sr.ht/~julienxx/Castor

I made a slight modification to the Rust code, used by Castor to increase the default font size for the main part of Gemini content, displayed within Castor.

I like the fact that Gemini permits some Markdown-like formatting: headings and bullet points. And I like how Castor permits READERS to make some custom typographical modifications.

The readers should be able to control how content appears. Some typographical controls exist within web browser reader mode or readability options. For Castor, the typographical changes are made by editing a config file. This is a good start. Some day, Gemini clients might permit readers to make these changes within the Gemini browser.

I like Gemini a lot because it aligns with my thoughts that I expressed in this January 2019 post.

https://sawv.org/2019/01/22/markdownonly-web-browser.html

No HMTL, no CSS, no JavaScript. Only plain, raw CommonMark (Markdown) text.

Here at https://sawv.org, my web-based static site generator app was written in Perl. A year or two ago, I wrote a similar one in Lua. I use the Lua version at this test website:

http://sora.soupmode.com

Last year in my Lua version, I changed to using the CommonMark Markdown library. I also use my Lua web-based static site generator at my Markdown-only test site.

http://md.soupmode.com/home.md - Actually, an index.html file exists to describe what in the heck this site is about.

For that md.soupmode.com site, I rely on web browser extensions to convert the CommonMark (Markdown) plain text into HTML for display. A view-source still shows only raw Markdown text. The browser extensions come with themes, but the extensions permit me to upload my own custom CSS to control how content looks. I have used the following web browser extensions:

But those two extensions do not offer an easy way to add support for another file extension. I wound up installing an extension in Firefox, called Markdown Viewer. Here's its source.

https://github.com/simov/markdown-viewer

The Markdown Viewer extension allowed me to add support for the .gmi file extension. It comes with numerous themes, but it does not permit me to upload my own custom CSS. It's hard to get everything that I want.

I suppose that I could git clone the source and add my custom CSS to the appropriate directory and try to install the extension manually if that's possible. It sounds like more trouble than it's worth. I settled on using a theme called Foghorn. Despite the many themes included with the extension, this was the only one worth using, in my opinion.

The ideal Markdown viewer web browser extension would permit adding file extensions to be supported and permit uploading custom CSS. For now, Markdown Viewer in Firefox with its builtin Foghorn theme is good enough to view .gmi files over the web.

For my Gemini site, I'll start by copying a lot of content from the current homepage of the web version of sawv.org.

On May 19, 2020, I setup a new sub-domain website called http://gemini.soupmode.com. At this site, I installed the Lua version of my web-based static site generator. I copied the version used at md.soupmode.com, but I made some changes, obviously, to output .gmi files, instead of .md files.

In my config file for the Molly Brown Gemini server, I pointed document root to the doc root directory used at http://gemini.soupmode.com.

Yesterday, May 20, 2020, I installed a Python script on my Linux laptop called md2gemini.

https://github.com/makeworld-the-better-one/md2gemini

At https://sawv.org, I chose to edit a post, and I copy the Markdown text from the textarea box, and I save it to my laptop via the Vim editor.

Then I run the md2gemini script on the .md file to create a .gmi file. Not much is changed. The main part is making links work for Gemini.

I use a text editor on my laptop to copy the contents of my .gmi file, which I paste into the textarea box at http://gemini.soupmode.com to create a new post, which is saved as plain, raw Markdown text in a .gmi file.

Then I edit http://gemini.soupmode.com/index.gmi and add a link and text for the above new post that I created.

Again, I use my web-based static site generator that I wrote in Lua to manage creates and updates at http://gemini.soupmode.com.

After the new post has been created and the index.gmi file updated, then I can use Castor to view the new changes at my Gemini site gemini://sawv.org.

I will need to edit the Gemini content again to ensure links work properly. When I use relative URLs to point to content on my website, the md2gemini script preserves those relative URLs. The problem is, those relative URLs will most likely not work, since I don't have much content copied to Gemini yet.

Even though the Gemini spec does not support rendering all of CommonMark formatting on the client side, I will leave my Markdown formatting in my posts, such as blockquotes, bolding, and italicizing, since those Markdown formatting functions can be self-describing or a bit semantic to people familiar with Markdown.

And maybe I will create my own Gemini client some day that supports rendering all of CommonMark markup.

Gemini clients support the three backticks too, which exists in CommonMark but not in Markdown. CommonMark is an attempt to create a standard Markdown, since many flavors of Markdown exist.

https://commonmark.org

Anyway, I'm taking a "few" steps to copy Markdown content from https://sawv.org to gemini://sawv.org even though both sites reside on the same Digital Ocean server.

I have several thousand posts at https://sawv.org. If I want everything to be Gemini-friendly, then I'll need to create a program to speed this up. But this manual approach enables me to learn more about what works and what doesn't and how I want my content exist at my Gemini site.

Currently, I have migrated 20 of my web posts to Gemini. I'll migrate at least 100 posts by using the above steps and going back and doing some more edit cleanup. Then I'll figure out how best to migrate more pages.

After only one week of exposure to Gemini, I like the protocol so much that in a year or so, I could see musing Gemini exclusively, and the only content that would exist at https://sawv.org would be the URL for my Gemini site.

I first learned about https://indieweb.org in the summer of 2013. I support some IndieWeb concepts at https://sawv.org. I like what the IndieWeb community has done over the past 10 years, but it's still the web. I created my first server-side web program in the fall of 1996, and it changed everything for me, regarding IT and programming. I loved the web. Now, I tolerate the web.

Obviously, the internet is more important than the web. The web is one of many application protocols that run over the internet. I still like email, Gopher, and FTP (sftp). And now Gemini is another app that runs over the internet.

Hopefully, the Gemini spec for clients and servers remains small and lightweight. In the past week, more client and server software have been created by people interested in the project.

Gemini got started in 2019. Today would be like 1990 or 1991 for the web and Gopher, since both the web and Gopher got started around 1989 to 1990.

In my opinion, Gemini is closer to Gopher than the web, which is great. I like the light formatting options that exists in the Gemini client spec that do not exist in Gopher. Gemini seems simpler than Gopher in some aspects.

Since the tech industry did not create the bloat:// application protocol around the year 2000 to take over the for the increasingly complex and bloated web, we can now let the current web assume the role of bloat://.

http:// or https:// equals bloat://.

gemini:// now equals the original spirit of the web, back around 1992.

I've seen mention of something like a Common Gateway Interface being supported by Gemini servers, but I don't yet know if that's true or how it works. I would be okay with CGI. I never want to see anything like CSS and JavaScript introduced on the client-side of Gemini. If people want that, then they can use bloat://, I mean the web.

It would be interesting if server-side Gemini programs (CGI) could dynamically generate Markdown files. Plain, raw, Markdown text would get sent to the client apps where it's formatted according to the readers. This would be hard for corporations to perverse and ruin.

For now, I'm only interested in static content. I like to read text. Gemini focuses on text.

-30-