Quote - Jan 31, 2020

"... JavaScript is murdering the Web ..."

Uh oh. This post became a long rant.

Hacker News comment from Jan 14, 2020 with my emphasis added. I did not create this comment.

People who choose to turn off JS are excluding themselves. JS is part of the web platform, and there are tons of amazing things it allows you to do.

People who choose to turn off JavaScript are protecting themselves, because JavaScript enables all sorts of attacks on one's security and privacy.

Disabling JavaScript also blocks possibly nefarious ad tracking tech, which would still fall under the security and privacy issues. Media orgs create reader-hostile websites.

More from the comment:

JavaScript is not part of the web platform. The Web is a web (hence the name) of interlinked documents: the core requirements for a web of interlinked documents are a protocol (we have HTTP) and a document format (HTML); a style format (CSS) is nice too, albeit not strictly necessary (one can read unstyled documents in links, lynx, elinks, eww, w3m or whatever else just fine).

Or use a browser plug-in, such as uMatrix that is blocking nearly everything. The result is an unstyled page that loads fast and is still readable. Firefox provides an option to view a webpage without styling. The web of documents should be readable without JavaScript and CSS.

Back to the comment:

There is absolutely no requirement for executing arbitrary, potentially malicious, unverified code from untrusted authors distributed globally. Are there great examples of cool JavaScript applications? Yes, certainly.

Fastmail is an example of a well-done, client-side JavaScript web app that I use regularly. I log into the web app and perform actions that are NOT available to the open web.

Similar private actions occur when accessing websites for banking, health care, tax prep, etc. These are applications that are accessed over the web, maybe unfortunately, but the interactions are private, hopefully. These web apps are not a part of the web of documents.

Could these so-called modern web apps and services be built without client-side JavaScript? Of course. Similar sites and services probably existed 20-plus years ago, and those "web apps" used little to no JavaScript. Craigslist works without much client-side JavaScript. If users demanded app-like experiences everywhere, then nobody would be using Craigslist in 2020.

I'm updating this web page by using my own web-based, client-side JavaScript editor called Tanager that I created a few years ago. I enjoy using it when I spend a long time updating a post. More about Tanager can be read here:

http://sawv.org/2017/01/13/tanager-readme.html

Tanager is a small, useful (at least to me), JavaScript-based tool. Client-side JavaScript has its advantages, and it can enhance a user's experience when necessary.

I started personal publishing on the web in 2001, and since then I have used HTML's simple textarea box to perform a ton of creates and updates to web posts. I still rely on the textarea box often for quick creates and updates.

But I added an auto-update feature to Tanager. I make Tanager access my CMS's API endpoint. Tanager sends and receives JSON. The default auto-update time is every five minutes, but I can also modify that timing within Tanager.

Anyway, back to the HN comment:

Is JavaScript a core requirement to support a web of interlinked documents — i.e., the Web? No, not at all.

Now the main quote:

To the contrary, JavaScript is murdering the Web, and this page's misuse is a prime example. JavaScript delenda est!

The problem is not JavaScript. The problem is the unnecessary usage of client-side JavaScript that is murdering the open web of documents.

Our atrociously designed local newspaper website https://toledoblade.com is murdering the web.

http://sawv.org/2019/04/17/the-medias-war-on-the-open-web.html

I still believe that https://text.npr.org is the best designed media website.

The Blade should provide logged-in subscribers with a slightly enhanced version of text.npr.org's design. To hide its content behind a hard paywall, the Blade's server-side software would have to ensure that the reader is logged-in (a paying customer), which requires the usage of cookies.

Then the Blade's server-side system either accesses the file system outside of document root to retrieve the HTML file (article) to send to the logged-in reader, or the Blade's server-side software dynamically creates the article page on the fly, probably by pulling info from a database, and sending it to the reader's web browser.

In my opinion, both methods of sending content to readers qualifies as a web of documents because in simple terms, I'm reading articles.

But when it comes to supporting journalism, I have to login because the content is available to paying customers only. These documents (articles) that appear in my web browser should contain NO JavaScript, NO ads, and minimal CSS because I'm a paying customer.

I love this October 2019 article.

http://blog.danieljanus.pl/2019/10/07/web-of-documents

Here's the January 2020 HN thread that pointed to that post.

https://news.ycombinator.com/item?id=22121462

The author of the Web of Documents post suggested no cookies and not POST type requests. I suppose that a user login form could use a GET request, but in order to maintain a login session, a cookie or cookies would be needed.

For READERS of sawv.org, then yes, the restrictions outlined by the Web of Documents author would apply and work fine.

But unless the toledoblade.com switches its funding model to something like how public radio works where it's free and open to all, but it exists because some people donate money, then the Blade will need to put its content behind a paywall.

Public radio, however, also receives taxpayer money too, I think, and that should not occur with our local "newspaper" websites.

Fastmail is a web application. toledoblade.com should be a web of documents for logged-in subscribers, but that's not the case with the Blade's current web design.

My seasonal blog http://toledowinter.com that I operated for a few winters is a web of documents even though no static HTML files exist.

A static HTML file does not automatically equate to a document. A static HTML file could contain mostly JavaScript and no content, which means code needs to be executed on readers' computers that fetches the content and displays it to the readers.

Sometimes, the content is stored in JSON that exists within a static HTML page, and the page requires JavaScript to display the content. That's un-cURL-able. That setup does not qualify as a web of documents, and that's basically how the Toledo Blade website functions.

If program code needs to be executed on the client-side in order to display a document, then it's obviously a web app. I don't want to execute code in the web browser to display a document that contains mainly text. The rendering should have occurred on the server.

As a client (reader), I don't know nor should I care if the page that I'm downloading came from the server's file system, from the server's RAM, or from a database. I should see a document that displays simply without JavaScript and, ideally, with little CSS.

For toledowinter.com, a READER will receive a web page that the Nginx web server pulled from Memcached, or if not cached, then the Nginx web server will execute my server-side code that pulls content from the database and applies a template to create the web page dynamically, and that page of content gets sent to the reader. And if the page was dynamically generated, then the web page gets stuffed into Memcached, which means that if a reader refreshed the page, then the reader would see the cached version. If I update the page's contents, then the updated page gets stuffed into Memcached.

Using the caching server means that my server-side code executes less often, which means that the database is accessed less often, and the resulting page is downloaded faster.

But all of that still qualifies toledowinter.com to support a web of documents, in my opinion, because pages can be accessed from command line utilities and then optionally converted to text/plain files.

One method:

curl http://toledowinter.com/7769/three-more-days-of-mild-temps-then-normal-temps | html2text -style pretty

The html2text program is an old utility that for some reason does not preserve whitespace, created by the HTML paragraph tag. The resulting text shows paragraphs with no spacing. But it's still readable plain, raw text.

This method:

lynx --dump http://toledowinter.com/7769/three-more-days-of-mild-temps-then-normal-temps

Displays the plain, raw text better. Paragraph spacing is preserved.

Of course, I can view toledowinter.com and websites that support the web of documents within text-based web browsers, such as Lynx, links, elinks, w3m, etc. And I can view the site within a limited GUI-based web browser, such as NetSurf.

NetSurf supports HTML through v4.x and CSS through v2. NetSurf might support a smidgen of HTML5 and CSS3, but I would have to double-check. But NetSurf does not support JavaScript.

The top of toledowinter.com's homepage that contains the image and site title looks wonky within NetSurf because of the CSS that I'm using, but the main content on the homepage still loads fine.

I like the idea that if the web content is not cURL-able, then that website does not support the open web.

Here's a tough one. Does cjr.org support the open web or the idea of the web of documents? The answer is half and half. cjr.org definitely uses an idiotic web design for a website that contains mainly text-based information. It's ridiculous.

When using a so-called modern web browser, such as Chrome or Firefox, with JavaScript disabled, then a cjr.org article page is blank, except for an orange horizontal bar, displayed at the top of the page.

This article page consists of mainly text and one large, useless stock image, yet no content is displayed when only JavaScript is disabled within Firefox.

https://www.cjr.org/politics/drudge-report-trump.php

If I set the uMatrix browser extension to block CSS and to block JavaScript, then I see the text.

If I view the article within NetSurf, then the page is blank, except for the the orange bar.

If I view the article within elinks, then the text displays. I can read the article easily.

But for some reason, that same cjr.org article produce a "403 Forbidden" error message when I try to access it within the Lynx web browser. It also errors out when I use Lynx from the command line with the --dump option.

But this works:

curl https://www.cjr.org/politics/drudge-report-trump.php | html2text -style pretty

I can read the text fine after executing the above command.

Strange. Actually, it's absurd on cjr.org's part.

At times, I read websites with elinks. I prefer NetSurf, but, again, some websites fail to display content unless CSS AND JavaScript are disabled. It's whacked.

https://politico.com is another example of how modern web design is ruining the web. It should be a website that supports the web of documents, but it's not, or it only halfway supports the web of documents.

If I have only JavaScript disabled within Firefox, the Politico website displays as a blank page.

Interestingly, the cjr.org homepage will display content within NetSurf and within Firefox with JavaScript disabled. But cjr.org article pages do not display content under those same conditions.

The Politico homepage and articles pages display nothing within NetSurf and within Firefox if only JavaScript is disabled.

If I use uMatrix in Firefox to disable CSS and JavaScript, then the Politico website displays content.

Example article:

https://www.politico.com/news/2020/01/30/joe-biden-impeachment-witness-109730

If I access that Politico article from within text-based web browsers, such as Lynx and elinks, then I can read the content. CSS and JavaScript have to be non-existent.

I wish that NetSurf contained to disable styling like Firefox.

It's interesting that the best way to view these websites without JavaScript is to use text-based web browsers. Even when using uMatrix cranked up within Firefox, the resulting page looks horrible because of the massive size of the SVG icons/images. I have to scroll down more than 50 percent of the page to get to the meat of the article.

The lynx --dump command works with the above Politico article page. That command produces easily readable plain text.

Modern web design is ruining the open web and/or the web of documents.

Back to our old, wretched friend toledoblade.com, which uses a wonderfully hideous modern web design.

This is a small editorial piece.

http://www.toledoblade.com/opinion/editorials/2020/01/31/term-limits-tango-michigan-politics/stories/20200127008

It's text. The editorial contains a little over 300 words and one large useless image. I know this because of how the Blade CMS constructs its web pages.

Regardless of what I do, the Blade's website displays blank in Firefox with uMatrix disabled, Privacy Badger disabled, and JavaScript enabled. I block third party cookies. I might have some other security and privacy features enabled within Firefox.

When I open Firefox in private mode and access the above editorial, part of the page displays, and then I receive a message about disabling my ad blocker. ???

It doesn't matter. The Blade's article content is not stored within HTML tags, such as the venerable <p> paragraph tag. Within a Blade article page, the body of the article is contained within JSON. A view-source on the above page shows this. Client-side JavaScript is obviously needed to process the blob of JSON and display it to the reader. WTF???

I'm a Blade paying customer, but I do not use any of the Blade's poor digital products, which includes the Blade's website and their apps.

I created my own web setup to read Blade articles.

http://sawv.org/2019/08/17/how-i-read-the-toledo-blade.html

Using the lynx --dump command to access the above Blade editorial produces text that only contains information about the website's navigation. No article content, of course.

The Blade newspaper has existed since the 1800s. It's stunning that the Blade has failed to display text on the web.

Paying customers who log into the Blade's website get pummeled with ads and potentially nefarious crapware that gets downloaded to users' machines because the Blade's website requires JavaScript to read TEXT. I consider the Blade's website to be a security and privacy concern.

Organizations, such as the Blade, should be forced to provide the public or at least paying customers quarterly security and privacy audits of their websites.

webpagetest.org results for the above Blade editorial that contains about 350 words.

From: Dulles, VA - Chrome - Cable
1/31/2020, 4:16:37 PM
First View Fully Loaded:
Download time: 11.425 seconds
Web requests: 405 !!!
Bytes downloaded: 2,864 KB

Actually, "only" 2.8 megabytes for 350 words is small for the Blade and other newspaper websites today, which is sad.

The text/plain version of War and Peace is 3.2 megabytes. The printed version of that Tolstoy books contains over 1,000 pages.

For the 2.8 megabytes downloaded to read 300-plus words, 1.2 megabytes were for JavaScript.

My Blade web reading app displays content to me in a manner that is a slightly enhanced version of text.npr.org. And if I desire, I can use Firefox's and Safari's reader mode capability.

Only I use my Blade web reading app. Here's the webpagetest.org results for the same Blade editorial that is displayed humanely by my code.

From: Dulles, VA - Chrome - Cable
1/31/2020, 4:24:08 PM
First View Fully Loaded:
Download time: 0.595 seconds
Web requests: 2
Bytes downloaded: 4 KB

The minimal CSS that I use is contained within the HTML output that is dynamcially generated on my server. 100 percent of the downloaded bytes went for HTML.

That's how a web page, an article from a media org, should display for paying customers.

The lynx --dump command works great with my Blade web reading app. Content displays fine within NetSurf, Lynx, and elinks. It displays fine within Firefox with JavaScript disabled.

A usable web of documents is easy to create. Maybe that's problem. It's too easy. We need a complex web to justify the need for some tech people.

When I access the Blade's content, I'm looking to be informed and not abused.

Etc.

I like this January 2020 post.

https://anderspitman.net/17/#curlable

More comments from the above HN thread.

https://news.ycombinator.com/item?id=22038853

anyone who browses that way should expect that many sites won't work and will need to be manually whitelisted.

I never whitelist sites that don't work without JS unless the site is actually critical for some reason (doing so is too risky). I expect that this means some parts of the web effectively no longer exist for people like me, and accept that, but I wonder if the authors of these badly engineered pages really know that they're excluding people.

https://news.ycombinator.com/item?id=22039378

Rendering markdown client-side seems like a fine way to implement a web page.

Requiring readers to execute arbitrary code in order to read content seems like a terrible way to implement a web page.

Nor is it cheap: it requires every single reader to execute the same code, burning CPU over and over and over when it could be done once for all readers, by the server.

Yes, some people choose to browse with JS disabled, but anyone who browses that way should expect that many sites won't work and will need to be manually whitelisted.

Yes, you can require execute privileges in order to publish content, but anyone who publishes that way should expect that many people won't read what he writes.

https://news.ycombinator.com/item?id=22039084

JS can be a highly supported tool, but it isn't itself a primitive use-case. It is an enabler of use-cases and, to OP's point, JS for the sake of loading JS isn't useful if the content of the site really didn't need it.

https://news.ycombinator.com/item?id=22039452

How is JS related to a page with only text and hyperlinks?

https://news.ycombinator.com/item?id=22038633

Please render your markdown on your server and not my client. You're wasting my battery, man. Shame on you!

Hah. Reminds me of these posts.

Back to comments excerpted from the HN thread.

https://news.ycombinator.com/item?id=22041166

JS search is not generally a critical site function.

Rendering the goddamned content is.

If you can't at a minimum give me a title, byline, dateline, main body text and/or some level of summary or description of non-textual content (as with graphics, audio, video, or interactive elements), then you're failing.

(SPAs or web applications should at least provide context for understanding what the application is/does. I'm not calling for all functionality to be rendered in HTML, but sufficient context to determine WTF the site is about.

Your "but I cannot implement search" is a strawman, and really doesn't address the core complaint.

-30-