There was never an ‘open web’

There was never an ‘open web’ because at a fundamental level the web enforced hierarchy via how URLs were defined to work. For most…

There was never an ‘open web’

There was never an ‘open web’ because at a fundamental level the web enforced hierarchy via how URLs were defined to work. For most protocols (including gopher), a URL combines a hostname with a path. That hostname gets resolved (via DNS or simply by being an IP address) to a particular machine or a small set of machines. This was fine in the early 90s, when a 486 connected over ISDN could handle the simultaneous traffic of nearly everybody on earth with a web browser, but as soon as the number of users began to increase it became clear that scale was a problem: that single host needed to be beefy and have a thick pipe, or else the host for a popular page or image would get the ‘hug of death’.

A number of hacks were developed to get around this, without ever really addressing the underlying problem. Beefy gateways running extremely lightweight software would forward requests to a large number of servers in order to balance load. Other beefy gateways would cache static or static-enough content. Layers upon layers of networks of expensive rented servers would be put between the user and the ultimate web host in order to ensure that the host at the end didn’t go down under high traffic (and the landlords who own these hosts — Akamai, Cloudflare, and now Amazon, Google, and Microsoft — got very rich). Meanwhile, the W3C’s proposed idea of having a translation layer between permanent URIs that reference content and temporary URLs that reference hosts was never implemented or even really adequately planned.

This gap — the lack of any kind of guarantee that the content at an address would remain static — became very lucrative for other organizations too, for other reasons. It made the web-app ecosystem possible: you can charge for open source software & keep proprietary software away from disassembler-wielding pirates and crack authors by hosting it on a web server and using the mutability of web pages as a delivery system for what would otherwise be a native application (and this lets you change or cut off access any time you want as well — making you permanent landlord of the software, rather than a mere manufacturer and distributor of software products). It made scammy, ad-ridden domain-squatting outfits a viable business strategy, and made piracy slightly harder by making domain seizure of sites hosting ‘unauthorized’ copies of ‘copyrighted’ content possible (nevermind that in all Berne Convention countries, all copyrightable content is implicitly copyrighted). It made targeted advertising possible because pages could be changed to host different ads based on the person viewing the site.

What most people call the ‘loss of the open web’ is an Eternal September situation: most contributors to the content of the web were no longer people who had the time & money to buy their own domains & host off their own machines.

It’s still a problem, and the problem does not lay at the feet of the influx of new users. Everybody in 1990 should have been able to see that by 2000 lots of people would be online and the people who were up for buying a domain name would be the same ones who had one in 1990.

Instead of making it easy for people to host things on their own machines & optimizing for inconsistent connections in the dialup era, we took the easy way out & relied on ISPs to host user content. That seemed OK because we were doing the same thing with usenet back then. We never really fixed this as the dailup era ended.

Of course, we’re addressing this a bit now with IPFS and DAT and similar things. But, it might be too late. Already, hosting content is “something that nerds do” and “something that big companies do”, never “something that I can do” from the perspective of non-technical users.

Everybody has a browser cache & so it’s pretty stupid that we don’t have cache sharing along bittorrent or something built into every major browser and turned on by default. Like, we could have done that in 1995 (not with bittorrent but just with full-file hashing or something, among users with the same ISP, with the ISP coordinating). Federated peer-to-peer cache contents sharing is a lot easier to gradually transform into a hostless fully distributed storage thing. Instead, when ATT WorldNet stopped hosting our home pages we turned to geocities, and when geocities stopped we went to MySpace.