Memoirs from the old web: IE's crazy content rating system
Today, Internet Explorer has been consigned to the dustbin of history, yet its quirks and peculiar features remain an interesting area of discussion from a historical perspective. There was one particular feature of IE which not only now seems comically naive, but also completely impractical: namely, IE tried to pioneer a system of content rating.
This was essentially a standardised “parental controls” system. The idea was that a webpage could be rated in terms of the profanity, nudity, sex, violence, etc. that it contained.
Alright, hands up: how many people remember this dialog in IE's Internet Options?
RSACi stands for “Recreational Software Advisory Council — Internet”, the RSAC
being an organisation which existed to come up with a system of classification
of various kinds of content. The RSACi v1 vocabulary, which is shown in the
above dialog, allowed a web page to add a special <meta/>
element to indicate
how its content should be classified on four different axes:
RSACi Axis | Level | Description (in Internet Options) |
---|---|---|
Language | Level 0 | Inoffensive slang. Inoffensive slang; no profanity. |
Level 1 | Mild expletives. Mild expletives or mild terms for body functions. | |
Level 2 | Moderate expletives. Expletives; non-sexual anatomical references. | |
Level 3 | Obscene gestures. Strong, vulgar language; obscene gestures. Use of epithets. | |
Level 4 | Explicit or crude language. Extreme hate speech or crude language. Explicit sexual references. | |
Nudity | Level 0 | None. No nudity. |
Level 1 | Revealing attire. Revealing attire. | |
Level 2 | Partial nudity. Partial nudity. | |
Level 3 | Frontal nudity. Frontal nudity. | |
Level 4 | Provocative frontal nudity. Provocative display of frontal nudity. | |
Sex | Level 0 | None. No sexual activity portrayed. Romance. |
Level 1 | Passionate kissing. Passionate kissing. | |
Level 2 | Clothed sexual touching. Clothed sexual touching. | |
Level 3 | Non-explicit sexual touching. Non-explicit sexual touching. | |
Level 4 | Explicit sexual activity. Explicit sexual activity. | |
Violence | Level 0 | No violence. No aggressive violence; no natural or accidental violence. |
Level 1 | Fighting. Creatures injured or killed; damage to realistic objects. | |
Level 2 | Killing. Humans or creatures injured or killed. Rewards injuring non-threatening creatures.. | |
Level 3 | Killing with blood and gore. Humans injured or killed. | |
Level 4 | Wanton and gratuitous violence. Wanton and gratuitous violence. |
IE's support for this scheme allowed you to configure a maximum level for each of these RSACi criteria, and then secure these settings against being changed by setting a special supervisor password.
The idea, in other words, is that websites would conscientiously add this metadata to each page so that IE could determine whether the user should be allowed to access it; IE allowed you to configure a maximum level for each of these RSACi criteria,and then secure these settings against being changed by setting a special supervisor password. Of course where this idea falls down is that basically no websites actually did this. You could choose to enable the “Users can see sites that have no rating” option, making the mechanism largely ineffective because so few sites actually included the rating metadata, or disable it (the default), making almost all of the web inaccessible; although you could also explicitly whitelist or blacklist specific websites:
Standards. Interestingly, this scheme wasn't some IE proprietary extension. It was actually the product of a W3C standards effort, Platform for Internet Content Selection (PICS). This was an abortive effort at a web standard for adding content ratings metadata to web pages. Other browsers could have implemented it, though I am unsure if any actually did.
However, the rabbit hole actually goes even deeper than I previously knew. You see, the PICS standard doesn't just define support for one particular ratings scheme (RSACi). Oh, no. Instead, the PICS standard defines an entire DSL for defining custom ratings schemes. Not only can arbitrarily many ratings vocabularies exist, but a web page can add metadata classifying itself according to as many of them as it wishes. In fact, a website could even invent its own content ratings scheme in the PICS language and then classify itself using that scheme.
...Did I mention this DSL uses S-expressions?1
IE didn't cop out implementing the whole shebang, either. You can add your own custom PICS ratings system definitions — Windows just included the .rat
file for RSACi by default:
Metadata format. OK, so we have some ratings scheme we want to use, so how do we add the appropriate metadata to a web page? The metadata looks something like this (newlines added for ease of reading):
As the syntax implies, you could also use a PICS-Label
HTTP header directly.
The above metadata would assert a lack of any violence, sex, nudity or bad
language on the page under the RSACi v1 scheme; the v
, s
, n
and l
characters match up with the transmit-as
directive in the PICS DSL example
given above. The ratings scheme is identified by the scheme URI. This syntax
isn't restricted to a single ratings scheme and a single PICS-Label
HTTP
header can express a rating in arbitrarily many schemes, each identified by a
URI. The http://www.example.com
string designates the prefix for which the
rating is valid; this might be an entire website or a single web page.
Internet Explorer 7. Weirdly enough, despite its poor uptake, Microsoft held out with its PICS implementation. In IE7, they started to install the newer ICRAv3 rating scheme by default (RAT file). The RSACi RAT file was still bundled and available in Windows's system directory, but you had to explicitly add it. A RAT file for a Taiwanese content rating scheme known as TICRF (ticrf.rat) was also bundled, which also needed to be explicitly added. The TICRF file contains Chinese characters; interestingly, when I tried it, Windows XP's UI displayed the settings for this ratings file as mojibake, an example of Microsoft failing to use their own “W” Unicode APIs.2
IE7 also introduces a password hint for the supervisor password.
UI when a page is blocked. When accessing a web page which Content Advisor wouldn't allow, IE would pop up the following dialog:
HTTP headers. Besides the PICS-Label
HTTP header, PICS also defined an
HTTP header to explicitly request that a PICS label header be provided in the
response:
(Note that HTTP/1.x supports breaking header lines over a new line; each continuation line has to begin with a space.)
The full
keyword in the Protocol-Request
header indicates how verbose the
PICS-Label
response header should be; the options are minimal
, short
,
full
or signed
. A list of ratings schemes the client is interested in is
sent, and the server responds with a Protocol
and PICS-Label
header in an
otherwise normal HTTP response.
By the way, despite their general naming, the Protocol
and Protocol-Request
headers are, as far as I can tell, entirely PICS-specific and have never been
used for any other purpose; see
RFC4229, which lists
known HTTP header names at the time it was published, and explicitly lists
Protocol
and Protocol-Request
as relating solely to PICS.
PICS rules files. PICS didn't only define a format for defining ratings schemes and a format for applying metadata to web pages. It also defined a “PICSRules” file format, which can contain a set of rules dictating whether a webpage can be viewed. These rules could whitelist or blacklist sites based on URL patterns, or based on logic and inequality expressions over numerical content rating values. Somewhat amazingly, Content Advisor even includes support for these files:
A complex example of a PICSRules file:
Ratings bureaus. Observant readers will notice the “ratings bureau” option found on the Advanced tab. By default, “None” is the only available option to select. The idea is that since a lot of content on the web wouldn't be rated, content ratings agencies could run a web API which allowed PICS labels to be distributed by the ratings agency rather than by the website itself. This formed an alternative distribution channel for a set of PICS labels. The API is documented in the PICS specification, but can be summarised as:
GET http://some-ratings-scheme.example.org/Ratings?opt=generic
&u="http%3A%2F%2Fwww.example.com%2Ffoobar"
&s="http%3A%2F%2Fwww.example.org%2Fv2.5" HTTP/1.1
HTTP/1.0 200 OK
Content-Type: application/pics-labels
...
Here, the s
query string argument specifies the rating scheme, and the u
argument specifies the web page being inquired about. The content returned is a
set of PICS labels expressed in the same S-expression syntax as used by the
PICS-Labels
HTTP header. There are also different query types which allow
an entire tree of ratings for a website to be obtained.
Signatures. PICS even defined a scheme for digitally signing PICS labels(!). Being a specification from 1998, the scheme is based on RSA and MD5.
Creating your own ratings scheme
So... PICS, and IE's “Content Advisor” scheme, allows you to define custom ratings schemes. There's really only one thing for it, isn't there? Clearly, I have to come up with my own content rating scheme.
Here goes. Presenting the Devever Content Rating Scheme:
Devever Axis | Level | Description |
---|---|---|
Cats | Content which features pictures of cats. This may cause lost productivity in the user due to being lulled into a sense of adorableness. | |
Level 0 | No cats. The content is completely cat-free. | |
Level 1 | Gruff and unsightly cats. While the content features cats, the cats do at least offset some of their inherent cuteness by being gruff or unsightly. | |
Level 2 | Cute cats. Cute cats. | |
Level 3 | Severely cute cats. Severely cute cats. Danger! | |
Level 4 | Kittens. Kittens. Extreme cuteness hazard. Consult a doctor before viewing pictures of kittens. May be classified as a munition by the Wassenaar Treaty. | |
Connector Mating | Content which features pictures of electrical or electronics connectors or sockets being bonded together. | |
Level 0 | No connector mating. The content features no connector mating. | |
Level 1 | Plug sockets only. Only electrical mains plug sockets are shown mating. | |
Level 2 | External electronics connector mating. The sensitive mating of electronics connectors on the outside of electronic equipment is depicted. | |
Level 3 | Internal electronics connector mating. The delicate mating of internal electronics connectors within electronic equipment is depicted. | |
Level 4 | Hermaphroditic connector mating. IBM TOKEN RING!! | |
Disturbing Technology | Content which discusses hardware or software which is disturbing, due to being insane, badly designed, or otherwise horrifying. | |
Level 0 | No disturbing technology. The content features no disturbing technology. | |
Level 1 | Mild technical debt. The content features mild technical debt. | |
Level 2 | Serious kludges. The content features serious kludges which may traumatise the reader for a few hours. | |
Level 3 | Nightmare fuel. The content features descriptions of technology so disturbing, the reader is likely to have nightmares about it for months afterwards. | |
Level 4 | Eldritch abomination. The content features descriptions of technology so terrible, you will lose your sanity just reading about it. | |
Esotericism | Content which deals with bizarrely esoteric computer technologies and aspects of computing. | |
Level 0 | Not esoteric at all. As common as ASCII. | |
Level 1 | Substantially obscure knowledge. Not many people know about this area. | |
Level 2 | Ridiculously obscure knowledge. The technological equivalent of having discovered something at the bottom of a locked filing cabinet in a disused lavatory hidden behind a door saying “Beware of the Leopard”. | |
Level 3 | Critically endangered knowledge. Knowledge of this technology is so obscure, those still retaining it should be considered critically endangered. It may be necessary to clone these people in the future. | |
Level 4 | The only person on the internet to ever write about this. Knowledge of this technology is so obscure, there is only one person ever known to have written about it, and nobody knows what has happened to them. Possibly they are some kind of time traveller from the past or future. | |
Rantiness | Content which involves angry technies ranting about awful technology issues. | |
Level 0 | No ranting. The content is free of ranting. | |
Level 1 | Mildly acerbic. The irritation of the writer is sensible, but does not reach the threshold of actual ranting. | |
Level 2 | Substantial ranting. The writer is ranting to a substantial degree. | |
Level 3 | Severe ranting. The writer is ranting to an extremely agitated degree. | |
Level 4 | Risk of aneurism in writer. The writer was so agitated while writing this content, they'd probably die if the wind blew too strongly on them. |
You can find the .RAT file for this scheme here. After downloading the file, you can install it using the dialog in Content Advisor's settings:
You can also install it by just double clicking the .RAT file in Windows Explorer. This file type even has its own bespoke icon:
Once installed, our rating scheme appears in Content Advisor's settings:
After installing the scheme, I've elected to limit Cats to Level 1, Connector Mating to Level 4, Disturbing Technology to Level 2, Esotericism to Level 3 and Rantiness to Level 0. I left all the ICRA3 settings at Level 0.
I produced a set of PICS test pages which you can access here. Note that since this website requires TLS 1.2 or later, you can't access it using old versions of IE. Therefore, these test pages are also available as a zipped download which you can use locally.
Sure enough, it works:
In terms of how multiple ratings systems interact, it seems like Content Advisor doesn't consider a page “unrated” so long as at least one of the ratings systems you have installed is used to label the page, even if you also have other systems installed that the page doesn't use to label itself.
Creating your own PICS rules file. As mentioned above, Content Advisor also supports PICS rules file. Here I import the following PRF file:
PRF files also have an icon set in Explorer, and you can double click them to import them:
I found Microsoft's implementation of the PRF format a bit strange, which seems
to relate to how its decision is combined with the main Content Advisor
settings. AcceptByURL
and RejectByURL
work well, and override ordinary
decisions, ignoring any labels. For example, they allow unrated pages to be
viewed when this isn't normally the case, or prohibit viewing of pages which
would usually be allowed by virtue of their labelling. However, I was unable to
get label-based policies like AcceptIf
or RejectIf
to work, even though
these are supposedly supported. I also was unable to get a ratings bureau,
which is supposed to be specified via bureauURL
in a PRF file, to show up in
the UI. The otherwise
clauses appear to be implemented in a non-standard way
where it applies the standard Content Advisor handling logic set in the UI,
regardless of whether RejectIf "otherwise"
or AcceptIf "otherwise"
is used.
Here's an example of accessing the rsaci-min
test
page, which should ordinarily be allowed but which is blocked by the rules
file:
Miscellanea
API. Microsoft's PICS implementation is actually part of the operating
system, implemented in msrating.dll
. The API is even
documented
so that other applications can take advantage. Microsoft also has a technical
specification describing its PICS
implementation.
Further reading. The PICS specification and W3C working group pages are still available.
Conclusions. Internet Explorer's “Content Advisor” and the PICS standard that turns out to underlie it is a fascinating view into a completely obsolete technical ecosystem. There's something vaguely amazing about the amount of effort that was put into developing the several different standards comprising PICS, let alone adding support for not just content ratings but all of the other PICS functionality, such as rules files and ratings bureaus, to IE. Were any ratings bureaus actually set up? Apparently, yes. How many people actually successfully used PICS? If you have your own story about this microcosm, do let me know. Now, it exists only as a strange historical curiosity; I myself only know about it because of the prominence which was given to it in IE's Internet Options dialog. —And having written this article, I now return PICS to the crypt from which I unearthed it.
1. Well, almost. Sadly the PICS specification makes no actual mention of S-expressions despite the obvious influence, instead specifying the grammar manually. Not only that, W3C then tried to replace PICS with an XML-based successor named POWDER. This is of course yet another example of an inappropriate use of XML, and demonstrates how W3C doesn't even understand their own standard. It also demonstrates how W3C began to misuse XML as soon a s they created it and, most likely, helped establish an extremely pervasive trend of XML misuse in the computing industry which lasted for decades. ⏎
2. To make it work correctly, you would need to set the Language for non-Unicode applications to Chinese (Traditional) in Windows's locale settings and then (sigh) reboot. This tiresome design aspect of Windows's “A” APIs resulted in various tools to launch applications with different non-Unicode locales, including Microsoft's own AppLocale. Incidentally, Windows 10 finally — finally! — received support for a UTF-8 codepage in 2019, making it possible to set the locale for non-Unicode applications to UTF-8. This probably doesn't benefit existing applications much, though. ⏎