HTML and the Internet

screenshot of how a web page would look like on the 
first browser designed to allow anyone with an 
Internet connection to access information on the Web

Image Copyright CERN Link opens in a new window

Screenshot of how a web page would look like on the first browser designed to allow anyone with an Internet connection to access information on the Web. The universal line mode browser: http://info.cern.ch/ Link opens in a new window

The HTML language was originally designed as a simple method of formatting scientific research documents that could be easily transmitted and transferred between different types of computer system to make documents universally readable. The original browsers only displayed text, so the formatting options were fairly basic. More recent developments have introduced new and different formatting and layout methods.

As no single person or company owns the Internet the different organisations involved (computer manufacturers, software developers, ISP's, etc) need a set of standard guidelines to work from. Major companies support and are represented in an organisation know as the World Wide Web Consortium (W3C) www.w3.org Link opens in a new window. The W3C discuss and develop standards and publish them so all interested parties can produce compatible systems.

Document Type Definitions

The standards for HTML are called Document Type Definitions (abbreviated to: DOCTYPE or DTD). The current most widely used DTD for HTML (HTML 4.01) was announced in July 1997, and following a few changes was accepted by the W3C as a Proposed Recommendation and became the standard a year later.

The W3C have a list of recommended DTD's: http://www.w3.org/QA/2002/04/valid-dtd-list.html Link opens in a new window

Although you should include the DTD information in your HTML document, web browsers will still display pages even if it is omitted. However if you use Cascading Style Sheets to format your pages some browsers may render differently without the DTD.

The next step...

In January 2000, the W3C proposed a new standard for web page authoring that is called XHTML 1.0 (Extensible HyperText Mark-up Language). It is generally compatible with the rules of HTML 4.01 but there a quite a few areas where the rules are more strict.

For example:

  • Every tag must be closed; for example there must be a </p> tag at the end of a paragraph, whereas in HTML 4.01 you can get away with simply putting a <p> between paragraphs and the browser would display it how you want.
  • Empty tags must have a terminating slash, for example <br> becomes <br />
  • All tags must be lower case, but not their attributes, and quotes become mandatory round variable values.