Problem Solving and Computing (CSC-103 98S)
Outline of Class 4: Organizing and Representing Information; HTML
Held: Thursday, January 29, 1998
- I still haven't graded your assignments (or it's likely that I won't
have graded assignments at the time I wrote this). Hopefully, I'll
get them done by Tuesday.
- Today's assignment: computing
assignment 3.
- We'll spend most of today on the computers. Since computers are
malicious, prepare for some problems, but don't let them bother you.
- Much of computer science is concerned with the algorithms
we write.
- However, it is also important how we represent and structure
data.
- What is meaningful and readable to the human may be difficult for
the computer, and vice versa.
- In addition, the way in which we organize data can significantly
affect the algorithms we write. For example, it is much easier to
find a word in a sorted list than an unsorted list.
- Hence, part of our study of computing will be a study of data
representation.
- As you may have observed while playing with a variety of web pages
(and other types of computer documents), there is much more to an
electronic document than just the contents of the document.
- Some of the "extra" stuff includes
- Formatting of the text
- Images
- Hypertext links
- ...
- How does the computer remember what everything is?
- With an internal
representation of the text that includes information on all of these
other details.
- Often, this representation is helped by a set of "standard" pieces
of information, such as the appearance of fonts.
- How can we tell the computer about all of these components?
- Interactively, using a program that permits us to specify the
various issues.
- Formally, by annotating (or "marking up") the text
with information on these extra facets.
- The languages used to annotate texts are called
markup languages.
- There are two primary trends in markup: physical markup and logical
markup.
- In physical markup you indicate the appearance
of each piece of text. For example, the following describes a particular
piece of text (but not necessarily itself).
"This text should be displayed in Helvetica, Italic, Medium Weight, 12 pt.
It should be indented one inch from the left margin and one inch from the
right margin. It should be separated from the previous text by 1/10 inch.
And so on and so forth."
Obviously, many of these facets can be set for the whole document or
automatically guessed.
- In logical markup you indicate the role of
each piece of text. For example,
"This is a quotation."
The choice of how to format each type of text
is then left to the reader/computer/browser/.
- There are a number of arguments for each type of markup.
- Advocates of physical markup claim that
- There is a wide variety of literature describing the effects of
design on readability. Without physical markup, it is much more
difficult to produce appropriate designs.
- In many arenas, the appearance of your text is as important as
what you say and how you say it (an extension of the old "what you
say is not as important as how you say it").
- Advocates of logical markup claim that
- Logical markup supports more intelligent document retrieval.
- Logical markup permits different readers to read documents in
a way that makes more sense to them. This includes simple
concepts, like which font to use.
- Logical markup better supports visually-impaired readers, as the
computer can be programmed to read different types of text
differently (and even to read the type of each piece of text).
For example, I've been told that one noted computer scientist has
the computer read emphasized text with a James Earl Jones voice.
- Few authors are good designers; logical markup can separate the two.
- I'll admit that I'm an advocate of logical markup.
- HTML, the Hypertext Markup Language, was designed as a simple
markup language for documents in the World-Wide Web.
- It originally included both physical and logical markup mechanisms, but only
a limited number of each.
- More recently, the web is moving to a clearer relationship between
logical and physical markup, with an emphasis on logical markup of
documents and accompanying style sheets that provide
the physical details.
- One clear goal of HTML is that we should be able to type HTML on any
"typical" (or at least "typical U.S.") keyboard.
- How do we mark documents? By using tags to surround pieces
of text.
- How do we distinguish tags from regular text? Tim Berners-Lee, the
designer of HTML, decided that we'd surround tags with less-than
(<) and greater-than (>) signs.
- What are some typical tags? Think about pages and then see the
handout.
For today, we'll work on writing HTML, as specified in
computing assignment 3.