The Languages and Tools of the Web
Based on material by Carlos Scheidegger and Kevin Sun
Next, we go over the basics of how the content of a web page is represented in HTML. We will learn about CSS, which lets us separate the content of the web page with how we want it to look. Then we’ll go into the essential technology for rendering visualizations: SVG - Scalable Vector Graphics.
HTML
In this course we will use HTML to create our data visualizations. HTML stands for “HyperText Markup Language”. 25 years ago, that used to be a meaningful description of what HTML actually did: it has links (hypertext), and it is a markup language. But we will be using many things from the HTML5 standard, which does much, much more: graphics, audio, video, etc. So it is easier to think of HTML as “whatever it is that web browsers know how to interpret”, and just not think about the actual term.
Elements
The important thing about HTML is that the markup is represented by elements. An HTML element is a portion of the content that is surrounded by a pair of tags of the same name. Like this:
See output in new page.In this element, strong is the name of the tag; the open tag is <strong>, and the matching closing tag is </strong>. The way you should interpret this is that the text “This is an HTML element” should be “strong”, i.e., typically this will be bold text.
HTML elements can and commonly do nest:
See output in new page.In addition to the names, opening tags can contain extra information about the element. These are called attributes:
See output in new page.In this case, we’re using the a
element (which stood for “anchor”, but now is almost universally used as a “link”). The attribute href
means “HTML reference”, which actually makes sense for a change. The meaning given to each attribute changes from element to element.
We will use element attributes in pretty much every example from now on. The most important ones are id
, class
, and style
. The id
attribute gives the attribute a unique identifier, which can then be used to access the element via Javascript (we’ll see how next week). Think of it as making the element accessible via a global variable. This is as convenient as a global variable, and potentially just as confusing: needing lots of different element ids might be a sign that you could organize your code better (in the coming weeks, we’ll learn about good practices like this). The class and style attributes will be explained in the CSS section below.
Self-closing elements
Some elements rarely have internal content between their opening and closing tags, and therefore it becomes a bit of a pain to type the closing tags every time. In that case, you can use the following shorthand notation: <foo />
is equivalent to <foo></foo>
(you might have noticed that in the charset declaration above).
Here are some important tags:
See output in new page.Below are a couple of tags that don’t have visual equivalents on the website, but they are used to define document metadata:
<html>
creates the entire HTML container.<head>
Creates the header (generally where the title and links to style sheets/scripts are found).<script>
links to or embeds a script (we will do that a lot).<style>
for embedding a style in the website.<link>
to reference an external document, often a css document like that:<link rel="stylesheet" type="text/css" href="theme.css">
. Therel
attribute defines the relationship to the active document, the type attribute tells the browser which type of file to expect.<body>
marks the container of the content of the website.
A comprehensive and well structured list of all elements can be found at MDN.
Necessary Boilerplate
An HTML5 document has a little bit of necessary boilerplate that you should just copy and paste every time you need to get started. Every HTML5 document you create in class should have this skeleton:
See output in new page.DOM
As we have seen above, a markup document looks a lot like a tree: it has a root, the HTML element, and elements can have children that are containing elements themselves.
While HTML is a textual representation of a markup document, the DOM is a programming interface for it. DOM stands for “Document Object Model”, and in this class we will use “DOM” to mean the tree created by the web browsers to represent the document, and the API that they provide in order to access it. This week, we will not use the API part of it, but the examples we’ll go over in class, will highlight the tree structure of the DOM.
INSPECTING THE DOM IN A LIVE BROWSER
Perhaps the most important habit you will learn in these first web lessons is the following: when in doubt, go to the Developer Tools. In this case, we’ll look at the Element tree, by clicking on the menu bar: View → Developer → Developer Tools. Alternatively, you can right click on any part of the webpage, and choose “Inspect Element”. Notice that there can be a big difference between what is in the DOM and what is in the source. In fact, much of this class is about dynamically generating DOM elements. Here is a good overview of the developer tools.
Cascading Style Sheets and Selectors
HTML specifies the content of a web page, but plain HTML says relatively little about how the content looks. This is where CSS comes in. CSS stands for Cascading Style Sheets: they are external declarations that control the way your elements will get rendered by a web browser. A full discussion of CSS syntax is, as usual, given at the MDN CSS website; we show the very basics here.
A stylesheet will usually consist of a list of CSS rules that are inserted in <style>
elements on the HTML header, <head>
. A single CSS rule associates a CSS selector with a set of CSS declarations. These are easier to see through examples.
More important than styling, however, are the rules on how we identify elements in the dom, the selectors.
CSS Element Selectors
Let’s look at a very simple CSS rule:
See output in new page.In this rule, strong is the selector, and each line inside the curly brackets is a declaration. Roughly, the way this goes is: for every DOM element with tag name strong, make its background color red, and its font size 300% of the base size. CSS rules are applied in order that they appear in the document, and if more than one rule matches the element, then they both apply. For instance, the example below is entirely equivalent to the above:
See output in new page.CSS Class Selectors
CSS selectors can be much more powerful than selecting on element names. For example, we can create user-defined “classes” of style. Classes allow us to define certain elements to be of a specific type that is then formatted consistently. Here is an example with “important” text and a “footnote”. Both are in a div, but they have different semantics, and we also want to display them differntly. Class selectors are identified by a leading .
, e.g., .menu
.
Note that we can also apply multiple classes to a single element, as you can see in the important footnote.
CSS ID Selectors
ID selectors work similar to class selectors, but IDs may only be used once for an element in the DOM. ID selectors are identified by a leading #
, e.g., #header
.You can use IDs as “anchors” to refer to a site by appending #idname
to the URL. But you can also use IDs to apply custom styles in CSS:
You can also use the combination of IDs and CSS to create layouts of a page:
See output in new page.CSS Relationship Selectors
CSS selectors let you match elements based on their relationship with other elements. While I will simply refer you to the MDN Selectors webpage for the full reference, I want to highlight two particularly important ones: the child selector and the descendant selector.
The child selector matches every time an element is directly enclosed by a different element. For example, consider the following rule involving the child selector:
See output in new page.Multiple Rules in CSS
When more than one CSS rule matches, then different CSS declarations might conflict with one another. In that case, “the most specific declaration wins”. The rules for what counts as more specific are really disgusting, so if you find yourself debugging CSS code because the styles “don’t take”, the first thing you should try is to set completely different classes for the element, add all the declarations to this class. Then, with help from the Developer Tools, you can add classes back to your element to see which declarations might be winning the specificity race. In order to avoid this kind of trouble, it’s better to stick to simple declarations as much as possible.
Other ways of declaring CSS
You can provide CSS stylesheets as an external file. This is very useful when you want to share CSS rules across many different documents. In that case, you include the following element in your <head>
:
<link rel="stylesheet" href="style.css"/>
In this case, style.css should be an additional file that consists entirely of CSS rules.
Finally, you can place CSS declarations directly inside an element. You do this using the style attribute, which most HTML elements support. For example, if you have this CSS rule:
See output in new page.It’s a bad idea to do this in HTML that you write manually: you’re mixing content with presentation, and making it hard to reuse the declarations. But later on we will be writing code to generate elements in the DOM for us, and in that case, this will be a very common and good thing to do. In this latter situation, it’s a good thing to do because the reusability will be represented in our Javascript source code.
SVG
So far we have only seen textual content in HTML, and this is a data visualization course. SVG (“Scalable Vector Graphics”) is a subset of the HTML5 standard that will provide us with essentially all of our graphical needs. SVG is extremely powerful, broadly supported, and very easy to program for. It’s also the preferred target for d3, the javascript library we’ll use for our visualization design.
Minimal Example
An SVG drawing starts with an svg element, which requires width and height attributes, specified in pixels:
See output in new page.This is results in a blank canvas, which is kind of boring, but you should be able to verify, using the Developer Tools, that there is in fact an honest-to-goodness SVG element there. In the following, you’ll learn how to add basic graphical shapes to the SVG element.
Circle
See output in new page.Ellipse
See output in new page.Rectangle
See output in new page.Lines
See output in new page.Text
See output in new page.Path
The SVG path element is how you “escape” the basic SVG shapes. In case none of the predefined shapes are good enough for you, you can draw any arbitrary shape you want using the path element. We will not use it very often in class, but it’s important that you know it exists, because it helps you understand how much of d3 works under the hood.
See output in new page.Instead of giving you just one simple example that wouldn’t do the path element justice, I will ask you to simply take a look at the MDN path tutorial.
Ordering
The order in which elements are drawn is the order in which they appear in the element:
See output in new page.Grouping
Grouping elements is a very powerful idea, and we will use it extensively when we get to use SVG for actual visualizations. It is powerful because it gives us abstraction, in the same way that a procedure groups a sequence of operations under a single name. In dynamic visualizations, this makes it possible for us to move a large number of elements by simply taking one branch of the DOM and placing it in a different subtree; without groups, we would have to remember over and over again which elements we cared about.
In addition, SVG groups give us geometric transformations. Geometric transformations are amazingly useful when we want to change the positions of a large number of elements in the same way, or when we want to express the positions of the elements in a more convenient manner. For example, recall that SVG’s basic coordinate system increases the y coordinate in the downward direction. If we want to draw a scatterplot, for example, then we’d have to remember every time to subtract the y coordinate we want, from the height of the SVG element:
See output in new page.This is annoying and error-prone. Instead, we can encode that transformation directly, using SVG’s grouping node g, and its transform attribute:
See output in new page.The transform attribute is read right-to-left, and it’s saying: to get the outer y coordinate, multiply the inner y coordinate by -1, and then add 200. In other words, outer_y = 200 - inner_y, which is precisely the flipping we need. Now the y coordinates behave as we would expect them in a scatterplot: increasing y means going up.
The main problem with these transformations, is that they apply to everything:
See output in new page.Clearly, we don’t want that to happen in every situation.
PRESENTATION ATTRIBUTES
As you might have noticed by comparing the HTML examples with the SVG ones, some appearance aspects are controlled by HTML attributes; others are controlled by CSS properties. This is a perennial source of confusion, and unfortunately there’s no good way around it. To add to the confusion, a subset of SVG attributes can also be specified via CSS: these are the “presentation attributes”.
It’s worth remembering this because CSS declarations for these attributes will override inline attribute definitions in the DOM. This is in turn inconsistent with the rule for the style attribute itself, which overrides CSS definitions (on behalf of whoever designed this standard: I am sorry).