Semantic markup is a wonderful thing. With it, HTML becomes a rich landscape of meaningful, structured text that is transparent to screen readers and search engines. Without it, the web becomes just a featureless blob of text. As Jeffrey Zeldman, godfather of the web, says in Designing with Web Standards, Third Edition (2010):
[N]onsemantic slop […] makes pages fat, unfindable, and unfriendly to people with (permanent or temporary) disabilities.
One problem that I occasionally come across, though, is that not every content type in the analogue world has a clearly corresponding HTML element. In such cases, I don't feel 100% confident about my markup choices. For a great example, look no further than this very site. I suppose that my blog posts should be contained in an element, but what about the teasers on the index page? Should they appear in or tags, or perhaps something else? And what about the Comments section at the bottom of this page? Is this a subsection of the main article, or should it be in a separate sibling section (or article or whatever)? Ach!
What prompted this line of thought was when I ran a page of markup from CSS Wizardry through the W3C HTML Validation Service. It returned the following error reports:
Warning: Article lacks heading. Consider using h2-h6 elements to add identifying headings to all articles.
Warning: Section lacks heading. Consider using h2-h6 elements to add identifying headings to all sections.
So, when you use and tags, it looks like you're supposed to use a heading. Great! What else don't I know about them? Let's find out.
Here is an abbreviated version of the description of the article element on W3C .
Aaron Gustafson also writes in Adaptive Web Design, Second Edition (2015), p72:
An is best thought of as an autonomous unit of content—it can exist on its own. […] If the content in question could be removed from the document or the content itself, is your best bet.
So, it looks like should be used for standalone content that, for example, could appear in a RSS feed. Thinking about this site, I guess each of the blog posts should appear in an element, although the same W3C page above also says this:
When the main content of the page (i.e. excluding footers, headers, navigation blocks, and sidebars) is all one single self-contained composition, the content should be marked up with a element. The content may also be marked with an , but it is technically redundant in this case.
Mmm. Looks like I should be marking up my blog posts with rather than . According to caniuse.com , 71% of users globally are using a browser that fully supports . Around 25% are on browsers that don't support default formatting for , and need it to be styled manually with display: block . Almost 4% of global users are using a browser that doesn't support at all.
To maximise coverage, caniuse.com proposes some solutions, all involving JavaScript, which I'm not keen on doing. I guess I could use an element inside a element to wrap my blog post content, and have this rule in CSS: main . There might be some scruffy text at the beginning and end of the post, but because HTML ignores unknown tags, at least one of the two containers will hopefully be recognised.
If I understand correctly then, a is any part of a web page, other than an , or , whose content is linked, and this link is often expressed in the form of a heading.
To clarify the difference between other similar elements, the W3C section description further states:
Authors are encouraged to use the article element instead of the section element when it would make sense to syndicate the contents of the element.
The section element is not a generic container element. When an element is needed only for styling purposes or as a convenience for scripting, authors are encouraged to use the element instead. A general rule is that the section element is appropriate only if the element's contents would be listed explicitly in the document's outline.
So, the difference between a and an is that the former represents part of a text, while the latter represents a whole text. The difference between and is that is semantic—it represents a portion of a whole text, as determined by the author. On the other hand, a is simply a nonsemantic, block-level hook for styles and scripts.
That's gone some way to clarify the difference between and elements. Now when is it appropriate to nest these elements inside each other?
Once again, www.w3.org tells us in what situation it might be necessary to nest an article inside another article:
When article elements are nested, the inner article elements represent articles that are in principle related to the contents of the outer article. For instance, a blog entry on a site that accepts user-submitted comments could represent the comments as article elements nested within the article element for the blog entry.
So the chief context for nested article elements is for user-generated comments. In the case of this site, it would be an important consideration if I was maintaining my own commenting system, say on Wordpress or Drupal. But since I'm using Disqus for the comments, I don't need to worry too much. Another context that comes to mind might be testimonials from clients: each testimonial is a complete text in its own right, but it appears on a page that gathers these testimonials in one place.
As explained above, a is a generic container for a thematically linked part of a document. I guess elements can be used to semantically break up any text into its component parts. As you'll see with this page, I've chosen to enclose each main part of the blog post in a section: the introduction, The element, The element, etc (each of these are tags).
If a is used to mark up a long part of a text, such as the chapter of a book, it's easy to see how that long chapter can also be broken up into smaller chunks. A book about animals ( ), might contain a chapter on reptiles ( ), which in turn contains sub-sections on chameleons (another ).
Frequently, a , for instance, on a web site's home page, might contain a teaser to a recent blog post (guilty). I suppose this is a valid reason to place an inside a .
One final curiosity about and is that they both belong to a category of elements (along with and ) that are called sectioning elements. As Aaron Gustafson explains in Adaptive Web Design, Second Edition , the idea behind sectioning elements was to overcome the problem of having only six levels of heading tags ( h1 to h6 ). When we use a sectioning element, the heading hierarchy continues where its parent left off. So, as I understand it, if a parent element contains all h1 to h6 heading tags, a h1 tag in one of these sectioning elements is treated by HTML as if it were h7 .
All of this has little meaning for most day-to-day web design, however, since these nested headings cannot easily be styled by CSS.
The roles of these two HTML elements are a little clearer for me now. is for whole, complete texts, while is for parts of texts. This same rule applies to nested elements. An within an is most likely appropriate for a user comment (whole text) within an article (whole text). A within an might be used to split up the introduction, main body paragraphs and conclusion (parts of a text) within an article (whole text).
On a side note, all of this may be moot, anyway. As I discovered at a recent talk about intertwingularity at Staffs Web Meetup, by Andy Wootton, we can try as much as we like to carve the world up into convenient, bite-size chunks by attaching labels like and . But in truth everything is interconnected, and our labels are rarely an accurate depiction of how the world really is. So, I guess the corollary is do your best, throw in some semantic HTML tags, and then chill out!
Here are some more articles about the differences between and elements:
And here are a couple of articles on that last point, intwingularity:
James Turner Web development / Tabletop RPGs
Oops! Looks like JavaScript might not be working. No worries. My email address is ja(!)m es@ja me st||urn[er] on line _._ net (just remove the ugly punctuation)