[This post is a work-in-progress as I've not found time to finish it. Only published publically as I can't make Wordpress make it visible but not appear on the front page or archive pages. Your comments will influence the outcome]
As I noted in a comment, markup is a framework. We don’t need a CSS framework with special attribute names to give meaning to the document - the markup should be doing this.
The role of markup is to convey information, in human or computer readable form.
This could be presentational information or content - I have no problem with either (I do have problem with mixing both in the same document… although HTML works)
Thus the fuss over Blueprints CSS framework and the level of separation - well yes fuss all you like, but many (most?) ‘valid’ ‘CSS-based’ websites are nothing more than presentational tag soup - the result of tables based designers jumping on the buzzwords and bastardising it to get their results.
I like things to be perfect and so want perfect separation, and my pragmatic friends have been telling me to get a life since 2003. I have, and have left that approach at the theoretical level - I see no fun in compromising by transforming the XML to something meaningless.
But in fact you have to. HTML is a presentational layer, with some semantics built-in. When we type <p>…</p> we are saying the contents is a paragraph (supposedly), but subconsciously we are also saying we want it displayed as a paragraph, with the margins that entails.
Had HTML elements never had styles built-in then the situation might be different; we however are used to it. And I think we wouldn’t want it the other way round - our minds work on concepts like paragraphs.
Defining an abstract tag name and a set of rules to style it into a paragraph wouldn’t be done frequently - we’d probably call it <para> and have a global style system we used on all documents which made it display just like <p> does.
I would say that adding class names like span-4 is the least of our worries. If we’ve got to this stage of degradation, what harm does a little more do?
By the time we add one wrapper div, or a clearing div, we’ve polluted the page.
In itself a class name has no semantic meaning. No text has semantics, save for those we give it. What the use of structural class names lack, as Simon Willison has already pointed out, is memorability and a clear meaning for when we read through the code later.
There again, when you define new tags in XML, it is the human to whom they have semantic meaning, the human who empowers them. To a computer they are just characters to process. Any attempt to argue otherwise is to imply the computer has intelligence. I detect this when people argue over whether to wrap form elements in <p> or <div>. In the situation either has a valid meaning (but only because of the presentational styling applied to <p> by default).
[[Define semantic??]]
I’ll end by saying I don’t entirely grasp the meaning of semantic, when something is semantic and when something isn’t. I’ve used words here which if you understand a narrow meaning for then may not fit. Have a broad mind, and if you want to educate me on the real meanings (or an alternative word that would fit more accurately) go for it!
I’m no expert on what semantics really are - if you want one talk to Steve Hesketh or any member of the webdesign-l mailing list
So is HTML a semantic (structural) level, or presentational?
How does it differ from RTF, LaTeX etc, in its outcome? Every format developed has description associated. The ‘modern’ XML-based forms are certainly my favourites (see above re: perfectionism) but all these previous ones worked.
Is there a difference in semantics? I’d say LaTeX (and possibly RTF, which I don’t know well) transmit as much meaning as HTML.
=======
Now to move onto the Blueprint framework which has been receiving so much attention.
It doesn’t deal with two important problems:
1. Source ordering
2. equal-height columns
The first is the more serious of the two. When I have a standard 2 or 3 column page layout, I want the contents to come first - this is after all more important. Using absolute positioning this can be done, however I’ve found that to open up a whole new set of problems. Which leaves floats… so I give up and put the source code in the order needed to float the elements as needed.
I tried using Blueprint on a new simple 2-column site I was doing it proved too inflexible - the grid sections were too wide. Admittedly I didn’t spend long reading the documentation, but dived in with the grid and examples.
I chose the wrong column widths at the start and when I needed to go back through my mark-up and change span-3, span-8 to spen-2 and span-9 I gave up and wrote my own CSS. A flexible-width system, based on the size of the container would be acceptable, but now I have widths defined in the CSS *and* the HTML. (After all, changing widths in the framework defeats the object - I could write my own CSS instead!)
I did appreciate the grid support for Blueprint - we all know that the typography on a well thought-out grid looks gorgeous.
=============
Does it matter if semantics are preserved in HTML markup?
——————-
To some yes, but in the end the web and web design today are commercial occupations, and what matters (in 99.9% of cases) is that the technology and approach used gets the job done, works for people, and pays back the investment made. If that can be done faster without worrying about ‘academic’ matters like semantics of the markup (lets face it, only technical people bother about) and how elements pollute the DOM, then that’s how it’s going to be done.
I’ve just reached the end of a long week dealing with CSS inconsistencies and browser bugs, so I’m not my most positive about web technology. Clients don’t understand why it can’t be pixel perfect (they also don’t understand that it takes time to convert a Photoshop mockup to a web page - “It’s all laid out nicely and the graphics are there, what’s the problem?”) and I can’t blame them for that. If a printer delivered 4 boxes of brochures with the contents of each box looking different, and blamed it on using 4 different printing machines and “nothing he can do about it mate” I’d be pretty darn annoyed.
What do we mean by semantics? I’ve just spent an hour trying to create a CSS-thumbnail gallery with definition lists (because I believe these to have more suitable semantics than an unordered list or divs) and failing, due to the odd size images and caption requirements I have. I’m certainly not ruling out lack of skill on my part, but if I can’t do it why should most people be able to? And does bending HTML to display an image gallery destroy the semantics, no matter what tags are used?
Semantics will always give way to ease of use.
======================
And (the million dollar question) where is the line between pragmatism and being slovenly and letting anything that’s easier be done? Which side of the line does a wrapper div go? A div to clear floated elements?
=======================
“if I have the tech-savviness of a guy who is barely able to program with html, will I be able to do simple things like add new designs and modify existing ones without pulling my hair out?”
We may say that people shouldn’t have such unrealistic expectations, but they do and we need to work with that. Microsoft Word has made them expect they can do everything they want themselves. Once that mindset’s in place, it’s incredibly difficult to go against the flow.
Saying “If they can’t write HTML they have no place on the web” is fine, and a view I’m sympathetic with, but for information sharing it limits people with valuable information who don’t have the technical skills (it also limits the people with junk, but that’s by the by…).
A two-teir web would be great, if the distinction can be enforced (”to be a pro you have to do it this way, everyone else can do whatever they like but can’t charge for it”).
I build a lot of content management systems, and am frequently asked by the more savvy users to enable TinyMCE’s table support, so they can lay out pages. When they do that in MS Word and there is no alternative toolbar in an online WYSIWYG editor, it’s very hard to explain to them that I won’t/shouldn’t because it doesn’t match up to web standards pipe-dreams — even though this is something they want in order to gain control of their website and serve their business needs!
‘Purity’ and ‘Practicality’ have nasty edge cases like this. (And a more granular CMS is not always a solution - neither are they technical enough to work with HTML directly)
=======================
What does it mean for something to be semantic? When is something semantic and when isn’t it? What defines if it has a semantic meaning?Markup cannot be semantic - it transmits the semantics(???)
“For example, with HTML and a tool to render it (perhaps Web browser software, perhaps another user agent), one can create and present a page that lists items for sale. The HTML of this catalog page can make simple, document-level assertions such as “this document’s title is ‘Widget Superstore’”. But there is no capability within the HTML itself to assert unambiguously that, for example, item number X586172 is an Acme Gizmo with a retail price of £199, or that it is a consumer product. Rather, HTML can only say that the span of text “X586172″ is something that should be positioned near “Acme Gizmo” and “£ 199″, etc. There is no way to say “this is a catalog” or even to establish that “Acme Gizmo” is a kind of title or that “£199″ is a price. There is also no way to express that these pieces of information are bound together in describing a discrete item, distinct from other items perhaps listed on the page.”
Source - http://en.wikipedia.org/wiki/Semantic_WebReferences:
http://en.wikipedia.org/wiki/Semantic_HTML#Semantic_HTML
Summary questions:
- Do semantics have anything to do with code cleanliness? Does polluting a page with unnecessary structural markup (wrapper divs…) remove the semantics?
- How can any markup have semantics?
- Why is <div id=”content”> any more semantic than <div id=”s876rts”>?
- How does (X)HTML semantic markup (at the level of <p> vs <div> vs <table>) make any difference to anyone?
Changelog:
29 Aug, 9pm: Added summary questions, cleaned up (some) muddled writing that was clear to me but no one else!
4 comments ↓
“If a printer delivered 4 boxes of brochures with the contents of each box looking different, and blamed it on using 4 different printing machines …”
that’s actually not a bad thing for a client to say, because it has a very great answer which the client can surely understand
put it to the client this way: “the printer is at your door with the ad copy and the paper stock, and wants to know where’s your print machine?”
that’s the funny thing about the difference between the web and print — on the web, the user has the print machine, and every user’s print machine is different!!
therefore, we need to plan carefully how to deliver the content, so that it will look acceptable on as many print machines as possible, without needing spending 2x, 3x, … umpteen times the cost (clients love this part) to customize it to that many different print machines
and regarding the very last quote from the wiki page about semantics, i don’t buy that argument, because for the example data given, that would be laid out in a TABLE, which has quite acceptable semantics
Curious place to stop the Acme quote since the next paragraph answers the question of how to meaningfully describe the data through the semantic web:
“The semantic web addresses this shortcoming, using the descriptive technologies Resource Description Framework (RDF) and Web Ontology Language (OWL), and the data-centric, customizable Extensible Markup Language (XML). These technologies are combined in order to provide descriptions that supplement or replace the content of Web documents. Thus, content may manifest as descriptive data stored in Web-accessible databases, or as markup within documents (particularly, in Extensible HTML (XHTML) interspersed with XML, or, more often, purely in XML, with layout/rendering cues stored separately). The machine-readable descriptions enable content managers to add meaning to the content, i.e. to describe the structure of the knowledge we have about that content. In this way, a machine can process knowledge itself, instead of text, using processes similar to human deductive reasoning and inference, thereby obtaining more meaningful results and facilitating automated information gathering and research by computers.”
Standards, then, help establish the meaning and ultimately the veracity of content. Stopping the process compromises the meaning. Browsers which don’t fully support standards block effective research.
To be sure there is a long way to go but what we do now sets the direction towards or away from this definition of a semantic web.
drew
Drew: I stopped the quote there because the current debate is about using HTML (potentially XHTML) and the semantics transmitted through this.
At some point in the cycle, from semantically stored data to humans reading it, presentation is going to corrupt the purity. I guess what RDF and similar gain is a clear separation from the presentation layer, which is confused in (X)HTML. That, however, is deviating from the post above into a whole new topic
Seems to me that the second paragraph nicely answers your summary questions by reminding us that the purpose of a semantic web is machine-readability of an author’s intent. Semantics describes and extends the data/content so it becomes more than just text or multi-media by describing the data and/or the source of the data. This is achieved by following specified standards which allow direct reading or mediating through intermediary standards which allow conversion between disparate systems.
For the most part this currently doesn’t mean much. The advantages of progressive enhancement/separation of content and presentation presently lie in the practical realm of light weight code, ease of maintenance, etc. When more sophisticated agents using sparql, owl, rif, rdf, xml, trust, etc come into wider use, not following standards may well result in pages not being displayed, being severely broken in those agents, or worst of all broken code would mean not showing up in search results at all depriving the rest of us of certain deathless prose.
But fear not! There will also still be browsers making best guess efforts to render the unrenderable. Individual users and large entities will still be able to follow those standards they choose to and ignore the rest. You may get your two-tiers with perfect enforcement — self selected code based separation.
So your answer are: 1a)Yes 1b)It’s not pollution so no. 2)Through the use of standards which assign or convert descriptions about the data. 3)Equivalent. Semantic value derives from additional descriptive technologies. 4)Not a lot of difference yet. Microformats offer a small hint at what might be as does MathML.
Leave a Comment