Learning Technology

The day that names run out

1 February 1st, 2011 at 03:02

Do a vanity search on google for yourself: which pages come up? For me, it’s facebook (as my name is in the title and the url), linkedIn (same reason), vimeo (same reason) and Youtube. Julian Tenney’s and my Xpert presentation at Open Nottingham is first, even against my Vimeo account.

If you imagine the Internet as some form of notary – then you could possibly get an idea of my professional life from these links – it doesn’t find my flickr, or personal youtube accounts as they live under other names. Not deliberately I might add, just the mood I was in when I made them. Even those of us that don’t use linkedIn, or facebook, understand what they are, and as such, they give our names and lives a limited form of Internet provenance.

Google almost certainly knows this as well, and so to avoid bringing back spam / the best links, it has a series of rules to do this. I would assume Youtube finishes first, as google, being the youtube owners, would like to drive traffic through to this site (if possible). The fact most of the URLs and page titles also contain my name indicates that google likes these over where my name appears in the document itself. Understanding these rules that google uses has led to the development of a new trade -search engine optimisation (seo for short). SEO people work to ensure that certain sites appear as high as possible in google searches. We’ve obviously come across a few of these rules above, but the google algorithm is always changing, and so it’s very much a continual process to keep your results highest in the search.

However, if we know the word appearing in the URL makes for better results, then we have a head start.

Or do we? Because by and large we don’t control the URL for out content. More often than not an arbitrary string or number is used to represent us. On this blog, you can find all my blog posts by using this link

http://blogs.nottingham.ac.uk/learningtechnology/author/cczpl/

But nowhere in the link is my name, and only a few people would know CCZPL is my username, but a google search for that word brings back results that definitely represent me.

So the issues arises of the cataloguer, or the URL producing part of the cataloguer to work out how best to rectify this. I would assume though, that URL customisation and SEO aren’t a concern of the day to day cataloguer. Most catalogues use an arbitrary identifier to represent an item. In using this arbitrary identifier in the URL though, the cataloguer and their system makes this item harder to find than if a more structured URL format was used. The identifier could be the title of the resource, and / or the author. The only problem would be if the author or title wasn’t unique. As long as titles and author names remain unique, then arbitrary identifiers can be dropped.

But we don’t have unique names, and if we aimed to do so, what would happen when we ran out of names. A first in first out system where you took the name of the last person who died? Interesting, slightly scary, and perhaps with a hat-tip to Logan’s run. We do have unique domain names on the internet though, and almost all the common words in the english language have already been bought. But after the domain name, the rest of the web address is open for change? Could this be optimised for searching?

The process of changing titles and web addresses would help this content be discovered. The only downside would be longer titles and URLs perhaps looking a tad ugly? But is their really a movement for prettier URLs? I doubt that some how. Perhaps the issue of the dreaded silos, and the need for silo APIs could be removed with an increase in data rich URLs and titles. The process of metadata creation remains an issue for many, and as a process is often struggling to deal with the scale of new items deposited.

If we assume that google is the primary tool for content discovery, then the issue for the content holder is to find a way of maximising the possibilities of the resource to be found. The repository no doubt has the most perfect metadata trapped inside it (the meta-silo if you will), but this often doesn’t manifest itself in a way that can be searched by search engines.

The scope exists to look inside the metadata, to create an “intradata”. Intradata would function where the metadata is arbitrary (link values for example) and is not catalogued but is defined by the code of the system. In cases such as this, the arbitrary nature is contra to the effort of cataloguing but also contra the need for the content to be discovered, which is surely why we place it in the systems we do.

Could OER discovery be improved by mandating the use of author, title and OER in the URL and title?

Post a comment

You must be logged in to post a comment.

One comment for “ The day that names run out ”