Information is Not Knowledge: The Deep Web (Class 18)

2.2.10

The Deep Web (Class 18) - Chris Edwards

How do we browse those things that are non-browsable.

Deep/Invisible/Hidden/Dark Web
Centres around the notion of searchability - it is not indexed by search engines so it is unlikely to be found.

So, how do we find things on the dark web? Traditional browsing, social bookmarking, word of mouth, blogs.

Libraries as analogy. They are highly organised and can be browsed or found via an OPAC. Closed-stack with no OPAC is of little use to the public. Each website is like a library - with "shallow web" - that is publicly accessible with advertised existence - and "deep web" - not open/known to public this does not mean concealed, but simply obscure.

Organisation leads to usability because structure enables automation. Web indexes create a highly organised structure external to the content - this makes finding your stuff much quicker.

Why is it Hidden?
Intentionally - may require log-in, reputation may be good without indexing of deep content, may be technically difficult to provide the data needed.
Accidentally - laziness, carelessness, indifference, lack of knowledge.

Data and Metadata
Increasing level of abstraction but this goes along with increasing utility. Adding good quality metadata means you can make your data more findable. Metadata are usually concise, precise, well-specified, and uniform across a variety of content. Web authors can provide their own metadata (self-cataloguing).

Complement any search facility with a browsing interface. Avoid JavaScript - bots don't like them. Keep URLs distinctive (URL should determine content). Flickr as both searchable and browsable.

HTML IMG element requires ALT attribute (improves accessibility to vision-impaired users and enables images to be indexed by keyword).

Remember: not only humans are looking at your site-love your robot friends! (Or Clamps will initiate the clampage...)

2.2.10

The Deep Web (Class 18) - Chris Edwards

No comments: