Continuing our discussion on how to find
information from Internet - we look at ways to make
search result more fruitful.
At the heart of any search endeavour,
no matter what kind of search tool you are using,
there are three areas that can affect your search
result significantly:
-
Content of search engine
-
Search logic or algorithm
-
Presentation of search result
Content of search
engine
A search engine collects information for
its database by accepting listings sent by websites
who want exposure, from its own spiders (please see
earlier discussion) or by simply using databases of
other search engines (e.g. meta search engines). There
are two issues in the process that you, as information
searcher, should be aware of:
-
Focus of search engine
-
Degree of information collection
There are thousands of search engines
- and each has a focus area. Few big ones like Yahoo!,
Alta Vista or Google are universal - they accept information
on any subject or from any geographical area so long
as the website satisfies their respective editorial
policy. However, most others are selective on content.
For example - country specific search engines accept
webpages only from or on the concerned country. Subject
specific search engines do not accept webpages on
alien subjects. Even universal search engines like
Yahoo!, MSN etc. have their country specific versions
(e.g. Yahoo! India)
So, if looking for information on Australia
- look for Australia specific search engines.
There are many sources in The Net that
compiles information on search engines. Following
are a few for your convenience:
Degree
of information collection
Though actual working of Spiders is closely
guarded secret in many cases - it is generally assumed
that they start with a historical list of links, such
as server lists, and lists of the most popular or
best sites, and follow the links on these pages to
find more links to add to the database. A spider could
send back just the title and URL of each page it visits,
or just parse some HTML tags, or it could send back
the entire text of each page. The coverage and degree
of indexing can have a bearing on quality of your
search result.
Many search engines use 'fields' to store
information collected from various parts of a webpage.
The title, the URL, image tag, hypertext link etc.
are common fields on a Web page. Field searching allows
the searcher to designate where a specific search
term will appear. Rather than searching for words
anywhere on a Web page, field-specific searching can
considerably reduce unwanted or junk information in
search result.
For
example, in Alta Vista - the searches
text : infobanc
Finds pages that contain the specified
text (i.e. infobanc) in any part of the page other
than an image tag, link, or URL.
title:'The Great Indian Bazaar'
Finds pages that contain the specified
phrase 'The Great Indian Bazaar' in the page title
(which appears in the title bar of most browsers).
url:text
Finds pages with a specific word or phrase
in the URL. For example - url:export will find all
pages on all servers that have the word export anywhere
in the host name, path, or filename.
More search tips in coming issues