What is a Web Search Engine ?
A web search engine is a programme that literally a reads content available on the internet. Once read, search engines return an index of keywords and an index of corresponding documents where the keywords were found.
In general, a search engine sends a crawler/spider to read as many documents as possible. Then another programme called an indexer, indexes these documents and creates a catalogue based on the words contained in each document. Each search engine uses a proprietary algorithm to create its indices such that, ideally, only meaningful results are returned for each query.
How Web Search Engines Work
Search engines are the instrumental in finding specific information on the vast expanse of the World Wide Web. With almost 231 million unique websites, it would be virtually impossible find anything on the internet without sophisticated search engines unless one knows the specific URL. But how do search engines work? And why are some search engines more effective than others?
What a web search engine does when a user enters a search query is, search through its prearranged keyword index and return the most relevant pages with the searched keyword or keyword combinations along with the corresponding web pages.
We can identify three basic types of search engines.
- Search engines powered by robots (called crawlers, ants or spiders)
- Search engines powered by human submissions
- Search engines that are a hybrid of the two
Robot-based search engines use automated software (called crawlers) to visit a website, read the information on the actual site, read the site’s meta content and also follow the links that the site connects to; performing indexing on all linked Websites at the same time. The crawler returns all that information back to a central repository, where the data is indexed. The crawler will periodically return to the sites to check for any information that has changed. The frequency with which this happens is determined by the search engine administrators taking into consideration the dynamics of the website as well as guidelines set by site administrators.
Human-powered search engines rely on humans to submit information that is subsequently indexed and catalogued. Only information that is submitted is put into the index. However, this type of search engines is fast becoming nonexistent due to the extreme difficulty of use and maintenance.
In both cases, when a user queries a search engine to locate information, the user is not actually searching the Web. Instead, the search engine queries the database(s) maintained by them. This explains why sometimes a search on a commercial search engine, such as Yahoo! or Google, will return results that are, in fact, dead links. Since the search results are based on the index, if the index hasn’t been updated since a Web page became invalid the search engine treats the page as still an active link even though it no longer is. It will remain like that until the index is updated.
Another question that arises is why will the same search on different search engines produce different results? The simple answer is that not all search engines use the same index or algorithm to rank web pages. The success of the search index depends on what the spiders find or what the humans have submitted. The search algorithm is what the search engines use to determine the relevance of the information in the index to what the user is searching for.
One of the elements that a search engine algorithm scans for is the frequency and location of keywords on a Web page. Those with higher frequency are typically considered more relevant. But search engine technology is becoming sophisticated in its attempt to discourage what is known as keyword stuffing, or spamdexing.
Search engine algorithms also analyse the manner in which web pages are linked to one another in the web. By doing so, search engines can define what a page is about [granted that the keywords in page is related to page and vice versa] and defines whether a certain web page is important for a certain keyword or not. Due to this dynamic nature, there is an ongoing cat and mouse game between webmasters and search engines to build artificially ranked websites and to prevent artificially ranked websites appearing in search engines.
As a leading SEO Company UK, SEO Vantage is in the unique position to evaluate and implement a campaign to best suit your requirements. For further information, please contact us. For a free SEO Company UK analysis on your website, visit our free SEO Analysis page.