|
The search engine
features chart below is designed primarily for webmasters
who care about how crawler-based search engines index their
sites. It provides a summary of important factors and features
that can affect how sites are indexed and ranked. Full explanations
of items can be found immediately below the comparison chart.
Human-powered
search engines like the Open Directory are
not listed on this chart because they do not crawl the web
to create their listings. See the How
Search Engines Work page for
an explanation of the differences between crawler-based and
human-powered services.
This chart
covers the crawler of AltaVista, FAST, Google,
Inktomi and Teoma. Some of these crawlers power other search
engines.
| Crawling |
Yes |
No |
Notes |
| Deep
Crawl |
FAST,
Google, Inktomi |
AltaVista, Teoma |
|
| Frames
Support |
All |
n/a |
|
| robots.txt |
All |
n/a |
|
| Meta
Robots Tag |
All |
n/a |
|
| Paid
Inclusion |
All but... |
Google |
|
| Full
Body Text |
All |
n/a |
Some
stop words may not be indexed |
| Stop
Words |
AltaVista,
Inktomi, Google |
FAST |
Teoma
unknown |
| Meta Description |
All provide some support, but
AltaVista, FAST and Teoma making most use of the
tag |
| Meta Keywords |
Inktomi, Teoma |
AltaVista, FAST,
Google |
Teoma
support is "unofficial" |
| ALT
text |
AltaVista, Google,
Teoma |
FAST,
Inktomi |
|
| Comments |
Inktomi |
Others |
|
Deep
Crawl
All
crawlers will find pages to add to their web page indexes,
even if those pages have never been submitted to them.
However, some crawlers are better than others. This section
of the chart shows which search engines are likely to do
a "deep crawl" and gather many pages from your web site,
even if these pages were never submitted. In general,
the larger a search engine's index is, the more likely
it will list many pages per site.
Frames
Support
This shows which
search engines can follow frame links. Those that can't will
probably miss listing much of your site. However, even for those that do, having individual frame
links indexed can pose problem.
robots.txt
The robots.txt
file is a means for webmasters to keep search engines out
of their sites. More information about
robots.txt can also be found here:
The
Web Robots Pages: The Robots Exclusion Protocol
http://www.robotstxt.org/wc/exclusion.html
Meta
Robots Tag
This is a special
meta tag that allows site owners to specify that a page shouldn't
be indexed. It is explained more on the Meta
Tags page.
More details can also be found here:
The
Web Robots Pages: The Robots META tag
http://www.robotstxt.org/wc/exclusion.html
Paid
Inclusion
Shows whether
a search engine offers a program where you can pay to be
guaranteed that your pages will be
included in its index. This is NOT the same as paid placement,
which guarantees a particular position in relation to a particular
search term.
Full Body Text
All
of the major search engines say they index the full visible
body text of a page, though some will not index stop words
or exclude copy deemed to be spam (explained further below). Google
may not index past the first 101K of long pages.
Stop
Words
Some search engines
either leave out words when they index a page or may not
search for these words during a query. These "stop words" are excluded
as a way to save storage space or to speed searches.
Meta
Description
All
the major crawlers support the meta description tag, to
some degree. The ones actually named on the chart are very
consistent. If you have a meta description tag on your
pages, you'll most likely see the content used in some
way.
The Meta
Tags page explains how to use the meta description tag.
Meta
Keywords
Shows which search
engines support the meta keywords tags, as explained on the Meta
Tags page.
ALT Text / Comments
This shows which
search engines index ALT text associated with images or text
in comment tags.
|