Click here for a trail version of the site search engine :
Download >>
Click here to purchase :
Purchase >>
Summary
  • The cost of searchdb is 49 US Dollars per installation including source code
  • Set up via web pages
  • Indexes dynamic and static pages
  • Fixed text may be changed
  • Uses either MS Access and SQL Server

Installation ....

Installation of searchdb is the same as any ASP.Net Version 2 application and is normally straight forward.

Using IIS create a virtual directory where you want the search application to be installed, usually searchdb.

Copy or ftp all the files to the folder.

The web.config file should be placed at the root of the site. If you already have an existing web.config file at the root of your site which is being used for another application, then you will have to copy the settings in the 'appSettings' from the searchdb web.config file into the 'appSettings' of your existing web.config file.

The database

The application may use Microsoft Access or SQL Server database.

Microsoft Access Database

    If you are using a Microsoft Access database (searchdb.mdb), this may be located where ever you want - you have to define its location in the web.config file. You may be restricted to the directory for the Access Database depending on the host server you are using. Some hosts define a specific location for the database because of security and permissions.

    You have to set the permissions on the Access database to write or otherwise you will get a database update error when you attempt to crawl sites.

    Make sure that the FOLDER where the MS Access file is located, has write access because MS Access has to create a lock file.

    If you are using a hosting organisation, they usually identify a folder outside the root of the web site which has the correct permissions.

    Otherwise you have to set the permissions by right clicking on the access database, select Properties and then the Security tab, then change the permissions for the appropriate user.

    I have created a button on the define site admin page which tests if the database can be written to.

SQL Server Database

    If you are using SQL Server, then a sql installation script is provided (searchdb.sql). First create a database called searchdb, create a new user for this database, and run the sql script against the database. The script will create all the tables and set up default values where appropriate.

    The connection string for the database must be entered into the web.config file.

web.config file

The application settings are within the web.config file.

You must have the database connection string to identify the database - other settings are optional and default setting will be used if they are not included.


<configuration>
   <appsettings>
    <add key="pg_CrawlerConnect" value="your provider string" />

    <add key="pg_ButtonText" value="Search" />
    <add key="pg_Instructions" value="Enter one or two words here" />
    <add key="pg_NorecordsFoundText" value="No records found - try again" />
    <add key="pg_XRecordsFoundText" value="records found" />

   <add key="pg_CodePage" value="1252" />

   <add key="pg_ErrorLogging" value="true" />

   </appsettings>
</configuration>


Settings

The application settings in the web.config allow you define your own static text so that other languages may be accommodated.

The file search.aspx is a basic search page displaying the search engine results. Modify search.aspx to fit in with your site design layout.

User options in web.config consists of :

pg_CrawlerConnect   This is the data provider connection string for the database.

You must construct the provider string which will depend on the type of database you are using.

If its an access database then it will be of the form :

<add key="pg_CrawlerConnect" value="Provider=Microsoft.Jet.OLEDB.4.0;Data Source= d:\inetpub\aspnet\db\searchdb.mdb;Persist Security Info=False" />

To help identify the location of the mdb directory on your server, I have included a file called a_map.aspx which displays the file name and path of a_map.aspx. From that it should be possible to work out the file name and path for the Access database.

If its a SQL Server then it will be of the form :

<add key="pg_CrawlerConnect" value="Provider=SQLOLEDB.1;Password=people;Persist Security Info=True;User ID=searchuser;Initial Catalog=searchDB;Data Source=169.254.219.170" />

The Provider string is just a standard OLEDB connection string.

If it is MSDE then it will be of the form :

<add key="pg_CrawlerConnect" value="Provider=MSDASQL;Persist Security Info=False;User ID=user_name;Password= user_password;Initial Catalog=user_catalog;Data Source=user_databasename;Connect Timeout=15" />

The admin page includes a button which you may use to check that the database connection is correct.
pg_ButtonText   A string identifying what text the search button should display. This is defined as 'Search' by default.
pg_Instructions   A string which appears above the text box and is intended to display a few words of brief instructions.
pg_NoRecordsFoundText   A string which is displayed when no records are found.
pg_XRecordsFoundText   A string which is displayed following the number of records. e.g. 32 records found, in this case XRecordsFoundText contains the string "records found"
pg_CodePage   This is an optional entry which can be used to define the code page when the web pages are read. If this setting is left out, then a standard code page of 1252 is used which should be correct for Western character sets.

If you wish to use a different code page, the value is integer of the form value="950".

In some circumstances you may see ? characters when the pages are indexed. This is an indication that the code page is incorrect.
pg_ErrorLogging   This is an optional entry which can be used to define if error logging is to be enabled. If this setting is left out, then no error logging occurs.

Logging includes any database update errors, webcrawler and indexing errors.

It also logs the search words which are entered in the search dialog box.

Logging details are stored in database tables and can be viewed using the administration web pages.

The search page

You may either use search.aspx or searchUC.aspx as the main search page and to display the results from the search engine. The file searchUC.aspx has a user control.

Some users wish to have a search box on each page so that when you click on the search button the search engine results are displayed on a separate page. The page default.htm is an example of how to do this. default.htm contains a form which you should place on each of your web pages. When you click on the search button, it will direct the output to search.aspx.

Operation

You log onto the management display system through a simple logon form which is accessed by http://www.yourserver.com/searchdb/admin/default.aspx. The user names and passwords are stored in the database.

The default username / password is admin / admin.

Once logged in you are able to set up the system :

Base URL :

    The root of the domain. This is in the form http://www.yourserver.com

Start spider from this URL :

    The page from where you want crawling to start. This is in the form http://www.yourserver.com/default.htm Usually this is the home page of the site but may be any page such as a site map.

Directories to be excluded :

    A list if exclude directories. This is a comma separated list of directories which are to be excluded.

Active :

    Defines if this URL is to be crawled. Tick the box to indicate that you want this URL to be crawled
Words to be excluded :

    A list of exclude words. This is a comma separated list which you want the indexer to ignore. Single character words are not indexed.

Valid file extension :

    A comma separated list of valid file extensions.

Start of extract :

    Start of extract. In many web sites, the first few characters of text do not give information which would be of value as an extract. This parameter effectively shifts the start point. The default value is 1.

Length of extract :

    Length of extract. Up to 255 characters can be used as the extract text. The fewer the characters, the smaller the database needed to store it. The default value is 150.

Script timeout of crawler in seconds :

    A timeout period for crawling the site. If the site is large, this period may have to be extended or otherwise the page will program will stop and an error message will be displayed.

Use meta description as extract :

    The use meta description as extract tick box defines whether the meta description will be used as the extract text. The meta description is of the form e.g. <meta name="description" content="Javascript code for downloading"> and appears in the head of the web page.

    If the web page does not contain a meta description tag, then the extract will be made from the body of the web page. When un ticked, the extract text always comes from the body of the web page. The problem with taking the extract from the body of the web page is that sometimes the text may not be sensible as you may have menu entries etc. Using the meta description tag should give more sensible and controllable results.

Include meta data in indexed data :

    The meta data tick box defines whether meta data such as keywords, description etc which can appear in the head of a web page will be indexed into the database. By default, this is set to not index meta data. The reason to stop meta data being indexed is that quite often the meta data may not have a great deal of relevance to the rest of the text on the page and will cause search results to display incorrect values.


The search form

Once the site has been crawled and then indexed, to search the site you use the search form which is accessed by http://www.yourserver.com/searchdb/search/search.aspx or http://www.yourserver.com/searchdb/search/searchUC.aspx or http://www.yourserver.com/searchdb/search/default.htm