A search engine spider using an ASP .NET script ....
This asp .net search engine spider enables users to search your web site.
Try it out on this site by clicking here.
The program searches though the files each time it runs, so is recommended it is used only on relatively small sites of up to a few hundred web pages. I have not tested it on larger sites.
You are presented with a text box where you enter one or two words. Any words less than three characters are ignored as are common words such as 'them', 'they', etc. The words are ORed together and there is no facility at the moment for any other more complex processing such as AND.
The engine searches through the web site files and returns the title and an extract of the pages sorted in order of number of matches. The display is similar to popular internet search engines such as Google.
An example web page is provided which may be used to base the search page on.
Set up in web.config
Configuration options are stored in the web.config file.
To use the search engine on your site you must set the following :
| pg_rootdir | The full path name of the web site root |
| pg_validfiles | A list of the file extensions to look for in the form of *.html,*.htm |
| pg_excludedir | A list of directories which need to be excluded from the search. All the listed directories and their subdirectories will be excluded. |
| pg_wordexcludes | A list of all the common words to be ignored. |
| pg_httpdir | The http root directory of the site which matches the above rootdir. |
To help fill in the pg_rootdir entry, run the file map.aspx which displays the path name. From the result, it should be possible to work out your root path on the web server.
All 'fixed' text such as the name of the search button can be configured so that the program can be used in any language.
The user options are also part of the web.config file :
| pg_ButtonText | A string identifying what text the search button should display. This is defined as 'Search' by default. |
| pg_Instructions | A string which appears above the text box and is intended to display a few words of brief instructions. |
| pg_NoRecordsFoundText | A string which is displayed when no records were found. |
| pg_XRecordsFoundText | A string which is displayed following the number of records. e.g. 32 records found, in this case XRecordsFoundText contains the string "records found" |
| pg_UntitledText | A string which defines what will appear for the title when the web page has blank for the title. |
| pg_Title | A string which defines what text will be displayed in the list next to the title of the page, default 'Title'. |
| pg_Summary | A string which defines what text will be displayed in the list next to the extract summary, default of 'Extract'. |
Installation
Place the search.dll into the appropriate bin directory on your web server. The location of this directory will be defined by your host.
Place the Styles.css (the style sheet) file into the required directory on the web server.
The configuration information in the web.config file is placed in appSettings. Copy the configuration information from the example file and place it into your web.config file and modify the entries to suit your site.
Now create the search page itself.
There are two ways you can create the search page, either use the example page default.aspx or use searchUC.aspx which has the user control (searchUC.ascx) on it.
Some users wish to have a search box on each page so that when you click on the search button the results are displayed on a separate page. The page default.htm is an example of how to do this. default.htm contains a form which you should place on each of your web pages. When you click on the search button, it will direct the output to default.aspx.
Files provided and which need to be installed
| Web.config | An example configuration file |
| Styles.css | The style sheet file |
| searchUC.dll | The compiled code |
| default.aspx | The example main page |
| default.htm | An example main page to be used when you want a search box to appear on every page. |
| searchUC.aspx | An example page with a user control |
| searchUC.ascx | The User control |
| map.aspx | Displays the path name of this file |
| Copyright © 2007 | Page updated July 2007 |