The package contains three pages - keywords.aspx , kpUpdate.aspx and seek.aspx - along with configuration files and code-behind files.
keywords.aspx will parse every page in the website with extensions included in search.config, except pages with filenames containing an underline ("_"), or pages in a folder with a name containing an underline. For example - the following will not be parsed:
|my_page.htm||underline in file name|
|/folder_name/mypage.htm||underline in folder name|
|/folder_name/mysubfolder/mypage.htm||underline in folder name|
|/folder/my_page.htm||underline in file name|
While keywords.aspx is parsing the pages, three HTML comments are looked for:
<!-- search-page-ignore -->
If the first comment is found in a page no keywords are indexed - this page's content will be excluded from any searches.
The second and third comments surround sections of the page that should be ignored, such as navigation or copyright notices. These sections will be ignored.
"Boilerplate" or navigation sections added by master pages or Server Side Includes (SSI) will not be indexed within a page. It is advisable to name SSI files with an underline, or with an excluded extension, to avoid these being indexed as individual pages. Sections added with a Dynamic Web Template (DWT) or Design Time Includes (aka FrontPage include page) will be indexed, unless these sections are excluded using the
<!--/nosearch--> comments. Again the individual include pages should be named with an underline in the name or path.
Any resulting keywords are stored in the Keywords table, and page information (filename, page title) in the webpages table.
The page has (essentially) three buttons:
- Generate Website Index
- The first button will scan the entire website for new pages and index all those found (with the exceptions mentioned earlier). The first use of this may take a long time - the server may time out in large websites, repeated use of the button should eventually complete the first index. Pages that have already been indexed will not be reindexed when using this option.
- Generate Page Index
- The second button will present a list of indexable pages in the website, and allows any individual page (click on the record) to be reindexed. The original keywords will be deleted and replaced.
- View Keywords
- This button displays the indexed keywords, showing the page URL and page Title.
This page uses the same password as kpUpdate.aspx, and is protected to restrict access to website administrators.
The search (seek.aspx) allows for a space separated list of keywords to be searched for, either looking for all words, any words, or a phrase. If a search for a phrase is instigated, then every page (with aforementioned exceptions) will be parsed rather than using the database. This search may take a long time to complete.
Results from seek.aspx will be displayed as links to pages containing the required text - the page title is used as the link text, unless changed using kpUpdate.aspx.
For the search only, the displayed link text for any page can be changed using kpUpdate.aspx.
kpUpdate displays the Webpages table in a gridview, showing Page Title and Page URL. This is, in effect, a list of all the pages in the website except those containing an underline in the file name or path, and may be useful for other purposes.
This page uses the same password as keywords.aspx, and is protected to restrict access to website administrators.