Siteimprove | HomeSiteimprove - web tools for website managers

Site Overview Page

Site Overview Page

The first page that is displayed is the Site Overview page. If your SiteImprove account is used to administer SearchImprove solutions for more than one site, these sites are listed. Click a site name to go to the Overview screen for that site.

Search summary

The Site overview page displays the current status of the search: Site name, URL, active status, and the number of pages that are currently being searched.

Main setup buttons

These buttons give access to search configuration, crawl setup, group setup, and search ranking setup.

The setup buttons are:

  • Configure: Gives you access to the main configuration for the site: Which pages to include or exclude from the search index; which file types to index; which content sections to include etc.
  • Delete Site: Remove the current site.
  • Add new crawl: Allows you to create a new instance of the SearchImprove crawler; enter a URL from which the crawler should start; and select how often the crawler should be initiated to re-index its content.
  • Groups: Create and manage groups for presentational and/or organisational purposes.
  • Search Ranking: Edit the weighting scores for each content element to tweak the search for your content.

Crawls

How do crawls work?

One of the main virtues of SearchImprove is the ability to set up several crawls, each of which serves its own purpose.

It could be that some sections of the website are updated frequently, and other sections remain static for long periods of time. For example, a section containing news, where news items are added and removed and edited many times a day, needs to be indexed frequently. A section containing static information, such as annual reports, in theory only needs indexing once or twice a year.

By limiting the scope of crawls to include only what is necessary SearchImprove offers a fast and economic solution to searching. In this way, indexing places minimal strain on web servers, only performing requests when needed.

The Crawls section displays the crawls that have been set up for this site, with name, date of last crawl, date and time of next scheduled crawl, and the type of crawl.

A Complete crawl follows every link and indexes every page throughout the website from a designated starting point.

A level 1 crawl starts from a designated start page and indexes pages that have links on that page. A level 2 crawl indexes these pages and pages a level deeper; and so on.

A level 0 crawl does not follow any links; it simply indexes the designated start page.

Measures to reduce redundant crawls

To further limit the impact on server resources and to ensure speedy indexing, only pages that have been altered since the previous crawl are indexed.

A unique identifier – an MD5 key – is calculated for each page. This identifier changes when any alterations are made to the page, allowing the SearchImprove crawler to recognize altered pages. If the MD5 key is unchanged, the SearchImprove simply uses its page index to proceed to the next page in that crawl.

This means that if the majority of pages on the website are unaltered, the search index is updated rapidly and without undue strain on web servers.

Click a crawl name to view a summary of that crawl.

Create a new crawl with the Add new crawl button to enter the crawl configuration procedure.