instruction manual
1. Introduction
2. Installation instructions
3. Administration Interface
4. Statistics Tool
5. Searching
6. Layout
7. Important information
1) Introduction
phpMySearch provides a search engine for your own web site. It is not meant to replace a powerful internet-wide search system like Google, Lycos or AltaVista.
phpMySearch provides the same full boolean search queries found in the 'big' search engines, e.g.: download +software
or download OR software
.
phpMySearch is very easy to install and this guide will take you through all the steps you need to set it up for your web site.
For the installation you do not need programmer or web server knowledge. The complete installation needs about 5 minutes and all you will need is a browser and an FTP program. The configuration can be changed at any time through a browser control panel.
The visual design of the search results are easily customised to match the rest of your site by use of templates. It's a simple process to modify the templates with an HTML editor to make unique layouts.
Documents can be searched in HTML, PDF, DOC or TXT format. Outdated documents are detected automatically and deleted from the data base. Therefore only the most current information is displayed to your visitors.
phpMySearch runs reliably on Linux, UNIX and Windows operating systems, and supports all the usual protocols: HTTP
, HTTPS
, FTP
, FTPS
. phpMySearch can be deployed on either internet or intranet-based web sites.
There is an online forum available at our web site where you can exchange ideas and support with other users of phpMySearch.
phpMySearch may be used privately as well as commercially, free of charge and without restrictions as long as the copyright notice on phpMySearch at the end of the search output is preserved. Support licenses are available to help you with larger and more complex projects.
2) Installation instructions
System Requirements:
- MySQL 3.23.32 or higher (http://www.mysql.com)
- PHP 4.0.2 or higher (http://www.php.net)
- CURL 7.0.2 or higher (http://curl.haxx.se)
Installation:
- Download phpMySearch from http://phpmysearch.web4.hm to your system. For *nix-systems download the phpMySearch.tar.gz file, for Windows systems download the phpMySearch.exe file.
- Windows: Double click on the phpMySearch.exe and follow the instructions to extract the contents.
Unix: Extract phpMySearch under a Unix shell with:tar -xzf phpMySearch.tar.gz
- Copy all files from the newly generated folder phpMySearch to a folder on your web space which is accessible from web (e.g. http://www.example.com/phpMySearch/ ) After that execute with a browser the install.php (e.g. http://www.example.com/phpMySearch/install.php)
- Setup phpMySearch with admin.php from a browser window.
Support: you can get commercial support licenses at http://phpMySearch.web4.hm.
3) Administration Interface
To access the administrative interface run admin.php
from the browser.
Default login and password are:
- login: admin
- password: admin
It is strongly recommended you change these to maintain security.
Table 1-1: Fields and buttons on the Administration page:
Field or button |
Description |
Search start URL's: |
A list of URLs from which crawler will start to gather information. To add new URL to the list type it in the field below and push ADD button. To remove any of the URLs check the check boxes near URLs you'd like to delete and push REMOVE button. e.g.: To add urls with (HTTP)-User-Identification you can add urls in this way: http://Username:该邮件地址已受到反垃圾邮件插件保护。要显示它需要在浏览器中启用 JavaScript。/file.html ftp://Username:该邮件地址已受到反垃圾邮件插件保护。要显示它需要在浏览器中启用 JavaScript。/pub/file.html |
Number of attempts to retrieve the page: |
If a document or server is not available phpMySearch will try to connect to it x times. |
Not indexed URLs (Black list) |
List of URLs, which will be ignored by crawler. To add new URL to the list type it in the field below and push ADD button. To remove any of the URLs check the check boxes near URLs you'd like to delete and push REMOVE button. |
Document extensions to index |
A list of document extensions which spider should try to index. To add new document extension to the list type it in the field below and push ADD button. To remove any of the extensions check the check boxes near extensions you'd like to delete and push REMOVE button. |
Search depth |
Search depth tells spider how much iteration he should follow links from the pages and proceed with crawling: |
Re parse all |
If check box is checked spider will clean database and start to ramble, otherwise it will parse only pages that were updated. By default it is unchecked |
Automatic spider start |
Check this box if you have not access to crontab tool at *nix systems or task scheduler in Windows system. If it is checked each time visitor use search script, it will check whether it is time to start the crawler. If it finds it is time to start or the crawler was not started at specified time script will start the spider. It is recommended that you use system scheduling utilities (CRON) to start the spider. |
Start time |
Time to start the spider |
Start spider each (days) |
Period in days to restart the spider (in days) |
Force crawling |
Click on Start Spider button to start spider immediately |
Number of links per page |
Specify here number of links, which should be displayed on a single page. |
Max pages block |
Specify how many pages should be visible in pages menu |
Proxy settings active |
Check this box if you want the spider shell connect to proxy. Please note that this function is in BETA and for testing. If you have any problems with proxy support, we'll be glad for any response. |
Proxy host |
Fill in here the proxy host. (e.g. an IP 192.168.0.1 or a host). If you must use a special port you can type host:port |
Proxy user and pass |
Fill in here if you need a user and password for the proxy. Please fill in the following format: User:Password |
Search Engine log file name |
Enter path to the file where search engine will log its work. |
Spider Engine log file name |
Enter path to the file where spider will log its work |
Admin Tool log file name |
Enter path to the file where all changes will be logged in admin tool |
Templates path |
Enter path to template files |
PHP Full Path: | Fill in here the path to PHP. You can fill in PHP, sometime you must type in the full path to PHP e.g. PHP , /usr/bin/local/PHP or c:\PHP\php.exe . On Windows platforms you can use backslash forward slash. |
Converter URLs: |
Defines the URL to use when which will convert file types PDF, DOC and XLS in to HTML. To transform PDF files into HTML you can use this url: For DOC or XLS files you can try to use the converters supplied with phpMySearch. These converters are in your phpMySearch folder, e.g.: To install these converters you need full ROOT or administrator rights on your server. For the converters we use:
For any installation or questions for these tools look at their own homepages, forums and mailing lists. The phpMySearch Group can not give you any support for questions of the installations or troubleshooting of these tools. |
Admin Login |
Login name of administrator |
Admin Password |
Administrators password |
Confirm Password |
Confirmation of administrators password |
Submit |
By clicking on Submit button you save all changes |
Logout |
By clicking on Logout button you logout from the admin tool |
4) Statistics Tool
This tool lets you store search terms into a database and lets you put your statistics into the search.php
. First you'll have to activate this implement at the Admin-tool beside the Proxy settings. After that, you can decide how many outputs of your statistics you'd like to have shown into the search.php
. With the clear-button next to "Clear Stats DB:" you are able to clear the stats table (for example if you want to show the most frequent search terms for this week or this month). When you've finished the settings press the submit-button.
If you now start the search.php
, the first term you enter will be stored into the database. By entering the second search term, the output of the database with your entered settings will be displayed.
5) Searching
If you want to search for a single word it is simple. Just execute the search.php
in your browser. Now type the word you'd like to search for in the text field and press Submit.
You also can use Boolean logic to narrow your search. See table below for operators allowed.
Table 2-1:Search Boolean logic.
Operator |
Description |
AND + |
Finds documents containing all of the specified words or phrases. Peanut AND butter finds documents with both the word peanut and the word butter. |
OR & |
Finds documents containing at least one of the specified words or phrases. Peanut OR butter finds documents containing either peanut or butter. The found documents could contain both items, but not necessarily. |
AND NOT + - |
Excludes documents containing the specified word or phrase. Peanut AND NOT butter finds documents with peanut but not containing butter. NOT must be used with another operator, like AND. Search engine does not accept 'peanut NOT butter'; instead, specify peanut AND NOT butter. |
OR NOT & - |
Finds documents containing one of the specified words or phrases or not containing other word. Peanut OR NOT butter finds documents which contain Peanut or not containing butter |
" " |
Quotation marks are used to denote exact phrases. For example, a search on "New York Times" will match only documents containing the words as an exact phrase. It will not find pages with the words used in a different order, such as "New times in York!" |
|
Braces are used to denote folders. For example, a search on "CPAN/objects" will match only documents stored in www.servername.com/currentlocation/CPAN/objects |
You also can navigate through the site folder structure. In the drop down box before Submit button you will see list of sub-folders of the current folder. By selecting the folder name you localize search to this folder and its sub-folders. '..' allows you to go one step up. [top]
6) Layout
You can make your own design by using the templates and a WYSIWIG Editor [eg. Dreamweaver]. You find the templates in the folder "templates" in your phpMyAdmin Directory, or when you change it in the admin panel, then in the "TemplatesPath" directory. All templates with the "adm" prefix are for the admin tool. All others [eg. main.tpl,body.tpl,body_docfrom.tpl,body_ok.tpl,refs.tpl.] are the search engine templates. Here you see an overview how the single templates are:
Here is a short list, which variables you can use. Remember that all searches are case sensitive and that all variables in the templates stand in braces [e.g. {VARIABLE}
].
Template variables |
Description |
|
Search String |
|
Search Result |
|
Error Message |
|
Search path |
|
Result URL |
{pageDate} |
Result Date |
{expiresDate} |
Expires Date |
{title} |
Website Title |
{description} |
META tag: Description |
{keywords} |
META tag: keywords |
{author} |
META tag: Author |
{replyTo} |
META tag: replyTo |
{publisher} |
META tag: Publisher |
{copyright} |
META tag: copyright |
{contentLanguage} |
META tag: language |
{pageTopic} |
META tag: page Topic |
{pageType} |
META tag: page type |
{abstract} |
META tag: abstract |
{classification} |
META tag: classification |
{body_1} |
META tag: text from the website (first 255 chars) |
{body_2} |
text from the website (all others) |
When you change the drop down menus to check boxes, radio buttons or hidden fields, ensure the variable names are the same as in the default templates.
The search.php
needs some variables to work [eg. page,currPath]. You can set this with hidden fields, too. The form action is always GET. If you have problems, have a look in the default templates.
If you want to send us your template or your link to your search we will consider putting it in the official templates list in the next version of phpMySearch! [top]
7) Important information
Please note if phpMySearch visits web sites and stores data into your database, you need the agreement from the web site owner to do this.
If the phpMySearch spider visits web sites which are not local and on the same web server, you may incur a lot of data transfer between both servers. Obviously this could result in great cost if you subsequently exceed your hosting limits.
To receive updates regarding phpMySearch you can subscribe to the newsletter service at http://phpMySearch.web4.hm. Also, all updates will be made available on this home page.
For more information on the phpMySearch Group and the phpMySearch project, please see http://phpMySearch.web4.hm.
phpMySearch is a project from:
Webagentur web4.hm
Pyrmonter Strasse 42
D-31789
Hameln
Germany.
Tel: +49-5151-609970-0
http://www.web4.hm
该邮件地址已受到反垃圾邮件插件保护。要显示它需要在浏览器中启用 JavaScript。