Part Two: Introduction
In part 1, the article discussed on document based searches that display result
based on number of search words found in each document. This article is an extension
that ranks based on number of search words found plus number of occurrence of
each search word in the document.
To search for “php tutorials and examples”, the following table
shows the title, occurrence of each search word in the document. Common words
like is, was, and etc are removed from the search constraints by the program.
So in this example, we have three search words, ‘php, ‘tutorials’
and ‘examples’.
| Article |
Number
| PHP |
Tutorial |
Examples |
Total Occurance |
Rank |
1 |
Article #189 |
15 |
11 |
16 |
42 |
3 |
2 |
Article #203 |
25 |
12 |
8 |
45 |
1 |
3 |
Article #257 |
18 |
16 |
5 |
39 |
4 |
4 |
Article #145 |
6 |
8 |
17 |
31 |
5 |
5 |
Article #526 |
5 |
17 |
21 |
43 |
2 |
6 |
Article #86 |
14 |
4 |
10 |
28 |
6 |
Article #203 has the highest occurrence and it is given rank 1. Similarly
ranking is given for other results.
Building Database:
The database consists of three tables. Wiz. Content Table, Keyword Table and
Link Table. Content table holds article’s title, and abstract. Keyword
table holds keyword. Keyword field is indexed. Link table holds keyword id,
content id, and occurrences.
The SQL Statement for creating these three tables are shown below.
Content Table:
CREATE TABLE content (
contid mediumint NOT NULL auto_increment,
title text NOT NULL,
abstract longtext NOT NULL,
PRIMARY KEY (contid)
) TYPE=MyISAM;
Keyword Table:
CREATE TABLE keytable (
keyid mediumint NOT NULL auto_increment,
keyword varchar(100) NOT NULL,
PRIMARY KEY (keyid),
KEY keyword (keyword)
) TYPE=MyISAM;
Link Table:
CREATE TABLE link (
keyid mediumint NOT NULL,
contid mediumint NOT NULL,
occurances mediumint NOT NULL
) TYPE=MyISAM;
Preparing Database:
The upload engine parses each word in the abstract, and process the whole text.
It removes common words like ‘is’, ‘was’, ‘and’,
‘that’ … In Part 1, duplicate words are removed. Here every
duplicate word is counted as occurrences. The $wordMap array is an associative
array that holds word and number of occurrences.
Next for every word in $wordMap array, keyword table is searched in keyword
table. If a match is found, it stores the generated key id, occurrences content
id in the link table. Else the new keyword is inserted in the keyword table.
The link table is updated with occurrences, content id and the newly generated
key id.
FormWordList() Function:
This is the core part of the program. This function is called after ExtractWords()
function. This parses filtered words removes common words like ‘a’,’is’,’was’,’and’….
Other words are taken as valid words. An associative array $wordMap which stores
the word and the number of occurrences in the document.
function FormWordList( $wordList ) {
global $COMMON_WORDS;
global $MAX_WORD_LENGTH;
$wordMap = array();
foreach ( $wordList as $word ) {
$len = strlen( $word );
if ( ($len > 1) && ($len < $MAX_WORD_LENGTH) ) {
if ( !$COMMON_WORDS[$word] ) {
if ( !$wordMap[$word] ) {
$wordMap[$word] = 1;
}else{
$wordMap[$word]++;
}
}
}
}
return $wordMap;
}
For every word in $wordList, it is checked for common word. If TRUE the loop
continues with the next word, else it is checked for already exist in the $wordMap
associative array. If FALSE, the word is added in $wordMap with occurrence count
1. Else the occurrence count is incremented by 1.
ProcessForm Function():
The code is similar to Part 1 coding, here the occurrence count is added in
link table along with key id an content id.
Here is the code.
while(list($word,$occurances)=each($wordList)){
$keyId = "";
if ( !$allWords[$word] ) {
mysql_query( sprintf( "INSERT INTO keytable ( keyword ) VALUES ( '%s' )",
mysql_escape_string($word) ) );
$keyId = mysql_insert_id();
$allWords[$word] = $keyId;
}
else {
$keyId = $allWords[$word];
}
// insert the link
mysql_query( sprintf( "INSERT INTO link (keyid, contid, occurrences) VALUES
( %d, %d, %d)", $keyId, $contentId,$occurances ) );
}
Search Engine:
As discussed in the Introduction part, here the search is performed with number
of occurrences in each document.
Here is the code.
while($lRow=mysql_fetch_array($lResult)){
$thisContentId=$lRow["contid"];
if(!$contArray[$thisContentId]){
$contArray[$thisContentId]["oc"]=$lRow["occurances"];
$contArray[$thisContentId]["id"]=$lRow["contid"];
$contArray[$thisContentId]["wrank"]=1;
}else{
$contArray[$thisContentId]["oc"]+=$lRow["occurances"];
$contArray[$thisContentId]["wrank"]++;
}
}
For every record in the result of link table, the content id, occurrence is
stored in an associative array $contArray. During while loop operation, if the
content id already exists in $contArray, the occurrence is incremented with
this new occurrence value.
Now $contArray is set, it shows that some results are found in the database
table. Else, the program skips to the next part that displays the result “NO
RESULTS FOUND”.
if(isset($contArray)){
//declare an array to store the results
$FoundRef=array();
//Sort array in desending order of the key value
arsort($contArray,SORT_DESC);
//Store the results in the $FoundRef Array
//code for this is given in the next line.
}
In the next step we have to fetch title, first 200 words in content table in
to an array $FoundRef.
foreach($contArray as $cont){
$rank=$cont["wrank"];
if ($rank == $noofSearchWords ) {
$contentId = $cont["id"];
$occurances = $cont["oc"];
$aQuery = "select contid,title,left(abstract,200) as summary from content
where contid = " . $contentId;
$aResult = mysql_query($aQuery);
if(mysql_num_rows($aResult) > 0){
$aRow = mysql_fetch_array($aResult);
$FoundRef[] = array (
"contid" => $aRow["contid"],
"title" => $aRow["title"],
"summary" => $aRow["summary"],
"occurance"=>$occurances );
}//end of if
} //end of for each
Finally we have to display the results in the browser. Here is the code.
if(isset($FoundRef)) {
echo "<table width=\"100%\"><tr><th class=\"title\">Search
Result</td></tr></table>";
echo "<a href=\"#\" onclick=\"history.back()\">Back</a>";
echo "<br>";
echo sizeof($FoundRef);
echo (sizeof($FoundRef) == 1 ? " reference" : " references");
echo " found";
if($junkWords){
echo "Common words like";
foreach($junkWords as $jWords){
echo " "."'".$jWords."'";
}
echo "are removed from the search string";
}
echo "</h5>";
foreach($FoundRef as $a => $value)
{
echo "<table>";
echo "<tr><td valign=\"top\">";
// echo $FoundRef[$a]["contid"];
?>
<a href=showref.php?refid=<? echo $FoundRef[$a]["contid"]?>><emp> <b><?
echo $FoundRef[$a]["title"]?></b></emp></a><div
align="right"> Occurance(s): <? echo $FoundRef[$a]["occurance"]
?></div>
<br><small><? echo $FoundRef[$a]["summary"] ?>...</small><br><br>
<? echo "</td></tr>";
}?>
<?
echo "</table>";
}//end of isset FoundRef
Timer to calculate the time taken to search the documents:
You can include timer that calculates time period to do the search operation.
Here is the code.
//START TIMER
$end=getmicrotime();
//PERFORM SEARCH OPERATION
//END TIMER
$end=getmicrotime();
//TOTAL TIME TAKEN TO DO SEARCH OPERATION
$time_taken=(float)($end-$start);
$time_taken=number_format($time_taken,2,'.','');
The following function calculates the time in microseconds:
function getmicrotime()
{
list($usec,$sec)=explode(" ",microtime());
return ((float)$usec+(float)$sec);
}
Conclusion:
Thus we come to an end of Document Based Search that displays results based
number of search words found plus number of occurrence of each search word in
each document.
I implemented this technique after several optimizations to reduce the search
time. I also tested this technique over 60000 distinct documents. Initially
the search time was around 23.35 seconds and on consequent optimizations the
search time was reduced to 10.89 seconds, 3.56 seconds and finally to 0.71 seconds.
Also note that the search time varies with the hardware setup. I welcome comments
on this article to optimize the performance further.
Source Code:
uload.php
<?
$MAX_WORD_LENGTH = 50;
//COMMON WORD LIST
$COMMON_WORDS = array(
"a"=>1, "as"=>1, "any"=>1, "all"=>1, "ate"=>1, "after"=>1, "am"=>1, "an"=>1,
"and"=>1, "are"=>1, "at"=>1, "away"=>1, "about"=>1, "ago"=>1, "almost"=>1, "along"=>1,
"answer"=>1, "anybody"=>1, "anywhere"=>1, "arent"=>1, "around"=>1, "ask"=>1, "also"=>1,
"b"=>1, "be"=>1, "better"=>1, "black"=>1, "brown"=>1, "but"=>1, "both"=>1, "bring"=>1,
"because"=>1, "been"=>1, "before"=>1, "big"=>1, "blue"=>1, "best"=>1, "by"=>1, "beg"=>1,
"bad"=>1, "being"=>1, "best"=>1, "between"=>1, "based"=>1,
"c"=>1, "call"=>1, "can"=>1, "cut"=>1, "carry"=>1, "cold"=>1, "could"=>1, "clean"=>1, "cant"=>1,
"come"=>1, "couldnt"=>1, "consider"=>1, "called"=>1,
"d"=>1, "did"=>1, "does"=>1, "do"=>1, "down"=>1, "dont"=>1, "day"=>1, "didnt"=>1,
"e"=>1, "eat"=>1, "every"=>1, "eve"=>1, "egg"=>1, "end"=>1, "eve"=>1, "era"=>1, "eye"=>1,
"each"=>1, "either"=>1, "else"=>1, "even"=>1, "ever"=>1, "every"=>1, "everybody"=>1, "everyone"=>1,
"f"=>1, "for"=>1, "from"=>1, "full"=>1, "found"=>1, "far"=>1, "fly"=>1, "fall"=>1, "first"=>1,
"fast"=>1, "five"=>1, "fall"=>1, "find"=>1, "four"=>1, "funny"=>1,
"g"=>1, "go"=>1, "get"=>1, "goes"=>1, "give"=>1, "gun"=>1, "good"=>1, "god"=>1, "give"=>1,
"got"=>1, "green"=>1, "grow"=>1, "good"=>1, "green"=>1, "grow"=>1, "got"=>1, "gave"=>1,
"going"=>1, "gone"=>1, "given"=>1,
"h"=>1, "hi"=>1, "hoo"=>1, "he"=>1, "his"=>1, "him"=>1, "her"=>1, "has"=>1, "how"=>1,
"hold"=>1, "how"=>1, "hot"=>1, "had"=>1, "here"=>1, "help"=>1, "hurt"=>1, "have"=>1,
"havet"=>1, "having"=>1, "hers"=>1, "home"=>1, "home"=>1, "href"=>1,
"i"=>1, "in"=>1, "is"=>1, "if"=>1, "its"=>1, "i"=>1, "it"=>1, "into"=>1, "im"=>1, "ill"=>1, "id"=>1,
"j"=>1, "just"=>1, "jump"=>1, "jet"=>1, "jaw"=>1, "jar"=>1, "jag"=>1, "jam"=>1, "job"=>1,
"jog"=>1, "joy"=>1, "jot"=>1,
"k"=>1, "kind"=>1, "keep"=>1, "kiss"=>1, "kinder"=>1, "kind"=>1, "kid"=>1, "key"=>1,
"kit"=>1, "ken"=>1, "know"=>1,
"l"=>1, "like"=>1, "little"=>1, "lust"=>1, "led"=>1, "lap"=>1, "let"=>1, "live"=>1,
"long"=>1, "live"=>1, "let"=>1, "look"=>1, "law"=>1, "leg"=>1, "lie"=>1, "lid"=>1,
"less"=>1, "look"=>1, "looking"=>1,
"m"=>1, "my"=>1, "may"=>1, "me"=>1, "many"=>1, "must"=>1, "much"=>1, "made"=>1,
"my"=>1, "make"=>1, "met"=>1, "mix"=>1, "mom"=>1, "mud"=>1, "mug"=>1, "mum"=>1,
"myself"=>1, "more"=>1, "most"=>1, "max"=>1, "maximun"=>1,
"n"=>1, "no"=>1, "nose"=>1, "not"=>1, "new"=>1, "now"=>1, "nor"=>1, "nod"=>1, "now"=>1,
"nil"=>1, "nib"=>1, "nut"=>1, "nun"=>1, "never"=>1, "near"=>1, "news"=>1, "none"=>1,
"nothing"=>1, "next"=>1,
"o"=>1, "of"=>1, "on"=>1, "or"=>1, "old"=>1, "open"=>1, "once"=>1, "only"=>1, "off"=>1,
"our"=>1, "oops"=>1, "out"=>1, "oil"=>1, "old"=>1, "oak"=>1, "oak"=>1, "ohm"=>1,
"oho"=>1, "ore"=>1, "owl"=>1, "often"=>1, "other"=>1, "ours"=>1, "out"=>1, "over"=>1, "one"=>1,
"p"=>1, "play"=>1, "pull"=>1, "pretty"=>1, "put"=>1, "push"=>1, "pad"=>1, "pop"=>1,
"pan"=>1, "pap"=>1, "pay"=>1, "peg"=>1, "pet"=>1, "phi"=>1, "pie"=>1, "pig"=>1,
"pet"=>1, "pub"=>1, "pin"=>1, "pit"=>1, "ply"=>1, "pod"=>1, "pus"=>1, "page"=>1, "please"=>1,
"q"=>1, "question"=>1, "quick"=>1, "quest"=>1,
"r"=>1, "ran"=>1, "red"=>1, "run"=>1, "ride"=>1, "read"=>1, "rag"=>1, "rat"=>1,
"ran"=>1, "ram"=>1, "red"=>1, "ray"=>1, "rev"=>1, "rid"=>1, "rib"=>1, "rig"=>1,
"rim"=>1, "rip"=>1, "rob"=>1, "rod"=>1, "roe"=>1, "row"=>1, "rum"=>1, "rug"=>1,
"rut"=>1, "rather"=>1, "recent"=>1,
"s"=>1, "so"=>1, "some"=>1, "stop"=>1, "say"=>1, "sing"=>1, "say"=>1, "she"=>1,
"stay"=>1, "said"=>1, "start"=>1, "soon"=>1, "six"=>1, "seven"=>1, "see"=>1, "sit"=>1,
"sitting"=>1, "son"=>1, "soap"=>1, "spy"=>1, "sum"=>1, "say"=>1, "sea"=>1, "sex"=>1,
"shy"=>1, "sib"=>1, "sic"=>1, "sin"=>1, "sip"=>1, "sir"=>1, "sky"=>1, "ski"=>1, "sly"=>1,
"sob"=>1, "sow"=>1, "sod"=>1, "should"=>1, "something"=>1, "sometime"=>1,
"somewhere"=>1, "set"=>1, "simple"=>1, "such"=>1, "side"=>1,
"t"=>1, "to"=>1, "the"=>1, "then"=>1, "that"=>1, "this"=>1, "those"=>1, "than"=>1,
"these"=>1, "those"=>1, "they"=>1, "thank"=>1, "tank"=>1, "tell"=>1, "take"=>1,
"together"=>1, "try"=>1, "today"=>1, "three"=>1, "tie"=>1, "thy"=>1, "tax"=>1, "tea"=>1,
"tap"=>1, "taxi"=>1, "ten"=>1, "tin"=>1, "tip"=>1, "tit"=>1, "toe"=>1, "tog"=>1,
"tom"=>1, "ton"=>1, "top"=>1, "tow"=>1, "toy"=>1, "two"=>1, "tub"=>1, "tug"=>1,
"tun"=>1, "tux"=>1, "true"=>1, "thank"=>1, "theirs"=>1, "them"=>1, "there"=>1,
"though"=>1, "through"=>1, "thus"=>1, "time"=>1, "times"=>1, "too"=>1, "type"=>1,
"u"=>1, "use"=>1, "us"=>1, "using"=>1, "usage"=>1, "useful"=>1, "up"=>1, "upon"=>1,
"ups"=>1, "under"=>1, "until"=>1, "untrue"=>1, "users"=>1,
"v"=>1, "van"=>1, "vex"=>1, "via"=>1, "vow"=>1, "vat"=>1, "vim"=>1, "version"=>1, "very"=>1,
"w"=>1, "was"=>1, "waste"=>1, "why"=>1, "who"=>1, "whose"=>1, "well"=>1,
"walk"=>1, "were"=>1, "which"=>1, "wish"=>1, "white"=>1, "with"=>1, "would"=>1,
"write"=>1, "when"=>1, "what"=>1, "wash"=>1, "warm"=>1, "want"=>1, "went"=>1, "will"=>1,
"won"=>1, "woe"=>1, "wow"=>1, "woo"=>1, "wins"=>1, "where"=>1, "web"=>1, "way"=>1,
"were"=>1, "where"=>1, "whom"=>1, "wide"=>1, "within"=>1, "without"=>1, "world"=>1,
"worse"=>1, "worst"=>1, "www"=>1, "we"=>1, "whether"=>1,
"y"=>1, "yes"=>1, "ya"=>1, "you"=>1, "yellow"=>1, "your"=>1, "yet"=>1, "yen"=>1,
"year"=>1, "yep"=>1, "yon"=>1, "yours"=>1,
"z"=>1, "zoo"=>1, "zip"=>1, "zed"=>1, "zinc"=>1, "zoom"=>1, "zero"=>1, "zeal"=>1, "zone"=>1);
$allWords = array();
if($submit){
global $allWords;
mysql_connect( "localhost", "root", "" ) or
die( "Unable to connect to database" );
mysql_select_db( "test" ) or die( "Unable to select database"
);
LoadCurrentWords();
if ( $title and $body){
ProcessForm($title ,$body);
echo "Successfully Finished Parsing and Uploading Content";
}else{
$err="Please fill in the fields to upload\n";
form($err);
}
}else{ //end of main
form($err);
}
function form($errmsg)
{ ?>
<h4 align="center">File Parser & Uploader</h4>
<div align="center"><b><? echo $errmsg; ?></b></div>
<center>
<form method="POST" action=<? echo $PHP_SELF ?>>
Title: <input type="text" name="title" size="50"
maxlength="100"><p>
Abstract: <textarea rows=20 cols=50 wrap="off" name="body"></textarea><p>
<input type="submit" name="submit" value="Start
Parsing and Upload Content">
</table>
</form>
</center>
<?
}
function LoadCurrentWords(){
global $allWords;
$result = mysql_query( "select keyid, keyword from keytable" ) or
die( "Error in executing mysql query" );
while ( $row = mysql_fetch_array($result) ) {
$allWords[$row['keyword']] = $row['keyid'];
}
}
function ExtractWords($text){
$STATE0 = 0; //Numeric / Other Characters
$STATE1= 1; //Alpha Characters
$state = $STATE0;
$wordList = array();
$curWord = "";
for ( $i = 0; $i < strlen($text); ++$i ) {
$ch = $text{$i};
$isAlpha = ctype_alpha( $ch );
if ( $state == $STATE0) {
if ( $isAlpha ) {
$curWord = $ch;
$state = $STATE1;
}
}
else if ( $state == $STATE1) {
if ( $isAlpha ) {
$curWord .= $ch;
}
else {
$wordList[] = strtolower( $curWord );
$state = $STATE0;
}
}
}
if ( $state == $STATE1) {
$wordList[] = strtolower( $curWord );
}
return $wordList;
}
function FormWordList( $wordList ) {
global $COMMON_WORDS;
global $MAX_WORD_LENGTH;
$wordMap = array();
foreach ( $wordList as $word ) {
$len = strlen( $word );
if ( ($len > 1) && ($len < $MAX_WORD_LENGTH) ) {
if ( !$COMMON_WORDS[$word] ) {
if ( !$wordMap[$word] ) {
$wordMap[$word] = 1;
}else{
$wordMap[$word]++;
}
}
}
}
return $wordMap;
}
function FilterCommonAndDuplicateWords( $wordList ) {
global $COMMON_WORDS;
global $MAX_WORD_LENGTH;
$wordMap = array();
foreach ( $wordList as $word ) {
$len = strlen( $word );
if ( ($len > 1) && ($len < $MAX_WORD_LENGTH) ) {
if ( !$wordMap[$word] ) {
if ( !$COMMON_WORDS[$word] ) {
$wordMap[$word] = 1;
}
}
}
}
return $wordMap;
}
function ProcessForm($title ,$body){
global $allWords;
$tempWordList = ExtractWords( $body );
$wordList = FormWordList($tempWordList);
// insert into content
mysql_query( sprintf( "INSERT INTO content (title, abstract) VALUES ('%s',
'%s')",
mysql_escape_string($title), mysql_escape_string($body) ) );
//store the newly generated content id in $contentId
$contentId = mysql_insert_id();
// insert all the new words and links
while(list($word,$occurances)=each($wordList)){
$keyId = "";
if ( !$allWords[$word] ) {
mysql_query( sprintf( "INSERT INTO keytable ( keyword ) VALUES ( '%s' )",
mysql_escape_string($word) ) );
$keyId = mysql_insert_id();
$allWords[$word] = $keyId;
}
else {
$keyId = $allWords[$word];
}
// insert the link
mysql_query( sprintf( "INSERT INTO link (keyid, contid, occurrences) VALUES
( %d, %d, %d)", $keyId, $contentId,$occurances ) );
echo mysql_error();
}
//End of Processing Form.
}
?>
search.php
<html>
<head>
<title>Search Engine</title>
<style type="text/css">
body{ font-size:20; font-weight:bold; font-stretch:semi-expand; font-family:MSserif;
color:#0066CC; background-color:#EEEEE4;
align:center; background-color:white }
h4{ background-color:#0066CC; color:#FFFFFF; font-family:verdana; }
h3{ color:#0066CC; }
th{ background-color:#6996ED; color:#FFFFFF; font-family:Arial; }
a{text-decoration:none;}
</style>
</head>
<body>
<?php
if($submit)
{
if(!$keywords){
$errmsg="Sorry, Please fill in search field";
form($errmsg);
}else{
// Connect to the database
$dServer = "localhost";
$dDb = "test";
$dUser = "admin";
$dPass = "";
$s = @mysql_connect($dServer, $dUser, $dPass)
or die("Couldn't connect to database server");
@mysql_select_db($dDb, $s)
or die("Couldn't connect to database");
$CommonWords=array("a"=>1, "as"=>1, "any"=>1, "all"=>1, "am"=>1, "an"=>1, "and"=>1, "are"=>1, "at"=>1,
"b"=>1, "be"=>1, "but"=>1, "by"=>1,
"c"=>1, "can"=>1,
"d"=>1, "did"=>1, "does"=>1, "do"=>1,
"e"=>1, "each"=>1, "else"=>1, "even"=>1, "ever"=>1,
"f"=>1, "for"=>1, "from"=>1,
"g"=>1, "go"=>1, "get"=>1,
"h"=>1, "hi"=>1, "he"=>1, "his"=>1, "him"=>1, "her"=>1, "has"=>1, "how"=>1,
"had"=>1, "here"=>1, "have"=>1,
"i"=>1, "in"=>1, "is"=>1, "if"=>1, "its"=>1,
"j"=>1, "just"=>1, "k"=>1,
"l"=>1, "like"=>1, "led"=>1, "lap"=>1, "let"=>1,
"m"=>1, "my"=>1, "me"=>1, "many"=>1, "must"=>1, "more"=>1,
"n"=>1, "no"=>1, "not"=>1, "new"=>1, "now"=>1,
"o"=>1, "of"=>1, "on"=>1, "or"=>1, "once"=>1,
"p"=>1, "q"=>1, "r"=>1,
"s"=>1, "so"=>1, "some"=>1, "say"=>1, "she"=>1,
"t"=>1, "to"=>1, "the"=>1, "then"=>1, "that"=>1,
"u"=>1, "use"=>1, "us"=>1, "up"=>1, "upon"=>1,
"v"=>1, "via"=>1, "vow"=>1,
"w"=>1, "was"=>1, "why"=>1, "who"=>1, "whose"=>1, "were"=>1,
"y"=>1, "yes"=>1, "ya"=>1, "you"=>1, "your"=>1,
"z"=>1, "zoo"=>1,);
//START TIMER
$start=getmicrotime();
$search_keywords=strtolower(trim($keywords));
$arrWords = explode(" ", $search_keywords);
//remove duplicates
$arrWords=array_unique($arrWords);
$searchWords=array();
$junkWords=array();
foreach($arrWords as $word)
//remove common words
if(!$CommonWords[$word]){
$searchWords[]=$word;
}else{
$junkWords[]=$word;
}
//count no of words in the search words and store in a variable
$noofSearchWords=count($searchWords);
//explode to an array
$arrWords = implode("' OR keyword='", $searchWords);
//get the key ids from the key table
$query = "select * from keytable where keyword='$arrWords'";
$kResult = mysql_query($query);
//array to store the content id and occurances
$contArray=array();
$rescount=0;
//search for the link table only if all the given keywords present in the keytable
if(mysql_num_rows($kResult) == $noofSearchWords){
while($kRow=mysql_fetch_array($kResult))
{
//get the link ids for each key id
$kid= $kRow['keyid'];
$query = "SELECT * FROM link WHERE keyid=$kid";
$lResult = mysql_query($query);
//echo mysql_num_rows($lResult);
while($lRow=mysql_fetch_array($lResult)){
$thisContentId=$lRow["contid"];
if(!$contArray[$thisContentId]){
$contArray[$thisContentId]["oc"]=$lRow["occurrences"];
$contArray[$thisContentId]["id"]=$lRow["contid"];
$contArray[$thisContentId]["wrank"]=1;
}else{
$contArray[$thisContentId]["oc"]+=$lRow["occurrences"];
$contArray[$thisContentId]["wrank"]++;
}
}
}//end of while
if(isset($contArray)){
//declare an array to store the results
$FoundRef=array();
//Sort array in desending order of the key value
arsort($contArray,SORT_DESC);
// while(list($contentId,$occurances)=each($contArray)){
foreach($contArray as $cont){
$rank=$cont["wrank"];
if ($rank == $noofSearchWords ) {
$contentId = $cont["id"];
$occurances = $cont["oc"];
$aQuery = "select contid,title,left(abstract,200) as summary from content
where contid = " . $contentId;
$aResult = mysql_query($aQuery);
if(mysql_num_rows($aResult) > 0){
$aRow = mysql_fetch_array($aResult);
$FoundRef[] = array (
"contid" => $aRow["contid"],
"title" => $aRow["title"],
"summary" => $aRow["summary"],
"occurance"=>$occurances );
}//end of if
}//end of rank == no of search words
} //end of for each
//end TIMER
$end=getmicrotime();
//TOTAL TIME TAKEN TO DO SEARCH OPERATION
$time_taken=(float)($end-$start);
$time_taken=number_format($time_taken,2,'.','');
}//end of if countwords == mysql_number_of _ records
//end TIMER
$end=getmicrotime();
//TOTAL TIME TAKEN TO DO SEARCH OPERATION
$time_taken=(float)($end-$start);
$time_taken=number_format($time_taken,2,'.','');
if(isset($FoundRef))
{
echo "<table width=\"100%\"><tr><th class=\"title\">Search
Result</td></tr></table>";
echo "<a href=\"#\" onclick=\"history.back()\">Back</a>";
echo "<br>";
echo sizeof($FoundRef);
echo (sizeof($FoundRef) == 1 ? " reference" : " references");
echo " found";
echo "<p>";
echo "<h5>Your Query executed in ".$time_taken." Seconds</h5>";
echo "<h5>";
if($junkWords){
echo "Common words like";
foreach($junkWords as $jWords){
echo " "."'".$jWords."'";
}
echo "are removed from the search string";
}
echo "</h5>";
foreach($FoundRef as $a => $value)
{
echo "<table>";
echo "<tr><td valign=\"top\">";
// echo $FoundRef[$a]["contid"];
?>
<a href=showdoc.php?refid=<? echo $FoundRef[$a]["contid"]?>><emp><b><?
echo $FoundRef[$a]["title"]?></b></emp></a><div
align="right"> Occurance(s): <? echo $FoundRef[$a]["occurance"]
?></div>
<br><small><? echo $FoundRef[$a]["summary"] ?>...</small><br><br>
<? echo "</td></tr>";
}?>
<?
echo "</table>";
}//end of isset FoundRef
}else {
//end TIMER
$end=getmicrotime();
//TOTAL TIME TAKEN TO DO SEARCH OPERATION
$time_taken=(float)($end-$start);
$time_taken=number_format($time_taken,2,'.','');
echo "<p>Your Query Executed in $time_taken Seconds";
$errmsg="<p>No Search result found for '$keywords'";
echo $errmsg;
echo "<br><a href=\"#\" onclick=\"history.back()\">Back</a>";
}//endof isset ref
}//end of if key word exists
} else{ //display the form
form($keyword);
} //END OF FORM DISPLAY ?>
</body>
</html>
<?
function form($errmsg)
{ ?>
<h4 align="center">Search Engine</h4>
<b><? echo $errmsg; ?></b>
<center>
<form method=POST action=<? echo $PHP_SELF ?>>
</div>
Enter keywords to search on:
<input type="text" name="keywords" maxlength="100">
<input type="submit" name="submit" value="Search">
</form>
</body>
</html>
<?
}
function getmicrotime()
{
list($usec,$sec)=explode(" ",microtime());
return ((float)$usec+(float)$sec);
}
?>
showdoc.php
<?
$contid=$HTTP_GET_VARS["refid"];
//Connect to Database
$dServer = "localhost";
$dDb = "test";
$dUser = "admin";
$dPass = "";
$s = @mysql_connect($dServer, $dUser, $dPass)
or die("Couldn't connect to database server");
@mysql_select_db($dDb, $s)
or die("Couldn't connect to database");
//Get the data from the database
$query=mysql_query("SELECT * FROM content WHERE contid={$contid}");
$result=mysql_fetch_array($query);
?>
<html>
<head><title>Search Display</title>
<style type="text/css">
h2{
font-family:verdana;
font-size:15;
color:#123453;
}
th{
font-family:verdana;
font-size:12;
color:#123453;
}
td{
font-family:verdana;
font-size:12;
color:#123453;
}
th.title{
background-color:#10B0B0;
color:white;
}
td.data{
background-color:#E4E4E4;
}
a{
text-decoration:none;
font-family:verdana;
font-size:12;
background-color:#E4E4E4;
}
h3{
background-color="#5494E4";
color:white;
}
</style>
</head>
<body bgcolor=#F8F8F8>
<div align="center"><h3>Display Article</h3></div>
<table><tr><th>|</th>
<th> <a href="#" onclick="history.back()">Back
to Results</a></th>
<th>|</th>
</table>
<table bgcolor="#F0F0F0" cellpadding="10" cellspacing="10">
<tr><td>Title</td><td><? echo $result["title"]
?></td></tr>
<tr><td valign="top">Abstract</td><td><?
echo $result["abstract"] ?></td></tr>
</table>
-----------------
Murali Dhar at
Indiaclen
I have nearly 2 years experience in IT and 8 months with PHP. My interests
include designing, programming and learning new technologies. I am presently working for a private
organization in Chennai, India.
|