Creating a Code Search Engine with PHP and MySQL(1)

原文:http://www.developer.com/db/article.php/3671021

 

I'm just a few days away from launching a comprehensive support website for my book, "Beginning PHP and MySQL 5, Second Edition", and among other features, have built a search engine for sifting through the more than 500 code snippets found throughout the book. This was an interesting exercise because it involves a number of storing a fairly significant amount of text within a MySQL database, using MySQL's full-text search facility, and devising an effective way to extract and display the code in the browser.

In this article I'll offer a simplified version of this search engine, introducing you to some compelling PHP and MySQL features along the way. You might adopt what you learn towards building your own search engine, or towards other applications.

The Database Schema

Just a single table is required for the engine's operation. The table, code, serves as the code repository. Each example is stored along with a suitable title and the chapter number in which it appears. Because the search engine should retrieve examples based on keywords found in the example title or in the code itself, a FULLTEXT index has been added for these columns. Because the table contents will rarely change beyond the occasional bug fix, its backed by the read-optimized MyISAM storage engine. The table follows:

CREATE TABLE code (
 id SMALLINT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
 title VARCHAR(50) NOT NULL,
 chapter TINYINT UNSIGNED NOT NULL,
 code TEXT NOT NULL,
 FULLTEXT (title,code)
) TYPE = MYISAM;

Loading the Table

The downloadable zip file containing all of the book's code should be easily navigable so readers can easily retrieve the desired example. To meet this requirement, the zip file contains a number of directories labeled according to chapter number (1, 2, 3, ... 37), and each script is aptly named with a lowercase title and series of underscores, for example retrieving_array_keys.php. Therefore a script capable of dealing with these two organizational matters is required in order to automate the process of loading the scripts into the database.

You might recognize this task as one well suited for recursion, and indeed it is. The following script does the job nicely:

<?php

mysql_connect("localhost","gilmore","secret");

mysql_select_db("beginningphpandmysqlcom");

// Running on Windows or Linux/Unix?
$delimiter = strstr(PHP_OS, "WIN") ? "/" : "/";

function parseCodeTree($path) {

  global $delimiter;

  if ($dir = opendir($path)) {
	
    while ($item = readdir($dir)) {

      // If $item is a directory, recurse
      if (is_dir($path.$delimiter.$item) && $item != "." && $item != "..") {
			
        //printf("Directory: %s <br />", $item);
        parseCodeTree($path.$delimiter.$item);

      // $item is a file, so insert it into database
      } elseif ($item != "." && $item != "..") {

        // Retrieve the chapter number
        $directory = substr(strrchr($path, "$delimiter"), 1);

        //printf("File: %s <br />", $item);

        // Convert the file name to a readable title
        $scriptTitle = str_replace(".php", "", $item);
        $scriptTitle = str_replace("_", " ", $scriptTitle);
		
        // Retrieve the file contents
        $scriptContents = file_get_contents($path.$delimiter.$item);

        // Insert the file information into database
        $query = "INSERT INTO code VALUES('NULL', '$scriptTitle', '$directory', '$scriptContents')";
        $result = mysql_query($query);

      }
    }
    closedir($dir);
  }
  return 1;
}

parseCodeTree('code');

?>

I've purposefully left two printf() statements in the script so you can view the script's logic. Some sample output follows:

Directory: 4
File: array_key.php
File: is_array.php
Directory: 5
File: multidimensional_array.php
File: retrieving_array_keys.php
File: retrieving_array_values.php
File: slicing_an_array.php

Building the Search Engine

With the code and corresponding metadata inserted into the database, all that's left to do is build the search engine. Believe it or not, this is perhaps the easiest part of the project, thanks to MySQL's fulltext search capabilities. Although I've used the symfony framework to abstract the database interaction, for the purposes of this article I've used POPM (Plain Old PHP and MySQL) to build the search engine. The search form is exceedingly simple, and looks like this:

<form method="POST" action="search.php">
Search the code repository:<br />
<input type="text" id="keyword" name="keyword" /><br />
<input type="submit" value="Search!" />
</form>

The search script (search.php) looks something like this. Provided you've used PHP to interact with MySQL before, there shouldn't be any surprises, except for perhaps the query itself. This query takes advantage of MySQL's fulltext feature to compare the keyword against those columns that have been identified as searchable using MySQL's fulltext conditions. These conditions can produce unexpected results without doing some advance reading, so be sure to peruse the appropriate section of the MySQL documentation before building your own queries.

<?php

  mysql_connect("localhost","gilmore","secret");
  mysql_select_db("beginningphpandmysqlcom");

  $keyword = mysql_real_escape_string(

原文:http://www.developer.com/db/article.php/3671021

 

I'm just a few days away from launching a comprehensive support website for my book, "Beginning PHP and MySQL 5, Second Edition", and among other features, have built a search engine for sifting through the more than 500 code snippets found throughout the book. This was an interesting exercise because it involves a number of storing a fairly significant amount of text within a MySQL database, using MySQL's full-text search facility, and devising an effective way to extract and display the code in the browser.

In this article I'll offer a simplified version of this search engine, introducing you to some compelling PHP and MySQL features along the way. You might adopt what you learn towards building your own search engine, or towards other applications.

The Database Schema

Just a single table is required for the engine's operation. The table, code, serves as the code repository. Each example is stored along with a suitable title and the chapter number in which it appears. Because the search engine should retrieve examples based on keywords found in the example title or in the code itself, a FULLTEXT index has been added for these columns. Because the table contents will rarely change beyond the occasional bug fix, its backed by the read-optimized MyISAM storage engine. The table follows:

CREATE TABLE code (
 id SMALLINT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
 title VARCHAR(50) NOT NULL,
 chapter TINYINT UNSIGNED NOT NULL,
 code TEXT NOT NULL,
 FULLTEXT (title,code)
) TYPE = MYISAM;

Loading the Table

The downloadable zip file containing all of the book's code should be easily navigable so readers can easily retrieve the desired example. To meet this requirement, the zip file contains a number of directories labeled according to chapter number (1, 2, 3, ... 37), and each script is aptly named with a lowercase title and series of underscores, for example retrieving_array_keys.php. Therefore a script capable of dealing with these two organizational matters is required in order to automate the process of loading the scripts into the database.

You might recognize this task as one well suited for recursion, and indeed it is. The following script does the job nicely:

<?php

mysql_connect("localhost","gilmore","secret");

mysql_select_db("beginningphpandmysqlcom");

// Running on Windows or Linux/Unix?
$delimiter = strstr(PHP_OS, "WIN") ? "/" : "/";

function parseCodeTree($path) {

  global $delimiter;

  if ($dir = opendir($path)) {
	
    while ($item = readdir($dir)) {

      // If $item is a directory, recurse
      if (is_dir($path.$delimiter.$item) && $item != "." && $item != "..") {
			
        //printf("Directory: %s <br />", $item);
        parseCodeTree($path.$delimiter.$item);

      // $item is a file, so insert it into database
      } elseif ($item != "." && $item != "..") {

        // Retrieve the chapter number
        $directory = substr(strrchr($path, "$delimiter"), 1);

        //printf("File: %s <br />", $item);

        // Convert the file name to a readable title
        $scriptTitle = str_replace(".php", "", $item);
        $scriptTitle = str_replace("_", " ", $scriptTitle);
		
        // Retrieve the file contents
        $scriptContents = file_get_contents($path.$delimiter.$item);

        // Insert the file information into database
        $query = "INSERT INTO code VALUES('NULL', '$scriptTitle', '$directory', '$scriptContents')";
        $result = mysql_query($query);

      }
    }
    closedir($dir);
  }
  return 1;
}

parseCodeTree('code');

?>

I've purposefully left two printf() statements in the script so you can view the script's logic. Some sample output follows:

Directory: 4
File: array_key.php
File: is_array.php
Directory: 5
File: multidimensional_array.php
File: retrieving_array_keys.php
File: retrieving_array_values.php
File: slicing_an_array.php

Building the Search Engine

With the code and corresponding metadata inserted into the database, all that's left to do is build the search engine. Believe it or not, this is perhaps the easiest part of the project, thanks to MySQL's fulltext search capabilities. Although I've used the symfony framework to abstract the database interaction, for the purposes of this article I've used POPM (Plain Old PHP and MySQL) to build the search engine. The search form is exceedingly simple, and looks like this:

<form method="POST" action="search.php">
Search the code repository:<br />
<input type="text" id="keyword" name="keyword" /><br />
<input type="submit" value="Search!" />
</form>

The search script (search.php) looks something like this. Provided you've used PHP to interact with MySQL before, there shouldn't be any surprises, except for perhaps the query itself. This query takes advantage of MySQL's fulltext feature to compare the keyword against those columns that have been identified as searchable using MySQL's fulltext conditions. These conditions can produce unexpected results without doing some advance reading, so be sure to peruse the appropriate section of the MySQL documentation before building your own queries.

___FCKpd___4
POST['keyword']); // Perform the fulltext search $query = "SELECT id, title, chapter, code FROM code WHERE MATCH(title, code) AGAINST ('$keyword')"; $result = mysql_query($query); // If results were found, output them if (mysql_num_rows($result) > 0) { printf("Results: <br />"); while ($row = mysql_fetch_array($result)) { printf("Chapter %s: <a href='displaycode.php?id=%s'>%s</a>", $row['chapter'], $row['id'], ucfirst($row['title'])); } } else { printf("No results found"); } ?>
Serialization failure with the error code 1213 typically occurs in database systems like MySQL when there's a deadlock detected during a transaction. A deadlock happens when two or more processes are waiting for each other to release resources they hold, creating a circular dependency that cannot be resolved without manual intervention. The message suggests that one of the transactions is unable to proceed because it can't acquire a lock on the data it needs, while another transaction has already locked the same resource, causing an impasse. To resolve this issue: 1. **Retry Transaction**: You may try restarting the transaction by rolling back any changes made and then retrying, hoping that the deadlock situation resolves itself due to different execution order. 2. **Check Locks**: Analyze your SQL queries to ensure no long-running SELECT statements with LOCK IN SHARE MODE, which can cause deadlocks if not handled properly. 3. **Optimize Query Execution**: Review your indexing strategy and query logic to minimize blocking operations and reduce the likelihood of deadlocks. 4. **Database Configuration**: Configure your database server's settings for deadlock detection and resolution, such as increasing the `innodb_deadlock_detect` parameter or setting a higher `innodb_lock_wait_timeout`. 5. **Deadlock Detection Tools**: Use tools like MySQL's `SHOW ENGINE INNODB STATUS` command or monitoring software to identify and resolve deadlocks manually. If the problem persists frequently, you might need to investigate the application's concurrency patterns and improve its design to avoid deadlocks altogether.
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值