首页 > PHP开发 > php中级 > PHP扩展之针对搜索引擎的扩展(一)——Apache Solr
2014
11-07

PHP扩展之针对搜索引擎的扩展(一)——Apache Solr

简介及安装

Solr扩展允许你在PHP5中有效的和Apache Solr服务器通话。

Solr扩展快速, 轻量级, 有允许PHP开发者和Solr服务器实例有效通话的特性丰富的库。

兼容APache Solr 1.3 和 1.4。

有内置工具添加文档和更新到solr服务器,还有工具允许你在搜索文档时构建高级查询。

需要PHP5.2.11及以上版本。

需要安装libxml及curl扩展。

使用示例

Example #1 启动文件的内容

<?php

/* Solr服务器的主机域名*/
define('SOLR_SERVER_HOSTNAME', 'solr.example.com');

/* 是否以安全模式运行 */
define('SOLR_SECURE', true);

/* 连接到的HTTP端口 */
define('SOLR_SERVER_PORT', ((SOLR_SECURE) ? 8443 : 8983));

/* HTTP 基本认证名 */
define('SOLR_SERVER_USERNAME', 'admin');

/* HTTP 基本认证密码 */
define('SOLR_SERVER_PASSWORD', 'changeit');

/* HTTP 连接超时 */
/* http数据传输操作时有一个最大执行时间(秒),默认是30秒 */
define('SOLR_SERVER_TIMEOUT', 10);

/* File name to a PEM-formatted private key + private certificate */
define('SOLR_SSL_CERT', 'certs/combo.pem');

/* File name to a PEM-formatted private certificate only */
define('SOLR_SSL_CERT_ONLY', 'certs/solr.crt');

/* File name to a PEM-formatted private key */
define('SOLR_SSL_KEY', 'certs/solr.key');

/* Password for PEM-formatted private key file */
define('SOLR_SSL_KEYPASSWORD', 'StrongAndSecurePassword');

/* Name of file holding one or more CA certificates to verify peer with*/
define('SOLR_SSL_CAINFO', 'certs/cacert.crt');

/* Name of directory holding multiple CA certificates to verify peer with */
define('SOLR_SSL_CAPATH', 'certs/');

?>

Example #2 添加文档到索引

<?php

include "bootstrap.php";

$options = array
(
    'hostname' => SOLR_SERVER_HOSTNAME,
    'login'    => SOLR_SERVER_USERNAME,
    'password' => SOLR_SERVER_PASSWORD,
    'port'     => SOLR_SERVER_PORT,
);

$client = new SolrClient($options);

$doc = new SolrInputDocument();

$doc->addField('id', 334455);
$doc->addField('cat', 'Software');
$doc->addField('cat', 'Lucene');

$updateResponse = $client->addDocument($doc);

print_r($updateResponse->getResponse());

?>

以上例程的输出类似于:

SolrObject Object
(
    [responseHeader] => SolrObject Object
        (
            [status] => 0
            [QTime] => 446
        )
)

Example #3 合并一个文档到另一个文档

<?php

include "bootstrap.php";

$doc = new SolrDocument();

$second_doc = new SolrDocument();

$doc->addField('id', 1123);

$doc->features = "PHP Client Side";
$doc->features = "Fast development cycles";

$doc['cat'] = 'Software';
$doc['cat'] = 'Custom Search';
$doc->cat   = 'Information Technology';

$second_doc->addField('cat', 'Lucene Search');

$second_doc->merge($doc, true);

print_r($second_doc->toArray());
?>

以上例程的输出类似于:

Array
(
    [document_boost] => 0
    [field_count] => 3
    [fields] => Array
        (
            [0] => SolrDocumentField Object
                (
                    [name] => cat
                    [boost] => 0
                    [values] => Array
                        (
                            [0] => Software
                            [1] => Custom Search
                            [2] => Information Technology
                        )
                )

            [1] => SolrDocumentField Object
                (
                    [name] => id
                    [boost] => 0
                    [values] => Array
                        (
                            [0] => 1123
                        )
                )

            [2] => SolrDocumentField Object
                (
                    [name] => features
                    [boost] => 0
                    [values] => Array
                        (
                            [0] => PHP Client Side
                            [1] => Fast development cycles
                        )
                )
        )
)

Example #4 搜索文档 - SolrObject响应

<?php

include "bootstrap.php";

$options = array
(
    'hostname' => SOLR_SERVER_HOSTNAME,
    'login'    => SOLR_SERVER_USERNAME,
    'password' => SOLR_SERVER_PASSWORD,
    'port'     => SOLR_SERVER_PORT,
);

$client = new SolrClient($options);

$query = new SolrQuery();

$query->setQuery('lucene');

$query->setStart(0);

$query->setRows(50);

$query->addField('cat')->addField('features')->addField('id')->addField('timestamp');

$query_response = $client->query($query);

$response = $query_response->getResponse();

print_r($response);

?>

以上例程的输出类似于:

SolrObject Object
(
    [responseHeader] => SolrObject Object
        (
            [status] => 0
            [QTime] => 1
            [params] => SolrObject Object
                (
                    [wt] => xml
                    [rows] => 50
                    [start] => 0
                    [indent] => on
                    [q] => lucene
                    [fl] => cat,features,id,timestamp
                    [version] => 2.2
                )
        )

    [response] => SolrObject Object
        (
            [numFound] => 3
            [start] => 0
            [docs] => Array
                (
                    [0] => SolrObject Object
                        (
                            [cat] => Array
                                (
                                    [0] => Software
                                    [1] => Lucene
                                )
                            [id] => 334456
                        )

                    [1] => SolrObject Object
                        (
                            [cat] => Array
                                (
                                    [0] => Software
                                    [1] => Lucene
                                )
                            [id] => 334455
                        )

                    [2] => SolrObject Object
                        (
                            [cat] => Array
                                (
                                    [0] => software
                                    [1] => search
                                )
                            [features] => Array
                                (
                                    [0] => Advanced Full-Text Search Capabilities using Lucene
                                    [1] => Optimized for High Volume Web Traffic
                                    [2] => Standards Based Open Interfaces - XML and HTTP
                                    [3] => Comprehensive HTML Administration Interfaces
                                    [4] => Scalability - Efficient Replication to other Solr Search Servers
                                    [5] => Flexible and Adaptable with XML configuration and Schema
                                    [6] => Good unicode support: héllo (hello with an accent over the e)
                                )
                            [id] => SOLR1000
                            [timestamp] => 2009-09-04T20:38:55.906
                        )
                )
        )
)

Example #5 搜索文档 - SolrDocument响应

<?php

include "bootstrap.php";

$options = array
(
    'hostname' => SOLR_SERVER_HOSTNAME,
    'login'    => SOLR_SERVER_USERNAME,
    'password' => SOLR_SERVER_PASSWORD,
    'port'     => SOLR_SERVER_PORT,
);

$client = new SolrClient($options);

$query = new SolrQuery();

$query->setQuery('lucene');

$query->setStart(0);

$query->setRows(50);

$query->addField('cat')->addField('features')->addField('id')->addField('timestamp');

$query_response = $client->query($query);

$query_response->setParseMode(SolrQueryResponse::PARSE_SOLR_DOC);

$response = $query_response->getResponse();

print_r($response);

?>

以上例程的输出类似于:

SolrObject Object
(
    [responseHeader] => SolrObject Object
        (
            [status] => 0
            [QTime] => 1
            [params] => SolrObject Object
                (
                    [wt] => xml
                    [rows] => 50
                    [start] => 0
                    [indent] => on
                    [q] => lucene
                    [fl] => cat,features,id,timestamp
                    [version] => 2.2
                )
        )

    [response] => SolrObject Object
        (
            [numFound] => 3
            [start] => 0
            [docs] => Array
                (
                    [0] => SolrDocument Object
                        (
                            [_hashtable_index:SolrDocument:private] => 19740
                        )
                    [1] => SolrDocument Object
                        (
                            [_hashtable_index:SolrDocument:private] => 25485
                        )
                    [2] => SolrDocument Object
                        (
                            [_hashtable_index:SolrDocument:private] => 25052
                        )
                )
        )
)

Example #6 简单 TermsComponent 实例 - 基本

<?php

include "bootstrap.php";

$options = array
(
    'hostname' => SOLR_SERVER_HOSTNAME,
    'login'    => SOLR_SERVER_USERNAME,
    'password' => SOLR_SERVER_PASSWORD,
    'port'     => SOLR_SERVER_PORT,
);

$client = new SolrClient($options);

$query = new SolrQuery();

$query->setTerms(true);

$query->setTermsField('cat');

$updateResponse = $client->query($query);

print_r($updateResponse->getResponse());

?>

以上例程的输出类似于:

SolrObject Object
(
    [responseHeader] => SolrObject Object
        (
            [status] => 0
            [QTime] => 2
        )

    [terms] => SolrObject Object
        (
            [cat] => SolrObject Object
                (
                    [electronics] => 14
                    [Lucene] => 4
                    [Software] => 4
                    [memory] => 3
                    [card] => 2
                    [connector] => 2
                    [drive] => 2
                    [graphics] => 2
                    [hard] => 2
                    [monitor] => 2
                )
        )
)

Example #7 简单 TermsComponent 实例 - 使用前缀

<?php

include "bootstrap.php";

$options = array
(
    'hostname' => SOLR_SERVER_HOSTNAME,
    'login'    => SOLR_SERVER_USERNAME,
    'password' => SOLR_SERVER_PASSWORD,
    'port'     => SOLR_SERVER_PORT,
);

$client = new SolrClient($options);

$query = new SolrQuery();

$query->setTerms(true);

/* Return only terms starting with $prefix */
$prefix = 'c';

$query->setTermsField('cat')->setTermsPrefix($prefix);

$updateResponse = $client->query($query);

print_r($updateResponse->getResponse());

?>

以上例程的输出类似于:

SolrObject Object
(
    [responseHeader] => SolrObject Object
        (
            [status] => 0
            [QTime] => 1
        )

    [terms] => SolrObject Object
        (
            [cat] => SolrObject Object
                (
                    [card] => 2
                    [connector] => 2
                    [camera] => 1
                    [copier] => 1
                )
        )
)

Example #8 简单 TermsComponent 实例 - 指定最小频率

<?php

include "bootstrap.php";

$options = array
(
    'hostname' => SOLR_SERVER_HOSTNAME,
    'login'    => SOLR_SERVER_USERNAME,
    'password' => SOLR_SERVER_PASSWORD,
    'port'     => SOLR_SERVER_PORT,
);

$client = new SolrClient($options);

$query = new SolrQuery();

$query->setTerms(true);

/* Return only terms starting with $prefix */
$prefix = 'c';

/* Return only terms with a frequency of 2 or greater */
$min_frequency = 2;

$query->setTermsField('cat')->setTermsPrefix($prefix)->setTermsMinCount($min_frequency);

$updateResponse = $client->query($query);

print_r($updateResponse->getResponse());

?>

以上例程的输出类似于:

SolrObject Object
(
    [responseHeader] => SolrObject Object
        (
            [status] => 0
            [QTime] => 0
        )

    [terms] => SolrObject Object
        (
            [cat] => SolrObject Object
                (
                    [card] => 2
                    [connector] => 2
                )
        )
)

Example #9 简单 Facet 实例

<?php

include "bootstrap.php";

$options = array
(
    'hostname' => SOLR_SERVER_HOSTNAME,
    'login'    => SOLR_SERVER_USERNAME,
    'password' => SOLR_SERVER_PASSWORD,
    'port'     => SOLR_SERVER_PORT,
);

$client = new SolrClient($options);

$query = new SolrQuery('*:*');

$query->setFacet(true);

$query->addFacetField('cat')->addFacetField('name')->setFacetMinCount(2);

$updateResponse = $client->query($query);

$response_array = $updateResponse->getResponse();

$facet_data = $response_array->facet_counts->facet_fields;

print_r($facet_data);

?>

以上例程的输出类似于:

SolrObject Object
(
    [cat] => SolrObject Object
        (
            [electronics] => 14
            [memory] => 3
            [Lucene] => 2
            [Software] => 2
            [card] => 2
            [connector] => 2
            [drive] => 2
            [graphics] => 2
            [hard] => 2
            [monitor] => 2
            [search] => 2
            [software] => 2
        )

    [name] => SolrObject Object
        (
            [gb] => 6
            [1] => 3
            [184] => 3
            [2] => 3
            [3200] => 3
            [400] => 3
            [500] => 3
            [ddr] => 3
            [i] => 3
            [ipod] => 3
            [memori] => 3
            [pc] => 3
            [pin] => 3
            [pod] => 3
            [sdram] => 3
            [system] => 3
            [unbuff] => 3
            [canon] => 2
            [corsair] => 2
            [drive] => 2
            [hard] => 2
            [mb] => 2
            [n] => 2
            [power] => 2
            [retail] => 2
            [video] => 2
            [x] => 2
        )
)

Example #10 简单 Facet 实例 - 使用可选字段重写 mincount

<?php

include "bootstrap.php";

$options = array
(
    'hostname' => SOLR_SERVER_HOSTNAME,
    'login'    => SOLR_SERVER_USERNAME,
    'password' => SOLR_SERVER_PASSWORD,
    'port'     => SOLR_SERVER_PORT,
);

$client = new SolrClient($options);

$query = new SolrQuery('*:*');

$query->setFacet(true);

$query->addFacetField('cat')->addFacetField('name')->setFacetMinCount(2)->setFacetMinCount(4, 'name');

$updateResponse = $client->query($query);

$response_array = $updateResponse->getResponse();

$facet_data = $response_array->facet_counts->facet_fields;

print_r($facet_data);

?>

以上例程的输出类似于:

SolrObject Object
(
    [cat] => SolrObject Object
        (
            [electronics] => 14
            [memory] => 3
            [Lucene] => 2
            [Software] => 2
            [card] => 2
            [connector] => 2
            [drive] => 2
            [graphics] => 2
            [hard] => 2
            [monitor] => 2
            [search] => 2
            [software] => 2
        )

    [name] => SolrObject Object
        (
            [gb] => 6
        )

)

Example #11 连接到 SSL-Enabled 服务器

<?php

include "bootstrap.php";

$options = array
(
    'hostname' => SOLR_SERVER_HOSTNAME,
    'login'    => SOLR_SERVER_USERNAME,
    'password' => SOLR_SERVER_PASSWORD,
    'port'     => SOLR_SERVER_PORT,
    'timeout'  => SOLR_SERVER_TIMEOUT,
    'secure'   => SOLR_SECURE,
    'ssl_cert' => SOLR_SSL_CERT_ONLY,
    'ssl_key'  => SOLR_SSL_KEY,
    'ssl_keypassword' => SOLR_SSL_KEYPASSWORD,
    'ssl_cainfo' => SOLR_SSL_CAINFO,
);

$client = new SolrClient($options);

$query = new SolrQuery('*:*');

$query->setFacet(true);

$query->addFacetField('cat')->addFacetField('name')->setFacetMinCount(2)->setFacetMinCount(4, 'name');

$updateResponse = $client->query($query);

$response_array = $updateResponse->getResponse();

$facet_data = $response_array->facet_counts->facet_fields;

print_r($facet_data);

?>

以上例程的输出类似于:

SolrObject Object
(
    [cat] => SolrObject Object
        (
            [electronics] => 14
            [memory] => 3
            [Lucene] => 2
            [Software] => 2
            [card] => 2
            [connector] => 2
            [drive] => 2
            [graphics] => 2
            [hard] => 2
            [monitor] => 2
            [search] => 2
            [software] => 2
        )

    [name] => SolrObject Object
        (
            [gb] => 6
        )

)

更多类及方法的使用详见http://www.php.net/manual/zh/book.solr.php

编程技巧