码迷,mamicode.com
首页 > 其他好文 > 详细

sphinx配置增量索引和索引合并

时间:2017-07-15 10:00:09      阅读:143      评论:0      收藏:0      [点我收藏+]

标签:lin   binlog   ima   type   ota   父类   小吃   com   test   

 

配置增量索引

1,配置csft.conf文件。

  其中base为父类,scr1和tmp_src1都是他的子类,相应配置如下。

searchd{
        listen = 9312
        listen = 9306:mysql41
        read_timeout =5
        max_children = 30
        max_matches = 1000
        seamless_rotate = 0
        preopen_indexes = 0
        unlink_old = 1
        pid_file = /usr/local/coreseek/var/log/searchd.pid
        log = /usr/local/coreseek/var/log/searchd.log
        query_log = /usr/local/coreseek/var/log/query.log
        binlog_path = 
}
#全局配置
source base
{
        type         = mysql
        sql_host     = 127.0.0.1
        sql_user     = root
        sql_pass    = 
        sql_db         = test
        sql_port     = 3306
        sql_query_pre = SET NAMES utf8
        sql_query        =
}

source src1: base
{
        sql_query            = SELECT id, group_id, UNIX_TIMESTAMP(date_added) AS date_added, title, content FROM documents
        sql_attr_uint        = id
        sql_attr_uint        = group_id
        sql_attr_timestamp     = date_added
        sql_field_string    = title
        sql_field_string    = content
        sql_query_info_pre    = SET NAMES utf8

        
}

index src1{
        source = src1
        path = /usr/local/coreseek/var/data/test1
        docinfo = extern
        mlock =0
        morphology = none
        min_word_len =1
        html_strip =0
        #index_sp          = 1
        charset_type = zh_cn.utf-8
        charset_dictpath = /usr/local/mmseg3/etc/
}

source tmp_src1 : base
{
        sql_query            = SELECT id, group_id, UNIX_TIMESTAMP(date_added) AS date_added, title, content FROM documents WHERE id>( SELECT max_doc_id FROM sph_counter WHERE counter_id=1 )
        sql_attr_uint        = id
        sql_attr_uint        = group_id
        sql_attr_timestamp     = date_added
        sql_field_string    = title
        sql_field_string    = content  
}

index tmp_src1{
        source = tmp_src1        
        path = /usr/local/coreseek/var/data/tmp_src1        
        docinfo = extern
        mlock =0
        morphology = none
        min_word_len =1
        html_strip =0
        charset_type = zh_cn.utf-8
        charset_dictpath = /usr/local/mmseg3/etc/
}

 

 

对应的sql文件为:

SET FOREIGN_KEY_CHECKS=0;

-- ----------------------------
-- Table structure for documents
-- ----------------------------
DROP TABLE IF EXISTS `documents`;
CREATE TABLE `documents` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `group_id` int(11) NOT NULL,
  `date_added` datetime NOT NULL,
  `title` varchar(255) CHARACTER SET utf8 DEFAULT NULL,
  `content` text NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=10 DEFAULT CHARSET=utf8mb4;

-- ----------------------------
-- Records of documents
-- ----------------------------
INSERT INTO `documents` VALUES (‘1‘, ‘1‘, ‘2017-06-06 21:45:58‘, ‘中国‘, ‘中国真美,地大物博‘);
INSERT INTO `documents` VALUES (‘2‘, ‘1‘, ‘2017-06-06 21:45:58‘, ‘中国美食‘, ‘台北小吃,各地美食‘);
INSERT INTO `documents` VALUES (‘3‘, ‘2‘, ‘2017-06-06 21:45:58‘, ‘美女之家‘, ‘美女之国‘);
INSERT INTO `documents` VALUES (‘4‘, ‘2‘, ‘2017-06-06 21:45:58‘, ‘hello‘, ‘this is to test groups‘);
INSERT INTO `documents` VALUES (‘5‘, ‘3‘, ‘2017-07-27 17:00:09‘, ‘熊猫‘, ‘中国国宝‘);
INSERT INTO `documents` VALUES (‘6‘, ‘3‘, ‘2017-07-14 17:00:04‘, ‘竹子‘, ‘熊猫吃竹子‘);
INSERT INTO `documents` VALUES (‘7‘, ‘4‘, ‘2017-07-14 17:30:36‘, ‘猫科动物‘, ‘老虎吃人‘);
INSERT INTO `documents` VALUES (‘8‘, ‘4‘, ‘2017-07-14 17:30:36‘, ‘猫科动物2‘, ‘东北虎‘);
INSERT INTO `documents` VALUES (‘9‘, ‘5‘, ‘2017-07-14 17:34:24‘, ‘动物园‘, ‘老鼠‘);
SET FOREIGN_KEY_CHECKS=1;






SET FOREIGN_KEY_CHECKS=0;

-- ----------------------------
-- Table structure for sph_counter
-- ----------------------------
DROP TABLE IF EXISTS `sph_counter`;
CREATE TABLE `sph_counter` (
  `counter_id` int(11) NOT NULL AUTO_INCREMENT,
  `max_doc_id` int(11) DEFAULT NULL,
  PRIMARY KEY (`counter_id`)
) ENGINE=InnoDB AUTO_INCREMENT=2 DEFAULT CHARSET=utf8mb4;

-- ----------------------------
-- Records of sph_counter
-- ----------------------------
INSERT INTO `sph_counter` VALUES (‘1‘, ‘6‘);
SET FOREIGN_KEY_CHECKS=1;

 

 

 

其中,sph_counter表中,的max_doc_id为当前在coreseek已经存放的索引的最大值。而配置文件tmp_src1中有这样一句话

sql_query            = SELECT id, group_id, UNIX_TIMESTAMP(date_added) AS date_added, title, content FROM documents WHERE id>( SELECT max_doc_id FROM sph_counter WHERE counter_id=1 )

意思是将新更新的部分加入到tmp_src1中。

索引合并

将在mysql中的最新数据加入到tmp_scr1中
/usr/local/coreseek/bin/indexer -c /usr/local/coreseek/etc/csft.conf --all --rotate

 技术分享

进行索引合并

/usr/local/coreseek/bin/indexer --merge src1 tmp_src1 --merge-dst-range deleted 0 0

 技术分享

这样,新增加的数据就到了src1的索引内。

技术分享



sphinx配置增量索引和索引合并

标签:lin   binlog   ima   type   ota   父类   小吃   com   test   

原文地址:http://www.cnblogs.com/ziyiang/p/7181344.html

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!