zhangxiong0301

浏览: 351062 次

最近访客更多访客>>

brosnan2800

rl724

itgege

fhtwins

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

hbase0.96—+版本的endpoint

博客分类：

HBASE

hbase endpoint

HBase Coprocessor 之 endpiont(hbase 0.96.0)

分类： hbase2014-04-22 15:23 1661人阅读评论(0) 收藏举报

hbasehadoopcoprocessorendpointhbase 0.96.0

本文是基于hbase 0.96.0 测试的，理论上支持hbase 0.94 以上版本！！

HBase有两种协处理器（Coprocessor）

1、RegionObserver ：类似于关系型数据库的触发器

2、Endpoint：类似于关系型数据库的存储过程，本文将介绍此种Coprocessor.

Endpoint 允许您定义自己的动态RPC协议,用于客户端与region servers通讯。Coprocessor 与region server在相同的进程空间中，因此您可以在region端定义自己的方法（endpoint），将计算放到region端，减少网络开销，常用于提升hbase的功能，如：count，sum等。

本文以count为例，实现一个自己的endpoint：

一、定义一个protocol buffer Service。

1、安装protobuf

下载protoc-2.5.0-win32.zip（根据自己的操作系统选择），解压；

将protoc-2.5.0-win32中的protoc.exe拷贝到c:\windows\system32中。

将proto.exe文件拷贝到解压后的XXX\protobuf-2.5.0\src目录中.

参考链接：http://shuofenglxy.iteye.com/blog/1512980

2.定义.proto文件，用于定义类的一些基本信息

CXKTest.proto的代码如下：

[html]view plaincopy 
<span style="font-family:SimSun;font-size:14px;">option java_package = "com.cxk.coprocessor.test.generated";  
option java_outer_classname = "CXKTestProtos";  
option java_generic_services = true;  
option java_generate_equals_and_hash = true;  
option optimize_for = SPEED;  
message CountRequest {  
}  
message CountResponse {  
  required int64 count = 1 [default = 0];  
}  
service RowCountService {  
  rpc getRowCount(CountRequest)  
    returns (CountResponse);  
}</span>  

参考链接：https://developers.google.com/protocol-buffers/docs/proto#services

3.用proto.exe 生成java代码
执行命令：proto.exe--java_out=. CXKTest.proto

在 com.cxk.coprocessor.test.generated 下会生成类：CXKTestProtos

二、定义自己的Endpoint类（实现一下自己的方法）

RowCountEndpoint.java 的代码片段如下：

[html]view plaincopy 
<span style="font-family:SimSun;font-size:14px;">package com.cxk.coprocessor.test;  
import java.io.IOException;  
import java.util.ArrayList;  
import java.util.List;  
import org.apache.hadoop.hbase.Cell;  
import org.apache.hadoop.hbase.CellUtil;  
import org.apache.hadoop.hbase.Coprocessor;  
import org.apache.hadoop.hbase.CoprocessorEnvironment;  
import org.apache.hadoop.hbase.client.Scan;  
import org.apache.hadoop.hbase.coprocessor.CoprocessorException;  
import org.apache.hadoop.hbase.coprocessor.CoprocessorService;  
import org.apache.hadoop.hbase.coprocessor.RegionCoprocessorEnvironment;  
import org.apache.hadoop.hbase.filter.FirstKeyOnlyFilter;  
import org.apache.hadoop.hbase.protobuf.ResponseConverter;  
import org.apache.hadoop.hbase.regionserver.InternalScanner;  
import org.apache.hadoop.hbase.util.Bytes;  
import com.google.protobuf.RpcCallback;  
import com.google.protobuf.RpcController;  
import com.google.protobuf.Service;  
  
public class RowCountEndpoint extends CXKTestProtos.RowCountService  
    implements Coprocessor, CoprocessorService {  
  private RegionCoprocessorEnvironment env;  
  
  public RowCountEndpoint() {  
  }  
  
  @Override  
  public Service getService() {  
    return this;  
  }  
  
  /**  
   * 统计hbase表总行数  
   */  
  @Override  
  public void getRowCount(RpcController controller, CXKTestProtos.CountRequest request,  
                          RpcCallback<CXKTestProtos.CountResponse> done) {  
    Scan scan = new Scan();  
    scan.setFilter(new FirstKeyOnlyFilter());  
    CXKTestProtos.CountResponse response = null;  
    InternalScanner scanner = null;  
    try {  
      scanner = env.getRegion().getScanner(scan);  
      List<Cell> results = new ArrayList<Cell>();  
      boolean hasMore = false;  
      byte[] lastRow = null;  
      long count = 0;  
      do {  
        hasMore = scanner.next(results);  
        for (Cell kv : results) {  
          byte[] currentRow = CellUtil.cloneRow(kv);  
          if (lastRow == null || !Bytes.equals(lastRow, currentRow)) {  
            lastRow = currentRow;  
            count++;  
          }  
        }  
        results.clear();  
      } while (hasMore);  
  
      response = CXKTestProtos.CountResponse.newBuilder()  
          .setCount(count).build();  
    } catch (IOException ioe) {  
      ResponseConverter.setControllerException(controller, ioe);  
    } finally {  
      if (scanner != null) {  
        try {  
          scanner.close();  
        } catch (IOException ignored) {}  
      }  
    }  
    done.run(response);  
  }  
  
  @Override  
  public void start(CoprocessorEnvironment env) throws IOException {  
    if (env instanceof RegionCoprocessorEnvironment) {  
      this.env = (RegionCoprocessorEnvironment)env;  
    } else {  
      throw new CoprocessorException("Must be loaded on a table region!");  
    }  
  }  
  
  @Override  
  public void stop(CoprocessorEnvironment env) throws IOException {  
    // nothing to do  
  }  
}  
</span>  

三、实现自己的客户端方法：

TestEndPoint.java 代码如下:

[html]view plaincopy 
<span style="font-family:SimSun;font-size:14px;">package com.test;  
  
  
import java.io.IOException;  
import java.util.Map;  
  
import org.apache.hadoop.conf.Configuration;  
import org.apache.hadoop.hbase.HBaseConfiguration;  
import org.apache.hadoop.hbase.client.HTable;  
import org.apache.hadoop.hbase.client.coprocessor.Batch;  
import org.apache.hadoop.hbase.ipc.BlockingRpcCallback;  
import org.apache.hadoop.hbase.ipc.ServerRpcController;  
  
import com.cxk.coprocessor.test.CXKTestProtos;  
import com.cxk.coprocessor.test.CXKTestProtos.RowCountService;  
import com.google.protobuf.ServiceException;  
  
public class TestEndPoint {  
/**  
 *   
 * @param args[0] ip ,args[1] zk_ip,args[2] table_name  
 * @throws ServiceException  
 * @throws Throwable  
 */  
    public static void main(String[] args) throws ServiceException, Throwable {  
        // TODO Auto-generated method stub  
         System.out.println("begin.....");  
         long begin_time=System.currentTimeMillis();  
        Configuration config=HBaseConfiguration.create();  
//      String master_ip="192.168.150.128";  
        String master_ip=args[0];  
        String zk_ip=args[1];  
        String table_name=args[2];  
        config.set("hbase.zookeeper.property.clientPort", "2181");   
        config.set("hbase.zookeeper.quorum", zk_ip);   
        config.set("hbase.master", master_ip+":600000");  
        final CXKTestProtos.CountRequest request = CXKTestProtos.CountRequest.getDefaultInstance();  
        HTable table=new HTable(config,table_name);  
          
        Map<byte[],Long> results = table.coprocessorService(RowCountService.class,  
                null, null,  
                new Batch.Call<CXKTestProtos.RowCountService,Long>() {  
                  public Long call(CXKTestProtos.RowCountService counter) throws IOException {  
                    ServerRpcController controller = new ServerRpcController();  
                    BlockingRpcCallback<CXKTestProtos.CountResponse> rpcCallback =  
                        new BlockingRpcCallback<CXKTestProtos.CountResponse>();  
                    counter.getRowCount(controller, request, rpcCallback);  
                    CXKTestProtos.CountResponse response = rpcCallback.get();  
                    if (controller.failedOnException()) {  
                      throw controller.getFailedOn();  
                    }  
                    return (response != null && response.hasCount()) ? response.getCount() : 0;  
                  }  
                });  
        table.close();  
  
         if(results.size()>0){  
         System.out.println(results.values());  
         }else{  
             System.out.println("没有任何返回结果");  
         }  
         long end_time=System.currentTimeMillis();  
         System.out.println("end:"+(end_time-begin_time));  
    }  
  
}  
</span>  

四、部署endpoint

部署endpoint有两种方法，第一种通过修改hbase.site.xml文件，实现对所有表加载这个endpoint；第二张通过alter表，实现对某一张表加载这个endpoint；

1、修改hbase.site.xml

在hbase.site.xml中添加如下内容

[html]view plaincopy 
<span style="font-family:SimSun;font-size:14px;"><property>  
    <name>hbase.coprocessor.region.classes</name>  
    <value>com.cxk.coprocessor.test.RowCountEndpoint</value>  
    <description>A comma-separated list of Coprocessors that are loaded by  
    default. For any override coprocessor method from RegionObservor or  
    Coprocessor, these classes' implementation will be called  
    in order. After implement your own  
    Coprocessor, just put it in HBase's classpath and add the fully  
    qualified class name here.  
    </description>  
  </property></span>  

2、hbase shell alter表

A、将CXKTestProtos.java和RowCountEndpoint.java打成jar放到hdfs上；

B、

[html]view plaincopy 
<span style="font-family:SimSun;font-size:14px;">disable 'test'</span>  

C、

[html]view plaincopy 
<span style="font-family:SimSun;font-size:14px;">alter 'test','coprocessor'=>'hdfs:///user/hadoop/test/coprocessor/cxkcoprocessor.1.01.jar|com.cxk.coprocessor.test.RowCountEndpoint|1001|arg1=1,arg2=2'</span>  

D、

[html]view plaincopy 
<span style="font-family:SimSun;font-size:14px;">enable 'test'</span>  

五、运行客户端

将TestEndPoint.java 打成jar，通过以下命令运行

[html]view plaincopy 
<span style="font-family:SimSun;font-size:14px;">java -jar test.cxk.endpiont.1.03.jar ip1 ip2 test</span>  

ps：如果eclipse可以直接调试hadoop，可直接运行测试类。

=================================================================================

alter 't1', METHOD => 'table_att', 'coprocessor'=>'hdfs:///test.jar|hbaseCoprocessor.RegionObserverExample|1001|'.

ps:经过尝试，可以将路径写完整，即：

alter 't1', METHOD => 'table_att', 'coprocessor'=>'hdfs://nnip:9000/test.jar|hbaseCoprocessor.RegionObserverExample|1001|'.

根据自己NN的配置，将上面的nnip修改即可正确运行cp.

5 删除一个coprocessor的shell命令:alter 't1', METHOD => 'table_att_unset',NAME => 'coprocessor$1'

===============================================================================

参考材料：

http://hbase.apache.org/devapidocs/index.html

分享到：

转载--淘宝hadoop升级遇到的问题 | hbase observer

2015-04-21 17:16
浏览 885
评论(0)
分类:开源软件
查看更多

发表评论

您还没有登录,请您登录后再发表评论

最近访客更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论

hbase0.96—+版本的endpoint

HBase Coprocessor 之 endpiont(hbase 0.96.0)

评论

发表评论

相关推荐

最近访客 更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论

hbase0.96—+版本的endpoint

HBase Coprocessor 之 endpiont(hbase 0.96.0)

评论

发表评论

相关推荐

HBase安全及namespace操作

How-to: Use HBase Bulk Loading, and Why

HBase的Block Cache实现机制分析

hbase中的MSLAB

hbase优化（1）

实时系统HBase读写优化--大量写入无障碍

hbase observer

hbase block cache中的in-memory

hbase0.94之后split策略

HBASE COPROCESSOR EndPoint实例

HBASE在QIHOO 360搜索中的应用

HBase的long GC与 Zookeeper lease expired的权衡(转载)

hadoop+hbase+hive日常异常记录

HBASE API高级特性

HBASE 协处理器入门（转载）

HBASE数据架构

HBASE高级应用

HBASE高级应用

HBASE ScannerTimeoutException 问题

hbase维护（转载）

最近访客更多访客>>