本文共 5759 字,大约阅读时间需要 19 分钟。
Hbase支持的压缩格式:
hbase支持的压缩格式:GZ(GZIP),LZ0,LZ4,Snappy
GZ:用于冷数据压缩,与Snappy和LZ0相比,GZIP的压缩率更高,但是更消耗CPU,解压/压缩速度更慢。
Snappy和LZ0:用于热数据压缩,占用CPU少,解压/压缩速度比GZ快,但是压缩率不如GZ高。
Snappy与LZ0相比,Snappy整体性能优于LZ0,Snappy压缩率比LZ0更低,但是解压/压缩速度更快。
LZ4与LZ0相比,LZ4的压缩率和LZ0的压缩率相差不多,但是LZ4的解压/压缩速度更快。
多数情况下,选择Snppy或LZ0是比较好的选择,因为它们的压缩开销底,能节省空间。
建表时指定压缩格式
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 | hbase(main):013:0> create 'test3' ,{NAME=> 'f1' },{NAME=> 'f2' ,COMPRESSION=> 'Snappy' } 0 row(s) in 1.2740 seconds => Hbase::Table - test3 hbase(main):014:0> desc 'test3' Table test3 is ENABLED test3 COLUMN FAMILIES DESCRIPTION {NAME => 'f1' , BLOOMFILTER => 'ROW' ,VERSIONS => '1' , IN_MEMORY => 'false' , KEEP_DELETED_CELLS => 'FALSE' ,DATA_BLOCK_ENCODING => 'NONE' , TTL => 'FOREVER' , COMPRESS ION => 'NONE' , MIN_VERSIONS => '0' ,BLOCKCACHE => 'true' , BLOCKSIZE => '65536' , REPLICATION_SCOPE => '0' } {NAME => 'f2' , BLOOMFILTER => 'ROW' ,VERSIONS => '1' , IN_MEMORY => 'false' , KEEP_DELETED_CELLS => 'FALSE' ,DATA_BLOCK_ENCODING => 'NONE' , TTL => 'FOREVER' , COMPRESS ION => 'SNAPPY' , MIN_VERSIONS => '0' ,BLOCKCACHE => 'true' , BLOCKSIZE => '65536' , REPLICATION_SCOPE => '0' } 2 row(s) in 0.0300 seconds hbase(main):002:0> create 'test4' ,{NAME=> 'f1' },{NAME=> 'f2' ,COMPRESSION=> 'GZ' } 0 row(s) in 1.4900 seconds => Hbase::Table - test4 hbase(main):003:0> desc 'test4' Table test4 is ENABLED test4 COLUMN FAMILIES DESCRIPTION {NAME => 'f1' , BLOOMFILTER => 'ROW' , VERSIONS => '1' , IN_MEMORY => 'false' , KEEP_DELETED_CELLS => 'FALSE' , DATA_BLOCK_ENCODING => 'NONE' , TTL => 'FOREVER' , COMPRESS ION => 'NONE' , MIN_VERSIONS => '0' , BLOCKCACHE => 'true' , BLOCKSIZE => '65536' , REPLICATION_SCOPE => '0' } {NAME => 'f2' , BLOOMFILTER => 'ROW' , VERSIONS => '1' , IN_MEMORY => 'false' , KEEP_DELETED_CELLS => 'FALSE' , DATA_BLOCK_ENCODING => 'NONE' , TTL => 'FOREVER' , COMPRESS ION => 'GZ' , MIN_VERSIONS => '0' , BLOCKCACHE => 'true' , BLOCKSIZE => '65536' , REPLICATION_SCOPE => '0' } 2 row(s) in 0.1290 seconds |
建表后修改columnfamily压缩格式
正确做法是先disable表,再修改列族压缩格式,enbale表后做major_compact操作。
如下:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 | hbase(main):004:0> desc 'test1' Table test1 is ENABLED test1 COLUMN FAMILIES DESCRIPTION {NAME => 'f1' , BLOOMFILTER => 'ROW' ,VERSIONS => '1' , IN_MEMORY => 'false' , KEEP_DELETED_CELLS => 'FALSE' ,DATA_BLOCK_ENCODING => 'NONE' , TTL => 'FOREVER' , COMPRESS ION => 'NONE' , MIN_VERSIONS => '0' ,BLOCKCACHE => 'true' , BLOCKSIZE => '65536' , REPLICATION_SCOPE => '0' } {NAME => 'f2' , BLOOMFILTER => 'ROW' ,VERSIONS => '1' , IN_MEMORY => 'false' , KEEP_DELETED_CELLS => 'FALSE' ,DATA_BLOCK_ENCODING => 'NONE' , TTL => 'FOREVER' , COMPRESS ION => 'NONE' , MIN_VERSIONS => '0' ,BLOCKCACHE => 'true' , BLOCKSIZE => '65536' , REPLICATION_SCOPE => '0' } 2 row(s) in 0.0230 seconds hbase(main):005:0> disable 'test1' 0 row(s) in 2.2870 seconds hbase(main):006:0> alter 'test1' ,{NAME=> 'f1' ,COMPRESSION=> 'Snappy' } Updating all regions with the new schema... 1 /1 regions updated. Done. 0 row(s) in 1.9510 seconds hbase(main):007:0> enable 'test1' 0 row(s) in 1.2820 seconds hbase(main):008:0> desc 'test1' Table test1 is ENABLED test1 COLUMN FAMILIES DESCRIPTION {NAME => 'f1' , BLOOMFILTER => 'ROW' ,VERSIONS => '1' , IN_MEMORY => 'false' , KEEP_DELETED_CELLS => 'FALSE' ,DATA_BLOCK_ENCODING => 'NONE' , TTL => 'FOREVER' , COMPRESS ION => 'SNAPPY' , MIN_VERSIONS => '0' ,BLOCKCACHE => 'true' , BLOCKSIZE => '65536' , REPLICATION_SCOPE => '0' } {NAME => 'f2' , BLOOMFILTER => 'ROW' ,VERSIONS => '1' , IN_MEMORY => 'false' , KEEP_DELETED_CELLS => 'FALSE' ,DATA_BLOCK_ENCODING => 'NONE' , TTL => 'FOREVER' , COMPRESS ION => 'NONE' , MIN_VERSIONS => '0' ,BLOCKCACHE => 'true' , BLOCKSIZE => '65536' , REPLICATION_SCOPE => '0' } 2 row(s) in 0.0310 seconds hbase(main):009:0> major_compact 'test1' 0 row(s) in 0.1380 seconds hbase(main):010:0> desc 'test1' Table test1 is ENABLED test1 COLUMN FAMILIES DESCRIPTION {NAME => 'f1' , BLOOMFILTER => 'ROW' ,VERSIONS => '1' , IN_MEMORY => 'false' , KEEP_DELETED_CELLS => 'FALSE' ,DATA_BLOCK_ENCODING => 'NONE' , TTL => 'FOREVER' , COMPRESS ION => 'SNAPPY' , MIN_VERSIONS => '0' ,BLOCKCACHE => 'true' , BLOCKSIZE => '65536' , REPLICATION_SCOPE => '0' } {NAME => 'f2' , BLOOMFILTER => 'ROW' ,VERSIONS => '1' , IN_MEMORY => 'false' , KEEP_DELETED_CELLS => 'FALSE' ,DATA_BLOCK_ENCODING => 'NONE' , TTL => 'FOREVER' , COMPRESS ION => 'NONE' , MIN_VERSIONS => '0' ,BLOCKCACHE => 'true' , BLOCKSIZE => '65536' , REPLICATION_SCOPE => '0' } 2 row(s) in 0.0260 seconds |
但是没有disable表,也不做major_compact,列族压缩格式也修改成功了(暂时不知道原因)。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | hbase(main):001:0> desc 'test' Table test is ENABLED test COLUMN FAMILIES DESCRIPTION {NAME => 'fam1' , BLOOMFILTER => 'ROW' , VERSIONS => '1' , IN_MEMORY => 'false' , KEEP_DELETED_CELLS => 'FALSE' , DATA_BLOCK_ENCODING => 'NONE' , TTL => 'FOREVER' , COMPRE SSION => 'NONE' , MIN_VERSIONS => '0' , BLOCKCACHE => 'true' , BLOCKSIZE => '65536' , REPLICATION_SCOPE => '0' } 1 row(s) in 0.3680 seconds hbase(main):002:0> alter 'test' ,{NAME=> 'fam1' ,COMPRESSION=> 'LZ4' } Updating all regions with the new schema... 1 /1 regions updated. Done. 0 row(s) in 2.0460 seconds hbase(main):003:0> desc 'test' Table test is ENABLED test COLUMN FAMILIES DESCRIPTION {NAME => 'fam1' , BLOOMFILTER => 'ROW' , VERSIONS => '1' , IN_MEMORY => 'false' , KEEP_DELETED_CELLS => 'FALSE' , DATA_BLOCK_ENCODING => 'NONE' , TTL => 'FOREVER' , COMPRE SSION => 'LZ4' , MIN_VERSIONS => '0' ,BLOCKCACHE => 'true' , BLOCKSIZE => '65536' , REPLICATION_SCOPE => '0' } 1 row(s) in 0.0280 seconds |
本文转自 天黑顺路 51CTO博客,原文链接:http://blog.51cto.com/mjal01/1963644,如需转载请自行联系原作者