这篇文章将为大家详细讲解有关Hive中matadata怎么用,小编觉得挺实用的,因此分享给大家做个参考,希望大家阅读完这篇文章后可以有所收获。
创新互联专注于绩溪企业网站建设,成都响应式网站建设公司,商城系统网站开发。绩溪网站建设公司,为绩溪等地区提供建站服务。全流程定制制作,专业设计,全程项目跟踪,创新互联专业和态度为您提供的服务
Hive元数据存储在MySQL库里,数据存储HDFS上;查看元数据库存放地址,查看Hive配置文件路径查看:
$HIVE_HOME/conf/hive-site.xml
javax.jdo.option.ConnectionURL jdbc:mysql://localhost:3306/hive_data?createDatabaseIfNotExist=true&characterEncoding=latin1
查看MySQL元数据:
mysql> use hive_data
Database changed
mysql> show tables;
+---------------------------+ | Tables_in_hive_data | +---------------------------+ | bucketing_cols | | cds | | columns_v2 | | database_params | | db_privs | | dbs | | func_ru | | funcs | | global_privs | | idxs | | index_params | | part_col_privs | | part_col_stats | | part_privs | | partition_key_vals | | partition_keys | | partition_params | | partitions | | roles | | sd_params | | sds | | sequence_table | | serde_params | | serdes | | skewed_col_names | | skewed_col_value_loc_map | | skewed_string_list | | skewed_string_list_values | | skewed_values | | sort_cols | | tab_col_stats | | table_params | | tbl_col_privs | | tbl_privs | | tbls | | version | +---------------------------+
----------------------------------------------------------------
1. Hive版本version表有且只有一条数据,多一条数据Hive会无法启动
mysql> select * from version; +--------+----------------+---------------------------------------+ | VER_ID | SCHEMA_VERSION | VERSION_COMMENT | +--------+----------------+---------------------------------------+ | 1 | 1.1.0 | Set by MetaStore hadoop@192.168.0.129 | +--------+----------------+---------------------------------------+
2.Hive数据库元数据表【dbs】【database_params】
mysql> select DB_ID,DB_LOCATION_URI,NAME from dbs
+-------+----------------------------------------------------------------+---------------+ | db_id | DB_LOCATION_URI | NAME | +-------+----------------------------------------------------------------+---------------+ | 1 | hdfs://192.168.0.129:9000/user/hive/warehouse | default | | 3 | hdfs://192.168.0.129:9000/user/hive/warehouse/hive_data2.db | hive_data2 | | 6 | hdfs://192.168.0.129:9000/user/hive/warehouse/ruozedata_job.db | ruozedata_job | +-------+----------------------------------------------------------------+---------------+
DB_ID -- 数据库ID 【tbls】
DB_LOCATION_URI -- HDFD存放路径
NAME -- 数据库名
3.Hive表内容、结构、属性
mysql> select TBL_ID,CREATE_TIME,DB_ID,SD_ID,TBL_NAME,TBL_TYPE from tbls;
+--------+-------------+-------+-------+---------------+----------------+ | TBL_ID | CREATE_TIME | DB_ID | SD_ID | TBL_NAME | TBL_TYPE | +--------+-------------+-------+-------+---------------+----------------+ | 7 | 1528299941 | 3 | 7 | emp | EXTERNAL_TABLE | | 10 | 1528311773 | 3 | 10 | emp_bak | MANAGED_TABLE | | 11 | 1528312267 | 3 | 11 | emp1 | EXTERNAL_TABLE | | 16 | 1528403085 | 3 | 16 | dual | MANAGED_TABLE | | 17 | 1528484818 | 3 | 17 | json | MANAGED_TABLE | | 22 | 1529454293 | 3 | 22 | emp_partition | MANAGED_TABLE | | 26 | 1529459118 | 3 | 31 | emp_sqoop111 | MANAGED_TABLE | | 34 | 1529530688 | 6 | 39 | user_click | EXTERNAL_TABLE | | 38 | 1529537107 | 6 | 44 | product_info | EXTERNAL_TABLE | | 39 | 1529593387 | 6 | 45 | city_info | MANAGED_TABLE | | 41 | 1529606647 | 6 | 46 | product_hot | MANAGED_TABLE | +--------+-------------+-------+-------+---------------+----------------+
TBL_ID -- 表ID 【table_params】【partitions】【partition_keys】
DB_ID -- 库ID
SD_ID -- 序列化ID 【sds】【partitions】
4.Hive文件存储相关元数据:【sds】【serdes】【serde_params】
mysql> select * from sds;
CD_ID --表列ID 【columns_v2】【cds】
SERDE_ID --序列化列ID 【serdes】【serde_params】
mysql> select * from columns_v2;
+-------+---------+--------------+-----------+-------------+
| CD_ID | COMMENT | COLUMN_NAME | TYPE_NAME | INTEGER_IDX |
+-------+---------+--------------+-----------+-------------+
| 7 | NULL | comm | double | 6 |
| 7 | NULL | deptno | int | 7 |
| 7 | NULL | empno | int | 0 |
| 7 | NULL | ename | string | 1 |
| 7 | NULL | hiredate | string | 4 |
| 7 | NULL | job | string | 2 |
| 7 | NULL | mgr | int | 3 |
| 7 | NULL | salary | double | 5 |
| 10 | NULL | comm | double | 6 |
| 10 | NULL | deptno | int | 7 |
| 10 | NULL | empno | int | 0 |
| 10 | NULL | ename | string | 1 |
| 10 | NULL | hiredate | string | 4 |
| 10 | NULL | job | string | 2 |
| 10 | NULL | mgr | int | 3 |
| 10 | NULL | salary | double | 5 |
5.Hive表分区相关的元数据表
mysql> select * from partitions;
+---------+-------------+------------------+-----------------+-------+--------+ | PART_ID | CREATE_TIME | LAST_ACCESS_TIME | PART_NAME | SD_ID | TBL_ID | +---------+-------------+------------------+-----------------+-------+--------+ | 1 | 1529456274 | 0 | pt=2018-06-19 | 26 | 22 | | 6 | 1529530704 | 0 | data=2018-06-20 | 40 | 34 |
PART_ID -- 分区ID 【partition_key_vasls】【partition_params】
mysql> select * from partition_key_vals;
+---------+--------------+-------------+
| PART_ID | PART_KEY_VAL | INTEGER_IDX |
+---------+--------------+-------------+
| 1 | 2018-06-19 | 0 |
| 6 | 2018-06-20 | 0 |
+---------+--------------+-------------+
PART_ID -- 分区ID
PART_KEY_VAL -- 分区字段值
mysql> select * from partition_params;
+---------+-----------------------+-------------+
| PART_ID | PARAM_KEY | PARAM_VALUE |
+---------+-----------------------+-------------+
| 1 | COLUMN_STATS_ACCURATE | true |
| 1 | numFiles | 2 |
| 1 | numRows | 0 |
| 1 | rawDataSize | 0 |
| 1 | totalSize | 734 |
| 1 | transient_lastDdlTime | 1529456274 |
| 6 | COLUMN_STATS_ACCURATE | true |
| 6 | numFiles | 1 |
| 6 | numRows | 0 |
| 6 | rawDataSize | 0 |
| 6 | totalSize | 725264 |
| 6 | transient_lastDdlTime | 1529530704 |
PARAM_KEY -- 分区属性
PARAM_VALUE -- 分区属性值
总结:
关于“Hive中matadata怎么用”这篇文章就分享到这里了,希望以上内容可以对大家有一定的帮助,使各位可以学到更多知识,如果觉得文章不错,请把它分享出去让更多的人看到。