8455| 94
|
Hive学习笔记 63页精华文档 来自阿里巴巴集团 B2B技术部-数据产品平台 |
1. HIVE 结构.................................................................................................................................... 6 1.1 HIVE 架构...................................................................................................................... 6 1.2 Hive 和Hadoop 关系.................................................................................................. 7 1.3 Hive 和普通关系数据库的异同...................................................................................8 1.4 HIVE 元数据库................................................................................................................9 1.4.1 DERBY..................................................................................................................9 1.4.2 Mysql.................................................................................................................10 1.5 HIVE 的数据存储........................................................................................................11 1.6 其它HIVE 操作........................................................................................................... 11 2. HIVE 基本操作.........................................................................................................................12 2.1 create table.................................................................................................................... 12 2.1.1 总述...................................................................................................................12 2.1.2 语法...................................................................................................................12 2.1.3 基本例子...........................................................................................................14 2.1.4 创建分区...........................................................................................................15 2.1.5 其它例子...........................................................................................................16 2.2 Alter Table.................................................................................................................... 17 2.2.1 Add Partitions.............................................................................................. 17 2.2.2 Drop Partitions............................................................................................ 17 2.2.3 Rename Table.................................................................................................. 17 2.2.4 Change Column................................................................................................ 18 2.2.5 Add/Replace Columns.................................................................................... 18 2.3 Create View.................................................................................................................. 18 2.4 Show..............................................................................................................................19 2.5 Load.............................................................................................................................. 19 2.6 Insert..............................................................................................................................21 2.6.1 Inserting data into Hive Tables from queries..................................21 2.6.2 Writing data into filesystem from queries........................................21 2.7 Cli..................................................................................................................................22 2.7.1 Hive Command line Options........................................................................ 22 2.7.2 Hive interactive Shell Command.............................................................. 24 2.7.3 Hive Resources.............................................................................................. 24 2.7.4 调用python、shell 等语言.......................................................................... 25 2.8 DROP...............................................................................................................................26 2.9 其它...............................................................................................................................27 2.9.1 Limit.................................................................................................................27 2.9.2 Top k.................................................................................................................27 2.9.3 REGEX Column Specification...................................................................... 27 3. Hive Select.................................................................................................................................27 3.1 Group By.......................................................................................................................28 3.2 Order /Sort By...........................................................................................................28 4. Hive Join.................................................................................................................................... 29 5. HIVE 参数设置.......................................................................................................................... 31 6. HIVE UDF................................................................................................................................... 33 6.1 基本函数.......................................................................................................................33 6.1.1 关系操作符...................................................................................................... 33 6.1.2 代数操作符...................................................................................................... 34 6.1.3 逻辑操作符...................................................................................................... 35 6.1.4 复杂类型操作符.............................................................................................. 35 6.1.5 内建函数...........................................................................................................36 6.1.6 数学函数...........................................................................................................36 6.1.7 集合函数...........................................................................................................36 6.1.8 类型转换...........................................................................................................36 6.1.9 日期函数...........................................................................................................36 6.1.10 条件函数...........................................................................................................37 6.1.11 字符串函数...................................................................................................... 37 6.2 UDTF............................................................................................................................ 39 6.2.1 Explode............................................................................................................ 39 7. HIVE 的MAP/REDUCE............................................................................................................. 41 7.1 JOIN............................................................................................................................... 41 7.2 GROUP BY......................................................................................................................42 7.3 DISTINCT........................................................................................................................42 8. 使用HIVE 注意点.....................................................................................................................43 8.1 字符集...........................................................................................................................43 8.2 压缩...............................................................................................................................43 8.3 count(distinct)........................................................................................................ 43 8.4 JOIN...............................................................................................................................43 8.5 DML 操作........................................................................................................................44 8.6 HAVING...........................................................................................................................44 8.7 子查询...........................................................................................................................44 8.8 Join 中处理null 值的语义区别............................................................................... 44 9. 优化与技巧...............................................................................................................................47 9.1 全排序...........................................................................................................................47 9.1.1 例1....................................................................................................................48 9.1.2 例2....................................................................................................................51 9.2 怎样做笛卡尔积...........................................................................................................54 9.3 怎样写exist/in 子句................................................................................................ 54 9.4 怎样决定reducer 个数.............................................................................................. 55 9.5 合并MapReduce 操作.................................................................................................. 55 9.6 Bucket 与sampling.....................................................................................................56 9.7 Partition.........................................................................................................................57 9.8 JOIN............................................................................................................................... 58 9.8.1 JOIN 原则.......................................................................................................... 58 9.8.2 Map Join............................................................................................................58 9.8.3 大表Join 的数据偏斜......................................................................................60 9.9 合并小文件...................................................................................................................62 9.10 Group By........................................................................................................................62 10. HIVE FAQ:....................................................................................................................... 62
购买主题
已有 3 人购买
本主题需向作者支付 6 金币 才能浏览
| |
发表于 2015-1-28 00:08:37
|
显示全部楼层
| ||
发表于 2015-1-28 08:40:48
|
显示全部楼层
| ||
发表于 2015-1-28 09:34:08
|
显示全部楼层
| ||
发表于 2015-1-28 09:56:12
|
显示全部楼层
| ||
发表于 2015-5-5 23:50:22
|
显示全部楼层
| ||
发表于 2015-5-25 10:57:53
|
显示全部楼层
| ||
发表于 2015-5-25 17:11:01
|
显示全部楼层
| ||