Ktl-icon-tai-lieu

HadoopBig Data

Được đăng lên bởi Ho van Lam
Số trang: 647 trang   |   Lượt xem: 4433 lần   |   Lượt tải: 4 lần
THIRD EDITION

Hadoop: The Definitive Guide

Tom White

Beijing • Cambridge • Farnham • Köln • Sebastopol • Tokyo



Hadoop: The Definitive Guide, Third Edition
by Tom White

Revision History for the :
2012-01-27
Early release revision 1
See  for release details.

ISBN: 978-1-449-31152-0
1327616795



For Eliane, Emilia, and Lottie





Table of Contents

Foreword . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv
1. Meet Hadoop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Data!
Data Storage and Analysis
Comparison with Other Systems
RDBMS
Grid Computing
Volunteer Computing
A Brief History of Hadoop
Apache Hadoop and the Hadoop Ecosystem
Hadoop Releases
What’s Covered in this Book
Compatibility

1
3
4
4
6
8
9
12
13
14
15

2. MapReduce . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
A Weather Dataset
Data Format
Analyzing the Data with Unix Tools
Analyzing the Data with Hadoop
Map and Reduce
Java MapReduce
Scaling Out
Data Flow
Combiner Functions
Running a Distributed MapReduce Job
Hadoop Streaming
Ruby
Python

17
17
19
20
20
22
30
31
34
37
37
37
40

iii



Hadoop Pipes
Compiling and Running

41
42

3. The Hadoop Distributed Filesystem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
The Design of HDFS
HDFS Concepts
Blocks
Namenodes and Datanodes
HDFS Federation
HDFS High-Availability
The Command-Line Interface
Basic Filesystem Operations
Hadoop Filesystems
Interfaces
The Java Interface
Reading Data from a Hadoop URL
Reading Data Using the FileSystem API
Writing Data
Directories
Querying the Filesystem
Deleting Data
Data Flow
Anatomy of a File Read
Anatomy of a File Write
Coherency Model
Parallel Copying with distcp
Keeping an HDFS Cluster Balanced
Hadoop Archives
Using Hadoop Archives
Limitations

45
47
47
48
49
50
51
52
54
55
57
57
59
62
64
64
69
69
69
72
75
76
78
78
79
80

4. Hadoop I/O . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ....
www.it-ebooks.info
HadoopBig Data - Trang 2
Để xem tài liệu đầy đủ. Xin vui lòng
HadoopBig Data - Người đăng: Ho van Lam
5 Tài liệu rất hay! Được đăng lên bởi - 1 giờ trước Đúng là cái mình đang tìm. Rất hay và bổ ích. Cảm ơn bạn!
647 Vietnamese
HadoopBig Data 9 10 279