Hadoop: The Definitive Guidetxt,chm,pdf,epub,mobi下载 作者:Tom White 出版社: O'Reilly Media 副标题: 4th Edition 出版年: 2015-4-11 页数: 756 定价: USD 49.99 装帧: Paperback ISBN: 9781491901632 内容简介 · · · · · ·Get ready to unlock the power of your data. With the fourth edition of this comprehensive guide, you’ll learn how to build and maintain reliable, scalable, distributed systems with Apache Hadoop. This book is ideal for programmers looking to analyze datasets of any size, and for administrators who want to set up and run Hadoop clusters. Using Hadoop 2 exclusively, author Tom Wh... 作者简介 · · · · · ·Tom White has been an Apache Hadoop committer since February 2007, and is a member of the Apache Software Foundation. He works for Cloudera, a company set up to offer Hadoop support and training. Previously he was as an independent Hadoop consultant, working with companies to set up, use, and extend Hadoop. He has written numerous articles for O'Reilly, java.net and IBM's devel... 目录 · · · · · ·Hadoop FundamentalsChapter 1Meet Hadoop Data! Data Storage and Analysis Querying All Your Data Beyond Batch · · · · · ·() Hadoop Fundamentals Chapter 1Meet Hadoop Data! Data Storage and Analysis Querying All Your Data Beyond Batch Comparison with Other Systems A Brief History of Apache Hadoop What’s in This Book? Chapter 2MapReduce A Weather Dataset Analyzing the Data with Unix Tools Analyzing the Data with Hadoop Scaling Out Hadoop Streaming Chapter 3The Hadoop Distributed Filesystem The Design of HDFS HDFS Concepts The Command-Line Interface Hadoop Filesystems The Java Interface Data Flow Parallel Copying with distcp Chapter 4YARN Anatomy of a YARN Application Run YARN Compared to MapReduce 1 Scheduling in YARN Further Reading Chapter 5Hadoop I/O Data Integrity Compression Serialization File-Based Data Structures MapReduce Chapter 1Developing a MapReduce Application The Configuration API Setting Up the Development Environment Writing a Unit Test with MRUnit Running Locally on Test Data Running on a Cluster Tuning a Job MapReduce Workflows Chapter 2How MapReduce Works Anatomy of a MapReduce Job Run Failures Shuffle and Sort Task Execution Chapter 3MapReduce Types and Formats MapReduce Types Input Formats Output Formats Chapter 4MapReduce Features Counters Sorting Joins Side Data Distribution MapReduce Library Classes Hadoop Operations Chapter 1Setting Up a Hadoop Cluster Cluster Specification Cluster Setup and Installation Hadoop Configuration Security Benchmarking a Hadoop Cluster Chapter 2Administering Hadoop HDFS Monitoring Maintenance Related Projects Chapter 1Avro Avro Data Types and Schemas In-Memory Serialization and Deserialization Avro Datafiles Interoperability Schema Resolution Sort Order Avro MapReduce Sorting Using Avro MapReduce Avro in Other Languages Chapter 2Parquet Data Model Parquet File Format Parquet Configuration Writing and Reading Parquet Files Parquet MapReduce Chapter 3Flume Installing Flume An Example Transactions and Reliability The HDFS Sink Fan Out Distribution: Agent Tiers Sink Groups Integrating Flume with Applications Component Catalog Further Reading Chapter 4Sqoop Getting Sqoop Sqoop Connectors A Sample Import Generated Code Imports: A Deeper Look Working with Imported Data Importing Large Objects Performing an Export Exports: A Deeper Look Further Reading Chapter 5Pig Installing and Running Pig An Example Comparison with Databases Pig Latin User-Defined Functions Data Processing Operators Pig in Practice Further Reading Chapter 6Hive Installing Hive An Example Running Hive Comparison with Traditional Databases HiveQL Tables Querying Data User-Defined Functions Further Reading Chapter 7Crunch An Example The Core Crunch API Pipeline Execution Crunch Libraries Further Reading Chapter 8Spark Installing Spark An Example Resilient Distributed Datasets Shared Variables Anatomy of a Spark Job Run Executors and Cluster Managers Further Reading Chapter 9HBase HBasics Concepts Installation Clients Building an Online Query Application HBase Versus RDBMS Praxis Further Reading Chapter 10ZooKeeper Installing and Running ZooKeeper An Example The ZooKeeper Service Building Applications with ZooKeeper ZooKeeper in Production Further Reading Case Studies Chapter 1Composable Data at Cerner From CPUs to Semantic Integration Enter Apache Crunch Building a Complete Picture Integrating Healthcare Data Composability over Frameworks Moving Forward Chapter 2Biological Data Science: Saving Lives with Software The Structure of DNA The Genetic Code: Turning DNA Letters into Proteins Thinking of DNA as Source Code The Human Genome Project and Reference Genomes Sequencing and Aligning DNA ADAM, A Scalable Genome Analysis Platform From Personalized Ads to Personalized Medicine Join In Chapter 3Cascading Fields, Tuples, and Pipes Operations Taps, Schemes, and Flows Cascading in Practice Flexibility Hadoop and Cascading at ShareThis Summary Appendix Installing Apache Hadoop Prerequisites Installation Configuration Appendix Cloudera’s Distribution Including Apache Hadoop Appendix Preparing the NCDC Weather Data Appendix The Old and New Java MapReduce APIs Case Studies Chapter 1Composable Data at Cerner From CPUs to Semantic Integration Enter Apache Crunch Building a Complete Picture Integrating Healthcare Data Composability over Frameworks Moving Forward Chapter 2Biological Data Science: Saving Lives with Software The Structure of DNA The Genetic Code: Turning DNA Letters into Proteins Thinking of DNA as Source Code The Human Genome Project and Reference Genomes Sequencing and Aligning DNA ADAM, A Scalable Genome Analysis Platform From Personalized Ads to Personalized Medicine Join In Chapter 3Cascading Fields, Tuples, and Pipes Operations Taps, Schemes, and Flows Cascading in Practice Flexibility Hadoop and Cascading at ShareThis Summary Appendix Installing Apache Hadoop Prerequisites Installation Configuration Appendix Cloudera’s Distribution Including Apache Hadoop Appendix Preparing the NCDC Weather Data Appendix The Old and New Java MapReduce APIs · · · · · · () |
理解起来更容易
有思想
感觉不出文化隔阂
哈哈哈哈哈哈