Apache Hadoop (Apache Hadoop) set of algorithms associated with the storage and processing of large data that is distributed in the form of a Framework software and open source inspired by MapReduce and Google File System designed and implemented related articles. Hadoop data usage for distributed applications under a free software support and ability to work with thousands of nodes and petabytes of data has several. Apache Hadoop is a top-level project and is supported by a wide range of participants and the Java programming language uses. As the largest contributor Yahoo, Hadoop is widely used in the business.
During the training, the Hadoop Fundamentals are key features of this powerful software framework are familiar.
Headlines training of:
– Understanding the main components such as Hadoop HDFS and MapReduce
– Set the Hadoop development environment
– Work with Hadoop file system
– Implementation and follow-up Hadoop job
– Meet the Hive and HBase
– The Pig Tools
– Create workflow
– Use the book houses such as Impala, Mahout and Storm
– Introduction to Spark
– Visualizing the output of Hadoop
– And …