项目作者: Oprishri

项目描述 :
Here is some hands on hadoop practical assignment, on single node hadoop cluster.
高级语言: Python
项目地址: git://github.com/Oprishri/Hadoop.git
创建时间: 2020-10-11T11:05:21Z
项目社区:https://github.com/Oprishri/Hadoop

开源协议:

下载


Hadoop Practicals

  • Hadoop is a framework for distributed storage and processing.
  • Core components of Hadoop include HDFS for storage, YARN for cluster-resource
    management, and MapReduce or Spark for processing.
    -The Hadoop ecosystem includes multiple components that support each stage of
    big data processing:

    • Flume and Scoop ingest data
    • HDFS and HBase store data
    • Spark and MapReduce process data
    • Pig, Hive, and Impala analyze data
    • Hue and Search help to explore data
    • Oozie manages the workflow of Hadoop tasks

alt text

Table of Content

  1. On single node hadoop.
  • Word Count Practical
  • Trending Word Count
  • Database Join
  • Vector Multiplication
  • Matrix Multiplication