Friday, February 8, 2013

Hadoop Ecosystem

 

The above diagram Clearly explain the Hadoop Echosystem, These are combination of different  Techknowledges all are doing different types of works shown the above program you can understand clearly     .
The Echosystem of Hadoop is

      Name            Purpose

    1. Hive          ( Data WareHouse)
    2. Pig            (Text Mining)
    3. Hbase       (Random Operations)
    4. Sqoop       (Export and Import)
    5. Flume        (Streaming Data)
    6. Ooziee       (Scheduler nd Workflow Design)
    7.Zookeeper (State Maintenance)


1. Hive :
  •   Hive is  a Data Warehouse in Hadoop Environment.
  •  To process Structured and Semi-structured and Un-Structured data.
  •  Un-Structured data can be processed by converting into Structured data.


2. Pig :
  • Pig is used for Text analytic (mining)
  • Pig is Processed for Xml and Json data.
  • Even though data is Structured and impossible of Hive (hql) can be processed by Pig.
  • The additional Functionalists (not possible of Pig) can be done by Using UDF (User Define Functions)
  • The Pig UDF's can be done in following languages
                       i.e : Java,Ruby,Python,Java script, C++, etc........  
  •   Hive also supports UDF's hive udf's can be done in
                         i.e:  Java,Ruby,Python,C++,R Program etc........ 
  • When you run high query's in pig automatically java Map Reduce code will be build by the frame it will be submitted by JVM.







1 comment:

  1. Hi,
    useful information from you and real time faculty provides best online training on hadoop online training

    ReplyDelete