Some Open Source Projects and software components serving Big Data landscape:

HDFS: is a distributed file system that manages data files across a cluster.

MapReduce: framework to process data in HDFS. Although, technically a part of HDFS, its capabilities demand its separate mention.

HBase: Hadoop database, that provides read/write access to data managed in large loosely structured tables.

Hive: framework with SQL like query language to query hadoop data.

Pig: framework to process data in hadoop using mapReduce

Oozie: Application to chain mapreduce jobs in a workflow.

Sqoop: Tool to import data from RDBMS into Hadoop.

Hue: Hadoop User Experience project provided a Web UI for Hadoop.

Avro: is a data serialization system

Thrift: simple and straight-forward interface definition language that is used to define and create services for numerous languages.



Let us bring in the sunshine

Customized Big Data Training!

We are proud to provide customized training for you and your organization. Our trainers are software consultants working in retail, finance and health industry. We have a range of topics and case-studies that can be chosen to create a training package, just for your needs.

Example Trainer Profile: Gaurav Gupta