Hadoop


Also found in: Dictionary, Thesaurus, Wikipedia.

Hadoop

An open source big data framework from the Apache Software Foundation designed to handle huge amounts of data on clusters of servers. The storage is handled by the Hadoop Distributed File System (HDFS), and the data are sorted and summarized in parallel by Hadoop MapReduce, a version of Google's MapReduce. Required Java files are included in Hadoop Common, and Hadoop YARN provides the cluster management.

Originally written for the Nutch Web crawler for spidering the Web, in 2008, Yahoo!'s Search Webmap was the first very large implementation of Hadoop running on 10,000 Linux servers. Search Webmap ran in a third less time than Yahoo!'s previous search engine.

The Hadoop name comes from a favorite stuffed elephant of the son of the developer Doug Cutting. See Google File System, MapReduce and Spark.
References in periodicals archive ?
WiseGuyReports.Com Publish a New Market Research Report On -" Hadoop Big Data Analytics Market 2019 : Discovers the Opportunities,Trends,Risk,Simulation,Management to 2023".
With the slowdown in world economic growth, the Hadoop Big Data Analytics industry has also suffered a certain impact, but still maintained a relatively optimistic growth, the past four years, Hadoop Big Data Analytics market size to maintain the average annual growth rate of 31.46% from 3310 million $ in 2015 to 7520 million $ in 2018, BisReport analysts believe that in the next few years, Hadoop Big Data Analytics market size will be further expanded, we expect that by 2023, The market size of the Hadoop Big Data Analytics will reach 38300 million $.
In this paper, we suggest a solution to improve Hadoop's system performance by optimizing its heartbeat periods.
Next-generation data technology initiatives explored in the survey include data lakes, machine learning, Hadoop, Spark, object storage, and the platforms and environments that are supporting them.
The open source Hadoop platform is widely used for such analytics, however as many servers are required to raise processing speed, issues existed in terms of equipment and management costs.
It is composed of the Hadoop ecosystem enterprise version, a large-scale parallel processing database, and a big data cloud service.
Then even at fairly large scales of data it's rare that people are using Hadoop as a substitute for a parallel database.
The Hadoop framework is an open source implementation of the MapReduce model and has been widely adopted due to its remarkable features such as high scalability, fault-tolerance, and computational parallelization [8, 9].
Section 2 gives the survey of related work in the area of image processing and Hadoop. Hadoop, as one of the leading platforms is described in Section 3.
Aqui es donde Hadoop (Hadoop, 2015), un framework opensource basado en MapReduce (Dean y Ghemawat, 2008) es de gran importancia y utilidad.
The announcement was made at Strata + Hadoop World, a deep-immersion event where data scientists, innovators, and executives get up to speed on emerging techniques and technologies.
Continuum Analytics has unveiled advancements in Anaconda bringing high performance advanced analytics to Hadoop. In addition to providing Python and R packages alongside Hadoop clusters, Anaconda will include a distributed processing framework for Hadoop that interacts directly with HDFS and YARN.