AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |
Back to Blog
It is based on Hadoop MapReduce and it extends the MapReduce model to efficiently use it for more types of computations, which includes interactive queries and stream processing. Apache SparkĪpache Spark is a lightning-fast cluster computing technology, designed for fast computation. Since Spark has its own cluster management computation, it uses Hadoop for storage purpose only. Spark uses Hadoop in two ways – one is storage and second is processing. Hadoop is just one of the ways to implement Spark. Spark was introduced by Apache Software Foundation for speeding up the Hadoop computational computing software process.Īs against a common belief, Spark is not a modified version of Hadoop and is not, really, dependent on Hadoop because it has its own cluster management. Here, the main concern is to maintain speed in processing large datasets in terms of waiting time between queries and waiting time to run the program. The reason is that Hadoop framework is based on a simple programming model (MapReduce) and it enables a computing solution that is scalable, flexible, fault-tolerant and cost effective. Industries are using Hadoop extensively to analyze their data sets.
0 Comments
Read More
Leave a Reply. |