Are there any use cases where the hasoop map cut can be better than apache sparks?

I agree that programming paradigms are iterative

and are interactive

very good with spark than shrinking the map. And I also agree that we can use HDFS or any haop datastore like HBase as the storage tier for Spark.

So my question is, do we have any real world use cases that can say that MR haop is better than apache sparks in these contexts. Here "Better" is used in terms performance, throughput, latency

. Is hadoop MR still better for handling BATCH than using spark.

If so, can anyone tell us advantages of hadoop MR over apache spark

? Please retain the entire volume of discussion regarding COMPUTATION LAYER

.

+3


source to share


1 answer


As you said, in programming, iterative

and interactive

spark is better than haop. But spark has a huge memory need, if memory is not enough, it will easily remove the OOM exception, hasoop can handle the situation very well because hasoop has a good failover mechanism.

Secondly, if Data Tilt happened, the spark might also collapse. I am comparing the spark and chaos from the reliability of the system because that will decide the success of the job.



I have been testing spark lately and hadoop performance is using some benchmark, according to the result, spark performance is no better than slaughter under some load, for example. kmeans, pagerank. Perhaps memory is the limitation for the spark.

0


source







All Articles