Are there any use cases where the hasoop map cut can be better than apache sparks?
I agree that programming paradigms are iterative
and are interactive
very good with spark than shrinking the map. And I also agree that we can use HDFS or any haop datastore like HBase as the storage tier for Spark.
So my question is, do we have any real world use cases that can say that MR haop is better than apache sparks in these contexts. Here "Better" is used in terms performance, throughput, latency
. Is hadoop MR still better for handling BATCH than using spark.
If so, can anyone tell us advantages of hadoop MR over apache spark
? Please retain the entire volume of discussion regarding COMPUTATION LAYER
.
source to share
As you said, in programming, iterative
and interactive
spark is better than haop. But spark has a huge memory need, if memory is not enough, it will easily remove the OOM exception, hasoop can handle the situation very well because hasoop has a good failover mechanism.
Secondly, if Data Tilt happened, the spark might also collapse. I am comparing the spark and chaos from the reliability of the system because that will decide the success of the job.
I have been testing spark lately and hadoop performance is using some benchmark, according to the result, spark performance is no better than slaughter under some load, for example. kmeans, pagerank. Perhaps memory is the limitation for the spark.
source to share