Spark SQL in real time on Hive

I am really asking myself how to use Spark SQL with Hive for real-time analysis. I know Hive was built for batch processing and Spark uses fast queries.

But will using Spark SQL with Hive allow me to do real-time queries? Or it will just make the fastest queries, but not in real time. Should I use another datastore instead of Hive like Hbase?

Thanks in advance, Florian

+3


source to share


1 answer


While Spark can be a lot faster than a hive, it's still probably not a perfect solution for serving a website. Thus, if Spark SQL can execute queries "in real time" or not, depends largely on what kind of timelines you think in real time, if your dataset is small enough to be cached in memory, and if your queries can take advantage of division.



+1


source







All Articles