Pyspark show dataframe as horizontally scrolling table in ipython notebook
pyspark.sql.DataFrame
displays randomly with DataFrame.show()
- line wrapping instead of scroll.
but displayed with pandas.DataFrame.head
I have tried these options
import IPython
IPython.auto_scroll_threshold = 9999
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"
from IPython.display import display
but no luck. While scrolling works when used in the Atom editor with the jupyter plugin:
+11
source to share
3 answers
I created below li'l function and it works great:
def printDf(sprkDF):
newdf = sprkDF.toPandas()
from IPython.display import display, HTML
return HTML(newdf.to_html())
you can use it directly on your spark queries if you like, or on any spark dataframe:
printDf(spark.sql('''
select * from employee
'''))
0
source to share
This is now possible natively with Spark 2.4.0 by setting the parameter spark.sql.repl.eagerEval.enabled
to True
:
0
source to share