Is it a driver or workers that read a text file when using sc.textfile?
The driver looks at the file's metadata - check if it exists, check which files are in the directory, if it is a directory, and check their sizes. It then sends the jobs to the workers, who actually view the contents of the file. The link is essentially "you are reading this file starting at this offset, for this length."
HDFS splits large files into chunks, and spark (usually / often) splits tasks according to chunks, so the process of going to that offset will be efficient.
Other file systems work in the same way, although not always. Compression can also go wrong with this process if the codec is not split-split.
source to share
textfile
creates an RDD as specified by ref :
RDD text files can be generated using the SparkContexts TextFile method.
There is also this note:
If you are using the path to the local filesystem, the file must also be accessible from the same path on the worker nodes. Copy the file to all workers or use a network shared file system.
which implies that your assumption that the driver parses the file and then propagates the data to the slaves is incorrect.
source to share