Spark stream example doesn't work for me: number of network words (maybe no data is streamed)

  • Beginner craftsman and worker
  • Started the console and entered

nc -lk 9999

  1. Example of counting network lines

./bin/run-example streaming.NetworkWordCount localhost 9999

  1. Posted in

"Hello world Hello"

netcat console.

  1. But the console on which I ran the program did not show the calculated data (maybe the data was not received by streams), when I stopped the program, it calculated

Hello 2 world, 1

+3


source to share


3 answers


I faced the same issue too and spent the last weekend to get this simple streaming example to work. Finally, I was able to successfully execute the NetworkWorkCount program. I am using spark 1.5.2. and Ubuntu_14.

There are several ways to solve this problem, you can use one of them:

  • You need to update the NetworkWorkCount.scala code (inside / examples / src / main / scala / org / apache / spark / examples / streaming /), Add setMaster ("local [2]") when creating the SparkConf variable as follows.

    new SparkConf (). setMaster ("local [2]"). setAppName ("NetworkWordCount")

The problem with this approach is that you have to compile this updated code somehow in order to make our modification effective, which could be another problem for people who just started learning the spark and were trying their hand at running this simple example. For them, the easiest way is below the option.



  1. The simplest solution is to set the MASTER variable to local [2] like this:

    and. Go to the / conf directory of your SPARK_HOME

    b. Create spark-env.sh using the provided template:

     cp spark-env.sh.template spark-env.sh
    
          

    Open spark-env.sh and set the following configuration in it:

     MASTER=local[2]
    
          

  2. Now open the first terminal and run the netcat utility

    nc -lk 9999

  3. Open a second terminal and run the NetworkWordCount program

    ./bin/run-example streaming.NetworkWordCount localhost 9999

It will start showing continuous streaming like this:

-------------------------------------------
Time: 1450077999000 ms
-------------------------------------------
(are,12)
(am,6)
(how,6)
(rashmit,6)
(apache,6)
(hello,5)
(spark,5)
(you,12)
(i,6)
(sparkhello,1)
...

-------------------------------------------
Time: 1450078000000 ms
-------------------------------------------
(are,2)
(am,1)
(how,1)
(rashmit,1)
(apache,1)
(hello,1)

      

+4


source


This should work



./bin/run-example --master local[4] streaming.NetworkWordCount localhost 9999

      

0


source


We don't need any code changes or configuration changes to run this example. I ran into this problem while trying to run this example in a virtual machine. This problem does not occur when we start the same as part of the host machine. Thanks to Rashmit Rathod for hints on this issue.

The solution is to add "--master local [2]" as part of the command line of the example run as follows

./bin/run-example --master local[2] org.apache.spark.examples.streaming.NetworkWordCount localhost 9999

      

-1


source







All Articles