Fixed connection failure error: java.net.ConnectException - localhost

Question

Fixed connection failure error: java.net.ConnectException - localhost

I am running hadoop and spark in the same machine (Ubuntu 14.04). JPS command gives me below output

hduser@ubuntu:~$ jps
4370 HRegionServer
6568 Jps
5555 RunJar
3744 TaskTracker
5341 RunJar
4120 HQuorumPeer
5790 SparkSubmit
3308 DataNode
4203 HMaster
3469 SecondaryNameNode
3079 NameNode
3587 JobTracker

I created a simple csv file in HDFS. Below is the file details.

hduser@ubuntu:~$ hadoop fs -ls /user/hduser/file_for_spark/spark1.csv
Warning: $HADOOP_HOME is deprecated.

Found 1 items
-rw-r--r--   1 hduser supergroup        174 2015-04-16 08:14 /user/hduser/file_for_spark/spark1.csv

But when trying to access the file from spark I get the connection failed error: java.net.ConnectException: Connection failed error

          ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 1.3.0
      /_/

Using Scala version 2.10.4 (Java HotSpot(TM) 64-Bit Server VM, Java 1.7.0_21)
Type in expressions to have them evaluated.
Type :help for more information.
Spark context available as sc.
SQL context available as sqlContext.

scala> val pagecount=sc.textFile("hdfs://localhost:3079/user/hduser/file_for_spark/spark1.csv")
pagecount: org.apache.spark.rdd.RDD[String] = hdfs://localhost:3079/user/hduser/file_for_spark/spark1.csv MapPartitionsRDD[1] at textFile at <console>:21

scala> pagecount.count()
java.net.ConnectException: Call to localhost/127.0.0.1:3079 failed on connection exception: java.net.ConnectException: Connection refused
    at org.apache.hadoop.ipc.Client.wrapException(Client.java:1099)
    at org.apache.hadoop.ipc.Client.call(Client.java:1075)
    at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
    at com.sun.proxy.$Proxy11.getProtocolVersion(Unknown Source)
    at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396)
    at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379)
    at org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:119)
    at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:238)
    at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:203)
    at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89)
    at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386)
    at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
    at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
    at org.apache.hadoop.fs.Path.getFileSystem(Path.java:187)
    at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:176)
    at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:208)
    at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:203)
    at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219)
    at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217)
    at scala.Option.getOrElse(Option.scala:120)
    at org.apache.spark.rdd.RDD.partitions(RDD.scala:217)
    at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32)
    at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219)
    at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217)
    at scala.Option.getOrElse(Option.scala:120)
    at org.apache.spark.rdd.RDD.partitions(RDD.scala:217)
    at org.apache.spark.SparkContext.runJob(SparkContext.scala:1511)
    at org.apache.spark.rdd.RDD.count(RDD.scala:1006)
    at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:24)
    at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:29)
    at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:31)
    at $iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:33)
    at $iwC$$iwC$$iwC$$iwC.<init>(<console>:35)
    at $iwC$$iwC$$iwC.<init>(<console>:37)
    at $iwC$$iwC.<init>(<console>:39)
    at $iwC.<init>(<console>:41)
    at <init>(<console>:43)
    at .<init>(<console>:47)
    at .<clinit>(<console>)
    at .<init>(<console>:7)
    at .<clinit>(<console>)
    at $print(<console>)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:601)
    at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065)
    at org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1338)
    at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840)
    at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871)
    at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819)
    at org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:856)
    at org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:901)
    at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:813)
    at org.apache.spark.repl.SparkILoop.processLine$1(SparkILoop.scala:656)
    at org.apache.spark.repl.SparkILoop.innerLoop$1(SparkILoop.scala:664)
    at org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$loop(SparkILoop.scala:669)
    at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply$mcZ$sp(SparkILoop.scala:996)
    at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:944)
    at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:944)
    at scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135)
    at org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$process(SparkILoop.scala:944)
    at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1058)
    at org.apache.spark.repl.Main$.main(Main.scala:31)
    at org.apache.spark.repl.Main.main(Main.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:601)
    at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:569)
    at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:166)
    at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:189)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:110)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.net.ConnectException: Connection refused
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:692)
    at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
    at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:489)
    at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
    at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
    at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
    at org.apache.hadoop.ipc.Client.getConnection(Client.java:1206)
    at org.apache.hadoop.ipc.Client.call(Client.java:1050)
    ... 73 more

EDIT 1:

@Paul Command gives the following output

hduser@ubuntu:~$ lsof -p 3079 -a -i
COMMAND  PID   USER   FD   TYPE DEVICE SIZE/OFF NODE NAME
java    3079 hduser   65u  IPv4  17260      0t0  TCP *:33736 (LISTEN)
java    3079 hduser   75u  IPv4  17341      0t0  TCP localhost:54310 (LISTEN)
java    3079 hduser   85u  IPv4  22609      0t0  TCP *:50070 (LISTEN)
java    3079 hduser   89u  IPv4  59998      0t0  TCP localhost:54310->localhost:46507 (ESTABLISHED)

Could you please let me know what exactly is happening on port 46507?

+3

hadoop apache-spark

Koushik chandra Apr 16 At 15:53

source to share

2 answers

This is a good answer to yourself! :)

3079 is the pid (process ID) of the menode. More details on jps output can be found here: Jps doc . BTW, you can check the namenode listening ports with pid with a command like:

lsof -p 3079 -a -i

In the command output:

java    3079 hduser   89u  IPv4  59998      0t0  TCP localhost:54310->localhost:46507 (ESTABLISHED)

This line indicates that another process has established a TCP connection to the Namenode on port 46507. Possible processes could be the resoure manager (RM) or data nodes, etc.

0

Paul H. Apr 16 15 at 17:49

source to share

Koushik chandra · Accepted Answer · 2015-04-16T16:14:31+0000

Sorry ... actually namenode works on a different port which I found from core-site.xml. It works for me after using port 54310.

<property>
  <name>fs.default.name</name>
  <value>hdfs://localhost:54310</value>
  <description>The name of the default file system.  A URI whose
  scheme and authority determine the FileSystem implementation.  The
  uri scheme determines the config property (fs.SCHEME.impl) naming
  the FileSystem implementation class.  The uri authority is used to
  determine the host, port, etc. for a filesystem.</description>
</property>
</configuration>

Can someone please tell me what 3079 means here, which is the result of the JPS command.

Fixed connection failure error: java.net.ConnectException - localhost

More articles: