Inconsistent connector state: ConnectException: The task already exists on this worker

Question

Inconsistent connector state: ConnectException: The task already exists on this worker

I am using Confluent Platform 3.2. Running 3 workers on 3 different EC2 computers.

I had a connector (debezium / MySQL source) which I removed and started again after a few minutes. But I was unable to successfully start the connector due to the error below. The connector is in a failed state. I had to restart the workers to fix the problem.

Need to know if this is a caching issue? How to solve this problem without restarting workers. Any support is appreciated.

   {
   "name": "debezium-connector",
   "connector": {
      "state": "RUNNING",
      "worker_id": "xx.xx.xx.xxx:8083"
   },
   "tasks": [
      {
         "state": "FAILED",
         "trace": "org.apache.kafka.connect.errors.ConnectException: Task already exists in this worker: debezium-connector-0\n\tat org.apache.kafka.connect.runtime.Worker.startTask(Worker.java:308)\n\tat org.apache.kafka.connect.runtime.distributed.DistributedHerder.startTask(DistributedHerder.java:834)\n\tat org.apache.kafka.connect.runtime.distributed.DistributedHerder.access$1500(DistributedHerder.java:101)\n\tat org.apache.kafka.connect.runtime.distributed.DistributedHerder$13.call(DistributedHerder.java:848)\n\tat org.apache.kafka.connect.runtime.distributed.DistributedHerder$13.call(DistributedHerder.java:844)\n\tat java.util.concurrent.FutureTask.run(FutureTask.java:266)\n\tat java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)\n\tat java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)\n\tat java.lang.Thread.run(Thread.java:745)\n",
         "id": 0,
         "worker_id": "xx.xx.xx.xxx:8083"
      }
   ]
}

+3

apache-kafka confluent apache-kafka-connect debezium

Renukaradhya 05 May '17 at 10:06

source to share

1 answer

eddyP23 · Answer 1 · 2018-01-31T08:30:54+0000

Hmm. I had the same error and found that one of the Kafka servers ran out of disk space, so the Kafka cluster was not functioning properly. I don't know all the details here, but I expect Connect to store some information about connectors and tasks in Kafka, and if it doesn't respond correctly, Kafka may still have information about the old task.

Sharing in case it helps anyone else.

EDIT:

I also noticed that this issue happens to my Kafka nodes from time to time, bringing the entire cluster to an unusable state. Restarting the problematic node fixes the problem.

Inconsistent connector state: ConnectException: The task already exists on this worker

More articles: