Debugging a multithreaded server

I was asked about this in an interview and now I'm curious because I don't think the interviewer was satisfied with my answer. Here's the question:

The multithreaded server application stops working and the last log message from the application is:

"Some Server Related Message..."


The code looks like this:

CalledFunc ()
    Code ...

    Acquiring Thread lock
    Line printing "Some Server Related Message..."
    Releasing Thread Lock


  • What should a programmer do to debug this?
  • What happened to the mistake in Func()

  • If an Func()

    exception is thrown, what should be done to fix the problem?

source to share

4 answers

Reason # 1: Database problem. It may sound strange, but the main reason the application server hangs is not directly related to the application server itself. The location of the symptom is rarely the cause of the underlying cause. The following scenario is fairly common:

The database is a bottleneck, resulting in queries running slower than usual. Requests that take 1 second now take 5 seconds. The average number of concurrent requests is slowly increasing (due to lagging). The server runs out of threads and the application server hangs. If you can manage to get a thread dump, you will only see a bunch of waiting threads and another group that is actually running. Another possibility is that the number of threads waiting (or threads in the queue) will burn up all available memory and eventually lead to an OutOfMemory error.

Reason # 2: Deadlocks. If the app server seems to be doing nothing, look for dead ends. It could be database locks that are causing your SQL queries to hang or looking for update statements. For example, a transaction log that is written to the database for every request can easily hang the entire application if the log table is locked. Also check for shared objects - an operating system file that is written from multiple threads at the same time.

Reason # 3: flow shutdown. In cases where the application server is really to blame, you should look for a downstream thread. They are difficult to spot as they barely appear in the logs as they are usually only written after the request has completed. Sliding flow probably won't come back until it affects the whole application. Therefore, the maintenance request will not be logged. These "runaway streams" typically involve endless loops or code that consume too much heap memory, resulting in out of memory. For example, a query that should show results that do not include the ability to swap between results pages suddenly should display a large number of results. The page takes forever rendering and glueing the application server, causing it to hang.



It is likely that:

  • Func () tries to acquire the lock again (easy to check), or
  • Func () threw a blocked blocking exception (more likely and thinner)


  • Check Func () code to check if all possible paths are freed (exceptions are included)
  • One of the two options above
  • Release the lock before throwing the exception, or catch the exception in CalledFunc () and release the lock


To fix the problem with the exception in Func (), you can use binding lock. RAII is a good way to keep exceptions safe and prevent leaks in general. This link also has a mutex as an example.

Also, looking at this line in the log does not mean that the problem comes from that part of the code.



I think they are looking for this:

What must a programmer do to debug this?

Get the hanging process dump and then use windbg to figure out the reason, i.e. if it is a dead lock, it will be obvious from the dump.

What happened to the bug in Func ()?

From what is asked in the next question, we can assume that it should have thrown an exception as some point, with the result that the lock would never be released, or it tried to lock the lock again, causing a deadlock.

If an exception is thrown in Func (), what should I do to fix the problem?

Use RAII for security exclusion and better / cleaner code.



All Articles