Skip to content

Commit

Permalink
[SPARK-51011][CORE] Add logging for whether a task is going to be int…
Browse files Browse the repository at this point in the history
…errupted when killed

### What changes were proposed in this pull request?

We now log the value of `interruptThread` when a `TaskRunner`'s `kill` method is killed. This should help with debugging when potential zombie Spark tasks do not seem to be exiting.

### Why are the changes needed?

Today, it's tricky to debug why a task is not exiting (and thus, why executors might be getting lost) without knowing for sure if it was issued a Java interrupt.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Ran `org.apache.spark.executor.ExecutorSuite` and verified the log looked as expected.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #49699 from neilramaswamy/spark-51011.

Lead-authored-by: Neil Ramaswamy <[email protected]>
Co-authored-by: Hyukjin Kwon <[email protected]>
Signed-off-by: Hyukjin Kwon <[email protected]>
  • Loading branch information
neilramaswamy and HyukjinKwon committed Feb 3, 2025
1 parent 8fc6a20 commit f03f45e
Show file tree
Hide file tree
Showing 2 changed files with 4 additions and 2 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -331,6 +331,7 @@ private[spark] object LogKeys {
case object INPUT extends LogKey
case object INPUT_SPLIT extends LogKey
case object INTEGRAL extends LogKey
case object INTERRUPT_THREAD extends LogKey
case object INTERVAL extends LogKey
case object INVALID_PARAMS extends LogKey
case object ISOLATION_LEVEL extends LogKey
Expand Down
5 changes: 3 additions & 2 deletions core/src/main/scala/org/apache/spark/executor/Executor.scala
Original file line number Diff line number Diff line change
Expand Up @@ -504,8 +504,9 @@ private[spark] class Executor(
@volatile var task: Task[Any] = _

def kill(interruptThread: Boolean, reason: String): Unit = {
logInfo(log"Executor is trying to kill ${LogMDC(TASK_NAME, taskName)}," +
log" reason: ${LogMDC(REASON, reason)}")
logInfo(log"Executor is trying to kill ${LogMDC(TASK_NAME, taskName)}, " +
log"interruptThread: ${LogMDC(INTERRUPT_THREAD, interruptThread)}, " +
log"reason: ${LogMDC(REASON, reason)}")
reasonIfKilled = Some(reason)
if (task != null) {
synchronized {
Expand Down

0 comments on commit f03f45e

Please sign in to comment.