[SPARK-47794] After executing the reboot command on the host where the Driver node is located, the spark streaming ends in a SUCCESSED state - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: 2.4.8, 3.3.4
Fix Version/s: None
Component/s: Spark Core
Labels:
None

Description

While running Spark streaming applications on YARN in cluster mode, reboot/shutdown of the node hosting AM causes the application to terminate SparkContext and mark it as SUCCEEDED。

The reboot/shutdown command will send an graceful stop signal of "kill -15" to all processes on this machine. From the Spark Streaming code, this end signal will make the Spark Streaming application think it has ended normally. The log is as follows:

But in most cases, the reboot/shutdown command may occur due to misoperation, other services needing to be restarted, or the operating system itself needing to be restarted. Is it appropriate for Spark to report such a "SUCCEEDED" status?

Especially now, many scheduling systems decide whether to restart the Spark task based on its status, such as the "FAILED" status. Spark streaming reports such a "SUCCEEDED" status, making it difficult for the scheduling system to handle.

Moreover, Spark streaming runs for a long time, and reporting the "SUCCEEDED" status is also very ambiguous.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

image-2024-04-10-16-03-29-393.png
10/Apr/24 08:05
58 kB
pengfei zhao

Activity

People

Assignee:: Unassigned

Reporter:: pengfei zhao

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 10/Apr/24 08:05

Updated:: 10/Apr/24 08:11